In today’s Programming Praxis problem we’re supposed to factor numbers both the naive way and using Wheel factorization. Let’s go.
First, let’s define a function that finds factors using trial division:
tdFactors :: Integer -> [Integer] -> [Integer] tdFactors _ [] = [] tdFactors r (x:xs) | x * x > r = [r] | mod r x == 0 = x : tdFactors (div r x) (x:xs) | otherwise = tdFactors r xs
Using that, the function for naive factorization is trivial:
factorNaive :: Integer -> [Integer] factorNaive n = tdFactors n [2..]
Now for the wheel factorization. First we define a function that tells us which columns of our wheel to use given the circumference.
spokes :: Integer -> [Bool] spokes c = map ((== 1) . gcd c) [1..c]
To construct our wheel, we keep repeating this pattern and use only the numbers from the correct columns (and correct the start of the wheel).
wheel :: [Integer] -> [Integer] wheel ps = (ps ++) . drop 1 . map snd . filter fst $ zip (cycle . spokes $ product ps) [1..]
The function to do wheel factorization is practically identical to the naive one:
factorWheel :: Integer -> [Integer] factorWheel n = tdFactors n $ wheel [2,3,5,7]
And of course we need to test if everything works:
main :: IO () main = do print $ factorNaive 600851475143 print $ factorWheel 600851475143
And we’re done. Pretty simple.
Tags: factorization, kata, praxis, programming, wheel
May 8, 2009 at 6:29 pm |
I submitted this code at http://codepad.org/xbZJhC9N and got a timeout error while factorNaive was running. Did I do something wrong?
Phil
May 8, 2009 at 8:07 pm |
Hm. I guess they’ve set their timeout limit pretty low.
A benchmark on my machine shows about 2.7 and 0.7 seconds for factorNaive and factorWheel when running in the interpreter and 0.2 and 0.1 seconds for the compiled version. How fast does your Scheme solution run?
May 8, 2009 at 8:15 pm |
On my two-year-old pc at work, 0 ms for both, running in the interpreter.
On the same machine as my Haskell attempt, at http://codepad.org/HScJryvX, 0 ms for both; I don’t know what the hardware is, but the Scheme system is an old version of MzScheme, an interpreter.
Phil
May 8, 2009 at 8:20 pm |
One speedup that works, assuming you only call tdFactors with an increasing list of numbers, is to replace factor with
factor r (x:xs) | x > r = []
| mod r x == 0 = x : factor (div r x) (x:xs)
| otherwise = factor r xs
(the only change is the added first guard).
That reduces execution time to 30 and 0 ms interpreted and 0 ms compiled.
May 8, 2009 at 8:24 pm |
Shouldn’t you stop as soon as x*x > r? But doesn’t the takewhile already do that?
Phil
May 8, 2009 at 8:29 pm |
No, if you do that then it only gives three of the four factors of the test number. The x > r clause does obsolete the takeWhile bit though (again, assuming increasing numbers).
May 8, 2009 at 8:48 pm |
Then something is wrong. You’re not handling the termination properly. You should be able to quit as soon as you exceed the square root of the remaining cofactor.
May 8, 2009 at 8:53 pm |
Correct. It should be
| x * x > r = [r]
On the bright side, now I understand why you were adding n to the factors at the end in your solution
I’ve updated the main post with this version.
May 8, 2009 at 8:54 pm |
I bet your timings are faster now.
May 8, 2009 at 9:04 pm |
Yeah, 0-15 and 0 ms interpreted as expected. And I realized that my original version only worked with random orderings in a very limited number of cases (it could only be in random order until the first case where x > sqrt(n), after that all xs had to be more than sqrt(n).
Thanks for the improvement
May 8, 2009 at 9:19 pm |
Please post your revised code at Programming Praxis, along with a discussion of why it is better than the original.
May 8, 2009 at 9:32 pm |
One further minor optimization: I interpreted the assignment so that we were supposed to find only the unique factors (e.g 60 results in [2,3,5] instead of [2,2,3,5]). Since your solution doesn’t, we can remove the nub function and the required import. Saves another two lines.
May 8, 2009 at 9:40 pm |
And with nub gone, the internal function factor becomes the entire function definition. Another line gone.
May 8, 2009 at 9:49 pm |
If you’re removing lines, you can take out tdFactors _ [] = []. It makes the function partial, but is unneeded.
May 8, 2009 at 9:57 pm |
Obviously, but I prefer my code not to throw any compilation warnings. Besides, this way it also works on non-infinite lists (not needed in this program, but still).