Programming Praxis – Primality Checking

Today’s Programming Praxis problem is about checking whether or not a number is prime. We’re supposed to implement the Miller-Rabin primality test, which is probably the most well-known algorithm to do this. Let’s go.

First the imports:

import Control.Arrow
import Data.Bits
import Data.List
import System.Random

The Miller-Rabin algorithm calls for random numbers. Since I didn’t want to make isPrime an IO function, we have to pass it a random number generator as well.

isPrime :: Integer -> StdGen -> Bool
isPrime n g =
    let (s, d) = (length *** head) . span even $ iterate (flip div 2) (n - 1)
        xs = map (expm n d) . take 50 $ randomRs (2, n - 2) g
    in all (\x -> elem x [1, n - 1] ||
                  any (== n - 1) (take s $ iterate (expm n 2) x)) xs

You may have noticed the expm in there. In theory all you have to do is mod (random_number ^ d) n. However, in our test case of 2^89 – 1, both the random number and d can easily be 10 digits. And 10-digit-number ^ 10-digit-number is a very big number, which takes a while to calculate. As a result, your algorithm is going to be terribly slow, as I discovered in my initial attempt. The solution lies in using modular exponentiation. This is a much faster way of calculating mod (a ^ b) m. In Scheme this function is apparently in the standard Prelude, but I couldn’t find a Haskell library that has it defined. No matter, we’ll just make our own:

expm :: Integer -> Integer -> Integer -> Integer
expm m e b = foldl' (\r (b', _) -> mod (r * b') m) 1 .
             filter (flip testBit 0 . snd) .
             zip (iterate (flip mod m . (^ 2)) b) $
             takeWhile (> 0) $ iterate (flip shiftR 1) e

And there’s our primality checking code. To test it, use:

main :: IO ()
main = print . isPrime (2 ^ 89 - 1) =<< getStdGen


11 Responses to “Programming Praxis – Primality Checking”

  1. programmingpraxis Says:

    There is no Standard Prelude for Scheme. As I write the exercises at Programming Praxis, I am gradually building my own Standard Prelude specific to these exercises.

    I’m surprised you couldn’t find a version for Haskell. Any crypto library has modular exponentiation, as does any math library that does number theoretic calculations. The square-and-multiply algorithm is standard stuff for many algorithms textbooks. Even SICP does modular exponentiation.

    The worse problem with defining a standard library function for modular exponentiation is the name. All of these are common: expm, expt-mod, mod-expt, exp-mod, mod-exp, power-mod, and variants of those without the dash.


  2. Remco Niemeijer Says:

    Just had another look. It turns out that searching for mod exp on Hayoo doesn’t find the expmod function from Codec.Encryption.RSA.NumberTheory. That same library also has an isPrime function, so the whole exercise could have been solved with one library call. Then again, using that I wouldn’t have learned anything, and where would be the fun in that? 🙂

  3. programmingpraxis Says:

    Your expm looks considerably more complicated than mine.

    In general, your entire solution looks considerably more complicated than mine. But I don’t speak Haskell (at least not well). Could you please break down your program step-by-step and explain how it works?

  4. Remco Niemeijer Says:

    The expm in my code is more complicated because I based it on the pseudocode in the wikipedia article on modular exponentiation, which is probably more appropriate for C than for Haskell. The expmod version in Codec.Encryption.RSA.NumberTheory uses the same algorithm you do, which is cleaner. It’s possible the one with bitshifting is slightly faster, but I can’t be sure without benchmarking.

    As for a step-by-step guide, sure.

    let (s, d) = (length *** head) . span even $ iterate (flip div 2) (n – 1)

    This is my way of getting s and d from this line in the wikipedia pseudocode:

    write n − 1 as 2^s·d with d odd by factoring powers of 2 from n − 1.

    We start with n – 1. Then we use the iterate function to build a list of the results of continuously dividing by two (flip div 2 being a slightly more elegant way of saying \n -> div n 2). For instance, if we start with 48 the list would be [48, 24, 12, 6, 3, 1, 0, 0, 0…]. Then we divide this list in two with span. Span divides the list in two when the argument (in this case even) no longer holds. So the list above would be split into the following tuple: ([48,24,12,6], [3,1,0,0,0…]). Finally we take the length of the first part and the first element of the last part (*** is a convenience function that applies its two arguments to the two sides of a tuple) to determine s and d: 48 is 2^4 (because the first list holds four items) times 3 (the first element of the second list).

    The next line,

    xs = map (expm n d) . take 50 $ randomRs (2, n – 2) g

    is the translation of the following lines from the pseudocode algorithm:

    LOOP: repeat k times:
    pick a randomly in the range [2, n − 2]
    x ← a^d mod n

    randomRs gives an infinite list of random numbers in the range defined by its first argument, in this case 2 to n – 2. Repeating k times is done by taking the first k (like you, I chose 50) random numbers. Since the random numbers are only used in x I decided to immediately calculate the needed xs by mapping expm over the random numbers.

    in all (\x -> elem x [1, n – 1] ||
    any (== n – 1) (take s $ iterate (expm n 2) x)) xs

    is the translation of

    if x = 1 or x = n − 1 then do next LOOP
    for r = 1 .. s − 1
    x ← x^2 mod n
    if x = 1 then return composite
    if x = n − 1 then do next LOOP
    return composite

    It says that for all the xs either x must 1 or n – 1 or the iteration of x^2 mod n must at some point be n – 1. If this is case then the number is prime, otherwise it is not.

    As for being more complicated, I’d say it’s a matter of taste and what you’re used to. Personally I find explicit for loops, especially nested ones, rather dangerous since I learned Haskell, because you have to keep a close eye on where and how the loop variables are changed. Functions like map and fold on the other hand work the same every time. An alternative approach to this algorithm can be found in Perhaps you will like that one better, though I must say I prefer mine.

    You might also enjoy this discussion on Reddit about Haskell being hard to read:

    If you have any more questions, let me know and I’ll be happy to try and answer them.

  5. Phil Says:

    What is the meaning of $ in three places?

    • zorg Says:

      in Haskell, function application associates to the left and has high precedence
      $ is an infix application operator with low precedence
      so a (b c) can be rewritten as a $ b c

  6. Phil Says:

    I’m not sure I like the RSA package you pointed to, regardless of the language. Isqrt should be implemented by Newton’s method, not by coercing an integer to floating point, taking the square root then the floor, and coercing back to integer. Primes should be generated by a sieve, not by trial division by odd numbers; likewise, trial division isn’t a very good way to find factors. The extended euclidean algorithm there repeats some tests, and isn’t very pretty; here’s my version in Scheme:

    (define (euclid x y)
    (let loop ((a 1) (b 0) (g x) (u 0) (v 1) (w y))
    (if (zero? w) (values a b g)
    (let ((q (quotient g w)))
    (loop u v w (- a (* q u)) (- b (* q v)) (- g (* q w)))))))

    I don’t believe David Sankel is really implementing RSA encryption with this code.

  7. Phil Says:

    I guess it’s an idiom of Haskell that laziness lets you separate the generation of values from stopping the generation after a particular count of values with take. I’m not sure if that’s a good thing or a bad thing. In C, all of the loop control is in one place, at the top of the loop. In Scheme, a do loop also has all the loop control in one place, but a named-let loop separates initialization, generation, and termination into three different places; that’s why named-let is sometimes hard to read. Nonetheless, tradition in Lisp/Scheme is to use recursion, and it’s baked into our genes — the first academic paper describing Lisp, by John McCarthy, is titled ‘Computation by recursive expressions’ or something like that.


    • zorg Says:

      > I guess it’s an idiom of Haskell that laziness lets you separate the generation
      > of values from stopping the generation after a particular count of values with
      > take. I’m not sure if that’s a good thing or a bad thing

      then read “Why FP matters” by John Hugues

  8. Remco Niemeijer Says:

    > What is the meaning of $ in three places?

    It’s just function application with a really low precedence, or, in terms of the type signature, a function of type (a -> b) -> a -> b. The main reason to use it is to cut down on the number of parentheses. All of the following are equivalent:

    f $ g x
    f (g x)
    f . g $ x
    (f . g) x
    f $ g $ x

    Using it or not is mainly a choice of style, though I think it is more common than using parentheses. Maybe to avoid being seen as just another lisp clone 🙂

  9. Programming Praxis – Monte Carlo factorization « Bonsai Code Says:

    […] we’re going to need to reuse the code from the Miller-Rabin primality test exercise, since we need to determine whether or not the number is […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: