Archive for July, 2009

Vacation announcement

July 14, 2009

Since I will be on vacation for the next 4 weeks with no internet access, I won’t be able to post anything in the meantime. I should be back for Programming Praxis puzzle number 59.

In the meantime, I leave you with the following two koans to ponder:

A student said to his master:
“Master, there is a bug in my code, but I cannot find it.”
The master took one look at the code and said:
“No wonder, your code is far too long.
As it is difficult to find a needle in a haystack,
so is it difficult to find bugs in lengthy code.”
The student spent the next year studying
abstractions, algorithms and refactoring.
After that year, he once again approached his master and said:
“Master, there is a bug in my code, but I cannot find it.”
The master took one look at the code and said:
“No wonder, your code, while short, is so abstract
that it is difficult to reason about what happens.”
The student was enlightened.

A student approached his master and said:
“Master, I have heard that a neighboring monastery
teaches a different philosophy from ours. I would
very much like to study there.”
The master nodded, and said “It is always a good idea
to learn of other philosophies. You have my permission.”
Two years later the student returned to the monastery.
“Master, in my time at the other monastery I have observed
a striking difference between the two philosophies:
whereas we teach a strict adherence to the rules at all times,
they are more interested with the result, and obey rules
only when convenient. What is your opinion on this?”
The master replied: “What do you use to eat meat?”
“Chopsticks”, replied the student.
“What do you use to eat soup?”
“A spoon”.
The student was enlightened.

Programming Praxis – The Daily Cryptogram

July 14, 2009

Today’s Programming Praxis problem is an interesting one: we have to write a program to solve monoalphebetic substitution ciphers.

The algorithm we’re going to use is by no means perfect, but for the example cryptogram, as well as a few other test cases, it generally gives a good enough approximation that only a little bit of human help is needed to get the full solution. It also has the benefit of being nice and short.

First, our imports:

import Data.Char
import Data.List
import qualified Data.Map as M
import GHC.Exts

We’re going to be using a dictionary-based approach. In order for that to work, we need a way to see if a given dictionary word can fit the pattern defined by the crypto word. We do this by looking at the first letter of the crypto and the dictionary word. If we already have a partial solution key, we check if the decoded letter is the same as that of the dictionary word. If we don’t, we add the relationship between the two letters to the solution key. Then we check the rest of the two words in the same way.

match :: M.Map Char String -> String -> String -> Bool
match _ []     []     = True
match k (c:cs) (p:ps) = M.findWithDefault [p] c k == [p] &&
                        match (M.insert c [p] k) cs ps
match _ _      _      = False

To apply the solution key to an encrypted string, we replace all the letters that exist in the key with their decoded equivalents, leaving the rest as is.

substitute :: M.Map Char String -> String -> String
substitute key = concatMap (\c -> M.findWithDefault [c] c key)

To get a series of potential keys, we look at every word of the encrypted text and see which dictionary words can still be substituted for them. If there are less than 100 options for a word, we generate all the keys that would result from those substitutions.

getKeys :: M.Map Char String -> [String] -> String -> [M.Map Char String]
getKeys k dict c = [M.fromList .
    unionBy (\(a,x) (b,y) -> a == b || x == y) (M.assocs k) .
    zip w $ map return p | w <- words c,
    let ps = filter (match k w) dict, length ps < 100, p <- ps]

We score keys by the amount of letters of they decrypt; the more, the better.

score :: M.Map Char String -> String -> Int
score k = negate . length . filter isLower . substitute k

We iterate this process a couple of times by feeding the resulting keys into a new iteration of the algorithm. In order to prevent combinatorial explosion, we only keep the 10 best keys of each generation.

findBestKeys :: Int -> [String] -> String -> [M.Map Char String]
findBestKeys n dict c = iterate (take 10 . sortWith (`score` c) .
    concatMap (\k -> getKeys k dict c)) [M.empty] !! n

‘Solving’ a crpytogram, i.e. showing the 10 best candidates and letting the user interpret them, means finding the best keys and showing the resulting substitutions. The first argument determines how many generations are performed, which determines the quality of the solution. For some cryptograms you may have to pay around with this a little. In my tests, something in the order of 3-7 usually works fairly well.

solve :: Int -> [String] -> String -> [String]
solve n dict c = map (`substitute` c) $ findBestKeys n dict $
                 filter (\x -> isAlpha x || isSpace x) c

Let’s see if our little algorithm works. For our dictionary we use english-words.10, a file that can be found in SCOWL, a Spell-Checker Oriented Word List (more info here). It is a list of common English words that offers a good balance between quality (it works better for my algorithm than the usr/dict/words file from Linux does) and speed (it’s just under 5000 words, which means that I get a solution in less than a second).

crypto = "P OYUUAOEXYW YM AEFGAD, FZGPEAG JAATUL, MYC ENA " ++
         "AGFOPEXYW PWG AWSYLVAWE YM ENA DPHHL ZCYICPVVAC"

main :: IO ()
main = do dict <- fmap lines $ readFile "english-words.10"
          mapM_ print $ solve 7 dict crypto

The resulting decrypted text is “a collection of etudew, updated JeeTls, for the education and enSosment of the waHHs programmer”, which looks strikingly similar to the old motto of Programming Praxis – “a collection of etudes, updated weekly, for the education and enjoyment of the savvy programmer”. As you can see, there are a few errors and it doesn’t get everything, but that is because these words are not in the dictionary file I used. Still, for an 18-line algorithm I’d say it works well enough. A trigram- or genetic algorithm-based solution might work better, but we’ll see that when the official solution is released.

Programming Praxis – The Golden Ratio

July 10, 2009

Today’s Programming Praxis problem is an easy one: all we have to do is make a function that calculates the golden ratio. Sadly, the provided solution already has the easiest way to do this, so all we can do is use the Haskell equivalent.

Since we want real fractions instead of floating point numbers, we’re going to need the Data.Ratio package.

import Data.Ratio

To calculate the golden ratio, we repeatedly take the reciprocal and add one, starting with 1 for the first step.

golden :: Int -> Rational
golden n = iterate (succ . recip) 1 !! n

A simple test to show it works correctly:

main :: IO ()
main = print $ golden 200

And that’s all there is to it. Piece of cake.

Programming Praxis – Modular Arithmetic

July 8, 2009

Yesterday’s Programming Praxis problem is about making a convenient way to do modular arithmetic. While the most elegant way to do this in Haskell is probably via a Num instance, my attempts at achieving this have failed because either the typechecker cannot infer everything anymore, or because you cannot use the same syntax for modular square roots, congruency and the rest of the operations. Maybe a better Haskell hacker than me can come up with a working solution.

In the meantime, we’re going to use a solution that’s consistent and plays nice with the type checker, even if it’s a tad less elegant.

First our imports:

import Data.Bits
import Data.List

We’ll be needing these two functions for divisions.

euclid :: Integral a => a -> a -> a
euclid x y = f 1 0 x 0 1 y where
    f a b g u v w | w == 0    = mod a y
                  | otherwise = f u v w (a-q*u) (b-q*v) (g-q*w)
                                where q = div g w

inv :: Integral a => a -> a -> a
inv x m | gcd x m == 1 = euclid x m
        | otherwise    = error "divisor must be coprime to modulus"

All the basic operations are pretty simple:

(<=>) :: Integral a => a -> a -> a -> Bool
a <=> b = \m -> mod a m == mod b m

(<+>), (<->), (<*>), (</>) :: Integral a => a -> a -> a -> a
a <+> b = \m -> mod (a + mod b m) m
a <-> b = a <+> (-b)
a <*> b = \m -> mod (a * mod b m) m
a </> b = \m -> (a <*> inv b m) m

For exponentiation we get to recycle the expm function, which was used in previous exercises.

expm :: Integer -> Integer -> Integer -> Integer
expm b e m = foldl' (\r (b', _) -> mod (r * b') m) 1 .
             filter (flip testBit 0 . snd) .
             zip (iterate (flip mod m . (^ 2)) b) $
             takeWhile (> 0) $ iterate (`shiftR` 1) e

And a little syntactic shortcut for expm:

(<^>) :: Integer -> Integer -> Integer -> Integer
(<^>) = expm

This modular square root implementation is undoubtedly a lot slower than the provided one. However, implementing that one would be boring (just translating Scheme to Haskell, with not much opportunity for size reduction) and would double or triple the size of the code. Therefore we use a more naive algorithm, that works well enough when the modulus is small. Copying the provided algorithm to Haskell is left as an exercise for the reader.

sqrtm :: Integral a => a -> a -> [a]
sqrtm x m = case [n | n <- [1..m], (n <*> n) m == mod x m] of
                (_:_:_:_) -> []
                s         -> s

The modulo function is just syntactic sugar for reverse function application.

modulo :: a -> (a -> b) -> b
modulo = flip ($)

This results in test code that looks as follows:

main :: IO ()
main = do mapM_ (modulo 12) [
              print . (17 <=> 5),
              print . (8 <+> 9),
              print . (4 <-> 9),
              print . (3 <*> 7),
              print . (9 </> 7)]
          mapM_ (modulo 13) [
              print . (6 <^> 2),
              print . (7 <^> 2),
              print . sqrtm 10]

Not terrible, but I believe a better solution exists. If anyone has one, please let me know.

Programming Praxis – The Playfair Cipher

July 3, 2009

Today’s Programming Praxis problem is about the Playfair Cipher, a way of encrypting messages that was used in the first and second World Wars. Our target is 44 lines, so let’s get cracking.

We’re going to need a bunch of imports on this one:

import Data.Char
import Data.List
import Data.List.HT
import Data.List.Split
import qualified Data.Map as M

And for convenience, we make a type definition for Keys.

type Key = M.Map (Int, Int) Char

In order for the algorithm to work, we’re going to have to filter all the input so that all we get is capital letters without a J.

process :: String -> String
process = replace "J" "I" . map toUpper . filter isLetter

Instead of working with a block, we turn the key into a map of of x-y coordinates to letters. This makes the rest of the algorithm simpler.

key :: String -> Key
key = M.fromList . concat .
      zipWith (\y -> zipWith (\x c -> ((x, y), c)) [0..]) [0..] .
      chunk 5 . (`union` delete 'J' ['A'..'Z']) . nub . process

bigram encodes or decodes two letters at a time according to the three given rules.

bigram :: Key -> Int -> Char -> Char -> String
bigram k dir c1 c2
    | y1 == y2  = get (x1 + dir, y1) : [get (x2 + dir, y2)]
    | x1 == x2  = get (x1, y1 + dir) : [get (x2, y2 + dir)]
    | otherwise = get (x2, y1)       : [get (x1, y2)]
    where (x1, y1) = head . M.keys $ M.filter (== c1) k
          (x2, y2) = head . M.keys $ M.filter (== c2) k
          get (x,y) = k M.! (mod x 5, mod y 5)

Encoding consists of inserting Xs in the correct places and then encoding all the bigrams.

encode' :: Key -> String -> String
encode' _ []       = []
encode' k [x]      = encode' k (x : "X")
encode' k (x:y:xs) | x == y    = encode' k [x] ++ encode' k (y:xs)
                   | otherwise = bigram k 1 x y ++ encode' k xs

Decoding is simpler, since there are no special cases there.

decode' :: Key -> String -> String
decode' k = concatMap (\[x,y] -> bigram k (-1) x y) . chunk 2

And finally, two convenience functions so we can work with strings directly.

encode :: String -> String -> String
encode k = encode' (key k) . process

decode :: String -> String -> String
decode k = decode' (key k) . process

All that remains is to test if everything works correctly:

main :: IO ()
main = do print $ encode "PLAYFAIR" "PROGRAMMING PRAXIS"
          print $ decode "PLAYFAIR" "LIVOBLKZEDOELIYWCN"

And we’re done. Interestingly, if you try to decode the 1943 message, you’ll notice two errors in the result: it says Blackess instead of Blackett and coce instead of cove. I wonder if the errors were made by the original sender or by the first person to type it on a computer.

Anyway, the solution is 24 lines, just over half the size of the Scheme version. That will do nicely.