Archive for November, 2010

Programming Praxis – Form Letters

November 30, 2010

In today’s Programming Praxis exercise, we have to write a program to generate form letters. Let’s get started, shall we?

First, some imports:

import Control.Applicative ((*>), (<*>), (<$>))
import Text.CSV
import Text.Parsec

The format for the message template is simple enough to do with a plain recursive algorithm, as I did initially, but in the end I decided to go with Parsec since it’s slightly cleaner.

fillWith :: String -> [String] -> String
fillWith text vars = either show concat $ parse form "" text where
    form = many $ escape <|> count 1 anyChar
    escape = char '$' *> (string "$" <|>
        ((vars !!) . read <$> option "0" (many1 digit)))

Once we have the function to fill in the message template with a record, doing this for multiple records is pretty simple.

formLetters :: FilePath -> FilePath -> IO [String]
formLetters schema vars = either (return . show) . map . fillWith <$>
    readFile schema <*> parseCSVFromFile vars

A quick test to see if everything is working properly:

main :: IO ()
main = mapM_ putStrLn =<< formLetters "schema.txt" "data.txt"

Yep. Now make sure never to use this code because nobody wants to see more form letters 🙂

Programming Praxis – Divisors And Totatives

November 26, 2010

In today’s Programming Praxis exercise,our goal is to make a small library to compute the following of a number: the divisors, the sum of the divisors, the number of divisors, the totatives and the totient. Let’s get started, shall we?

Some imports:

import Data.List
import Data.Numbers.Primes
import Data.Ratio

Calculating the divisors of a number could be done the compact but naive way (divisors n = filter ((== 0) . mod n) [1..n]), but let’s go with the slightly longer but more efficient version of multiplying every combination of prime factors.

divisors :: Integral a => a -> [a]
divisors = nub . sort . map product . subsequences . primeFactors

Getting the sum of divisors is trivial.

divisorSum :: Integral a => a -> a
divisorSum = sum . divisors

For the amount of divisors we’ll ignore the trivial method (length . divisors) and go with the more efficient algorithm.

divisorCount :: Integral a => a -> Int
divisorCount = product . map (succ . length) . group . primeFactors

The totatives can be be trivially calculated using trial division.

totatives :: Integral a => a -> [a]
totatives n = filter ((== 1) . gcd n) [1..n]

For the totient there is also a better way than the trivial one (length . totatives):

totient :: (Integral a, Integral b) => a -> b
totient n = floor . product $ n % 1 : [1 - 1 % f | f <- nub $ primeFactors n]

As usual, some tests to see if everything is working properly:

main :: IO ()
main = do print $ divisors 60 == [1,2,3,4,5,6,10,12,15,20,30,60]
          print $ divisorSum 60 == 168
          print $ divisorCount 60 == 12
          print $ totatives 30 == [1,7,11,13,17,19,23,29]
          print $ totient 30 == 8
          print $ totient 60 == 16

Looks like it is, and at five lines of code I’d call this a pretty small library 🙂 Fortunately, the non-trivial algorithms don’t add much in the way of length. Sure, the lines are a little longer, but they’re all still one-liners. I’d say it’s worth it for the improved speed.

Programming Praxis – String Subsets

November 23, 2010

In today’s Programming Praxis exercise,we have to write functions to determine if one string is a subset of another as if we were in an interview. Let’s get started, shall we?

First, some imports.

import Data.List
import qualified Data.Map as M
import qualified Data.IntMap as I

My first attempt doesn’t actually work, since the intersect function only checks whether an element is a member of the second list. It doesn’t keep track of duplicates.

subsetOf1 :: Eq a => [a] -> [a] -> Bool
subsetOf1 xs ys = intersect xs ys == xs

Since a call to a library function won’t suffice, we’ll have to whip up something ourselves. The obvious way is to get a count of all the characters in both strings and check if the second string has an equal or higher count for all the characters in the first string. Of course this method is O(n * m), so it’s not very efficient.

subsetOf2 :: Ord a => [a] -> [a] -> Bool
subsetOf2 xs ys = all (\(c, n) -> maybe False (n <=) .
                       lookup c $ count ys) $ count xs
    where count = map (\x -> (head x, length x)) . group . sort

To improve the big O complexity, we’re going to switch to a data type that has a faster lookup. Like the previous version, counting the frequency of each letter is O(n log n), but using Maps the comparison can now be done in O(m + n), meaning the overall complexity remains at O(n log n).

subsetOf3 :: Ord a => [a] -> [a] -> Bool
subsetOf3 xs ys = M.null $ M.differenceWith
    (\x y -> if x <= y then Nothing else Just x) (f xs) (f ys)
    where f = M.fromListWith (+) . map (flip (,) 1)

Since the keys of the map are characters, there’s a further optimization we can make. By converting the characters to integers we can use an IntMap instead of a plain Map. An IntMap has a lookup of O(min(n,W)), with W being the amount of bits in an Int. For any non-trivial n, this results in O(1) lookup. Counting all the letters can now be done in O(n). Since the comparison still takes O(m + n), the resulting complexity is O(m + n). This is the minimum we can achieve, as we need to fully evaluate both strings to produce an answer, which is an O(m + n) operation.

subsetOf4 :: Enum a => [a] -> [a] -> Bool
subsetOf4 xs ys = I.null $ I.differenceWith
    (\x y -> if x <= y then Nothing else Just x) (f xs) (f ys)
    where f = I.fromListWith (+) . map (flip (,) 1 . fromEnum)

A quick test shows that the first function indeed fails, but the other ones succeed.

main :: IO ()
main = do let test f = print (f "da" "abcd", not $ f "dad" "abcd")
          test subsetOf1
          test subsetOf2
          test subsetOf3
          test subsetOf4

I must say I hope that I have internet access during the interview, though. If not, I would have had to come up with an alternative for differenceWith and I might not have remembered the existence of IntMap. In that case I’d probably have gone with something along the lines of the array-based solution from number 4 of the Scheme solutions.

Boolean Minimization

November 22, 2010

A week ago I got the following email from Trevor Hill:

Dear Remco,

I have enjoyed reading some of your bonsai code postings on the Internet, thank you for posting these examples.

At present I am teaching myself Haskell programming.  I thought that writing a boolean expression minimizing program would be a good exercise and found an interesting technique in a paper by Adrian Duşa, University of Bucharest titled “A mathematical approach to the boolean minimization problem”  at http://www.compasss.org/files/WPfiles/Dusa2007.pdf

My Haskell skills are still very basic.  I managed to write some Haskell to generate Adrian’s  “Difference Matrix” but the rest of the implementation evades my current Haskell skills.

I was wondering if this problem may be something that you would enjoy using in one of your Haskell postings. I am sure that a really elegant Haskell program can be written to implement Dusa’s technique.

My crude implementation of a difference matrix generator and code to input a truth table is in the attached file.

Regards,
Trevor Hill.

Boolean minimization is a technique that takes a logic circuit and tries to find a smaller circuit that produces the same output. You can find a more detailed description here. Among others, it is used in circuit design. So let’s get started, shall we?

First, some imports:

import Data.Bits
import Data.List
import qualified Data.List.Key as K

The algorithm we’re going to be using is the Quine McCluckey minimization algorithm. The first step is to convert a list of minterms (the inputs for which the circuit produces 1, written as integers) to disjunctive normal form (the same thing, but written as the required state for each input).

dnf :: Int -> [Int] -> [[(Char, Bool)]]
dnf n = map (\m -> [(['a'..] !! (n - 1 - y), testBit m y)
                   | y <- [n-1, n-2..0]])

The principle behind the Quine McCluckey algorithm is to repeatedly use complementation to reduce the total number of terms in the expression. If only one variable differs between two terms, this variable can be eliminated, since its state has no influence on the result in that situation.

combine :: (Eq a, Eq b) => [(a, b)] -> [(a, b)] -> [(a, b)]
combine a@((x,onx):xs) ((y,ony):ys)
    | onx /= ony = if x == y && xs == ys then xs else a
    | x == y     = (x,onx) : combine xs ys
combine a _ = a

A single pass consists of trying to combine all the terms in the expression to end up with a simpler expression. Often, there are multiple candidates. This version takes the naive approach of evaluating all possibilities, since some may be reduced further than others in subsequent passes. This does make the algorithm rather inefficient when there are a lot of variables. To speed this up, some kind of heuristic would have to be used to reduce the solution space.

pass :: (Eq a, Eq b) => [[(a, b)]] -> [[(a, b)]] -> [[[(a, b)]]]
pass _ []     = [[]]
pass b (x:xs) = case filter ((< length x) . length . combine x) (xs ++ b) of
    [] -> map (x :) $ pass (x:b) xs
    ys -> (\y -> map (combine x y :) $ pass (x:y:b) (delete y xs)) =<< ys

Minimizing is just applying multiple passes until we’ve ended up with the minimum expressions and then choosing the shortest one out of all the options.

minimize :: [[(Char, Bool)]] -> [[(Char, Bool)]]
minimize = K.minimum (length . concat) . untilRepeat .
           iterate (pass [] =<<) . return where
    untilRepeat ~(x:y:xs) = if x == y then x else untilRepeat (y:xs)

To make the output a bit easier to read, we make a prettyprinting function. An apostrophe after a value negates the value, subsequent variables are ANDed together and + means OR.

pretty :: [[(Char, Bool)]] -> String
pretty = intercalate " + " . sort . map (prettyVar =<<) where
    prettyVar (var, on) = var : if on then "" else "'"

For the sake of convenience, we make a function that combines all the steps, printing the minimized expression for a series of minterms.

run :: Int -> [Int] -> IO ()
run n = putStrLn . pretty . minimize . dnf n

All that’s left is to test our algorithm:

main :: IO ()
main = do run 3 [1,2,4,5,6,7]
          run 5 [7,8,9,10,23,24,25,26]

          let test n xs r = print $ pretty (minimize $ dnf n xs) == r
          test 3 [1,2,4,5,6,7] "a + b'c + bc'"
          test 5 [7,8,9,10,23,24,25,26] "b'cde + bc'd' + bc'e'"
          test 3 [0,1,4,5] "b'"

Everything seems to be working properly, albeit not terribly fast for larger inputs. Thanks for the exercise, Trevor.

Programming Praxis – Topological Sort

November 19, 2010

In today’s Programming Praxis exercise, our goal is to write two functions related to directed acyclical graphs (DAGs). The first one is to check whether a given directed graph is acyclical. The second is to perform a topological sort of a DAG, which means to sort it so that no node precedes a node that leads to it. Let’s get started, shall we?

A quick import:

import Data.List

The following function is just a bit of syntactic sugar for an operation I use a few times.

with :: (a -> b) -> [a] -> (b -> b -> Bool) -> b -> [a]
with t xs eq x = filter ((eq x) . t) xs

Both functions need to find vertices with no incoming edges.

noIncoming :: Eq a => [(a, a)] -> [a] -> Maybe a
noIncoming es = find (null . with snd es (==))

Checking if a graph is cyclical is a simple matter of recursively removing nodes with no incoming edges to see if any remain, which would mean that the graph is cyclical.

isCyclic :: Eq a => [(a, a)] -> Bool
isCyclic = not . null . until (\x -> remove x == x) remove where
    remove es = maybe es (with fst es (/=)) . noIncoming es $ map fst es

The process for topologically sorting a list is roughly similar: Find a vertex with no incoming edges, remove the edges leading from it and repeat, returning the vertices in the correct order.

tsort :: Eq a => [(a, a)] -> [a]
tsort xs = if isCyclic xs then error "cannot sort cyclic list"
           else f xs . nub . uncurry (++) $ unzip xs where
    f es vs = maybe [] (\v -> v : f (with fst es (/=) v) (delete v vs)) $
              noIncoming es vs

Some tests to see if everything is working correctly:

main :: IO ()
main = do print $ isCyclic [(3,8),(3,10),(5,11),(7,8)
                           ,(7,11),(11,2),(11,9),(11,10)]
          print $ isCyclic [(3,8),(3,10),(5,11),(7,8),(7,11)
                           ,(10,5),(11,2),(11,9),(11,10)]
          print $ tsort [(3,8),(3,10),(5,11),(7,8)
                        ,(7,11),(11,2),(11,9),(11,10)]

We get a different order than the Scheme solution, but as the exercise mentions there are many different possible sorts. Since we’re using a different algorithm, we get different results.

Programming Praxis – RSA Cryptography

November 16, 2010

In today’s Programming Praxis exercise, our task is to implement a key generator and encryption/decription functions for RSA cryptography. Let’s get started, shall we?

A few imports:

import Data.Bits
import Data.List
import System.Random
import Data.Numbers.Primes

Like the Scheme solution, we’ll be recycling a few functions from previous solutions.

expm :: Integer -> Integer -> Integer -> Integer
expm b e m = foldl' (\r (b', _) -> mod (r * b') m) 1 .
             filter (flip testBit 0 . snd) .
             zip (iterate (flip mod m . (^ 2)) b) $
             takeWhile (> 0) $ iterate (`shiftR` 1) e

euclid :: Integral a => a -> a -> a
euclid x y = f 1 0 x 0 1 y where
    f a b g u v w | w == 0    = mod a y
                  | otherwise = f u v w (a - q*u) (b - q*v) (g - q*w)
                                where q = div g w

inv :: Integral a => a -> a -> a
inv x m | gcd x m == 1 = euclid x m
        | otherwise    = error "divisor must be coprime to modulus"

The keygen returns the modulus and the decryption key.

keygen :: Integer -> Integer -> IO (Integer, Integer)
keygen bits key = do p <- gen (div bits 2)
                     q <- gen (div bits 2)
                     let d = inv key ((p - 1) * (q - 1))
                     return (p * q, d) where
    gen k = fmap (until valid succ) $ randomRIO (2 ^ (k - 1), 2 ^ k)
    valid v = gcd key (v - 1) == 1 && mod v 4 == 3 && isPrime v

For encrypting and decrypting we could just use expm directly, but we flip the last two arguments to correspond to the Scheme solution.

crypt :: Integer -> Integer -> Integer -> Integer
crypt = flip . expm

Some tests to see if everything’s working correctly:

main :: IO ()
main = do let e = 65537
          (m, d) <- keygen 32 e
          print $ crypt (crypt 42 m e) m d == 42
          print $ crypt 42 1893932189 e == 1118138102

Everything seems to be working fine.

Programming Praxis – Rowland’s Prime-Generating Function

November 12, 2010

In today’s Programming Praxis exercise our goal is to implement a few functions that create integer sequences related to prime generation. Let’s get started, shall we?

As in the Scheme solution, the function names refer to their number in the On-Line Encyclopedia of Integers.

The first sequence can be written in a manner similar to the typical Fibonacci sequence solution.

a106108 :: [Integer]
a106108 = 7 : zipWith (\n r -> r + gcd n r) [2..] a106108

The next two just build on the previous one and are fairly trivial.

a132199 :: [Integer]
a132199 = zipWith (-) (tail a106108) a106108

a137613 :: [Integer]
a137613 = filter (/= 1) a132199

The shortcut function is slightly longer. We store the sum of all the previous numbers so that the function runs in O(n2) rather than O(n3).

shortcut :: [Integer]
shortcut = f 0 1 where
    f s n = r : f (s + r) (n + 1) where r = lpd (6 - n + s)
    lpd n = head $ filter (\d -> mod n d == 0) [3,5..]

Some tests to see if we’re getting the correct output:

main :: IO ()
main = do print $ take 65 a106108
          print $ take 104 a132199
          print $ take 72 a137613
          print $ take 72 shortcut

Yup. Piece of cake.

Programming Praxis – Subset Sums

November 9, 2010

In today’s Programming Praxis exercise, we have to solve the third part of the Greplin challenge, which is to find the amount of subsequences of a list of integers where the largest number is equal to the sum of the rest. Let’s get started, shall we?

A quick import:

import Data.List

In keeping with the spirit of the original challenge, we’re going to go with the naive but simple method, which is mostly just a matter of translating the problem description from English to Haskell syntax. The only extra element is tail, which is needed to get rid of an empty list element.

greplin3 :: [Integer] -> Int
greplin3 = length . filter (\s -> sum (init s) == last s) .
           tail . subsequences

Applying this function to the given list gives us the expected answer of 179.

main :: IO ()
main = print $ greplin3 [ 3, 4, 9,14,15,19,28,37,47,50,54
                        ,56,59,61,70,73,78,81,92,95,97,99]

The program runs in about 1.1 seconds, so for the challenge it’s fast enough. For the primes below 210 a more efficient algorithm will be needed though.

Programming Praxis – Weather Forecast

November 5, 2010

In today’s Programming Praxis exercise our task is to print the weather forecast for a given city in the United States. Let’s get started, shall we?

The Scheme solution uses wget to download the file first. In Haskell, however, we can just use a library for this purpose.

import Network.HTTP

Thanks to the library, the actual function is trivial: download the file and print either the error message if it can’t be found or the weather forecast if it can.

showWeather :: String -> String -> IO ()
showWeather state city = either print (putStrLn . rspBody) =<<
    simpleHTTP (getRequest $ "http://weather.noaa.gov/pub/data/\
        \forecasts/city/" ++ state ++ "/" ++ city ++ ".txt")

A quick to see if everything works correctly:

main :: IO ()
main = showWeather "mo" "st_louis"

Not very warm at 47 degrees Fahrenheit, but at least it’s dry.

Programming Praxis – Emirps

November 2, 2010

In today’s Programming Praxis, our task is to enumerate all the non-palindrome prime numbers that are still prime when their digits are reversed. Let’s get started, shall we?

Since we are to focus on maximum speed, the obvious choice is to use a library.

import Data.Numbers.Primes

I tried a couple different versions, and settled on a simple filter. This version can calculate all the emirps under one million in about 650 ms. One other version where the isPrime test was replaced by first putting all the primes in a Set and doing membership testing was a fraction faster (about 10 ms), but it is about twice as long and requires specifying the maximum up front, which is less convenient then the current infinite stream. I also tried to make it faster via parallelism, but I could only get it to go slower, so it looks like the benefits of doing everything in parallel don’t weigh up against the additional overhead. Or maybe I’m just not using the library right.

emirps :: [Integer]
emirps = [p | p <- primes, let rev = reverse (show p)
            , isPrime (read rev), show p /= rev]

All that’s left is to print all the emirps under a million.

main :: IO ()
main = print $ takeWhile (< 1000000) emirps

As mentioned, just evaluating the list takes about 650 ms, which is about the same as the Scheme version.