Archive for May, 2011

Programming Praxis – Big Numbers: Addition, Subtraction, And Multiplication

May 31, 2011

In today’s Programming Praxis exercise, we’re going to add addition, subtraction and multiplication to our big number library. Let’s get started, shall we?

Addition and multiplication are both functions of the Num typeclass, so we add the two functions (the fact that a – b is equal to a + (-b) is already present in the Num typeclass, so since we already defined negate we don’t need to define the subtraction function) in our Num instance:

instance Num BigNum where

For addition, we need to decide whether we’re adding or subtracting, which we do for each pair of digit groups. We could do this digit by digit, but I’m going to be lazy and do it per group. The conversion to and from Integers is not needed now, but will be required once the base is increased to prevent overflowing the Int type.

    a@(B l1 ds1) + b@(B l2 ds2) = B (length ds * signum l) ds where
        B l _ = if abs b > abs a then b else a
        ds = f 0 $ (if abs b > abs a then flip else id)
             (prep $ if signum l1 == -signum l2 then (-) else (+)) ds1 ds2
        prep op (x:xs) (y:ys) = op (toInteger x) (toInteger y) : prep op xs ys
        prep _  xs     ys     = map toInteger $ xs ++ ys
        f r (x:xs) = let (d,m) = divMod (r + x) base in fromIntegral m : f d xs
        f r []     = if r == 0 then [] else [fromIntegral r]

For multiplication we use the grade school method of multiplying each digit group (again, instead of per-digit) and summing them up at the end.

    (B l1 ds1) * (B l2 ds2) = B (signum l1 * signum l2 * sl) sds where
        B sl sds = sum $ mult ds1 ds2
        mult (x:xs) (y:ys) = fromIntegral (toInteger x * toInteger y) :
                             map shift (mult xs (y:ys)) ++
                             map shift (mult [x] ys)
        mult _     _  = []
        shift (B l ds) = B (l + 1) (0 : ds)

Some tests to see if everything is working properly:

main :: IO ()
main = do print $  12345678 +  987654321  == ( 999999999 :: BigNum)
          print $  12345678 -  987654321  == (-975308643 :: BigNum)
          print $ 987654321 -   12345678  == ( 975308643 :: BigNum)
          print $ -12345678 +  987654321  == ( 975308643 :: BigNum)
          print $ -12345678 -   87654321  == ( -99999999 :: BigNum)
          print $  12345678 *   87654321  == ( 1082152022374638 :: BigNum)
          print $  12345678 * (-87654321) == (-1082152022374638 :: BigNum)
          print $ -12345678 *   87654321  == (-1082152022374638 :: BigNum)
          print $ -12345678 * (-87654321) == ( 1082152022374638 :: BigNum)

Programming Praxis – Upside Up

May 27, 2011

In today’s Programming Praxis exercise, our task is to write a function to determine if a number remains the same if it is rotated 180 degrees. Let’s get started, shall we?

The function is pretty simple: reverse the number, and for each digit check if it the rotation of the corresponding digit in the original number. If a non-reversible digit is encountered, we can return false immediately. An optimization that could be made is to only check half of the number (rounding up), but in this case speed is not an issue so I opted for cleaner code.

upsideUp :: Show a => a -> Bool
upsideUp n = and . zipWith isRot (show n) . reverse $ show n where
    isRot a b = maybe False (== b) . lookup a $ zip "01689" "01986"

Two checks to see if everything is working properly:

main :: IO ()
main = do print $ head (filter upsideUp [1962..]) == 6009
          print $ length (filter upsideUp [0..9999]) == 39

Programming Praxis – Big Numbers: Getting Started

May 24, 2011

In today’s Programming Praxis exercise, our task is to implement the basics of a library for big numbers. Let’s get started, shall we?

A quick import:

import Data.List

We could represent a big number as a plain list like the Scheme version does. Using a custom data structure, however, has the advantage of being able to make it an instance of the standard numeric classes, which means shorter function names and easier literals. We also store the number of digits separately because I find it a bit cleaner.

data BigNum = B Int [Int] deriving (Eq, Show)

We’re using 1000 as our base for now.

base :: Integer
base = 1000

The Num class gives us some of the required functions. Since addition and multiplication haven’t been implemented yet the compiler will throw some warnings.

instance Num BigNum where
    negate (B l ds) = B (-l) ds
    abs (B l ds)    = B (abs l) ds
    signum (B l _)  = fromIntegral $ signum l
    fromInteger n | n < 0     = negate $ fromInteger (-n)
                  | otherwise = B (length ds) (map fromIntegral ds)
                  where ds = tail $ f (n,0)
                        f (0,m) = [m]
                        f (d,m) = m : f (divMod d base)

I personally don’t see much use for three separate functions for the sign of a number, since you can either use signum or the appropriate comparsion, but we’ll stick to the assignment.

positive, negative, zero :: (Num a, Ord a) => a -> Bool
positive = (> 0)
negative = (< 0)
zero     = (== 0)

If we make BigNum an instance of the Integral class we could use the default even and odd functions, but this requires implementing modulo arithmetic. We’ll use these for now until we have to do so in a following exercise.

bigOdd, bigEven :: BigNum -> Bool
bigEven (B l ds) = l == 0 || even (head ds)
bigOdd           = not . bigEven

The Integral class also has a toInteger function, but for now we’ll use a separate function.

fromBig :: BigNum -> Integer
fromBig (B _ ds) = foldr (\x a -> fromIntegral x + base * a) 0 ds

Here’s another advantage of using the standard type classes: we only need to implement the compare function to get all the others for free.

instance Ord BigNum where
    compare (B l1 ds1) (B l2 ds2) = case compare l1 l2 of
        EQ -> maybe EQ id . find (/= EQ) . reverse $ zipWith compare ds1 ds2
        c  -> c

Time for a whole bunch of tests.

main :: IO ()
main = do let a = 12345678
          let b = -87654321
          print $ 0               == B 0 []
          print $ 1               == B 1 [1]
          print $ (-1)            == B (-1) [1]
          print $ a               == B 3 [678,345,12]
          print $ b               == B (-3) [321,654,87]
          print $ fromBig a       == 12345678
          print $ fromBig (abs b) == 87654321
          print $ abs a           == a
          print $ positive a
          print $ negative b
          print $ zero (0 :: BigNum)
          print $ bigEven a
          print $ bigOdd b
          print $ b /= a
          print $ b < a

Looks like everything is working properly.

Programming Praxis – ISBN Validation

May 20, 2011

In today’s Programming Praxis exercise, our goal is to write a number of functions related to ISBN numbers. Let’s get started, shall we?

Some imports:

import Control.Applicative hiding ((<|>), optional)
import Data.Char
import Data.List
import Data.Map (elems)
import Network.HTTP
import Text.HJson
import Text.HJson.Query
import Text.Parsec

First, we need some parsers for ISBN and EAN numbers.

isbn = (++) <$> (concat <$> sepEndBy1 (many1 d) (oneOf " -"))
            <*> option [] ([10] <$ char 'X') where
    d = read . return <$> digit
ean = string "978" *> optional (oneOf " -") *> isbn

Since we need the check digits both for validation and conversion we make separate functions for them.

isbnCheck, eanCheck :: Integral a => [a] -> a
isbnCheck n = 11 - mod (sum $ zipWith (*) [10,9..] (take 9 n)) 11
eanCheck n = mod (sum $ zipWith (*) (cycle [1,3]) (take 9 n)) 10

Checking whether a number is valid is a matter of checking if the length and last digit are correct.

validISBN, validEAN :: String -> Bool
validISBN = valid isbn isbnCheck
validEAN = valid ean eanCheck

valid p c = either (const False) v . parse p "" where
    v ds = length ds == 10 && c ds == last ds

Conversion just requires changing the last digit.

toISBN, toEAN :: String -> Maybe String
toISBN = convert ean isbnCheck
toEAN = fmap ("978-" ++) . convert isbn eanCheck

convert p c = either (const Nothing) (Just . fixCheck) . parse p ""
    where fixCheck n = map intToDigit (init n) ++ [check $ c n]
          check n = if n == 10 then 'X' else intToDigit n

Since I don’t like APIs that require an access key, we’ll be using openlibrary instead of isbndb.

lookupISBN :: String -> IO [(String, [String])]
lookupISBN = get . ("http://openlibrary.org/api/books?format=json&\
                    \jscmd=data&bibkeys=ISBN:" ++) where
    f ~(JObject j) = map (\b -> (unjs $ key "title" b,
        map (unjs . key "name") . getFromArr $ key "authors" b)) $ elems j
    key k = head . getFromKey k
    unjs ~(JString s) = s
    get url = fmap (either (const undefined) f . fromString) .
              getResponseBody =<< simpleHTTP (getRequest url)

Some tests to see if everything is working properly:

main :: IO ()
main = do print $ validISBN "99921-58-10-7"
          print $ validISBN "80-902734-1-6"
          print $ validISBN "0-943396-04-2"
          print $ validISBN "0-9752298-0-X"
          print $ validISBN "0943396042"
          print $ not $ validISBN "99921-58-10-8"
          print $ not $ validISBN "99921-58-10-"
          print $ not $ validISBN "9"
          print $ validEAN "978-0-0700048-4-9"
          print $ validEAN "9780070004849"
          print $ not $ validEAN "9780070004848"
          print $ toISBN "9780070004849" == Just "0070004846"
          print $ toEAN "0070004846" == Just "978-0070004849"

          mapM_ (\(t,a) -> putStrLn ("Title: " ++ t) >>
                           putStrLn ("Authors: " ++ intercalate ", " a)) =<<
              lookupISBN "0070004846"

Programming Praxis – Two Bad Sorts

May 17, 2011

In today’s Programming Praxis exercise, our task is to implement two inefficient sorting algorithms. Let’s get started, shall we?

Some imports:

import Control.Arrow
import System.Random
import System.Random.Shuffle

Stoogesort is fairly bad at O(n^2.7).

stoogesort :: Ord a => [a] -> [a]
stoogesort []       = []
stoogesort xs@(h:t) = f $ if last xs < h then last xs : init t ++ [h] else xs
    where f = if length xs > 2 then s first 2 . s second 1 . s first 2 else id
          s p n = uncurry (++) . p stoogesort . splitAt (div (n * length xs) 3)

Bogosort is more interesting. It has the potential of sorting a list in O(n). The chance of this happening, however, is pretty low. The resulting average performance is a terrible O(n*n!).

bogosort :: Ord a => [a] -> IO [a]
bogosort [] = return []
bogosort xs = if and $ zipWith (<=) xs (tail xs) then return xs
              else bogosort . shuffle' xs (length xs) =<< newStdGen

Some tests to see if everything is working properly:

main :: IO ()
main = do print . (== [1..5]) =<< bogosort [3,1,5,4,2]
          print $ stoogesort [3,1,5,4,2] == [1..5]

Seems like it is. Having said that, never use either of these in practice.

Programming Praxis – Comm

May 10, 2011

In today’s Programming Praxis exercise, our goal is to implement the Unix command line utility comm. Let’s get started, shall we?

Some imports:

import Control.Monad
import System.Environment
import Text.Printf
import System.IO
import qualified System.IO.Strict as SIO
import GHC.IO.Handle

Determining the common lines isn’t too difficult. We go trough the two lists element by element, putting them in column 1,2 or 3 as appropriate. Afterwards, we filter out the specified columns.

comm :: (Num b, Ord a) => [b] -> [a] -> [a] -> [(a, b)]
comm flags zs = filter ((`notElem` flags) . snd) . f zs where
    f xs     []     = map (flip (,) 1) xs
    f []     ys     = map (flip (,) 2) ys
    f (x:xs) (y:ys) = case compare x y of 
        LT -> (x,1) : f xs     (y:ys)
        GT -> (y,2) : f (x:xs) ys
        EQ -> (x,3) : f xs     ys

Displaying the results in columns can be achieved with printf.

columns :: [(String, Int)] -> IO ()
columns xs = let width = maximum (map (length . fst) xs) + 2 in
    mapM_ (\(s,c) -> printf "%*s%-*s\n" ((c - 1) * width) "" width s) xs

Handling the arguments is fairly straightforward for the most part, with one exception: if the input for both files comes from stdin, the default getContents function will not work for two reasons: first, since the handle gets closed after the first one, the second call to getContents will fail. The way to resolve this is to duplicate the handle to stdin. Secondly, since getContents is lazy by default it will read the first file from stdin first, marking each line as unique to the first file, followed by doing the same thing for the second file. We therefore need to read both files strictly first. Both problems are resolved by the newStdIn function.

main :: IO ()
main = do args <- getArgs
          columns =<< case args of
              (('-':p:ps):fs) -> go (map (read . return) (p:ps)) fs
              fs              -> go [] fs
    where go args ~[f1, f2] = liftM2 (comm args) (file f1) (file f2)
          file src = fmap lines $ if src == "-" then newStdIn
                                                else readFile src
          newStdIn = catch (SIO.hGetContents =<< hDuplicate stdin)
                           (\_ -> return [])

Programming Praxis – Entab And Detab

May 6, 2011

In today’s Programming Praxis exercise, our goal is to write two functions to convert between text fragments indented with tabs and spaces. Let’s get started, shall we?

A required import:

import Text.Regex

Converting tabs to spaces is easy: just replace each tab with the desired amount of spaces. For the actual replacing we use a regular expression library.

detab :: Int -> String -> String
detab w s = subRegex (mkRegex "\t") s (replicate w ' ')

Converting spaces to tabs is a bit more complicated, particularly if the text is indented using a combination of tabs and spaces. First we count the total indent length, and then we insert the required amount of tabs and spaces.

entab :: Int -> String -> String
entab w = unlines . map f . lines where
    f s = replicate tabs '\t' ++ replicate spaces ' ' ++ line where
        (indent, line) = span (`elem` " \t") s
        (tabs, spaces) = divMod (sum $ map width indent) w
    width c = if c == '\t' then w else 1

For testing, we need to make sure we handle all the test cases: no indent, less than a tab of indent, one tab of indent, less than two tabs of indent, two tabs of indent and a line with mixed spaces and tabs.

main :: IO ()
main = do print $ entab 2 "a\n b\n  c\n   d\n    e\n \t f\n" ==
                          "a\n b\n\tc\n\t d\n\t\te\n\t\tf\n"
          print $ detab 2 "a\n b\n\tc\n\t d\n\t\te\n \t f\n" ==
                          "a\n b\n  c\n   d\n    e\n    f\n"

Everything seems to be working correctly. Personally, I’m on the spaces side of this particular holy war, specifically in the four spaces camp. Two spaces is still acceptable, every other setting is wrong 🙂

Programming Praxis – Squaring The Bishop

May 3, 2011

In today’s Programming Praxis exercise, our goal is to write a program that can create word squares. Let’s get started, shall we?

Some imports:

import qualified Data.ByteString.Char8 as B
import qualified Data.List.Key as K
import qualified Data.Map as M
import qualified Data.Trie as T

First we need to load the words into a practical data structure. The obvious one here is a trie. Rather than one big trie for the whole dictionary, we make group the words by length, making a trie for each different length.

loadWords :: IO (M.Map Int (T.Trie Int))
loadWords = fmap (M.fromList . map (\(w:ws) -> (snd w, T.fromList (w:ws))) .
                  K.group snd . K.sort snd . map (\w -> (w, B.length w)) .
                  B.words) $ B.readFile "words.txt"

Next, we need a function to find all the possible words of the correct length given a prefix.

findWords :: Int -> String -> M.Map Int (T.Trie a) -> [B.ByteString]
findWords l prefix = T.keys . T.submap (B.pack prefix) . (M.! l)

Finally, constructing the square is a matter recursively finding all the possible next words and keeping only the combinations that result in a full square.

square :: String -> M.Map Int (T.Trie a) -> [[B.ByteString]]
square word ds = f 1 [B.pack word] where
    f n ws = if n == length word then [ws] else 
             (\w -> f (n + 1) (ws ++ [w])) =<<
             findWords (length word) (map (`B.index` n) ws) ds

Some tests to see if everything is working properly:

main :: IO ()
main = do print . square "bonsai" =<< loadWords
          print . (== 122) . length . square "bishop" =<< loadWords

Looks like it. Interestingly, the word bonsai only has a single word square:

bonsai
osiers
nitril
serosa
arisen
island