Posts Tagged ‘string’

Programming Praxis – Big Numbers: Input And Output

June 14, 2011

In today’s Programming Praxis exercise, our task is to write functions to convert Big Numbers to and from strings. Let’s get started, shall we?

To convert from a string, we simply convert each digit to the correct value, multiplying them by the base as we go along.

readBase :: (Num a, Enum a) => a -> String -> a
readBase b ('-':xs) = - readBase b xs
readBase b xs       = foldl (\a x -> b * a + val x) 0 xs where
    val d = maybe (error "unrecognized digit") id . lookup d $ zip
            (['0'..'9'] ++ ['A'..'Z'] ++ ['a'..'z']) [0..]

To convert to a string, we divide by the base until we reach zero. The remainders form the digits of the output.

showBase :: Integral a => a -> a -> String
showBase b n = if n < 0 then '-' : showBase b (abs n) else
               map (digit . snd) $ m : reverse ms  where
    ((_:ms), (m:_)) = span ((> 0) . fst) $ iterate (flip divMod b . fst) (n, 0)
    digit d = maybe undefined id . lookup d . zip [0..] $
              ['0'..'9'] ++ ['A'..'Z'] ++ ['a'..'z']

While we’re at it let’s also make BigNum an instance of Read and Show, the typeclasses that normally handle conversion to and from strings.

instance Read BigNum where
    readsPrec _ = return . first (readBase 10) . split where
        split ('-':xs) = first ('-':) $ split xs
        split xs       = span (`elem` ['0'..'9'] ++ ['A'..'Z'] ++ ['a'..'z']) xs

instance Show BigNum where
    show = showBase 10

Some tests to see if everything is working properly:

main :: IO ()
main = do print $ readBase 10 "1234"   == ( 1234 :: BigNum)
          print $ readBase 10 "-1234"  == (-1234 :: BigNum)
          print $ readBase  2 "101010" == (   42 :: BigNum)
          print $ read "-1234"         == (-1234 :: BigNum)
          print $ showBase 10 ( 1234 :: BigNum) ==   "1234"
          print $ showBase 10 (-1234 :: BigNum) ==  "-1234"
          print $ showBase  2 (   42 :: BigNum) == "101010"
          print $ show        (-1234 :: BigNum) ==  "-1234"

Programming Praxis – String Subsets

November 23, 2010

In today’s Programming Praxis exercise,we have to write functions to determine if one string is a subset of another as if we were in an interview. Let’s get started, shall we?

First, some imports.

import Data.List
import qualified Data.Map as M
import qualified Data.IntMap as I

My first attempt doesn’t actually work, since the intersect function only checks whether an element is a member of the second list. It doesn’t keep track of duplicates.

subsetOf1 :: Eq a => [a] -> [a] -> Bool
subsetOf1 xs ys = intersect xs ys == xs

Since a call to a library function won’t suffice, we’ll have to whip up something ourselves. The obvious way is to get a count of all the characters in both strings and check if the second string has an equal or higher count for all the characters in the first string. Of course this method is O(n * m), so it’s not very efficient.

subsetOf2 :: Ord a => [a] -> [a] -> Bool
subsetOf2 xs ys = all (\(c, n) -> maybe False (n <=) .
                       lookup c $ count ys) $ count xs
    where count = map (\x -> (head x, length x)) . group . sort

To improve the big O complexity, we’re going to switch to a data type that has a faster lookup. Like the previous version, counting the frequency of each letter is O(n log n), but using Maps the comparison can now be done in O(m + n), meaning the overall complexity remains at O(n log n).

subsetOf3 :: Ord a => [a] -> [a] -> Bool
subsetOf3 xs ys = M.null $ M.differenceWith
    (\x y -> if x <= y then Nothing else Just x) (f xs) (f ys)
    where f = M.fromListWith (+) . map (flip (,) 1)

Since the keys of the map are characters, there’s a further optimization we can make. By converting the characters to integers we can use an IntMap instead of a plain Map. An IntMap has a lookup of O(min(n,W)), with W being the amount of bits in an Int. For any non-trivial n, this results in O(1) lookup. Counting all the letters can now be done in O(n). Since the comparison still takes O(m + n), the resulting complexity is O(m + n). This is the minimum we can achieve, as we need to fully evaluate both strings to produce an answer, which is an O(m + n) operation.

subsetOf4 :: Enum a => [a] -> [a] -> Bool
subsetOf4 xs ys = I.null $ I.differenceWith
    (\x y -> if x <= y then Nothing else Just x) (f xs) (f ys)
    where f = I.fromListWith (+) . map (flip (,) 1 . fromEnum)

A quick test shows that the first function indeed fails, but the other ones succeed.

main :: IO ()
main = do let test f = print (f "da" "abcd", not $ f "dad" "abcd")
          test subsetOf1
          test subsetOf2
          test subsetOf3
          test subsetOf4

I must say I hope that I have internet access during the interview, though. If not, I would have had to come up with an alternative for differenceWith and I might not have remembered the existence of IntMap. In that case I’d probably have gone with something along the lines of the array-based solution from number 4 of the Scheme solutions.

Programming Praxis – String Search: Rabin-Karp

September 1, 2009

Today’s Programming Praxis problem marks the end of the string search series. In this final entry, we have to implement the Rabin-Karp search algorithm. Let’s see what we can do.

First some imports:

import Data.Char
import Data.List

We need the ascii values of the characters in the string, and since the hash values we’re going to be using can get quite large, we’re going to use Integers.

asciis :: String -> [Integer]
asciis = map (fromIntegral . ord)

The hash function treats the list of ascii values as a base 256 number.

hash :: Num a => [a] -> a
hash = sum . zipWith (*) (iterate (* 256) 1) . reverse

For the algorithm, we need the hashes of all the pattern-length substrings of the search string.

hashes :: Int -> String -> [Integer]
hashes p xs = scanl (\s (f,t) -> 256*s - 256^p*f + t) (hash $ take p ascs) .
              zip ascs $ drop p ascs where ascs = asciis xs

With those helper functions defined, the search algorithm becomes fairly straightforward:

rabinKarp :: String -> Maybe Int -> String -> Maybe Int
rabinKarp p s = lookup (hash $ asciis p) . drop (maybe 0 id s) .
                flip zip [0..] . hashes (length p)

And as before, we run our algorithm through the test suite:

test :: (String -> Maybe Int -> String -> Maybe Int) -> IO ()
test f = do assertEqual (f ""   Nothing  "Hello World") (Just 0)
            assertEqual (f "He" Nothing  "Hello World") (Just 0)
            assertEqual (f "od" Nothing  "Bonsai Code") (Just 8)
            assertEqual (f "ef" Nothing  "Bonsai Code") (Nothing)
            assertEqual (f ""   (Just 1) "Hello World") (Just 1)
            assertEqual (f "He" (Just 1) "Hello World") (Nothing)
            assertEqual (f "od" (Just 1) "Bonsai Code") (Just 8)
            assertEqual (f "ef" (Just 1) "Bonsai Code") (Nothing)
         where assertEqual a b = print (a == b, a, b)

main :: IO ()
main = test rabinKarp

Everything’s working correctly. Not bad for 6 lines of code.

Programming Praxis – String Search: Boyer-Moore

August 29, 2009

In yesterday’s Programming Praxis problem we have to implement a more efficient string search algorithm than the brute force approach we did earlier, namely the Horspool variation of the Boyer-Moore algorithm. Let’s get started.

Our import:

import Data.Map (findWithDefault, fromList, (!))

The algorithm this time is a bit longer than the brute force one, but it’s nothing too bad. In lines 2-4 we cache some values to remove some duplication and possibly avoid recalculation. The last four lines are the actual algorithm and the first line just calls it with the proper initial arguments.

horspool :: Ord a => [a] -> Maybe Int -> [a] -> Maybe Int
horspool pat skip xs = f (lp - 1 + maybe 0 id skip) p' where
    (lp, lxs, p') = (length pat, length xs, reverse pat)
    t = fromList $ zip pat [lp - 1, lp - 2..]
    m = fromList $ zip [0..] xs
    f n []     = Just (n + 1)
    f n (p:ps) | n >= lxs   = Nothing
	       | p == m ! n = f (n - 1) ps
               | otherwise  = f (n + findWithDefault lp (m ! n) t) p'

When we test our algorithm with the same test suite as last time we can see everything is working correctly:

test :: (String -> Maybe Int -> String -> Maybe Int) -> IO ()
test f = do assertEqual (f ""   Nothing  "Hello World") (Just 0)
            assertEqual (f "He" Nothing  "Hello World") (Just 0)
            assertEqual (f "od" Nothing  "Bonsai Code") (Just 8)
            assertEqual (f "ef" Nothing  "Bonsai Code") (Nothing)
            assertEqual (f ""   (Just 1) "Hello World") (Just 1)
            assertEqual (f "He" (Just 1) "Hello World") (Nothing)
            assertEqual (f "od" (Just 1) "Bonsai Code") (Just 8)
            assertEqual (f "ef" (Just 1) "Bonsai Code") (Nothing)
         where assertEqual a b = print (a == b, a, b)

main :: IO ()
main = test horspool