Posts Tagged ‘tree’

Programming Praxis – Lowest Common Ancestor

March 11, 2011

In today’s Programming Praxis exercise, our goal is to write an algorithm to find the lowest common ancestor of two nodes in a binary tree. Let’s get started, shall we?

First, we define a binary tree data structure.

data BinTree a = Node a (BinTree a) (BinTree a) | Nil

The algorithm is pretty trivial: if the higher value is less than the current node, descend the left branch. If the lower value is higher than the current node, descend the right branch. Otherwise, we’ve found our result.

lca :: Ord a => a -> a -> BinTree a -> a
lca m n ~(Node v l r) | n < v     = lca m n l
                      | m > v     = lca m n r
                      | otherwise = v

Some tests to see if everything is working properly:

main :: IO ()
main = do let tip n = Node n Nil Nil
          let tree = Node 8 (Node 3 (tip 1) (Node 6 (tip 4) (tip 7)))
                            (Node 10 Nil (Node 14 (tip 13) Nil))
          print $ lca 4  7 tree == 6
          print $ lca 4 10 tree == 8
          print $ lca 1  4 tree == 3
          print $ lca 1  3 tree == 3
          print $ lca 3  6 tree == 3

Yup. Simple enough.

Programming Praxis – Carl Hewitt’s Same-Fringe Problem

August 3, 2010

In today’s Programming Praxis exercise we have to implement Carl Hewitt’s same-fringe algorithm, which determines if two binary trees have the same leaves in the same order, regardless if their respective structures. The simple solution of flattening both trees and comparing them doesn’t work in most languages, since their strict evaluation requires that the entire flattened list be loaded into memory, which will fail on big trees. Haskell on the other hand is lazy by default, so we don’t need to do anything complicated to have things work correctly. Let’s get started, shall we?

Since we’ll be working with binary trees, we’ll need to define a data structure for them:

data Tree a = Node (Tree a) (Tree a) | Leaf a

Flattening a tree is trivial.

flatten :: Tree a -> [a]
flatten (Node l r) = flatten l ++ flatten r
flatten (Leaf x)   = [x]

As mentioned in the intro, all we need to do is check whether the two flattened lists are equal.

sameFringe :: Eq a => Tree a -> Tree a -> Bool
sameFringe a b = flatten a == flatten b

As always, a test to see if things are working correctly:

main :: IO ()
main = print $ sameFringe (Node (Leaf 1) (Node (Leaf 2) (Leaf 3)))
                          (Node (Node (Leaf 1) (Leaf 2)) (Leaf 3))

Looks like they are. But how do we prove this method doesn’t require a big memory overhead? Well, let’s just create some big trees and find out. A binary tree of depth 28 has 2^28 = over 268 million leaves. At the very least, that would result in a list of 1 GB, and that’s assuming 32 bits per integer and no overhead. Since Haskell’s default lists are linked lists, you should probably at least double that. However, the following program

main = print $ sameFringe (treeOfDepth 28 (Leaf 1))
                          (treeOfDepth 28 (Leaf 1))
       where treeOfDepth n t = iterate (\x -> Node x x) t !! (n - 1)

hums along at a constant memory use of less than 20 MB when interpreted and less than 3 MB when compiled, the compiled version finishing in just under 32 seconds. To quote Larry Kersten, “Hard work often pays off after time, but laziness always pays off now.” 🙂

Programming Praxis – Minimum Spanning Tree: Prim’s Algorithm

April 9, 2010

In today’s Programming Praxis exercise we have to implement an algorithm for finding the minimum spanning tree of a graph. The Scheme solution weighs in at 15 lines, so let’s see if we can do better.

As usual, some imports:

import Data.List
import qualified Data.List.Key as K

Both the Scheme solution and the pseudocode algorithm on Wikipedia take both a list of vertices and a list of edges as input, but since the list of vertices is implicitly defined in the edges there’s really no point in specifying it separately. To get the starting vertex, we just take the first vertex of the first edge. Other than that, we do pretty much what the pseudocode algorithm says: check if there’s an edge with one connected and one unconnected point. If multiple exist, add the shortest. Add the unconnected vertex to the list of connected ones. Stop if there are no more edges with unconnected vertices.

prim :: (Eq a, Ord b) => [(a, a, b)] -> [(a, a, b)]
prim es = f [(\(v,_,_) -> v) $ head es] [] where
    f vs t = if null r then t else f (union vs [x,y]) (m:t) where
        r = filter (\(a,b,_) -> elem a vs /= elem b vs) es
        m@(x,y,_) = K.minimum (\(_,_,c) -> c) r

A quick test shows we get the same result as the Scheme version:

main :: IO ()
main = print $ prim [('A', 'D',  5), ('A', 'B', 7), ('B', 'D', 9)
                    ,('B', 'C',  8), ('C', 'E', 5), ('B', 'E', 7)
                    ,('D', 'E', 15), ('D', 'F', 6), ('E', 'F', 8)
                    ,('F', 'G', 11), ('E', 'G', 9)]

And with that we’ve reduced a 15-line solution to four lines. Not bad.

Programming Praxis – Binary Search Tree

March 5, 2010

In today’s Programming Praxis exercise we have to implement a Binary Search Tree. Let’s get started, shall we?

We need two imports:

import Control.Monad
import System.Random

The data structure is your run-of-the-mill binary tree.

data BTree k v = Node k v (BTree k v) (BTree k v) | Empty

Finding an element is pretty straightforward. Just keep taking the correct branch until we exhaust the tree or find what we want.

find :: (k -> k -> Ordering) -> k -> BTree k v -> Maybe v
find _   _ Empty          = Nothing
find cmp k (Node k' v' l r) = case cmp k k' of EQ -> Just v'
                                               LT -> find cmp k l
                                               GT -> find cmp k r

Inserting works the same way as find: move to the correct position and insert or replace the new value.

insert :: (k -> k -> Ordering) -> k -> v -> BTree k v -> BTree k v
insert _   k v Empty            = Node k v Empty Empty
insert cmp k v (Node k' v' l r) = case cmp k k' of
    EQ -> Node k v l r
    LT -> Node k' v' (insert cmp k v l) r
    GT -> Node k' v' l (insert cmp k v r)

Since the deletion algorithm calls for a random number, delete is an IO action. You can consider using unsafePerformIO to hide this (I did in my first draft), but I decided to stick with the honest, safer (though less convenient) version. Alternatively you could accept the occasional imbalance and just always start on the left.

delete :: (k -> k -> Ordering) -> k -> BTree k v -> IO (BTree k v)
delete _   _ Empty              = return Empty
delete cmp k t@(Node k' v' l r) = case cmp k k' of
    EQ -> fmap (flip deroot t . (== 0)) $ randomRIO (0,1 :: Int)
    LT -> fmap (flip (Node k' v') r) $ delete cmp k l
    GT -> fmap (      Node k' v'  l) $ delete cmp k r

For the deroot function we use a slightly different approach than the Scheme version. I’m not sure how that version deals with the case of one of the two branches being empty, but here they are explicitly included in the patterns. The rot-left and rot-right functions are rewritten as patterns.

deroot :: Bool -> BTree k v -> BTree k v
deroot _    Empty              = Empty
deroot _    (Node _ _ l Empty) = l
deroot _    (Node _ _ Empty r) = r
deroot True (Node k v l (Node rk rv rl rr)) =
    Node rk rv (deroot False $ Node k v l rl) rr
deroot _    (Node k v (Node lk lv ll lr) r) =
    Node lk lv ll (deroot True $ Node k v lr r)

Converting the search tree to a list is trivial.

toList :: BTree k v -> [(k, v)]
toList Empty          = []
toList (Node k v l r) = toList l ++ (k, v) : toList r

And, as always, a test to see if everything is working correctly:

main :: IO ()
main = do let t = foldr (uncurry $ insert compare) Empty $
                  [(n, n) | n <- [4,1,3,5,2]]
          print $ toList t
          print $ find compare 3 t
          print $ find compare 9 t
          print . toList =<< foldM (flip $ delete compare) t [4,2,3,5,1]

Power programming

January 27, 2010

Yesterday I read called Power Programming, in which the author gives solutions for this Google Code Jam problem in Python, Perl, Arc and C++. I figured I’d have a go at providing a solution in Haskell to see how it stacks up.

Since converting the given tree to a tuple won’t work in Haskell (it might with Data.Dynamic, but that’s not exactly standard practice), we’ll have to settle for writing a parser. Fortunately, Parsec makes this really easy.

First, we need some imports.

import Control.Applicative ((<*>), (<$>))
import Text.Parsec
import Text.Parsec.Token
import Text.Parsec.Language
import Text.Printf

We’ll also have to define the structure of the decision tree.

data Tree = Node Double String Tree Tree | Leaf Double

Since we need a parser for the decision trees anyway, we might as well write a parser for the complete test case input. Because the tokens we’ll be getting are pretty much the same as in most programming languages, we can just use the existing haskell tokenizer to parse the tree.

input    = count' testCase
testCase = (,) <$> (natural h >> tree) <*> count' animal
tree     = parens h $ try node <|> leaf
node     = Node <$> float h <*> identifier h <*> tree <*> tree
leaf     = Leaf <$> float h
animal   = identifier h >> count' (identifier h)
h        = haskell
count' p = flip count p =<< (fromIntegral <$> natural h)

Once we have the tree and the animals, calculating the cuteness of one of them is just a matter of taking the correct branches and multiplying all the values.

cute (Leaf x) _        = x
cute (Node x f l r) fs = x * cute (if elem f fs then l else r) fs

Showing the result just requires a bit of printf use.

output = mapM_ (\(i, (t, as)) -> printf "Case #%d:\n" i >>
             mapM_ (printf "%1.7f\n" . cute t) as) . zip [1::Int ..]

And finally a function that combines the required steps.

solve = either print output . parse input ""

A test to see if everything works correctly (for the sake of brevity, we read the input from a file, but using getContents to read from the console or a plain string literal will work as well):

main = solve =<< readFile "input.txt"

Not bad I reckon. Obivously, it’s not quite as brief as the Perl solution, but at least to me it’s a whole lot more readable. It’s roughly the same size as the Arc solution, which seems about right to me.

Programming Praxis – Treaps

June 26, 2009

Today’s Programming Praxis problem is about Treaps – binary trees that are more or less balanced thanks to random numbers. The provided solution is 58 lines, so we have quite a bit of work to do. Let’s get started.

First our imports:

import Control.Monad
import Data.Char
import qualified Data.List.Key as K
import System.Random

As mentioned in the intro, the data structure is, for all intents and purposes, a binary tree.

data Treap k a = Nil | Node Int k a (Treap k a) (Treap k a)

Nil nodes have a priority of -1.

priority :: Treap k a -> Int
priority Nil = -1
priority (Node p _ _ _ _) = p

Rotating left and right requires some node reshuffling:

rotLeft :: Treap k a -> Treap k a
rotLeft (Node p k a l (Node rp rk ra rl rr)) =
    Node rp rk ra (Node p k a l rl) rr
rotLeft t = t

rotRight :: Treap k a -> Treap k a
rotRight (Node p k a (Node lp lk la ll lr) r) =
    Node lp lk la ll (Node p k a lr r)
rotRight t = t

This function automatically applies the correct rotations:

rot :: Treap k a -> Treap k a
rot Nil = Nil
rot t@(Node p _ _ l r) | p < priority l = rotRight t
                       | p < priority r = rotLeft t
                       | otherwise      = t

Lookup works the same as in any binary tree.

find :: Ord k => k -> Treap k a -> Maybe a
find _  Nil = Nothing
find k' (Node _ k a l r) | k' < k    = find k' l
                         | k' > k    = find k' r
                         | otherwise = Just a

And updating is also not that different. The only annoyance is that we have to do it in the IO monad, since we need random numbers.

update :: Ord k => (a -> a -> a) -> k -> a -> Treap k a -> IO (Treap k a)
update _ k' a' Nil = fmap (\r -> Node r k' a' Nil Nil) $
                     randomRIO (0, maxBound)
update f k' a' (Node p k a l r)
    | k' < k    = fmap (\n -> rot $ Node p k a n r) (update f k' a' l)
    | k' > k    = fmap (rot . Node p k a l) (update f k' a' r)
    | otherwise = return $ Node p k' (f a' a) l r

Inserting is just a special case of updating:

insert :: Ord k => k -> a -> Treap k a -> IO (Treap k a)
insert = update const

For deleting, we need a way the combine the two remaining subtrees. This requires some rotation.

deroot :: Treap k a -> Treap k a
deroot Nil = Nil
deroot t@(Node _ _ _ l r)
    | priority l < priority r = d deroot id $ rotLeft t
    | otherwise               = d id deroot $ rotRight t
    where d fl fr = (\(Node p k a l' r') -> Node p k a (fl l') (fr r'))

That makes deleting a simple matter of applying deroot at the correct location.

delete :: Ord k => k -> Treap k a -> Treap k a
delete _ Nil = Nil
delete k' t@(Node p k a l r)
    | k' < k    = Node p k a (delete k' l) r
    | k' > k    = Node p k a l (delete k' r)
    | otherwise = deroot t

Converting a treap to a list is just simple recursion.

toList :: Treap k a -> [(k, a)]
toList Nil = []
toList (Node _ k a l r) = toList l ++ [(k, a)] ++ toList r

And there we go. Now to test it on the word frequency problem:

main :: IO ()
main = mapM_ print =<< wordFreqs 25 =<< readFile "bible.txt"

wordFreqs :: Int -> String -> IO [(String, Int)]
wordFreqs n = fmap (take n . reverse . K.sort snd . toList) .
              foldM (\a w -> update (+) w 1 a) Nil .
              map (filter isAlpha) . words

Works as expected. At 40 lines it’s about a 30% reduction compared to the scheme code. Not bad.