In today’s Programming Praxis exercise, our goal is to write a program that can create word squares. Let’s get started, shall we?
Some imports:
import qualified Data.ByteString.Char8 as B import qualified Data.List.Key as K import qualified Data.Map as M import qualified Data.Trie as T
First we need to load the words into a practical data structure. The obvious one here is a trie. Rather than one big trie for the whole dictionary, we make group the words by length, making a trie for each different length.
loadWords :: IO (M.Map Int (T.Trie Int)) loadWords = fmap (M.fromList . map (\(w:ws) -> (snd w, T.fromList (w:ws))) . K.group snd . K.sort snd . map (\w -> (w, B.length w)) . B.words) $ B.readFile "words.txt"
Next, we need a function to find all the possible words of the correct length given a prefix.
findWords :: Int -> String -> M.Map Int (T.Trie a) -> [B.ByteString] findWords l prefix = T.keys . T.submap (B.pack prefix) . (M.! l)
Finally, constructing the square is a matter recursively finding all the possible next words and keeping only the combinations that result in a full square.
square :: String -> M.Map Int (T.Trie a) -> [[B.ByteString]] square word ds = f 1 [B.pack word] where f n ws = if n == length word then [ws] else (\w -> f (n + 1) (ws ++ [w])) =<< findWords (length word) (map (`B.index` n) ws) ds
Some tests to see if everything is working properly:
main :: IO () main = do print . square "bonsai" =<< loadWords print . (== 122) . length . square "bishop" =<< loadWords
Looks like it. Interestingly, the word bonsai only has a single word square:
bonsai osiers nitril serosa arisen island
Tags: bishop, bonsai, code, Haskell, kata, praxis, programming, square, word
May 3, 2011 at 3:42 pm |
Out of interest, why the use of ByteString.Char8 instead of Char?
May 3, 2011 at 4:03 pm |
If you mean why I used ByteStrings instead of plain Strings, it’s because the Data.Trie package works with ByteStrings and not with Strings. If you mean why I used Data.ByteString.Char8 instead of Data.ByteString, it’s because the latter has no words function.