In today’s Programming Praxis exercise, we have to implement the famous MapReduce algorithm. Let’s get going, shall we?
First, some imports:
import Control.Arrow import Data.Char import Data.List import qualified Data.Map as M
Since I wasn’t here for the Red-Black Tree exercise, I’ll just use Maps.
mapReduce :: Ord k => (a -> (k, v)) -> (v -> v -> v) -> (k -> k -> Bool) -> [a] -> [(k, v)] mapReduce m r lt = sortBy (\(a,_) (b,_) -> if lt a b then LT else GT) . M.assocs . M.map (foldl1 r) . M.fromListWith (++) . map (second return . m)
With that, the version that works on files is trivial.
mapReduceInput :: Ord k => (a -> (k, v)) -> (v -> v -> v) -> (k -> k -> Bool) -> (String -> [a]) -> FilePath -> IO [(k, v)] mapReduceInput m r lt g = fmap (mapReduce m r lt . g) . readFile
In order to test our algorithm, let’s reproduce the tests from Programming Praxis:
anagrams = map snd . mapReduce (sort &&& id) (\a b -> a ++ " " ++ b) (<) getWords = concat . zipWith (\i -> map (\w -> (w, [i])) . words) [1..] . map (map clean) . lines where clean c = if isAlphaNum c || isSpace c then c else ' ' xref = mapReduceInput id (flip union) (<) getWords main = do print $ mapReduce (\x -> (x, 1)) (+) (<) "banana" print $ anagrams ["time", "stop", "pots", "cars", "emit"] print =<< xref "mapreduce.txt"
Everything’s working correctly.