In today’s Programming Praxis exercise, our goal is to implement the Unix command line utility comm. Let’s get started, shall we?
import Control.Monad import System.Environment import Text.Printf import System.IO import qualified System.IO.Strict as SIO import GHC.IO.Handle
Determining the common lines isn’t too difficult. We go trough the two lists element by element, putting them in column 1,2 or 3 as appropriate. Afterwards, we filter out the specified columns.
comm :: (Num b, Ord a) => [b] -> [a] -> [a] -> [(a, b)] comm flags zs = filter ((`notElem` flags) . snd) . f zs where f xs  = map (flip (,) 1) xs f  ys = map (flip (,) 2) ys f (x:xs) (y:ys) = case compare x y of LT -> (x,1) : f xs (y:ys) GT -> (y,2) : f (x:xs) ys EQ -> (x,3) : f xs ys
Displaying the results in columns can be achieved with printf.
columns :: [(String, Int)] -> IO () columns xs = let width = maximum (map (length . fst) xs) + 2 in mapM_ (\(s,c) -> printf "%*s%-*s\n" ((c - 1) * width) "" width s) xs
Handling the arguments is fairly straightforward for the most part, with one exception: if the input for both files comes from stdin, the default getContents function will not work for two reasons: first, since the handle gets closed after the first one, the second call to getContents will fail. The way to resolve this is to duplicate the handle to stdin. Secondly, since getContents is lazy by default it will read the first file from stdin first, marking each line as unique to the first file, followed by doing the same thing for the second file. We therefore need to read both files strictly first. Both problems are resolved by the newStdIn function.
main :: IO () main = do args <- getArgs columns =<< case args of (('-':p:ps):fs) -> go (map (read . return) (p:ps)) fs fs -> go  fs where go args ~[f1, f2] = liftM2 (comm args) (file f1) (file f2) file src = fmap lines $ if src == "-" then newStdIn else readFile src newStdIn = catch (SIO.hGetContents =<< hDuplicate stdin) (\_ -> return )