In today’s Programming Praxis exercise, our goal is to write a program that can tell us on which lines each identifier and operator in a program appers. Let’s get started, shall we?
import Data.List import qualified Data.List.Key as K import Language.Haskell.Lexer
Rather than muck about with brittle regular expressions or something to that effect, we’ll just use a proper Haskell lexer library. Note that the one we’re using comes from the haskell-lexer package, which shares a module name with the haskell-src package that comes with the Haskell Platform. When running this program, pass -hide-package haskell-src as an argument to GHC. With that out of the way, all we need to do is read a file, list all the tokens and group all the identifiers by line.
main :: IO () main = do file <- readFile "test.hs" mapM_ putStrLn . map ((\((n:_), ls) -> unwords $ n : nub ls) . unzip) . K.group fst $ K.sort fst [(s, show $ line p) | (tok, (p,s)) <- lexerPass0 file, elem tok [Varid, Conid, Varsym, Consym]]
Running this algorithm on its own source code produces the following:
$ 7 8 9 . 7 8 Conid 10 Consym 10 IO 5 K 2 Varid 10 Varsym 10 as 2 elem 10 file 6 9 fst 8 lexerPass0 9 line 9 ls 7 main 5 6 map 7 mapM_ 7 n 7 nub 7 p 9 putStrLn 7 qualified 2 readFile 6 s 9 show 9 tok 9 10 unwords 7 unzip 8
Looks like everything is working properly.