Posts Tagged ‘parsec’

Programming Praxis – Fix Something Annoying

November 6, 2012

In today’s Programming Praxis exercise, our goal is to fix something that annoys us in our programming environment. Let’s get started, shall we?

The first annoyance that came to mind for me is one I encountered in the previous exercise. While I love the Parsec library in general, it is not without its little niggles. Specifically, there were two points:

  • The default method of parsing numbers is to use the functions defined on GenTokenParser, which requires two additional import lines and requires an extra parameters after the integer and float parsers.
  • I normally use the operators from Control.Applicative for composing parsers, but they’re right-associative and have an impractical precedence, which results in lots of parentheses.

Today’s exercise was a good excuse to finally sit down and write a small library to fix these issues once and for all. And since I want to be able to easily use it in future projects and exercises, I’ve put it up on Hackage. Note: If you’re reading this shortly after I post it: documentation always takes a while to generate on Hackage, so you might need to wait a bit.

I won’t cover all the functions of the library here (that’s what the documentation is for), so instead we’ll see what the gas mileage program looks like when using this new library:

Some imports:

import Text.Parsec
import Text.Parsec.Utils
import Text.Printf

The showLog function is unchanged from last time.

showLog :: [(Float, Float)] -> [String]
showLog es = "Miles  Gals  Avg" : "------ ---- ----" : zipWith (\(m2,g) (m1,_) ->
    printf "%.0f %.1f %.1f" m2 g ((m2 - m1) / g)) (tail es) es

The parser is nice and simple now.

logParser = many $ (,) .: float -: space +: float -: newline

And the main function is also simpler than it would otherwise have been, since we don’t need to deal with parse errors anymore.

main = mapM_ putStrLn . showLog =<< parseFile logParser "input.txt"

Programming Praxis – Expression Evaluation

April 16, 2010

In today’s Programming Praxis exercise our task to write a parser for simple mathematical expressions. Let’s get started, shall we?

Some imports:

import Control.Applicative
import Text.Parsec hiding ((<|>))

We’re going to be using a somewhat different approach than the provided Scheme solution. Rather than doing everything ourselves, we will use the Parsec library, which is the go-to solution for writing parsers in Haskell. This, however, results in a small limitation. Parsec is a parser combinator library, and parser combinators cannot deal with left-recursive grammars. The grammar in the assignment is left-recursive, because if we were to enter the rule expr = expr + term (as I did in my first attempt in solving this), our program will enter an infinite loop. Fortunately, left-recursive grammars can be fairly easily rewritten using the chain functions in Parsec.

An expression is one or more terms, separated by addition and subtraction operators.

expr = chainl1 term ((+) <$ char '+' <|> (-) <$ char '-')

A term works just like an expression, but with multiplication and divison.

term = chainl1 fact ((*) <$ char '*' <|> div <$ char '/')

Factors need no change from the specified grammar: they’re either numbers or expressions in parentheses.

fact = read <$> many1 digit <|> char '(' *> expr <* char ')'

Evaluating an expression is trivial. The only extra step is to filter out all the spaces, since they have not been defined in the grammar but are present in the test cases.

eval :: String -> Int
eval = either (error . show) id . parse expr "" . filter (/= ' ')

A quick test shows that our rewritten grammar passes all of the test cases correctly:

main :: IO ()
main = mapM_ print [eval "6+2"            == 8
                   ,eval "6-2"            == 4
                   ,eval "6*2"            == 12
                   ,eval "6/2"            == 3
                   ,eval "6 * 2"          == 12
                   ,eval "2+3*4"          == 14
                   ,eval "2*3+4"          == 10
                   ,eval "2+3+4"          == 9
                   ,eval "2-3-4"          == -5
                   ,eval "2*3*4"          == 24
                   ,eval "(2+3)*4"        == 20
                   ,eval "(2*3)+4"        == 10
                   ,eval "2+(3*4)"        == 14
                   ,eval "2*(3+4)"        == 14
                   ,eval "12 * (34 + 56)" == 1080

Using a parser library and slightly rewriting the grammar reduces the amount of required lines from 48 to 4. That’s a good trade-off in my book.

Programming Praxis – Regular Expressions, Part 1

September 15, 2009

In today’s Programming Praxis problem our task is to write a parser for simple regular expressions. Since Haskell has a very good parser library called Parsec, we’re going to be using that. Let’s get started.

First, some imports:

import Control.Applicative ((<$>), (*>), (<*), (<*>))
import Data.Char
import Text.Parsec
import Text.Parsec.String

Next we define our data structure. There are seven constructs we have to implement, split into two groups based on whether or not they can be followed by a star or not.

data Elem = Lit Char | Esc Char | Any | Set Bool [Elem] deriving Show
data Chunk = Elem Elem | BoL | EoL | Star Elem deriving Show

The parser itself is not too difficult if you know how the operators from Control.Applicative work. <$> means apply the function on the left to the result of the parser on the right. <*, *> and <*> take only the result on the left, right and both sides respectively.

regex :: Parser [Chunk]
regex = (++) <$> bol <*> many chunk where
    bol = option [] (const [BoL] <$> char '^')
    chunk = choice [Star <$> try (element <* char '*'),
                    const EoL <$> try (char '$' <* eof),
                    Elem <$> element]
    element = choice [esc <$> try (char '\\' *> anyChar),
                      const Any <$> char '.',
                      Set False . expandSet <$> set "[^",
                      Set True . expandSet <$> set "[",
                      Lit <$> noneOf "]"]
    esc c = if elem c "nt" then Esc c else Lit c
    set s = try (string s *> many1 element <* char ']')
    expandSet (Lit a:Lit '-':Lit b:xs)
        | validRange a b = map Lit [a..b] ++ expandSet xs
    expandSet (x:xs) = x : expandSet xs
    expandSet _ = []
    validRange a b = b > a && ((isLower a && isLower b) ||
                               (isUpper a && isUpper b) ||
                               (isDigit a && isDigit b))

With the parser written, the function to parse a string is trivial:

parseRegex :: String -> Either ParseError [Chunk]
parseRegex = parse regex ""

Some tests to see if everything is working properly:

main :: IO ()
main = mapM_ print [parseRegex "[0-9][0-9]*",
                    parseRegex "^..*$",
                    parseRegex "hello",
                    parseRegex "^ *hello *$",
                    parseRegex "^[^x].*[0-9] *x$"]

Piece of cake. Next time we do the implementation.