Archive for the ‘Tips & tricks’ Category

Forcing evaluation in Haskell

April 27, 2009

As you might know, Haskell is a lazy language. This means that nothing is evaluated until it is actually needed. This allows you to do nice things like

foo = take 10 [0..]

without taking infinity to evaluate the list. Until yesterday, however, I didn’t fully appreciate quite how literally Haskell takes this concept. I was working on a BMP importer and had the following data structure:

data Image = Image { width :: Int, height :: Int, pixels :: [[Pixel]] }

data Pixel = Pixel { red :: Int, green :: Int, blue :: Int, alpha :: Int }
             deriving Show

Pretty simple. Now let’s make a function that generates a bitmap:

makeImage :: Int -> Int -> Image
makeImage w h = Image w h . replicate h . replicate w $ Pixel 255 0 0 255

Let’s jump to ghci to test working with a reasonable sized bitmap.

> :set +s

This makes ghci report how long everything takes and how much memory it requires.

> width $ makeImage 800 600
(0.05 secs, 527520 bytes)

As the more perceptive of you may have noticed, we have 800 * 600 = 480000 pixels. Each pixel has four ints, which are 4 bytes each, so the full data structure should take 480000 * 16 = 7680000 bytes, or 7 MB at the very least. Our command only took 527 KB, so obviously it’s not evaluating the whole thing, which is logical. It doesn’t need the pixels, so it doesn’t evaluate them.

So how do we make it evaluate all the pixels? Until yesterday i thought getting the last pixel via (last . last) would work, figuring it should have to evaluate the whole thing to give me the last element. Let’s try that.

> last . last . pixels $ makeImage 800 600
Pixel {red = 255, green = 0, blue = 0, alpha = 255}
(0.00 secs, 522388 bytes)

Well, that was fast. But if we look at the memory used we see the obvious problem: it’s still not nearly enough. GHC is a clever compiler, so it skips doing the work for all the other pixels because it sees it doesn’t have to. Unfortunately it took me a whole lot longer to see this problem since I was testing a bigger section of code, which took a lot more memory, obscuring this problem.

Fortunately the kind people in the #haskell IRC channel pointed out this problem and also provided a way to force the evaluation of an object, namely the rnf function in Control.Parallel.Strategies. So let’s try that:

import Control.Parallel.Strategies

Go to ghci again and type

> rnf . pixels $ makeImage 800 600
    No instance for (NFData Pixel)
      arising from a use of `rnf' at :1:0-2
    Possible fix: add an instance declaration for (NFData Pixel)
    In the first argument of `(.)', namely `rnf'
    In the first argument of `($)', namely `rnf . pixels'
    In the expression: rnf . pixels $ makeImage 800 600

Hm. We’ll need to make Pixel an instance of NFData. Had this instead been a [[Int]] or another common data type it would have worked out of the box. Now we could just say

instance NFData Image

but this would tell only part of the story. If we look at the source code for Control.Parallel.Strategies we see that rnf does nothing by default. Using this version we would indeed evaluate the whole list (because the instance for lists has been correctly defined), but leave the pixels themselves unevaluated. In my test this resulted in a reported time of about 5 seconds instead of the 9-10 it actually took to fully load the image. Another look at the source code reveals that the correct implementation is very simple: just seq everything together.

instance NFData Pixel where
    rnf (Pixel r g b a) = rnf r `seq` rnf g `seq` rnf b `seq` rnf a

Let’s also make Image an instance of NFData for good measure.

instance NFData Image where
    rnf (Image w h ps) = rnf w `seq` rnf h `seq` rnf ps

Let’s try evaluating the image again:

> rnf $ makeImage 800 600
(1.78 secs, 92503000 bytes)

Quite a difference. Now we have a realistic memory use and the actual time it takes to fully evalutate the image. So if you’re ever unsure if something has been fully evaluated or if your compiler is just being sneaky, remember the rnf function. It might save you a few hours of confusion 🙂