Forcing evaluation in Haskell

As you might know, Haskell is a lazy language. This means that nothing is evaluated until it is actually needed. This allows you to do nice things like

foo = take 10 [0..]

without taking infinity to evaluate the list. Until yesterday, however, I didn’t fully appreciate quite how literally Haskell takes this concept. I was working on a BMP importer and had the following data structure:

data Image = Image { width :: Int, height :: Int, pixels :: [[Pixel]] }

data Pixel = Pixel { red :: Int, green :: Int, blue :: Int, alpha :: Int }
             deriving Show

Pretty simple. Now let’s make a function that generates a bitmap:

makeImage :: Int -> Int -> Image
makeImage w h = Image w h . replicate h . replicate w $ Pixel 255 0 0 255

Let’s jump to ghci to test working with a reasonable sized bitmap.

> :set +s

This makes ghci report how long everything takes and how much memory it requires.

> width $ makeImage 800 600
(0.05 secs, 527520 bytes)

As the more perceptive of you may have noticed, we have 800 * 600 = 480000 pixels. Each pixel has four ints, which are 4 bytes each, so the full data structure should take 480000 * 16 = 7680000 bytes, or 7 MB at the very least. Our command only took 527 KB, so obviously it’s not evaluating the whole thing, which is logical. It doesn’t need the pixels, so it doesn’t evaluate them.

So how do we make it evaluate all the pixels? Until yesterday i thought getting the last pixel via (last . last) would work, figuring it should have to evaluate the whole thing to give me the last element. Let’s try that.

> last . last . pixels $ makeImage 800 600
Pixel {red = 255, green = 0, blue = 0, alpha = 255}
(0.00 secs, 522388 bytes)

Well, that was fast. But if we look at the memory used we see the obvious problem: it’s still not nearly enough. GHC is a clever compiler, so it skips doing the work for all the other pixels because it sees it doesn’t have to. Unfortunately it took me a whole lot longer to see this problem since I was testing a bigger section of code, which took a lot more memory, obscuring this problem.

Fortunately the kind people in the #haskell IRC channel pointed out this problem and also provided a way to force the evaluation of an object, namely the rnf function in Control.Parallel.Strategies. So let’s try that:

import Control.Parallel.Strategies

Go to ghci again and type

> rnf . pixels $ makeImage 800 600
    No instance for (NFData Pixel)
      arising from a use of `rnf' at :1:0-2
    Possible fix: add an instance declaration for (NFData Pixel)
    In the first argument of `(.)', namely `rnf'
    In the first argument of `($)', namely `rnf . pixels'
    In the expression: rnf . pixels $ makeImage 800 600

Hm. We’ll need to make Pixel an instance of NFData. Had this instead been a [[Int]] or another common data type it would have worked out of the box. Now we could just say

instance NFData Image

but this would tell only part of the story. If we look at the source code for Control.Parallel.Strategies we see that rnf does nothing by default. Using this version we would indeed evaluate the whole list (because the instance for lists has been correctly defined), but leave the pixels themselves unevaluated. In my test this resulted in a reported time of about 5 seconds instead of the 9-10 it actually took to fully load the image. Another look at the source code reveals that the correct implementation is very simple: just seq everything together.

instance NFData Pixel where
    rnf (Pixel r g b a) = rnf r `seq` rnf g `seq` rnf b `seq` rnf a

Let’s also make Image an instance of NFData for good measure.

instance NFData Image where
    rnf (Image w h ps) = rnf w `seq` rnf h `seq` rnf ps

Let’s try evaluating the image again:

> rnf $ makeImage 800 600
(1.78 secs, 92503000 bytes)

Quite a difference. Now we have a realistic memory use and the actual time it takes to fully evalutate the image. So if you’re ever unsure if something has been fully evaluated or if your compiler is just being sneaky, remember the rnf function. It might save you a few hours of confusion 🙂

Tags: , , ,

6 Responses to “Forcing evaluation in Haskell”

  1. 23Skidoo Says:

    Using Pixels made up of 4 Ints and storing them in a list is atrociously inefficient. You end up using 7x as much memory as opposed to a flat C array (with one uint32_t per pixel).

  2. TomMD Says:

    Why does the concept of an array have to be tied to C? Haskell arrays, while slightly clumsy to work with, would do fine too. But you’re right – if he’s memory concious he should probably use an array.

    On the other hand, if the data is read in order and written to disk then the entire image (list) might never be generated. In such a use case arrays would be a waste of space. Profiling has always proven to me to be a worthwhile step when trying to shave off seconds or bytes.

  3. Jedai Says:

    It seems to me that most of the usages would need to have at least portions (probably not just lines) of the image in memory at a time… But you ca

  4. Jedai Says:

    But you can have your cake and eat it too with some of the newest Arrays libraries (like uvector, vector, …) since they support stream fusion. They also try to be more efficient in space and time than the traditional arrays library in Haskell.

  5. medeamelana Says:

    Thanks for this! Succinct and clear explanation of rnf.

  6. Strictbench 0.1 « Bonsai Code Says:

    […] By Remco Niemeijer In the post ‘Forcing evaluation in Haskell‘ , I described how to fully evaluate a value to get around Haskell’s lazy evaluation. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: