In today’s Programming Praxis exercise, our task is to see if Benford’s law (lower digits are more likely to be the first digit of numbers found i large collections of data) holds for the littoral areas of lakes in Minnesota. Let’s get started, shall we?
import Data.List import Text.Printf
The algorithm for calculating the distribution of leading digits given a group of numbers is virtually identical to the Scheme version, despite the fact that I didn’t look at the solution beforehand. It’s not too surprising though, since it’s simply the most obvious method. Note that the function argument is a list of floats. Initially I assumed all areas were integers, which resulted in incorrect results until I found that there were 10 floats hidden in the input (thank god for regular expressions).
firstDigits :: [Float] -> [(Char, Double)] firstDigits xs = map (\ds -> (head ds, 100 * toEnum (length ds) / toEnum (length xs))) . group . sort $ map (head . show) xs
With that function out of the way, the problem becomes trivial: just call firstDigits on all the appropriate numbers.
shriram :: [[String]] -> [(Char, Double)] shriram xs = firstDigits [n | [(n,_)] <- map (reads . (!! 3)) xs]
Of course we need to run the algorithm over the given data, using the parser from two exercises ago:
main :: IO () main = either print (mapM_ (uncurry $ printf "%c %f\n") . shriram) =<< readDB csv "csv.txt"
This produces the same list of percentages as the Scheme version. Looks like Benford’s Law holds in this case as well.