In today’s Programming Praxis exercise,we have to write functions to determine if one string is a subset of another as if we were in an interview. Let’s get started, shall we?
First, some imports.
import Data.List import qualified Data.Map as M import qualified Data.IntMap as I
My first attempt doesn’t actually work, since the intersect function only checks whether an element is a member of the second list. It doesn’t keep track of duplicates.
subsetOf1 :: Eq a => [a] -> [a] -> Bool subsetOf1 xs ys = intersect xs ys == xs
Since a call to a library function won’t suffice, we’ll have to whip up something ourselves. The obvious way is to get a count of all the characters in both strings and check if the second string has an equal or higher count for all the characters in the first string. Of course this method is O(n * m), so it’s not very efficient.
subsetOf2 :: Ord a => [a] -> [a] -> Bool subsetOf2 xs ys = all (\(c, n) -> maybe False (n <=) . lookup c $ count ys) $ count xs where count = map (\x -> (head x, length x)) . group . sort
To improve the big O complexity, we’re going to switch to a data type that has a faster lookup. Like the previous version, counting the frequency of each letter is O(n log n), but using Maps the comparison can now be done in O(m + n), meaning the overall complexity remains at O(n log n).
subsetOf3 :: Ord a => [a] -> [a] -> Bool subsetOf3 xs ys = M.null $ M.differenceWith (\x y -> if x <= y then Nothing else Just x) (f xs) (f ys) where f = M.fromListWith (+) . map (flip (,) 1)
Since the keys of the map are characters, there’s a further optimization we can make. By converting the characters to integers we can use an IntMap instead of a plain Map. An IntMap has a lookup of O(min(n,W)), with W being the amount of bits in an Int. For any non-trivial n, this results in O(1) lookup. Counting all the letters can now be done in O(n). Since the comparison still takes O(m + n), the resulting complexity is O(m + n). This is the minimum we can achieve, as we need to fully evaluate both strings to produce an answer, which is an O(m + n) operation.
subsetOf4 :: Enum a => [a] -> [a] -> Bool subsetOf4 xs ys = I.null $ I.differenceWith (\x y -> if x <= y then Nothing else Just x) (f xs) (f ys) where f = I.fromListWith (+) . map (flip (,) 1 . fromEnum)
A quick test shows that the first function indeed fails, but the other ones succeed.
main :: IO () main = do let test f = print (f "da" "abcd", not $ f "dad" "abcd") test subsetOf1 test subsetOf2 test subsetOf3 test subsetOf4
I must say I hope that I have internet access during the interview, though. If not, I would have had to come up with an alternative for differenceWith and I might not have remembered the existence of IntMap. In that case I’d probably have gone with something along the lines of the array-based solution from number 4 of the Scheme solutions.