Learn Haskell Part 5: Better errors and Applicative
Telling exactly what happened
For now, we have only one way to tell there was an error. This is the Nothing variant of the Maybe type. We may want to carry data about what was the error. For example, if there was a parsing error, we would like to know which symbol is the culprit.
We will create a type Result that can replace Maybe, but with an error variant carrying an error description as a String instead of just Nothing.
data Result a
= Error String
| Result a
For convenience, we would like to show values of this type, so we will implement Show for it. You can implement it yourself or derive the implementation:
data Result a
= Error String
| Result a
Deriving is kind of magic, it implements boring but useful traits automaticaly for our types. We will use it more often now, but keep in mind that some times you may want to implement traits yourself because the implementation matters.
Now let's rewrite the function readRPNSymbol so that it returns a Result and not a Maybe:
readRPNSymbol string = case string of
"+" -> Result Add
"-" -> Result Sub
"*" -> Result Mul
"/" -> Result Div
"dup" -> Result Dup
"drop" -> Result Drop
s -> case readMaybe s of
Just n -> Result (Number n)
Nothing -> Error ("Unknown symbol '" ++ string ++ "'")
You see now we can tell which symbol made the parsing fail! Try this:
λ> readRPNSymbol "10"
Result 10
λ> readRPNSymbol "nope"
Error "Unknown symbol 'nope'"
The type Result is a very common type, and it is so common that in Haskell, there is a similar type called Either that is used everywhere we need to carry a result from a source, or from another source. Either is defined as:
data Either a b
= Left a
| Right b
It is almost the same as Result, but instead of having a String for the first variant, we have a generic type. This allows people to use Either with their own error type, or even to carry an object that simply have two different shapes. We will now use the type Either String in place of our custom Result type. And yes, I just partialy applied a type function, Either is a function over types that needs 2 types, so Either String is a function that takes 1 type. I can now create a type alias:
type Result = Either String
And this new Result type will behave like the old one except now its variants are Left instead of Error, and Right instead of Result.
Ok, let's modify our functions (readRPNSymbol, traverseList, readRPNExpression and evalRPN) to use this type instead now.
readRPNSymbol string = case string of
"+" -> Right Add
"-" -> Right Sub
"*" -> Right Mul
"/" -> Right Div
"dup" -> Right Dup
"drop" -> Right Drop
s -> case readMaybe s of
Just n -> Right (Number n)
Nothing -> Left ("Unknown symbol '" ++ string ++ "'")
traverseList f [] = Just []
traverseList f (a:as) =
case f a of
Left e -> Left e
Right b -> case traverseList f as of
Left e -> Left e
Right bs -> Right (b:bs)
readRPNExpression = traverseList readRPNSymbol . words
evalRPN str = case readRPNExpression str of
Right exp -> Right (evalRPNExpression [] exp)
Left err -> Left err
Wow, this was a lot of refactored code ... But we now have error reports !.. But it still was a lot of code to change just for this feature. After all, the only thing that changed are the variants, but the logic is the same everytime: if you have an error (Nothing or Left variant), you pass the error as is, but if you have a good value (Just or Right) then you pass it to the rest of the transformation chain (the rest of the program).
Surely there is some way to generalise this pattern, don't you think ? If you've come this far, you should know the answer is yes, of course. And to introduce this concept, we will start with evalRPN. We will analyse the types of objects used in this function.
aaaand that's it, we just compose those two functions together, the rest of the objects are just parameters and outputs. To better understand , let me just re-arrange evalRPNExpression a bit.
evalRPNExpression' expr = evalRPNExpression [] expr
-- could you write the above function better with partial application ?
Because the input stack is constant, we can get rid of it in the type analysis, so we can focus on the real job here: taking the RPNExpression output from readRPNExpression, and evaluate it. So we virtualy have these objects:
A function that outputs a RPNExpression but wrapped in a type that tells us "this function might fail". And a totaly pure function that takes a RPNExpression (raw, no wrapped at all), and outputs a RPNStack. We would be tempted to use evalRPNExpression' on the result of readRPNExpression , but the types don't match: we have a RPNExpression -> RPNStack , and we would like a Result RPNExpression -> Result RPNStack (because the output of evalRPN is Result RPNStack).
So we need a way to transform a function RPNExpression -> RPNStack to a function Result RPNExpression -> Result RPNStack, and we could get rid of the case of:
evalRPN str = applyOnResult evalRPNExpression' (readRPNExpression str)
Does that sound familiar to you ?
This function has almost the same shape as the map function, but instead of working on lists, it works on Result. So we are searching for a generalisation of map. Note that before we use Result type, we had the same need, but the needed function would need to work on Maybe instead.
First abstraction level: Functor
And such a function exists, but only for types that implement a special type class: Functor. Let's look at this class:
λ> :i Functor
type Functor :: (* -> *) -> Constraint
We see that for a type f to be a Functor it needs to implement the function fmap, which type is:
If we add unnecessary parenthesis we clearly see it is the function we are searching for:
Let's play with it:
λ> fmap (+1) [1,2,3]
[2,3,4]
λ> fmap length Nothing
Nothing
λ> fmap length (Just "hello")
Just 5
λ> fmap not (Left "ah ha !")
Left "ah ha !"
λ> fmap not (Right True)
Right False
Functors have this in common: they add a context to a type. Lists are sequences of values so fmap over a list transforms all its elements. Maybe allows to represent the absence of value, so fmap over an object of type Maybe will do nothing if we had Nothing, and transform the value if one exists. Either is like Maybe, but carries a value that is forwarded instead of simply Nothing.
As always, it is good to try recreating the functions we use. We already know map, which is fmap but for lists. Try implementing fmap for Maybe and Either, let's call them applyOnMaybe :: (a -> b) -> Maybe a -> Maybe b and applyOnEither :: (a -> b) -> Either e a -> Either e b.
applyOnMaybe f Nothing = Nothing
applyOnMaybe f (Just x) = Just (f x)
applyOnEither f (Right e) = Right e
applyOnEither f (Just x) = Just (f x)
Ok so how would it help us ? Remember evalRPN function ? It composes readRPNExpression and evalRPNExpression. Because readRPNExpression can fail, it returns a Maybe RPNExpression (Or Either String RPNExpression in our last version). Until now, we did it "by hand", using case of expression, but we ecan implement it using fmap.
evalRPN s = fmap (evalRPNExpression []) (readRPNExpression s)
Because we used fmap, we can change the Functor returned by readRPNExpression as we want, and this function will still work. This would have saved us from refactoring the function. And if we ever change the Functor again for something a bit more complex, this function would stay exactly the same, the only thing changing would be the returned type which we can easily change because it already is an alias:
type Result = Either String
-- can change to:
type Result = Maybe
-- or something more complicated
type Result = SomeBigFunctor
As long as Result is a Functor, it will work. This is very interesting, because in Haskell we often change the functors we work on to add more and more needed context and effects as our application grows.
Not so fast
We missed a place where fmap would have been useful. Can you spot it ? I'll give you a hint: it is in the function readRPNSymbolSafe (that we renamed readRPNSymbol):
readRPNSymbolSafe string = case string of
"+" -> Just Add
"-" -> Just Sub
"*" -> Just Mul
"/" -> Just Div
"dup" -> Just Dup
"drop" -> Just Drop
s -> case readMaybe s of
Just n -> Just (Number n)
Nothing -> Nothing
readRPNSymbolSafe string = case string of
"+" -> Just Add
"-" -> Just Sub
"*" -> Just Mul
"/" -> Just Div
"dup" -> Just Dup
"drop" -> Just Drop
s -> fmap Number (readMaybe s)
But you could argue that is is not useful because we changed our type Maybe a to Either String a and readMaybe does not return a Either String a. And you are right, but intelligent people developing Haskell's standard library already thought about this and did the Either counterpart of readMaybe , cleverly called readEither😆.
So our Either version will be:
readRPNSymbolSafe string = case string of
"+" -> Right Add
"-" -> Right Sub
"*" -> Right Mul
"/" -> Right Div
"dup" -> Right Dup
"drop" -> Right Drop
s -> fmap Number (readEither s)
In the end, using fmap helped us again.
This was not a really big help though, as you can see, we still had to refactor a lot. But Functor is the first level of abstraction of a bigger abstraction tower. Bear with me, the fun is only beginning!
Trivia: Some funny Functors
Se we saw List, Maybe, and Either. But there are way more Functor that exist. Two of them seem to be totaly useless, but are used by very complex code in very specific occasions. Here they are:
data Identity a = Identity a
data Const c a = Const c
The functor Identity adds absolutely no special context to its wrapped value, it justs carries a value. If would be the same to have the bare value in our hands. It is still useful, because sometimes you need to use generic code that work on any functor, but you don't need a context. So what you can do is wrap your value in an Identity functor, and call the function on it.
Try implementing fmap on Indentity:
fmap f (Identity x) = Identity (f x)
The functor Const is a bit special. If you noticed, Const c a carries is like Either b a. They both have 2 type variables, but Functor have only one, so like Either, that is not a functor but Either SomeType is a functor, the type Const alone is not a Functor, but Const SomeType is one. There is a catch though.
A value of type Const c a does not carry any value of type a, only a value of type c. If we were to implement fmap for Const c a, it would have this type:
The only implementation that makes sense is this:
fmap f (Const c) = Const c
We never apply the function, because we just can't! This means whatever process we wanted to do, it is just ignored and we get the value we started with.
Identity and Const are the Functor equivalent of functions id :: a -> a and const :: b -> a -> b. Again, they may seem useless, but they are incredibly powerful in the right context. If you want to know more about them, see Monad Transformers and lense. In particular, they are used by the package lense to mimic syntax from imperative languages when accessing attributes of structs.
Evaluation errors
Now that we handled parsing errors, we need to handle evaluation errors. We will use the same type (Result), and do the same as before: first refactor basic functions, then refactor bigger functions. Starting with evalRPNSymbol.
evalRPNSymbol stack symbol = case symbol of
Number n -> n : stack -- numbers are just pushed on the stack
Add -> bin (+)
Sub -> bin (-)
Mul -> bin (*)
Div -> bin (/)
Dup -> case stack of (x:stack) -> x : x : stack
Drop -> case stack of (x:stack) -> stack
where
bin op = case stack of
(y:x:stack) -> op x y : stack
One obvious possible error is the stack underflow. Try to evaluate evalRPN "dup":
λ> evalRPN "dup"
Just *** Exception: RPN.hs:43:10-49: Non-exhaustive patterns in case
Let's modify the function to make it aware of underflows. This one is pretty simple, we only need to make our case of expression exhaustive, meaning add a catch-all pattern which result in Nothing.
evalRPNSymbol stack symbol = case symbol of
Number n -> Right (n : stack) -- numbers are just pushed on the stack
Add -> bin (+)
Sub -> bin (-)
Mul -> bin (*)
Div -> bin (/)
Dup -> case stack of
(x:stack) -> Right (x : x : stack)
others -> Left ("Stack underflow at symbol " ++ show symbol)
Drop -> case stack of
(x:stack) -> Right stack
others -> Left ("Stack underflow at symbol " ++ show symbol)
where
bin op = case stack of
(y:x:stack) -> Right (op x y : stack)
others -> Left ("Stack underflow at symbol " ++ show symbol)
Now that evalRPNSymbol changed its result type, evalRPNExpression also needs a change. We cannot use foldl anymore because the types do not match, so here we go again, return to the old version with explicit recursion:
evalRPNExpression stack [] = stack
evalRPNExpression stack (sym:syms) = evalRPNExpression (evalRPNSymbol stack sym) syms
Of course this won't compile either, because evalRPNSymbol changed. But from here we can see what to do:
- Change return type of
evalRPNExpression - Change expressions to handle potential failure of
evalRPNSymbol
Once again, the type system is your friend. You can use holes to help you.
evalRPNExpression stack [] = Right stack
evalRPNExpression stack (sym:syms) =
case evalRPNSymbol stack sym of
Left error -> Left error
Right stack -> evalRPNExpression stack syms
And finaly change evalRPN (the easiest):
evalRPN str = case readRPNExpression str of
Left error -> Left error
Right exp -> evalRPNExpression [] exp
Second abstraction level: Applicative
We have seen Functors, which are a way to add context to values:
Functor | Meaning |
|---|---|
List | Sequences of values |
Maybe | Absence of value |
Either | Computation error |
Identity | No context (yes, again, it can be useful) |
Const | Carry a value and discard the computation |
| ... | ... and more ... |
We can fmap over them to apply pure functions (a -> b) on objects of type f a, to modify (maybe) the value(s) inside (maybe) the so called context (yes, lots of maybe, but the point is: we don't need to know which context we are in when using fmap)...
For now we always used fmap on functions of 1 parameter. But for more complex programs, we can have multiple values coming with their own context to combine. For example, the function traverseList we built earlier:
traverseList f [] = Just []
traverseList f (a:as) =
case f a of
Left e -> Left e
Right b -> case traverseList f as of
Left e -> Left e
Right bs -> Right (b:bs)
If we decompose this function and analyse its types, we can see that we do exactly this:
-- we do pattern matching on these expressions
f a :: Result b
traverseList f as :: Result [b]
-- and we output a combination of their wrapped value in the happy path
Right (b:bs)
We managed to combine two Results with a pure function (:), and before that, we did the same with Maybe instead you can check, it has the same type pattern. How about we try to implement it with fmap?
Let's first see what would happen if we apply (:) partialy with only one fmap on Maybe objects:
λ> :t fmap (:) (Just 1)
fmap (:) (Just 1) :: Num a => Maybe ([a] -> [a])
Oh, now we trapped our function inside a Maybe, we are stuck with a Maybe ([a] -> [a]), and we still have to apply a Maybe [a] (For example: Just [2, 3]) to it... We cannot resolve this type equation:
fmap (:) (Just 1) :: Num a => Maybe ([a] -> [a])
Just [2, 3] :: Num a => Maybe [a]
-- first parameter is not wrapped
we would need a function a bit more powerful, that had a type similar to fmap, but more like this:
That way we could just resolve the problem like this:
mysteryFunction (fmap (:) (Just 1)) (Just [2, 3])
We are lucky, because there exist such "super Functor", and in Haskell we call them Applicative Functors. Let me show you Applicative definition:
λ> :i Applicative
type Applicative :: (* -> *) -> Constraint
(<*>) :: f (a -> b) -> f a -> f b
-- stop reading here
(*>) :: f a -> f b -> f b
(<*) :: f a -> f b -> f a
-- Defined in ‘GHC.Base’
-- Defined in ‘GHC.Base’
Applicative ((,,,) a b c)
-- Defined in ‘GHC.Base’
Don't really bother with all functions, we can focus on only 2. In fact , the second function is an infix operator (you can see it because it is inside parenthesis), it has 2 parameters, let's first analyse it.
(<*>) :: f (a -> b) -> f a -> f b
Let's add another function signature below to reveal this operator's purpose:
-- I ommitted typeclass constraints because it is not relevant here
(<*>) :: f (a -> b) -> f a -> f b
Oh my, if you fix f to the Maybe functor you get the same type as mysteryFunction! This is exactly what we were looking for, if Maybe is Applicative we can write:
λ> fmap (:) (Just 1) <*> Just [2, 3]
Just [1,2,3]
It worked! We computed a concatenation wrapped in Maybe! This means we can combine all combinations of Just and Nothing. Let's put this expression in a function over f, a and b, and try adding different Maybe values (including Nothing) and try to make sense of the result.
λ> app2F a b = fmap f a <*> b
λ> app2F (:) Nothing (Just [2, 3])
Nothing
λ> app2F (:) (Just 1) Nothing
Nothing
λ> app2F (:) Nothing Nothing
Nothing
This seems to forward an error as soon at it encounters one. This is normal, since (:) needs 2 objects to combine, if we don't have both of them, we cannot do anything.
This function would not only apply functions inside a context, but also merge two different contexts together. In this example, we have two distinct Maybe a values, each carrying their own context. The type of the context is the same, but not its value. We could just have one Just 1 and a Nothing, or two Nothing, they are two independent context-wrapped values. The implementation takes care of all possible combination, and returns a value only if both parameters contained a value.
Let's try to see what it means for List to merge their context:
-- this time we work with the (+) operator to add 2 numbers together
λ> app2F (+) [] [2]
[]
λ> app2F (+) [1, 2] [3]
[4,5]
λ> app2F (+) [1, 2] [3, 4]
[4,5,5,6]
It seems to perform a cartesian product, it computes all combinations between elements of the first list, and elements of the second list. The last example shows it did [1+3, 2+3, 1+4, 2+4]. Another example:
λ> app2F (&&) [False, True] [False, True]
[False, False, False, True]
Here we go, we have the complete truth table of the (&&) operator 😁.
How (<*>) works on functions with more than 2 arguments
For now we only saw how to use (<*>) with functions of 2 arguments. First apply the function to its first argument, then use (<*>) on the resulting function. But how about this function:
ifThenElse predicate trueValue falseValue = if predicate then trueValue else falseValue
It would be nice to apply it in a Functor. Let's analyse the type of fmap ifThenElse, and progressively add operators:
λ> :t fmap ifThenElse
fmap ifThenElse :: Functor f => f Bool -> f (a -> a -> a)
λ> :t fmap ifThenElse (Just True)
fmap ifThenElse (Just True) :: Maybe (a -> a -> a)
λ> :t fmap ifThenElse (Just True) <*> (Just 0)
fmap ifThenElse (Just True) <*> (Just 0) :: Num a => Maybe (a -> a)
Wait, we applied (<*>) to a a -> a -> a and it worked? I thought the (<*>) operator worked only on functions of type a -> b? Hehehe... Remember partial application? Well it is a consequence of this concept. Because a -> a -> a is the same as a -> (a -> a), you can apply (<*>) to any function of any number of argument, and you will get a value that can be a function but with 1 less argument.
This means we can apply (<*>) as many time as we need to apply all arguments the function needs. Try this out:
λ> fmap ifThenElse (Just True) <*> (Just 0) <*> (Just 2)
Just 0
It worked! We applied ifThenElse to Just True, Just 0, and Just 2, and it returned the right value, still wrapped in the Maybe type.
The pure function
So an Applicative is a Functor that has (<*>), but we also saw a second function:
This function takes any pure value, and "wraps" it in the context of f. It does not seem like it is useful at all. After all, why having a function for just putting a value in a Just, or Right constructor?
Well, it may be a simple function, but remember we define a generic function, this means that whatever the Functor we work on is, we will have this function available. If we work with Just constructor, we are tied to the Maybe Functor and can't generalise. So an Applicative Functor also needs a way to wrap pure values, and it is expressed as the pure function. Let's try it:
λ> pure 5 :: Maybe Int
Just 5
λ> pure True :: Either String Bool
Right True
λ> pure "help" :: [String]
["hello"]
We could have used pure to make our current code easier to refactor! If we used pure instead of Just on the first version, changing from Maybe to Either for the result would have been easier. I encourage you to change any instance of Right to pure. It will be important later, because we may change our parsing and execution Functor.
Back to traverseList
Now that we know how to combine multiple values, we can try using this concept to re-code traverseList. First let's see which function to apply, and to what arguments:
traverseList f [] = Just []
traverseList f (a:as) =
-- f a is the first argument, we match against it
-- to unwrap its inner value (if it exists)
case f a of
Left e -> Left e
-- The second argument is traverseList f as
Right b -> case traverseList f as of
Left e -> Left e
-- b is the "unwraped" first value
-- and "bs" is the second "unwrapped" value.
-- we see that we combine them with (:)
Right bs -> Right (b:bs)
Try to use the function app2F we coded before to code the new traverseList function:
traverseList f [] = pure []
traverseList f (a:as) = app2F (:) (f a) (traverseList f as)
And does this function already exist? The answer is, you guessed it, yes, of course and its name is traverse. Check its type in ghci. You see it is much more complicated than traverseList, it is because we abstract more things. The list type becomes any type that is Traversable.
λ> :t traverse
traverse
:: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
As always, prefer using standard functions even though here our traverseList is enough.