Learn Haskell Part 3: A naïve Reverse Polish Notation calculator

2025-07-11

So what now, we played a bit with core concepts of Haskell, it was fun (for me at least), but we didn't do actualy useful programs. Only toy examples. A full program will need more than that ! So let's do it, we will build something useful ! A Reverse Polish Notation calculator !

Why this ? Because it is very comprehensive. It involves parsing, error handling (yeah I know if you come from JS it may be a weird concept 😇), inputs/outputs, software architecture... And we may even talk a bit about the PDF file format (which is based on RPN).

Once done, the software will look like this:

> 10 20 +
30
> dup
30:30
> 2 /
15:30
> +
45
> 0 /
Error: division by zero at symbol '/'
> drop
[]
> drop
Error: empty stack at symbol 'drop'

So let's begin by opening the file RPN.hs

First things first, a model of our world

Before doing anything with code, we need to think about what is the think we want to build. Reverse Polish Notation is a way to write arithmetic expressions without the need of parenthesis. It does so by having an unambiguous syntax where every symbol is read from left to right, and an action is done for each symbol. Numbers are put on a stack, and other symbols do things with the stack. Here is an example:

1 2 + dup *

will be evaluated like this:

push 1 on the stack
push 2 on the stack
take out the 2 topmost elements (1 and 2), add them (3), and push the result (3) on the stack
take out the topmost element (3) and push it twice (duplicate the stack top)
take out the 2 topmost elements (3 and 3), multiply them (9), and push the result on the stack

In the end, the stack contains the number 9, we computed the expression .

We see that the program will have:

a stack
a RPN expression
- an expression is a list of RPN symbols
- a RPN symbol is either a number, or an arithmetic operation, or dup (for now)
a function to modify the stack based on a given expression

The stack is the simplest type. Since we only deal with numbers, we can just define a stack as a list of numbers. Why a list ? Because it has the same shape as a stack:

to push, you add an element at the head of the list : element : stack.
to pop, you can match against the top : stack pattern.

So we can just create a type alias:

type RPNStack = [Double]

This is exactly how String is defined by the way. You can check with :i String:

λ> :i String
type String :: *
type String = [Char]
        -- Defined in ‘GHC.Base’

To represent our RPN symbol, we need to define our own type. In Haskell, most types are defined using the data syntax. It is used like this:

data Type
  = Constructor1 AttributeType11 AttributeType12 AttributeType13
  | Constructor2 AttributeType21 AttributeType22
  | Constructor3
  -- ...

A type consist of multiple variant, also called "constructors", and each variant can have 0 or more attributes. For example, our RPN symbol type could be defined as:

data RPNSymbol
  = Number Double
  | Add
  | Sub
  | Mul
  | Div
  | Dup
  | Drop

A RPNSymbol can be a Number, and a Number has 1 attribute of type Double, because it must contain which number to push. A RPNSymbol can also be an operation (for now we only defined 5 of them).

Printing

Let's try to play with this type:

λ> :l RPN.hs
λ> :t Add
Add :: RPNSymbol
λ> Add

<interactive>:56:1: error: [GHC-39999]
    • No instance for ‘Show RPNSymbol’ arising from a use of ‘print’
    • In a stmt of an interactive GHCi command: print it

Oh, ghci did not like it because RPNSymbol cannot be printed, it does not implement the Show typeclass. And Haskell tells us by saying • No instance for ‘Show RPNSymbol’ arising from a use of ‘print’, which means "the print function (used by ghci to print the result of an expression) needs to implement Show, but RPNSymbol does not".

Let's implement this function then ! Implementing a typeclass is just a bit of syntax:

instance Show RPNSymbol where
  show symbol = ...

We start with instance Class Type where, and then define the needed functions. In the case of Show, the only function to define is show. Don't forget to indent your functions !

So how would we implement it ? Pattern matching would be perfect here:

instance Show RPNSymbol where
  show (Number n) = show n
  show Add = "+"
  show Sub = "-"
  show Mul = "*"
  show Div = "/"
  show Dup = "dup"
  show Drop = "drop"

You can see here that constructors of our type can be used as patterns, this is why pattern matching is so useful. Did you see also that we called show on the attribute of Number ? It's ok to do this because Double implement show, so the show on the right is the Double version and not the RPNSymbol version.

Now try again:

λ> :l RPN.hs
λ> Add
+
λ> Mul
*
λ> Number 5
5.0
λ> [Number 1, Dup, Add]
[1.0,dup,+]

Did you see how constructors with attributes are used ? They are like functions, you simply put attributes next to the constructor. And in fact, they really are functions. Check the type of Number:

λ> :t Number
Number :: Double -> RPNSymbol

How would you implement show using case of instead of multiple definitions ?

Show solution

instance Show RPNSymbol where
  show symbol = case symbol of
    Number n -> show n
    Add -> "+"
    Sub -> "-"
    Mul -> "*"
    Div -> "/"
    Dup -> "dup"
    Drop -> "drop"

It is the same with lists, the (:) operator is a constructor for the list type, so it is usable as a pattern, AND as a function that creates a list with a head and a tail:

λ> :t (:)
(:) :: a -> [a] -> [a]
λ> [1,2,3] == 1:2:3:[]
True
λ> case [1,2,3,4] of (x:y:rest) -> (x, y, rest)
(1,2,[3,4])

Try implementing the functions swap, which swaps the two first elements of a list if the list has at least two elements, and does not modify the list eitherway.

Show solution

swap l = case l of
  (x:y:xs) -> y:x:xs
  l -> l

Parsing

Now that we can see a RPNSymbol, it would be nice to read it. We won't implement the Read typeclass, because it is a bit difficult. We will simply implement a readRPNSymbol function:

readRPNSymbol :: String -> RPNSymbol

Your turn! Try to implement it yourself. It should do this:

If the string is an operation symbol, the result is the corresponding operaton
If not, then try reading the string as a number
Some output examples:

λ> readRPNSymbol "+"
+ -- remember, this is the textual representation of Add constructor
λ> readRPNSymbol "12"
12.0 -- this is the textual representation of Number constructor

String literals ("blabla") are also valid String patterns. At the moment, we don't care about total functions. Your function can be partial. We will make it total soon.

Show solution

readRPNSymbol string = case string of
  "+" -> Add
  "-" -> Sub
  "*" -> Mul
  "/" -> Div
  "dup" -> Dup
  "drop" -> Drop
  s -> Number (read s)

Evaluating a symbol

For now we can only show and parse our symbols. But wouldn't it be nice to evaluate them ? Make them actualy do something to the RPN stack ? Let's define a function:

evalRPNSymbol :: RPNStack -> RPNSymbol -> RPNStack

Same as before, we will enumerate all possible patterns for the type RPNSymbol :

evalRPNSymbol :: RPNStack -> RPNSymbol -> RPNStack
evalRPNSymbol stack symbol = case symbol of
    -- ...

For a number, we simply put it on the stack (try implementing the Number case):

Show a possible solution

evalRPNSymbol :: RPNStack -> RPNSymbol -> RPNStack
evalRPNSymbol stack symbol = case symbol of
  Number n -> n : stack -- numbers are just pushed on the stack

For the Add case, we need to get the first and second elements of the stack. We can do it with pattern matching:

Show a possible solution

evalRPNSymbol :: RPNStack -> RPNSymbol -> RPNStack
evalRPNSymbol stack symbol = case symbol of
  Number n -> n : stack -- numbers are just pushed on the stack
  Add -> case stack of (y:x:stack) -> (x + y) : stack

Now try implementing the other cases:

Show a possible solution

evalRPNSymbol :: RPNStack -> RPNSymbol -> RPNStack
evalRPNSymbol stack symbol = case symbol of
  Number n -> n : stack -- numbers are just pushed on the stack
  Add -> case stack of (y:x:stack) -> (x + y) : stack
  Sub -> case stack of (y:x:stack) -> (x - y) : stack
  Mul -> case stack of (y:x:stack) -> (x * y) : stack
  Div -> case stack of (y:x:stack) -> (x / y) : stack
  Dup -> case stack of (x:stack)   -> x : x : stack
  Drop -> case stack of (x:stack)  -> stack

You feel like it is really annoying and verbose ? Don't worry, I do two. But for now, this is all we can do with what we know. It will be better soon. You can still try the function now:

λ> :l RPN.hs
λ> evalRPNSymbol [] (readRPNSymbol "1")
[1.0]
λ> evalRPNSymbol [1,2] (readRPNSymbol "+")
[3.0]

Loops using recursion

Now that we can process a single RPNSymbol, it would be nice to process a whole RPN expression. So first let's define what is a RPNExpression.

type RPNExpression = [RPNSymbol]

Simple, right ? Now let's read a whole expression. An expression is entered as a String by the user, and is a list of symbols separated by spaces. Ok, so we could first separate our input String into multiple Strings separated by space with the function words, then process each word using readRPNSymbol. This is a nice approach, chain together functions that do each a little bit of the whole process, and you have the whole process. We could implement our own words function, and if you feel like it be my guest, but we are not there yet.

So our whole process resembles this:

readRPNExpression :: String -> RPNExpression
readRPNExpression s = toSymbols (words s)

You can write this function in your module. What we need now is the function toSymbols. First of all, let's try to compute its type.

We know its output must be a RPNExpression (aka [RPNSymbol]) since its result is the result of readRPNExpression.
Its input is the result of words, so it is a [String] (list of strings)

So toSymbols has type [String] -> [RPNSymbol]. It seems obvious that this function reads each String of the input list into its RPNSymbol equivalent. So something like a for each ? No, we don't have these in Haskell. We will need to think ~~different~~ recursive.

A recursive function is a function that calls itself. Since it calls itself, the "self call" (recursive call) will do the same work again, and again, and again... until some exit condition is true and we just stop calling the function recursively. Oh but I just described a loop, didn't I?

To think recursive, we first need to understand how the function behaves with certain simple inputs (e.g. the empty list, a list with one element ...), and try to find a general case (which is often the recursive case), and at least one specific case which does not involve recursion (which we call the base case or terminal case) to stop the recursion once reached. It's like finding the termination condition of a loop.

So, let's think about the empty list first:

toSymbols [] = _what

Write this line and load the module in ghci...

λ> :l RPN.hs
[1 of 2] Compiling Main             ( RPN.hs, interpreted )

RPN.hs:5:16: error: [GHC-88464]
    • Found hole: _what :: RPNExpression
      Or perhaps ‘_what’ is mis-spelled, or not in scope
    • In an equation for ‘toSymbols’: toSymbols [] = _what
    • Relevant bindings include
        toSymbols :: [String] -> RPNExpression (bound at RPN.hs:5:1)
      Valid hole fits include
        [] :: forall a. [a]
          with [] @RPNSymbol
          (bound at <wired into compiler>)
        mempty :: forall a. Monoid a => a
          with mempty @RPNExpression
          (imported from ‘Prelude’ at RPN.hs:1:1
           (and originally defined in ‘GHC.Base’))
  |
5 | toSymbols [] = _what
  |                ^^^^^
Failed, no modules loaded.

Holes

Oh, it seems that our ghci does not like the name _what it says it's a hole. Holes are names starting with a _ and not bound to any value. When Haskell compiler gets an "unknown variable" error on a hole, it just tells us "oh I found a place where I think you wanted to put something, but did not know what, its type is the type. Oh and by the way, I found some values that could fit this hole...". Here, we see that _what is of type RPNExpression, and it found two values that could fit: [] (the empty list, of course), and mempty (a special object from the Monoid typeclass, which, for lists, is the empty list).

Thanks ghc, now that I think of it, if we have no String to parse, there should be no RPNSymbol, so an empty list should be the right value of toSymbols [] !

toSymbols [] = []

Nice, we can parse empty Strings ! Try it:

λ> :l RPN.hs
λ> readRPNExpression ""
[]

Now let's think about the other cases:

toSymbols (c:[]) = _1
toSymbols (a:(b:[])) = _2
toSymbols (a:(b:(c:[]))) = _3

For the first case, we will want to transform the string a to a symbol:

toSymbols (c:[]) = readRPNSymbol c

Oh, but since it's a list, we should build it in a list. We could just wrap our expression in [ ], but let me write it in a way that will show a pattern later:

toSymbols (c:[]) = readRPNSymbol c : []

It is ok to write it, (:) chains a head to a tail, here, the tail is just the empty list. Now let's think about the second and third case. Well it should be something like this:

toSymbols (b:(c:[])) = readRPNSymbol b : (readRPNSymbol c : [])
toSymbols (a:(b:(c:[]))) = readRPNSymbol a : (readRPNSymbol b : (readRPNSymbol c : []))

Hm... wait, there seem to be a repeating pattern here... let me re-arrange a bit:

toSymbols       (c:[])   =                                       readRPNSymbol c : []
toSymbols    (b:(c:[]))  =                    readRPNSymbol b : (readRPNSymbol c : [])
toSymbols (a:(b:(c:[]))) = readRPNSymbol a : (readRPNSymbol b : (readRPNSymbol c : []))

You see how the third case contains the second case ? The input of case _3 is c followed by the same input as case _2. And the output of case _3 is really just readRPNSymbol c followed by the output of case _2 ! We could rewrite case _3 based on case _2. Remember, the = symbol in Haskell truly means "equivalent", so if you see an occurrence of readRPNSymbol b : (readRPNSymbol c : []), it is strictly equivalent as toSymbols (b:(c:[])).

toSymbols (a:(b:(c:[]))) = readRPNSymbol a : toSymbols (b:(c:[]))

But wait, the same applies to case _2 and case _1 :

toSymbols (b:(c:[])) = readRPNSymbol b : toSymbols (c:[])

What about case _1 ? Well, remember the base case ? It is equivalent to the empty list, and what is readRPNSymbol c attached to ? The empty list. So we can rewrite it also:

toSymbols (c:[]) = readRPNSymbol c : toSymbols []

Generalize a pattern

We see that everytime we encounter a list of the shape x:xs (a x followed by the rest of the xs), we transform the x (readRPNSymbol x), and chain the transformed value to the transformation of the rest (toSymbols xs). This means instead of enumerating all kinds of lists, we could just write:

Maybe try to implement it before seeing the answer

-- don't forget the base case
toSymbols [] = []
toSymbols (x:xs) = readRPNSymbol x : toSymbols xs

Try it, you'll see that all cases we unfolded by hand are really just particular cases of this general one.

Wow, we now have a function that parses a whole RPN expression.

λ> readRPNExpression "10 10 +"
[10.0,10.0,+]

Hiding auxiliary functions

If you think about it, the function toSymbols has no other purpose outside of readRPNExpression, but right now it still pollutes our scope. People can access it, and its name may collide with other names in other modules. This is not clean.

We could define it in a let expression in readRPNExpression:

readRPNExpression s =
  let toSymbols [] = []
      toSymbols (x:xs) = readRPNSymbol x : toSymbols xs
  in toSymbols (words s)

But we can do better, with a special syntax for functions. The where clause:

readRPNExpression s = toSymbols (words s)
  where
    toSymbols [] = []
    toSymbols (x:xs) = readRPNSymbol x : toSymbols xs

In a where clause, you can put everything that needs to be in the scope of the function, and keep your main expression clean.

Evaluate a whole expression

Now we need a function that evaluates an expression.

evalRPNExpression :: RPNStack -> RPNExpression -> RPNStack

Evaluating a whole expression is pretty simple, we just need to evaluate each symbol from left to right. If we have an expression with 3 symbols: [a,b,c], we need to evaluate a then b then c. Unfolding this will result in:

evalRPNSymbol (evalRPNSymbol (evalRPNSymbol [] a) b) c

Let's do the same as before: doing a few cases by hand.

-- no symbol means nothing to evaluate, the stack does not change
evalRPNExpression s          []    = s
-- evaluate the symbol c
evalRPNExpression s       (c:[])   = evalRPNSymbol s c
evalRPNExpression s    (b:(c:[]))  = evalRPNSymbol (evalRPNSymbol b s) c
evalRPNExpression s (a:(b:(c:[]))) = evalRPNSymbol (evalRPNSymbol (evalRPNSymbol s a) b) c

It feels odd, it's like we do have a pattern, but it is backward. In the last example, we had a tree of operations that "combine the processing of the head with the processing of the tail". The pattern had the same order as the input ([a,b,c] gave (:) a ((:) b ((:) c []))). But now, the pattern seem to give eval (eval (eval s a) b) c).

Let's think about it for a second. evalRPNExpression stack expr is the result of evaluating an expression expr on a stack stack. This means the function takes a stack that could already be in some state.

If we rewrite the equations, maybe we can see a pattern:

evalRPNExpression s1          []    = (s1                                                  )
evalRPNExpression s2       (c:[])   = evalRPNSymbol (s2                                  ) c
evalRPNExpression s3    (b:(c:[]))  = evalRPNSymbol (evalRPNSymbol (s3                ) b) c
evalRPNExpression s4 (a:(b:(c:[]))) = evalRPNSymbol (evalRPNSymbol (evalRPNSymbol s4 a) b) c

If you compare the right hand side of lines 2 and 3:

evalRPNSymbol (s2                ) c
evalRPNSymbol (evalRPNSymbol s3 b) c

You see that the only difference here is s2 and evalRPNSymbol s3 b. If we transform s2 with evalRPNSymbol s3 b, we get the same value. So the 3rd line is the same as the second line but with s2 = (evalRPNSymbol s3 b). And when two things are equivalent, we can change one with the other. So we can rewrite line 3 like this:

evalRPNExpression s3 (b:(c:[])) = evalRPNExpression (evalRPNSymbol s3 b) (c:[])

Once again, we see a recursive pattern emerge. If we transform the pattern (b:(c:[])) with the pattern (sym:syms), we can rewrite our line:

evalRPNExpression s3 (sym:syms) = evalRPNExpression (evalRPNSymbol s3 sym) syms

Let's verify this by unfolding evalRPNExpression [] [a,b,c].

Show unfolding

  evalRPNExpression [] [a,b,c]
= evalRPNExpression [] (a:(b:(c:[])))
= evalRPNExpression (evalRPNSymbol [] a) (b:(c:[]))                             -- by recursive case
= evalRPNExpression (evalRPNSymbol (evalRPNSymbol [] a) b) (c:[])               -- by recursive case
= evalRPNExpression (evalRPNSymbol (evalRPNSymbol (evalRPNSymbol [] a) b) c) [] -- by recursive case
= evalRPNSymbol (evalRPNSymbol (evalRPNSymbol [] a) b) c                        -- by terminal case

In the end, the function looks like this:

-- never forget the base case
evalRPNExpression s [] = s
evalRPNExpression s (x:xs) = evalRPNExpression (evalRPNSymbol s x) xs

Iterative form to recursive form translation

evalRPNExpression is interesting. It is really simple to translate it in an imperative language. It would be the same as doing this in JS:

function evalRPNExpression(syms, stack) {
  for (const sym in syms) {
    stack = evalRPNSymbol(sym, stack);
  }
  return stack;
}

In JS, we use a mutable variable called stack, that we modify at each loop iteration. In Haskell, we carry the modified stack in the function's parameters. This is how you would translate a loop into a recursive function. If you have difficulties thinking about a recursive implementation, try first doing the iterative version, and then translate it.

If the iterative function needs temporary values, like this one:

function power(x, n) {
  let r = 1;
  while (n > 0) {
    n = n - 1;
    r = r * x;
  }
  return r;
}

You can see it as two functions. One having only the parameters, and the other having the parameters AND the temporary values:

-- the function we will expose
power x n = powerAux x n 1
  where
    -- the helper function, with "r" as the temporary variable
    powerAux x 0 r = r
    -- look at how n = n - 1 translates to "call powerAux with n-1 as the new n"
    -- same with r
    powerAux x n r = powerAux x (n - 1) (r * x)

If you need to practice, try translating these functions:

function factorial(n) {
  let r = 1;
  while (n > 0) {
    r = r * n;
    n = n - 1;
  }
  return r;
}

function fibonacci(n) {
  let a = 0;
  let b = 1;
  while (n > 0) {
    let tmp = a;
    a = b;
    b = tmp + b;
    n = n - 1;
  }
  return a;
}

Show solution

factorial n = factorialAux n 1
  where
    factorialAux 0 r = r
    factorialAux n r = factorialAux (n - 1) (r * n)

fibonacci n = fibonacciAux n 0 1
  where
    fibonacciAux 0 a b = a
    fibonacciAux n a b = let tmp = a in fibonacciAux (n - 1) b (b + tmp)

Printing an expression

We can already show a RPNExpression because it is a list, and we can show a list of Showable things. But this is not pretty. The textual representation of an expression is not [sym1,sym2,...], it is sym1 sym2.... So we need to define a pretty printing function that does exactly this.

Printing an expression is as simple as parsing it: You take a list of symbols, and print them one by one, separated by spaces. For the parsing part, we used the function words to split a String by spaces, for this one we will use unwords, which ... does the opposite: join String with spaces between them. So we should have the same function shape as for parsing, reversed:

-- because RPNExpression is not a type but a type alias, we cannot implement a type class for it
-- we then need our own function
showRPNExpression :: RPNExpression -> String
showRPNExpression expr = unwords (toStrings expr)
  where
    toStrings expr = _impl

Try doing it yourself now ;)

Show solution

toStrings [] = []
toStrings (x:xs) = show x : toStrings xs

Now that we have all these little functions, we can combine them to create an evaluator that takes an exprssion as a String and gives the result of the expression... or crash if it encounters any error.

Show solution

evalRPN :: String -> [Double]
evalRPN s = evalRPNExpression [] (readRPNExpression s)

No REPL, a lot of repetitions, no proper error handling, but still, it works pretty well! Good job!

Previous chapter Next chapter