Saturday, December 17, 2011

A Postfix Calculator, Step-by-step - Part 1

Goal

Recently, I was introducing the C# language to someone from work. I gave him a few challenges, but after a while a colleague of mine asked him if he would be able to make a postfix calculator, or a calculator which uses postfix notation. More than that, we wanted to design a postfix expression evaluator (instead of a common calculator which receives a single operation as input).

The objective of this series of posts will be to show a step-by-step analysis of the problem, the design, implementation and, finally, testing. It seems an ideal for first-timers, being simple to implement but requiring a more deep analysis and thinking.

Analysis

The cool thing about this exercise is that, once understood, it's very simple to implement, while at first it may seem very complex. Most of us (myself included) knows this type of calculator as an user: you first enter operands and then the operation. But the inner workings are deceiving: they look convoluted and counter-intuitive.

Again, we are deceived by our mind... this kind of calculator is much more simpler than an usual expression evaluator. Let's begin by analyzing an example... let's see an addition in postfix:

Table 1 - The Polish Reverse Notation
Common NotationPostfix Notation (PRN)
10 + 5
10 5 +

Looking at the example, you can see that you should change the reading order of the expression in order to understand it more easily. Our brain (probably, since I'm no psychiatrist here) usually reads the first expression as:

1. Read 10, hold the value.
2. The operation is "add", seek another value.
3. Feeds 5 into the "add" operation.
4. Returns 15.
Listing 1 - Evaluating an Expression

Or something like that. In the Occident people learn to read left-to-right, and it's expected that we apply the same rule to expression reading. Take a look at the steps above. Are they simple? Not really. To execute the second step, you must execute another! It seems that our brain in making a call to a subroutine.

Now, let's look at the second expression. How to read it? We could apply the same left-to-right reading rule. We would have to store the values in two areas first and later retrieve then in order to perform the addition. Insightful readers may already have realized that it's easier to reading the expression right-to-left:

1. The operation is add, seek two values.
2. Feeds 5 into the add operation.
3. Feeds 10 into the operation.
4. Returns 15.
Listing 2 - Postfix Evalutaion

Hmm, what changed? It's simple enough, you see. You first understand the operation, which leads you to seek enough operands to perform it. But wait! There's another notation that looks exactly like this, isn't it? Take a look below:

Table 2 - Enter C-Like Notation
Common Notation Postfix Notation (PRN) C-like Notation
10 + 5 10 5 + add(5, 10)

Wow! It looks like a common function (method) call, doesn't it? This is an advantage of postfix; All in all, it looks more like how a machine would work than the "common notation". Now, another advantage of the PRN is that it doesn't requires parenthesis. Let's see another example:

Table 3 - Parenthesis in PRN
Common NotationPostfix Notation (PRN)C-like Notation
10 + 510 5 +add(5, 10)
(10 + 5) * (10 + 2)10 5 + 10 2 + *???

It's becoming harder. I have involved another operation (multiplication) to exemplify the parenthesis. But now, how does one evaluates the PRN above? Again, I will risk trying to explain the brain process behind it:

1. The operation is multiplication, seek two values.
2. Ooops! Operation!?
3. ...
Listing 3 - How to multiply?

Things get complicated after you read the value to the multiplication. As it turns out, it's not a value, but another operation. What if we evaluate it and feed the result into the multiplication?

1. The operation is multiplication, seek two values.
2.1. The operation is add, seek two values.
2.2. Feeds 2 into the add operation.
2.3. Feeds 10 into the operation.
2.4. Returns 12.
3.1. The operation is add, seek two values.
3.2. Feeds 5 into the add operation.
3.3. Feeds 10 into the operation.
3.4. Returns 15.
4. Returns 180.
Listing 4 - Multiplication Complete

OK, it's working properly. The call to multiplication resulted in two other calls, both to, again, adds. It seems that whenever we are evaluating an operation, we might run into another one. Hmm... which may lead into yet another one and so forth. This looks a hell lot like recursion, doesn't it? Well, hold that thought for a moment; we still must fill the third column of the table.

How would the operation above looks like in "C-like notation"? Let's again assume a multiply operation, just like add.

Table 4 - C-Like for Multiplication
Common NotationPostfix Notation (PRN)C-like Notation
10 + 5
10 5 +
add(5, 10)
(10 + 5) * (10 + 2)
10 5 + 10 2 + *
multiply(add(2, 10), add(10, 5))

Looking good! We have already established the principles and already projected the calculations to a computer language.

So let's focus on the C-like notation and expand it, after all, we are building a program, aren't we? Let's take another look at the steps in listing 4. There is something that calls attention: "seek two values". We do that a lot, don't we? Let's us assume that we could have an operation that seeks those values and then evaluate it. Could that be possible? Enter X(), a function that does that for us.

So
multiply(add(2, 10), add(10, 5)) 
becomes
multiply(add(X(), X()), add(X(), X())) 
What's this X operation we are talking about? An operation that extracts something from the expression and returns its value. And doesn't both multiply and add do exactly that -- return a value from an expression? Oh yes, they do, and they are recursive, because, before returning a value, they will extract two more values from the expression. Yes! We got back to that recursive thing, didn't we? I told you to hold on to that, didn't I?

(I said that this was a good exercise for first timers, but recursion is an advanced topic. If you need help to follow it, go to Wikipedia.)

Anyway, if X takes something out of the expression (an operation or operand) and chooses whether to call multiply or add? Then, add and multiply can rely on X to retrieve more data. We have a two-step recursion (A calls B, B calls A). Let's check an algorithm that uses this X thing:

Let X:
1. a ← next_factor
2. If a is a multiplication
2.1     b ← X()   (recursion)
2.2     c ← X()   (recursion)
2.3.    Return b times c
3. If a is an addition
3.1     b ← X()   (recursion)
3.2     c ← X()   (recursion)
3.3.    Return b plus c
4. a must be a value, so, return a    (else)
Listing 5 - Recursion

So far I have avoided the non-commutative operation subtraction and division. That's because there's a particularity in PRN, which is that the operands to an operation are always read left-to-right. In addition and multiplication it doesn't matter, them being recursive, but we must reverse the operation order for subtraction and division. Let me clarify what I just said with an example:

Table 5 - Subtraction and Division
Common NotationPostfix Notation (PRN)C-like Notation
10 + 5
10 5 +
add(5, 10)
(10 + 5) * (10 + 2)
10 5 + 10 2 + *
multiply(add(2, 10), add(10, 5))
10 / 2
10 2 /
divide(10, 2)

We could have thought, from my earlier explanation, that 10 / 2 would actually be written as 2 10 /, because, following the recursive algorithm for addition or multiplication:

AlgorithmApplication
0. Let X:
1. a ← next_factor
2. If a is a multiplication
2.1.    b ← X()   (recursion)
2.2.    c ← X()   (recursion)
2.3.    Return b times c
3. If a is an addition
3.1.    b ← X()   (recursion)
3.2.    c ← X()   (recursion)
3.3.    Return b plus c
4. If a is a division
4.1    b ← X() 
4.2    c ← X()
4.3    Return b / c

4. If a is a subtraction
4.1    b ← X() 
4.2    c ← X()
4.3    Return b - c

5. a must be a value, so, return a    (else)
Expression is "10 2 /"

(0)  X();
(1)  a ← next_factor    {10 2 [/]}
(1') a ← /
(4)  Is (a="/") is a division:
(4.1)    b ← X()
(4.1 leads to 0)
(1)  a ← next_factor    {10 [2] /}
(1') a ← 2
(5) Return 2
(returns to 4.1)
(4.1')    b ← 2
(4.2)     c ← X()
(4.1 leads to 0)
(1)  a ← next_factor {[10] 2 /}
(1) a ← 10
(5) Return 10
(returns to 4.2)
(4.2') c ← 10
(4.3) Return b / c
(4.3') Return 2 / 10
Listing 6 - Non commutative operations

So, as you can see from the listing, we have to invert b and c. This should be simple enough, instead of "Return b / c", you just "Return c / b". I won't get back to the algorithm, but you can make the alteration by yourself.

Once the algorithm above is established, you can see that any PRN expression can be solved by applying it. We must turn our attention to the next_factor operation. We could parse the expression reversely, reading characters from right to left, but then  another problem arises, since we'll be reading the digits in the wrong order (e.g. 10 will be read as {0, 1}). This means taking care of reversing them back to the usual order, which can make the parsing become highly complex. So instead, let's look at our algorithm once more... we need a structure that will process first the last items stored. Hmm... Last In, First Out...? A LIFO, which is a stack! And, hear me, stacks are goooood! Whenever you can implement something with a stack, do so.They usually make for nice, clean algorithms.²

Anyway, so we can parse the expression left-to-right, pushing factors into a list and, later, as X calls for a new item, we pop them out.¹

Wrapping Up

Uffs! We finished analyzing the problem, resulting in devising an algorithm to process the data, the necessary data structure and rules. By checking our reasoning above, we could create a formal process of analysis:

  1. Gather requirements from the problem, do your homework! 
  2. Iterate in order to create a mathematical model to describe your problem. Most things are reduced to a series of calculations.
  3. Iterate in order to define data structures that can support your algorithm. You will usually end up with lists, queues and stacks. Or some cases (on data-centric problems) you will deal with dictionaries.
This is a very high-level process for analysis. As you can see, we have not used any kind of formalism to solve this problem. The goal here is exactly that: on a day-to-day basis, you won't be required to hold a degree in mathematics (or computer science, for that matter) in order to design problems.

Next On

On the next article we will produce an implementation for the algorithm above. As usual, after having a a well defined problem and a fully designed solution, building the program should be very, very simple. But I will try to iterate over the algorithm to show programming steps, like where you should focus first, then how to clean-up the code and where you should make heavy use of comments.



Notes

¹ Veterans might have realized earlier that this model is, in fact, a Stack Machine. I refrain to mention it so that I wouldn't scare of newcomers to programming. It was extremely helpful to have read the excellent article from Eric Lippert that discussed the IL as a stack machine.

² In my particular opinion, of course. This is by no means a rule.