Friday, March 4, 2011

Rosseta code tutorials:Arithmetic evaluation

Today, let's consider another rosetta code sample - Arithmetic evaluation

We have to evaluate an arithmetic expression by building the parsing tree. So let's define tree elements being used by the parser:

#subject parse_order.

// --- Token ---

#class Token
{
    #field theValue.
    
    #method parse_order'get = 0.
    
    #method += aChar
    [
        theValue += aChar.
    ]
    
    #method + aNode
    [
        ^ aNode += self.
    ]
    
    #method new
    [
        theValue := String.
    ]
    
    #method numeric = theValue save:Real64Convertor.
}

// --- Node ---

#class Node
{
    #field theLeft.
    #field theRight.
    
    #role Empty
    {
        #method += aNode
        [
            theLeft := aNode.
            
            $self $setLeftAssigned.
        ]
    }
    
    #role LeftAssigned
    {
        #method += aNode
        [
            theRight := aNode.
            
            #shift.
        ]
    }
    
    #method $setLeftAssigned
    [
        #shift LeftAssigned.
    ]

    #method + aNode
    [
        #if (self parse_order > aNode parse_order)?
        [
            self += aNode.
        ]
        | [
            aNode += self.
            
            ^ aNode.
        ].
    ]
        
    #method += aNode
    [
        #if (theRight parse_order > aNode parse_order)?
        [
            theRight += aNode.
        ]
        | [
            theRight := aNode += theRight.
        ].        
    ]
    
    #method new
    [
        #shift Empty.
    ]
}

// --- SummaryNode

#class SummaryNode (Node)
{
    #method parse_order'get = 2.
    
    #method numeric = theLeft numeric + theRight numeric.
}

// --- DifferenceNode ---

#class DifferenceNode (Node)
{
    #method parse_order'get = 2.
    
    #method numeric = theLeft numeric - theRight numeric.
}

// --- ProductNode ---

#class ProductNode (Node)
{
    #method parse_order'get = 1.
    
    #method numeric = theLeft numeric * theRight numeric.
}

// --- FractionNode ---

#class FractionNode (Node)
{
    #method parse_order'get = 1.
    
    #method numeric = theLeft numeric / theRight numeric.
}

Our tree will consist of Token instances (leaf nodes) and Node descendants (branch node). Any branch branch may have two children. So the expression 1+2+3 will be presented like this:

       --(+)----
      |         |
  -- (+)--     [3]
 |        |
[1]      [2]

To correctly parse the expression like 1+2*3 any node should return its precedence level (parsing order). That's why we declare a new property (actually new subject) - parse_order. The tree is evaluated by sending real'get message to the top node, which in turn sends it to its children until we reach terminal node (Token). Each brunch node (SummaryNode, DifferenceNode and so on) performs the associated operation with the results and at the end we've got the result of the expression.

Let's consider some parts of the code. Token's numeric'get method converts the literal value to a numeric one:
theValue save:Real64Convertor
Add message (operator +) is used to connect two nodes with each other in accordance with their precedence levels. To add children nodes to the brunch one append message (+= operator) should be used. Note the use of roles there. Instead of checking each time if the sub node was already attached we define two roles - Empty and LeftAssigned. Due to current implementation limitation we cannot switch from one role to another one inside the role so a private method is used ($setLeftAssigned).
To support precedence-control parentheses we have to declare a special node - SubExpression. It is actually a sub tree. Note, we have to use Parser class before its declaration. It can be done by providing the symbol namespace (if our module is named as arithmeval - arithmeval'Parser).
#class SubExpression
{
    #field theParser.
    #field theCounter.
    
    #role EOF
    {
        #method eof'is []
        
        #method += aChar [ $self fail. ]
    }
    
    #method parse_order'get = 0.
    
    #method + aNode
    [
        ^ aNode += self.
    ]
    
    #method append : aChar
    [
        #var aCode := Int32Value::aChar.

        #if control if:(aCode == 41)
        [
            theCounter -= 1.
        ]
        | if:(aCode == 40)
        [
            theCounter += 1.
        ].
        
        #if(theCounter == 0)?
            [ #shift EOF. ^ $self. ].
        
        theParser evaluate:aChar.
    ]
    
    #method numeric = theParser numeric.
    
    #method new
    [
        theParser := arithmeval'Parser.
        theCounter := Integer << 1.
    ]
}
The field theCounter is used to deal with nested parentheses. Now let's declare our parser class.
#class Parser
{
    #field theToken.
    #field theTopNode.
    
    #role Start
    {
        #method evaluate : aChar
        [
            #if (40 == aChar)?
            [
                theToken := SubExpression.
                theTopNode := theToken.
                
                $self $setBrackets.
            ]
            | [
                theToken := Token.
                theTopNode := theToken.
                
                theToken += aChar.
                
                #shift.
            ].
        ]
    }
    
    #role Brackets
    {
        #method evaluate : aChar
        [
            theToken += aChar.
            
            #if theToken eof'is
            [
                #shift.
            ].
        ]
    }
    
    #role Operator
    {
        #method evaluate : aChar
        [
            #if Control if:(48 < aChar) if:(58 > aChar)
            [
                theToken := (Token += aChar).
                
                theTopNode += theToken.
                
                #shift.
            ]
            | if:(40 == aChar)
            [
                theToken := SubExpression.
                theTopNode += theToken.
                
                #shift Brackets.
            ]
            | [ $self fail. ].
        ]
    }
    
    #method numeric = theTopNode numeric.
    
    #method evaluate : aChar
    [
        #if Control if:(48 < aChar) if:(58 > aChar)
        [
            theToken += aChar.
        ]
        | if:(42 == aChar)  // *
        [
            theTopNode := theTopNode + ProductNode.
            
            #shift Operator.
        ]
        | if:(47 == aChar)  // /
        [
            theTopNode := theTopNode + FractionNode.
            
            #shift Operator.
        ]
        | if:(43 == aChar)  // +
        [
            theTopNode := theTopNode + SummaryNode.
            
            #shift Operator.
        ]
        | if:(45 == aChar)  // -
        [
            theTopNode := theTopNode + DifferenceNode.
            
            #shift Operator.
        ]
        | if:(40 == aChar)
        [
            theToken := SubExpression.
            theTopNode := theToken.
            
            #shift Brackets.
        ]
        | [ $self fail. ].
    ]
    
    #method new
    [
        #shift Start.
    ]
    
    #method $setBrackets
    [
        #shift Brackets.
    ]
}
Parser is an action class used by a literal enumerator. Evaluate method is called for every expression character to build the parser tree. Once again the roles are used to deal with special cases (like parentheses).
Parser class gives us an opportunity to speak about conditional statements in ELENA. Because the language does not have any built-in types it is problematic to implement classical conditional statements. One possible solution is implemented in Smalltalk where closures are used (the similar could be done in ELENA as well). The second solution used by the language takes into account another ELENA feature - alternative flow: if the object does not react to the message (no appropriate record in VMT) the flow is considered to be broken and control goes to the alternative one. So the conditional statement in the language looks like this:
#if anObject message1:aParameter1 // first condition
[
   ....
]
| message2:aParameter2  // second condition (optional)
[ 
   ... 
]
|    // else condition (optional) 
[
   ...
].
Though conditional statement can be used with any message, operators ? (is) and ! (isnot) are used in ELENA API. It is presumed that a true value (std'basic'True symbol) reacts to ? operator and a false one (std'basic'False) to ! operator. So the conditional statement may look like this
#if aFlag?
[
   // true part
]
| ! [
   // false part
]
| [
    // error part
].
Comparison operators (==, !=, >, < and so on) returns either true or false values (or fails if the operands are not compatible). So we could write like this
#if (32 < aChar)?
[
   ...
].
Note that we compare an integer to the character rather than other way around - to convert character value to a numeric one.

If several conditions are required we could use 'and' and 'or' messages (supported by boolean symbols)
#if (48 < aChar) and:(58 > aChar)
[
].
Note, that in this case both condition are evaluated (unlike C++ for example). But it is still possible to achieve using std'patterns'Control symbol
#if Control if:(48 < aChar) if:(58 > aChar)
[
]
In this case if the first condition is not true the control goes immediately to the alternative part (if presented).
And finally, the conditional statement (as well as a loop one) does not break the flow if the message fails and there is no alternative statements (as opposed to a normal statement). That why we have to break it ourselves (if we want to inform the method caller that the operation is failed).
#if Control if:(48 < aChar) if:(58 > aChar)
[
   theToken += aChar.
]
...
| [
   $self fail.
].
It is presumed that fail message is never handled.
#symbol Program =>
[
    #var aText := String.
    
    #loop ((Console >> aText) length > 0)?
    [
        #var aParser := Parser.

        Console << "=" << aText then: aText =>
        [
            Scan::aText run:aParser.
            
            ^ aParser numeric.
        ]
        | << "Invalid Expression".
        
        Console << "%n".
    ].
].
The main program body is quite simple. We read the user input, parse it and prints the result (or an error message if the expression is invalid) until the user enters the empty line. ctrl'Control-then method evaluates the action (closure; a la Smalltalk) and returns the result.

2 comments:

  1. What is #shift, what is its function ?

    ReplyDelete
  2. ELENA supports the concept of class roles. Actually the role is an alternative set of methods (extension of the main one). #shift replaces the current set of methods with the one from the role. #shift (without role name) replaces the role set with the main one.

    ReplyDelete