Lab 16 - Parsing
Due by 11:59pm on 2023-08-01.
Starter Files
Download lab16.zip. Inside the archive, you will find starter files for the questions in this lab.
Topics
Calculator Language
In Project 3 you will build an interactive interpreter for the Calculator language.
The interpreter will allow the user to enter valid calculator expressions and evaluate them. For example:
calc >> (+ 3 4)
7
calc >> (+ 3 (+ 2 2))
7
To do that, we must first review the calculator language.
The Calculator language has primitive expressions, such as 2, -4, 5.6, and call expressions.
A call expression is a combination that begins with an operator (+, -, *, /) followed by 0 or more expressions.
For example:
(+ 1 2)(+ 1 2 3)(/ 3 (+ 4 5))
Notice that the operator comes first and the operands follow. (So, instead of having (3 + 4), we have (+ 3 4).)
Additionally, the operands can be another call expression.
The rules of evaluation are as follows:
- Evaluate the operator/function (in other words, what function is it?)
- Evaluate the operands
- If an operand is another call expression, then evaluate that call expression. For example, if the expression is
(/ 3 (+ 4 5)), evaluate(+ 4 5)to get9, then do(/ 3 9)to get0.33
- If an operand is another call expression, then evaluate that call expression. For example, if the expression is
- Apply the function to the operands and achieve the result
Using these rules of evaluation on the previous examples gives us,
(+ 1 2)= 3(+ 1 2 3)= 6(/ 3 (+ 4 5))= 0.33
Notice that the expression (+ 1 2 3) evaluates to $(1+2)+3 = 6$.
Likewise, (/ 20 5 4) evaluates to $(20 / 5) / 4 = 1$.
Practice
Evaluate the following calculator expressions (it may be useful to go to the expression tree segment if you are struggling).
(+ 3 2)(* (- 8 4) 4)(/ (+ 15 5) (* 2 2))(+ 1 2 3 4 5)(+ (- 10 2 2) (* (+ 3 2) 4))
Answers:
5- Add 3 to 2
16- Multiply
(- 8 4)or 4 by 4
- Multiply
5- Divide
(+ 15 5)or 20 by(* 2 2)or 4
- Divide
15(((1 + 2) + 3) + 4) + 5= 15
26- Add
(- 10 2 2)or 6 to(* (+ 3 2) 4)or 20
- Add
Expression Tree
Expression Trees allow us to break down complicated calculator expressions. If we have a call expression, we represent that as a box with edges or lines leading to the operator/function and the operands. If an operand is another call expression, then we represent that as a box with its own edges.
For example, if the expression is (+ 2 2), we diagram that like so:
Additionally, if the expression is (* 3 (+ 4 5) (* 6 7 8)):
Required Question
Q1: Tokenize
To evaluate user input, our program needs to format that input in such a way that make it easier to evaluate. Build a tokenize() function that takes a string expression like the ones above and returns a list where each item in the list is a parenthesis, one of the four operators, or a number literal.
def tokenize(expression):
""" Takes a string and returns a list where each item
in the list is a parenthesis, one of the four operators (/, *, -, +),
or a number literal.
>>> tokenize("(+ 3 2)")
['(', '+', '3', '2', ')']
>>> tokenize("(- 9 3 3)")
['(', '-', '9', '3', '3', ')']
>>> tokenize("(+ 10 100)")
['(', '+', '10', '100', ')']
>>> tokenize("(+ 5.5 10.5)")
['(', '+', '5.5', '10.5', ')']
>>> expr = "(* (- 8 4) 4)"
>>> tokenize(expr)
['(', '*', '(', '-', '8', '4', ')', '4', ')']
>>> expr = "(* (- 6 8) (/ 18 3) (+ 10 1 2))"
>>> tokenize(expr)
['(', '*', '(', '-', '6', '8', ')', '(', '/', '18', '3', ')', '(', '+', '10', '1', '2', ')', ')']
"""
# Write your code here
Hint: The
.split()or.replace()methods may be helpful here.
Representing Expression Trees
We can represent expression trees using slightly different version of a linked list. Instead of using the Link class, we will be using the Pair class to construct a pair list of the operators and operands. Like Link, every Pair has a .first and .rest attribute; however, all Pair lists end with nil. For example, Pair(2, nil) or Pair(1, Pair(2, nil)). Representing expressions as pair lists will make it easier to evaluate the expressions later down the road.
If necessary, refer to
pair.pyfor the implementation forPairandnil
To demonstrate how to represent expressions as pair lists, look at the following examples that convert a calculator expression to a tokens list and then to a pair list:
(+ 1 1) -> ['(', '+', '1', '1', ')'] -> Pair('+', Pair(1, Pair(1, nil)))
# pair list of operands
(+ 1.5 1.5) -> ['(', '+', '1.5', '1.5', ')'] -> Pair('+', Pair(1.5, Pair(1.5, nil)))
Notice that at the start of an expression, we immediately start with a Pair followed by the operator and a Pair list of operands where each .first value is an operand. Additionally, each operand is converted to a float or int accordingly. In the context of converting the python list to a pair list, the open parenthesis ( signifies the start of a pair list and the closing parenthesis ) signifies the end of the pair list or nil.
In the case that there is an operand is another expression, we represent that with another pair list. For example:
(* (+ 1 1) 4) --> Pair('*', Pair(Pair('+', Pair(1, Pair(1, nil))), Pair(4, nil)))
^-----------------------------^
(+ 1 1)
For Q2, here's the algorithm for converting a python list of tokens to a Pair list. It uses a recursive function, called parse_tokens() that has two parameters: a list containg the tokens and an index value.
- If the token at the provided
indexis an open parenthesis'('- The next token should be an operator. Store the operator in a variable.
- If the
indexis not zero, then the current tokens is a start of a sub-expression, so...- Call
parse_tokens()by passing in the list of tokenstokenswith theindexincremented by2(this puts the token after the operator as the next one to be processed). The recursive call will return two values: a new pair list and a new updated index. Save the new pair list in a variable and updateindexto equal the updated index returned by the function. - Reassign the variable you stored the operator in step 1.1 to a
Pairobject with the operator as.firstand the new pair list (returned in step 1.2.1) as.rest
- Call
- If the
indexof the current token is0- increment the
indexby2
- increment the
- After those conditions are checked call
parse_tokens()with the token listtokensandindex. Save the new pair list in a variable and updateindexto new index returned by the function. This function call will either return the rest of the syntax tree if the input index was 0 ornilif it wasn't. - Return a
Pairobject with the variable holding the operator from 1.1 (which was possibly updated in 1.2.2) as the.firstand the pair list returned in step 1.4 as the.restalong with theindexreturned in 1.4 (in that order).
- If the token is a close parenthesis
')'- return a
nilobject and theindexincremented by 1.
- return a
- All other tokens should be operands for expression as integers or floating point numbers. Do this entire part in a try/except
block that raises a
TypeErrorif converting the current token to float or integer fails.- If the current token has a decimal point
., convert it to afloat, and store it in a variable. - If the current token does not have a decimal point, convert it to an
int, and store it in a variable. - Call
parse_tokens()with the token listtokensand the index incremented by1to process the next token. Save the new pair list and updateindexto the new index returned. - Return a
Pairobject with the variable created in 3.1 or 3.2 as.firstand the pair object returned in step 3.3 as.restalong with the index returned in step 3.3.
- If the current token has a decimal point
Q2: Parse Tokens (Optional)
You will be writing this function in Homework 6, so if you have time get started on it in the lab!
Start implementing the parse_tokens() function that takes a list of tokens and an index and converts the list of tokens to a Pair list. Refer to the steps above for instructions how to implement it.
def parse_tokens(tokens, index):
""" Takes a list of tokens and an index and converts the tokens to a Pair list
>>> parse_tokens(['(', '+', '1', '1', ')'], 0)
(Pair('+', Pair(1, Pair(1, nil))), 5)
>>> parse_tokens(['(', '*', '(', '-', '8', '4', ')', '4', ')'], 0)
(Pair('*', Pair(Pair('-', Pair(8, Pair(4, nil))), Pair(4, nil))), 9)
"""
# Write your code here
Submit
If you attend the lab, you don't have to submit anything.
If you don't attend the lab, you will have to submit working code. Submit the lab16.py file on Canvas to Gradescope in the window on the assignment page.