Discussion 5: Trees, Data Abstraction, Sequences
Files: disc05.pdf
This is an online worksheet that you can work on during discussions. Your work is not graded and you do not need to submit anything.
Data Abstraction
Data abstraction is a powerful concept in computer science that allows programmers to treat code as objects. For example, using code to represent cars, chairs, people, and so on. That way, programmers don’t have to worry about how code is implemented; they just have to know what it does (You may have experienced this while reading code in the hog project.)
Data abstraction mimics how we think about the world. If you want to drive a car, you don’t need to know how the engine was built or what kind of material the tires are made of to do so. You just have to know how to use the car for driving itself, such as how to turn the wheel or press the gas pedal.
A data abstraction consists of two types of functions:
- Constructors: functions that build the abstract data type. In the example of a chair, this function is equivalent to actually building an individual chair.
- Selectors: functions that retrieve information from the data type. In the chair example this could be what it's made of, how big it is, how much it costs, etc.
Programmers design data abstractions to hide how information is stored and calculated such that the end user does not need to know how constructors and selectors are implemented. The nature of abstraction allows whoever uses them to assume that the functions have been written correctly and work as described. That way, the user does not need to know how every line of code works, just how to use the written functions.
Trees
One example of data abstraction is with trees.
In computer science, trees are recursive data structures (this means that one tree will contain one or more other trees within it.) They are widely used in various settings and can be implemented in many ways. The diagram below is an example of a tree.
Generally in computer science, you may see trees drawn “upside-down” like so. We say the root is the node where the tree begins to branch out at the top, and the leaves are the nodes where the tree ends at the bottom. Notice how each node can be considered a tree in and of itself.
Some terminology regarding trees:
- Parent Node: A node that has at least one branch.
- Child Node: A node that has a parent. A child node can only have one parent.
- Root: The top node of the tree. In our example, this is the
1
node. - Label: The value stored at a node. In our example, every node’s label is an integer.
- Leaf: A node that has no branches. In our example, the
4
,5
,6
,2
nodes are leaves. - Branch: A subtree of the root. Trees have branches, which are trees themselves: this is why trees are recursive data structures.
- Depth: How far away a node is from the root. We define this as
the number of edges between the root to the node. As there are no
edges between the root and itself, the root has depth 0. In our
example, the
3
node has depth 1 and the4
node has depth 2. - Height: The depth of the lowest (furthest from the root) leaf.
In our example, the
4
,5
, and6
nodes are all the lowest leaves with depth 2. Thus, the entire tree has height 2.
In computer science, there are many different types of trees, used for different purposes. Some vary in the number of branches each node has; others vary in the structure of the tree.
Tree Data Abstraction
A tree has a root value and a list of branches, where each branch is itself a tree.
The data abstraction specifies that calling branches
on a tree t
will give us a list of branches. Treating the return value of
branches(t)
as a list is then part of how we define trees.
How the entire tree t
is implemented is under the abstraction barrier.
Rather than assuming a pre-programmed implementation of label
and branches
, we
will want to use those selector functions directly to code our own implementation.
For example, we could choose to implement the tree data abstraction with
a dictionary with separate entries for the label
and the branches
or as a list with the first element being label
and the rest being
branches
.
- The
tree
constructor takes in a valuelabel
for the root, and an optional list of branchesbranches
. Ifbranches
isn’t given, the constructor uses the empty list[]
as the default, meaning the tree starts with no branches. - The
label
selector returns the value of the root, while thebranches
selector returns the list of branches of the tree. They are called bylabel(t)
andbranches(t)
respectively wheret
is the tree. Remember that each branch inbranches
is its own tree.
With this in mind, we can create the tree from earlier using our constructor:
t = tree(1,
[tree(3,
[tree(4),
tree(5),
tree(6)]),
tree(2)])
The code above may appear a bit intimidating at first, but remember
that trees are recursive. That means we need to create a new tree
with the tree
constructor for every tree in the list branches
.
Those trees will each have their own list of branches and so on.
If no list branches
is specified, then we have reached a leaf
node in the tree.
Try to go through the above code one line at a time and draw out a tree diagram similar to the one given above.
Questions
Due to the recursive nature of trees, it might be helpful to use recursion in your solutions for these first few questions
Q1: Tree Abstraction Barrier
Consider a tree t
constructed by calling
tree(1, [tree(2), tree(4)])
. For each of the following expressions,
answer these two questions (hint: draw out the tree!):
- What does the expression evaluate to?
- Does the expression violate any abstraction barriers (i.e. are we accessing
the data directly rather than using selector functions like
label
andbranches
?) If so, write an equivalent expression that does not violate abstraction barriers.
label(t)
t[0]
label(branches(t)[0])
is_leaf(t[1:][1])
(is_leaf
returns true if the list of branches for the given tree is empty)[label(b) for b in branches(t)]
- Challenge:
branches(tree(5, [t, tree(3)]))[0][0]
Q2: Height
Write a function that returns the height of a tree. Recall that the height of a tree is the length of the longest path from the root to a leaf (starting at 0).
def height(t):
"""Return the height of a tree.
>>> t = tree(3, [tree(5, [tree(1)]), tree(2)])
>>> height(t)
2
>>> t = tree(3, [tree(1), tree(2, [tree(5, [tree(6)]), tree(1)])])
>>> height(t)
3
"""
"*** YOUR CODE HERE ***"
Q3: Maximum Path Sum
Write a function that takes in a tree and returns the maximum sum of the values along any path in the tree. Recall that a path is from the tree’s root to any leaf picking only one branch at each step along the way.
def max_path_sum(t):
"""Return the maximum path sum of the tree.
>>> t = tree(1, [tree(5, [tree(1), tree(3)]), tree(10)])
>>> max_path_sum(t)
11
"""
"*** YOUR CODE HERE ***"
Q4: Find Path
Write a function that takes in a tree and a value x
and returns a list
containing the nodes along the path required to get from the root of the
tree to a node containing x
.
If x
is not present in the tree, return None
. Assume that the
entries of the tree are unique, so there will only be one path to
node x
.
For the following tree, find_path(t, 5)
should return [2, 7, 6, 5]
Hint A framework has been provided for you, but feel free to disregard it if you find another way to solve the problem.
def find_path(t, x):
"""
>>> t = tree(2, [tree(7, [tree(3), tree(6, [tree(5), tree(11)])] ), tree(15)])
>>> find_path(t, 5)
[2, 7, 6, 5]
>>> find_path(t, 10) # returns None
"""
if _____________________________:
return _____________________________
_____________________________:
path = ______________________
if _____________________________:
return _____________________________
Sequences
Sequences are ordered collections of values that support element-selection and have a length. We’ve worked with lists, but other Python types are also sequences, including strings.
Q5: Map, Filter, Reduce
Many languages provide map
, filter
, reduce
functions for
sequences. Python also provides these functions (and we’ll formally
introduce them later on in the course), but to help you better
understand how they work, you’ll be implementing these functions in the
following problems.
In Python, the
map
andfilter
built-ins have slightly different behavior than themy_map
andmy_filter
functions we are defining here.
my_map
takes in a one argument function fn
and a sequence seq
and
returns a list containing fn
applied to each element in seq
.
def my_map(fn, seq):
"""Applies fn onto each element in seq and returns a list.
>>> my_map(lambda x: x*x, [1, 2, 3])
[1, 4, 9]
"""
"*** YOUR CODE HERE ***"
my_filter
takes in a one-argument function cond
and a sequence seq
and returns a list containing all elements in seq
for which cond
returns True.
def my_filter(cond, seq):
"""Keeps elements in seq only if they satisfy cond.
>>> my_filter(lambda x: x % 2 == 0, [1, 2, 3, 4]) # new list has only even-valued elements
[2, 4]
"""
"*** YOUR CODE HERE ***"
my_reduce
takes in a two argument function combiner
and a non-empty
sequence seq
and combines the elements in seq
into one value using
combiner
. combiner
will be called len(seq)-1
times taking the
result as the first input of the next function call from left to right
until only a single value remains.
def my_reduce(combiner, seq):
"""Combines elements in seq using combiner.
seq will have at least one element.
>>> my_reduce(lambda x, y: x + y, [1, 2, 3, 4]) # 1 + 2 + 3 + 4
10
>>> my_reduce(lambda x, y: x * y, [1, 2, 3, 4]) # 1 * 2 * 3 * 4
24
>>> my_reduce(lambda x, y: x * y, [4])
4
>>> my_reduce(lambda x, y: x + 2 * y, [1, 2, 3]) # (1 + 2 * 2) + 2 * 3
11
"""
"*** YOUR CODE HERE ***"
Q6: Count palindromes
Write a function that counts the number of palindromes (any word that
reads the same forwards as it does when read backwards ignoring case)
in a list of words using only lambda
, string operations,
conditional expressions (if
,elif
, and else
), and the functions
we defined above (my_filter
, my_map
, my_reduce
).
Specifically, do not use recursion or any kind of loop.
def count_palindromes(L):
"""The number of palindromic words in the sequence of strings
L (ignoring case).
>>> count_palindromes(("Acme", "Madam", "Pivot", "Pip"))
2
"""
return ______
Hint: The easiest way to get the reversed version of a string s
is
to use the Python slicing notation trick s[::-1]
. Also, the function
lower
, when called on strings, converts all of the characters in the
string to lowercase. For instance, if the variable s
contains the
string “PyThoN”, the expression s.lower()
evaluates to “python”.
Additional Practice
Q7: Perfectly Balanced
Part A: Implement sum_tree
, which returns the sum of all the
labels in tree t
.
Part B: Implement balanced
, which returns whether every branch of
t
has the same total sum and that the branches themselves are also
balanced.
Challenge: Solve both of these parts with just 1 line of code each.
def sum_tree(t):
"""
Add all elements in a tree.
>>> t = tree(4, [tree(2, [tree(3)]), tree(6)])
>>> sum_tree(t)
15
"""
"*** YOUR CODE HERE ***"
def balanced(t):
"""
Checks if each branch has same sum of all elements and
if each branch is balanced.
>>> t = tree(1, [tree(3), tree(1, [tree(2)]), tree(1, [tree(1), tree(1)])])
>>> balanced(t)
True
>>> t = tree(1, [t, tree(1)])
>>> balanced(t)
False
>>> t = tree(1, [tree(4), tree(1, [tree(2), tree(1)]), tree(1, [tree(3)])])
>>> balanced(t)
False
"""
"*** YOUR CODE HERE ***"