Trees & Tree Mutation

Trees / Tree Mutation

In computer science, trees are recursive data structures that are widely used in various settings and can be implemented in many ways. The diagram below is an example of a tree.

Generally in computer science, you may see trees drawn "upside-down" like so. We say the root is the node where the tree begins to branch out at the top, and the leaves are the nodes where the tree ends at the bottom.

Some terminology regarding trees:

  • Parent Node: A node that has at least one branch.
  • Child Node: A node that has a parent. A child node can only have one parent.
  • Root: The top node of the tree. In our example, this is the 1 node.
  • Label: The value at a node. In our example, every node's label is an integer.
  • Leaf: A node that has no branches. In our example, the 4562 nodes are leaves.
  • Branch: A subtree of the root. Trees have branches, which are trees themselves: this is why trees are recursive data structures.
  • Depth: How far away a node is from the root. We define this as the number of edges between the root to the node. As there are no edges between the root and itself, the root has depth 0. In our example, the 3 node has depth 1 and the 4 node has depth 2.
  • Height: The depth of the lowest (furthest from the root) leaf. In our example, the 45, and 6 nodes are all the lowest leaves with depth 2. Thus, the entire tree has height 2.

In computer science, there are many different types of trees, used for different purposes. Some vary in the number of branches each node has; others vary in the structure of the tree.

A tree has a root value and a list of branches, where each branch is itself a tree.

  • The Tree constructor takes in a value label for the root, and an optional list of branches branches. If branches isn't given, the constructor uses the empty list [] as the default.
  • To get the label of a tree t, we access the instance attribute t.label.
  • Accessing the instance attribute t.branches will give us a list of branches.

With this in mind, we can create the tree from earlier using our constructor:

t = Tree(1,
      [Tree(3,
          [Tree(4),
           Tree(5),
           Tree(6)]),
      Tree(2)])

Implementing trees as a class gives us another advantage: we can specify how we want them to be output by the interpreter by implementing the __repr__ and __str__ methods.

Here is the __repr__ method:

def __repr__(self):
    if self.branches:
        branch_str = ', ' + repr(self.branches)
    else:
        branch_str = ''
    return 'Tree({0}{1})'.format(self.label, branch_str)

With this implementation of __repr__, a Tree instance is displayed as the exact constructor call that created it:

>>> t = Tree(4, [Tree(3), Tree(5, [Tree(6)]), Tree(7)])
>>> t
Tree(4, [Tree(3), Tree(5, [Tree(6)]), Tree(7)])
>>> t.branches
[Tree(3), Tree(5, [Tree(6)]), Tree(7)]
>>> t.branches[0]
Tree(3)
>>> t.branches[1]
Tree(5, [Tree(6)])

Here is the __str__ method. You do not need to understand how this function is implemented.

def __str__(self):
    def print_tree(t, indent=0):
        tree_str = '  ' * indent + str(t.label) + "\n"
        for b in t.branches:
            tree_str += print_tree(b, indent + 1)
        return tree_str
    return print_tree(self).rstrip()

With this implementation of __str__, we can pretty-print a Tree to see both its contents and structure:

>>> t = Tree(4, [Tree(3), Tree(5, [Tree(6)]), Tree(7)])
>>> print(t)
4
  3
  5
    6
  7
>>> print(t.branches[0])
3
>>> print(t.branches[1])
5
  6