Path: blob/master/python/algorithms/tree.ipynb
1480 views
Tree Data Structure
Tree Traversal
Summary
There are two main ways to traverse through a tree. By traversing, we essentially mean we try to walk through the tree without repeating ourselves.
For depth first search (DFS), we start down a path and we won't stop until we get to the end. In other words, we traverse through one branch of the tree until we get to a leaf and only then will we work back up to the trunk of the tree.
In breadth first search (BFS), we would search through the nodes from one level to the next, and traverse through all the children of a node before moving on to visit the grandchildren nodes
Implementation
Resources:
From an implementation standpoint, the difference between the two is that the breadth first search uses a queue, while depth first search uses a stack.
Breadth First Search (BFS)
We will use a graph that has 7 nodes and 7 edges as our example. The edges are undirected and unweighted. Distance between two nodes will be measured based on the number of edges separating two vertices.
When using BFS to traverse through all the nodes in the graph, the algorithm follows the following steps:
Check the starting node and add its neighbours to the queue.
Mark the starting node as explored.
Get the first node from the queue / remove it from the queue.
Check if node has already been visited.
If not, go through the neighbours of the node.
Add the neighbour nodes to the queue.
Mark the node as explored.
Loop through steps 3 to 7 until the queue is empty.
For depth first search, the list of actions to perform upon each visit to a node:
Mark the current vertex as being visited.
Explore each adjacent vertex that is not included in the visited set.
Binary Seach Tree
Summary
In order for a binary tree to be searchable, all of the nodes to the left of root node must be smaller than the value of the root node. On the other hand, all the value to the right of the root node must be bigger than the root node's value. This property also applies to every subtree.
One of the main use case for binary search tree is to search for the presence of a number in O(log(n))
time. The rationale behind this is that if the value is below root, we can say for sure that the value is not in the right subtree then we only need to search in the left subtree and if the value is above root, we can say for sure that the value is not in the left subtree and we need to only search in the right subtree.
Heap
Summary
The heap is a binary tree with two additional rules
The shape: An entire level of nodes must all have children added to them before any of the child node can have grandchildren nodes
The ordering: the parent nodes must be either greater or equal to the value of its child nodes or less or equal to the value of its child node. Both formats are both acceptable.
We can classify heap into two categories based on the ordering, a minheap or a maxheap.
From an implementation standpoint, heaps are often times represented as arrays. We start off by representing the root node of the heap as the first element of the array. Then if a parent node’s index is represented by index i in an array, then it’s left child will always be located at 2i + 1. Similarly, if a parent node’s index is represented by index i in an array, then it’s right child will always be located at 2i + 2.