This article brings you a detailed introduction (code example) about binary search trees in python. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
1. Binary search tree
**Binary search tree (BST binary search tree)** is a special binary tree. The value of any node is greater than the value of the left child and less than or equal to the value of the right child. BST (Binary) is traversed in order. Search Tree) can obtain a sorted set of elements, and the time consumption of insertion and deletion is also relatively reasonable, but one disadvantage is that the memory overhead is a bit large.
Properties of binary search trees
1. For any node x, the key in its left subtree is not greater than x.key, and the key in its right subtree is not less than x.key.
2, different binary search trees can represent the same set of values.
3. The basic operations of a binary search tree are proportional to the height of the tree, so if it is a complete binary tree, the worst running time is Θ(lgn), but if it is a linear tree connected by n nodes, then The worst running time is Θ(n).
4, the root node is the only node whose parent pointer points to the NIL node.
5. Each node includes at least four attributes: key, left, right and parent. When building a binary search tree, there must be a comparison algorithm for key.
2. Implementation of binary search tree
tree:
class TreeNode: def __init__(self,key,val,left=None,right=None,parent=None): self.key = key self.payload = val self.leftChild = left self.rightChild = right self.parent = parent class BinarySearchTree: def __init__(self): self.root = None self.size = 0
Now that we have the BinarySearchTree shell and TreeNode, it is time to write the put method, which will allow us to build a binary search tree. The put method is a method of BinarySearchTree class. This method will check if the tree already has a root. If there is no root, then put will create a new TreeNode and make it the root of the tree. If the root node is already in place, put calls the private recursive helper function _put to search the tree according to the following algorithm:
Starting from the root of the tree, the binary tree is searched, matching the new key with the current node keys to compare. If the new key is smaller than the current node, the left subtree is searched. If the new key is greater than the current node, the right subtree is searched.
When there are no left (or right) children to search for, we find the position in the tree where the new node should be established.
To add a node to the tree, create a new TreeNode object and insert the object into the node discovered in the previous step.
When inserting, it is always inserted into the tree as a leaf node
Given a sequence to construct a binary search tree in sequence, the binary search tree is unique; if all keys of this sequence are used Words are used to construct possible binary trees (arranged in any order), which are generally not unique.
def put(self,key,val): if self.root: self._put(key,val,self.root) else: self.root = TreeNode(key,val) self.size = self.size + 1 def _put(self,key,val,currentNode): if key < currentNode.key: if currentNode.hasLeftChild(): self._put(key,val,currentNode.leftChild) else: currentNode.leftChild = TreeNode(key,val,parent=currentNode) else: if currentNode.hasRightChild(): self._put(key,val,currentNode.rightChild) else: currentNode.rightChild = TreeNode(key,val,parent=currentNode)
The above shows the Python code to insert a new node in the tree. The _put function is written recursively following the above steps. Note that when a new child node is inserted into the tree, currentNode is passed to the new tree node as the parent node.
An important problem in our implementation of insertion is that duplicate keys cannot be handled correctly. When our tree is implemented, a duplicate key will create a new node with the same key value in the right subtree of the node with the original key. The result of this is that the node with the new key will never be found during the search. A better way to handle inserting a duplicate key is to replace the old value with the value associated with the new key.
Once the tree is constructed, the next task is to implement the retrieval of the value for a given key. The get method is easier than the put method because it just searches the tree recursively until it reaches a non-matching leaf node or finds a matching key. When a matching key is found, the value stored in the node's payload is returned.
When the binary search tree is not empty:
First compare the given value with the keyword of the root node. If they are equal, the search is successful
If it is less than the key value of the root node, recursively search the left subtree
If it is greater than the key value of the root node, recursively search the right subtree Subtree
If the subtree is empty, the search is unsuccessful
def get(self,key): if self.root: res = self._get(key,self.root) if res: return res.payload else: return None else: return None def _get(self,key,currentNode): if not currentNode: return None elif currentNode.key == key: return currentNode elif key < currentNode.key: return self._get(key,currentNode.leftChild) else: return self._get(key,currentNode.rightChild) def __getitem__(self,key): return self.get(key)
The first task is through Search the tree to find the node to delete. If the tree has multiple nodes, we search using the _get method to find the TreeNode that needs to be deleted. If the tree has only one node, this means we delete the root of the tree, but we still have to check to make sure the root's key matches the key we want to delete. In either case, the del operator will raise an error if the key is not found.
def delete(self,key): if self.size > 1: nodeToRemove = self._get(key,self.root) if nodeToRemove: self.remove(nodeToRemove) self.size = self.size-1 else: raise KeyError('Error, key not in tree') elif self.size == 1 and self.root.key == key: self.root = None self.size = self.size - 1 else: raise KeyError('Error, key not in tree') def __delitem__(self,key): self.delete(key)
Once we find the node of the key we want to delete, we must consider three situations:
The node to be deleted has no child nodes.
The node to be deleted has only one child node.
The node to be deleted has two child nodes.
第一种情况:
如果当前节点没有子节点,我们需要做的是删除节点并删除对父节点中该节点的引用。
第二种情况:
如果一个节点只有一个孩子,那么我们可以简单地促进孩子取代其父。此案例的代码展示在下一个列表中。当你看这个代码,你会看到有六种情况要考虑。由于这些情况相对于左孩子或右孩子对称,我们将仅讨论当前节点具有左孩子的情况。决策如下:
如果当前节点是左子节点,则我们只需要更新左子节点的父引用以指向当前节点的父节点,然后更新父节点的左子节点引用以指向当前节点的左子节点。
如果当前节点是右子节点,则我们只需要更新左子节点的父引用以指向当前节点的父节点,然后更新父节点的右子节点引用以指向当前节点的左子节点。
如果当前节点没有父级,则它是根。在这种情况下,我们将通过在根上调用replaceNodeData 方法来替换 key,payload,leftChild 和 rightChild 数据。
第三种情况:
最难处理的情况。 如果一个节点有两个孩子,那么我们不太可能简单地提升其中一个节点来占据节点的位置。 然而,我们可以在树中搜索可用于替换被调度删除的节点的节点。 我们需要的是一个节点,它将保留现有的左和右子树的二叉搜索树关系。 执行此操作的节点是树中具有次最大键的节点。 我们将这个节点称为后继节点,我们将看一种方法来很快找到后继节点。 继承节点保证没有多于一个孩子,所以我们知道使用已经实现的两种情况删除它。 一旦删除了后继,我们只需将它放在树中,代替要删除的节点。
找到后继的代码如下所示,是 TreeNode 类的一个方法。此代码利用二叉搜索树的相同属性,采用中序遍历从最小到最大打印树中的节点。在寻找接班人时,有三种情况需要考虑:
如果节点有右子节点,则后继节点是右子树中的最小的键。
如果节点没有右子节点并且是父节点的左子节点,则父节点是后继节点。
如果节点是其父节点的右子节点,并且它本身没有右子节点,则此节点的后继节点是其父节点的后继节点,不包括此节点。
def findSuccessor(self): succ = None if self.hasRightChild(): succ = self.rightChild.findMin() else: if self.parent: if self.isLeftChild(): succ = self.parent else: self.parent.rightChild = None succ = self.parent.findSuccessor() self.parent.rightChild = self return succ def findMin(self): current = self while current.hasLeftChild(): current = current.leftChild return current def spliceOut(self): if self.isLeaf(): if self.isLeftChild(): self.parent.leftChild = None else: self.parent.rightChild = None elif self.hasAnyChildren(): if self.hasLeftChild(): if self.isLeftChild(): self.parent.leftChild = self.leftChild else: self.parent.rightChild = self.leftChild self.leftChild.parent = self.parent else: if self.isLeftChild(): self.parent.leftChild = self.rightChild else: self.parent.rightChild = self.rightChild self.rightChild.parent = self.parent
二叉搜索树的深度越小,那么搜索所需要的运算时间越小。一个深度为log(n)的二叉搜索树,搜索算法的时间复杂度也是log(n)。然而,我们在二叉搜索树中已经实现的插入和删除操作并不能让保持log(n)的深度。如果我们按照8,7,6,5,4,3,2,1的顺序插入节点,那么就是一个深度为n的二叉树。那么,搜索算法的时间复杂度为n。
在上一节中我们讨论了建立一个二叉搜索树。我们知道,当树变得不平衡时get和put操作会使二叉搜索树的性能降低到O(n)。
如何减少树的深度呢?
一种想法是先填满一层,再去填充下一层,这样就是一个完全二叉树(complete binary tree)。这样的二叉树实现插入算法会比较复杂。另一种比较容易实现的树状数据结构——AVL树,其搜索算法复杂度为log(n)。
在这一节中我们将看到一种特殊的二叉搜索树,它可以自动进行调整,以确保树随时都保持平衡。这种树被称为AVL树,命名源于其发明者:G.M. Adelson-Velskii 和 E.M. Landis。
AVL Tree Implementation The Map abstract data type is just like a regular binary search tree, the only difference is how the tree is executed. In order to implement our AVL tree, we need to keep track of the balance factor for each node in the tree. We do this by looking at the heights of each node's left and right subtrees. More formally, we define the balancing factor of a node as the difference between the height of the left subtree and the height of the right subtree.
##balanceFacto#r##=##height(leftSubTree)##−height(rightSubTree) Using the definition of balance factor given above, we say that a subtree is left-heavy if the balance factor is greater than zero. If the balance factor is less than zero, the subtree is right-heavy. If the balance factor is zero, then the tree is perfectly balanced. In order to implement an AVL tree, and get the benefits of having a balanced tree, we will define tree balancing if the balancing factor is -1,0 or 1. Once the balance factor of a node in the tree is outside this range, we will need a procedure to bring the tree back into balance. The number of nodes (Nh) of a tree with height h is: Nh=1 Nh−1 Nh−2 This formula is very similar to the Fibonacci sequence. We can use this formula to deduce the height of an AVL tree from the number of nodes in the tree. As the numbers of the Fibonacci sequence get larger and larger, Fi / Fi−1 gets closer and closer to the golden ratio Φ At any time, the height of our AVL tree is equal to a constant that is the logarithm of the number of nodes in the tree (1.44) times. This is good news for searching our AVL tree because it limits the search to O(logN).
When a new node is added, it will destroy the balance of the binary tree, so it needs to be corrected by rotation.
Make the left node Becomes the root of the subtree.
Note: Since the new root is the left node of the old root, the left node of the old root after movement must be empty. At this time, you can directly add the left node to the moved old root.
Similarly, to perform left rotation we need to do the following:
Make the right node (B) the root of the subtree.
Move the old root node (A) to the left node of the new root.
If the new root (B) originally has a left node, then let the left node of the original B become the right node of the new root left node (A).
As shown in the figure, if the child node is 4 instead of 2, try single rotation and you will find that the problem cannot be solved.
To solve this problem, we must use the following rules:
If the subtree needs to be rotated left to make it balanced, first check the balance factor of the right node. If the right node is left-heavy, the right node is rotated right, and then the original node is rotated left.
If the subtree needs to be rotated right to balance it, first check the balance factor of the left node. If the left node is right-heavy, the left node is rotated left, and then the original node is rotated right.
The solution at this time is double rotation.
Double rotation is actually two single rotations: the subtree with root node 4 first performs a single rotation to the left, and then the subtree with root node 5 performs a rightward rotation of single rotation. This restores the ACL properties of the tree.
For AVL trees, it can be proved that when a new node is added, the properties of the AVL tree can always be restored through one rotation.
When we insert a new node, where does it rotate? Should we use single rotation or double rotation?
We follow the following basic steps:
Add nodes according to the binary search tree method, and the new node name is a leaf node.
Start from the new node and trace back to the first unbalanced node (5).
(If you trace back to the root node and there is no unbalanced node, it means that the tree already conforms to the AVL property.)
(The depth of the two subtrees cannot be equal, and the subtree with a larger depth contains new nodes.)
The red-black tree is a very interesting balanced search tree. Its statistical performance is better than the balanced binary tree (AVL-tree), so the red-black tree is used in many places. In C STL, many parts (currently including
set, multiset, map, multimap) apply variations of the red-black tree (the red-black tree in SGI STL has some changes, These modifications provide better performance, as well as support for set operations). The main rules of red-black trees:
The main rules of red-black trees are as follows:
The nodes inserted in the red-black tree are all red. This is not accidental, because inserting a red node is less likely to violate the red-black rule than inserting a black node. The reason is: inserting a black node will always change the black height (violating rule 4), but inserting a red node has only half a chance of violating rule 3. Additionally, a violation of Rule 3 is easier to correct than a violation of Rule 4.
When inserting a new node, this balance may be destroyed. The red-black tree mainly corrects the balance in three ways, changing the node color, left-hand rotation and right-hand rotation. (Specific rules omitted)
Because each red-black tree is also a specialized binary search tree, the read-only operations on the red-black tree are the same as the read-only operations on the ordinary binary search tree. However, inserting and deleting operations on a red-black tree will no longer comply with the properties of a red-black tree. Restoring the properties of a red-black tree requires a small (O(logn)) color change (actually very fast) and no more than three tree rotations (two for insertion operations). Although insertion and deletion are complex, the operation time can still be kept as O(logn) times.
The height of a red-black tree with n internal nodes is at most 2lg(n 1).
The above is the detailed content of Detailed introduction to binary search trees in python (code example). For more information, please follow other related articles on the PHP Chinese website!