一、定义
二叉查找树(Binary Search Tree),也称二叉搜索树,是一棵二叉树,其中每个结点都含有一个Comparable的键(以及相关联的值)且每个结点的键都大于其左子树中的任意结点的键而小于右子树的任意结点的键。
BST的数据结构定义:
public class BST<Key extends Comparable<Key>, Value> {
private Node root; // root of BST
private class Node {
private Key key; // sorted by key
private Value val; // associated data
private Node left, right; // left and right subtrees
private int size; // number of nodes in subtree
public Node(Key key, Value val, int size) {
this.key = key;
this.val = val;
this.size = size;
}
}
public BST() {
}
}
BST的特点:
如果将一棵二叉查找树的所有键投影到一条直线上,保证一个结点的左子树中的键出现在它的左边,右子树中的键出现在它的右边,那么我们一定可以得到一条有序的键列。
BST的遍历:
以根节点为参照,根据根节点的访问前后顺序,定义了3种遍历方式:
- 先序遍历:根节点->左子树->右子树
- 中序遍历:左子树->根节点->右子树
- 后序遍历:左子树->右子树->根结点
二、实现
2.1 查找
在一棵二叉查找树中查找一个键的步骤如下:
- 如果树是空,则查找未命中;
- 如果被查找的键和根节点的键相等,查找命中。否则,递归地在适当的子树中继续查找:
①如果被查找的键小于当前根节点,则选择左子树递归;
②如果被查找的键大于当前根节点,则选择右子树递归。
查找-源码实现:
/**
* Returns the value associated with the given key.
*/
public Value get(Key key) {
return get(root, key);
}
/**
* 从指定根节点x开始查找key
*/
private Value get(Node x, Key key) {
if (x == null) return null;
int cmp = key.compareTo(x.key);
if (cmp < 0) return get(x.left, key);
else if (cmp > 0) return get(x.right, key);
else return x.val;
}
2.2 插入
在一棵二叉查找树中插入一个结点的步骤和查找步骤类似,当查找一个不存在树中的结点并结束于一条空链接时(说明找到了待插入的位置),我们需要做的就是将链接指向一个含有被查找的键的新结点。
步骤如下:
- 如果树是空,则返回一个含有该键值对的新结点;
- 如果被查找的键和根节点的键相等,则查找命中,更新结点值。否则,递归地在适当的子树中继续查找:
①如果被查找的键小于当前根节点,则选择左子树递归,更新左子树的结点数;
②如果被查找的键大于当前根节点,则选择右子树递归,更新右子树的结点数。
插入-源码实现:
public void put(Key key, Value val) {
if (key == null)
throw new IllegalArgumentException("calls put() with a null key");
root = put(root, key, val);
}
/**
* 向根结点为x的二叉查找树中插入一个结点,并更新结点数
* @return 返新树的根节点
*/
private Node put(Node x, Key key, Value val) {
if (x == null)
return new Node(key, val, 1);
int cmp = key.compareTo(x.key);
if (cmp < 0)
x.left = put(x.left, key, val);
else if (cmp > 0)
x.right = put(x.right, key, val);
else
x.val = val;
x.size = 1 + size(x.left) + size(x.right);
return x;
}
2.3 删除最小结点
二叉查找树的最小结点要么在左子树中,要么就是根结点。
所以,如果左子树不为空,就继续在左子树中查找并删除最小结点;如果左子树为空,则首先将当前根节点指向右结点(相当于删除了根节点)。
删除最小结点-源码实现:
/**
* 删除二叉查找树中的最小结点.
*/
public void deleteMin() {
if (isEmpty())
throw new NoSuchElementException("Symbol table underflow");
root = deleteMin(root);
}
/**
* 删除当前根节点为x的树的最小结点
* @return 返回新树的根节点
*/
private Node deleteMin(Node x) {
if (x.left == null)
return x.right;
x.left = deleteMin(x.left);
x.size = size(x.left) + size(x.right) + 1;
return x;
}
2.4 删除任意结点
当被删除的结点只有左子树或右子树时,可以采用类似删除最小结点的方式处理。但是,当被删除的结点有左右子树时,需要用它的后继结点填补该结点。
所谓后继结点,就是被删除结点的右子树中的最小结点。
具体步骤:
- 将指向被删除的结点的链接保存为t;
- 将x指向t的右子树的最小结点min(t.right)
- 将x的右链接指向deleteMin(t.right)
- 将x的左链接(原本为空)指向t.left
- 返回新树的根节点(x)
删除任意结点-源码实现:
/**
* 根据键值删除结点
*/
public void delete(Key key) {
if (key == null)
throw new IllegalArgumentException("calls delete() with a null key");
root = delete(root, key);
}
/**
* 删除当前根节点为x的树的指定结点
* @return 返回新树的根节点
*/
private Node delete(Node x, Key key) {
if (x == null)
return null;
int cmp = key.compareTo(x.key);
if (cmp < 0)
x.left = delete(x.left, key);
else if (cmp > 0)
x.right = delete(x.right, key);
else {
if (x.right == null)
return x.left;
if (x.left == null)
return x.right;
Node t = x;
x = min(t.right);
x.right = deleteMin(t.right);
x.left = t.left;
}
x.size = size(x.left) + size(x.right) + 1;
return x;
}
2.5 完整实现
用例轨迹:
完整源码:
public class BST<Key extends Comparable<Key>, Value> {
private Node root; // root of BST
private class Node {
private Key key; // sorted by key
private Value val; // associated data
private Node left, right; // left and right subtrees
private int size; // number of nodes in subtree
public Node(Key key, Value val, int size) {
this.key = key;
this.val = val;
this.size = size;
}
}
/**
* Initializes an empty symbol table.
*/
public BST() {
}
/**
* Returns true if this symbol table is empty.
*
* @return {@code true} if this symbol table is empty; {@code false}
* otherwise
*/
public boolean isEmpty() {
return size() == 0;
}
/**
* Returns the number of key-value pairs in this symbol table.
*
* @return the number of key-value pairs in this symbol table
*/
public int size() {
return size(root);
}
// return number of key-value pairs in BST rooted at x
private int size(Node x) {
if (x == null)
return 0;
else
return x.size;
}
/**
* Does this symbol table contain the given key?
*
* @param key
* the key
* @return {@code true} if this symbol table contains {@code key} and
* {@code false} otherwise
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public boolean contains(Key key) {
if (key == null)
throw new IllegalArgumentException("argument to contains() is null");
return get(key) != null;
}
/**
* Returns the value associated with the given key.
*
* @param key
* the key
* @return the value associated with the given key if the key is in the
* symbol table and {@code null} if the key is not in the symbol
* table
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public Value get(Key key) {
return get(root, key);
}
private Value get(Node x, Key key) {
if (key == null)
throw new IllegalArgumentException("calls get() with a null key");
if (x == null)
return null;
int cmp = key.compareTo(x.key);
if (cmp < 0)
return get(x.left, key);
else if (cmp > 0)
return get(x.right, key);
else
return x.val;
}
/**
* Inserts the specified key-value pair into the symbol table, overwriting
* the old value with the new value if the symbol table already contains the
* specified key. Deletes the specified key (and its associated value) from
* this symbol table if the specified value is {@code null}.
*
* @param key
* the key
* @param val
* the value
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public void put(Key key, Value val) {
if (key == null)
throw new IllegalArgumentException("calls put() with a null key");
if (val == null) {
delete(key);
return;
}
root = put(root, key, val);
assert check();
}
private Node put(Node x, Key key, Value val) {
if (x == null)
return new Node(key, val, 1);
int cmp = key.compareTo(x.key);
if (cmp < 0)
x.left = put(x.left, key, val);
else if (cmp > 0)
x.right = put(x.right, key, val);
else
x.val = val;
x.size = 1 + size(x.left) + size(x.right);
return x;
}
/**
* Removes the smallest key and associated value from the symbol table.
*
* @throws NoSuchElementException
* if the symbol table is empty
*/
public void deleteMin() {
if (isEmpty())
throw new NoSuchElementException("Symbol table underflow");
root = deleteMin(root);
assert check();
}
private Node deleteMin(Node x) {
if (x.left == null)
return x.right;
x.left = deleteMin(x.left);
x.size = size(x.left) + size(x.right) + 1;
return x;
}
/**
* Removes the largest key and associated value from the symbol table.
*
* @throws NoSuchElementException
* if the symbol table is empty
*/
public void deleteMax() {
if (isEmpty())
throw new NoSuchElementException("Symbol table underflow");
root = deleteMax(root);
assert check();
}
private Node deleteMax(Node x) {
if (x.right == null)
return x.left;
x.right = deleteMax(x.right);
x.size = size(x.left) + size(x.right) + 1;
return x;
}
/**
* Removes the specified key and its associated value from this symbol table
* (if the key is in this symbol table).
*
* @param key
* the key
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public void delete(Key key) {
if (key == null)
throw new IllegalArgumentException("calls delete() with a null key");
root = delete(root, key);
assert check();
}
private Node delete(Node x, Key key) {
if (x == null)
return null;
int cmp = key.compareTo(x.key);
if (cmp < 0)
x.left = delete(x.left, key);
else if (cmp > 0)
x.right = delete(x.right, key);
else {
if (x.right == null)
return x.left;
if (x.left == null)
return x.right;
Node t = x;
x = min(t.right);
x.right = deleteMin(t.right);
x.left = t.left;
}
x.size = size(x.left) + size(x.right) + 1;
return x;
}
/**
* Returns the smallest key in the symbol table.
*
* @return the smallest key in the symbol table
* @throws NoSuchElementException
* if the symbol table is empty
*/
public Key min() {
if (isEmpty())
throw new NoSuchElementException("calls min() with empty symbol table");
return min(root).key;
}
private Node min(Node x) {
if (x.left == null)
return x;
else
return min(x.left);
}
/**
* Returns the largest key in the symbol table.
*
* @return the largest key in the symbol table
* @throws NoSuchElementException
* if the symbol table is empty
*/
public Key max() {
if (isEmpty())
throw new NoSuchElementException("calls max() with empty symbol table");
return max(root).key;
}
private Node max(Node x) {
if (x.right == null)
return x;
else
return max(x.right);
}
/**
* Returns the largest key in the symbol table less than or equal to
* {@code key}.
*
* @param key
* the key
* @return the largest key in the symbol table less than or equal to
* {@code key}
* @throws NoSuchElementException
* if there is no such key
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public Key floor(Key key) {
if (key == null)
throw new IllegalArgumentException("argument to floor() is null");
if (isEmpty())
throw new NoSuchElementException("calls floor() with empty symbol table");
Node x = floor(root, key);
if (x == null)
return null;
else
return x.key;
}
private Node floor(Node x, Key key) {
if (x == null)
return null;
int cmp = key.compareTo(x.key);
if (cmp == 0)
return x;
if (cmp < 0)
return floor(x.left, key);
Node t = floor(x.right, key);
if (t != null)
return t;
else
return x;
}
public Key floor2(Key key) {
return floor2(root, key, null);
}
private Key floor2(Node x, Key key, Key best) {
if (x == null)
return best;
int cmp = key.compareTo(x.key);
if (cmp < 0)
return floor2(x.left, key, best);
else if (cmp > 0)
return floor2(x.right, key, x.key);
else
return x.key;
}
/**
* Returns the smallest key in the symbol table greater than or equal to
* {@code key}.
*
* @param key
* the key
* @return the smallest key in the symbol table greater than or equal to
* {@code key}
* @throws NoSuchElementException
* if there is no such key
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public Key ceiling(Key key) {
if (key == null)
throw new IllegalArgumentException("argument to ceiling() is null");
if (isEmpty())
throw new NoSuchElementException("calls ceiling() with empty symbol table");
Node x = ceiling(root, key);
if (x == null)
return null;
else
return x.key;
}
private Node ceiling(Node x, Key key) {
if (x == null)
return null;
int cmp = key.compareTo(x.key);
if (cmp == 0)
return x;
if (cmp < 0) {
Node t = ceiling(x.left, key);
if (t != null)
return t;
else
return x;
}
return ceiling(x.right, key);
}
/**
* Return the key in the symbol table whose rank is {@code k}. This is the
* (k+1)st smallest key in the symbol table.
*
* @param k
* the order statistic
* @return the key in the symbol table of rank {@code k}
* @throws IllegalArgumentException
* unless {@code k} is between 0 and <em>n</em>鈥�1
*/
public Key select(int k) {
if (k < 0 || k >= size()) {
throw new IllegalArgumentException("argument to select() is invalid: " + k);
}
Node x = select(root, k);
return x.key;
}
// Return key of rank k.
private Node select(Node x, int k) {
if (x == null)
return null;
int t = size(x.left);
if (t > k)
return select(x.left, k);
else if (t < k)
return select(x.right, k - t - 1);
else
return x;
}
/**
* Return the number of keys in the symbol table strictly less than
* {@code key}.
*
* @param key
* the key
* @return the number of keys in the symbol table strictly less than
* {@code key}
* @throws IllegalArgumentException
* if {@code key} is {@code null}
*/
public int rank(Key key) {
if (key == null)
throw new IllegalArgumentException("argument to rank() is null");
return rank(key, root);
}
// Number of keys in the subtree less than key.
private int rank(Key key, Node x) {
if (x == null)
return 0;
int cmp = key.compareTo(x.key);
if (cmp < 0)
return rank(key, x.left);
else if (cmp > 0)
return 1 + size(x.left) + rank(key, x.right);
else
return size(x.left);
}
/**
* Returns all keys in the symbol table as an {@code Iterable}. To iterate
* over all of the keys in the symbol table named {@code st}, use the
* foreach notation: {@code for (Key key : st.keys())}.
*
* @return all keys in the symbol table
*/
public Iterable<Key> keys() {
if (isEmpty())
return new Queue<Key>();
return keys(min(), max());
}
/**
* Returns all keys in the symbol table in the given range, as an
* {@code Iterable}.
*
* @param lo
* minimum endpoint
* @param hi
* maximum endpoint
* @return all keys in the symbol table between {@code lo} (inclusive) and
* {@code hi} (inclusive)
* @throws IllegalArgumentException
* if either {@code lo} or {@code hi} is {@code null}
*/
public Iterable<Key> keys(Key lo, Key hi) {
if (lo == null)
throw new IllegalArgumentException("first argument to keys() is null");
if (hi == null)
throw new IllegalArgumentException("second argument to keys() is null");
Queue<Key> queue = new Queue<Key>();
keys(root, queue, lo, hi);
return queue;
}
private void keys(Node x, Queue<Key> queue, Key lo, Key hi) {
if (x == null)
return;
int cmplo = lo.compareTo(x.key);
int cmphi = hi.compareTo(x.key);
if (cmplo < 0)
keys(x.left, queue, lo, hi);
if (cmplo <= 0 && cmphi >= 0)
queue.enqueue(x.key);
if (cmphi > 0)
keys(x.right, queue, lo, hi);
}
/**
* Returns the number of keys in the symbol table in the given range.
*
* @param lo
* minimum endpoint
* @param hi
* maximum endpoint
* @return the number of keys in the symbol table between {@code lo}
* (inclusive) and {@code hi} (inclusive)
* @throws IllegalArgumentException
* if either {@code lo} or {@code hi} is {@code null}
*/
public int size(Key lo, Key hi) {
if (lo == null)
throw new IllegalArgumentException("first argument to size() is null");
if (hi == null)
throw new IllegalArgumentException("second argument to size() is null");
if (lo.compareTo(hi) > 0)
return 0;
if (contains(hi))
return rank(hi) - rank(lo) + 1;
else
return rank(hi) - rank(lo);
}
/**
* Returns the height of the BST (for debugging).
*
* @return the height of the BST (a 1-node tree has height 0)
*/
public int height() {
return height(root);
}
private int height(Node x) {
if (x == null)
return -1;
return 1 + Math.max(height(x.left), height(x.right));
}
/**
* Returns the keys in the BST in level order (for debugging).
*
* @return the keys in the BST in level order traversal
*/
public Iterable<Key> levelOrder() {
Queue<Key> keys = new Queue<Key>();
Queue<Node> queue = new Queue<Node>();
queue.enqueue(root);
while (!queue.isEmpty()) {
Node x = queue.dequeue();
if (x == null)
continue;
keys.enqueue(x.key);
queue.enqueue(x.left);
queue.enqueue(x.right);
}
return keys;
}
/*************************************************************************
* Check integrity of BST data structure.
***************************************************************************/
private boolean check() {
if (!isBST())
StdOut.println("Not in symmetric order");
if (!isSizeConsistent())
StdOut.println("Subtree counts not consistent");
if (!isRankConsistent())
StdOut.println("Ranks not consistent");
return isBST() && isSizeConsistent() && isRankConsistent();
}
// does this binary tree satisfy symmetric order?
// Note: this test also ensures that data structure is a binary tree since
// order is strict
private boolean isBST() {
return isBST(root, null, null);
}
// is the tree rooted at x a BST with all keys strictly between min and max
// (if min or max is null, treat as empty constraint)
// Credit: Bob Dondero's elegant solution
private boolean isBST(Node x, Key min, Key max) {
if (x == null)
return true;
if (min != null && x.key.compareTo(min) <= 0)
return false;
if (max != null && x.key.compareTo(max) >= 0)
return false;
return isBST(x.left, min, x.key) && isBST(x.right, x.key, max);
}
// are the size fields correct?
private boolean isSizeConsistent() {
return isSizeConsistent(root);
}
private boolean isSizeConsistent(Node x) {
if (x == null)
return true;
if (x.size != size(x.left) + size(x.right) + 1)
return false;
return isSizeConsistent(x.left) && isSizeConsistent(x.right);
}
// check that ranks are consistent
private boolean isRankConsistent() {
for (int i = 0; i < size(); i++)
if (i != rank(select(i)))
return false;
for (Key key : keys())
if (key.compareTo(select(rank(key))) != 0)
return false;
return true;
}
/**
* Unit tests the {@code BST} data type.
*
* @param args
* the command-line arguments
*/
public static void main(String[] args) {
BST<String, Integer> st = new BST<String, Integer>();
for (int i = 0; !StdIn.isEmpty(); i++) {
String key = StdIn.readString();
st.put(key, i);
}
for (String s : st.levelOrder())
StdOut.println(s + " " + st.get(s));
StdOut.println();
for (String s : st.keys())
StdOut.println(s + " " + st.get(s));
}
}
三、性能分析
二叉查找树的性能取决于树的形状,而树的形状又取决于键被插入的先后顺序。
也就是说二叉查找树要具有良好性能,则其中键的分布必须足够随机以消除长路径。
- 最好情况
N个结点的二叉树是完全平衡的(完全二叉树),每个空链接和根结点的距离~lgN。
插入/查找时间复杂度:O(lgN)
- 最坏情况
N个结点的二叉树如下图,搜索路径上有N个结点,形成完全线程结构。
插入/查找时间复杂度:O(N)