[TOC]
direct address
适用于数量小且没有重复的key的情况
都是O(1)时间
hash table
with direct address, key k store in slot k; while with hashing, we have a hash function h(k)
collision
collision: two keys hash to the same slot.
solution: chaining; open addressing
chaining
load factor = n/m, n is total elements, m is the number of slots.
Search element
worst-case: all elements in the same slots
average case: O(1)
choose a simple uniform hashing (各个元素均匀地分布在各个slots)
Open addressing
每个slot只有一个element,但我们在search的时候,可能要search多个slots
hash function h 不仅需要input k, 还需要input一个i
当删除一个key的时候,不能找到这个key之后直接标记为null,而应当标记为deleted,不然会影响后面的search
Double hashing效果不错:h(i, k) = (h1(k) + i * h2(k)) mod m
linear probing
如果slot已经被占据了,就看下一个slot
会使得average search time 增加
quadratic probing
类似linear probing,但不是看下一个slot,而是通过加上一个次方的数再mode,寻找下一个探测是否能够插入的slot
double hashing
计算h的时候使用另外两个hash function
hash functions
a good hash function should be simple uniform hashing.
division method
h(k) = k mod m
get the remainder as h(k) ---- 取一个质数,并且这个质数和keys的分布没有关联
a prime not too close to an exact power of 2 is often a good choice of m