[数据结构]---哈希表存储之HashMap源码分析

什么是哈希表

在线性表、树中，记录在结构中的相对位置是随机的，和记录的关键字之间不存在确定的关系，因此，在结构中查找记录时需要进行一系列和关键字的比较。在链表中，比较的结果为=和!=两种情况；在二叉排序树中，比较的结果为=、<、>三种情况，这样查找的效率就依赖于查找过程中所进行的比较次数。理想情况下希望不经过任何比较，一次存取便能得到所有记录，那就必须在记录的存储位置和它的关键字之间建立一种确定的对应关系f(即映射)，是每个关键字和结构中的一个唯一的存储位置相对应，只要根据对应关系f找到给定值K的映像f(K)。这种对应关系f为哈希(hash)函数，按这个思想建立的表为哈希表。

HashMap定义

public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable

基于哈希表的Map接口的实现。这个实现提供了所有可选的映射操作，并且允许空值和null键。(HashMap类大致等同于Hashtable，除了它是不同步的并且允许空值的。)这个类不能保证存储的顺序;为基本操作（get和put）提供了稳定的性能。

HashMap属性

    //默认容量16
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    //最大容量 必须是2整数幂
    static final int MAXIMUM_CAPACITY = 1 << 30;
     //默认情况下的填充因子
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    //一个链表上转成红黑树的阈值
    static final int TREEIFY_THRESHOLD = 8;
    //一颗红黑树转成链表的阈值
    static final int UNTREEIFY_THRESHOLD = 6;
    //整个哈希表中节点数树形化的阈值
    static final int MIN_TREEIFY_CAPACITY = 64;
    //初始长度必须是2的幂数
    transient Node<K,V>[] table;
    //节点的数量
    transient int size;
    //编辑次数
    transient int modCount;
    //扩容阈值，当size>=threshold时，就会扩容
    int threshold;
    //加载因子
    final float loadFactor;

HashMap数据结构

    static class Node<K,V> implements Map.Entry<K,V> {
        //哈希值也就是位置
        final int hash;
        final K key;
        V value;
        //指向下一个指针
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
    //红黑树数据结构
     static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // 父节点
        TreeNode<K,V> left;      //左孩子
        TreeNode<K,V> right;   //右孩子
        TreeNode<K,V> prev;    // 前一个元素的节点
        boolean red;    //该节点是否是红色

JDK1.8以前 HashMap的数据结构是数组+链表,JDK1.8以后的数据结构是当链表长度大于8时就转换为红黑树，如图

HashMap的链表结构以及红黑树结构.png

HashMap的构造方法

    public HashMap(int initialCapacity, float loadFactor) {
        ...省略参数检查代码
        this.loadFactor = loadFactor; //装填因子0.75
        this.threshold = tableSizeFor(initialCapacity); //计算扩容的阈值
    }
  
    //实例化时传入容量，但必须是2的整数幂
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }
    
    //没有指定容量的，所以HashMap就用默认容量16，
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }
    //传入一个Map
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false); //循环put到HashMap里
    }

HashMap的实现过程

HashMap的put过程

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);     //注意第一个参数hash(key)
    }

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;                         //HashMap的元素为空时就会进行初始化/扩容
        if ((p = tab[i = (n - 1) & hash]) == null)       
         //先通过hash运算后确定位置并赋值给p，如果这个位置没有元素，直接创建并赋值即可
            tab[i] = newNode(hash, key, value, null);
        else {   
            Node<K,V> e; K k;
            if (p.hash == hash &&                                 // 插入的元素与之前该位置的key相同，v可能不同，
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)                       //如果P属于红黑树结构，就采用红黑树的方式进行put
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {                                                 //这个分支说明hash值计算后插入的位置相同，但是key不同
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {               //将P的节点的next赋值给e，如果是空直接创建节点插入即可，
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st    //判断哈希表该位置上的链表是否可树形化
                            treeifyBin(tab, hash);                       //该函数作用是将链表转换红黑树
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;    
                }
            }
            if (e != null) { // existing mapping for key                  //写入
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;         //编辑次数记录
        if (++size > threshold)
            resize();             //容量大于扩容的阈值 需要扩容
        afterNodeInsertion(evict);
        return null;
    }

put过程总结

先对key进行hash运算确定hash值
判断哈希表是否为空，空就初始化
然后判断是否发生碰撞，如果没发生碰撞，创建插入即可，否则如果发生碰撞且是同一个key覆盖，else插入到链表的下一个节点
-判断是否可转换为红黑树
判断是否需要扩容

HashMap的get(key)

    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;         //先进行hash值计算
    }

    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

get过程小结

首先进行hash值运算
然后根据hash值确定key在哈希表index位置节点的表头
最后遍历遍历该链表找到key对应的节点

哈希值hash的计算

我们看到无论是put过程还是get过程都需要对key进行hash运算得到哈希值

    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

代码很简单，首先对key进行hashCode()然后异或 key.hashCode()的无符号右移16位,如图

hash计算.png

HashMap的扩容resize()

扩容的原理:
Initializes or doubles table size. If null, allocates in accord with initial capacity target held in field threshold Otherwise, because we are using power-of-two expansion, the elements from each bin must either stay at same index, or move
with a power of two offset in the new table.
初始化或加倍表格大小。如果为空，则根据在场阈值中保持的初始容量目标进行分配。否则，因为我们正在使用二次幂次扩展，来自每个容器的元素必须保持在相同的索引，或者原位置再移动2次幂的位置。

image.png

如何理解，下面通过16扩容到32演示一下

16扩容到32.png

元素在重新计算hash之后，因为n变为2倍，那么n-1的二进制数在高位就多1bit(红色)，因此新的index就会发生这样的变化：