HashMap---jdk8

发布时间 2023-11-08 10:29:38作者: anpeiyong

概述  

    

Hash table based implementation of the Map interface.
This implementation provides all of the optional map operations, and permits <tt>null</tt> values and the <tt>null</tt> key.(The <tt>HashMap</tt> class is roughly equivalent to <tt>Hashtable</tt>, except that it is unsynchronized and permits nulls.) 

This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

  基于Map的hash表实现;

  hashmap支持Map的所有操作,允许k=null, v=null

  hashmap不保证排序

 

This implementation provides constant-time performance for the basic operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function disperses the elements properly among the buckets.
Iteration over collection views requires time proportional to the "capacity" of the <tt>HashMap</tt> instance (the number of buckets) plus its size (the number of key-value mappings).
Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

  假设hashmap的hash将元素分散到buckets中,hashmap的get,put 提供O(1)复杂度;

  hashmap迭代 和它的capacity和size有关系;

  如果需要迭代的时候,不要把capacity设置的太高;

 

An instance of <tt>HashMap</tt> has two parameters that affect its performance: <i>initial capacity</i> and <i>load factor</i>.
The <i>capacity</i> is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created.
The <i>load factor</i> is a measure of how full the hash table is allowed to get before its capacity is automatically increased.
When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is <i>rehashed</i> (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

  hashmap有2个参数影响它的性能:initial capacity、load factor

  capacity:hash表的buckets数量,initial capacity是hash表创建时确定的;

  load factor:衡量hash表在容量增加之前允许的负载;

  当hash表的 entry数量 超过 load factor*capacity,hash表将会扩容;

 

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs.
Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the <tt>HashMap</tt> class, including <tt>get</tt> and <tt>put</tt>).

  作为一般规则,默认的load factor=0.75 在时间和空间提供了良好的权衡

  loadfactor较高,会减少空间开销,但是增加查找时间;

The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations.
If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

  在 initial capacity 时,要考虑Map的entry数量 & loadfactor,以便减少rehash的次数

  如果 initial capacity远大于 entry的最大量/loadfactor,将不会发生rehash;

 

If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table.  

  如果hashmap要存储大量的映射,创建时,要建一个大的容量以便存储,而不是让它自己rehash;

Note that using many keys with the same {@code hashCode()} is a sure way to slow down performance of any hash table.
To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

  注意:使用相同的hashcode的key 会降低hash表的性能

  为了改善影响,当key是Comparable时,会使用key的comparison order打破hashcode的影响;

 

Note that this implementation is not synchronized.
multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it <i>must</i> be synchronized externally.
(A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)
This is typically accomplished by synchronizing on some object that naturally encapsulates the map.

  hashmap是非线程同步的;

  当多个线程并发访问hashmap时,需要在外部做同步;

  

If no such object exists, the map should be "wrapped" using the {@link Collections#synchronizedMap Collections.synchronizedMap} method.
This is best done at creation time, to prevent accidental unsynchronized access to the map: Map m = Collections.synchronizedMap(new HashMap(...));

  如果没有线程安全的hashMap对象,可以使用Collections.synchronizedMap;

  

The iterators returned by all of this class's "collection view methods" are <i>fail-fast</i>: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own <tt>remove</tt> method, the iterator will throw a {@link ConcurrentModificationException}.  

Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

  hashmap的iterator是fail-fast:如果在iterator时修改结构,将会抛出ConcurrentModificationException;

  在并发修改时,会进行fail-fast;

 

Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification.
Fail-fast iterators throw <tt>ConcurrentModificationException</tt> on a best-effort basis.
Therefore, it would be wrong to write a program that depended on this exception for its correctness: <i>the fail-fast behavior of iterators should be used only to detect bugs.</i>

  注意:在存在不同步的并发修改情况下,无法保证iterator的fail-fast;

  编写iterator的remove程序是错误的;

 

  

public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {

        static class Node<K,V> implements Map.Entry<K,V> {
            final int hash;
            final K key;
            V value;
            Node<K,V> next;

            public final int hashCode() {
                return Objects.hashCode(key) ^ Objects.hashCode(value);
            }
        }

        final class EntrySet extends AbstractSet<Entry<K,V>> {

        }

        abstract class HashIterator {

        }

        final class EntryIterator extends HashIterator implements Iterator<Map.Entry<K,V>> {
            public final Map.Entry<K,V> next() { return nextNode(); }
        }

        /**
         * The table, initialized on first use, and resized as necessary.   第一次使用时初始化,必要时resize
         * When allocated, length is always a power of two.   长度分配时,一般是2的幂次
         */
        transient Node<K,V>[] table;

        /**
         * The default initial capacity - MUST be a power of two.   默认初始容量(必须是2的幂次)
         */
        static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

        /**
         * The maximum capacity, used if a higher value is implicitly specified by either of the constructors with arguments.
         * MUST be a power of two <= 1<<30.
         */
        static final int MAXIMUM_CAPACITY = 1 << 30;

        /**
         * The load factor used when none specified in constructor.
         */
        static final float DEFAULT_LOAD_FACTOR = 0.75f;

        static final int TREEIFY_THRESHOLD = 8;

        static final int UNTREEIFY_THRESHOLD = 6;

        static final int MIN_TREEIFY_CAPACITY = 64;

        // The next size value at which to resize (capacity * load factor).  resize的下个阈值(capacity * load factor)
        int threshold;

        // The number of key-value mappings contained in this map. map中k-v数量
        transient int size;

        /**
         * Holds cached entrySet(). Note that AbstractMap fields are used for keySet() and values().
         */
        transient Set<Map.Entry<K,V>> entrySet;


    }

  

 

链路 

map.entrySet() + map.remove

Map<Integer, Integer> map = new HashMap<Integer, Integer>();
        map.put(null, null);
        map.put(1, 2);
        map.put(2, 3);


/**
         * 40 invokeinterface #8 <java/util/Map.entrySet> count 1
         * 45 invokeinterface #9 <java/util/Set.iterator> count 1
         * 50 astore_2
         * 51 aload_2
         * 52 invokeinterface #10 <java/util/Iterator.hasNext> count 1
         * 57 ifeq 86 (+29)
         * 60 aload_2
         * 61 invokeinterface #11 <java/util/Iterator.next> count 1
         * 66 checkcast #12 <java/util/Map$Entry>
         * 69 astore_3
         * 70 aload_1
         * 71 aload_3
         * 72 invokeinterface #13 <java/util/Map$Entry.getKey> count 1
         * 77 invokeinterface #14 <java/util/Map.remove> count 2
         */
for (Map.Entry<Integer, Integer> entry : map.entrySet()) {
            map.remove(entry.getKey());
        }



// java.util.HashMap.entrySet
    public Set<Map.Entry<K,V>> entrySet() {
        Set<Map.Entry<K,V>> es;
        return (es = entrySet) == null ? (entrySet = new EntrySet()) : es;
    }

    // java.util.HashMap.EntrySet.iterator
    public final Iterator<Map.Entry<K,V>> iterator() {
        return new EntryIterator();
    }

    // java.util.HashMap.HashIterator.hasNext
    public final boolean hasNext() {
        return next != null;
    }

    // java.util.HashMap.EntryIterator.next
    public final Map.Entry<K,V> next() { return nextNode(); }

    // java.util.HashMap.HashIterator.nextNode
    final Node<K,V> nextNode() {
        Node<K,V>[] t;
        Node<K,V> e = next;
        if (modCount != expectedModCount)                                       // remove操作后,modCount+1了,但是expectedModCount没变,导致异常
            throw new ConcurrentModificationException();
        if (e == null)
            throw new NoSuchElementException();
        if ((next = (current = e).next) == null && (t = table) != null) {
            do {} while (index < t.length && (next = t[index++]) == null);
        }
        return e;
    }

  

 

get 

// java.util.HashMap.get
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    // java.util.HashMap.hash
    // 将key的hashcode值 与 hashcode值右位移16位 做异或位运算,确定数组的索引
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

    // java.util.HashMap.getNode
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) != null) {       // 数组不为空 && 指定索引位不为null
            if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k))))           // 数组位元素的hash值相等 && key相等 -> return数组位元素value
                return first;
            if ((e = first.next) != null) {                                                                 // 数组位元素值 与 参数不相等
                if (first instanceof TreeNode)                                                              // 如果 数组位元素 为TreeNode -> 从树中查找
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))           // 遍历链表查找元素
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }