Java Hash Storage
The data structures of hash storage in Java mainly refer to HashSet, HashMap, LinkedHashSet, LinkedHashMap and HashTable, etc. To understand the hash storage mechanism in Java, we must first understand two methods: equals() and hashCode(). Regarding the equals() method and its difference from the "==" relational operator, we have explained it in another article. As for hashCode(), it is a method defined in the Object class:
public native int hashCode();
This is a local method that returns an int value and is not implemented in the Object class. This method is mainly used in data structures that use hashing, and works normally with hash-based collections. For example, when inserting an object into a container (we assume it is a HashMap), how to determine whether the object already exists in the container? Where is the target? Since there may be thousands of elements in the container, using the equals() method to compare them sequentially is very inefficient. The value of hashing is speed, it saves the key somewhere so it can be found quickly. The fastest data structure to store a set of elements is an array, so use it to store key information (note that it is the key information, not the key itself). But because the array cannot adjust its capacity, there is a problem: we want to save an uncertain number of values in the Map, but what should we do if the number of keys is limited by the capacity of the array?
The answer is: the array does not save the key itself, but generates a number through the key object and uses it as the subscript of the array. This number is the hash code (hashcode), which is defined in Object , and may be generated by the hashCode() method overridden by your class. To solve the problem of fixed array capacity, different keys can produce the same subscript, a phenomenon called a conflict. Therefore, the process of querying a value in the container is: first calculate the hash code of the object to be inserted through hashCode(), and then use the hash code to query the array. Conflicts are often handled through external links, that is, the array does not directly store the value, but a list of values, and then a linear query is performed on the values in the list. This part of the query will naturally be slower. However, if the hash function is good enough, there will be fewer values at each position in the array. Therefore, the hashing mechanism can quickly jump to a location in the array, comparing only a few elements. This is why HashMap is so fast. We can realize it through the HashMap.put() method:
public V put(K key, V value) { if (table == EMPTY_TABLE) { inflateTable(threshold); } if (key == null) return putForNullKey(value); int hash = hash(key); int i = indexFor(hash, table.length); for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(hash, key, value, i); return null; }
The main idea is: the key is not When empty, the hash code hash is obtained according to the key object, and then the subscript i of the array is obtained through the hash code. Iterate through the list represented by table[i] and determine whether the key exists through equals(). If it exists, update the old value with the new value and return the old value; otherwise, add the new key-value pair to HashMap. It can be seen from here that the hashCode method exists to reduce the number of calls to the equals method, thereby improving program efficiency.
Here we need to note: hashCode() does not always need to be able to return a unique identification code, but the equals() method must strictly determine whether two objects are the same.
Thank you for reading, I hope it can help you, thank you for your support of this site!
For more detailed explanations and simple examples of Java hash storage, please pay attention to the PHP Chinese website!