Hash is to establish a certain correspondence relationship f between the storage location of the record and its keywords, so that each keyword key corresponds to a storage location f (key), and establishes the mutual relationship between the keyword and the storage location. Correspondence relationship, this relationship f is called hash function (hash function). The editor of this article mainly talks about the conflict handling problem of hash function.
During the search process, the number of key code comparisons depends on the number of conflicts. The fewer conflicts, the higher the search efficiency. , there will be many conflicts, and the search efficiency will be low. Therefore, factors that affect the number of conflicts are factors that affect search efficiency. There are the following three factors that affect the number of conflicts:
1. Whether the hash function is uniform;
2. The method of handling conflicts;
3. The filling factor of the hash table .
The filling factor of the hash table is defined as: α = the number of elements filled in the table / the length of the hash table
α is a sign factor of the degree of filling of the hash table. Since the table length is a fixed value, α is proportional to the "number of elements filled in the table". Therefore, the larger α is, the more elements are filled in the table, and the greater the possibility of conflict; the smaller α, The fewer elements that populate the table, the less likely there will be conflicts.
In fact, the average search length of the hash table is a function of the filling factor α, but different methods of handling conflicts have different functions.
The methods to resolve hash conflicts generally include:
NO.1 Open addressing method
The so-called open addressing method means that once a conflict occurs, look for the next empty address. As long as the hash table is large enough, the empty hash address can always be found and the record will be stored.
Formula: f(key)=(f(key) di)%m(di=1,2,3….m-1)
For example, the keyword set is { 12, 67, 56, 16, 25, 37, 22, 29, 15, 47, 48, 34}, the table length is 12. Hash function f(key) = key mod 12.
When calculating the first five numbers {12, 67, 56, 16, 25}, they are all hash addresses without conflict and are stored directly; when calculating key = 37, it is found that f(37) = 1. At this time, it conflicts with the location of 25. So apply the above formula f(37) = (f(37) 1) mod 12 =2,. So 37 is stored at the location with index 2. The next 22, 29, 15, and 47 have no conflicts and are deposited normally. At 48, we calculate f(48) = 0, which conflicts with the 0 position of 12. It doesn't matter, we f(48) = (f(48) 1) mod 12 = 1, which conflicts with the position of 25. . So f(48) = (f(48) 2) mod 12 = 2, there is still a conflict... There will be no vacancies until f(48) = (f(48) 6) mod 12 = 6. As shown in the table below.
Serial number | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Keywords | 12 | 25 | 16 | 56 |
The above is the detailed content of Classic conflict handling of hash tables (hash tables) in data structures. For more information, please follow other related articles on the PHP Chinese website!