Hash algorithm in PHP_PHP tutorial
Hash Table is the core of PHP, this is not an exaggeration at all.
PHP's arrays, associative arrays, object properties, function tables, symbol tables, etc. all use HashTable as a container.
PHP's HashTable uses the zipper method to resolve conflicts. Needless to say, my main focus today is PHP's Hash algorithm and some of the ideas revealed by the algorithm itself.
PHP's Hash uses the most common DJBX33A (Daniel J. Bernstein, Times 33 with Addition). This algorithm is widely used in multiple software projects, such as Apache, Perl and Berkeley DB. For strings, this is currently The best hashing algorithm known, because it is very fast and classifies very well (little collisions, even distribution).
The core idea of the algorithm is:
1. hash(i) = hash(i-1) * 33 + str[i]
In zend_hash.h, we can find this algorithm in PHP:
1. static inline ulong zend_inline_hash_func(char *arKey, uint nKeyLength)
2. {
3. Register ulong hash = 5381;
4.
5. /* variant with the hash unrolled eight times */
6. for (; nKeyLength >= 8; nKeyLength -= {
7. hash = ((hash << 5) + hash) + *arKey++;
8. hash = ((hash << 5) + hash) + *arKey++;
9. hash = ((hash << 5) + hash) + *arKey++;
10. hash = ((hash << 5) + hash) + *arKey++;
11. hash = ((hash << 5) + hash) + *arKey++;
12. hash = ((hash << 5) + hash) + *arKey++;
13. hash = ((hash << 5) + hash) + *arKey++;
14. hash = ((hash << 5) + hash) + *arKey++;
15. }
16. switch (nKeyLength) {
17. case 7: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
18. case 6: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
19. case 5: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
20. case 4: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
21. case 3: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
22. case 2: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
23. case 1: hash = ((hash << 5) + hash) + *arKey++; break;
24. case 0: break;
25. EMPTY_SWITCH_DEFAULT_CASE()
26. }
27. Return hash;
28. }
Compared to the classic Times 33 algorithm adopted directly in Apache and Perl:
1. hashing function used in Perl 5.005:
2. # Return the hashed value of a string: $hash = perlhash("key")
3. # (Defined by the PERL_HASH macro in hv.h)
4. sub perlhash
5. {
6. $hash = 0;
7. foreach (split //, shift) {
8. $hash = $hash*33 + ord($_);
9. }
10. return $hash;
11. }
In PHP’s hash algorithm, we can see very subtle differences.
First of all, the most different thing is that PHP does not use direct multiplication by 33, but uses:
1. hash << 5 + has
This will of course be faster than taking a ride.
Then, the most important thing to consider is the use of unrolled. I read an article a few days ago about Discuz’s caching mechanism. One of them said that Discuz will adopt different caching strategies according to the popularity of the post. According to user habits, only Cache the first page of the post (because few people will read the post).
Similar to this idea, PHP encourages character indexes of less than 8 digits. It uses unrolled in units of 8 to improve efficiency. It must be said that this is also a very detailed and meticulous place.
In addition, there are inline and register variables... It can be seen that PHP developers have also taken great pains to optimize hash
Finally, the initial value of hash is set to 5381. Compared with the times algorithm in Apache and the Hash algorithm in Perl (both use an initial hash of 0), why choose 5381? I don’t know the specific reason, but I Discovered some features of 5381:
1. Magic Constant 5381:
2. 1. odd number
3. 2. prime number
4. 3. deficient number
5. 4. 001/010/100/000/101
After reading this, I have reason to believe that the selection of this initial value can provide better classification.
As for why Times 33 is Times 33 instead of Times other numbers, there are some explanations in the comments of the PHP Hash algorithm. I hope it will be useful to interested students:
1. DJBX33A (Daniel J. Bernstein, Times 33 with Addition)
2.
3. This is Daniel J. Bernstein's popular `times 33' hash function as
4. Posted by him years ago on comp.lang.c. It basically uses a function
5. Like ``hash(i) = hash(i-1) * 33 + str[i]''. This is one of the best
6. Known hash functions for strings. Because it is both computed very
7. fast and distributes very well.
8.
9. The magic of number 33, i.e. why it works better than many other
10. constants, prime or not, has never been adequately explained by
11. anyone. So I try an explanation: if one experimentally tests all
12. multipliers between 1 and 256 (as RSE did now) one detects that even
13. Numbers are not useable at all. The remaining 128 odd numbers
14. (except for the number 1) work more or less all equally well. They
15. all distribute in an acceptable way and this way fill a hash table
16. with an average percent of approx. 86%.
17.
18. If one compares the Chi^2 values of the variants, the number 33 not
19. even has the best value. But the number 33 and a few other equally
20. Good numbers like 17, 31, 63, 127 and 129 have nevertheless a great
21. Advantage to the remaining numbers in the large set of possible
22. Multipliers: their multiply operation can be replaced by a faster
23. Operation based on just one shift plus either a single addition
24. or subtraction operation. And because a hash function has to both
25. distribute well _and_ has to be very fast to compute, those few
26. Numbers should be preferred and seems to be the reason why Daniel J.
27. Bernstein also preferred it.
28.
29. www.2cto.com -- Ralf S. Engelschall
• Author: Laruence
• This article’s address: http://www.laruence.com/2009/07/23/994.html

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



PHP 8.4 brings several new features, security improvements, and performance improvements with healthy amounts of feature deprecations and removals. This guide explains how to install PHP 8.4 or upgrade to PHP 8.4 on Ubuntu, Debian, or their derivati

Working with database in CakePHP is very easy. We will understand the CRUD (Create, Read, Update, Delete) operations in this chapter.

To work with date and time in cakephp4, we are going to make use of the available FrozenTime class.

To work on file upload we are going to use the form helper. Here, is an example for file upload.

CakePHP is an open-source framework for PHP. It is intended to make developing, deploying and maintaining applications much easier. CakePHP is based on a MVC-like architecture that is both powerful and easy to grasp. Models, Views, and Controllers gu

Validator can be created by adding the following two lines in the controller.

Logging in CakePHP is a very easy task. You just have to use one function. You can log errors, exceptions, user activities, action taken by users, for any background process like cronjob. Logging data in CakePHP is easy. The log() function is provide

Visual Studio Code, also known as VS Code, is a free source code editor — or integrated development environment (IDE) — available for all major operating systems. With a large collection of extensions for many programming languages, VS Code can be c
