php-perl哈希算法实现(times33哈希算法)_PHP
复制代码 代码如下:
APR_DECLARE_NONSTD(unsigned int) apr_hashfunc_default(const char *char_key,
apr_ssize_t *klen)
{
unsigned int hash = 0;
const unsigned char *key = (const unsigned char *)char_key;
const unsigned char *p;
apr_ssize_t i;
/*
* This is the popular `times 33' hash algorithm which is used by
* perl and also appears in Berkeley DB. This is one of the best
* known hash functions for strings because it is both computed
* very fast and distributes very well.
*
* The originator may be Dan Bernstein but the code in Berkeley DB
* cites Chris Torek as the source. The best citation I have found
* is "Chris Torek, Hash function for text in C, Usenet message
* in comp.lang.c , October, 1990." in Rich
* Salz's USENIX 1992 paper about INN which can be found at
* .
*
* The magic of number 33, i.e. why it works better than many other
* constants, prime or not, has never been adequately explained by
* anyone. So I try an explanation: if one experimentally tests all
* multipliers between 1 and 256 (as I did while writing a low-level
* data structure library some time ago) one detects that even
* numbers are not useable at all. The remaining 128 odd numbers
* (except for the number 1) work more or less all equally well.
* They all distribute in an acceptable way and this way fill a hash
* table with an average percent of approx. 86%.
*
* If one compares the chi^2 values of the variants (see
* Bob Jenkins ``Hashing Frequently Asked Questions'' at
* http://burtleburtle.net/bob/hash/hashfaq.html for a description
* of chi^2), the number 33 not even has the best value. But the
* number 33 and a few other equally good numbers like 17, 31, 63,
* 127 and 129 have nevertheless a great advantage to the remaining
* numbers in the large set of possible multipliers: their multiply
* operation can be replaced by a faster operation based on just one
* shift plus either a single addition or subtraction operation. And
* because a hash function has to both distribute good _and_ has to
* be very fast to compute, those few numbers should be preferred.
*
* -- Ralf S. Engelschall
*/
if (*klen == APR_HASH_KEY_STRING) {
for (p = key; *p; p++) {
hash = hash * 33 + *p;
}
*klen = p - key;
}
else {
for (p = key, i = *klen; i; i--, p++) {
hash = hash * 33 + *p;
}
}
return hash;
}
对函数注释部分的翻译: 这是很出名的times33哈希算法,此算法被perl语言采用并在Berkeley DB中出现.它是已知的最好的哈希算法之一,在处理以字符串为键值的哈希时,有着极快的计算效率和很好哈希分布.最早提出这个算法的是Dan Bernstein,但是源代码确实由Clris Torek在Berkeley DB出实作的.我找到的最确切的引文中这样说”Chris Torek,C语言文本哈希函数,Usenet消息 in comp.lang.c ,1990年十月.”在Rich Salz于1992年在USENIX报上发表的讨论INN的文章中提到.这篇文章可以在上找到. 33这个奇妙的数字,为什么它能够比其他数值效果更好呢?无论重要与否,却从来没有人能够充分说明其中的原因.因此在这里,我来试着解释一下.如果某人试着测试1到256之间的每个数字(就像我前段时间写的一个底层数据结构库那样),他会发现,没有哪一个数字的表现是特别突出的.其中的128个奇数(1除外)的表现都差不多,都能够达到一个能接受的哈希分布,平均分布率大概是86%. 如果比较这128个奇数中的方差值(gibbon:统计术语,表示随机变量与它的数学期望之间的平均偏离程度)的话(见Bob Jenkins的http://burtleburtle.net/bob/hash/hashfaq.html,中对平方差的描述),数字33并不是表现最好的一个.(gibbon:这里按照我的理解,照常理,应该是方差越小稳定,但是由于这里不清楚作者方差的计算公式,以及在哈希离散表,是不是离散度越大越好,所以不得而知这里的表现好是指方差值大还是指方差值小),但是数字33以及其他一些同样好的数字比如 17,31,63,127和129对于其他剩下的数字,在面对大量的哈希运算时,仍然有一个大大的优势,就是这些数字能够将乘法用位运算配合加减法来替换,这样的运算速度会提高.毕竟一个好的哈希算法要求既有好的分布,也要有高的计算速度,能同时达到这两点的数字很少.

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



Golang is a new high-performance programming language with a rich standard library and built-in functions. These include hash functions, which can be used to generate hash values of data for file verification, data verification, etc. This article will introduce the calculation methods and applications of the commonly used functions hash, crc32, md5 and sha1 in Golang. 1. Hash function Golang’s hash function includes a variety of hash algorithms, such as SHA-1, MD5, SHA-224, SHA-256, SH

In the Java function library, the MessageDigest class can be used for hash algorithms and provides implementations of MD5, SHA and other hash algorithms, including: 1. MD5 algorithm: Use MessageDigest.getInstance("MD5") to obtain an instance. 2.SHA algorithm: including SHA-1, SHA-256, SHA-384 and SHA-512, use MessageDigest.getInstance("SHA-256") to obtain the instance. 3. Other hashing algorithms: You can use third-party libraries, such as Algorithms.MessageDigest or BouncyCastle library.

Detailed explanation of hash algorithm in PHP In PHP development, hash algorithm is a commonly used encryption technology, which can convert data of any length into a fixed-length hash value. Hash algorithms are widely used in cryptography, data integrity verification, and fast data search. In this article, we will introduce hashing algorithms in PHP in detail and provide some code examples for reference. 1. Basic Principles of Hash Algorithms The hash algorithm generates a fixed-length hash value by performing a series of mathematical operations on the input data. Have the following basic

How to use the hashlib module for hash algorithm calculation in Python 2.x. In Python programming, the hash algorithm is a commonly used algorithm used to generate a unique identification of data. Python provides the hashlib module to perform hash algorithm calculations. This article will introduce how to use the hashlib module to perform hash algorithm calculations and give some sample codes. The hashlib module is part of the Python standard library and provides a variety of common hash algorithms, such as MD5, SH

How to use Java to implement the MD5 hash algorithm MD5 (MessageDigestAlgorithm5) is a commonly used hash algorithm used to encrypt and verify data. In Java, we can use the MessageDigest class to implement the MD5 hash algorithm. The following is a simple sample code that demonstrates how to implement the MD5 algorithm using Java. importjava.security.MessageDigest;

Python’s underlying technology revealed: How to implement a hash table A hash table is a very common and important data structure in the computer field. It can efficiently store and search a large number of key-value pairs. In Python, we can use hash tables using dictionaries, but few people understand its implementation details in depth. This article will reveal the underlying implementation technology of hash tables in Python and give specific code examples. The core idea of a hash table is to map keys into a fixed-size array through a hash function, rather than simply storing them sequentially.

How to implement SHA hashing algorithm using Python? SHA (Secure Hash Algorithm) is a commonly used cryptographic hash function that generates a fixed-length unique hash value for any length of data. The hashlib module is provided in Python, which contains commonly used hashing algorithms, including the SHA algorithm. This article will introduce in detail how to use Python to implement the SHA hash algorithm and provide relevant code examples. First, you need to import the hashlib module. The following is the code to import the hashlib module:

Hash is also called "hash". It receives any set of input information of any length and transforms it into a fixed-length data fingerprint through the hash algorithm. The fingerprint is the hash value. Overall, a hash can be thought of as a message digest.
