最近看书,里面提到了一些Hash算法。比较有印象的是Times33,当时理解不是很透测,今天写了段程序来验证了一下。 先上代码: ?php /** *CRC32Hashfunction *@param$str *@returnint */ function hash32 ( $str ) { return crc32 ( $str ) 16 0x7FFFFFFF ; }
最近看书,里面提到了一些Hash算法。比较有印象的是Times33,当时理解不是很透测,今天写了段程序来验证了一下。
先上代码:<span style="color: #000000;">
<span style="color: #0000BB;"><?php <br />
<br></span><span style="color: #FF8000;">/**
<br> * CRC32 Hash function
<br> * @param $str
<br> * @return int
<br> */
<br></span><span style="color: #007700;">function </span><span style="color: #0000BB;">hash32</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">)
<br>{
<br> return </span><span style="color: #0000BB;">crc32</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">) >> </span><span style="color: #0000BB;">16 </span><span style="color: #007700;">& </span><span style="color: #0000BB;">0x7FFFFFFF</span><span style="color: #007700;">;
<br>}
<br>
<br></span><span style="color: #FF8000;">/**
<br> * Times33 Hash function
<br> * @param $str
<br> * @return int
<br> */
<br></span><span style="color: #007700;">function </span><span style="color: #0000BB;">hash33</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">)
<br>{
<br> </span><span style="color: #0000BB;">$hash </span><span style="color: #007700;">= </span><span style="color: #0000BB;">0</span><span style="color: #007700;">;
<br> for(</span><span style="color: #0000BB;">$i</span><span style="color: #007700;">=</span><span style="color: #0000BB;">0</span><span style="color: #007700;">; </span><span style="color: #0000BB;">$i</span><span style="color: #007700;"><span style="color: #0000BB;">strlen</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">); </span><span style="color: #0000BB;">$i</span><span style="color: #007700;">++) {
<br> </span><span style="color: #0000BB;">$hash </span><span style="color: #007700;">+= </span><span style="color: #0000BB;">33 </span><span style="color: #007700;">* </span><span style="color: #0000BB;">$hash </span><span style="color: #007700;">+ </span><span style="color: #0000BB;">ord</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">{</span><span style="color: #0000BB;">$i</span><span style="color: #007700;">});
<br> }
<br> return </span><span style="color: #0000BB;">$hash </span><span style="color: #007700;">& </span><span style="color: #0000BB;">0x7FFFFFFF</span><span style="color: #007700;">;
<br>}
<br>
<br>
<br></span><span style="color: #0000BB;">$n </span><span style="color: #007700;">= </span><span style="color: #0000BB;">10</span><span style="color: #007700;">;
<br>
<br></span><span style="color: #FF8000;">// Test Case 1
<br></span><span style="color: #0000BB;">$stat </span><span style="color: #007700;">= array();
<br>for(</span><span style="color: #0000BB;">$i</span><span style="color: #007700;">=</span><span style="color: #0000BB;">0</span><span style="color: #007700;">; </span><span style="color: #0000BB;">$i</span><span style="color: #007700;"><span style="color: #0000BB;">10000</span><span style="color: #007700;">; </span><span style="color: #0000BB;">$i</span><span style="color: #007700;">++){
<br> </span><span style="color: #0000BB;">$str </span><span style="color: #007700;">= </span><span style="color: #0000BB;">substr</span><span style="color: #007700;">(</span><span style="color: #0000BB;">md5</span><span style="color: #007700;">(</span><span style="color: #0000BB;">microtime</span><span style="color: #007700;">(</span><span style="color: #0000BB;">true</span><span style="color: #007700;">)), </span><span style="color: #0000BB;">0</span><span style="color: #007700;">, </span><span style="color: #0000BB;">8</span><span style="color: #007700;">);
<br> </span><span style="color: #0000BB;">$p </span><span style="color: #007700;">= </span><span style="color: #0000BB;">hash32</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">) % </span><span style="color: #0000BB;">$n</span><span style="color: #007700;">;
<br> if(isset(</span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">[</span><span style="color: #0000BB;">$p</span><span style="color: #007700;">])){
<br> </span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">[</span><span style="color: #0000BB;">$p</span><span style="color: #007700;">]++;
<br> }else{
<br> </span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">[</span><span style="color: #0000BB;">$p</span><span style="color: #007700;">] = </span><span style="color: #0000BB;">1</span><span style="color: #007700;">;
<br> }
<br>}
<br></span><span style="color: #0000BB;">print_r</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">);
<br>
<br></span><span style="color: #FF8000;">// Test Case 2
<br></span><span style="color: #0000BB;">$stat </span><span style="color: #007700;">= array();
<br>for(</span><span style="color: #0000BB;">$i</span><span style="color: #007700;">=</span><span style="color: #0000BB;">0</span><span style="color: #007700;">; </span><span style="color: #0000BB;">$i</span><span style="color: #007700;"><span style="color: #0000BB;">10000</span><span style="color: #007700;">; </span><span style="color: #0000BB;">$i</span><span style="color: #007700;">++){
<br> </span><span style="color: #0000BB;">$str </span><span style="color: #007700;">= </span><span style="color: #0000BB;">substr</span><span style="color: #007700;">(</span><span style="color: #0000BB;">md5</span><span style="color: #007700;">(</span><span style="color: #0000BB;">microtime</span><span style="color: #007700;">(</span><span style="color: #0000BB;">true</span><span style="color: #007700;">)), </span><span style="color: #0000BB;">0</span><span style="color: #007700;">, </span><span style="color: #0000BB;">8</span><span style="color: #007700;">);
<br> </span><span style="color: #0000BB;">$p </span><span style="color: #007700;">= </span><span style="color: #0000BB;">hash33</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$str</span><span style="color: #007700;">) % </span><span style="color: #0000BB;">$n</span><span style="color: #007700;">;
<br> if(isset(</span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">[</span><span style="color: #0000BB;">$p</span><span style="color: #007700;">])){
<br> </span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">[</span><span style="color: #0000BB;">$p</span><span style="color: #007700;">]++;
<br> }else{
<br> </span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">[</span><span style="color: #0000BB;">$p</span><span style="color: #007700;">] = </span><span style="color: #0000BB;">1</span><span style="color: #007700;">;
<br> }
<br>}
<br></span><span style="color: #0000BB;">print_r</span><span style="color: #007700;">(</span><span style="color: #0000BB;">$stat</span><span style="color: #007700;">);</span>
</span>
</span></span></span>
以上有两个测试用例。第一个,用CRC32的方法;第二个是Times33的算法实现。
效果:
结果分布,两种算法不相上下(估计是数据源的问题,md5只有0-f)。也有文章说CRC32的分布更均匀(参考链接:)
但耗费时间,CRC32比Times33快将近一倍。
为什么是33?
即是素数(质数),也是奇数。除了33,还有131, 1313, 5381等。PHP内置的Hash函数用的是5381,在“鸟哥”的一篇博文中也有提到:
原文地址:PHP的Hash算法:Times33, 感谢原作者分享。