Murmur3 hash compatibility between Go and Python

王林
Release: 2024-02-09 13:10:19
forward
1160 people have browsed it

Go 和 Python 之间的 Murmur3 哈希兼容性

php editor Zimo introduces you to the Murmur3 hash compatibility between Go and Python. Murmur3 is an efficient hash algorithm commonly used for hash operations in data structures and algorithms. The Murmur3 hashing algorithm is implemented differently in the two programming languages ​​Go and Python, so compatibility issues may arise when using it. This article will detail the differences in the Murmur3 hashing algorithm in Go and Python and provide solutions to ensure correct hash compatibility when passing data between different languages.

Question content

We have two different libraries, one in python and one in go, that need to calculate murmur3 hashes in the same way. Unfortunately, no matter how hard we tried, we couldn't get the library to produce the same results. Judging from this question about java and python, compatibility is not necessarily straightforward.

Now we are using python mmh3 and go github.com/spaolacci/murmur3 libraries.

In go:

hash := murmur3.new128()
hash.write([]byte("chocolate-covered-espresso-beans"))
fmt.println(base64.rawurlencoding.encodetostring(hash.sum(nil)))
// output: clhso2ncbxyoezvilm5gwg
Copy after login

In python:

name = "chocolate-covered-espresso-beans"
hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='big', signed=False)
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# Output: jns74izOYMJwsdKjacIHHA (big byteorder)

hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='little', signed=False)
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# Output: HAfCaaPSsXDCYM4s4jt7jg (little byteorder)

hash = mmh3.hash_bytes(name.encode('utf-8'))
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# Output: HAfCaaPSsXDCYM4s4jt7jg
Copy after login

In go, murmur3 returns a uint64, so we assume signed=false in python; but we also tried signed= true did not get a matching hash value.

We are open to different libraries, but would like to know if there is an issue with our go or python approach to computing a base64 encoded hash from a string. Any help is appreciated.

Solution

The first python result is almost correct.

>>> binascii.hexlify(base64.b64decode('jns74izoymjwsdkjacihha=='))
b'8e7b3be22cce60c270b1d2a369c2071c'
Copy after login

In go:

    x, y := murmur3.sum128([]byte("chocolate-covered-espresso-beans"))
    fmt.printf("%x %x\n", x, y)
Copy after login

result:

70b1d2a369c2071c 8e7b3be22cce60c2
Copy after login

So the order of these two words is reversed. To get the same result in python you can try the following:

name = "chocolate-covered-espresso-beans"
hash = mmh3.hash128(name.encode('utf-8'), signed=False).to_bytes(16, byteorder='big', signed=False)
hash = hash[8:] + hash[:8]
print(base64.urlsafe_b64encode(hash).decode('utf-8').strip("="))
# cLHSo2nCBxyOezviLM5gwg
Copy after login

The above is the detailed content of Murmur3 hash compatibility between Go and Python. For more information, please follow other related articles on the PHP Chinese website!

source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!