Storing SHA1 Hash Values in MySQL: VARCHAR vs. BINARY
When storing the result of a SHA1 hash in a MySQL database, the appropriate field type and length must be carefully considered. Contrary to common practice, using a VARCHAR field for this purpose is inefficient.
VARCHAR vs. BINARY
A SHA1 hash is always 160 bits long. VARCHAR fields are designed for variable-length data, but they introduce an overhead of one byte for the length of the field. Therefore, using VARCHAR for a fixed-length value, such as a SHA1 hash, is unnecessary.
Binary fields, on the other hand, store binary data without any overhead. They are ideal for fixed-length values like SHA1 hashes.
Character Representation
SHA1 hashes are typically represented as hexadecimal strings, which use 4 bits per character. To store the hexadecimal representation of a SHA1 hash in a VARCHAR field, it would require 160/4 = 40 characters. However, binary fields use 8 bits per character, allowing the same hash to be stored in a 160/8 = 20-character field.
Recommended Approach
It is recommended to use BINARY(20) for storing SHA1 hash values in MySQL. This field type eliminates the overhead associated with VARCHAR and allows the hash to be stored efficiently. Additionally, the UNHEX function can be used to convert the hexadecimal representation of the hash to binary before storing it in the database.
Storage Comparison
A comparison of storage requirements reveals that BINARY(20) is more efficient than CHAR(40) for storing SHA1 hash values. For example, with millions of records, BINARY(20) requires approximately 44.56M of storage, while CHAR(40) requires 64.57M. This difference becomes even more pronounced with larger datasets.
The above is the detailed content of To Store a SHA1 Hash in MySQL: VARCHAR or BINARY?. For more information, please follow other related articles on the PHP Chinese website!