MySQL VARCHAR Lengths: Navigating the Confusion of Bytes vs. Characters
Within MySQL, the interpretation of VARCHAR(n) column lengths varies depending on the database version. In versions prior to 4.1, column lengths were defined in bytes. However, in MySQL 5 and later, lengths are interpreted in character units.
Byte Counts: MySQL Version 4
In MySQL Version 4, a VARCHAR(32) column could accommodate 32 bytes of data. Hence, storing multi-byte characters within this field would depend upon the specific character encoding, potentially requiring more bytes per character.
Character Counts: MySQL Version 5 and Later
Post MySQL Version 4, column lengths are determined by character count. A VARCHAR(32) column in UTF-8 allows for 32 characters. Multi-byte UTF-8 characters will still occupy multiple bytes in this case, but the column length calculation considers the number of characters, not bytes.
MySQL Documentation Confirmation
To clarify further, the official MySQL 5 documentation states: "MySQL interprets length specifications in character column definitions in character units ... This applies to CHAR, VARCHAR, and the TEXT types."
UTF-8 Impact on VARCHAR Maximum Length
Interestingly, the maximum length of a VARCHAR column is also affected by the character set. For instance, in MySQL 5.0.3 and later, utf8 characters can require up to three bytes per character. Therefore, a VARCHAR column using UTF-8 can have a maximum declared length of 21,844 characters (65,535 bytes / 3 bytes per character).
The above is the detailed content of How Do VARCHAR Lengths Change Between MySQL 4 and 5 ?. For more information, please follow other related articles on the PHP Chinese website!