I see that many database designs choose to use utf8_general_ci instead of utf8_unicode_ci for Chinese characters? The difference between utf8_general_ci and utf8_unicode_ci is not big: utf8_unicode_ci is more accurate in proofreading, and utf8_general_ci is faster. The difference between them is mainly in German and French. For accuracy, utf8_general_ci is enough, so utf8_general_ci is generally used.
The detailed description is as follows:
The most important feature of utf8_unicode_ci is to support expansion, that is, when a letter is regarded as equal to other letter combinations. For example, 'ß' is equal to 'ss' in German and some other languages.
utf8_general_ci is a legacy collation rule and does not support extensions. It is only capable of character-by-character comparisons. This means that comparisons made by the utf8_general_ci collation are fast, but less accurate than those using the utf8_unicode_ci collation).
The difference between the two collation rules is that for utf8_general_ci the following equation holds:
ß = s
However, for utf8_unicode_ci the following equation holds:
ß = ss
So for German and French utf8_unicode_ci is more accurate and not necessary for Chinese.
The above is the detailed content of The difference between utf8_general_ci and utf8_unicode_ci in MYSQL. For more information, please follow other related articles on the PHP Chinese website!