1 MySQL字符集简介
MySQL服务器可以支持多种字符集,在同一台服务器,同一个数据库,甚至同一个表的不同字段都可以指定使用不同的字符集,相比oracle等其他数据库管理系统,在同一个数据库只能使用相同的字符集,MySQL明显存在更大的灵活性。
MySQL的字符集包括字符集(CHARACTER)和校对规则(COLLATION)两个概念。字符集是用来定义MySQL存储字符串的方式,校对规则则是定义了比较字符串的方式,解决排序和字符分组的问题。字符集和校对规则是一对多的关系,每个字符集至少对应一个校对规则,MySQL支持39种字符集的将近200种校对规则。
在MySQL中,字符集的概念和编码方案被看做是同义词,一个字符集是一个转换表和一个编码方案的组合。
Unicode(Universal Code)是一种在计算机上使用的字符编码。Unicode 是为了解决传统的字符编码方案的局限而产生的,它为每种语言中的每个字符设定了统一并且唯一的二进制编码,以满足跨语言、跨平台进行文本转换、处理的要求。Unicode存在不同的编码方案,包括Utf-8,Utf-16和Utf-32。Utf表示Unicode Transformation Format。
2 查看字符集及校对
2.1 查看字符集
mysql> show character set;mysql> select * from information_schema.character_sets;mysql> select character_set_name, default_collate_name, description, maxlen from information_schema.character_sets;
+----------+-----------------------------+---------------------+--------+| Charset | Description | Default collation | Maxlen |+----------+-----------------------------+---------------------+--------+| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 || dec8 | DEC West European | dec8_swedish_ci | 1 || cp850 | DOS West European | cp850_general_ci | 1 || hp8 | HP West European | hp8_english_ci | 1 || koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 || latin1 | cp1252 West European | latin1_swedish_ci | 1 || latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 || swe7 | 7bit Swedish | swe7_swedish_ci | 1 || ascii | US ASCII | ascii_general_ci | 1 || ujis | EUC-JP Japanese | ujis_japanese_ci | 3 || sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 || hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 || tis620 | TIS620 Thai | tis620_thai_ci | 1 || euckr | EUC-KR Korean | euckr_korean_ci | 2 || koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 || gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 || greek | ISO 8859-7 Greek | greek_general_ci | 1 || cp1250 | Windows Central European | cp1250_general_ci | 1 || gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 || latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 || armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 || utf8 | UTF-8 Unicode | utf8_general_ci | 3 || ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 || cp866 | DOS Russian | cp866_general_ci | 1 || keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 || macce | Mac Central European | macce_general_ci | 1 || macroman | Mac West European | macroman_general_ci | 1 || cp852 | DOS Central European | cp852_general_ci | 1 || latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 || utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 || cp1251 | Windows Cyrillic | cp1251_general_ci | 1 || utf16 | UTF-16 Unicode | utf16_general_ci | 4 || cp1256 | Windows Arabic | cp1256_general_ci | 1 || cp1257 | Windows Baltic | cp1257_general_ci | 1 || utf32 | UTF-32 Unicode | utf32_general_ci | 4 || binary | Binary pseudo charset | binary | 1 || geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 || cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 || eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |+----------+-----------------------------+---------------------+--------+
转载请注明出处:http://blog.csdn.net/jesseyoung/article/details/36427677