HTML Basic Tutorial Computer Coding (Character Set)
Computer encoding (character set) - understand
Why there is a character set, because computers can only process binary data. In order for the computer to recognize human language (0-9, a-z, A-Z, special symbols), we need to "encode" each character. The so-called "encoding" means: each character can be represented by a different binary system.
Assumption: A uses binary to represent 1000, B uses binary to represent 1001
ASCII encoding: use 1 byte (8-bit binary) to represent all characters, a total of 2^8 = 256.
ANSI encoding: Other countries have extended the ASCII encoding to display their own language.
ANSI under the Chinese operating system, represents gb2312
ANSI under the traditional operating system, represents big5
ANSI under the Japanese operating system represents JIS
......
uses 2 bytes (16-bit binary) ( To represent, a total of 2^16 = 65536 characters can be represented.
- ##GB2312 contains a total of 6763 Chinese characters ##GBK encoding: Yes. GB2312 has been expanded to include some unpopular characters, rare characters, ancient Chinese, etc. A total of 21,000 Chinese characters are included.
- Unicode encoding. : Plans to uniformly encode all characters in the world, using 4 bytes (32-bit binary) to represent a character
- Assumptions: 1 Use Unicode encoding to represent 0000000000000000000000000000000000000000000001 ##UTF-8: Unified conversion format encoding (Multi-language encoding)
Different characters, it will choose the appropriate encoding for translation
- For example: 1 You can use ASCII encoding (8-bit binary)#.
- ##"Country" can be represented by 2 bytes