One space occupies one character, and characters include letters, numbers, arithmetic symbols, punctuation marks and other symbols, as well as some functional symbols; when characters are stored in the computer, the corresponding binary code representing the character should be specified.
#The operating environment of this article: Windows 7 system, Dell G3 computer.
1. A space occupies one character;
2. A Chinese character occupies 2 characters;
3. A letter occupies one character;
4. GB and GBK codes occupy 1 character 2 bytes;
5. UTF8 encoding means that one character occupies 3 bytes;
6. Unicode encoding means that one character occupies 4 bytes;
7. Different encoding intervals are used The bytes represented are also different.
Related introduction:
Characters include letters, numbers, arithmetic symbols, punctuation marks and other symbols, as well as some functional symbols. When characters are stored in the computer, the corresponding binary code representing the character should be specified. The selection of codes should be consistent with the specifications of the relevant peripheral devices. These peripheral devices include keyboard console input and output, printer output, and so on. When characters are input, they are automatically converted into binary codes and stored in the machine; when output, the binary codes in the computer are automatically converted into characters. The conversion between the two is realized by peripheral devices. Character is the smallest data access unit in the data structure. A character is usually represented by 8 binary bits (one byte), but there are also a few computer systems that use 6 binary character representations. The size of the character set in a system is completely determined by the system itself. The number of characters available for computers is generally 128 to 256 (excluding Chinese characters). After each character enters the computer, it will be converted into an 8-bit binary number. Different computer systems and different languages have different character ranges.
In ASCII encoding, one English alphabetic character requires 1 byte to store. In GB 2312 encoding or GBK encoding, one Chinese character storage requires 2 bytes. In UTF-8 encoding, the storage of an English alphabetic character requires 1 byte, and the storage of a Chinese character requires 3 to 4 bytes. In UTF-16 encoding, the storage of an English alphabetic character or a Chinese character requires 2 bytes (some Chinese characters in the Unicode extension area require 4 bytes to store). In UTF-32 encoding, the storage of any character in the world requires 4 bytes.
The above is the detailed content of How many characters is a space?. For more information, please follow other related articles on the PHP Chinese website!