Home Common Problem The internal code of a Chinese character requires several bytes to store

The internal code of a Chinese character requires several bytes to store

Dec 14, 2020 pm 05:45 PM
Chinese character In-camera code

The internal code of a Chinese character requires 2 bytes to store. In the popular Chinese character system in China, the internal code of a Chinese character occupies 2 bytes. Because the Chinese character processing system must ensure compatibility between Chinese and Western languages, ambiguity will occur when ASCII codes and Chinese character national standard codes exist in the system. ; To this end, the Chinese character internal code should be appropriately processed and transformed into the national standard code.

The internal code of a Chinese character requires several bytes to store

#The operating environment of this article: windows10 system, thinkpad t480 computer.

How many bytes are needed to store the internal code of a Chinese character?

The internal code of a Chinese character requires 2 bytes to store.

The National Bureau of Standards of my country promulgated the "Chinese Coded Character Set for Information Exchange—Basic Set" in May 1981, code-named GB2312-80, with a total of 6763 Chinese characters and 682 graphic characters. Encoding is carried out, and the encoding principle is: Chinese characters are represented by two bytes.

In principle, two bytes can represent 256×256=65536 different symbols, which is feasible as the basis for Chinese character encoding representation. However, considering the relationship between Chinese character encoding and other international universal encodings, such as ASCII Western character encoding, my country's National Bureau of Standards adopted a modified two-byte Chinese character encoding scheme, using only the lower 7 bits of the two bytes.

This solution can accommodate 128×128=16384 different Chinese characters, but in order to be compatible with the standard ASCII code, 32 control function codes and spaces with a code value of 32 and 32 spaces can no longer be used in each byte. The opcode of 127. So there can only be 94 encodings per byte. In this way, the actual number of words that can be represented by double seven digits is: 94×94=8836.

The internal code of a Chinese character requires several bytes to store

To read more related articles, please visit PHP Chinese website! !

The above is the detailed content of The internal code of a Chinese character requires several bytes to store. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Repo: How To Revive Teammates
1 months ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

The internal code of a Chinese character requires several bytes to store The internal code of a Chinese character requires several bytes to store Dec 14, 2020 pm 05:45 PM

The internal code of a Chinese character requires 2 bytes to store. In the popular Chinese character system in China, the internal code of a Chinese character occupies 2 bytes. Because the Chinese character processing system must ensure compatibility between Chinese and Western languages, ambiguity will occur when ASCII codes and Chinese character national standard codes exist in the system. ; To this end, the Chinese character internal code should be appropriately processed and transformed into the national standard code.

In-depth understanding of the principle of converting Chinese characters to UTF-8 encoding in PHP In-depth understanding of the principle of converting Chinese characters to UTF-8 encoding in PHP Mar 28, 2024 pm 02:44 PM

The principle of converting Chinese characters to UTF-8 encoding actually involves the concept of character encoding. In computers, text characters need to be represented and stored in the form of numbers, and different character encoding schemes specify the correspondence between different characters and numbers. UTF-8 is a commonly used character encoding method. It supports characters worldwide and uses a variable-length encoding method, which can effectively represent characters in various languages ​​and is especially suitable for the Unicode character set. As a common server-side scripting language, PHP also provides

Master the skills of PHP processing Chinese character transcoding Master the skills of PHP processing Chinese character transcoding Mar 28, 2024 pm 03:47 PM

PHP is a widely used server-side scripting language commonly used for website development. During website development, we often encounter the need to transcode Chinese characters, especially when dealing with Chinese characters. Mastering the skills of PHP in processing Chinese character transcoding can effectively avoid problems such as garbled characters and improve the stability and user experience of the website. 1.utf8_encode and utf8_decode functions In PHP, you can use the utf8_encode and utf8_decode functions to encode and decode Chinese characters.

Solution to Chinese character input problem in win11 Solution to Chinese character input problem in win11 Jan 05, 2024 am 08:29 AM

After we install the win11 system, we must first install the win11 Chinese input method to type Chinese characters. If after installing the Chinese input method, we still cannot type Chinese characters, then the related service may be disabled. Just restart it. The following is Get up and take a look. What to do if win11 cannot type Chinese characters: 1. First, we must make sure that we have downloaded and installed the Chinese input method or third-party input method software. 2. If you don’t know how to add an input method, you can check out the tutorials on this site. 3. If you still cannot type Chinese characters after adding the input method, you need to enable related services. 4. First, right-click the start menu and find "Computer Management" 5. Then enter "Task Scheduler" - "Task Scheduler"

In what form are Chinese characters output on the computer? In what form are Chinese characters output on the computer? Dec 07, 2020 am 11:15 AM

Chinese characters are output in the computer in the form of glyph codes, which are a type of dot matrix code. In order to output Chinese characters on a monitor or printer, the Chinese characters are designed into a dot matrix according to graphic symbols, and the corresponding dot matrix is ​​obtained. code.

PHP Regular Expression Guide: How to Match Chinese Characters PHP Regular Expression Guide: How to Match Chinese Characters Mar 20, 2024 pm 05:27 PM

PHP Regular Expression Guide: Methods of Matching Chinese Characters Regular expressions play a very important role in text processing. It can help us quickly and accurately match text content in specific patterns. For the processing of Chinese text, especially the special need of matching Chinese characters, regular expressions can also come in handy. This article will introduce how to use regular expressions to match Chinese characters in PHP and provide specific code examples. First, we need to clarify the scope of Chinese characters in Unicode encoding. The Unicode encoding range of Chinese characters is large

What is used to store Chinese characters in computers? What is used to store Chinese characters in computers? Dec 07, 2020 am 10:17 AM

Chinese characters in computers are stored using internal codes. Chinese internal codes refer to codes composed of 0 and 1 symbols used in computer internal storage, processing, and transmission of Chinese characters. Internal codes are the most basic encoding of Chinese characters, no matter what Chinese characters they are. System and Chinese character input method, the input Chinese character external code must be converted into internal code inside the machine before it can be stored and processed in various ways.

Tips to avoid PHP outputting garbled Chinese characters Tips to avoid PHP outputting garbled Chinese characters Mar 16, 2024 am 10:21 AM

Tips to avoid PHP outputting garbled Chinese characters When programming PHP, you often encounter the situation of outputting Chinese content. However, if you accidentally handle it improperly, it is easy for Chinese characters to be garbled and affect the user experience. Therefore, mastering some skills can effectively prevent PHP from outputting garbled Chinese characters. Some specific code examples will be introduced below to help developers better handle Chinese character output. 1. Set the character encoding to UTF-8 to ensure that the encoding of the PHP file itself is UTF-8. You can add the following comment at the beginning of the code to specify