How many bytes does one ascii character occupy?
One ascii character occupies 1 byte. ASCII code characters are represented by 7-bit or 8-bit binary encoding in the computer and are stored in one byte, that is, one ASCII code occupies one byte. ASCII code can be divided into standard ASCII code and extended ASCII code. Standard ASCII code is also called basic ASCII code. It uses 7-bit binary numbers (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, and the numbers 0 to 9. Punctuation marks, and special control characters used in American English.
The operating environment of this tutorial: Windows 7 system, Dell G3 computer.
ASCII (American Standard Code for Information Interchange): The American Standard Code for Information Interchange is a computer coding system based on the Latin alphabet, mainly used to display modern English and other Western European languages.
ASCII code uses a specified 7-bit or 8-bit binary number combination to represent 128 or 256 possible characters.
ASCII code characters are represented by 7-bit or 8-bit binary encoding in the computer and are stored in one byte, that is, one ASCII code occupies one byte.
ASCII code can be divided into standard ASCII code and extended ASCII code.
Standard ASCII code is also called basic ASCII code
, which uses 7 binary digits (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, the number 0 to 9, punctuation, and special control characters used in American English. Among them:
-
0~31 and 127 (33 in total) are control characters or special communication characters (the rest are displayable characters)
For example, the control character: LF ( Line feed), CR (carriage return), FF (page feed), DEL (delete), BS (backspace), BEL (ring), etc.;
Special characters for communication: SOH (header), EOT (End of text), ACK (confirmation), etc.;
ASCII values 8, 9, 10 and 13 are converted into backspace, tab, line feed and carriage return characters respectively. They do not have a specific graphic display, but will have different effects on text display depending on the application.
32~126 (95 in total) are characters (32 is a space), of which 48~57 are ten Arabic numerals from 0 to 9.
65~90 are 26 uppercase English letters, 97~122 are 26 lowercase English letters, and the rest are some punctuation marks, arithmetic symbols, etc.
Also note that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check refers to a method used to check whether errors occur during code transmission. It is generally divided into two types: odd check and even check. Odd parity rules: the number of 1's in a byte of the correct code must be an odd number. If it is not an odd number, add 1 to the highest bit b7; even parity rules: the number of 1's in a byte of the correct code must be an even number. , if it is not an even number, add 1 to the highest bit b7.
The last 128 characters are called extended ASCII codes.
Many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII allows the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign letters, and graphic symbols.
The ASCII code standard table is as follows
Bin (binary)
|
Oct
(octal)
|
Dec
(decimal)
|
Hex
(Hex)
|
Abbreviation/Character |
Explanation |
||||
##0000 0000 | 00 | 0 | 0x00 | NUL(null) | null character | ||||
0000 0001 | 01 | 1 | 0x01 | SOH(start of headline) | Title start | ||||
0000 0010 | 02 | 2 |
##0x02 | ##STX (start of text)Text begins | |||||
03 | 3 | 0x03 | ETX (end of text) | End of text | |||||
04 | 4 | 0x04 | EOT (end of transmission) | End of transmission | |||||
05 | 5 | ##0x05 |
ENQ (enquiry) |
Request |
0000 0110 | ||||
06 |
6 |
0x06 |
ACK (acknowledge) |
Notification received |
0000 0111 | ||||
07 |
7 |
##0x07 | ##BEL (bell)ring | ##0000 1000 | |||||
8 | 0x08 | BS (backspace) | Backspace | 0000 1001 | |||||
9 | 0x09 | HT (horizontal tab) | Horizontal tab character | 0000 1010 | |||||
10 | 0x0A | LF (NL line feed, new line) | Line feed key | 0000 1011 | |||||
11 | ##0x0B |
VT (vertical tab) |
Vertical tab character |
##0000 1100 |
|
||||
12 | 0x0C | FF (NP form feed, new page ) | Page key | ##0000 1101 |
015 |
13 |
##0x0D | CR ( carriage return) | Enter key |
0000 1110 | 016 | 14 | 0x0E | SO (shift out) | No need to switch | ||||
0000 1111 | 017 | 15 | 0x0F | SI (shift in) | Enable switching | ||||
0001 0000 | 020 | ##16##DLE (data link escape) | Data link escape | ||||||
021 | 17 | 0x11 | DC1 (device control 1) | Device control 1 | |||||
##022 |
##18 | 0x12 | DC2 (device control 2) | Device control 2 | 0001 0011 | ||||
023 | 19 | ##0x13DC3 (device control 3) | Device control 3 | ##0001 0100 | |||||
20 | 0x14 | ##DC4 (device control 4) |
Device Control 4 |
##0001 0101 | 025 | ||||
21 | 0x15 | NAK (negative acknowledge) | Refuse to accept | 0001 0110 | 026 | ||||
22 | 0x16 | SYN (synchronous idle) | synchronous idle | 0001 0111 | 027 | ||||
23 | 0x17 | ETB (end of trans. block) | End transmission block |
||||||
0001 1000 |
##030 | 24 | 0x18 | ##CAN (cancel)Cancel | |||||
031 | 25 | 0x19 | EM (end of medium) | End of medium | |||||
032 | 26 | 0x1A | SUB (substitute) | instead of | |||||
033 | 27 | ##0x1B | ESC (escape) | Escape (overflow) | ##0001 1100 | ||||
034
|
28 |
0x1C |
FS (file separator) |
File delimiter |
0001 1101 | ||||
##035 | 29 | 0x1D | ##GS (group separator)Grouping symbol | 0001 1110 |
|||||
30 | 0x1E | RS (record separator) | Record separator | 0001 1111 |
|||||
31 | 0x1F | US (unit separator) | Unit separator | 0010 0000 |
|||||
32 | 0x20 | (space) | space | ##0010 0001 | 041 | ||||
33 |
##0x21 | ! | Exclamation mark | ##0010 0010 |
##042 | ||||
##0x22 |
" |
##Double quotes | |||||||
35 | 0x23 | ||||||||
044 | 36 | 0x24 | ##$ | dollar sign | |||||
045 | 37 | 0x25 | % | Percent sign | |||||
046 | 38 | 0x26 | & | 和 | |||||
047 | 39 | 0x27 | ' |
|
##0010 1000 | ||||
050 |
40 |
0x28 |
( |
open bracket |
0010 1001 | ||||
051 |
41 |
0x29 |
) |
Closing bracket |
##0010 1010 | ||||
052 | 42 | ##0x2A* | 星 | 0010 1011 |
|||||
43 | 0x2B | ##plus sign | 0010 1100 | ##054||||||
44 |
0x2C |
, |
comma |
0010 1101 | ##055 | ||||
45 | ##0x2D- | Minus sign/dash |
|||||||
0010 1110 |
056 |
46 |
##0x2E | ##.Period | |||||
057 | 47 | / | slash | ||||||
060 | 48 | 0x30 | 0 | Characters 0 | |||||
061 | 49 | ##0x31 |
1 |
Character 1 |
##0011 0010 | ||||
062 | 50 | 0x32 | 2 | Character 2 | 0011 0011 | ||||
063 | ##510x33 | 3 | ##Character 3 | ##0011 0100 | 064 | ||||
52 |
##0x34 | 4 | Characters 4 | 0011 0101 | 065 | ||||
53 | 0x35 | ##5##Character 5 | 0011 0110 | ##06654 | |||||
0x36 |
6 |
##Characters 6 | 0011 0111 | 067 | 55 | ||||
0x37 | 7 | Characters 7 | ##0011 1000 |
070 |
56 |
||||
8 | |||||||||
0011 1001 |
071 |
57 |
0x39 |
9 |
Characters 9 |
||||
0011 1010 |
072 |
58 |
0x3A |
: |
Colon |
||||
##0011 1011 | 073 | 59 | 0x3B | ; | Semicolon | ||||
0011 1100 | 074 | 60 | 0x3C | is less than | |||||
0011 1101 | 075 | ##610x3D | = | equal sign | |||||
076 | ##62 | ##0x3E |
> |
is greater than |
0011 1111 | ||||
##077 | 63 | 0x3F | ##?Question mark | 0100 0000 |
|||||
64 | 0x40 | @ | Email symbol | 0100 0001 |
|||||
65 | ##0x41 | A | Capital Letter A | ##0100 0010 | 0102 | ||||
66 |
0x42 |
B |
Capital B |
##0100 0011 | 0103 | ||||
0x43 | C | uppercase letter C | 0100 0100 |
0104 |
68 |
##0x44 | D | Capital D | |
0100 0101 | 0105 | 69 | 0x45 | ##Euppercase letter E | |||||
0106 | 70 | 0x46 | F | Capital letter F | |||||
0107 | 71 | ##0x47 |
|
Capital G | ##0100 1000 | ||||
##0110 | 72 | 0x48 | H | uppercaseH | ##0100 1001 |
||||
73 | 0x49 | I | uppercase letter I | 01001010 |
|||||
##74 | 0x4A | J | Capital J | 0100 1011 | |||||
##75 |
##0x4B | K |
Capital K | 0100 1100 | ##0114 |
||||
0x4C | ##L | uppercase L | 0100 1101 | 0115 | |||||
0x4D | M | ##Capital letter M |
0100 1110 | 0116 | 78 | ||||
0x4E |
N |
uppercase letter N |
##0100 1111 | 0117 | 79 |
0x4F |
##O | ##uppercase letters O||
0120 | 80 | 0x50 | P | ##upper case P | |||||
0121 | ##81 |
0x51 |
Q |
Capital letter Q |
0101 0010 | ||||
0122 |
##82 | ##0x52R | Capital R | 0101 0011 |
|||||
83 | 0x53 | S | uppercase S | 0101 0100 | ##0124|||||
84 |
0x54 |
T |
##Capital letter T | 0101 0101 | 0125 | ||||
0x55 | U | upper case U | ##0101 0110 | 0126 | |||||
0x56 | V | Capital V | 0101 0111 | 0127 | |||||
0x57 | W | uppercase letter W | 0101 1000 | 0130 | ##88|||||
0x58 |
X |
Capital letter X |
0101 1001 | 0131 | ##89 | ||||
Y | Capital letter Y | ##0101 1010 | 0132 | 90 | 0x5A |
Z |
##Capital letter Z | ||
0101 1011 | 0133 | ##910x5B | [ | ##Open square brackets | |||||
0134 | ##92 |
##0x5C | \ |
Backslash | 0101 1101 | ||||
0135 | 93 | 0x5D | ] | Closing square bracket | 0101 1110 | ||||
94 | 0x5E | ^ | ##Caret | 0101 1111 | |||||
95 | ##0x5F | _ |
underscore |
##0110 0000 | 0140 | ||||
96 | 0x60 | ` | Open single quotes | ##0110 0001 |
##0141 | ||||
0x61 | a | lower case a | 0110 0010 | 0142 | ##98|||||
0x62 |
##b | lowercase b |
##0110 0011 |
0143 |
99 |
||||
c | lower case c | 0110 0100 | 0144 | 100 | |||||
##d |
##lower case d |
0110 0101 | 0145 | 101 | 0x65 | e |
lower case e |
||
##0110 0110 | 0146 | 102 | 0x66 | f | lower case f | ||||
0147 | 103 | 0x67 | g | lower case g | |||||
0150 | 104 | 0x68 | h | lowercase h | |||||
0151 | 105 | 0x69 | i | ##lower case i |
##0110 1010 | ||||
0152 | 106 | 0x6A | j | Lower case j | ##0110 1011 |
||||
107 | 0x6B | k | ##lower case k | 0110 1100 | |||||
108 | ##0x6C |
l |
lowercase l |
0110 1101 | 0155 | ||||
109 |
##0x6D | m | lower case m | 0110 1110 | 0156 | ||||
110 | 0x6E | ##n##lowercase n | 0110 1111 | 0157 | |||||
0x6F | o | ##lowercase o |
0111 0000 | 0160 | ##112 | ||||
0x70 | p |
lower case p |
|||||||
0111 0001 |
0161 |
##113 | ##0x71q | lower case q | |||||
0162 | 114 | ##0x72 | ##r |
##lowercase r | 0111 0011 | ||||
115 | 0x73 | s | lowercase s | 0111 0100 |
|||||
116 | 0x74 | t | lower case t | ##0111 0101 | |||||
117 | ##0x75 |
##u | lowercase u | 0111 0110 | 0166 | ||||
118 | 0x76 | v | lower case v |
0111 0111 | 0167 | ||||
119 | 0x77 | w | lowercase w | 0111 1000 | 0170 | ||||
120 | ##0x78x | Lower case x | ##0111 1001 | 0171 | |||||
0x79 | y |
|
##0111 1010 | ##0172 | 122 | ||||
0x7A | z | lower case z | 0111 1011 | 0173 | 123 | ||||
0x7B | { |
##Opening brackets | |||||||
##124 | ##0x7C |
| |
vertical line |
##0111 1101 | |||||
0175 | 125 | 0x7D | } | Closing curly brace |
0111 1110 | ||||
0176 | 126 | 0x7E | ~ | tilde | 0111 1111 | ||||
##127 | ##0x7F |
DEL (delete) |
Delete |
##Size rules |
|
- The letter A is smaller than the letter Z, and increases in order from A to Z. For example, "A"
- The uppercase letters of the same letter are 32 smaller than the lowercase letters. Such as "A"
- The ASCII code sizes of several common letters: "A" is 65; "a" is 97; "0" is 48. For more related knowledge, please visit the FAQ
- column!
The above is the detailed content of How many bytes does one ascii character occupy?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

1MB of storage capacity is equivalent to 2 to the 20th power bytes, or 1,048,576 bytes. MB is a storage unit in computers, pronounced as "mega"; because 1MB is equal to 1024KB, and 1KB is equal to 1024B (bytes), so 1MB is equal to 1048576 (1024 *1024) bytes.

128mb refers to 134217728 bytes; the byte conversion formula is "1MB=1024KB=1048576B=8388608bit", which means that 1048576 English letters and 524288 Chinese characters can be saved; the traffic unit conversion formula is 1GB=1024MB, 1MB=1024KB, 1KB= 1024B.

1 bit is equal to one-eighth of a byte. In the binary number system, each 0 or 1 is a bit (bit), and a bit is the smallest unit of data storage; every 8 bits (bit, abbreviated as b) constitute a byte (Byte), so "1 byte ( Byte) = 8 bits”. In most computer systems, a byte is an 8-bit (bit) long data unit. Most computers use a byte to represent a character, number, or other character.

An ASCII code occupies one byte. ASCII code is a coding standard used to represent characters. It uses 7-bit binary numbers to represent 128 different characters, including letters, numbers, punctuation marks, special characters, etc. A byte is the basic unit of computer storage unit. It consists of 8 binary bits. Each binary bit can be 0 or 1. One byte can represent 256 different values, so it can represent all characters in the ASCII code.

UTF8 encoded Chinese characters occupy 3 bytes. In UTF-8 encoding, one Chinese character is equal to three bytes, and one Chinese punctuation mark occupies three bytes; while in Unicode encoding, one Chinese character (including traditional Chinese) is equal to two bytes. UTF-8 uses 1~4 bytes to encode each character. One US-ASCIl character only needs 1 byte to encode. Latin, Greek, Cyrillic, Armenian, and Hebrew with diacritical marks. , Arabic, Syriac and other letters require 2-byte encoding.

One ascii character occupies 1 byte. ASCII code characters are represented by 7-bit or 8-bit binary encoding in the computer and are stored in one byte, that is, one ASCII code occupies one byte. ASCII code can be divided into standard ASCII code and extended ASCII code. Standard ASCII code is also called basic ASCII code. It uses 7-bit binary numbers (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, and the numbers 0 to 9. Punctuation marks, and special control characters used in American English.

4KB means that the storage unit is 4096 bytes. KB refers to kilobyte, which is a multiple form of computer data storage unit byte. A kilobyte is based on the power of 2, that is, a kilobyte (1KB) is equal to 1024 bytes (B ); therefore "4KB=4*1024B=4096B", that is, 4KB represents 4096 bytes.

Detailed explanation of the method of converting int type to byte in PHP In PHP, we often need to convert the integer type (int) to the byte (Byte) type, such as when dealing with network data transmission, file processing, or encryption algorithms. This article will introduce in detail how to convert the int type to the byte type and provide specific code examples. 1. The relationship between int type and byte In the computer field, the basic data type int represents an integer, while byte (Byte) is a computer storage unit, usually 8-bit binary data