Each element in the string is called a "character", and characters can be obtained when traversing or individually obtaining string elements.
There are two types of characters in Go language:
One is uint8 type, or byte type, Represents a character in ASCII code. (Recommended learning: go)
The other is the rune type, which represents a UTF-8 character. When you need to process Chinese, Japanese or other compound characters, you need to use the rune type. . The rune type is equivalent to the int32 type. The
byte type is an alias for uint8, which is completely fine for traditional ASCII-encoded characters that only occupy 1 byte, such as var ch byte = 'A', and the characters are enclosed in single quotes.
In the ASCII code table, the value of A is 65, and in hexadecimal notation it is 41, so the following writing is equivalent:
var ch byte = 65 或 var ch byte = '\x41' //(\x 总是紧跟着长度为 2 的 16 进制数)
In addition One possible way to write it is \ followed by an octal number of length 3, for example \377.
Go language also supports Unicode (UTF-8), so characters are also called Unicode code points or runes, and are represented by int in memory. In documents, the format U hhhh is generally used, where h represents a hexadecimal number.
When writing Unicode characters, you need to add the prefix \u or \U before the hexadecimal number. Because Unicode occupies at least 2 bytes, we use int16 or int type to represent it. If you need to use 4 bytes, use the \u prefix, if you need to use 8 bytes, use the \U prefix.
var ch int = '\u0041' var ch2 int = '\u03B2' var ch3 int = '\U00101234' fmt.Printf("%d - %d - %d\n", ch, ch2, ch3) // integer fmt.Printf("%c - %c - %c\n", ch, ch2, ch3) // character fmt.Printf("%X - %X - %X\n", ch, ch2, ch3) // UTF-8 bytes fmt.Printf("%U - %U - %U", ch, ch2, ch3) // UTF-8 code point
Output:
65 - 946 - 1053236 A - β - r 41 - 3B2 - 101234 U+0041 - U+03B2 - U+101234
The format specifier %c is used to represent characters. When used with characters, %v or %d will be output An integer used to represent the character. %U outputs a string in the format of U hhhh.
The Unicode package has some built-in functions for testing characters. The return value of these functions is a Boolean value, as shown below (where ch represents the character):
判断是否为字母:unicode.IsLetter(ch) 判断是否为数字:unicode.IsDigit(ch) 判断是否为空白符号:unicode.IsSpace(ch)
The above is the detailed content of what is golang byte. For more information, please follow other related articles on the PHP Chinese website!