There are two character types in the go language: 1. byte type, also called uint8 type, which represents a character of ASCII code; 2. rune type, which represents a UTF-8 character. When you need to process Chinese, For Japanese or other compound characters, you need to use the rune type. The rune type is equivalent to the int32 type.
The operating environment of this tutorial: Windows 10 system, GO 1.11.2, Dell G3 computer.
Each element in a string is called a "character", and characters can be obtained when traversing or obtaining a single string element.
There are two character types in Go language:
One is the uint8 type, or byte type, which represents the ASCII code a character.
The other is the rune type, which represents a UTF-8 character. When you need to process Chinese, Japanese or other compound characters, you need to use the rune type. The rune type is equivalent to the int32 type.
byte type is an alias of uint8. There is no problem at all for traditional ASCII encoded characters that only occupy 1 byte. For example, var ch byte = 'A', the character uses Enclosed in single quotes.
In the ASCII code table, the value of A is 65, and in hexadecimal notation it is 41, so the following writing is equivalent:
var ch byte = 65 或 var ch byte = '\x41' //(\x 总是紧跟着长度为 2 的 16 进制数)
Another possible writing is \ is followed by an octal number of length 3, for example \377
.
Go language also supports Unicode (UTF-8), so characters are also called Unicode code points or runes, and are represented by int in memory. In documents, the format U hhhh is generally used, where h represents a hexadecimal number.
When writing Unicode characters, you need to add the prefix \u or \U before the hexadecimal number. Because Unicode occupies at least 2 bytes, we use int16 or int type to represent it. If you need to use 4 bytes, use the \u prefix, if you need to use 8 bytes, use the \U prefix.
var ch int = '\u0041' var ch2 int = '\u03B2' var ch3 int = '\U00101234' fmt.Printf("%d - %d - %d\n", ch, ch2, ch3) // integer fmt.Printf("%c - %c - %c\n", ch, ch2, ch3) // character fmt.Printf("%X - %X - %X\n", ch, ch2, ch3) // UTF-8 bytes fmt.Printf("%U - %U - %U", ch, ch2, ch3) // UTF-8 code point
Output:
65 - 946 - 1053236 A - β - r 41 - 3B2 - 101234 U+0041 - U+03B2 - U+101234
The format specifier %c is used to represent characters. When used with characters, %v or %d will output the integer used to represent the character, % U outputs a string in the format of U hhhh.
The Unicode package has some built-in functions for testing characters. The return value of these functions is a Boolean value, as shown below (where ch represents the character):
Judge whether it is a letter: unicode.IsLetter(ch)
Judge whether it is a number: unicode.IsDigit(ch)
Judge whether it is a number For white space symbols: unicode.IsSpace(ch)
Recommended learning: Golang tutorial
The above is the detailed content of Is there a character type in go language?. For more information, please follow other related articles on the PHP Chinese website!