When writing Golang programs, you may need to embed some non-ASCII characters, such as Chinese text, Emoji expressions, etc. However, sometimes, the embedded characters will appear garbled, affecting the running effect and readability of the program. The following introduces some common causes and solutions of Golang embedded garbled characters.
1. Golang string encoding
A string in Golang is a sequence composed of Unicode characters. Each character occupies 1 or 2 bytes, depending on its encoding method. Golang supports the following three character encoding methods:
UTF-8 is a variable-length Unicode encoding, each character occupies 1-4 bytes, the specific length depends on the Unicode encoding value of the character. In Golang, strings are encoded in UTF-8 by default. For example:
s := "你好"
The above string s uses UTF-8 encoding.
UTF-16 is a fixed-length Unicode encoding that takes up 2 bytes per character. In Golang, you can use the rune type to represent UTF-16 encoded characters. For example:
var r rune = '好'
The above code represents the UTF-16 encoding of the character 'good', which is an integer of type uint16.
UTF-32 is a fixed-length Unicode encoding that takes up 4 bytes per character. In Golang, you can use the int32 type to represent UTF-32 encoded characters. For example:
var c int32 = '?'
The above code represents the UTF-32 encoding of the Emoji expression ?, which is an integer of uint32 type.
2. Methods of embedding non-ASCII characters
In Golang, there are four ways to embed non-ASCII characters:
Use characters directly to represent non-ASCII characters in the string. For example:
s := "你好?"
The above code contains both Chinese characters and an Emoji expression.
Use escape characters to represent non-ASCII characters. For example:
s := "你好U0001F60A"
In the above code, U is followed by the UTF-32 encoding of the character, which represents an Emoji expression.
Use Unicode encoded values to represent non-ASCII characters. For example:
s := "u4f60u597dU0001F60A"
In the above code, u is followed by the UTF-16 encoding of the character, and U is followed by the UTF-32 encoding.
Base64 encode the non-ASCII characters, and then embed the encoded string in the program. For example:
s := "5L2g5aW98J+YqA=="
The above string is the base64 encoding result of "Hello?"
3. Causes and solutions for Golang embedded garbled characters
If the wrong encoding method is used in the program , will cause the string to contain garbled characters. For example, when using Unicode encoded values, if the wrong encoding method is used, garbled characters will appear. The correct way to use it is to use the correct encoding of the character to escape. For example:
s := "u4f60u597dud83dude0a"
In the above code, u is followed by UTF-16 encoding, and ud83dude0a is the correct representation of UTF-16 encoding for Emoji expressions.
Some editors will change the encoding of the file when saving the file, such as converting UTF-8 to ANSI encoding. This will cause the strings in the program to be garbled. Therefore, you should save the file using an editor that supports UTF-8 encoding, and ensure that the file encoding is consistent with the encoding used in the program.
In some cases, the system's environment variables may affect the string encoding in the program. You need to check whether the system environment variable is correctly set to the encoding method.
In short, when embedding non-ASCII characters, you need to choose the encoding method correctly and ensure that the encoding method of the file is consistent with the encoding method in the program. Use correct escaping methods to avoid garbled characters.
The above is the detailed content of golang embedded garbled code. For more information, please follow other related articles on the PHP Chinese website!