Byte Accessing and Conversion in Go Strings
In Go, strings store UTF-8 encoded bytes, not characters or runes. Accessing elements of a string, denoted as str[i], returns a bytes (byte or uint8) rather than converting it to a rune.
When iterating over a string using the for ... range construct, you can access either bytes or runes. Indexing a string (e.g., str[i]) directly accesses bytes, while the loop without a conversion (for i := range str) iterates over runes.
Performance Considerations
Converting a string to a byte slice using []byte(str) does not perform an actual copy; it's optimized away. Thus, there is no performance difference between the two methods presented:
str := "large text" for i := range str { // use str[i] }
str := "large text" str2 := []byte(str) for _, s := range str2 { // use s }
However, for maximum performance and code clarity, it's recommended to use the method that matches the intent of the code:
Character Iteration
When iterating over rune characters in a string, it's important to note that multibyte characters, such as Unicode characters, may be represented using multiple bytes in the underlying UTF-8 encoding. The for ... range str syntax handles this automatically, returning the byte index and character (rune) value on each iteration.
Additional Resources
The above is the detailed content of How do you access bytes and runes in Go strings?. For more information, please follow other related articles on the PHP Chinese website!