Understanding Byte Access in Go Strings
Accessing characters in a string using the slice notation str[i] will return a value of type byte in Go. This raises the question of whether Go performs a conversion from a rune to a byte during this operation.
Byte Access in Go
Notably, Go strings store UTF-8 encoded bytes of the text rather than characters or runes. Therefore, indexing a string, i.e., str[i], directly retrieves the corresponding byte value. Thus, no conversion is performed during this operation.
Rune Iteration Using for ... range
When using the for ... range loop to iterate over a string, it retrieves the runes (characters) rather than bytes. This is because Go optimizes the loop to iterate over the byte offsets of the runes. The first value in the loop represents the byte index, while the second value is the actual rune or character. This loop avoids the conversion to a []byte slice, ensuring better performance when dealing with UTF-8 encoded characters.
Converting to []byte for Byte Iteration
Alternatively, you can convert the string to a []byte slice using the []byte(str) function. This approach does not result in a copy since Go optimizes it to point to the original string's bytes. Despite this optimization, iterating over the bytes in this way is less efficient than using the for ... range loop to iterate over the runes.
Conclusion
In summary, Go strings store UTF-8 encoded bytes, and accessing elements using str[i] retrieves byte values without any conversion. When iterating over runes (characters), using the for ... range loop directly on the string is more efficient than converting it to a []byte slice.
The above is the detailed content of Does Go Convert Runes to Bytes When Accessing Strings with Slice Notation?. For more information, please follow other related articles on the PHP Chinese website!