In Go, iterating using range over strings and rune slices might seem identical at first glance, as both yield the Unicode code points of the respective data structure. However, there's a crucial difference that becomes apparent when dealing with multibyte characters.
When you range over a string directly, like in the following code:
<code class="go">for _, s := range str { fmt.Printf("type of v: %s, value: %v, string v: %s \n", reflect.TypeOf(s), s, string(s)) }</code>
You're actually iterating over a sequence of bytes. As Go strings are essentially byte arrays, each iteration yields a byte from the string. This granularity may not pose an issue for strings primarily containing ASCII characters. However, for Unicode strings containing multibyte characters, byte-wise iteration could lead to unexpected results.
In contrast, ranging over a rune slice, created by explicitly converting a string to a slice of runes like:
<code class="go">for _, s := range []rune(str) { fmt.Printf("type : %s, value: %v ,string : %s\n", reflect.TypeOf(s), s, string(s)) }</code>
Provides you with an iteration over code points. Unlike strings, rune slices are sequences of Unicode characters, making them more suitable for operating on textual data at the character level.
The choice between ranging over strings and rune slices becomes even more critical when using indexing. Indexing a string will give you the byte position of a character, while indexing a rune slice will provide the index of the character within the sequence of code points.
For example, if you have a string with a multibyte character at index 1, indexing it as a rune slice would provide the index of that character, which may be different from the byte index.
In Go, ranging over strings and rune slices serves different purposes. Ranging over strings gives you bytes, while ranging over rune slices provides character-level iteration. The decision between the two depends on whether you need to work with bytes or characters, and whether indexing is a factor. For general-purpose text manipulation, rune slices are the preferred choice, ensuring consistent character-based operations regardless of character encoding.
The above is the detailed content of What\'s the difference between ranging over strings and rune slices in Go?. For more information, please follow other related articles on the PHP Chinese website!