Ignore accented characters when comparing strings in C#
Handling string comparisons with accented characters can be tricky in C#. Consider the following example:
<code class="language-csharp">string s1 = "hello"; string s2 = "héllo"; s1.Equals(s2, StringComparison.InvariantCultureIgnoreCase); s1.Equals(s2, StringComparison.OrdinalIgnoreCase);</code>
The two strings should be equal, but both statements return false. This is because the accents on letters are treated as different characters. To solve this problem, we can use a technique that removes additional symbols (or accents) before comparing the strings.
Remove additional symbols
Here's a way to remove additional symbols from a string:
<code class="language-csharp">static string RemoveDiacritics(string text) { // 将字符串规范化为 Unicode 规范化形式 D string formD = text.Normalize(NormalizationForm.FormD); // 创建一个 StringBuilder 来保存结果字符串 StringBuilder sb = new StringBuilder(); // 迭代规范化字符串中的字符 foreach (char ch in formD) { // 检查字符是否不是非间隔标记 UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch); if (uc != UnicodeCategory.NonSpacingMark) { // 将字符附加到 StringBuilder sb.Append(ch); } } // 将 StringBuilder 转换为字符串并将其规范化为 Unicode 规范化形式 C return sb.ToString().Normalize(NormalizationForm.FormC); }</code>
This method first normalizes the string into D form, which splits accented characters into multiple characters. It then iterates over these characters and filters out non-spacing tokens (i.e. accents). Finally, it recombines the remaining characters to form an unaccented string.
To use this method, simply apply it to two strings before comparing them. For example:
<code class="language-csharp">string s1 = "hello"; string s2 = "héllo"; s1.Equals(RemoveDiacritics(s2), StringComparison.InvariantCultureIgnoreCase); // True</code>
This will correctly evaluate to true, considering the accented "e" in "héllo" to be equivalent to the unaccented "e" in "hello".
The above is the detailed content of How Can I Compare Strings in C# While Ignoring Accented Characters?. For more information, please follow other related articles on the PHP Chinese website!