Ignore accented letters in C# string comparisons
In C#, strings with the same spelling but different accents are usually treated as different entities. This can cause challenges when accents need to be ignored for string equality comparisons.
To solve this problem, the RemoveDiacritics
function normalizes the input string to NormalizationForm.FormD
and removes all non-spacing marks, effectively stripping all accents from the characters. The resulting string is then normalized to NormalizationForm.FormC
to restore its original case sensitivity.
<code class="language-csharp">static string RemoveDiacritics(string text) { string formD = text.Normalize(NormalizationForm.FormD); StringBuilder sb = new StringBuilder(); foreach (char ch in formD) { UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(ch); if (uc != UnicodeCategory.NonSpacingMark) { sb.Append(ch); } } return sb.ToString().Normalize(NormalizationForm.FormC); }</code>
This function effectively converts an accented character (e.g. "é") to its corresponding unaccented equivalent (e.g. "e"). By using NormalizationForm.FormD
and NormalizationForm.FormC
of Normalize
, you can remove accent marks while maintaining case sensitivity.
For example, the following code demonstrates how to use the RemoveDiacritics
function to compare strings while ignoring accents:
<code class="language-csharp">string s1 = "hello"; string s2 = "héllo"; string s1NoDiacritics = RemoveDiacritics(s1); string s2NoDiacritics = RemoveDiacritics(s2); Console.WriteLine(s1NoDiacritics == s2NoDiacritics); // 输出:True</code>
In this example, if accents are not taken into account, s1 and s2 will be considered different. However, removing the accent marks makes s1NoDiacritics and s2NoDiacritics identical, causing the comparison to be True.
The above is the detailed content of How Can I Compare Strings in C# While Ignoring Accents?. For more information, please follow other related articles on the PHP Chinese website!