Removing accentuated characters (also known as diacritics) from strings is a common need in text processing. In older browsers like IE6, manipulating such characters using regular expressions can be problematic.
With the advent of ES2015/ES6, the String.prototype.normalize() method can be used for this task. By normalizing the string to Unicode Normal Form Decomposition (NFD), accented characters are broken down into their base characters and diacritical marks.
const str = "Crème Brûlée"; str.normalize("NFD").replace(/[\u0300-\u036f]/g, ""); // "Creme Brulee"
The range [u0300-u036f] matches the Unicode Combining Diacritical Marks block. Alternatively, the /p{Diacritic}/gu regular expression can be used with Unicode property escapes.
Sorting strings with accents can be inconsistent using simple sort methods. Intl.Collator provides more accurate sorting capabilities.
const c = new Intl.Collator(); ["creme brulee", "crème brûlée", ...].sort(c.compare); // Sorts correctly based on collation rules
Using String.normalize() or Intl.Collator provides effective methods for removing accents/diacritics from strings in JavaScript. These solutions can handle sorting considerations more consistently than traditional methods.
The above is the detailed content of How Can I Efficiently Remove Accents from Strings in JavaScript?. For more information, please follow other related articles on the PHP Chinese website!