How to Match Accented Characters in JavaScript Regular Expressions?-JS Tutorial-php.cn

How to Match Accented Characters in JavaScript Regular Expressions?

Patricia Arquette

Release： 2024-11-08 01:54:01

Original

233 people have browsed it

How to Match Accented Characters in JavaScript Regular Expressions?

Matching Accented Characters in JavaScript Regular Expressions

When matching strings containing accented characters (diacritics), JavaScript presents challenges due to its Unicode handling. Here are approaches to address this:

Explicit Listing of Accented Characters

This method is cumbersome and inflexible, as it requires manually listing all supported accented characters:

var accentedCharacters = "àèìòùÀÈÌÒÙáéíóúýÁÉÍÓÚÝâêîôûÂÊÎÔÛãñõÃÑÕäëïöüÿÄËÏÖÜŸçÇßØøÅåÆæœ";
var regex = "^[a-zA-Z" + accentedCharacters + "]+,\s[a-zA-Z" + accentedCharacters + "]+$";

Copy after login

Using the Dot Character Class

This approach matches almost anything, as the dot (.) class allows for any character except newlines:

var regex = /^.+,\s.+$/;

Copy after login

Unicode Range

This method utilizes a Unicode character range to match accented Latin characters:

/^[a-zA-Z\u00C0-\u017F]+,\s[a-zA-Z\u00C0-\u017F]+$/

Copy after login

Comparison and Recommendation

The third approach using the Unicode range is recommended, as it matches all Latin characters with accents relevant to the user case and avoids unnecessary characters or excessive matching.

A Simpler Solution for Unicode Accents

For matching all Unicode accents, consider using this simplified expression:

[A-zÀ-ú] // accepts lowercase and uppercase characters
[A-zÀ-ÿ] // as above, including letters with an umlaut (includes [ ] ^ \ × ÷)
[A-Za-zÀ-ÿ] // as above but not including [ ] ^ \
[A-Za-zÀ-ÖØ-öø-ÿ] // as above, but not including [ ] ^ \ × ÷

Copy after login

The above is the detailed content of How to Match Accented Characters in JavaScript Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!