Does PostgreSQL support "accent-insensitive" collations?
Microsoft SQL Server provides an "accent-insensitive" collation that allows queries such as SELECT * FROM users WHERE name LIKE 'João' to be executed to find rows containing the name "Joao". PostgreSQL does not natively support such collations.
Solution using unaccent module
PostgreSQL provides the unaccent module, which removes diacritics from strings. To use it, install the extension (CREATE EXTENSION unaccent). You can then perform a case-insensitive search like this:
<code class="language-sql">SELECT * FROM users WHERE unaccent(name) = unaccent('João');</code>
Expression Index
To improve query speed, use the f_unaccent function to create an expression index. This allows the planner to leverage the index for queries involving unaccenting.
<code class="language-sql">CREATE INDEX users_unaccent_name_idx ON users(f_unaccent(name));</code>
PostgreSQL 12 and its ICU collation
Newer PostgreSQL versions (12) support ICU (International Component of Unicode) collations, which provide accent-insensitive grouping and sorting. However, these collations may impact performance. If you prioritize performance, consider using the unaccent solution.
<code class="language-sql">CREATE COLLATION ignore_accent (provider = icu, locale = 'und-u-ks-level1-kc-true', deterministic = false); CREATE INDEX users_name_ignore_accent_idx ON users(name COLLATE ignore_accent); SELECT * FROM users WHERE name = 'João' COLLATE ignore_accent;</code>
The above is the detailed content of Does PostgreSQL Offer Accent-Insensitive Collations for Efficient String Matching?. For more information, please follow other related articles on the PHP Chinese website!