Home > Database > Mysql Tutorial > How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

Linda Hamilton
Release: 2024-11-01 05:33:27
Original
891 people have browsed it

How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?

Fuzzy Matching Email or Telephone Using Elasticsearch

Elasticsearch offers built-in capabilities for fuzzy matching of email addresses and telephone numbers.

Email Matching

To match email addresses ending with a specific domain (e.g., @gmail.com):

<code class="json">{
    "query": {
        "term": {
            "email": ".*@gmail.com"
        }
    }
}</code>
Copy after login

Or, to match emails containing a specific string:

<code class="json">{
    "query": {
        "match": {
            "email": {
                "query": "sales@*",
                "operator": "and"
            }
        }
    }
}</code>
Copy after login

Telephone Matching

For fuzzy matching of telephone numbers, you can use the following pattern:

<code class="json">{
    "query": {
        "prefix": {
            "tel": "136*"
        }
    }
}</code>
Copy after login

This will match all phone numbers starting with "136".

Performance Optimization

To improve performance for fuzzy matching, consider using custom analyzers that leverage n-gram or edge n-gram token filters. These filters break down the text into smaller tokens, making it easier for Elasticsearch to perform fuzzy matching.

Email Analyzer Configuration:

<code class="json">{
  "settings": {
    "analysis": {
      "analyzer": {
        "email_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "name_ngram_filter",
            "trim"
          ]
        }
      },
      "filter": {
        "name_ngram_filter": {
          "type": "ngram",
          "min_gram": "3",
          "max_gram": "20"
        }
      }
    }
  }
}</code>
Copy after login

Telephone Analyzer Configuration:

<code class="json">{
  "settings": {
    "analysis": {
      "analyzer": {
        "phone_analyzer": {
          "type": "custom",
          "char_filter": [
            "digit_only"
          ],
          "tokenizer": "digit_edge_ngram_tokenizer",
          "filter": [
            "trim"
          ]
        }
      },
      "char_filter": {
        "digit_only": {
          "type": "pattern_replace",
          "pattern": "\D+",
          "replacement": ""
        }
      },
      "tokenizer": {
        "digit_edge_ngram_tokenizer": {
          "type": "edgeNGram",
          "min_gram": "3",
          "max_gram": "15",
          "token_chars": [
            "digit"
          ]
        }
      }
    }
  }
}</code>
Copy after login

The above is the detailed content of How to Perform Fuzzy Matching of Email Addresses and Telephone Numbers Using Elasticsearch?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template