Fuzzy query, such as querying users whose names contain "xiao", the common way of writing is like "%xiao%". In MySQL, it will scan the entire table, so the amount of data is small, which is fine. Full table scan is also very fast, but will become slower as the data increases, and it is very heavy to load ES. This article will introduce to you the solution to slow like fuzzy matching queries - MySQL full-text index.
Requirements
Need fuzzy matching to query a word
select * from t_phrase where LOCATE('Chan',phrase) = 0;
select * from t_chinese_phrase where instr(phrase,'Chan') > 0;
select * from t_chinese_phrase where phrase like '%长%'
Explain and take a look at the execution plan
It can be seen from the results of explain that although we build phrase The index is set, but when querying, the index is invalid.
reason:
The index of mysql is a B-tree structure. InnoDB's use of "%xx" when fuzzy querying data will cause the index to fail (I will not go into details here)
From the perspective of query time, the time spent: 90ms
Current data volume: 93230 (9.3W) already requires 90ms. This time is not acceptable. If the data volume increases, this time will continue to grow.
Solution:
When the amount of data is not large, use the full-text index of mysql; When the amount of data is relatively large or the full-text index of mysql does not meet expectations , you can consider using ES
The following is mainly related to the full-text index of MySQL.
Full-text index introduction
1. Development History
The full-text index of the old version of MySQL can only be used on the char, varchar and text fields of the MyISAM storage engine.
The InnoDB engine on MySQL5.6.24 has also added full-text indexing.
2. Full-text index
#Full-Text Search is stored in the database Technology to find information from any content in the entire book or entire article. It can obtain information about chapters, sections, paragraphs, words, etc. in the full text as needed, and can also perform various statistics and analysis
3. Create a full-text index
If you need to set up a full-text index for a large amount of data, it is recommended to add the data first and then create the index.
create fulltext index index name on table name (field name );
eg:
create table t_word
(
id int unsigned auto_increment comment '自增id' primary key,
uid char(32) not null comment '32位唯一id',
word varchar(256) null comment '英文单词',
translate varchar(256) null
);
create fulltext index full_idx_translate
on t_word (translate);
create fulltext index full_idx_word
on t_word (word);
INSERT INTO t_word (id, uid, word, translate) VALUES (1, '9d592499c65648b0a9519206688ef3f9', 'lion', '狮子');
INSERT INTO t_word (id, uid, word, translate) VALUES (2, 'ce26ac4239514bc6af481bcb1d9b67df', 'panda', '熊猫');
INSERT INTO t_word (id, uid, word, translate) VALUES (3, 'a7d6042853c44904b68275daafb44702', 'tiger', '老虎');
INSERT INTO t_word (id, uid, word, translate) VALUES (4, 'f13bd0a8ecea44fc9ade1625eeb4cc3c', 'goat', '山羊');
INSERT INTO t_word (id, uid, word, translate) VALUES (5, '27d5cbfc93a046388d712085e567474f', 'sheep', '绵羊');
INSERT INTO t_word (id, uid, word, translate) VALUES (6, 'ed35df138cf348aa937781be8ee21cbf', 'lamb', '羊羔');
INSERT INTO t_word (id, uid, word, translate) VALUES (7, 'fba5861d9527440990276e999f47ef8f', 'buffalo', '水牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (8, '3a72e76f210841b1939fff0d3d721375', 'bull', '公牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (9, '272e0b28ea7a48248a86f17533bf9943', 'cow', '母牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (10, '47127adface54e418e4c1b9980af6d16', 'calf', '小牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (11, '10592499c65648b0a9519206688ef3f9', 'little lion', '小狮子');
INSERT INTO t_word (id, uid, word, translate) VALUES (12, '1bf095110b634a01bee5b31c5ee7ee0c', 'little cow', '母牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (13, '4813e588cde54c30bd65bfdbb243ad1f', 'little calf', '小小牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (14, '5e377e281ad344048b6938a638b78ccb', 'little bull', '小公牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (15, '2855ad0da2964c7682c178eb8271f13d', 'little buffalo', '小水牛');
INSERT INTO t_word (id, uid, word, translate) VALUES (16, '72f24c9a77644d57a36f3bdf2b8116b0', 'little lamb', '小羊羔');
INSERT INTO t_word (id, uid, word, translate) VALUES (17, '2d592499c65648b0a9519206688ef3f9', 'I''m a big lion', '我是一只大狮子');
Copy after login
3. Delete the full-text index
alter table table name drop index index name;
4. Full-text index uses
syntax
MATCH(col1,col2,...) AGAINST(expr[search_modifier])
search_modifier:
{
IN NATURAL LANGUAGE MODE
| IN NATURAL LANGUAGE MODE WITH QUERY EXPANSION
| IN BOOLEAN MODE
| WITH QUERY EXPANSION
}
Copy after login
4.1 IN NATURAL LANGUAGE MODE
The natural language mode is MySQL default Full-text search mode for . Natural language mode cannot use operators, and cannot specify complex queries such as keywords that must appear or must not appear.
// 默认是使用 in natural language mode
select * from t_word where match(word) against ('lion');
// 或者 显示写
select * from t_word where match(word) against ('lion' in natural language mode);
Copy after login
Copy after login
The results are as follows:
4.2 IN BOOLEAN MODE
BOOLEAN modeYou can use operators , can support complex queries such as specifying whether keywords must appear or must not appear, or whether the weight of keywords is high or low. Recommended to use boolean mode
##Operator
Description
is empty
Default, contains the word
Includes, this word must exist.
-
Exclusion, the word must not appear.
##>(greater than sign)
Include and increase the ranking value, the query results will be higher
##< ;
Include and reduce the ranking value, the query results will be later
()
Group words into subexpressions (allow They are included as a group, excluded, ranked, etc.).
~
Ranking value of negative words.
*
The wildcard character is at the end of the word.
""
Defines a phrase (as opposed to a list of individual words, where the entire phrase matches to include or exclude).
示例:
// 默认是使用 in natural language mode
select * from t_word where match(word) against ('lion');
// 或者 显示写
select * from t_word where match(word) against ('lion' in natural language mode);
Copy after login
Copy after login
// 排除包含lion记录、查询出包含cow或者little的记录,提升包含calf单词的排名,降低包含cow记录的排名,查询出以go开头的记录
select * from t_word where match(word) against ('-lion cow little >calf <cow go*' in boolean mode) ;
Copy after login
好像问题都解决了, 但是问题才刚开始
回到最开始的需求,我想模糊搜索
select * from t_word where match(word) against('lio' in boolean mode);
Copy after login
预期值:把包含lion的都查询出来
实际结果:啥都没有。
全匹配查询的时候能查询出来
select * from t_word where match(translate) against('小水牛' in boolean mode);
Copy after login
只查询部分查询不出来。如:下面只查询 "小水" 或者"水牛" 都没有数据
select * from t_word where match(translate) against('小水' in boolean mode);
# test: 库名 t_chinese_phrase: 表名字
SET GLOBAL innodb_ft_aux_table="test/t_chinese_phrase";
# 查询分词情况
SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_INDEX_CACHE;
# 查询分词情况
select * from information_schema.innodb_ft_index_table;
Copy after login
查询结果如下:
因为我们上面设置了分词数是1,所以,可以看到都是按照一个词进行分词的。
字段解析: FIRST_DOC_ID :word第一次出现的文档ID LAST_DOC_ID : word最后一次出现的文档ID DOC_COUNT :含有word的文档个数 DOC_ID :当前文档ID POSITION : word 当在前文档ID的位置
查询
1、使用自然语言模式 NATURAL LANGUAGE MODE 查询
在自然语言模式(NATURAL LANGUAGE MODE)下,文本的查询被转换为n-gram分词查询的并集。
SELECT * FROM t_chinese_phrase WHERE MATCH (phrase) AGAINST ('繁荣昌盛' in boolean mode) ;
Copy after login
实际使用
回到我们最开始的查询需求,看看实际的效果
查询包含了“昌”的数据
SELECT * FROM t_chinese_phrase WHERE MATCH (phrase) AGAINST ('昌' IN boolean MODE) ;
SELECT * FROM t_chinese_phrase WHERE MATCH (phrase) AGAINST ('昌' ) order by id asc;
The above is the detailed content of Let's talk about how MySQL full-text index solves the problem of slow like fuzzy matching queries. For more information, please follow other related articles on the PHP Chinese website!
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
You can open phpMyAdmin through the following steps: 1. Log in to the website control panel; 2. Find and click the phpMyAdmin icon; 3. Enter MySQL credentials; 4. Click "Login".
MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.
Redis uses a single threaded architecture to provide high performance, simplicity, and consistency. It utilizes I/O multiplexing, event loops, non-blocking I/O, and shared memory to improve concurrency, but with limitations of concurrency limitations, single point of failure, and unsuitable for write-intensive workloads.
MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen
MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.
Effective monitoring of Redis databases is critical to maintaining optimal performance, identifying potential bottlenecks, and ensuring overall system reliability. Redis Exporter Service is a powerful utility designed to monitor Redis databases using Prometheus.
This tutorial will guide you through the complete setup and configuration of Redis Exporter Service, ensuring you seamlessly build monitoring solutions. By studying this tutorial, you will achieve fully operational monitoring settings
The methods for viewing SQL database errors are: 1. View error messages directly; 2. Use SHOW ERRORS and SHOW WARNINGS commands; 3. Access the error log; 4. Use error codes to find the cause of the error; 5. Check the database connection and query syntax; 6. Use debugging tools.
Apache connects to a database requires the following steps: Install the database driver. Configure the web.xml file to create a connection pool. Create a JDBC data source and specify the connection settings. Use the JDBC API to access the database from Java code, including getting connections, creating statements, binding parameters, executing queries or updates, and processing results.