mysql ft refers to FullText, that is, full-text index; full-text index is to solve queries that need to be based on similarity, rather than precise numerical comparison; full-text index can be N times faster than like in the face of a large amount of data. Speed is not an order of magnitude.
MySQL full-text index (FullText)
1. Introduction
Basic concepts
Full-text index is to solve the needs Queries based on similarity rather than exact numerical comparisons.
Although fuzzy matching can also be achieved using like %, it is unimaginable for retrieval of large amounts of text data. In the face of a large amount of data, full-text indexing can be N times faster than like, but the speed is not an order of magnitude.
Version support
MySQL 5.6 In previous versions, only MyISAM storage engine supports full-text index
MySQL 5.6 and later versions, MyISAM and InnoDB storage engines support full-text index
MySQL 5.7.6 provides a built-in full-text ngram parser that supports Chinese, Japanese, and Korean (CJK), as well as an installable MeCab for Japanese Full-text parser plug-in
Full-text index can only be used for InnoDB or MyISAM tables, and can only be used for CHAR, VARCHAR, TEXTColumn creation
For large data sets, Loading data into a table without a full-text index and then creating an index is much faster than loading data into a table with an existing full-text index
RDS MySQL 5.6 Although it also supports Chinese full-text search, there are BUG
Restrictions and shortcomings
Causes a lot of disk resources usage. Full-text indexing itself is a method of using disk space for performance. The reason why the full-text index is large is that word segmentation based on a certain language
The creation of the full-text index is slow, and the modification of various data with the full-text index is also slow
Using full-text indexing is not transparent to the application. If you want to use the full-text index, you must modify the query statement. It is impossible to use the full-text index for the original query statement and needs to be changed to the syntax specified by the full-text index
Case-insensitive
##Partition The table does not support full-text search
The index for full-text search composed of multiple columns must use the same character set and sorting rules
Full-text index may have accuracy issues, that is, the data found by the full-text index may be inconsistent with the like columns in the MATCH() function Must be exactly the same as the columns defined in the FULLTEXT index, unless full-text search using IN BOOLEAN MODE mode is used in the MyISAM table (search can be performed on columns that are not indexed, but the speed is very slow)
When creating full-text indexes for single columns separately, multi-column fuzzy queries will not take effect
Full-text indexes of different tables cannot be queried together. You can add OR## to the two statements.
#2. Operation of full-text index
2.1 Configure the minimum search length
We can view the currently configured minimum search length (word segmentation length) through SQL commands:
SHOW VARIABLES LIKE 'ft%';
Copy after login
Variable_nameValue
##ft_boolean_syntax
-> ;<()~*:""&|
##ft_max_word_len
84
##ft_min_word_len
1
ft_query_expansion_limit
20
ft_stopword_file
(built-in)
全文索引的相关参数都无法进行动态修改,必须通过修改 MySQL 的配置文件来完成。修改最小搜索长度的值为 1,首先打开 MySQL 的配置文件 /etc/my.cnf,在 [mysqld] 的下面追加以下内容:
CREATE FULLTEXT INDEX billno_fulltext ON salebill(billno) WITH PARSER ngram;
CREATE FULLTEXT INDEX remarks_fulltext ON salebill(remarks) WITH PARSER ngram;
CREATE FULLTEXT INDEX remarks_fulltext ON salebilldetail(remarks) WITH PARSER ngram;
CREATE FULLTEXT INDEX goodsremarks_fulltext ON salebilldetail(goodsremarks) WITH PARSER ngram;
CREATE FULLTEXT INDEX remarks_goodsremarks_fulltext ON salebilldetail(remarks, goodsremarks) WITH PARSER ngram;
CREATE FULLTEXT INDEX custname_fulltext ON customer(custname) WITH PARSER ngram;
CREATE FULLTEXT INDEX goodsname_fulltext ON goods(goodsname) WITH PARSER ngram;
CREATE FULLTEXT INDEX goodscode_fulltext ON goods(goodscode) WITH PARSER ngram;
Copy after login
测试结果,总的来说很魔幻。 为什么魔幻,看下面几个语句:
test_1
-- 测试1,原始 like 查询方式,用时 0.765s
select 1 from salebilldetail d where d.tid=260434 and ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%')));
Copy after login
test_2
-- 测试2,使用全文索引 remarks_fulltext、goodsremarks_fulltext, 用时 0.834s
select 1 from salebilldetail d where d.tid=260434 and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)));
Copy after login
test_3
-- 测试3,使用全文索引 remarks_goodsremarks_fulltext, 用时 0.242s
select 1 from salebilldetail d where d.tid=260434 and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)));
Copy after login
test_4
-- 测试4,原始 like 查询方式,不过滤 tid ,用时 22.654s
select t from salebilldetail d where ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%')));
Copy after login
test_5
-- 测试5,使用全文索引 remarks_fulltext、goodsremarks_fulltext, 不过滤 tid ,用时 24.855s
select 1 from salebilldetail d where ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)));
Copy after login
test_6
-- 测试6,使用全文索引 remarks_goodsremarks_fulltext, 不过滤 tid ,用时 0.213s
select 1 from salebilldetail d where ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)));
Copy after login
test_7
-- 测试7,使用全文索引 remarks_goodsremarks_fulltext, 用时 0.22s
select count(1) from salebilldetail d where d.tid=260434 and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)));
Copy after login
test_8
-- 测试8,使用全文索引 remarks_goodsremarks_fulltext, 不过滤 tid ,用时 0.007s
select count(1) from salebilldetail d where ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode)));
Copy after login
从上面的测试语句可以看出,数据量越多,查询越简单,全文索引的效果越好。
再来看看我们的业务测试SQL:
test_9
-- 测试9
select
i.billid
,if(0,0,i.qty) as qty
,if(0,0,i.goodstotal) as total
,if(0,0,i.chktotal) as selfchktotal
,if(0,0,i.distotal) as distotal
,if(0,0,i.otherpay) as feetotal
,if(0,0,ifnull(d.costtotal,0)) as costtotal
,if(0,0,ifnull(d.maoli,0)) as maoli
,i.billno
,from_unixtime(i.billdate,'%Y-%m-%d') as billdate /*单据日期*/
,from_unixtime(i.createdate,'%Y-%m-%d %H:%i:%s') as createdate /*制单日期*/
,if(i.sdate=0,'',from_unixtime(i.sdate,'%Y-%m-%d %H:%i:%s')) as sdate /*过账日期*/
,from_unixtime(i.udate,'%Y-%m-%d %H:%i:%s') as udate /*最后修改时间*/
,i.custid ,c.custname
,i.storeid ,k.storename
,i.empid ,e.empname
,i.userid ,u.username
,i.remarks /*单据备注*/
,i.effect,i.settle,i.redold,i.rednew /*单据状态*/
,i.printtimes /* 打印次数 */
,(case when i.rednew=1 then 1 when i.redold=1 then 2 when i.settle=1 then 3 when i.effect=1 then 4 else 9 end) as state /*单据状态*/
,(case when i.rednew=1 then '红冲单' when i.redold=1 then '已红冲' when i.settle=1 then '已结算' when i.effect=1 then '已过账' else '草稿' end) as statetext
,'' as susername /* 操作人 */
,'' as accname /* 科目 */
from salebill i
left join coursecentersale d on d.tid=i.tid and d.billid=i.billid
left join customer c on c.tid=i.tid and c.custid=i.custid
left join store k on k.tid=i.tid and k.storeid=i.storeid
left join employee e on e.tid=i.tid and e.empid=i.empid
left join user u on u.tid=i.tid and u.userid=i.userid
where i.tid=260434 and (i.billtype = 5 or i.effect = 1)
and ('_billdate_f_'!='')
and ('_billdate_t_'!='')
and ('_sdate_f_'!='')
and ('_sdate_t_'!='')
and ('_udate_f_'!='')
and ('_udate_t_'!='')
and ('_cdate_f_'!='')
and ('_cdate_t_'!='')
and ('_billid_'!='') /*单据id*/
and ('_custid_'!='') /*客户ID*/
and ('_storeid_'!='') /*店仓ID*/
and ('_empid_'!='') /*业务员ID*/
and ('_custstop_'!='') /*客户是否停用*/
and (
(i.billno like concat('%','葡萄','%'))
or (i.remarks like concat('%','葡萄','%'))
or exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((d.remarks like concat('%','葡萄','%')) or (d.goodsremarks like concat('%','葡萄','%'))))
or exists(select 1 from customer c where c.tid=260434 and c.custid=i.custid and (c.custname like concat('%','葡萄','%')))
or exists(select 1 from goods g join salebilldetail d on d.tid=g.tid and d.goodsid=g.goodsid where d.tid=260434 and d.billid=i.billid and ((g.goodsname like concat('%','葡萄','%')) or (g.goodscode like concat('%','葡萄','%'))))
)
and i.rednew=0 /*单据列表不含红冲单*/
and i.billid not in (select billid from coursecenter_del t where t.tid=260434)
and ((i.settle=1 and i.effect=1 and i.redold=0 and i.rednew=0)) /*已结算*/
order by udate desc,billno desc
limit 0,100;
Copy after login
执行时间约 1.6 秒,使用的是 like 方式。
改成使用全文索引方式:
test_10
-- 测试10
select
i.billid
,if(0,0,i.qty) as qty
,if(0,0,i.goodstotal) as total
,if(0,0,i.chktotal) as selfchktotal
,if(0,0,i.distotal) as distotal
,if(0,0,i.otherpay) as feetotal
,if(0,0,ifnull(d.costtotal,0)) as costtotal
,if(0,0,ifnull(d.maoli,0)) as maoli
,i.billno
,from_unixtime(i.billdate,'%Y-%m-%d') as billdate /*单据日期*/
,from_unixtime(i.createdate,'%Y-%m-%d %H:%i:%s') as createdate /*制单日期*/
,if(i.sdate=0,'',from_unixtime(i.sdate,'%Y-%m-%d %H:%i:%s')) as sdate /*过账日期*/
,from_unixtime(i.udate,'%Y-%m-%d %H:%i:%s') as udate /*最后修改时间*/
,i.custid ,c.custname
,i.storeid ,k.storename
,i.empid ,e.empname
,i.userid ,u.username
,i.remarks /*单据备注*/
,i.effect,i.settle,i.redold,i.rednew /*单据状态*/
,i.printtimes /* 打印次数 */
,(case when i.rednew=1 then 1 when i.redold=1 then 2 when i.settle=1 then 3 when i.effect=1 then 4 else 9 end) as state /*单据状态*/
,(case when i.rednew=1 then '红冲单' when i.redold=1 then '已红冲' when i.settle=1 then '已结算' when i.effect=1 then '已过账' else '草稿' end) as statetext
,'' as susername /* 操作人 */
,'' as accname /* 科目 */
from salebill i
left join coursecentersale d on d.tid=i.tid and d.billid=i.billid
left join customer c on c.tid=i.tid and c.custid=i.custid
left join store k on k.tid=i.tid and k.storeid=i.storeid
left join employee e on e.tid=i.tid and e.empid=i.empid
left join user u on u.tid=i.tid and u.userid=i.userid
where i.tid=260434 and (i.billtype = 5 or i.effect = 1)
and ('_billdate_f_'!='')
and ('_billdate_t_'!='')
and ('_sdate_f_'!='')
and ('_sdate_t_'!='')
and ('_udate_f_'!='')
and ('_udate_t_'!='')
and ('_cdate_f_'!='')
and ('_cdate_t_'!='')
and ('_billid_'!='') /*单据id*/
and ('_custid_'!='') /*客户ID*/
and ('_storeid_'!='') /*店仓ID*/
and ('_empid_'!='') /*业务员ID*/
and ('_custstop_'!='') /*客户是否停用*/
and (
(match(i.billno) against(concat('"','葡萄','"') in boolean mode))
or (match(i.remarks) against(concat('"','葡萄','"') in boolean mode))
or exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))))
or exists(select 1 from customer c where c.tid=260434 and c.custid=i.custid and (match(c.custname) Against(concat('"','葡萄','"') in boolean mode)))
or exists(select 1 from goods g join salebilldetail d on d.tid=g.tid and d.goodsid=g.goodsid where d.tid=260434 and d.billid=i.billid
and ((match(g.goodsname) Against(concat('"','葡萄','"') in boolean mode))
or (match(g.goodscode) Against(concat('"','葡萄','"') in boolean mode))))
)
and i.rednew=0 /*单据列表不含红冲单*/
and i.billid not in (select billid from coursecenter_del t where t.tid=260434)
and ((i.settle=1 and i.effect=1 and i.redold=0 and i.rednew=0)) /*已结算*/
order by udate desc,billno desc
limit 0,100;
exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks) Against(concat('"','葡萄','"') in boolean mode)) or (match(d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))))
Copy after login
test_11
改成使用全文索引 remarks_goodsremarks_fulltext
-- 测试11
exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))))
Copy after login
执行时间无限长(跑了半天没成功)? 经分析,在 where 子句中,一个条件子句中包含一个以上 match 时会出现这样的情况。即:
-- and 中只有一个全文检索时正常, 用时0.2秒
select xxx from xxx
...
and (
exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))))
)
...
-- 下面这样就异常了,会慢成百上千倍,用时 160 秒, 如果有更多的 match ,会更夸张的慢下去
select xxx from xxx
...
and (
exists(select 1 from salebilldetail d where d.tid=260434 and d.billid=i.billid and ((match(d.remarks,d.goodsremarks) Against(concat('"','葡萄','"') in boolean mode))))
or match(i.billno) against(concat('"','葡萄','"') in boolean mode)
)
...
Copy after login
测试结果汇总:
查询
用时(秒)
备注
test 1
0.765
原始like查询
test 2
0.834
全文索引 remarks_fulltext、goodsremarks_fulltext
test 3
0.242
全文索引 remarks_goodsremarks_fulltext
---
test 4
22.654
原始like查询,不过滤 tid
test 5
24.855
全文索引 remarks_fulltext、goodsremarks_fulltext, 不过滤 tid
-- 修改加密规则
ALTER USER 'root'@'localhost' IDENTIFIED BY 'password' PASSWORD EXPIRE NEVER;
-- 更新用户密码
ALTER USER '账号'@'%' IDENTIFIED WITH mysql_native_password BY '密码';
-- 刷新权限
FLUSH PRIVILEGES;
-- 原来的流程:
mysql> grant all on *.* to 'admin'@'%' identified by 'admin';
-- 新的正确流程:
mysql> create user 'admin'@'%' identified by 'admin';
mysql> grant all on *.* to 'admin'@'%' ;
mysql> flush privileges;
show variables like '%time_zone%';
set global time_zone='+8:00';
Copy after login
如果时区问题还不能解决:
JDBC 连接串修改如下(首先需要驱动使用8.0对应连接的驱动):
MySQL 5.7 原生支持JSON类型,并引入了众多JSON函数
MySQL 8.0 JSON字段的部分更新(JSON Partial Updates)
MySQL 8.0 默认字符集由latin1修改为utf8mb4
MySQL 8.0 正则表达式的增强,新增了4个相关函数,REGEXP_INSTR(),REGEXP_LIKE(),REGEXP_REPLACE(),REGEXP_SUBSTR()
MySQL 8.0 GROUP BY语句不再隐式排序 (忽略在Group By中的排序命令,如 desc, asc)
The above is the detailed content of What does mysql ft refer to?. For more information, please follow other related articles on the PHP Chinese website!
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn