Home Database Mysql Tutorial 中文模糊搜索_MySQL

中文模糊搜索_MySQL

May 30, 2016 pm 05:10 PM
Chinese

什么是模糊搜索?为什么要使用模糊搜索?相信大家都知道这些,我就不讲了。今天只讲怎么使用模糊搜索。

 

一 LIKE。大名鼎鼎的like字句,使用方便,兼容性好,易维护,但效率奇低。大家都会用,不多介绍。

 

二 MYSQL 原生支持的全文索引(FULLTEXT index)。

 

实现方式:首先给目标字段添加索引,索引的类型是FULLTEXT,然后查询的时候,在sql语句的where条件后面使用against()去指定关键字就好了。

 

而网上很多文章对这个理解却有很多误区,认为FULLTEXT不支持中文,或者Linux下全文索引不支持中文,然后说要把中文转为拼音就好了。事实上不是这样的,重点是在于分词,因为中文没有自然分词,不像英文每个单词都有空格隔开,而转为拼音之后,每个字对应的拼音之间也像单词一样空格隔开,所以才有了“FULLTEXT不支持中文,要把中文转为拼音”这一说法。

 

其实,转拼音也行、按照分词规则把一个个词组用空格隔开也好、甚至简单粗暴的每个字空格隔开都可以,然后把这些用空格隔开的文本存入都数据库的一个特定字段里面,也就是数据库里面要一份信息要存两个字段,一段原始文本/一段分词之后的文本。注意,FULLTEXT 索引要设置在分词之后的那个字段上面。

 

优点:与使用like字句相比,更加高效,且MYSQL 原生支持。

 

缺点:要额外维护一个字段,而且需要自行分词。使用复杂,有多复杂?请见下文(FULLTEXT全文索引的几个关键点)

 

三 使用第三方组件,(Coreseek)sphinx、迅搜……

 

实现方式:把数据中需要搜索的字段连同Id,一起导入到这些第三方组件中去,搜索的时候,调用这些第三方组件提供的api去搜索,得到返回的Id,再根据Id去数据库查询。

 

优点:比上面两种方案都要高效,且不需要自行分词。

 

缺点:需要额外维护这个第三方组件,并且每次更新数据库都要同时更新它。

 

而我选择了方案三,至于维护与同步,都交给了定时任务去做了。

 

-- FULLTEXT全文索引的几个关键点

 

1. 表的存储引擎需要是MyISAM,听说MYSQL5.6也支持全文索引了;

 

2. 字段类型:char、varchar和text;

 

3. MySQL全文索引查询关键词最小长度限制;

 

=> ft_min_word_len,默认是4,建议改为1,不然against()对应的关键字就只能是4个以上的字符,查不了单个字符,也查不了单个汉字.

 

=> my.ini配置文件中添加

[mysqld]

ft_min_word_len = 1

 

=> 设置 ft_min_word_len 之后,要重启MySQL服务,然后执行 SHOW VARIABLES 查看 ft_min_word_len设置成功没有;

 

=> 重新设置配置后,已经设置的索引需要重新设置生成索引,不然有可能报错,

我在update某些记录的时候就报错了: Incorrect key file for table './webm/temp.MYI';try to repair it. 

 

后来, 我执行了 mysql> repair table 表名; 就好了;

 

5. match(索引名),match()的参数是索引名,不是字段名;

 

=> MATCH(title, content)里的参数必须和FULLTEXT(title, content)里的参数一模一样。

 

6. match(singername,songname),可以同时在多个索引名里面查找关键字;

 

7. 如果一个关键词在50%的数据出现,那么这个词会被当做无效词,可以使用against('关键字'IN BOOLEAN MODE)绕过无效设定;

 

8. 如果搜索多个词,请用空格或者逗号隔开,如下

 

=> SELECT * FROM `temp` WHERE MATCH(`char`) AGAINST ('a x');

 

=> SELECT * FROM `temp` WHERE MATCH(`char`) AGAINST ('a,x');

 

=> AGAINST('关键字1 关键字2'),使用逗号或空格隔开多个关键字,使用的是or规则.

 

9. 每次更新表都会重构索引,索引使用了全文索引会拖慢insert和update;

 

10. 搜索语法规则;

 

=> + 一定要有(不含有该关键词的数据条均被忽略)。

 

=> - 不可以有(排除指定关键词,含有该关键词的均被忽略)。

 

=> " " 用双引号将一段句子包起来表示要完全相符,不可拆字。

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to set Chinese in Call of Duty: Warzone mobile game How to set Chinese in Call of Duty: Warzone mobile game Mar 22, 2024 am 08:41 AM

Call of Duty Warzone is a newly launched mobile game. Many players are very curious about how to set the language of this game to Chinese. In fact, it is very simple. Players only need to download the Chinese language pack, and then You can modify it after using it. The detailed content can be learned in this Chinese setting method introduction. Let us take a look together. How to set the Chinese language for the mobile game Call of Duty: Warzone 1. First enter the game and click the settings icon in the upper right corner of the interface. 2. In the menu bar that appears, find the [Download] option and click it. 3. Select [SIMPLIFIEDCHINESE] (Simplified Chinese) on this page to download the Simplified Chinese installation package. 4. Return to the settings

How to set Excel table to display Chinese? Excel switching Chinese operation tutorial How to set Excel table to display Chinese? Excel switching Chinese operation tutorial Mar 14, 2024 pm 03:28 PM

Excel spreadsheet is one of the office software that many people are using now. Some users, because their computer is Win11 system, so the English interface is displayed. They want to switch to the Chinese interface, but they don’t know how to operate it. To solve this problem, this issue The editor is here to answer the questions for all users. Let’s take a look at the content shared in today’s software tutorial. Tutorial for switching Excel to Chinese: 1. Enter the software and click the "File" option on the left side of the toolbar at the top of the page. 2. Select "options" from the options given below. 3. After entering the new interface, click the “language” option on the left

How to display Chinese characters correctly in PHP Dompdf How to display Chinese characters correctly in PHP Dompdf Mar 05, 2024 pm 01:03 PM

How to display Chinese characters correctly in PHPDompdf When using PHPDompdf to generate PDF files, it is a common challenge to encounter the problem of garbled Chinese characters. This is because the font library used by Dompdf by default does not contain Chinese character sets. In order to display Chinese characters correctly, we need to manually set the font of Dompdf and make sure to select a font that supports Chinese characters. Here are some specific steps and code examples to solve this problem: Step 1: Download the Chinese font file First, we need

An effective way to fix Chinese garbled characters in PHP Dompdf An effective way to fix Chinese garbled characters in PHP Dompdf Mar 05, 2024 pm 04:45 PM

Title: An effective way to repair Chinese garbled characters in PHPDompdf. When using PHPDompdf to generate PDF documents, garbled Chinese characters are a common problem. This problem usually stems from the fact that Dompdf does not support Chinese character sets by default, resulting in Chinese content not being displayed correctly. In order to solve this problem, we need to take some effective ways to fix the Chinese garbled problem of PHPDompdf. 1. Use custom font files. An effective way to solve the problem of Chinese garbled characters in Dompdf is to use

Setting up Chinese with VSCode: The Complete Guide Setting up Chinese with VSCode: The Complete Guide Mar 25, 2024 am 11:18 AM

VSCode Setup in Chinese: A Complete Guide In software development, Visual Studio Code (VSCode for short) is a commonly used integrated development environment. For developers who use Chinese, setting VSCode to the Chinese interface can improve work efficiency. This article will provide you with a complete guide, detailing how to set VSCode to a Chinese interface and providing specific code examples. Step 1: Download and install the language pack. After opening VSCode, click on the left

Will wwe2k24 have Chinese? Will wwe2k24 have Chinese? Mar 13, 2024 pm 04:40 PM

"WWE2K24" is a racing sports game created by Visual Concepts and was officially released on March 9, 2024. This game has been highly praised, and many players are eagerly interested in whether it will have a Chinese version. Unfortunately, so far, "WWE2K24" has not yet launched a Chinese language version. Will wwe2k24 be in Chinese? Answer: Chinese is not currently supported. The standard version of WWE2K24 in the Steam Chinese region is priced at 199 yuan, the deluxe version is 329 yuan, and the commemorative edition is 395 yuan. The game has relatively high configuration requirements, and there are certain standards in terms of processor, graphics card, or running memory. Official recommended configuration and minimum configuration introduction:

How to set the language of Windows 7 to Chinese How to set the language of Windows 7 to Chinese Dec 21, 2023 pm 10:07 PM

Some friends may accidentally set it to English when installing the system. As a result, all the interfaces are changed to English and they cannot be understood. In fact, we can set the language in the control panel and change the language to Chinese. Let’s take a look at how to change it. How to change the language in win7 to Chinese 1. First click the button in the lower left corner of the screen, and then select "Control Panel" 2. Find "Changedispalylanguage" under "Clock, Language, and Region" 3. Click "English" below to select from the drop-down menu Simplified Chinese. 4. After confirmation, click "Logoffnow" to log out and restart the computer. 5. After coming back

Tips for solving Chinese garbled characters when writing txt files with PHP Tips for solving Chinese garbled characters when writing txt files with PHP Mar 27, 2024 pm 01:18 PM

Tips for solving Chinese garbled characters written by PHP into txt files. With the rapid development of the Internet, PHP, as a widely used programming language, is used by more and more developers. In PHP development, it is often necessary to read and write text files, including txt files that write Chinese content. However, due to encoding format problems, sometimes the written Chinese will appear garbled. This article will introduce some techniques to solve the problem of Chinese garbled characters written into txt files by PHP, and provide specific code examples. Problem analysis in PHP, text

See all articles