Table of Contents
目前的中文检索查询方案
Mysql 的中文全文检索插件开发
插件开发的具体方法
Home Database Mysql Tutorial zg手册 之 Mysql 开发(1)-- 中文全文检索插件开发_MySQL

zg手册 之 Mysql 开发(1)-- 中文全文检索插件开发_MySQL

Jun 01, 2016 pm 01:11 PM

目前的中文检索查询方案

  1. 基于数据库的模糊匹配(运行时字符串查找,查询速度比较慢)

  2. 专有的全文检索引擎(sphinx, lucene等)


我曾经遇到一个项目,数据量在百万级别,不需要高级的全文检索方式(没有复杂的匹配需求,没有复杂的过滤条件),只是需要根据关键词检索数据,当时采用的 mysql 全文检索插件的方式来满足的项目需求。


Mysql 的中文全文检索插件开发

  1. Mysql 的 MyISAM 引擎支持第三方的全文检索插件,可以用第三方插件替换默认的全文检索插件。

  2. 在全文检索插件中提供中文分词算法,告诉MyISAM如何分词,并创建索引。

  3. 查询的时候通过插件分词,查询索引快速定位数据记录。


插件开发的具体方法

主要通过代码注释描述插件的开发方法,创建文件 tft.c,代码如下

#include <stdlib.h>#include <ctype.h>// mysql 插件必须包含的头文件#include <mysql>// 这是我自己写的一个分词库,没有什么优化,可以替换为其他开源的实现。#include <st_darts.h>#include <st_utils.h>#if !defined(__attribute__) && (defined(__cplusplus) /|| !defined(__GNUC__)  || __GNUC__ == 2 && __GNUC_MINOR__ mode == MYSQL_FTPARSER_FULL_BOOLEAN_INFO){    bool_info.yesno = 1;  }  // 传递词给 mysql,用来创建索引,或者查询。  param->mysql_add_word(param, word, len, &bool_info);}/*  英文分词简单处理,用空格分隔          param              插件环境      描述:    解析英文的文档或者查询词,传递给 mysql 的索引引擎,用来创建索引,或者进行查询。*/static int tft_parse_en(MYSQL_FTPARSER_PARAM *param){  char *end, *start, *docend= param->doc + param->length;  number_of_calls++;  for (end= start= param->doc;; end++)  {    if (end == docend)    {      if (end > start)        add_word(param, start, end - start);      break;    }    else if (isspace(*end))    {      if (end > start)        add_word(param, start, end - start);      start= end + 1;    }  }  return 0;}/*  分词函数,对文档或者查询词进行分词。如果是全英文文档,则调用英文分词。*/#define c_uWordsCount 1024static int tft_parse(MYSQL_FTPARSER_PARAM *param){  if (NULL == param->doc || 0 == param->length){    return 0;  }  // 统计调用次数  number_of_calls++;  st_timer stTimerType = ST_TIMER_MICRO_SEC;  char* start = param->doc;  char* docend = param->doc + param->length;  // 初始化分词 handler  struct st_wordInfo wordInfo[c_uWordsCount] = { { 0, 0, 0 } };    st_darts_state dState;  stDartsStateInit(g_s_pDarts, &dState, start, docend);    uint32_t uWordsCount = 0;  long long queryBeginTime = stTimer(stTimerType);  // 循环获取中文分词  while(uWordsCount </st_utils.h></st_darts.h></mysql></ctype.h></stdlib.h>
Copy after login
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How do you alter a table in MySQL using the ALTER TABLE statement? How do you alter a table in MySQL using the ALTER TABLE statement? Mar 19, 2025 pm 03:51 PM

The article discusses using MySQL's ALTER TABLE statement to modify tables, including adding/dropping columns, renaming tables/columns, and changing column data types.

How do I configure SSL/TLS encryption for MySQL connections? How do I configure SSL/TLS encryption for MySQL connections? Mar 18, 2025 pm 12:01 PM

Article discusses configuring SSL/TLS encryption for MySQL, including certificate generation and verification. Main issue is using self-signed certificates' security implications.[Character count: 159]

How do you handle large datasets in MySQL? How do you handle large datasets in MySQL? Mar 21, 2025 pm 12:15 PM

Article discusses strategies for handling large datasets in MySQL, including partitioning, sharding, indexing, and query optimization.

What are some popular MySQL GUI tools (e.g., MySQL Workbench, phpMyAdmin)? What are some popular MySQL GUI tools (e.g., MySQL Workbench, phpMyAdmin)? Mar 21, 2025 pm 06:28 PM

Article discusses popular MySQL GUI tools like MySQL Workbench and phpMyAdmin, comparing their features and suitability for beginners and advanced users.[159 characters]

How do you drop a table in MySQL using the DROP TABLE statement? How do you drop a table in MySQL using the DROP TABLE statement? Mar 19, 2025 pm 03:52 PM

The article discusses dropping tables in MySQL using the DROP TABLE statement, emphasizing precautions and risks. It highlights that the action is irreversible without backups, detailing recovery methods and potential production environment hazards.

How do you create indexes on JSON columns? How do you create indexes on JSON columns? Mar 21, 2025 pm 12:13 PM

The article discusses creating indexes on JSON columns in various databases like PostgreSQL, MySQL, and MongoDB to enhance query performance. It explains the syntax and benefits of indexing specific JSON paths, and lists supported database systems.

How do you represent relationships using foreign keys? How do you represent relationships using foreign keys? Mar 19, 2025 pm 03:48 PM

Article discusses using foreign keys to represent relationships in databases, focusing on best practices, data integrity, and common pitfalls to avoid.

Explain InnoDB Full-Text Search capabilities. Explain InnoDB Full-Text Search capabilities. Apr 02, 2025 pm 06:09 PM

InnoDB's full-text search capabilities are very powerful, which can significantly improve database query efficiency and ability to process large amounts of text data. 1) InnoDB implements full-text search through inverted indexing, supporting basic and advanced search queries. 2) Use MATCH and AGAINST keywords to search, support Boolean mode and phrase search. 3) Optimization methods include using word segmentation technology, periodic rebuilding of indexes and adjusting cache size to improve performance and accuracy.

See all articles