分享mysql中文全文搜索：中文分词简单函数 -php手册-php.cn

Home

php教程

php手册

分享mysql中文全文搜索：中文分词简单函数

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jun 07, 2016 am 11:40 AM

分享mysql中文全文搜索：中文分词简单函数
原文地址：http://www.jb100.net/html/content-22-400-1.html
前段时间研究中文全文搜索，结果发现mysql不支持中文的全文搜索。但是有一些解决办法，就是手动把中文单词用空格分开，然后搜索的时候加上 in boolean mode。但是这就带来一个问题，就是中文分词。这个是个很大的难题，貌似中科院有个小组就是专门做中文分词技术的。我们用 php来分词的话，要实现真正语义上的分词是非常困难的，就算实现了效率也不高。一般情况下，我们采用的是如下方法分词：

比如我们有一句话：你好我是刘春龙
那么我们可以这样来分词：你好好我我是是刘刘春春龙

这样虽然看起来有点傻，但是实际应用起来确实可行，因为我们搜索时候输入的关键词也是按照这个方法分词。

下面有个我自己写的函数，可以实现这种分词。传入三个参数，分别是：

1.需要分词的字符串，必须，英文，标点，数字，汉字，日语等都可以。编码为UTF-8
2.是否返回字符串，可选，默认是。如果传入false，那么将返回一个数组。
3.是否base64_encode中文，可选，默认是。Mysql的全文搜索有个配置是 ft_min_word_len 这个值一般是4，而我们分成的中文词语是两个字，就不会被mysql认为是一个词。而base64_encode过后，词语的长度为8，就不存在最小长度问题了。 base64_encode过后数据量会增大 50%。

注意，这里输入和输出的字符串编码都是UTF-8 function string2words($s,$return_string = true,$encode64 = true) { $re = ''; //匹配汉字 if (preg_match_all("/([x{4e00}-x{9fff}]{2,})/u",$s,$ms)) { foreach($ms[0] as $w) { //关键部分：分词 $l = strlen($w)/3; for($i=0;$i { $wi = substr($w,$i*3,6); if (strlen($wi) > 3) { $re .= ($encode64)?' '.str_replace(',','@',base64_encode($wi)):' '.$wi; } } } } //匹配数字 if (preg_match_all("/(d+[.]?d+)/",$s,$ms)) { foreach($ms[0] as $wi) { if(strlen($wi) >= 2) { $re .= ($encode64)?' '.str_replace(',','@',base64_encode($wi)):' '.$wi; } } $s = preg_replace("/(d+[.]?d+)/",' ',$s); } //去掉所有双字节字符 $s = preg_replace("/([^x{00}-x{ff}]+)/u",' ',$s); $re = $s.' '.$re; if (!$return_string) { $re = preg_replace("/([^d])([,.-?n])([^d])/",'$1 $3',$re); $re = trim(preg_replace("/[s]{2,}/",' ',$re)); $arr = explode(' ',$re); $re = array(); foreach($arr as $a) { if (strlen($a) >= 2) $re[] = $a; } return $re; } else { $re = trim(preg_replace("/[s,.]{2,}/",' ',$re)); return $re; } } 原文地址：http://www.jb100.net/html/content-22-400-1.html

AD：真正免费，域名+虚机+企业邮箱=0元

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Will R.E.P.O. Have Crossplay?

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7549

CakePHP Tutorial

1382

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Learn about introductory code examples for Python programming Jan 04, 2024 am 10:50 AM

Learn about Python programming with introductory code examples Python is an easy-to-learn, yet powerful programming language. For beginners, it is very important to understand the introductory code examples of Python programming. This article will provide you with some concrete code examples to help you get started quickly. Print HelloWorldprint("HelloWorld") This is the simplest code example in Python. The print() function is used to output the specified content

PHP variables in action: 10 real-life examples of use Feb 19, 2024 pm 03:00 PM

PHP variables store values during program runtime and are crucial for building dynamic and interactive WEB applications. This article takes an in-depth look at PHP variables and shows them in action with 10 real-life examples. 1. Store user input $username=$_POST["username"];$passWord=$_POST["password"]; This example extracts the username and password from the form submission and stores them in variables for further processing. 2. Set the configuration value $database_host="localhost";$database_username="username";$database_pa

Go language programming examples: code examples in web development Mar 04, 2024 pm 04:54 PM

"Go Language Programming Examples: Code Examples in Web Development" With the rapid development of the Internet, Web development has become an indispensable part of various industries. As a programming language with powerful functions and superior performance, Go language is increasingly favored by developers in web development. This article will introduce how to use Go language for Web development through specific code examples, so that readers can better understand and use Go language to build their own Web applications. 1. Simple HTTP Server First, let’s start with a

Java implements simple bubble sort code Jan 30, 2024 am 09:34 AM

The simplest code example of Java bubble sort Bubble sort is a common sorting algorithm. Its basic idea is to gradually adjust the sequence to be sorted into an ordered sequence through the comparison and exchange of adjacent elements. Here is a simple Java code example that demonstrates how to implement bubble sort: publicclassBubbleSort{publicstaticvoidbubbleSort(int[]arr){int

From beginner to proficient: Code implementation of commonly used data structures in Go language Mar 04, 2024 pm 03:09 PM

Title: From Beginner to Mastery: Code Implementation of Commonly Used Data Structures in Go Language Data structures play a vital role in programming and are the basis of programming. In the Go language, there are many commonly used data structures, and mastering the implementation of these data structures is crucial to becoming a good programmer. This article will introduce the commonly used data structures in the Go language and give corresponding code examples to help readers from getting started to becoming proficient in these data structures. 1. Array Array is a basic data structure, a group of the same type

Huawei Cloud Edge Computing Interconnection Guide: Java code examples to quickly implement interfaces Jul 05, 2023 pm 09:57 PM

Huawei Cloud Edge Computing Interconnection Guide: Java Code Samples to Quickly Implement Interfaces With the rapid development of IoT technology and the rise of edge computing, more and more enterprises are beginning to pay attention to the application of edge computing. Huawei Cloud provides edge computing services, providing enterprises with highly reliable computing resources and a convenient development environment, making edge computing applications easier to implement. This article will introduce how to quickly implement the Huawei Cloud edge computing interface through Java code. First, we need to prepare the development environment. Make sure you have the Java Development Kit installed (

How to use PHP to write inventory management function code in the inventory management system Aug 06, 2023 pm 04:49 PM

How to use PHP to write the inventory management function code in the inventory management system. Inventory management is an indispensable part of many enterprises. For companies with multiple warehouses, the inventory management function is particularly important. By properly managing and tracking inventory, companies can allocate inventory between different warehouses, optimize operating costs, and improve collaboration efficiency. This article will introduce how to use PHP to write code for inventory warehouse management functions, and provide you with relevant code examples. 1. Establish the database before starting to write the code for the inventory warehouse management function.

Guidance and Examples: Learn to implement the selection sort algorithm in Java Feb 18, 2024 am 10:52 AM

Java Selection Sorting Method Code Writing Guide and Examples Selection sorting is a simple and intuitive sorting algorithm. The idea is to select the smallest (or largest) element from the unsorted elements each time and exchange it until all elements are sorted. This article will provide a code writing guide for selection sorting, and attach specific Java sample code. Algorithm Principle The basic principle of selection sort is to divide the array to be sorted into two parts, sorted and unsorted. Each time, the smallest (or largest) element is selected from the unsorted part and placed at the end of the sorted part. Repeat the above

See all articles