Home Backend Development PHP Tutorial sphinx全文检索之PHP使用课程

sphinx全文检索之PHP使用课程

Jun 13, 2016 pm 12:37 PM
nbsp sphinx

sphinx全文检索之PHP使用教程
Sphinx
以上一篇的email数据表为例:

数据结构:


view sourceprint?
01.CREATE TABLE email (
02.emailid mediumint(8) unsigned NOT NULL auto_increment COMMENT '邮件id',
03. 
04.fromid int(10) unsigned NOT NULL default '0' COMMENT '发送人ID',
05. 
06.toid int(10) unsigned NOT NULL default '0' COMMENT '收件人ID',
07.content text unsigned NOT NULL COMMENT '邮件内容',
08.subject varchar(100) unsigned NOT NULL COMMENT '邮件标题',
09. 
10.sendtime int(10) NOT NULL COMMENT '发送时间',
11. 
12.attachment varchar(100) NOT NULL COMMENT '附件ID,以逗号分割', PRIMARY KEY (emailid),
13.) ENGINE=MyISAM';


使用打开控制台,必需打开控制台PHP才能连接到sphinx(确保你已经建立好索引源):

d:\coreseek\bin\searchd -c d:\coreseek\bin\sphinx.conf



coreseek/api目录下提供了PHP的接口文件 sphinxapi.php,这个文件包含一个SphinxClient的类

在PHP引入这个文件,new一下

view sourceprint?
01.$sphinx = new SphinxClient();
02. 
03.//sphinx的主机名和端口
04. 
05.$sphinx->SetServer ( 'loclahost', 9312 );
06. 
07.//设置返回结果集为php数组格式
08. 
09.$sphinx->SetArrayResult ( true );
10. 
11.//匹配结果的偏移量,参数的意义依次为:起始位置,返回结果条数,最大匹配条数
12. 
13.$sphinx->SetLimits(0, 20, 1000);
14. 
15.//最大搜索时间
16. 
17.$sphinx->SetMaxQueryTime(10);
18. 
19. 
20. 
21.//执行简单的搜索,这个搜索将会查询所有字段的信息,要查询指定的字段请继续看下文
22. 
23.$index = 'email' //索引源是配置文件中的 index 类,如果有多个索引源可使用,号隔开:'email,diary' 或者使用'*'号代表全部索引源
24. 
25.$result = $sphinx->query ('搜索关键字', $index);
26. 
27.echo '
<span style="font-size:18px">';

print_r($result);

echo '</span>
Copy after login
';

$result是一个数组,其中

total是匹配到的数据总数量

matches是匹配的数据,包含id,attrs这些信息

words是搜索关键字的分词



你可能奇怪为什么没有邮件的内容这些信息,其实sphinx并不会返回像mysql那样的数据数组,因为sphinx本来就没有记录完整的数据,只记录被分词后的数据。

具体还要看matches数组,matches中的ID就是指配置文件中sql_query SELECT语句中的第一个字段,我们配置文件中是这样的

sql_query = SELECT emailid,fromid,toid,subject,content,sendtime,attachement FROM email

所以matches中的ID是指emailid

至于weight是指匹配的权重,一般权重越高被返回的优先度也最高,匹配权重相关内容请参考官方文档

attrs是配置文件中sql_attr_ 中的信息,稍后会提到这些属性的用法


说了这么多,即使搜索到结果也不是我们想要的email数据,但事实sphinx是不记录真实数据的,所以要获取到真实email数据还要根据matches中的ID去搜索mysql的email表,但总体来说这样一来一回的速度还是远远比mysql的LIKE快得多,前提是几十万数据量以上,否则用sphinx只会更慢。



接下来介绍sphinx一些类似mysql条件的用法

view sourceprint?
01.//emailid的范围
02. 
03.$sphinx->SetIdRange($min, $max);
04. 
05. 
06. 
07.//属性过滤,可过滤的属性必需在配置文件中设置sql_attr_    ,之前我们定义了这些
08. 
09.sql_attr_uint            = fromid
10. 
11.sql_attr_uint            = toid
12. 
13.sql_attr_timestamp  = sendtime
14. 
15.//如果你想再次修改这些属性,配置完成后记得重新建立索引才能生效
16. 
17. 
18. 
19.//指定一些值
20. 
21.$sphinx->SetFilter('fromid', array(1,2));    //fromid的值只能是1或者2
22. 
23.//和以上条件相反,可增加第三个参数
24. 
25.$sphinx->SetFilter('fromid', array(1,2), false);    //fromid的值不能是1或者2
26. 
27.//指定一个值的范围
28. 
29.$sphinx->SetFilterRange('toid', 5, 200);    //toid的值在5-200之间
30. 
31.//和以上条件相反,可增加第三个参数
32. 
33.$sphinx->SetFilterRange('toid', 5, 200, false);    //toid的值在5-200以外
34. 
35. 
36. 
37.//执行搜索
38. 
39.$result = $sphinx->query('关键字', '*');


排序模式
可使用如下模式对搜索结果排序:

SPH_SORT_RELEVANCE 模式, 按相关度降序排列(最好的匹配排在最前面)

SPH_SORT_ATTR_DESC 模式, 按属性降序排列 (属性值越大的越是排在前面)

SPH_SORT_ATTR_ASC 模式, 按属性升序排列(属性值越小的越是排在前面)

SPH_SORT_TIME_SEGMENTS 模式, 先按时间段(最近一小时/天/周/月)降序,再按相关度降序

SPH_SORT_EXTENDED 模式, 按一种类似SQL的方式将列组合起来,升序或降序排列。

SPH_SORT_EXPR 模式,按某个算术表达式排序


view sourceprint?
01.//使用属性排序
02. 
03.//以fromid倒序排序,注意当再次使用SetSortMode会覆盖上一个排序
04. 
05.$sphinx->SetSortMode ( "SPH_SORT_ATTR_DESC", 'fromid');
06. 
07.//如果要使用多个字段排序可使用SPH_SORT_EXTENDED模式
08. 
09.//@id是sphinx内置关键字,这里指emailid,至于为什么是emailid,自己思考一下
10. 
11.$sphinx->SetSortMode ( "SPH_SORT_ATTR_DESC", 'fromid ASC, toid DESC, @id DESC');
12. 
13.//执行搜索
14. 
15.$result = $sphinx->query('关键字', '*');

//更多请查看官方文档排序模式的说明

匹配模式
有如下可选的匹配模式:

SPH_MATCH_ALL, 匹配所有查询词(默认模式);

SPH_MATCH_ANY, 匹配查询词中的任意一个;

SPH_MATCH_PHRASE, 将整个查询看作一个词组,要求按顺序完整匹配;

SPH_MATCH_BOOLEAN, 将查询看作一个布尔表达式

SPH_MATCH_EXTENDED, 将查询看作一个CoreSeek/Sphinx内部查询语言的表达式 . 从版本Coreseek 3/Sphinx 0.9.9开始, 这个选项被选项SPH_MATCH_EXTENDED2代替,它提供了更多功能和更佳的性能。保留这个选项是为了与遗留的旧代码兼容――这样即使Sphinx及其组件包括API升级的时候,旧的应用程序代码还能够继续工作。

SPH_MATCH_EXTENDED2, 使用第二版的“扩展匹配模式”对查询进行匹配.

SPH_MATCH_FULLSCAN, 强制使用下文所述的“完整扫描”模式来对查询进行匹配。注意,在此模式下,所有的查询词都被忽略,尽管过滤器、过滤器范围以及分组仍然起作用,但任何文本匹配都不会发生.

我们要关注的主要是SPH_MATCH_EXTENDED2扩展匹配模式,扩展匹配模式允许使用一些像mysql的条件语句

view sourceprint?
01.//设置扩展匹配模式
02. 
03.$sphinx->SetMatchMode ( "SPH_MATCH_EXTENDED2" );
04. 
05.//查询中使用条件语句,字段用@开头,搜索内容包含测试,toid等于1的邮件:
06. 
07.$result = $sphinx->query('@content (测试) & @toid =1', '*');
08. 
09.//用括号和&(与)、|、(或者)、-(非,即!=)设置更复杂的条件
10. 
11.$result = $sphinx->query('(@content (测试) & @subject =呃) | (@fromid -(100))', '*');
12. 
13.//更多语法请查看官方文档匹配模式的说明

扩展匹配模式中值得一提的是搜索的字段,如果该字段被设置属性,那么扩展匹配搜索的字段默认是不包含这些属性的,只能用SetFilter()或者SetFilterRange()之类

之前我们设置了fromid、toid、sendtime为属性,但又想在扩展匹配模式中又想用作条件该怎么办?

只要在sql_query语句中再选择多一次该字段就可以了

sql_query = SELECT emailid,fromid,fromid,toid,toid,subject,content,sendtime,sendtime,attachement FROM email

//设置完成记得重新建立索引

更多条件技巧
只是一些技巧,但不建议使用的部署环境中,至于为什么,请看文章结尾



、>=
默认sphinx没有这些比较符。

假如我想邮件的发送时间大于某一日期怎么办?用SetFilterRange()方法模拟一下

view sourceprint?
01.//大于等于某一时间截$time
02. 
03.$sphinx->SetFilterRange('sendtime', $time, 10000000000) //时间截最大是10个9,再加1是不可超越了。。
04. 
05. 
06. 
07.//大于某一时间截$time
08. 
09.$sphinx->SetFilterRange('sendtime', $time+1, 10000000000)
10. 
11.//小于等于某一时间截$time
12. 
13.$sphinx->SetFilterRange('sendtime', -1, $time)    //时间截最小是0,所以应该减1
14. 
15.//大于某一时间截$time
16. 
17.$sphinx->SetFilterRange('sendtime', -1, $time - 1)

IS NOT NULL
怎样搜索为空的字段,比如我要搜索附件为空的邮件,有人可能会想 @attachment ('')不就可以了吗?其实这是搜索两个单引号。。。sphinx搜索的字符串不用加引号的

目前sphinx是没有提供这样的功能,其实可以在mysql语句上作手脚:

sql_query = SELECT emailid,fromid,toidsubject,content,sendtime,attachement != '' as attach is not null FROM email //这里返回了一个新字段attachisnotnull,当attachisnotnull为1的时候附件就不为空了

//设置完成记得重新建立索引



FIND_IN_SET()
搜索包含某一附件的邮件,mysql习惯用FIND_IN_SET这么简单一句就搞定了,在sphinx中必需在配置里设置属性sql_attr_multi 多值属性(MVA):

sql_attr_multi = attachment #attachment可以是逗号分隔的附件ID,或者是空格、分号等sphinx都能识别

view sourceprint?
01.//设置完成记得重新建立索引
02. 
03. 
04. 
05.然后PHP中可以使用SetFilter()
06. 
07.//搜索包含附件ID为1或2邮件,mysql语法是这样FIND_IN_SET(`attachment`, '1,2')
08. 
09.$sphinx->SetFilter('attachment', array(1,2))
10. 
11.//可以使用SetFilterRange,搜索包含附件ID在50-100范围的邮件
12. 
13.$sphinx->SetFilterRange('attachment', 50, 100)
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Solution: Your organization requires you to change your PIN Solution: Your organization requires you to change your PIN Oct 04, 2023 pm 05:45 PM

The message "Your organization has asked you to change your PIN" will appear on the login screen. This happens when the PIN expiration limit is reached on a computer using organization-based account settings, where they have control over personal devices. However, if you set up Windows using a personal account, the error message should ideally not appear. Although this is not always the case. Most users who encounter errors report using their personal accounts. Why does my organization ask me to change my PIN on Windows 11? It's possible that your account is associated with an organization, and your primary approach should be to verify this. Contacting your domain administrator can help! Additionally, misconfigured local policy settings or incorrect registry keys can cause errors. Right now

How to adjust window border settings on Windows 11: Change color and size How to adjust window border settings on Windows 11: Change color and size Sep 22, 2023 am 11:37 AM

Windows 11 brings fresh and elegant design to the forefront; the modern interface allows you to personalize and change the finest details, such as window borders. In this guide, we'll discuss step-by-step instructions to help you create an environment that reflects your style in the Windows operating system. How to change window border settings? Press + to open the Settings app. WindowsI go to Personalization and click Color Settings. Color Change Window Borders Settings Window 11" Width="643" Height="500" > Find the Show accent color on title bar and window borders option, and toggle the switch next to it. To display accent colors on the Start menu and taskbar To display the theme color on the Start menu and taskbar, turn on Show theme on the Start menu and taskbar

How to change title bar color on Windows 11? How to change title bar color on Windows 11? Sep 14, 2023 pm 03:33 PM

By default, the title bar color on Windows 11 depends on the dark/light theme you choose. However, you can change it to any color you want. In this guide, we'll discuss step-by-step instructions for three ways to change it and personalize your desktop experience to make it visually appealing. Is it possible to change the title bar color of active and inactive windows? Yes, you can change the title bar color of active windows using the Settings app, or you can change the title bar color of inactive windows using Registry Editor. To learn these steps, go to the next section. How to change title bar color in Windows 11? 1. Using the Settings app press + to open the settings window. WindowsI go to "Personalization" and then

OOBELANGUAGE Error Problems in Windows 11/10 Repair OOBELANGUAGE Error Problems in Windows 11/10 Repair Jul 16, 2023 pm 03:29 PM

Do you see "A problem occurred" along with the "OOBELANGUAGE" statement on the Windows Installer page? The installation of Windows sometimes stops due to such errors. OOBE means out-of-the-box experience. As the error message indicates, this is an issue related to OOBE language selection. There is nothing to worry about, you can solve this problem with nifty registry editing from the OOBE screen itself. Quick Fix – 1. Click the “Retry” button at the bottom of the OOBE app. This will continue the process without further hiccups. 2. Use the power button to force shut down the system. After the system restarts, OOBE should continue. 3. Disconnect the system from the Internet. Complete all aspects of OOBE in offline mode

How to enable or disable taskbar thumbnail previews on Windows 11 How to enable or disable taskbar thumbnail previews on Windows 11 Sep 15, 2023 pm 03:57 PM

Taskbar thumbnails can be fun, but they can also be distracting or annoying. Considering how often you hover over this area, you may have inadvertently closed important windows a few times. Another disadvantage is that it uses more system resources, so if you've been looking for a way to be more resource efficient, we'll show you how to disable it. However, if your hardware specs can handle it and you like the preview, you can enable it. How to enable taskbar thumbnail preview in Windows 11? 1. Using the Settings app tap the key and click Settings. Windows click System and select About. Click Advanced system settings. Navigate to the Advanced tab and select Settings under Performance. Select "Visual Effects"

Display scaling guide on Windows 11 Display scaling guide on Windows 11 Sep 19, 2023 pm 06:45 PM

We all have different preferences when it comes to display scaling on Windows 11. Some people like big icons, some like small icons. However, we all agree that having the right scaling is important. Poor font scaling or over-scaling of images can be a real productivity killer when working, so you need to know how to customize it to get the most out of your system's capabilities. Advantages of Custom Zoom: This is a useful feature for people who have difficulty reading text on the screen. It helps you see more on the screen at one time. You can create custom extension profiles that apply only to certain monitors and applications. Can help improve the performance of low-end hardware. It gives you more control over what's on your screen. How to use Windows 11

10 Ways to Adjust Brightness on Windows 11 10 Ways to Adjust Brightness on Windows 11 Dec 18, 2023 pm 02:21 PM

Screen brightness is an integral part of using modern computing devices, especially when you look at the screen for long periods of time. It helps you reduce eye strain, improve legibility, and view content easily and efficiently. However, depending on your settings, it can sometimes be difficult to manage brightness, especially on Windows 11 with the new UI changes. If you're having trouble adjusting brightness, here are all the ways to manage brightness on Windows 11. How to Change Brightness on Windows 11 [10 Ways Explained] Single monitor users can use the following methods to adjust brightness on Windows 11. This includes desktop systems using a single monitor as well as laptops. let's start. Method 1: Use the Action Center The Action Center is accessible

How to Fix Activation Error Code 0xc004f069 in Windows Server How to Fix Activation Error Code 0xc004f069 in Windows Server Jul 22, 2023 am 09:49 AM

The activation process on Windows sometimes takes a sudden turn to display an error message containing this error code 0xc004f069. Although the activation process is online, some older systems running Windows Server may experience this issue. Go through these initial checks, and if they don't help you activate your system, jump to the main solution to resolve the issue. Workaround – close the error message and activation window. Then restart the computer. Retry the Windows activation process from scratch again. Fix 1 – Activate from Terminal Activate Windows Server Edition system from cmd terminal. Stage – 1 Check Windows Server Version You have to check which type of W you are using

See all articles