coreeek 和 sphinx 的配置与使用
前言 关于 sphinx 的安装请参考 Sphinx 安装记录. 关于 coreeek 的安装请参考 coreseek 安装记录. sphinx 和 coreeek 安装好后,是可以搜索出满意的结果了,凡是有一个问题:对于新增的数据,我们需要在 sphinx 中重建索引。 又由于旧的数据量是很大的,所以
前言
关于 sphinx 的安装请参考 Sphinx 安装记录.
关于 coreeek 的安装请参考 coreseek 安装记录.
sphinx 和 coreeek 安装好后,是可以搜索出满意的结果了,凡是有一个问题:对于新增的数据,我们需要在 sphinx 中重建索引。
又由于旧的数据量是很大的,所以重建索引是很费时间的,所有如果数据不需要实时同步,那么每天晚上定时重建一下就行了。
如果需要实时同步,比如几分钟内就要搜索生效,那么就需要使用增量索引了。
然后再在晚上闲时合并增量索引和主索引。
关于配置
在 sphinx 中,需要配置两个数据源和两个索引, 一个是主索引,另一个是增量索引,而且增量索引需要继承于主索引。
由于我们的索引会在指定时间合并,所以在下次合并索引之前,我们增量索引需要做的就是重建上次合并索引之后改变或新增的数据。
所有我们需要一个辅助表来记录上次修改的时间,用于增量索引使用。
辅助表 结构很简单,只有一个字段上次合并的时间,而且永远只有一条记录。
CREATE TABLE t_blog_time_sphinx ( c_id INTEGER PRIMARY KEY NOT NULL, c_time DATETIME NOT NULL );
关于 sphinx 的配置如下
# 主数据源 source main_source { type = mysql sql_host = 127.0.0.1 sql_user = test sql_pass = test sql_db = test sql_port = 3306 sql_query_pre= SET NAMES utf8 sql_query = select c_id,c_title,c_content,c_year,c_month,c_day,c_modifytime,c_createtime FROM t_blog_sphinx; sql_attr_uint = c_year sql_attr_uint = c_month sql_attr_uint = c_day sql_attr_timestamp = c_modifytime sql_attr_timestamp = c_createtime sql_field_string = c_title sql_field_string = c_content } # 增量数据源 source main_inc_source : main_source { sql_query_pre = SET NAMES utf8 sql_query = select c_id,c_title,c_content,c_year,c_month,c_day,c_modifytime,c_createtime FROM t_blog_sphinx where c_modifytime > ( SELECT c_time FROM t_blog_time_sphinx limit 1 ); } # 主索引 index main_index { source = main_source path = /usr/local/coreseek4/var/data/main_index docinfo = extern charset_type = zh_cn.utf-8 charset_dictpath = /usr/local/mmseg3/etc/ ngram_len = 0 } # 增量索引 index main_inc_index : main_index { source = main_inc_source path = /usr/local/coreseek4/var/data/main_inc_index } # 索引程序 indexer { mem_limit = 32M } # 守护程序 searchd { listen = 9312 listen = 9306:mysql41 log = /usr/local/coreseek4/var/log/searchd.log query_log = /usr/local/coreseek4/var/log/query.lo client_timeout= 300 read_timeout = 5 max_children = 30 pid_file = /usr/local/coreseek4/var/log/searchd.pid max_matches = 1000 seamless_rotate = 1 preopen_indexes = 1 unlink_old = 1 mva_updates_pool= 1M max_packet_size= 8M max_filters= 256 max_filter_values= 4096 max_batch_queries= 32 workers = threads # for RT to work }
启动 sphinx
第一步是辅助表中插入一个时间
INSERT INTO t_blog_time_sphinx (c_time)VALUES(now());
第二步是创建主索引和增量索引
/usr/local/coreseek4/bin/indexer main_index /usr/local/coreseek4/bin/indexer main_inc_index
第三部是启动守护程序
/usr/local/coreseek4/bin/searchd
定时任务
定时任务需要做的有这么几件事。
- 实时重建当天的索引(增量索引)
- 晚上合并增量索引到主索引
- 更新辅助表的时间为当前时间(一般减去若干分钟,来使数据有几分钟的冗余,避免遗漏数据)
# 增量索引 /usr/local/coreseek4/bin/indexer t_cover_sphinx_inc_index --rotate # 合并 /usr/local/coreseek4/bin/indexer --merge t_cover_sphinx_index t_cover_sphinx_inc_index --rotate # 修改辅助表上次的合并时间 update t_blog_time_sphinx set c_time = now() - 10*60;
php 测试程序
在 coreseek 的测试目录下可以找到 sphinxapi.php 文件,复制到你的 php 源代码对应的位置。
关于全文索引字段的组装格式,可以参考 官方文档
//加入 sphinx api include('api/coreseek_sphinxapi.php'); //初始化 sphinx $sphinx = new SphinxClient(); $sphinx->setServer($ip, $port); //设置属性字段 if(isset($_GET["year"]) && strlen($_GET["year"]) > 0){ $sphinx->SetFilter("c_year", array($_GET["year"])); } //设置全文检索字段 $query = ""; if(isset($_GET["title"]) && strlen($_GET["title"]) > 0){ $query .= "|" . trim($_GET["title"]); } if(isset($_GET["content"]) && strlen($_GET["content"]) > 0){ $query .= "|" . trim($_GET["content"]); } $query = trim($query); //开始搜索,索引必须是主索引和增量索引 $res = $sphinx->query($query, 'main_inc_index,main_index'); echo "<p>query = $query </p>"; //输出结果,其中 GetLastError 和 GetLastWarning 用于调试。 echo "<pre class="brush:php;toolbar:false">"; print_r($sphinx->GetLastError()); print_r($sphinx->GetLastWarning ()); print_r($res); echo "
本文出自:http://tiankonguse.github.io, 原文地址:http://tiankonguse.github.io/blog/2014/11/06/sphinx-config-and-use/, 感谢原作者分享。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

CrystalDiskMark is a small HDD benchmark tool for hard drives that quickly measures sequential and random read/write speeds. Next, let the editor introduce CrystalDiskMark to you and how to use crystaldiskmark~ 1. Introduction to CrystalDiskMark CrystalDiskMark is a widely used disk performance testing tool used to evaluate the read and write speed and performance of mechanical hard drives and solid-state drives (SSD). Random I/O performance. It is a free Windows application and provides a user-friendly interface and various test modes to evaluate different aspects of hard drive performance and is widely used in hardware reviews

Title: The working principle and configuration method of GDM in Linux systems In Linux operating systems, GDM (GNOMEDisplayManager) is a common display manager used to control graphical user interface (GUI) login and user session management. This article will introduce the working principle and configuration method of GDM, as well as provide specific code examples. 1. Working principle of GDM GDM is the display manager in the GNOME desktop environment. It is responsible for starting the X server and providing the login interface. The user enters

foobar2000 is a software that can listen to music resources at any time. It brings you all kinds of music with lossless sound quality. The enhanced version of the music player allows you to get a more comprehensive and comfortable music experience. Its design concept is to play the advanced audio on the computer The device is transplanted to mobile phones to provide a more convenient and efficient music playback experience. The interface design is simple, clear and easy to use. It adopts a minimalist design style without too many decorations and cumbersome operations to get started quickly. It also supports a variety of skins and Theme, personalize settings according to your own preferences, and create an exclusive music player that supports the playback of multiple audio formats. It also supports the audio gain function to adjust the volume according to your own hearing conditions to avoid hearing damage caused by excessive volume. Next, let me help you

Cloud storage has become an indispensable part of our daily life and work nowadays. As one of the leading cloud storage services in China, Baidu Netdisk has won the favor of a large number of users with its powerful storage functions, efficient transmission speed and convenient operation experience. And whether you want to back up important files, share information, watch videos online, or listen to music, Baidu Cloud Disk can meet your needs. However, many users may not understand the specific use method of Baidu Netdisk app, so this tutorial will introduce in detail how to use Baidu Netdisk app. Users who are still confused can follow this article to learn more. ! How to use Baidu Cloud Network Disk: 1. Installation First, when downloading and installing Baidu Cloud software, please select the custom installation option.

Understanding Linux Bashrc: Function, Configuration and Usage In Linux systems, Bashrc (BourneAgainShellruncommands) is a very important configuration file, which contains various commands and settings that are automatically run when the system starts. The Bashrc file is usually located in the user's home directory and is a hidden file. Its function is to customize the Bashshell environment for the user. 1. Bashrc function setting environment

NetEase Mailbox, as an email address widely used by Chinese netizens, has always won the trust of users with its stable and efficient services. NetEase Mailbox Master is an email software specially created for mobile phone users. It greatly simplifies the process of sending and receiving emails and makes our email processing more convenient. So how to use NetEase Mailbox Master, and what specific functions it has. Below, the editor of this site will give you a detailed introduction, hoping to help you! First, you can search and download the NetEase Mailbox Master app in the mobile app store. Search for "NetEase Mailbox Master" in App Store or Baidu Mobile Assistant, and then follow the prompts to install it. After the download and installation is completed, we open the NetEase email account and log in. The login interface is as shown below

MetaMask (also called Little Fox Wallet in Chinese) is a free and well-received encryption wallet software. Currently, BTCC supports binding to the MetaMask wallet. After binding, you can use the MetaMask wallet to quickly log in, store value, buy coins, etc., and you can also get 20 USDT trial bonus for the first time binding. In the BTCCMetaMask wallet tutorial, we will introduce in detail how to register and use MetaMask, and how to bind and use the Little Fox wallet in BTCC. What is MetaMask wallet? With over 30 million users, MetaMask Little Fox Wallet is one of the most popular cryptocurrency wallets today. It is free to use and can be installed on the network as an extension

Apple rolled out the iOS 17.4 update on Tuesday, bringing a slew of new features and fixes to iPhones. The update includes new emojis, and EU users will also be able to download them from other app stores. In addition, the update also strengthens the control of iPhone security and introduces more "Stolen Device Protection" setting options to provide users with more choices and protection. "iOS17.3 introduces the "Stolen Device Protection" function for the first time, adding extra security to users' sensitive information. When the user is away from home and other familiar places, this function requires the user to enter biometric information for the first time, and after one hour You must enter information again to access and change certain data, such as changing your Apple ID password or turning off stolen device protection.
