In this article, I will show you how to implement full-text search using Ruby on Rails and Elasticsearch. Nowadays, everyone is accustomed to typing a search term and getting suggestions and highlighted results for the search term. Autocorrect is also a nice feature if what you're trying to search for is misspelled, as we've seen on sites like Google or Facebook.
Achieving all of this functionality using just a relational database like MySQL or Postgres is not straightforward. So we use Elasticsearch, which you can think of as a database built and optimized specifically for search. It is open source and built on Apache Lucene.
One of the best features of Elasticsearch is exposing its functionality using a REST API, so there are libraries that encapsulate this functionality for most programming languages.
Earlier, I mentioned that Elasticsearch is like a database for search. This will be useful if you are familiar with some of its terminology.
One thing to note here is that in Elasticsearch, when you write a document to the index, the document fields are analyzed literally to make searching easy and fast. Elasticsearch also supports geolocation, so you can search for documents located within a certain distance of a given location. This is exactly how Foursquare implements search.
I would like to mention that Elasticsearch is built with high scalability in mind, so it is easy to build clusters with multiple servers and have high availability even if some servers fail. I won't go into detail on how to plan and deploy different types of clusters in this article.
If you are using Linux, you may be able to install Elasticsearch from one of the repositories. It can be used in APT and YUM.
If you are using a Mac, you can install it using Homebrew: brew install elasticsearch
. After installing elasticsearch, you will see a list of related folders in the terminal:
To verify that the installation is working properly, start it by typing elasticsearch
in the terminal. Then run curl localhost:9200
in the terminal and you should see something similar to:
Elastic HQ is a monitoring plugin that we can use to manage Elasticsearch from the browser, similar to phpMyAdmin for MySQL. To install it, just run in terminal:
/usr/local/Cellar/elasticsearch/2.2.0_1/libexec/bin/plugin -install royrusso/elasticsearch-HQ
After the installation is complete, navigate to http://localhost:9200/_plugin/hq in your browser:
Click Connect and you will see a screen showing the cluster status: p>
At this point, as you might expect, no indexes or documents have been created yet, but we do have a local instance of Elasticsearch installed and running.
I'm going to create a very simple Rails application where you add articles to a database so that we can perform a full text search on them using Elasticsearch. Start by creating a new Rails application:
rails new elasticsearch-rails
Next we use scaffolding to generate a new article resource:
rails generates scaffolding article title:string text:text
Now we need to add a new root route so that we can see the article list by default. Editconfig/routes.rb:
Rails.application.routes.draw do root to: 'articles#index' resources :articles end
Create the database by running the command rake db:migrate
. If you start rails server
, open a browser, navigate to localhost:3000 and add some articles to the database, or just download the file db/seeds.rb with the dummy data I created so you don't have to spend a lot of time Fill in the form.
Now that we have our little Rails application containing the articles in the database, we're ready to add search functionality. We'll start by adding references to two official Elasticsearch Gems:
gem 'elasticsearch-model' gem 'elasticsearch-rails'
On many websites, it is common to have a text box for search in the top menu of all pages. So I'm going to create a form section at app/views/search/_form.html.erb. As you can see, I'm sending the generated form using GET, so I can easily copy and paste the URL for a specific search.
<%= form_for :term, url: search_path, method: :get do |form| %> <p> <%= text_field_tag :term, params[:term] %> <%= submit_tag "Search", name: nil %> </p> <% end %>
Add a reference to the form in the main site layout. Edit app/views/layouts/application.html.erb.
<body> <%= render 'search/form' %> <%= yield %> </body>
Now we also need a controller to perform the actual search and display the results, so we run the command rails g new controller Search
to generate it.
class SearchController < ApplicationController def search if params[:term].nil? @articles = [] else @articles = Article.search params[:term] end end end
As you can see, I'm calling the method search
on the Article model. We haven't defined it yet, so if we try to perform a search at this point, we'll get an error. Also, we haven't added the SearchController's routes in the config/routes.rb file, so let's do this:
Rails.application.routes.draw do root to: 'articles#index' resources :articles get "search", to: "search#search" end
If we look at the documentation for the gem 'elasticsearch-rails', we need to include two modules on the model to be indexed in Elasticsearch, in our case article.rb.
require 'elasticsearch/model' class Article < ActiveRecord::Base include Elasticsearch::Model include Elasticsearch::Model::Callbacks end
The first model injects the Search method we used in the previous controller. The second module integrates with ActiveRecord callbacks to index each instance of an article we save to the database, and it also updates the index if we modify or delete an article from the database. So it's all transparent to us.
If you previously imported data into the database, these articles are still not in the Elasticsearch index; only new ones will be automatically indexed. Therefore, we have to index them manually, which is easy if we launch rails console
. Then we just need to run irb(main) > Article.import
.
Now we are ready to try out the search feature. If I type "ruby" and click search, the results are:
On many websites, you can see how the terms you searched for are highlighted on the search results page. This is easy to do using Elasticsearch.
Editapp/models/article.rband modify the default search method:
def self.search(query) __elasticsearch__.search( { query: { multi_match: { query: query, fields: ['title', 'text'] } }, highlight: { pre_tags: ['<em>'], post_tags: ['</em>'], fields: { title: {}, text: {} } } } ) end
By default, the search
method is defined by gem 'elasticsearch-models' and provides the proxy object __elasticsearch__ to access the wrapper class of the Elasticsearch API. Therefore, we can modify the default query using the standard JSON options provided by the documentation.
The search method will now wrap results that match the query with the specified HTML tag. To do this, we also need to update the search results page to safely render HTML tags. To do this, edit app/views/search/search.html.erb.
<h1>Search Results</h1> <% if @articles %> <ul class="search_results"> <% @articles.each do |article| %> <li> <h3> <%= link_to article.try(:highlight).try(:title) ? article.highlight.title[0].html_safe : article.title, controller: "articles", action: "show", id: article._id %> </h3> <% if article.try(:highlight).try(:text) %> <% article.highlight.text.each do |snippet| %> <p><%= snippet.html_safe %>...</p> <% end %> <% end %> </li> <% end %> </ul> <% else %> <p>Your search did not match any documents.</p> <% end %>
Add CSS styles to app/assets/stylesheets/search.scss for highlighted tags:
.search_results em { background-color: yellow; font-style: normal; font-weight: bold; }
Try searching for "ruby" again:
As you can see, highlighting search terms is easy, but not ideal because we need to send a JSON query and as the Elasticsearch documentation specifies, we don't have any kind of abstraction.
The Searchkick gem is provided by Instacart and is an abstraction on top of the official Elasticsearch gem. I'm going to refactor the highlighting functionality so we first add gem 'searchkick'
to the gemfile. The first class we need to change is the Article.rb model:
class Article < ActiveRecord::Base searchkick end
As you can see, it's much simpler. We need to reindex the article again and execute the command rake searchkick:reindex CLASS=Article
. In order to highlight search terms, we need to pass an additional parameter from search_controller.rb to the search method.
class SearchController < ApplicationController def search if params[:term].nil? @articles = [] else term = params[:term] @articles = Article.search term, fields: [:text], highlight: true end end end
The last file we need to modify is views/search/search.html.erb because searchkick now returns results in a different format:
<h2>Search Results for: <i><%= params[:term] %></i></h2> <% if @articles %> <ul class="search_results"> <% @articles.with_details.each do |article, details| %> <li> <h3> <%= link_to article.title, controller: "articles", action: "show", id: article.id %> </h3> <p><%= details[:highlight][:text].html_safe %>...</p> </li> <% end %> </ul> <% else %> <p>Your search did not match any documents.</p> <% end %>
Now it’s time to run the application again and test the search functionality:
请注意,我输入了搜索词“dato”。我这样做的目的是为了向您展示,默认情况下,searchkick 设置为分析索引的文本,并且更允许拼写错误。
自动建议或预先输入可预测用户将输入的内容,从而使搜索体验更快、更轻松。请记住,除非您有数千条记录,否则最好在客户端进行过滤。
让我们首先添加 typeahead 插件,该插件可通过 gem 'bootstrap-typeahead-rails'
获得,并将其添加到您的 Gemfile 中。接下来,我们需要向 app/assets/javascripts/application.js 添加一些 JavaScript,以便当您开始在搜索框中输入内容时,会出现一些建议。
//= require jquery //= require jquery_ujs //= require turbolinks //= require bootstrap-typeahead-rails //= require_tree . var ready = function() { var engine = new Bloodhound({ datumTokenizer: function(d) { console.log(d); return Bloodhound.tokenizers.whitespace(d.title); }, queryTokenizer: Bloodhound.tokenizers.whitespace, remote: { url: '../search/typeahead/%QUERY' } }); var promise = engine.initialize(); promise .done(function() { console.log('success'); }) .fail(function() { console.log('error') }); $("#term").typeahead(null, { name: "article", displayKey: "title", source: engine.ttAdapter() }) }; $(document).ready(ready); $(document).on('page:load', ready);
关于前一个片段的一些评论。在最后两行中,因为我没有禁用涡轮链接,所以这是连接我想要在页面加载时运行的代码的方法。在脚本的第一部分,您可以看到我正在使用 Bloodhound。它是 typeahead.js 建议引擎,我还设置了 JSON 端点来发出 AJAX 请求来获取建议。之后,我在引擎上调用 initialize()
,并使用其 id“term”在搜索文本字段上设置预输入。
现在,我们需要对建议进行后端实现,让我们从添加路由开始,编辑 app/config/routes.rb。
Rails.application.routes.draw do root to: 'articles#index' resources :articles get "search", to: "search#search" get 'search/typeahead/:term' => 'search#typeahead' end
接下来,我将在 app/controllers/search_controller.rb 上添加实现。
def typeahead render json: Article.search(params[:term], { fields: ["title"], limit: 10, load: false, misspellings: {below: 5}, }).map do |article| { title: article.title, value: article.id } end end
此方法返回使用 JSON 输入的术语的搜索结果。我只按标题搜索,但我也可以指定文章的正文。我还将搜索结果的数量限制为最多 10 个。
现在我们准备尝试 typeahead 实现:
如您所见,将 Elasticsearch 与 Rails 结合使用使搜索数据变得非常简单且快速。在这里,我向您展示了如何使用 Elasticsearch 提供的低级 gem,以及 Searchkick gem,这是一个隐藏了 Elasticsearch 工作原理的一些细节的抽象。
根据您的具体需求,您可能会很乐意使用 Searchkick 并快速轻松地实施全文搜索。另一方面,如果您有一些其他复杂的查询,包括过滤器或组,您可能需要了解有关 Elasticsearch 上查询语言的详细信息,并最终使用较低级别的 gem 'elasticsearch-models' 和 'elasticsearch-导轨”。
The above is the detailed content of Full text search in Rails using Elasticsearch. For more information, please follow other related articles on the PHP Chinese website!