How to perform full text retrieval and search in Java-javaTutorial-php.cn

Table of Contents

1. Introduce the Lucene library

2. Create an index

3. Perform search

4. Usage example

Home

Java

javaTutorial

How to perform full text retrieval and search in Java

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Oct 08, 2023 am 09:31 AM

java programming Full-text search (full-text index) search

How to perform full text retrieval and search in Java

How to perform full-text retrieval and search in Java

Full-text retrieval and search is a technique for finding specific keywords or phrases in large-scale text data. In applications that process large amounts of text data, such as search engines, email systems, and document management systems, full-text retrieval and search functions are very important.

As a widely used programming language, Java provides a wealth of libraries and tools that can help us implement full-text retrieval and search functions. This article will introduce how to use the Lucene library to implement full-text retrieval and search, and provide some specific code examples.

1. Introduce the Lucene library

First, we need to introduce the Lucene library into the project. The Lucene library can be introduced into the Maven project in the following ways:

<dependencies>
    <dependency>
        <groupId>org.apache.lucene</groupId>
        <artifactId>lucene-core</artifactId>
        <version>8.10.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.lucene</groupId>
        <artifactId>lucene-analyzers-common</artifactId>
        <version>8.10.1</version>
    </dependency>
</dependencies>

Copy after login

2. Create an index

Before performing full-text search, we need to create an index first. This index contains relevant information about the text data to be searched, so that we can perform subsequent search operations. The following is a simple example code for creating an index:

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import java.io.IOException;
import java.nio.file.Paths;

public class Indexer {
    private IndexWriter indexWriter;

    public Indexer(String indexDir) throws IOException {
        Directory dir = FSDirectory.open(Paths.get(indexDir));
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig config = new IndexWriterConfig(analyzer);
        indexWriter = new IndexWriter(dir, config);
    }

    public void close() throws IOException {
        indexWriter.close();
    }

    public void addDocument(String content) throws IOException {
        Document doc = new Document();
        doc.add(new TextField("content", content, Field.Store.YES));
        indexWriter.addDocument(doc);
    }
}

Copy after login

In the above example code, we use IndexWriter to create the index and TextField to define the Indexed fields. When adding content to be indexed to the index, we need to first create a Document object, then add fields to the object, and finally call the addDocument method to add DocumentObject is added to the index.

3. Perform search

After creating the index, we can perform search operations. The following is a simple search sample code:

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import java.io.IOException;
import java.nio.file.Paths;

public class Searcher {
    private IndexSearcher indexSearcher;
    private QueryParser queryParser;

    public Searcher(String indexDir) throws IOException {
        Directory dir = FSDirectory.open(Paths.get(indexDir));
        Analyzer analyzer = new StandardAnalyzer();
        IndexReader indexReader = DirectoryReader.open(dir);
        indexSearcher = new IndexSearcher(indexReader);
        queryParser = new QueryParser("content", analyzer);
    }

    public ScoreDoc[] search(String queryString, int numResults) throws Exception {
        Query query = queryParser.parse(queryString);
        TopDocs topDocs = indexSearcher.search(query, numResults);
        return topDocs.scoreDocs;
    }

    public Document getDocument(int docID) throws IOException {
        return indexSearcher.doc(docID);
    }
}

Copy after login

In the above sample code, we use IndexSearcher to perform the search operation. Before performing a search, we need to create a Query object to represent the query to be searched, and use QueryParser to parse the query string into a Query object. We then use the search method of IndexSearcher to perform the search and return the ranking of the search results.

4. Usage example

The following is a sample code that uses the full-text retrieval and search function:

public class Main {
    public static void main(String[] args) {
        String indexDir = "/path/to/index/dir";
        
        try {
            Indexer indexer = new Indexer(indexDir);
            indexer.addDocument("Hello, world!");
            indexer.addDocument("Java is a programming language.");
            indexer.addDocument("Lucene is a full-text search engine.");
            indexer.close();

            Searcher searcher = new Searcher(indexDir);
            ScoreDoc[] results = searcher.search("Java", 10);
            for (ScoreDoc result : results) {
                Document doc = searcher.getDocument(result.doc);
                System.out.println(doc.getField("content").stringValue());
            }
        } catch (IOException e) {
            e.printStackTrace();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Copy after login

In the above sample code, we first create a Indexer to create an index and add some text data. Then, we create a Searcher to perform the search and print out the text content of the search results.

Through the above sample code, we can use the Lucene library to easily implement full-text retrieval and search functions in Java. Using Lucene, we can efficiently find specific keywords or phrases in large-scale text data, thereby improving the efficiency and performance of text processing applications.

The above is the detailed content of How to perform full text retrieval and search in Java. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

WWE 2K25: How To Unlock Everything In MyRise

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7463

CakePHP Tutorial

1376

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

How to perform full text retrieval and search in Java Oct 08, 2023 am 09:31 AM

How to do full-text retrieval and search in Java Full-text retrieval and search is a technique for finding specific keywords or phrases in large-scale text data. In applications that process large amounts of text data, such as search engines, email systems, and document management systems, full-text retrieval and search functions are very important. As a widely used programming language, Java provides a wealth of libraries and tools that can help us implement full-text retrieval and search functions. This article will introduce how to use the Lucene library to implement full-text retrieval and search, and provide a

ChatGPT Java: How to achieve intelligent code generation and optimization Oct 24, 2023 pm 12:18 PM

ChatGPTJava: How to implement intelligent code generation and optimization Introduction: With the rapid development of artificial intelligence technology, intelligent code generation and optimization have become hot topics in the programming field. ChatGPT is a powerful language model based on OpenAI that enables interaction between natural language and machines. This article will introduce how to use ChatGPT to implement intelligent code generation and optimization operations, and provide some specific code examples. 1. Intelligent code generation: Use ChatGPT to build intelligent code generation

Why should we follow Java naming conventions? Sep 19, 2023 pm 01:57 PM

Java naming conventions make programs easier to understand by making them easier to read. In Java, class names should generally be nouns, in title form starting with a capital letter, with the first letter of each word capitalized. Interface names should usually be adjectives, in title form, starting with a capital letter, with the first letter of each word capitalized. Why you should follow Java naming standards Reduce the effort required to read and understand source code. Allows code reviews to focus on more important issues than syntax and naming standards. Enable code quality review tools to focus primarily on important issues rather than syntax and style preferences. Naming Conventions for Different Type Identifiers Package names should be all lowercase. Example packagecom.tutorialspoint;Interface Interface names should be in uppercase

How to solve Java data format exception (DataFormatException) Aug 27, 2023 am 10:14 AM

How to solve Java data format exception (DataFormatException) In Java programming, we often encounter various abnormal situations. Among them, data format exception (DataFormatException) is a common but also very challenging problem. This exception will be thrown when the input data cannot meet the specified format requirements. Solving this anomaly requires certain skills and experience. This article will detail how to resolve Java data format exceptions and provide some code examples

ChatGPT Java: How to implement intelligent information extraction and structured processing Oct 28, 2023 am 10:00 AM

ChatGPTJava: How to implement intelligent information extraction and structured processing, specific code examples are required Introduction: With the rapid development of artificial intelligence technology, intelligent information extraction and structured processing play an increasingly important role in the field of data processing. In this article, we will introduce how to use ChatGPTJava to implement intelligent information extraction and structured processing functions, and provide specific code examples. 1. Intelligent information extraction Intelligent information extraction refers to the process of extracting key information from unstructured data. In Ja

How to implement radix sort algorithm using java Sep 19, 2023 pm 03:39 PM

How to implement radix sort algorithm using Java? The radix sort algorithm is a non-comparative sorting algorithm that sorts elements based on their bit value. Its core idea is to group the numbers to be sorted according to units, tens, hundreds and other digits, and then sort each digit in turn to finally obtain an ordered sequence. The following will introduce in detail how to implement the radix sort algorithm using Java and provide code examples. First, the radix sorting algorithm needs to prepare a two-dimensional array to save the numbers to be sorted. The number of rows in the array is determined by the number of bits, for example

In Java, how to add new elements to an array? Jan 03, 2024 pm 03:30 PM

Adding new elements to an array is a common operation in Java and can be accomplished using a variety of methods. This article will introduce several common methods of adding elements to an array and provide corresponding code examples. 1. A common way to use a new array is to create a new array, copy the elements of the original array to the new array, and add new elements at the end of the new array. The specific steps are as follows: Create a new array whose size is 1 larger than the original array. This is because a new element is being added. Copy the elements of the original array to the new array. Add to the end of the new array

How to implement social sharing function in Java switch grocery shopping system Nov 01, 2023 pm 05:15 PM

How to implement a Java switch grocery shopping system with social sharing function. With the development of technology and the popularity of social media, more and more people are accustomed to sharing their purchasing experience and thoughts when shopping. In order to meet the needs of users, a good shopping system not only needs to complete purchases conveniently and quickly, but also needs to provide social sharing functions. This article will introduce how to implement a Java switch grocery shopping system with social sharing function. First, we need to determine the social sharing channels to be implemented. Common ones include WeChat, Weibo, QQ, etc. In Java, you can use the third

See all articles