How to use Java to implement string matching algorithm
Introduction:
String matching algorithm is a common problem in the computer field, used to match a main string Find occurrences of a specific pattern string. In actual development, it is often necessary to match strings, such as the search function in text editors, keyword matching in search engines, etc. This article will introduce several common string matching algorithms and provide corresponding Java code examples.
1. Brute force matching algorithm
The brute force matching algorithm, also known as the naive matching algorithm, is the most basic string matching algorithm. Its principle is very simple, that is, starting from every position in the main string, it compares character by character with the pattern string until a non-matching character appears or a match is successfully found.
The specific implementation code is as follows:
public class BruteForceMatcher { public int match(String text, String pattern) { int n = text.length(); int m = pattern.length(); for (int i = 0; i <= n - m; i++) { int j; for (j = 0; j < m; j++) { if (text.charAt(i + j) != pattern.charAt(j)) { break; } } if (j == m) { return i; } } return -1; } }
2. KMP algorithm
The KMP algorithm is an efficient string matching algorithm. It uses part of the matching information of the pattern string to avoid inconsistencies. Necessary character comparison. The core of the KMP algorithm is to construct a next array to record the length of the longest common suffix at each position in the pattern string.
The specific implementation code is as follows:
public class KMPMatcher { public int match(String text, String pattern) { int n = text.length(); int m = pattern.length(); int[] next = getNext(pattern); int i = 0, j = 0; while (i < n && j < m) { if (j == -1 || text.charAt(i) == pattern.charAt(j)) { i++; j++; } else { j = next[j]; } } if (j == m) { return i - j; } else { return -1; } } private int[] getNext(String pattern) { int m = pattern.length(); int[] next = new int[m]; next[0] = -1; int i = 0, j = -1; while (i < m - 1) { if (j == -1 || pattern.charAt(i) == pattern.charAt(j)) { i++; j++; if (pattern.charAt(i) != pattern.charAt(j)) { next[i] = j; } else { next[i] = next[j]; } } else { j = next[j]; } } return next; } }
3. Boyer-Moore algorithm
Boyer-Moore algorithm is an efficient string matching algorithm that uses the character distribution information in the pattern string and move-back rules.
The specific implementation code is as follows:
public class BMMatcher { public int match(String text, String pattern) { int n = text.length(); int m = pattern.length(); int[] bmBc = preBmBc(pattern); int[] bmGs = preBmGs(pattern); int j = 0; while (j <= n - m) { int i; for (i = m - 1; i >= 0 && pattern.charAt(i) == text.charAt(i + j); i--); if (i < 0) { return j; } else { j += Math.max(bmGs[i], bmBc[text.charAt(i + j)] - m + 1 + i); } } return -1; } private int[] preBmBc(String pattern) { int[] bmBc = new int[256]; int m = pattern.length(); for (int i = 0; i < 256; i++) { bmBc[i] = m; } for (int i = 0; i < m - 1; i++) { bmBc[pattern.charAt(i)] = m - 1 - i; } return bmBc; } private int[] preBmGs(String pattern) { int m = pattern.length(); int[] bmGs = new int[m]; int i = m - 1, j = m; bmGs[m - 1] = m; while (i >= 0) { if (pattern.charAt(i) == pattern.charAt(j)) { bmGs[--i] = --j; } else { j = m - 1 - i; while (j < m && pattern.charAt(m - 1 - j) == pattern.charAt(j)) { j++; } bmGs[i] = j; } } return bmGs; } }
Conclusion:
The above are code examples of three common string matching algorithms, which are brute force matching algorithm, KMP algorithm and Boyer-Moore algorithm. In practical applications, appropriate algorithms can be selected according to specific needs to improve matching efficiency.
The above is the detailed content of How to implement string matching algorithm using java. For more information, please follow other related articles on the PHP Chinese website!