Table of Contents
String search" >String search
Partial Match Table/Partial Match Table" >Partial Match Table/Partial Match Table
Subset" >Subset
Prefix & Suffix" >Prefix & Suffix
Javascript 中的实现" >Javascript 中的实现
Home Web Front-end JS Tutorial Let you easily understand the KMP algorithm

Let you easily understand the KMP algorithm

Apr 30, 2019 pm 02:25 PM
js kmp algorithm

KMP (The Knuth-Morris-Pratt Algorithm) algorithm is used for string matching to find a given substring from a string. But it's not very easy to understand and master. Understanding the partial matching table in its concept is the key to understanding the KMP algorithm.

The discussion here avoids the obscure logic behind it and focuses on understanding it from its application.

String search

For example, find the abcdg substring from the string abcdef.

Simple solution, we can do this,

  • take out the first digit respectively for matching, and if they are the same, take out the second digit.
  • If they are different, move the index one bit back, starting from the second digit of the total string, and repeat step one.

The disadvantage of this simple solution is that every time the matching fails, the index is only moved back one position, which has many redundant operations and is not efficient.

In the first round of matching, that is, when the index is 0, we can match the first four characters abcd which are equal, and later find that the desired g is equal to The real e does not match, indicating that the match failed when the index is 0, and we start looking at the index 1, but because we already know the appearance of the first four characters in the total string in the first round of matching , but still need to be matched one by one repeatedly.

Partial Match Table/Partial Match Table

Take a string of length 8abababca, as an example, the partial match table is:

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">char:  | a | b | a | b | a | b | c | a |<br>index: | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | <br>value: | 0 | 0 | 1 | 2 | 3 | 4 | 0 | 1 |</span>
Copy after login
Copy after login

The value row is the value of the partial matching table.

Subset

For the above example string, if we observe the position where index is 2, then we get a Subset aba, if we observe the position where index is 7, it is obvious that we get the entire string. When the position we observe is different, it means that the subset of the string we are paying attention to is different because the substring has changed.

Prefix & Suffix

For a given string, remove one or more characters from the end, and the remaining part is called the true value of the string Prefix (Proper prefix), hereinafter referred to as prefix. "True" here does not mean "true prefix". Think of the "proper subset" of a set in mathematics. For example, banana, its prefix is:

  • <span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">b</span>
  • <span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">ba</span>
  • ##ban<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
  • bana<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
  • banan<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
#Similarly, starting from the header, remove one or more words, and the remaining part is the true suffix of the string (Proper suffix). Or

banana, its suffix is:

  • <span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">anana</span>
  • ##nana<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
  • ana<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
  • na<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
  • ##a<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>
  • ##Partial matching value

It can be seen that all prefixes and suffixes are symmetrical in quantity. Then we can find one from the prefix and match it with the suffix. Let’s not start with Care about the meaning of this match. Take the initial text abababca

as an example.

If we observe the position where index

is 2, the substring is

aba, and its suffixes and suffixes are: Prefix:

a
    ,
  • ab Suffix: ba
  • ,
  • a## Change the prefixes in order Match in the suffix. The only substring that can be matched in the suffix list here is a
  • . Its length is 1, so fill in the observation result in the table and write it down. The matching table matches.

For another example, let’s observe the position where index is 3. The substring obtained at this time is

abab

, and the suffix and suffix at this time are: Prefix: a

,
    ab
  • , aba Suffix: bab,
  • ab
  • , bAt this time, it can be observed that the matching item is ab
  • , and the length is 2, which is also consistent with the value in the above partial matching table.

For another example, let’s observe the position where index is 5. At this time, the substring is

ababab

, and the suffix and suffix are: prefix : a

,
    ab
  • , aba, abab, ababa Suffix: babab,
  • abab
  • , bab, ab, b and then take each of the prefixes The elements are matched with the elements in the suffix, and two matches are finally found,

ab
  • <span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>abab
  • <span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"></span>We take the longer abab
  • , its length is 4.

So now let’s look at the partial matching table above. First, we can understand how its value comes from, and second, we can understand the meaning of its representation, that is, the longest length among all the matches of prefixes and suffixes. The length of that one. When we continue until

index

is 6, the substring is

abababc

. As can be expected, no match is found in the suffix and suffix. Because all prefixes do not contain c, and all suffixes contain c. So the partial match value is 0 at this time. If you continue, you will reach the end of the string, that is, the entire string abababca. It is also expected that since all prefixes start with

a

and all suffixes end with a, the partial match value in this case is at least 1. You will continue to find that because the following suffixes start to have c added, the suffixes all contain ca, and the only prefix that can contain c is abababc, and the length 7 does not match the suffix bababca of equal length. At this point it can be concluded that the matching result is 1 and there is no longer match. Use of partial matching table

Using the above partial matching value, when we perform string search, we don’t have to move only one bit after each failure, but Multiple bits can be moved to remove some redundant matches. There is a formula here as follows:

If a partial match of length partial_match_length is found and table[partial_match_length] > 1, we may skip ahead partial_match_length - table[partial_match_length - 1] characters.

如果匹配过程中,匹配到了部分值为 partial_match_length,即目前找出前 partial_match_length 个字符是匹配的,将这个长度减一作为部分匹配表格中的 index 代入,查找其对应的 valuetable[partial_match_length-1],那么我们可以向前移动的步长为 partial_match_length - table[partial_match_length - 1]

下面是本文开始时的那个部分匹配表:

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">char:  | a | b | a | b | a | b | c | a |<br>index: | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | <br>value: | 0 | 0 | 1 | 2 | 3 | 4 | 0 | 1 |</span>
Copy after login
Copy after login

假设需要从 bacbababaabcbab 中查找 abababca,根据上面的公式我们来走一遍。

首次匹配发生在总字符串的第二个字符,

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">bacbababaabcbab |<br> abababca</span>
Copy after login

此时匹配的长度为 1,部分匹配表中索引为 1-1=0 的位置对应的部分匹配值为 0,所以我们可以向前移动的距离是 1-0 1。其实也相当于没有跳跃,就是正常的本次匹配失败,索引后移一位的情况。这里没有节省任何成本。

继续直到再次发生匹配,此时匹配到的情况如下:

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">bacbababaabcbab    |||||<br>    abababca</span>
Copy after login

现在匹配到的长度是 5,部分匹配表中 5-1=4 对应的部分匹配值为 3,所以我们可以向前移动 5-3=2,此时一下子就可以移动两位了。

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">    上一次的位置    | 最新移动到的位置    | |bacbababaabcbab<br>    xx|||<br>      abababca</span>
Copy after login

此时匹配到的长度为 3, 查找到 table[partial_match_length-1] 即 index 为 2 对应的值为 1,所以可向前移动的距离为 

3-1=2。

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">bacbababaabcbab<br>      xx|<br>        abababca</span>
Copy after login

此时我们需要查找的字符串其长度已经超出剩余可用来匹配的字符串了,所以可直接结束匹配,得到结论:没有查找到结果。

Javascript 中的实现

以下是来自 trekhleb/javascript-algorithms 中 JavaScript 版本的 KMP 算法实现:

相关教程:Javascript视频教程

<span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">//**<br/> * @see https://www.youtube.com/watch?v=GTJr8OvyEVQ<br/> * @param {string} word<br/> * @return {number[]}<br/> */<br/>function buildPatternTable(word) {<br/>  const patternTable = [0];<br/>  let prefixIndex = 0;<br/>  let suffixIndex = 1;<br/><br/>  while (suffixIndex < word.length) {<br/>    if (word[prefixIndex] === word[suffixIndex]) {<br/>      patternTable[suffixIndex] = prefixIndex + 1;<br/>      suffixIndex += 1;<br/>      prefixIndex += 1;<br/>    } else if (prefixIndex === 0) {<br/>      patternTable[suffixIndex] = 0;<br/>      suffixIndex += 1;</span><span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"><br/></span><span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif">    } else {<br/>      prefixIndex = patternTable[prefixIndex - 1];<br/>    }<br/>  }<br/><br/>  return patternTable;<br/>}<br/><br/>/**<br/> * @param {string} text<br/> * @param {string} word<br/> * @return {number}<br/> */<br/>export default function knuthMorrisPratt(text, word) {<br/>  if (word.length === 0) {<br/>    return 0;</span><span style="font-family:Microsoft Yahei, Hiragino Sans GB, Helvetica, Helvetica Neue, 微软雅黑, Tahoma, Arial, sans-serif"><br/>  }<br/><br/>  let textIndex = 0;<br/>  let wordIndex = 0;<br/><br/>  const patternTable = buildPatternTable(word);<br/><br/>  while (textIndex < text.length) {<br/>    if (text[textIndex] === word[wordIndex]) {<br/>      // We&#39;ve found a match.<br/>      if (wordIndex === word.length - 1) {<br/>        return (textIndex - word.length) + 1;<br/>      }<br/>      wordIndex += 1;<br/>      textIndex += 1;<br/>    } else if (wordIndex > 0) {<br/>      wordIndex = patternTable[wordIndex - 1];<br/>    } else {<br/>      wordIndex = 0;<br/>      textIndex += 1;<br/>    }<br/>  }<br/><br/>  return -1;<br/>}<br/></span>
Copy after login

时间复杂度

因为算法中涉及两部分字符串的线性对比,其时间复杂度为两字符串长度之和,假设需要搜索的关键词长度为 k,总字符串长度为 m,则时间复杂度为 O(k+m)。

The above is the detailed content of Let you easily understand the KMP algorithm. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

CLIP-BEVFormer: Explicitly supervise the BEVFormer structure to improve long-tail detection performance CLIP-BEVFormer: Explicitly supervise the BEVFormer structure to improve long-tail detection performance Mar 26, 2024 pm 12:41 PM

Written above &amp; the author’s personal understanding: At present, in the entire autonomous driving system, the perception module plays a vital role. The autonomous vehicle driving on the road can only obtain accurate perception results through the perception module. The downstream regulation and control module in the autonomous driving system makes timely and correct judgments and behavioral decisions. Currently, cars with autonomous driving functions are usually equipped with a variety of data information sensors including surround-view camera sensors, lidar sensors, and millimeter-wave radar sensors to collect information in different modalities to achieve accurate perception tasks. The BEV perception algorithm based on pure vision is favored by the industry because of its low hardware cost and easy deployment, and its output results can be easily applied to various downstream tasks.

Implementing Machine Learning Algorithms in C++: Common Challenges and Solutions Implementing Machine Learning Algorithms in C++: Common Challenges and Solutions Jun 03, 2024 pm 01:25 PM

Common challenges faced by machine learning algorithms in C++ include memory management, multi-threading, performance optimization, and maintainability. Solutions include using smart pointers, modern threading libraries, SIMD instructions and third-party libraries, as well as following coding style guidelines and using automation tools. Practical cases show how to use the Eigen library to implement linear regression algorithms, effectively manage memory and use high-performance matrix operations.

Recommended: Excellent JS open source face detection and recognition project Recommended: Excellent JS open source face detection and recognition project Apr 03, 2024 am 11:55 AM

Face detection and recognition technology is already a relatively mature and widely used technology. Currently, the most widely used Internet application language is JS. Implementing face detection and recognition on the Web front-end has advantages and disadvantages compared to back-end face recognition. Advantages include reducing network interaction and real-time recognition, which greatly shortens user waiting time and improves user experience; disadvantages include: being limited by model size, the accuracy is also limited. How to use js to implement face detection on the web? In order to implement face recognition on the Web, you need to be familiar with related programming languages ​​and technologies, such as JavaScript, HTML, CSS, WebRTC, etc. At the same time, you also need to master relevant computer vision and artificial intelligence technologies. It is worth noting that due to the design of the Web side

Explore the underlying principles and algorithm selection of the C++sort function Explore the underlying principles and algorithm selection of the C++sort function Apr 02, 2024 pm 05:36 PM

The bottom layer of the C++sort function uses merge sort, its complexity is O(nlogn), and provides different sorting algorithm choices, including quick sort, heap sort and stable sort.

Can artificial intelligence predict crime? Explore CrimeGPT's capabilities Can artificial intelligence predict crime? Explore CrimeGPT's capabilities Mar 22, 2024 pm 10:10 PM

The convergence of artificial intelligence (AI) and law enforcement opens up new possibilities for crime prevention and detection. The predictive capabilities of artificial intelligence are widely used in systems such as CrimeGPT (Crime Prediction Technology) to predict criminal activities. This article explores the potential of artificial intelligence in crime prediction, its current applications, the challenges it faces, and the possible ethical implications of the technology. Artificial Intelligence and Crime Prediction: The Basics CrimeGPT uses machine learning algorithms to analyze large data sets, identifying patterns that can predict where and when crimes are likely to occur. These data sets include historical crime statistics, demographic information, economic indicators, weather patterns, and more. By identifying trends that human analysts might miss, artificial intelligence can empower law enforcement agencies

Improved detection algorithm: for target detection in high-resolution optical remote sensing images Improved detection algorithm: for target detection in high-resolution optical remote sensing images Jun 06, 2024 pm 12:33 PM

01 Outlook Summary Currently, it is difficult to achieve an appropriate balance between detection efficiency and detection results. We have developed an enhanced YOLOv5 algorithm for target detection in high-resolution optical remote sensing images, using multi-layer feature pyramids, multi-detection head strategies and hybrid attention modules to improve the effect of the target detection network in optical remote sensing images. According to the SIMD data set, the mAP of the new algorithm is 2.2% better than YOLOv5 and 8.48% better than YOLOX, achieving a better balance between detection results and speed. 02 Background & Motivation With the rapid development of remote sensing technology, high-resolution optical remote sensing images have been used to describe many objects on the earth’s surface, including aircraft, cars, buildings, etc. Object detection in the interpretation of remote sensing images

Application of algorithms in the construction of 58 portrait platform Application of algorithms in the construction of 58 portrait platform May 09, 2024 am 09:01 AM

1. Background of the Construction of 58 Portraits Platform First of all, I would like to share with you the background of the construction of the 58 Portrait Platform. 1. The traditional thinking of the traditional profiling platform is no longer enough. Building a user profiling platform relies on data warehouse modeling capabilities to integrate data from multiple business lines to build accurate user portraits; it also requires data mining to understand user behavior, interests and needs, and provide algorithms. side capabilities; finally, it also needs to have data platform capabilities to efficiently store, query and share user profile data and provide profile services. The main difference between a self-built business profiling platform and a middle-office profiling platform is that the self-built profiling platform serves a single business line and can be customized on demand; the mid-office platform serves multiple business lines, has complex modeling, and provides more general capabilities. 2.58 User portraits of the background of Zhongtai portrait construction

Add SOTA in real time and skyrocket! FastOcc: Faster inference and deployment-friendly Occ algorithm is here! Add SOTA in real time and skyrocket! FastOcc: Faster inference and deployment-friendly Occ algorithm is here! Mar 14, 2024 pm 11:50 PM

Written above & The author’s personal understanding is that in the autonomous driving system, the perception task is a crucial component of the entire autonomous driving system. The main goal of the perception task is to enable autonomous vehicles to understand and perceive surrounding environmental elements, such as vehicles driving on the road, pedestrians on the roadside, obstacles encountered during driving, traffic signs on the road, etc., thereby helping downstream modules Make correct and reasonable decisions and actions. A vehicle with self-driving capabilities is usually equipped with different types of information collection sensors, such as surround-view camera sensors, lidar sensors, millimeter-wave radar sensors, etc., to ensure that the self-driving vehicle can accurately perceive and understand surrounding environment elements. , enabling autonomous vehicles to make correct decisions during autonomous driving. Head

See all articles