Home Java javaTutorial Detailed explanation of edit distance examples of strings

Detailed explanation of edit distance examples of strings

Jun 30, 2017 am 09:34 AM
dynamic programming string distance

Dynamic programming algorithm questions are often frequent questions in written examinations of major companies. In many algorithmic WeChat public accounts, articles about "dynamic programming" are common. They all try to describe and explain dynamic programming in the most understandable words, and some even use comics to explain it. Read every public article carefully. In fact, all the articles pushed by the number can be read and understood, and everyone can have a general understanding of dynamic programming.

 What is dynamic programming? In layman’s terms, the solution to a problem can be known at a glance (exhaustive list), but you can’t count them one by one. You have to find the optimal solution. In other words, there will be questions like “the most” and “the most” in the question. "At least", "how many kinds are there in total" and other formulations, these problems can theoretically be solved using the idea of ​​dynamic programming. Dynamic programming is similar to the divide-and-conquer method. It solves the original problem by combining the solutions of sub-problems, but it only solves each sub-problem once and saves it in a table without recalculation. It is usually used to solve the most optimal problem. Optimization Problem——"Introduction to Algorithms".

Edit distance (Edit Distance), in this article refers to Levenshtein distance, which is the string S1The minimum number of times that can be converted into a string S2 through the three operations of inserting , modifying, and deleting. For example: S1 = abc, S2 = abf, edit distance d = 1 (only Need to change c to f). In this article, the algorithm idea of ​​dynamic programming will be used to solve the edit distance of strings.

Definition: S1, S2 represents two strings , S1(i) represents S The first character of 1, d[i, j] represents the th character of S1 #i prefix to the jth prefix of S2 (for example :S1 = ” abc”,S2 = ”def”,Solve the edit distance from S1 to S2 as d[ 3, 3]).

  1. If

    S1 = “abc”, S2 = “dec”, then their edit distance is d[3, 3] = 2 , observe that the last character of the two strings is the same, that is to say, S1(3) = S2(3) does not require any transformation, so S1 = "abc", S2 = "dec" <= > S1' = ”ab”, S2' = ”de”, that is, when S1[i] = S[j], , d [i, j] = d[i-1,j -1]. Get the formula: d[i, j] = d[i - 1, j - 1] (S1[i] = S2[j])

  2. The above article yields the calculation formula when S1[i] = S2[j]. Obviously there is another situation where S1[i] ≠ S2[j], if S1 = “abc”, S2 = “def”. The process of converting S1 to S2 can be "modified", But you can also delete # by "INSERT"" ##” causes S1 to be transformed into S2.

  1) Insert the

character “f”# at the end of the S1 string ##, at this time S1 = ”abcf”, S2 = ”def”,At this time that is, S1[i] = S2[j] In the case of , the edit distance when S1 is transformed into S2 is d[4, 3] = d[ 3, 2]. So we getd[i, j]=d[i, j - 1] + 1. (+1 is because S1 has added ”f”)  2 )Insert the character

“c” at the end of the S2 string, then S1 = “abc” , S2 = ”defc”, this is the situation where S1[i] = S[j],# The edit distance transformed from ##S1 to S2 is d[3, 4] = d[2, 3] . So we getd[i, j]=d[i - 1, j] + 1. In fact, this is done to S1 delete. (+1 is because S2 has added ”c”) 3 )Modify S1

at the end of the string to ”f”, at this time S1 = ”abf”S2 = ”def”, at this time S1[i] = S[j] In the case of , the edit distance to transform S1 into S2 is d[3, 3] = d[2, 2]. So we getd[i, j] = d[i – 1, j - 1] + 1. (+1 is because S1 modified “c”) In summary, we get the recursion formula:

##=>

## You might as well use a table to express the dynamic programming The solution process of

S1=

”abc”, S2=”def”.

It can be seen that the red square is the final edit distance required, and the entire solution process is to fill up this table

— -

Two-dimensional array. The following are the dynamic programming solutions to the string edit distance of Java and Python respectively.

 

Java

  1 package com.algorithm.dynamicprogramming;  2   3 
  /**  4  * 动态规划——字符串的编辑距离  5  * s1 = "abc", s2 = "def"  6  
  * 计算公式:  7  *          | 0                                          
   i = 0, j = 0  8  *          | j                                          
    i = 0, j > 0  9  * d[i,j] = | i                                          
     i > 0, j = 0 10  *          | min(d[i,j-1]+1, d[i-1,j]+1, d[i-1,j-1])    s1(i) = s2(j) 11  
     *          | min(d[i,j-1]+1, d[i-1,j]+1, d[i-1,j-1]+1)  s1(i) ≠ s2(j) 12  * 定义二维数组[4][4]: 13 
      *      d e f            d e f 14  *   |x|x|x|x|        |0|1|2|3| 15  
      * a |x|x|x|x|  =>  a |1|1|2|3|  => 编辑距离d = [3][3] = 3 16  * b |x|x|x|x|      
      b |2|2|2|3| 17  * c |x|x|x|x|      c |3|3|3|3| 18  * 19  * Created by yulinfeng on 6/29/17. 20  
      */ 21 public class Levenshtein { 22  23     public static void main(String[] args) { 24         
      String s1 = "abc"; 25         String s2 = "def"; 26         int editDistance = levenshtein(s1, s2); 27         
      System.out.println("s1=" + s1 + "与s2=" + s2 + "的编辑距离为:" + editDistance); 28     } 29  30     /** 31      
      * 编辑距离求解 32      * @param s1 字符串s1 33      * @param s2 字符串s2 34      * @return 编辑距离 35      
      */ 36     private static int levenshtein(String s1, String s2) { 37         int i = 0;  //s1字符串中的字符下标 38      
         int j = 0;  //s2字符串中的字符下标 39         char s1i = 0;   //s1字符串第i个字符 40         
         char s2j = 0;   //s2字符串第j个字符 41         int m = s1.length();    //s1字符串长度 42         
         int n = s2.length();    //s2字符串长度 43         if (m == 0) {   
         //s1字符串长度为0,此时的编辑距离就是s2字符串长度 44             return n; 45         } 46         
         if (n == 0) { 47             return m;   //s2字符串长度为0,此时的编辑距离就是s1字符串长度 48         }          
         int[][] solutionMatrix = new int[m + 1][n + 1];     //求解矩阵 50         
         /** 51          *      
         d e f 52         *   |0|x|x|x| 53         
          * a |1|x|x|x| 54         
           * b |2|x|x|x| 55          
         * c |3|x|x|x| 56    */ 57         for (i = 0; i < m + 1; i++) { 58             solutionMatrix[i][0] = i; 59         } 60         
         /** 61          *      d e f 62         
          *   |0|1|2|3| 63          
          * a |x|x|x|x| 64          
          * b |x|x|x|x| 65          
         * c |x|x|x|x| 66          */ 67         for (j = 0; j < n + 1; j++) { 68             solutionMatrix[0][j] = j; 69         } 70         
         /** 71          * 上面两个操作后,求解矩阵变为 72         
          *      d e f 73         
           *   |0|1|2|3| 74          
         * a |1|x|x|x| 75         
          * b |2|x|x|x| 76         
          * c |3|x|x|x| 77          
          * 接下来就是填充剩余表格 78          
         */ 79         for (i = 1; i < m + 1; i++) {   //i = 1,j = 1, 2, 3,以行开始填充 80             s1i = s1.charAt(i - 1); 81             
         for (j = 1; j < n + 1; j++) { 82                 s2j = s2.charAt(j - 1); 83                 int flag = (s1i == s2j) ? 0 : 1;    
         //根据公式,如果s1[i] = s2[j],则d[i,j]=d[i-1,j-1],如果s1[i] ≠ s2[j],则其中一个公式为d[i,j]=d[i-1,j-1]+1 84                 
         solutionMatrix[i][j] = min(solutionMatrix[i][j-1] + 1, solutionMatrix[i-1][j] + 1, solutionMatrix[i-1][j-1] + flag); 85             
         } 86         } 87         return solutionMatrix[m][n]; 88     } 89  90     /** 91      * 根据公式求解编辑距离 92      
         * @param insert s1插入操作 93      * @param delete s1删除操作 94      * @param edit s1修改操作 95      * @return 编辑距离 96      
         */ 97     private static int min(int insert, int delete, int edit) { 98         int tmp = insert < delete ? insert : delete; 99         
         return tmp < edit ? tmp : edit;100     }101 }
Copy after login

## Python3

 1 &#39;&#39;&#39; 2     动态规划——字符串的编辑距离 3     s1 = "abc", s2 = "def" 4     
 计算公式: 5              | 0                                          
  i = 0, j = 0 6              | j                                           
  i = 0, j > 0 7     d[i,j] = | i                                          
   i > 0, j = 0 8              | min(d[i,j-1]+1, d[i-1,j]+1, d[i-1,j-1])    s1(i) = s2(j) 9              
   | min(d[i,j-1]+1, d[i-1,j]+1, d[i-1,j-1]+1)  s1(i) ≠ s2(j)10     
   定义二维数组[4][4]:11         d e f            d e f12     |x|x|x|x|        
   |0|1|2|3|13     a |x|x|x|x|  =>  a |1|1|2|3|  => 编辑距离d = [4][4] = 314     b |x|x|x|x|      
   b |2|2|2|3|15     c |x|x|x|x|      c |3|3|3|3|16 '''17 def levenshtein(s1, s2):18     i = 0  
    #s1字符串中的字符下标19     j = 0   #s2字符串中的字符下标20     s1i = ""    #s1字符串第i个字符21     
    s2j = ""    #s2字符串第j个字符22     m = len(s1) #s1字符串长度23     n = len(s2) #s2字符串长度24    
     if m == 0:25         return n    #s1字符串长度为0,此时的编辑距离就是s2字符串长度26     if n == 0:27        
      return m    #s2字符串长度为0,此时的编辑距离就是s1字符串长度28     
      solutionMatrix = [[0 for col in range(n + 1)] for row in range(m + 1)]  #长为m+1,宽为n+1的矩阵29     '''30       
      d e f31           |0|x|x|x|32         a |1|x|x|x|33         b |2|x|x|x|34        
       c |3|x|x|x|35     '''36     for i in range(m + 1):37         solutionMatrix[i][0] = i38     '''39       
      d e f40           |0|1|2|3|41         a |x|x|x|x|42         b |x|x|x|x|43         
       c |x|x|x|x|44         45     '''46     for j in range(n + 1):47         solutionMatrix[0][j] = j48     '''49         
       上面两个操作后,求解矩阵变为50              d e f51           |0|1|2|3|52         a |1|x|x|x|53        
        b |2|x|x|x|54         c |3|x|x|x|55         接下来就是填充剩余表格56     '''57     for x in range(1, m + 1):58         
        s1i = s1[x - 1]59         for y in range(1, n + 1):60             s2j = s2[y - 1]61             flag = 0 if s1i == s2j  else 162             
        solutionMatrix[x][y] = min(solutionMatrix[x][y-1] + 1, solutionMatrix[x-1][y] + 1, solutionMatrix[x-1][y-1] + flag)63 64     
        return solutionMatrix[m][n]65 66 def min(insert, delete, edit):67     tmp = insert if insert < delete else delete68     
        return tmp if tmp < edit else edit69 70 s1 = "abc"71 s2 = "def"72 distance = levenshtein(s1, s2)73 print(distance)
Copy after login

The above is the detailed content of Detailed explanation of edit distance examples of strings. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Detailed explanation of the method of converting int type to string in PHP Detailed explanation of the method of converting int type to string in PHP Mar 26, 2024 am 11:45 AM

Detailed explanation of the method of converting int type to string in PHP In PHP development, we often encounter the need to convert int type to string type. This conversion can be achieved in a variety of ways. This article will introduce several common methods in detail, with specific code examples to help readers better understand. 1. Use PHP’s built-in function strval(). PHP provides a built-in function strval() that can convert variables of different types into string types. When we need to convert int type to string type,

How to determine whether a Golang string ends with a specified character How to determine whether a Golang string ends with a specified character Mar 12, 2024 pm 04:48 PM

Title: How to determine whether a string ends with a specific character in Golang. In the Go language, sometimes we need to determine whether a string ends with a specific character. This is very common when processing strings. This article will introduce how to use the Go language to implement this function, and provide code examples for your reference. First, let's take a look at how to determine whether a string ends with a specified character in Golang. The characters in a string in Golang can be obtained through indexing, and the length of the string can be

How to repeat a string in python_python repeating string tutorial How to repeat a string in python_python repeating string tutorial Apr 02, 2024 pm 03:58 PM

1. First open pycharm and enter the pycharm homepage. 2. Then create a new python script, right-click - click new - click pythonfile. 3. Enter a string, code: s="-". 4. Then you need to repeat the symbols in the string 20 times, code: s1=s*20. 5. Enter the print output code, code: print(s1). 6. Finally run the script and you will see our return value at the bottom: - repeated 20 times.

How to check if a string starts with a specific character in Golang? How to check if a string starts with a specific character in Golang? Mar 12, 2024 pm 09:42 PM

How to check if a string starts with a specific character in Golang? When programming in Golang, you often encounter situations where you need to check whether a string begins with a specific character. To meet this requirement, we can use the functions provided by the strings package in Golang to achieve this. Next, we will introduce in detail how to use Golang to check whether a string starts with a specific character, with specific code examples. In Golang, we can use HasPrefix from the strings package

How to intercept a string in Go language How to intercept a string in Go language Mar 13, 2024 am 08:33 AM

Go language is a powerful and flexible programming language that provides rich string processing functions, including string interception. In the Go language, we can use slices to intercept strings. Next, we will introduce in detail how to intercept strings in Go language, with specific code examples. 1. Use slicing to intercept a string. In the Go language, you can use slicing expressions to intercept a part of a string. The syntax of slice expression is as follows: slice:=str[start:end]where, s

How to solve the problem of Chinese garbled characters when converting hexadecimal to string in PHP How to solve the problem of Chinese garbled characters when converting hexadecimal to string in PHP Mar 04, 2024 am 09:36 AM

Methods to solve Chinese garbled characters when converting hexadecimal strings in PHP. In PHP programming, sometimes we encounter situations where we need to convert strings represented by hexadecimal into normal Chinese characters. However, in the process of this conversion, sometimes you will encounter the problem of Chinese garbled characters. This article will provide you with a method to solve the problem of Chinese garbled characters when converting hexadecimal to string in PHP, and give specific code examples. Use the hex2bin() function for hexadecimal conversion. PHP’s built-in hex2bin() function can convert 1

PHP String Matching Tips: Avoid Ambiguous Included Expressions PHP String Matching Tips: Avoid Ambiguous Included Expressions Feb 29, 2024 am 08:06 AM

PHP String Matching Tips: Avoid Ambiguous Included Expressions In PHP development, string matching is a common task, usually used to find specific text content or to verify the format of input. However, sometimes we need to avoid using ambiguous inclusion expressions to ensure match accuracy. This article will introduce some techniques to avoid ambiguous inclusion expressions when doing string matching in PHP, and provide specific code examples. Use preg_match() function for exact matching In PHP, you can use preg_mat

PHP string manipulation: a practical way to effectively remove spaces PHP string manipulation: a practical way to effectively remove spaces Mar 24, 2024 am 11:45 AM

PHP String Operation: A Practical Method to Effectively Remove Spaces In PHP development, you often encounter situations where you need to remove spaces from a string. Removing spaces can make the string cleaner and facilitate subsequent data processing and display. This article will introduce several effective and practical methods for removing spaces, and attach specific code examples. Method 1: Use the PHP built-in function trim(). The PHP built-in function trim() can remove spaces at both ends of the string (including spaces, tabs, newlines, etc.). It is very convenient and easy to use.

See all articles