Problem:
You have two strings in MySQL and need to determine their similarity percentage. For example, given the strings "@a = 'Welcome to Stack Overflow'" and "@b = 'Hello to stack overflow'", you want to find the similarity between them.
Solution:
Create the Levenshtein Distance Function:
Use the following function to calculate the Levenshtein distance between two strings:
CREATE FUNCTION `levenshtein`(s1 text, s2 text) RETURNS int(11) DETERMINISTIC BEGIN ... END
The above function is adapted from the one provided at http://www.artfulsoftware.com/infotree/queries.php#552.
Create the Levenshtein Similarity Ratio Function:
To convert the Levenshtein distance into a similarity ratio, use this function:
CREATE FUNCTION `levenshtein_ratio`( s1 text, s2 text ) RETURNS int(11) DETERMINISTIC BEGIN ... END
Usage:
To calculate the similarity percentage between two strings, use the following formula:
similarity_percentage = ((1 - LEVENSHTEIN(s1, s2) / MAX_LENGTH) * 100)
Example:
SELECT levenshtein_ratio('Welcome to Stack Overflow', 'Hello to stack overflow') AS similarity;
This query will return the similarity percentage between the two strings, which in this case would be 66%.
The above is the detailed content of How to Calculate String Similarity Percentage in MySQL?. For more information, please follow other related articles on the PHP Chinese website!