Regular expressions are not unique to Python, but are an independent set of syntax supported by many programming languages. The regular expression syntax used in different languages is not exactly the same, but it is generally similar. This article focuses on the usage and difference between greedy mode and non-greedy mode. By default, regular expressions are matched according to the greedy mode, that is, to match as much content as possible. For example:
In the above code, the first \b in the regular expression means matching the beginning of the word, followed by the letter b, It means to match words starting with the letter b, followed by a dot. It means to match any character (including spaces), then the plus sign + means that any preceding character appears one or more times, and the last \b means to match the end of the word. So the question is, what counts as the end of a word? Whitespace characters and punctuation marks are both counted as word endings, but regular expressions use greedy mode by default, which is to match as much content as possible, so the above code matches the last word ending in the text. As shown in the picture:
So how can we only match words starting with the letter b instead of like the above? Non-greedy mode can be used. The non-greedy mode is completed using the question mark "?". In the regular expression, if the question mark is preceded by an ordinary character or sub-pattern, it means that the character or sub-pattern before the question mark may or may not appear. But if the question mark follows content such as +, * and {m,n}, it indicates a non-greedy mode, that is, matching as little content as possible. Take the above problem as an example, change it to non-greedy mode, for example:
The following code further demonstrates the difference between greedy mode and non-greedy mode :
Of course, back to the original question of this article, if you just want to match words starting with the letter b, you don’t have to go to so much trouble and just use \ w is just fine, because \w can only match letters, numbers, or underscores, not spaces. For example:
Related recommendations:
Greedy algorithm and sum in regular expression re Application of non-greedy algorithm in python
The above is the detailed content of The usage and difference between greedy mode and non-greedy mode in Python regular expressions. For more information, please follow other related articles on the PHP Chinese website!