Regular expression itself is knowledge independent of programming language, but it is also dependent on programming language. Basically, the programming language we use provides Of course, there are some differences in its implementation. Some support more functions and some support less.
Because regular expressions are a widely used tool in practice, I think it is unreliable to learn without language.
Regular expression main API relationship diagram
This diagram is mine Personally, I think I have basically clarified the relationship between the functions here. Their functions are:
match matches the regular expression from the beginning of the text and returns the matching object. If not, return None
#search matches the regular expression in the entire text and returns the first matching object. If not, return None.
sub Use regular expressions for text replacement (function of regular expressions: search and replace)
findall matches regular expressions from the entire text An expression that returns all matching results as a list.
finditer matches a regular expression from the entire text, returning all matching results as an iterator.
split Use regular expressions to split text
As you can see here, ·re· there are many functions that can be used immediately, Then re.compile
There are many functions with the same name below. Directly under the ·re· module are officially provided functions for easy use. The most orthodox way to use them is through re.compile
. So, for the next content, I basically use re.compile
and the methods below to achieve it.
compile
The function is used to compile regular expressions and generate a regular expression (Pattern) object for match( )
and search()
and other functions.
Syntax:
re.compile(pattern[, flags])
pattern: a regular expression in the form of a string
flags optional, indicating matching pattern , such as ignoring case, multi-line mode, etc., the specific parameters are:
re.I Ignore case
re. L multi-line mode
re.S is '.' and any character including newline ('.' does not include newline)
re.U represents the special character set \w, \W, \b, \B, \d, \D, \s, \S and relies on the Unicode character attribute database
re. ##Learning Template
Here is a sample template that will be used all the time. This template is the most important thing in this blog, and subsequent content will be expanded based on it. So, please understand it well.
import re s = 'runoob 123 google 456' result1 = re.findall(r'\d+', s) pattern = re.compile(r'\d+') # 查找数字 result2 = pattern.findall(s) result3 = pattern.findall(s, 0, 20) print(result1) print(result2) print(result3) """ output: [‘123', ‘456'] [‘123', ‘456'] [‘123', ‘45'] """
Note: The regular expression regexp will use the
rprefix before starting. The purpose of this is In order to avoid using a lot of escape characters in regular expressions, which destroys the overall readability.
Python’s regular expressions include many very easy-to-use methods, but I won’t introduce them too much here. We will always use the above pattern, because those easy-to-use methods are just a kind of encapsulation of it, and learning to use this basic method will naturally lead to others.The matching object can obtain information about the regular expression. Its most important methods and properties are:
Methods/Properties
Purpose
group() | Return the regular matching string |
Return the starting position of the match | |
Return the end position of the match | |
Return a tuple containing the matching (start, end) position | |
The above is the detailed content of How to implement regular expressions in Python. For more information, please follow other related articles on the PHP Chinese website!