For Python, learning regular rules requires learning how to use the module re. This article will demonstrate some advanced techniques that everyone should master.
Compile regular expression object
The re.compile function generates a regular expression object based on a pattern string and optional flag parameters. This object has a series of methods for regular expression matching and replacement. There are slight differences in usage. For example, to match a string, you can use the following method:
If you use compile, it will become:
Why do you need to use it like this? In fact, it is to improve the speed of regular expression matching and reuse regular expression objects. Let's compare the efficiency of the two methods:
You can see that the second method is much faster. In actual work, you will find that the more you use compiled regular expression objects, the better the effect will be.
Group
You may have seen the use of grouping matching content:
Pass Add parentheses to the objects to be matched to accurately match the results. We can also perform nested grouping:
Grouping can meet the needs, but sometimes the readability is poor, then the grouping can be named:
Now the readability is very high.
String matching
Students who have learned sed may have seen the following replacement usage:
This \1 represents the result of the previous regular match. The above sed is to add square brackets to the matched results.
There is also such usage in the re module:
It is also possible to use named grouping:
Nearby matching (Look around)
re module also supports nearby matching, just look at the example to understand:
Use the function when doing regular matching
Most of what we've seen before matches an expression, but sometimes the requirements are much more complex, especially when it comes to substitutions.
For example, chat records can be obtained through Slack's API, such as the following sentence:
Among them <@U1EAT8MG9> and <@U0K1MF23Z> are two real users, but Encapsulated by Slack, you need to obtain this correspondence through other interfaces.
The result is similar to this:
After parsing the correspondence, I also hope that the angle brackets are also removed. The result after replacement is "@xiaoming, @laolin Yes, it is indeed like this"
How to use regular expressions to achieve this?
So of course pattern can also be a function
For more articles related to the advanced usage of Python regular expressions, please pay attention to PHP Chinese net!