This article brings you an introduction (code example) about Python regular expressions and re library. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. .
A regular expression is a sequence of characters that defines a search pattern. Typically this pattern is used by string search algorithms for "find" or "find and replace" operations on strings, or for input validation.
1. Regular expression syntax
. Represents any single character
[] Character set, giving a value range for a single character
[^] Non-character set, giving an exclusion range for a single character
* The previous character is expanded 0 times or infinitely
The previous character is expanded 1 time or infinitely
? The previous character is expanded 0 times or 1 Times expansion
|Any one of the left and right expressions
{m}expands the previous character m times
{m,n}Expand the previous character m to n times
^match the beginning of the string
$match the end of the string
() grouping mark, only the | operator can be used internally
d number, equivalent to [0-9]
w word characters, equivalent to [A-Z,a-z,0-9]
2. Use of re library in python
Re library is the standard library of python, mainly used for string matching. Calling method: import re
re library The raw string type is used to represent regular expressions, expressed as
r'text'
raw string is a string that does not contain escape characters again, in short, it is string Characters will be escaped, but raw string will not, because escape symbols will appear in regular expressions, so to avoid tediousness we use raw string
re.search() Searches for the first position of a regular expression in a string and returns the match object
re .match() Matches the regular expression from the beginning of a string and returns the match object
re.findall()Search for the string, Return all matching substrings in list type
re.split()Split a string according to the regular expression matching result and return list type
re.finditer()Search for a string and return an iteration type of matching results. Each iteration element is a match object
re.sub()Replace all substrings matching the regular expression in a string and return the replaced string
Search for the first position of the regular expression in a string and return the match object
import re match = re.search(r'[1-9]\d{5}', 'BIT 100081') if match: print(match.group(0)) 结果为100081
The parameters are the same as the search function
Example:
import re match = re.match(r'[1-9]\d{5}', 'BIT 100081') print(match.group(0)) 结果会报错,match为空,因为match函数是 从字符串开始位置开始匹配,因为从开始位置没有匹配到,所以为空
The parameters are the same as search
Example:
import re ls=re.findall(r'[1-9]\d{5}', 'BIT100081 TSU100084') print(ls) 结果为['100081', '100084']
import re re.split(r'[1-9]\d{5}', 'BIT100081 TSU100084') 结果['BIT', ' TSU', ' '] re.split(r'[1-9]\d{5}', 'BIT100081 TSU100084', maxsplit=1) 结果['BIT', ' TSU100081']
The parameters are the same as search
Example:
import re for m in re.finditer(r'[1-9]\d{5}', 'BIT100081 TSU100084'): if m: print(m.group(0)) 结果为 100081 100084
import re re.sub(r'[1-9]\d{5}', ':zipcode', 'BIT100081 TSU100084') 结果为 'BIT:zipcode TSU:zipcode'
rst=re.search(r'[1-9]\d{5}', 'BIT 100081') 函数式的调用,一次性操作
pat=re.compile(r'[1-9]\d{5}') rst=pat.search('BIT 100081') 编译后多次操作
regex also has the above Six usages
The following is Match Attributes of the object
.pos The starting position of the regular expression search text
.endpos The end position of the regular expression search text
The following are the methods of the Match object
.group(0) Get the matched string
.start() Matches the string at the beginning of the original string
.end() Matches the string at the end of the original string
.span() returns (.start(), .end())
When a regular expression can match multiple items of different lengths, which one is returned? The Re library uses greedy matching by default, that is, it returns the longest matching substring
the smallest matching
*? before A character is expanded 0 times or infinitely, and the minimum match is
? The previous character is expanded 1 time or infinitely, and the minimum match is
As long as the length output may be different, you can add ? after the operator to become the minimum match
The above is the detailed content of Introduction to Python regular expressions and re library (code examples). For more information, please follow other related articles on the PHP Chinese website!