A brief introduction to regular expressions in python (with code)-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

A brief introduction to regular expressions in python (with code)

不言

Sep 14, 2018 pm 05:05 PM

python

本篇文章给大家带来的内容是关于python中正则表达式的简单介绍（附代码），有一定的参考价值，有需要的朋友可以参考一下，希望对你有所帮助。

正则表达式是对字符串操作的一种逻辑公式，就是用事先定义好的一些特定字符、及这些特定字符的组合，组成一个“规则字符串”，这个“规则字符串”用来表达对字符串的一种过滤逻辑。

在python中正则表达式被封装到了re模块，通过引入re模块来使用正则表达式

re模块中有很多正则表达式处理函数，首先用findall函数介绍基本基本字符的含义

元字符有：. \ * + ? ^ $ | {} [] ()

findall函数

遍历匹配，可以获取字符串中所有匹配的字符串，返回一个列表

. 匹配任意除换行符"\n"外的字符

import re

temp=re.findall("a.c","abcdefagch")
print(temp)#[&#39;abc&#39;, &#39;agc&#39;]

Copy after login

* 匹配前一个字符0或多次

temp=re.findall("a*b","abcaaaaabcdefb")
print(temp)#[&#39;ab&#39;, &#39;aaaaab&#39;, &#39;b&#39;]

Copy after login

+ 匹配前一个字符1次或无限次

temp=re.findall("a+b","abcaaaaabcdefb")
print(temp)#[&#39;ab&#39;, &#39;aaaaab&#39;]

Copy after login

? 匹配前一个字符0次或1次

temp=re.findall("a?b","abcaaaaabcdefb")
print(temp)#[&#39;ab&#39;, &#39;ab&#39;, &#39;b&#39;]

Copy after login

^ 匹配字符串开头。在多行模式中匹配每一行的开头

temp=re.findall("^ab","abcaaaaabcdefb")
print(temp)#[&#39;ab&#39;]

Copy after login

$ 匹配字符串末尾，在多行模式中匹配每一行的末尾

temp=re.findall("ab$","abcaaaaabcdefab")
print(temp)#[&#39;ab&#39;]

Copy after login

| 或。匹配|左右表达式任意一个，从左到右匹配，如果|没有包括在()中，则它的范围是整个正则表达式

temp=re.findall("abc|def","abcdef")
print(temp)#[&#39;abc&#39;, &#39;def&#39;]

Copy after login

{} {m}匹配前一个字符m次，{m,n}匹配前一个字符m至n次，若省略n，则匹配m至无限次

temp=re.findall("a{3}","aabaaacaaaad")
print(temp)#[&#39;aaa&#39;, &#39;aaa&#39;]
temp=re.findall("a{3,5}","aaabaaaabaaaaabaaaaaa")
print(temp)#[&#39;aaa&#39;, &#39;aaaa&#39;, &#39;aaaaa&#39;, &#39;aaaaa&#39;]在获取了3个a后，若下一个还是a，并不会得到aaa，而是算下一个a

Copy after login

[] 字符集。对应的位置可以是字符集中任意字符。字符集中的字符可以逐个列出，也可以给出范围，如[abc]或[a-c]。[^abc]表示取反，即非abc，所有特殊字符在字符集中都失去其原有的特殊含义。用\反斜杠转义恢复特殊字符的特殊含义。

temp=re.findall("a[bcd]e","abcdefagch")
print(temp)#[]此时bcd为b或c或d
temp=re.findall("a[a-z]c","abcdefagch")
print(temp)#[&#39;abc&#39;, &#39;agc&#39;]
temp=re.findall("[^a]","aaaaabcdefagch")
print(temp)#[&#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;f&#39;, &#39;g&#39;, &#39;c&#39;, &#39;h&#39;]
temp=re.findall("[^ab]","aaaaabcdefagch")
print(temp)#[&#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;f&#39;, &#39;g&#39;, &#39;c&#39;, &#39;h&#39;]a和b都不会被匹配

Copy after login

() 被括起来的表达式将作为分组，从表达式左边开始每遇到一个分组的左括号“（”，编号+1.分组表达式作为一个整体，可以后接数量词。表达式中的|仅在该组中有效。

temp=re.findall("(abc){2}a(123|456)c","abcabca456c")
print(temp)#[(&#39;abc&#39;, &#39;456&#39;)]
temp=re.findall("(abc){2}a(123|456)c","abcabca456cbbabcabca456c")
print(temp)#[(&#39;abc&#39;, &#39;456&#39;), (&#39;abc&#39;, &#39;456&#39;)]
#这里有()的情况中，findall会将该规则的每个()中匹配到的字符创放到一个元组中

Copy after login

要想看到被完全匹配的内容，我们可以使用一个新的函数search函数

search函数

在字符串内查找模式匹配,只要找到第一个匹配然后返回，如果字符串没有匹配，则返回None

temp=re.search("(abc){2}a(123|456)c","abcabca456c")
print(temp)#<re.Match object; span=(0, 11), match=&#39;abcabca456c&#39;>
print(temp.group())#abcabca456c

Copy after login

\ 转义字符，使后一个字符改变原来的意思

反斜杠后边跟元字符去除特殊功能；（即将特殊字符转义成普通字符）

temp=re.search("a\$","abcabca456ca$")
print(temp)#<<re.Match object; span=(11, 13), match=&#39;a$&#39;>
print(temp.group())#a$

Copy after login

引用序号对应的字组所匹配的字符串。

即下面的\2为前边第二个括号中的内容，2代表第几个，从1开始

a=re.search(r&#39;(abc)(def)gh\2&#39;,&#39;abcdefghabc abcdefghdef&#39;).group()
print(a)#abcdefghdef

Copy after login

反斜杠后边跟普通字符实现特殊功能；（即预定义字符）　　

预定义字符有：\d \D \s \S \w \W \A \Z \b \B

预定义字符在字符集中仍有作用

\d 数字:[0-9]

temp=re.search("a\d+b","aaa234bbb")
print(temp.group())#a234b

Copy after login

\D 非数字:[^\d]

\s 匹配任何空白字符:[<空格>\t\r\n\f\v]

temp=re.search("a\s+b","aaa   bbb")
print(temp.group())#a   b

Copy after login

\S 非空白字符:[^\s]

\w 匹配包括下划线在内的任何字字符:[A-Za-z0-9_]

\W 匹配非字母字符，即匹配特殊字符

temp=re.search("\W","$")
print(temp.group())#$

Copy after login

\A 仅匹配字符串开头,同^

\Z 仅匹配字符串结尾，同$

\b 匹配\w和\W之间的边界

temp=re.search(r"\bas\b","a as$d")
print(temp.group())#$as

Copy after login

\B [^\b]

下面介绍其他的re常用函数

compile函数

编译正则表达式模式，返回一个对象的模式

rule = re.compile("abc\d+\w")
str = "aaaabc6def"
temp = rule.findall(str)
print(temp)#[&#39;abc6d&#39;]

Copy after login

match函数

在字符串刚开始的位置匹配,和^功能相同

temp=re.match("asd","asdfasd")
print(temp.group())#asd

Copy after login

finditer函数

将所有匹配到的字符串以match对象的形式按顺序放到一个迭代器中返回

temp=re.finditer("\d+","as11d22f33a44sd")
print(temp)#<callable_iterator object at 0x00000242EEEE9E48>
for i in temp:
    print(i.group())
#11
#22
#33
#44

Copy after login

split函数

用于分割字符串，将分割后的字符串放到一个列表中返回

如果在字符串的首或尾分割，将会出现一个空字符串

temp=re.split("\d+","as11d22f33a44sd55")
print(temp)#[&#39;as&#39;, &#39;d&#39;, &#39;f&#39;, &#39;a&#39;, &#39;sd&#39;, &#39;&#39;]

Copy after login

使用字符集分割

如下先以a分割，再将分割后的字符串们以b分割，所以会出现3个空字符串

temp=re.split("[ab]","ab123b456ba789b0")
print(temp)#[&#39;&#39;, &#39;&#39;, &#39;123&#39;, &#39;456&#39;, &#39;&#39;, &#39;789&#39;, &#39;0&#39;]

Copy after login

sub函数　

将re匹配到的部分进行替换再返回新的字符串

temp=re.sub("\d+","_","ab123b456ba789b0")
print(temp)#ab_b_ba_b_

Copy after login

后边还可以再加一个参数表示替换次数，默认为0表示全替换

subn函数

将re匹配到的部分进行替换再返回一个装有新字符串和替换次数的元组

temp=re.subn("\d+","_","ab123b456ba789b0")
print(temp)#(&#39;ab_b_ba_b_&#39;, 4)

Copy after login

然后讲一下特殊分组

temp=re.search("(?P<number>\d+)(?P<letter>[a-zA-Z])","ab123b456ba789b0")
print(temp.group("number"))#123
print(temp.group("letter"))#b

Copy after login

以?P的形式起名

最后说一下惰性匹配和贪婪匹配

temp=re.search("\d+","123456")
print(temp.group())#123456

Copy after login

此时为贪婪匹配，即只要符合就匹配到底

temp=re.search("\d+?","123456")
print(temp.group())#1

Copy after login

在后面加一个？变为惰性匹配，即只要匹配成功一个字符就结束匹配　

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks ago By DDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

WWE 2K25: How To Unlock Everything In MyRise

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Zend Studio 13.0.1

Powerful PHP integrated development environment

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7504

CakePHP Tutorial

1378

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

HadiDB: A lightweight, horizontally scalable database in Python Apr 08, 2025 pm 06:12 PM

HadiDB: A lightweight, high-level scalable Python database HadiDB (hadidb) is a lightweight database written in Python, with a high level of scalability. Install HadiDB using pip installation: pipinstallhadidb User Management Create user: createuser() method to create a new user. The authentication() method authenticates the user's identity. fromhadidb.operationimportuseruser_obj=user("admin","admin")user_obj.

The 2-Hour Python Plan: A Realistic Approach Apr 11, 2025 am 12:04 AM

You can learn basic programming concepts and skills of Python within 2 hours. 1. Learn variables and data types, 2. Master control flow (conditional statements and loops), 3. Understand the definition and use of functions, 4. Quickly get started with Python programming through simple examples and code snippets.

Navicat's method to view MongoDB database password Apr 08, 2025 pm 09:39 PM

It is impossible to view MongoDB password directly through Navicat because it is stored as hash values. How to retrieve lost passwords: 1. Reset passwords; 2. Check configuration files (may contain hash values); 3. Check codes (may hardcode passwords).

Python: Exploring Its Primary Applications Apr 10, 2025 am 09:41 AM

Python is widely used in the fields of web development, data science, machine learning, automation and scripting. 1) In web development, Django and Flask frameworks simplify the development process. 2) In the fields of data science and machine learning, NumPy, Pandas, Scikit-learn and TensorFlow libraries provide strong support. 3) In terms of automation and scripting, Python is suitable for tasks such as automated testing and system management.

How to use AWS Glue crawler with Amazon Athena Apr 09, 2025 pm 03:09 PM

As a data professional, you need to process large amounts of data from various sources. This can pose challenges to data management and analysis. Fortunately, two AWS services can help: AWS Glue and Amazon Athena.

How to optimize MySQL performance for high-load applications? Apr 08, 2025 pm 06:03 PM

MySQL database performance optimization guide In resource-intensive applications, MySQL database plays a crucial role and is responsible for managing massive transactions. However, as the scale of application expands, database performance bottlenecks often become a constraint. This article will explore a series of effective MySQL performance optimization strategies to ensure that your application remains efficient and responsive under high loads. We will combine actual cases to explain in-depth key technologies such as indexing, query optimization, database design and caching. 1. Database architecture design and optimized database architecture is the cornerstone of MySQL performance optimization. Here are some core principles: Selecting the right data type and selecting the smallest data type that meets the needs can not only save storage space, but also improve data processing speed.

How to start the server with redis Apr 10, 2025 pm 08:12 PM

The steps to start a Redis server include: Install Redis according to the operating system. Start the Redis service via redis-server (Linux/macOS) or redis-server.exe (Windows). Use the redis-cli ping (Linux/macOS) or redis-cli.exe ping (Windows) command to check the service status. Use a Redis client, such as redis-cli, Python, or Node.js, to access the server.

How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

See all articles

A brief introduction to regular expressions in python (with code)

Hot AI Tools

Undresser.AI Undress

AI Clothes Remover

Undress AI Tool

Clothoff.io

AI Hentai Generator

Hot Article

Hot Tools

Notepad++7.3.1

SublimeText3 Chinese version

Zend Studio 13.0.1

Dreamweaver CS6

SublimeText3 Mac version

Hot Topics