Home Backend Development PHP Tutorial How do regular expression patterns match strings?

How do regular expression patterns match strings?

Nov 30, 2017 am 09:14 AM
match how expression

正则表达式,又称规则表达式。计算机科学的一个概念。正则表通常被用来检索、替换那些符合某个模式(规则)的文本。许多程序设计语言都支持利用正则表达式进行字符串操作。本文我们就和大家分享正则表达式模式匹配字符串的基础知识。

在实际项目中有个功能的实现需要解析一些特定模式的字符串。而在已有的代码库中,在已实现的部分功能中,都是使用检测特定的字符,使用这种方法的缺点是:

逻辑上很容易出错

很容易漏掉对一些边界条件的检查

代码复杂难以理解、维护

性能差

      看到代码库中有一个cpp,整个cpp两千多行代码,有个方法里,光解析字符串的就有400余行!一个个字符对比过去,真是不堪入目。而且上面很多注释都已经过期,很多代码的书写风格也各不相同,基本可以判断是过了很多人手的。  在这种情况下,基本没办法还沿着这条老路走下去,自然而然就想到了使用正则表达式。

这篇文章可以说是一个总结,把在书写正则表达式的匹配字符串方面的基础知识介绍一下。主要分为以下两个个部分:

匹配字符串的基本规则

正则匹配、查找与替代

本文介绍的正则表达式规则是ECMAScript。使用的编程语言是C++。其他方面的不做介绍。

匹配字符串的基本规则

1. 匹配固定的字符串

regex e("abc");
Copy after login

2. 匹配固定字符串,不区分大小写

regex e("abc", regex_constants::icase);
Copy after login

3. 匹配固定字符串之外多一个字符,不区分大小写

regex e("abc.", regex_constants::icase);  // .  Any character except 
newline. 1个字符
Copy after login

4. 匹配0个或1个字符

regex e("abc?");    // ?  Zero or 1 preceding character. 
匹配?前一个字符
Copy after login

5. 匹配0个或多个字符

regex e("abc*");    // *  Zero or more preceding character. 
匹配*前一个字符
Copy after login

6. 匹配1个或多个字符

regex e("abc+");    // +  One or more preceding character. 
匹配+前一个字符
Copy after login

7. 匹配特定字符串中的字符

regex e("ab[cd]*");    // [...] Any character inside square brackets. 
匹配[]内的任意字符
Copy after login

8. 匹配非特定字符串的字符

regex e("ab[^cd]*");    // [...] Any character not inside square 
brackets. 匹配非[]内的任意字符
Copy after login

9. 匹配特定字符串,且指定数量

regex e("ab[cd]{3}"); // {n} 匹配{}之前任意字符,且字符个数为3个

10. 匹配特定字符串,指定数量范围

regex e("ab[cd]{3,}");  // {n} 匹配{}之前任意字符,且字符个数为3个或3个以上
regex e("ab[cd]{3,5}");  // {n} 匹配{}之前任意字符,且字符个数为3个以上,5个以下闭区间
Copy after login


11. 匹配规则中的某一个规则

regex e("abc|de[fg]");    // |  匹配|两边的任意一个规则
Copy after login

12. 匹配分组

regex e("(abc)de+"); // () ()表示一个子分组

13. 匹配子分组

regex e("(abc)de+\\1");  // ()    ()表示一个子分组,而\1表示在此位置匹配第一个分组的内容
regex e("(abc)c(de+)\\2\\1");  // \2 表示的是在此匹配第二个分组的内容
Copy after login


14. 匹配某个字符串开头

regex e("^abc."); 
// ^ begin of the string 查找以abc开头的子字符串
Copy after login


15. 匹配某个字符串结尾

regex e("abc.$");
// $ end of the string 查找以abc结尾的子字符串
Copy after login


以上是最基本的匹配模式的书写。通常如果要匹配特定的字符,需要使用\进行转义,比如在匹配字符串中需要匹配".",那么在匹配字符串中应该在特定字符前加上\。出了以上的基本规则,如果还不满足特定的需要,那么可以参考此链接。使用了解基本的匹配模式后,需要使用正则表达式进行匹配、查找或者替代。

正则匹配、查找与替代

书写好模式字符串后,需要将待匹配的字符串和模式字符串进行一定规则的匹配。包括三种方式:匹配(regex_match)、查找(regex_search)、替换(regex_replace)。

匹配很简单,直接将待匹配字符串和模式字符串传入到regex_match中,返回一个bool量来指明待匹配的字符串是否满足模式字符串的规则。匹配整个str字符串。

bool match = regex_match(str, e);
// 匹配整个字符串str
Copy after login



查找是在整个字符串中找到和满足模式字符串的子字符串。也就是只要str中存在满足模式字符串就会返回true。

bool match = regex_search(str, e);
// 查找字符串str中匹配e规则的子字符串
Copy after login


但是很多情况下,光是返回一个是否匹配的bool量是不够的,我们需要拿到匹配的子字符串。那么就需要在模式字符串中将匹配字符串分组,参考【匹配字符串的基本规则】第12点。再将smatch传入到regex_search中,就可以获得满足每个子分组的字符串。

smatch m;
bool found = regex_search(str, m, e);
for (int n = 0; n < m.size(); ++n)
  {
    cout << "m[" << n << "].str()=" << m[n].str() << endl;
  }
Copy after login


替换也是基于模式字符串在分组情况下完成的。

cout << regex_replace(str, e, "$1 is on $2");
Copy after login


此时,会在满足分组1和分组2的字符串中间加上“ is on”。

以上三个函数有很多版本的重载,可以满足不同情况下的需求。

实战

要求:找出满足sectionA("sectionB")或者sectionA ("sectionB")的模式字符串。且分离出sectionA、sectionB。sectionA和sectionB不会出现数字,字符可大小写,至少有一个字符。

分析:根据要求,大致可分为两个部分,也就是sectionA和sectionaB。这是就需要用到分组。

第一步:写出满足section情况的模式字符串

[a-zA-Z]+

第二步:在sectionA和sectionB中可能会出现空格。暂且假设至多有1个空格

\\s?

将以上两个情况组合起来,也就是能满足我们需求的模式字符串。但是如何组织才能让其分为两组呢?

[a-zA-Z]+\\s[a-zA-Z]+

上面这种写法肯定不对的,根据分组规则,需要将分组以()进行区分

regex e("([a-zA-Z]+)\\s?\\(\"([a-zA-Z]+)\"\\)");

此时,在\\s?后面的\\(\"是为了满足sectionB外层的引号和括号进行的转义。

以上完成后,可先用regex_match进行匹配,如果匹配,那么继续使用regex_search对字符串进行查找

if (regex_match(str, e))
{
 smatch m;
 auto found = regex_search(str, m, e);
 for (int n = 0; n < m.size(); ++n)
 {
 cout << "m[" << n << "].str()=" << m[n].str() << endl;
 }
}
else
{
 cout << "Not matched" << endl;
}
Copy after login

对象m数组的第一个字符串是满足需求的整个子串,接下来才是满足分组1、分组2的子串。

以上内容就是正则表达式模式匹配字符串的基础知识,希望对大家有帮助。

相关推荐:

PHP正则表达式合集

php正则表达式中常用函数的详解

常用的正则表达式汇总

The above is the detailed content of How do regular expression patterns match strings?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Practical Guide to Regular Expressions in Go: How to Match Hexadecimal Color Codes Practical Guide to Regular Expressions in Go: How to Match Hexadecimal Color Codes Jul 13, 2023 am 10:46 AM

Go Language Regular Expressions Practical Guide: How to Match Hexadecimal Color Codes Introduction: Regular expressions are a powerful and flexible tool for pattern matching and finding strings. In Go language, we can use the built-in regular expression package regexp to implement these operations. This article will introduce how to use regular expressions to match hexadecimal color codes in Go language. Importing the regular expression package First, we need to import the regular expression package regexp of the Go language. You can add the following import statement at the beginning of the code: i

PHP regular expression in action: matching letters and numbers PHP regular expression in action: matching letters and numbers Jun 22, 2023 pm 04:49 PM

PHP regular expression practice: matching letters and numbers Regular expression is a tool used to match strings, which can easily realize string search, replacement, split and other operations. Regular expressions are also a very useful tool in PHP development. This article will introduce how to use PHP regular expressions to match letters and numbers. Matching a Single Character To match a single character, you can use the character classes in regular expressions. Character classes are represented by square brackets []. The characters in them represent the characters that can be matched. You can use hyphens - to represent ranges.

PHP regular expressions: exact matching and exclusion of fuzzy inclusions PHP regular expressions: exact matching and exclusion of fuzzy inclusions Feb 28, 2024 pm 01:03 PM

PHP Regular Expressions: Exact Matching and Exclusion Fuzzy inclusion regular expressions are a powerful text matching tool that can help programmers perform efficient search, replacement and filtering when processing text. In PHP, regular expressions are also widely used in string processing and data matching. This article will focus on how to perform exact matching and exclude fuzzy inclusion operations in PHP, and will illustrate it with specific code examples. Exact match Exact match means matching only strings that meet the exact condition, not any variations or extra words.

PHP String Matching Tips: Avoid Ambiguous Included Expressions PHP String Matching Tips: Avoid Ambiguous Included Expressions Feb 29, 2024 am 08:06 AM

PHP String Matching Tips: Avoid Ambiguous Included Expressions In PHP development, string matching is a common task, usually used to find specific text content or to verify the format of input. However, sometimes we need to avoid using ambiguous inclusion expressions to ensure match accuracy. This article will introduce some techniques to avoid ambiguous inclusion expressions when doing string matching in PHP, and provide specific code examples. Use preg_match() function for exact matching In PHP, you can use preg_mat

How to match in Jedi Submarine 2 How to match in Jedi Submarine 2 Feb 27, 2024 pm 08:43 PM

Jedi Submarine 2 is a third-person shooting game with high-quality masterpiece gameplay. It has a lot of exciting gameplay that allows friends to explore the operational fun of online shooting battles. The online mode in the game can be matched. Some players I still don’t know how to operate matching. In this issue, I will share the matching steps with you! Matching operation tutorial of Jedi Submarine 2. Answer: Click Quick Match on the planet interface. The matching method of Jedi Submarine 2. The quick matching of Jedi Submarine 2 is a very good function. It can help players find teammates to match together, enter a mission together, and cooperate with each other to obtain a higher mission evaluation. The matching options are on the planet interface. When looking for tasks or viewing public rooms, there will be a quick match below. Click to start matching. If the player turns on cross leveling

Is there a future for employment in clinical pharmacy at Harbin Medical University? (What are the employment prospects for clinical pharmacy at Harbin Medical University?) Is there a future for employment in clinical pharmacy at Harbin Medical University? (What are the employment prospects for clinical pharmacy at Harbin Medical University?) Jan 02, 2024 pm 08:54 PM

What are the employment prospects of clinical pharmacy at Harbin Medical University? Although the national employment situation is not optimistic, pharmaceutical graduates still have good employment prospects. Overall, the supply of pharmaceutical graduates is less than the demand. Pharmaceutical companies and pharmaceutical factories are the main channels for absorbing such graduates. The demand for talents in the pharmaceutical industry is also growing steadily. According to reports, in recent years, the supply-demand ratio for graduate students in majors such as pharmaceutical preparations and natural medicinal chemistry has even reached 1:10. Employment direction of clinical pharmacy major: After graduation, students majoring in clinical medicine can engage in medical treatment, prevention, medical research, etc. in medical and health units, medical research and other departments. Employment positions: Medical representative, pharmaceutical sales representative, sales representative, sales manager, regional sales manager, investment manager, product manager, product specialist, nurse

PHP Regular Expression: How to match all textarea tags in HTML PHP Regular Expression: How to match all textarea tags in HTML Jun 22, 2023 pm 09:27 PM

HTML is a commonly used page markup language used to display content on web pages. In HTML, the textarea tag is used to create text boxes that allow users to enter or edit text. When you need to extract all textarea tags and their contents from a page, PHP regular expressions can provide a simple and effective solution. In this article, we will learn how to match all textarea tags in HTML using PHP regular expressions. Understand regular tables

How to solve Python expression syntax errors? How to solve Python expression syntax errors? Jun 24, 2023 pm 05:04 PM

Python, as a high-level programming language, is easy to learn and use. Once you need to write a Python program, you will inevitably encounter syntax errors, and expression syntax errors are a common one. In this article, we will discuss how to resolve expression syntax errors in Python. Expression syntax errors are one of the most common errors in Python, and they are usually caused by incorrect usage of syntax or missing necessary components. In Python, expressions usually consist of numbers, strings, variables, and operators. most common

See all articles