Detailed explanation of regular expressions
The regular expression language consists of two basic character types: literal (normal) text characters and metacharacters.
Related recommendations:
1. Regular expression syntax tutorial (including online testing tools)
2. PHP regular expression quick introduction video tutorial
Metacharacters have the ability to be processed using regular expressions. Metacharacters can be any single character placed in [ ]
(for example, [a]
means matching a single lowercase character a
), or a sequence of characters ( For example, [a-d]
means matching any character between a, b, c, d
, and \w
means any English letters, numbers and underscores), Common metacharacters are as follows:
Common metacharacters
Characters | Description | Special instructions |
---|---|---|
. |
Matches any character except the newline character (\n ) |
~ |
[abcde] |
matches any character among a b c d e
|
All characters are or . The relationship |
[a-h] |
matches a to Any character between h
|
~ |
[^fgh] |
does not match Any character in fgh matches |
. Add ^ before the first character of the square brackets [ ] to indicate negation Does not match any characters appearing inside square brackets |
\w |
Matches uppercase and lowercase English characters and numbers 0 to 9 Any one between and the underscore is equivalent to [a-zA-Z0-9_]
|
~ |
##\W
| is the opposite of \w and is equivalent to [^a-zA-Z0-9_]
| ~|
\s
| matches any whitespace character, equivalent to [\f\n\r\t\v]
| ~|
\S
| is the opposite of \s, equivalent to [^\s]
| ~|
\d
| matches any single digit between 0 and 9, equivalent to [0-9]
| ~|
# is the opposite of | \d , equivalent to [^0-9] ~ |
|
Matches any single Chinese character (Chinese) | (the Chinese characters represented by Unicode encoding are used here) ~
|
|
Matches the beginning or end of a word | ~||
Matches the beginning of the string
| when placed before the first character of the brackets, it becomes which means inverse | |
Match the end of the string | ~
unit
preceding this symbol. Unit:
If the preceding character is a character, then this one The character is a
- If we used parentheses to enclose a long string before, then the entire parentheses are considered a unit
- The above metacharacters are all matched against a single character. If you want to match multiple characters at the same time, you need to use qualifiers. The following are some common qualifiers ( n in the table below
integer. )
Special Instructions | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
matches 0 to multiple metacharacters, equivalent to {0,} |
~ |
|||||||||||||||||||||||||||
matches 0 to 1 metacharacter, equivalent to {0,1} |
~ |
|||||||||||||||||||||||||||
matches at least 1 metacharacter, equivalent to {1,} |
~ |
|||||||||||||||||||||||||||
Match n metacharacters | ~||||||||||||||||||||||||||||
Match at least n metacharacters | ~||||||||||||||||||||||||||||
Match n to m metacharacters | ~||||||||||||||||||||||||||||
Match word boundaries | ~||||||||||||||||||||||||||||
The string must start with the specified character | ~||||||||||||||||||||||||||||
The string must end with the specified character | ~
Regular | Meaning |
---|---|
Windows98|Windows2000|WindowsXP |
matches Windows98 or Windows2000 or WindowsXP
|
^Windows98|Windows2000|WindowsXP$ |
Starts with Windows98 or contains Windows2000 or ends with WindowsXPNote that ^ and $ are both included in the range of | , because the boundaries of | are only: beginning, end, parentheses |
Windows(98|2000|XP) |
Windows then98 or 2000 orXP
|
Summary: The multi-selection structure can include many characters, but it cannot exceed the boundaries of brackets
.
Advanced 2 - Grouping and Backreferences
Grouping
- We already know how to repeat a single character;
- But if you want to What should I do if I want to repeat a string? You can use parentheses to specify subexpressions (also called groupings) .
-
(\d{1,3}\.){3}\d{1,3}
Simple IP address matching expression - But it will also Matches the impossible IP address 256.300.888.999. Can you write a more accurate regex?
((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4] \d|25[0-5]|[01]?\d\d?)
Backreference
- Use parentheses to specify a sub After an expression (grouped), text matching this subexpression can be captured for further processing within the expression or other programs.
- By default, each group will automatically have a group number. The rules are: With the
left bracket
of the group as the symbol, from left to right, the first group number The group number is 1, the second one is 2, and so on.
Example:
- ##\b(\w )\b\s \1\b
can be used to match duplicates The word
matches words such as: - where where go, tom tom happy
in the regular expression , use parentheses in the front to divide (group), and then put the content matched by the parentheses and quote
to the back, using \1, \2
, etc. To represent. (The first parenthesis is \1
...). If there are parentheses nested inside parentheses (\w (.?))
Remember: At this time, you need to use (
as the symbol to count the parentheses from left to right. .Advanced 3 - Look Around (Zero Width Assertion)
- specific positions in the text
- . Similar to \b,
^
,$
like that.Looking around will not occupy characters.
Looking around is divided into order - There are two kinds of reverse order: order
-
- (?=exp)
- The following
position can match
exp . For example:(?=\d)
The right side of the current position is a number. (?!exp) - The following
of the position cannot be matched
exp. For example:(?!\d)
The right side of the current position is not a number.
- The following
- (?<=exp)
- The
in front of the position can match
exp. For example:(?<=\d)
To the left of the current position It is a number (?. The
in front of the position cannot match
exp. For example:(?!\d )
The left side of the current position is not a number.- The
-
- quantifier
- (a specified number of codes, such as ,
*
,{3,12}
, etc.) that can be repeated ,The usual behavior is to match as many characters as possible
. Regular expression: a.*b - , it will match the longest character ending with
a# A string starting with ## and ending with
b. If you use it to search for
aabab, it will match the entire string
aabab, which is called -- -----
Greedy matching-
We need more - Lazy matching , that is, matching as few characters as possible, as given above All quantifiers can be converted into lazy matching patterns.
- Just add a question mark after it ? . In this way,
.*?
means matching any number of repetitions , but use the least repeated under the premise that the entirecan be matched successfully.
a.*?b matches the shortest one, starting with - a
, a string ending with
b. If applied to
aabab, it will match
aaband
ab.
Summary:
The difference between greedy and lazy mode is:
Lazy modeis behind the quantifierWhen using regular expressions, you need to pay attention to the order of matching. Usually the same priority* There is one more question mark ?
.
Advanced 5 - Priority of pattern matching
is calculated from left to right
, and operations with different prioritiesare higher first and then lower . The matching order priority of various operators is from high to low as shown in the following table.
Order | Metacharacters | Description |
---|---|---|
1 | \ |
Escape characters |
2 |
() 、(?:) 、(?=) 、[]
|
Mode units and atom tables |
3 |
* , ,? 、{n} 、{n,} 、{n,m}
|
Duplicate match |
4 |
^ 、$ 、\b 、\B 、\A 、\Z
|
Border restrictions |
|
| Pattern selection
333333\$33\ How should the \$
in 33333 be written?
2 Question: If the
preg_match function in PHP uses the expressions of
single quotes and double quotes to match the above \$,how to write?
Answer:
- The rule required for the expression is
- \\\$
- '/\\\\\\$/'
. (For the convenience of viewing, we split it into
'/\\ \\ \\ $/')
Use double quotes to represent the above string - "/\\\\ \\\$/"
. (For the convenience of viewing, we split it into
"/\\ \\ \\ \$/")
What are you asking?
Another answer:
- Single quotes in PHP do not escape any characters, but only escape
- \
, So we need 6
\to generate the expression.
- In addition to escaping
\
, double quotes also need one more
\to escape
$, so it requires 7
\.
The above is the detailed content of Detailed explanation of regular expressions. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



PHP regular expression verification: Number format detection When writing PHP programs, it is often necessary to verify the data entered by the user. One of the common verifications is to check whether the data conforms to the specified number format. In PHP, you can use regular expressions to achieve this kind of validation. This article will introduce how to use PHP regular expressions to verify number formats and provide specific code examples. First, let’s look at common number format validation requirements: Integers: only contain numbers 0-9, can start with a plus or minus sign, and do not contain decimal points. floating point

To validate email addresses in Golang using regular expressions, follow these steps: Use regexp.MustCompile to create a regular expression pattern that matches valid email address formats. Use the MatchString function to check whether a string matches a pattern. This pattern covers most valid email address formats, including: Local usernames can contain letters, numbers, and special characters: !.#$%&'*+/=?^_{|}~-`Domain names must contain at least One letter, followed by letters, numbers, or hyphens. The top-level domain (TLD) cannot be longer than 63 characters.

In Go, you can use regular expressions to match timestamps: compile a regular expression string, such as the one used to match ISO8601 timestamps: ^\d{4}-\d{2}-\d{2}T \d{2}:\d{2}:\d{2}(\.\d+)?(Z|[+-][0-9]{2}:[0-9]{2})$ . Use the regexp.MatchString function to check if a string matches a regular expression.

As a modern programming language, Go language provides powerful regular expressions and string processing functions, allowing developers to process string data more efficiently. It is very important for developers to master regular expressions and string processing in Go language. This article will introduce in detail the basic concepts and usage of regular expressions in Go language, and how to use Go language to process strings. 1. Regular expressions Regular expressions are a tool used to describe string patterns. They can easily implement operations such as string matching, search, and replacement.

PHP Regular Expressions: Exact Matching and Exclusion Fuzzy inclusion regular expressions are a powerful text matching tool that can help programmers perform efficient search, replacement and filtering when processing text. In PHP, regular expressions are also widely used in string processing and data matching. This article will focus on how to perform exact matching and exclude fuzzy inclusion operations in PHP, and will illustrate it with specific code examples. Exact match Exact match means matching only strings that meet the exact condition, not any variations or extra words.

The method of using regular expressions to verify passwords in Go is as follows: Define a regular expression pattern that meets the minimum password requirements: at least 8 characters, including lowercase letters, uppercase letters, numbers, and special characters. Compile regular expression patterns using the MustCompile function from the regexp package. Use the MatchString method to test whether the input string matches a regular expression pattern.

Regular expression wildcards include ".", "*", "+", "?", "^", "$", "[]", "[^]", "[a-z]", "[A-Z] ","[0-9]","\d","\D","\w","\W","\s&quo

The steps to detect URLs in Golang using regular expressions are as follows: Compile the regular expression pattern using regexp.MustCompile(pattern). Pattern needs to match protocol, hostname, port (optional), path (optional) and query parameters (optional). Use regexp.MatchString(pattern,url) to detect whether the URL matches the pattern.
