Introduction to php regular expressions_PHP tutorial-PHP Tutorial-php.cn

Today I saw a tutorial that included some usage of regular expressions. It mainly talked about the segmentation, matching, search, and replacement of characters by regular expressions, as well as some introductory knowledge and common examples, so I compiled them to share with you.

1. Introduction and function of regular expressions.

01.What is a regular expression?

Regular Expression (English: Regular Expression, regex or regexp, abbreviated as RE), also translated as regular expression, regular expression, in computer science, refers to a series of words that are used to describe or match a certain A single string of strings with syntax rules. In many text editors or other tools, regular expressions are often used to retrieve and/or replace text content that matches a certain pattern. Many programming languages support string manipulation using regular expressions.

Rule syntax

02. Main functions: segmentation, matching, search, and replacement.

Expression

Match

表达式	匹配
/^s*$/	匹配空行。
/d{2}-d{5}/	验证由两位数字、一个连字符再加 5 位数字组成的 ID 号。
/])?>[sS]/	匹配 HTML 标记。

/^s*$/

Matches empty lines.

/d{2}-d{5}/

Verify an ID number consisting of two digits, a hyphen, and 5 digits.

/]*)?>[sS]*/

Matches HTML tags.

2. Two commonly used regular functions in PHP.

preg_match regular function, based on perl prediction. (For efficiency, you need to customize a start and end character.)

ereg regular function, based on POSIX (Uniox, Script).

3. Elements included in regular expressions.

01. Atoms (common characters: a-z A-Z 0-9, atom table, escape characters).
02. Atomic characters (characters with special functions).
03. Mode modifier (some built-in modules in the system, similar functions).

4. "Atoms" in regular expressions.

01.a-z A-Z _ 0-9 //The most common characters.

02.(abc) (skd) //Unit symbol enclosed in brackets. 03.[abcs] [^abd] //Greedy matching, source table enclosed in square brackets, ^ in the atomic table represents exclusion or opposite content.

04. Escape characters (case sensitive)

d contains all numbers == [0-9].
D does not contain all numbers == [^0-9].

w contains all English characters == [a-zA-Z_0-9].
W does not contain all English characters & numbers and is used to match special symbols == [^a-zA-Z_0-9].
s contains white space such as carriage return, line feed, page break == [fnr].

Metacharacters

* Matches 0 times, 1 or more times of the previous content
. Matches 0 times, 1 or more times of the content, but does not include carriage returns and line feeds

+ Match the previous content 1 or more times

? Matches 0 or 1 times of the previous content

| Select matching is similar to | in PHP (because this operator is a weak type, the previous one is the most overall match)

代码如下

复制代码

$mode = "#test#"; //这里可以用以上原子表进行匹配。
$str = "sdfsstestdf";

if (preg_match($mode, $str, $end)) { //mode正则模块、str正则内容、end正则结果，以数组输出。
echo "匹配成功" . $end[0];
} else {
echo "匹配失败";
}
?>

^ Match the first content of the string $ Matches the content at the end of the string b matches word boundaries, which can be spaces or special characters B matches unexpected content except with word boundaries {m} matches the previous content repeated M times {m,} matches the number of repetitions of the previous content greater than or equal to M times {m,n} matches the number of repetitions of the previous content from M times to N times ( ) Merge the overall match and put it into memory. You can use 1 2... to obtain in sequence Example:

The code is as follows	Copy code
$mode = "#test#"; //Here you can use the above atom table for matching. <🎜> $str = "sdfsstestdf";<🎜> <🎜>if (preg_match($mode, $str, $end)) { //mode regular module, str regular content, end regular result, output as array. <🎜> echo "match successfully" . $end[0];<🎜> } else {<🎜> echo "Match failed";<🎜> }<🎜> ?>

Commonly used regular expressions

* 1. ^S+[a-z A-Z]$ cannot be empty, cannot have spaces, and can only be English letters
* 2. S{6,} cannot be empty, more than six digits
* 3. ^d+$ cannot have spaces or non-digits
* 4. (.*)(.jpg|.bmp)$ can only be in jpg and bmp formats
* 5. ^d{4}-d{1,2}-d{1,2}$ can only be in 2004-10-22 format
* 6. ^0$ Select at least one
* 7. ^0{2,}$ Choose at least two items
* 8. ^[s|S]{20,}$ cannot be empty, more than 20 characters
* 9. ^+?[a-z0-9](([-+.]|[_]+)?[a-z0-9]+)*@([a-z0-9]+(.| -))+[a-z]{2,6}$mail
* 10. w+([-+.]w+)*@w+([-.]w+)*.w+([-.]w+)*([,;]s*w+([-+.]w+)* @w+([-.]w+)*.w+([-.]w+)*)* Enter multiple addresses and separate emails with commas or spaces
* 11. ^(([0-9]+))?[0-9]{7,8}$ The phone number is 7 or 8 digits or preceded by an area code, such as (022) 87341628
* 12. ^[a-z A-Z 0-9 _]+@[a-z A-Z 0-9 _]+(.[a-z A-Z 0-9 _]+)+(,[a-z A-Z 0-9 _]+@[a-z A-Z
0-9 _]+(.[a-z A-Z 0-9 _]+)+)*$
* Can only be letters, numbers, and underscores; must contain @ and. At the same time, the format must be standardized. Email
* 13 ^w+@w+(.w+)+(,w+@w+(.w+)+)*$The above expression can also be written like this, which is more concise.
14 ^w+((-w+)|(.w+))*@w+((.|-)w+)*.w+$ [/size]
Regular expression to match Chinese characters: [u4e00-u9fa5]

Match specific numbers:

^[1-9]d*$　 //Match positive integers
^-[1-9]d*$ //Match negative integers
^-?[1-9]d*$　 //Match integers
^[1-9]d* |0$　 // Match non-negative integers (positive integers + 0)
^-[1-9]d* |0$　 // Match non-positive integers (negative integers + 0)
^[1-9]d*.d* |0.d*[1-9]d*$　 //Match positive floating point numbers
^-([1-9]d*.d* |0.d*[1-9]d*)$ // Match negative floating point numbers
^-?([1-9]d*.d* |0.d*[1-9]d* |0?.0+ |0)$ // Match floating point number
^[1-9]d*.d* |0.d*[1-9]d* |0?.0+ |0$　 // Match non-negative floating point numbers (positive floating point numbers + 0)
^(-([1-9]d*.d* |0.d*[1-9]d*)) |0?.0+ |0$　 //Match non-positive floating point numbers (negative floating point numbers + 0 )

It consists of letters a～z (not case sensitive), numbers 0～9, minus sign or underscore
It can only start and end with numbers or letters. The username must be 4 to 18 characters in length

The code is as follows

Copy code

代码如下	复制代码
^[a-za-z0-9]{1}[a-za-z0-9\|-\|_]{2-16}[a-za-z0-9]{1}$

^[a-za-z0-9]{1}[a-za-z0-9|-|_]{2-16}[a-za-z0-9]{1}$

The user name is an uppercase letter, lowercase letter or underscore, and starts with a letter, and the length is 6-20

代码如下	复制代码
^[a-za-z][wd_]{5,19}

The code is as follows	Copy code
^[a-za-z][wd_]{5,19}

代码如下

复制代码

/^[a-z0-9_u4e00-u9fa5]+[^_]$/g

utf-8下

preg_match("/^[a-z0-9_x80-xff]+[^_]$/g",$a);

gbk下:

preg_match("/^[a-z0-9_".chr(0xa1)."-".chr(0xff)."]+[^_]$/",$a)

Username: includes English lowercase, Chinese characters, numbers, and underscores. It cannot be all numbers and the underscore cannot be at the end

The code is as follows

Copy code

/^[a-z0-9_u4e00-u9fa5]+[^_]$/g

代码如下	复制代码
function is_email($email){ return strlen($email) > 6 && preg_match(“/^[w-.]+@[w-]+(.w+)+$/“, $email); } ?>

utf-8 under preg_match("/^[a-z0-9_x80-xff]+[^_]$/g",$a); gbk下: preg_match("/^[a-z0-9_".chr(0xa1)."-".chr(0xff)."]+[^_]$/",$a)

The code is as follows	Copy code
function is_email($email){<🎜> return strlen($email) > 6 && preg_match(“/^[w-.]+@[w-]+(.w+)+$/“, $email); } ?>

url地址

代码如下

代码如下	复制代码
function autolink($foo) { $foo = eregi_replace('(((f\|ht){1}tp://)[-a-zA-Z0-9@:%_/+.~#?&//=]+)', '/1', $foo); if( strpos($foo, "http") === FALSE ){ $foo = eregi_replace('(www.[-a-zA-Z0-9@:%_/+.~#?&//=]+)', '/1', $foo); }else{ $foo = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_/+.~#?&//=]+)', '/1/2', $foo); } return $foo; } ?>

复制代码

function autolink($foo)
{
$foo = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_/+.~#?&//=]+)', '/1', $foo);
if( strpos($foo, "http") === FALSE ){
$foo = eregi_replace('(www.[-a-zA-Z0-9@:%_/+.~#?&//=]+)', '/1', $foo);
}else{
$foo = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_/+.~#?&//=]+)', '/1/2', $foo);
}
return $foo;
}
?>

http://www.bkjia.com/PHPjc/632636.html

TechArticle

今天看一个教程中有写了一些正则表达式用法，里面主要讲到了正则对字符的分割、匹配、查找、替换及一些入门知识与常用实例，所以整...