Matches the string "abc 123". Ask to take out abc and 123.
<?php
$str = "abc 123";
$preg = "/^(.*?)\s+(.*?)$/";
$preg1 = "/^(.*?)\s*(.*?)$/";
preg_match($preg, $str, $tmp);
preg_match($preg1, $str, $tmp1);
echo '<pre>';
print_r($tmp);
print_r($tmp1);
echo '</pre>';
// $tmp
Array
(
[0] => abc 123
[1] => abc
[2] => 123
)
// $tmp1
Array
(
[0] => abc 123
[1] =>
[2] => abc 123
)
Why are the matching results different? Is there anything I need to pay attention to?
Neither of the first two students answered to the point.
Let me answer it,
The real secret lies in
惰性(或叫非贪婪)匹配
’s rules:An asterisk or plus sign followed by a question mark indicates lazy matching, which means matching as little as possible.
/(.*?)s+/
, the plus sign indicates that the previous match (that is, the space s) appears one or more times. This paragraph means matching as little as possible, followed by at least one space s. Looking at it this way, the previous bracket can match abc./(.*?)s*/
, the asterisk indicates that the previous match (that is, the space s) appears 0 or more times. The meaning of this paragraph is to match as little as possible, and there can be nothing after it (s*). This results in an empty string, matching nothing.Please note that the results of the regular wildcards s+ and s* are definitely different.
is equivalent to {0,}."*"
Matches the preceding subexpression zero or more times. For example, zo"*"
匹配前面的子表达式零次或多次。例如,zo能匹配“z"以及"zoo"。等价于{0,}。"+"
matches "z" and "zoo".
?)s+(."+"
Matches the previous subexpression one or more times. For example, "zo+" matches "zo" and "zoo", but not "z". + is equivalent to {1,}. /^(.?)$/
The first bracket means matching all characters, " ?" non-greedy matching, which means matching the previous character or subexpression zero or once.The "/" in front and the "/" in the back indicate that the beginning and end have no practical meaning.
The first "^" means matching the beginning of the text
() is the priority from left to right. "." means matching any character, "*" means matching more than 0 times
s is any whitespace character
🎜Recommend a tutorial for you to learn: 30-minute introduction to regular expressions🎜Correct the answer, it is caused by lazy matching, which is also one of the difficulties of regular expression.