What is the specific implementation of PHP regular expression matching? In fact, we know that in the actual matching operation, what we operate is not just a single letter or number, so how should we deal with words or a group of numbers?
Specific implementations of PHP regular expression matching will use the built-in universal character clusters of PHP regular expressions. What are the built-in universal character sets of PHP regular expressions?
PHP regular expression built-in universal character set and meaning:
<ol class="dp-c"> <li class="alt"><span><span>[[:alpha:]] </span><span class="comment">//任何字母 </span><span> </span></span></li> <li> <span>[[:digit:]] </span><span class="comment">//任何数字 </span><span> </span> </li> <li class="alt"> <span>[[:alnum:]] </span><span class="comment">//任何字母和数字 </span><span> </span> </li> <li> <span>[[:space:]] </span><span class="comment">//任何白字符 </span><span> </span> </li> <li class="alt"> <span>[[:upper:]] </span><span class="comment">//任何大写字母 </span><span> </span> </li> <li> <span>[[:lower:]] </span><span class="comment">//任何小写字母 </span><span> </span> </li> <li class="alt"> <span>[[:punct:]] </span><span class="comment">//任何标点符号 </span><span> </span> </li> <li> <span>[[:xdigit:]] </span><span class="comment">//任何16进制的数字,相当于[0-9a-fA-F] </span><span> </span> </li> </ol>
Analysis of PHP regular expression matching:
By now, you already know how to match a letter or number, but more often than not, you may want to match a word or a group of numbers. A word consists of several letters, and a group of numbers consists of several singular numbers. The curly braces ({}) following a character or character cluster are used to determine the number of times the preceding content is repeated.
The PHP regular expression character set and meaning used
<ol class="dp-c"> <li class="alt"><span><span>^[a-zA-Z_]$ </span><span class="comment">//所有的字母和下划线 </span><span> </span></span></li> <li> <span>^[[:alpha:]]{3}$ </span><span class="comment">//所有的3个字母的单词 </span><span> </span> </li> <li class="alt"> <span>^a$ </span><span class="comment">//字母a </span><span> </span> </li> <li> <span>^a{4}$ </span><span class="comment">//aaaa </span><span> </span> </li> <li class="alt"> <span>^a{2,4}$ </span><span class="comment">//aa,aaa或aaaa </span><span> </span> </li> <li> <span>^a{1,3}$ </span><span class="comment">//a,aa或aaa </span><span> </span> </li> <li class="alt"> <span>^a{2,}$ </span><span class="comment">//包含多于两个a的字符串 </span><span> </span> </li> <li> <span>^a{2,} </span><span class="comment">//如:aardvark和aaab,但apple不行 </span><span> </span> </li> <li class="alt"> <span>a{2,} </span><span class="comment">//如:baad和aaa,但Nantucket不行 </span><span> </span> </li> <li> <span>t{2} </span><span class="comment">//两个制表符 </span><span> </span> </li> <li class="alt"> <span>.{2} </span><span class="comment">//所有的两个字符 </span><span> </span> </li> </ol>
These examples describe three different uses of curly braces. A number, {x} means "the preceding character or character cluster appears only x times"; a number plus a comma, {x,} means "the preceding content appears x or more times"; two Comma-separated numbers, {x,y} means "the previous content appears at least x times, but not more than y times". We can extend the pattern to more words or numbers:
<ol class="dp-c"> <li class="alt"><span><span>^[a-zA-Z0-9_]{1,}$ </span><span class="comment">//所有包含一个以上的字母、数字或下划线的字符串 </span><span> </span></span></li> <li> <span>^[0-9]{1,}$ </span><span class="comment">//所有的正数 </span><span> </span> </li> <li class="alt"> <span>^-{0,1}[0-9]{1,}$ </span><span class="comment">//所有的整数 </span><span> </span> </li> <li> <span>^-{0,1}[0-9]{0,}.{0,1}[0-9]{0,}$ </span><span class="comment">//所有的小数 </span><span> </span> </li> </ol>
The last example is not easy to understand, is it? Look at it this way: with everything starting with an optional negative sign (-{0,1}) (^), followed by 0 or more digits ([0-9]{0,}), and an optional A decimal point (.{0,1}) followed by 0 or more digits ([0-9]{0,}) and nothing else ($). Below you will learn about the simpler methods you can use.
The special characters "?" are equal to {0,1}, they both represent: "0 or 1 previous content" or "the previous content is optional". So the example just now can be simplified to:
<ol class="dp-c"><li class="alt"><span><span>^-?[0-9]{0,}.?[0-9]{0,}$ </span></span></li></ol>
The special characters "*" and {0,} are equal, and they both represent "0 or more previous contents". Finally, the character "+" is equal to {1,}, which means "one or more previous contents", so the above 4 examples can be written as:
<ol class="dp-c"> <li class="alt"><span><span>^[a-zA-Z0-9_]+$ </span></span></li> <li> <span class="comment">//所有包含一个以上的字母、数字或下划线的字符串 </span><span> </span> </li> <li class="alt"> <span>^[0-9]+$ </span><span class="comment">//所有的正数 </span><span> </span> </li> <li> <span>^-?[0-9]+$ </span><span class="comment">//所有的整数 </span><span> </span> </li> <li class="alt"> <span>^-?[0-9]*.?[0-9]*$ </span><span class="comment">//所有的小数 </span><span> </span> </li> </ol>
Of course, this does not follow from Technically reduces the complexity of regular expressions, but makes them easier to read.
The specific implementation of PHP regular expression matching is introduced to you here. I hope it will be helpful for you to understand and learn the specific implementation of PHP regular expression matching.