What is a character cluster?
In INTERNET programs, regular expressions are usually used to verify user input. When a user submits a FORM, it is not enough to use ordinary literal characters to determine whether the entered phone number, address, email address, credit card number, etc. are valid.
So we need to use a more free way to describe the pattern we want, which is character clusters. To create a cluster representing all vowel characters, place all vowel characters in square brackets:
[AaEeIiOoUu]
This pattern matches any vowel character, But it can only represent one character. Use hyphens to represent a range of characters, such as:
[a-z] //Match all lowercase letters
[A-Z] //Match all uppercase letters
[a-zA-Z] //Match all letters
[0-9] //Match all numbers
[0-9\.\-] //Match all Numbers, periods and minus signs
[ \f\r\t\n] //Match all white characters
Again, these only represent one character, which is very important of. If you want to match a string consisting of a lowercase letter and a digit, such as "z2", "t6" or "g7", but not "ab2", "r2d3" or "b52", use this pattern:
^[a-z][0-9]$
Although [a-z] represents a range of 26 letters, here it can only match strings where the first character is a lowercase letter .
It was mentioned earlier that ^ represents the beginning of a string, but it also has another meaning. When ^ is used within a set of square brackets, it means "not" or "exclude" and is often used to eliminate a certain character. Using the previous example, we require that the first character cannot be a number:
^[^0-9][0-9]$
This pattern is the same as "&5", "g7 " and "-2" are matched, but "12" and "66" are not matched. Here are a few examples of excluding specific characters:
[^a-z] //All characters except lowercase letters
[^\\\/\^] //Except (\) All characters except (/)(^)
[^\”\'] //All characters except double quotes (”) and single quotes (')
Special The character "." (dot, period) is used in regular expressions to represent all characters except "new line". So the pattern "^.5$" matches any two-character string that ends with the number 5 and begins with some other non-"newline" character. The pattern "." can match any string, except empty strings and strings containing only a "new line".
PHP’s regular expressions have some built-in common character clusters, the list is as follows:
Character cluster meaning
[[:alpha:]] Any letters
[[:digit:]] Any numbers
[[:alnum:]] Any letters and numbers
[[:space:]] Any white characters
[[:upper:]] Any uppercase letter
[[:lower:]] Any lowercase letter
[[:punct:]] Any punctuation mark
[[:xdigit:]] Any hexadecimal number, equivalent to [0-9a-fA-F]
The above is the detailed content of Regular expression character cluster (1). For more information, please follow other related articles on the PHP Chinese website!