Basic concepts
A regular expression is a text pattern that includes both ordinary characters (for example, the letters between a and z) and special characters (called "metacharacters"). A pattern describes one or more strings to match when searching for text.
First of all, we recommend several regular expression editors
Debuggex: https://www.debuggex.com/
PyRegex:http://www.pyregex.com/
Regexper: http://www.regexper.com/
Regular expression is a search and string replacement operation. Regular expressions are widely used in text editors. For example, regular expressions are used:
[copy] Check whether the text contains the specified feature word
Find the position of matching feature words in the text
Extract information from text, such as: substring of string
Modify text
Description: Regular expressions are usually used for two tasks: 1. Verification, 2. Search/replace. When used for verification, it is usually necessary to add ^ and $ before and after to match the entire string to be verified; whether to add this limit when searching/replacing depends on the search requirements. In addition, it may also be necessary to add before and after b instead of ^ and $. The commonly used regular expressions listed in this table are not preceded or followed by any restrictions except for a few. Please handle them by yourself according to your needs.
Priority order
After a regular expression is constructed, it can be evaluated like a mathematical expression, that is, it can be evaluated from left to right and in an order of precedence. The following table lists the precedence order of the various regular expression operators from highest priority to lowest priority:
操作符 | 描述 |
---|---|
转义符 | |
(), (?:), (?=), [] | 圆括号和方括号 |
*, , ?, {n}, {n,}, {n,m} | 限定符 |
^, $, anymetacharacter | 位置和顺序 |
Create regular expression
Constructing regular expressions is the same as creating mathematical expressions. That is, using a variety of metacharacters and operators to combine small expressions to create larger expressions.
A regular expression can be constructed by placing the various components of the expression pattern between a pair of delimiters.
For JScript, the delimiter is a pair of forward slash (/) characters. For example:
/expression/
For VBScript, a pair of quotation marks ("") is used to determine the boundaries of the regular expression. For example:
Let’s look at an example
var re =new RegExp("^[a-zA-Z][a-zA-Z0-9_]{5,19}$");
if(re.test(aaaa)){
alert("Correct format");
}else{
alert("Format error");
}
The components of a regular expression can be a single character, a collection of characters, a range of characters, a selection between characters, or any combination of all these components.
Commonly used regular expressions
Regular expression matching Chinese characters: [u4e00-u9fa5]
Comment: Matching Chinese is really a headache. With this expression, it will be easier
Match double-byte characters (including Chinese characters): [^x00-xff]
Comment: Can be used to calculate the length of a string (the length of a double-byte character counts as 2, and the length of an ASCII character counts as 1)
Regular expression matching blank lines: ns*r
Comment: Can be used to delete blank lines
Regular expression matching HTML tags: <(S*?)[^>]*>.*?1>|<.*? />
Comment: The version circulating on the Internet is too bad. The above one can only match part of it, and it is still powerless for complex nested tags
Regular expression matching leading and trailing whitespace characters: ^s*|s*$
Comment: It can be used to delete whitespace characters (including spaces, tabs, form feeds, etc.) at the beginning and end of the line. A very useful expression
Regular expression matching email addresses: w ([- .]w )*@w ([-.]w )*.w ([-.]w )*
Comment: Very useful for form validation
Regular expression matching URL: [a-zA-z]://[^s]*
Comment: The version circulating on the Internet has very limited functions. The above one can basically meet the needs
Is the matching account legal (starting with a letter, 5-16 bytes allowed, alphanumeric underscores allowed): ^[a-zA-Z][a-zA-Z0-9_]{4,15}$
Comment: Very useful for form validation
Match domestic phone numbers: d{3}-d{8}|d{4}-d{7}
Comment: Matching format such as 0511-4405222 or 021-87888822
Matches Tencent QQ number: [1-9][0-9]{4,}
Comment: Tencent QQ account starts from 10000
Match Chinese postal code: [1-9]d{5}(?!d)
Comment: China’s postal code is a 6-digit number
Matching ID card: d{15}|d{18}
Comment: China’s ID card has 15 or 18 digits
Match ip address: d .d .d .d
Comment: Useful when extracting IP address
Match specific numbers
[copy] ^[1-9]d*$ // Match positive integers
^-[1-9]d*$ // Match negative integers
^-?[1-9]d*$ //Match integers
^[1-9]d*|0$ // Match non-negative integers (positive integers 0)
^-[1-9]d*|0$ // Match non-positive integers (negative integers 0)
^[1-9]d*.d*|0.d*[1-9]d*$ //Match positive floating point numbers
^-([1-9]d*.d*|0.d*[1-9]d*)$ //Match negative floating point numbers
^-?([1-9]d*.d*|0.d*[1-9]d*|0?.0 |0)$ // Match floating point number
^[1-9]d*.d*|0.d*[1-9]d*|0?.0 |0$ //Match non-negative floating point numbers (positive floating point numbers 0)
^(-([1-9]d*.d*|0.d*[1-9]d*))|0?.0 |0$ //Match non-positive floating point numbers (negative floating point numbers 0)
Comment: Useful when processing large amounts of data, please pay attention to corrections when applying specifically
Match specific string
[copy]^[A-Za-z] $ //Match a string consisting of 26 English letters
^[A-Z] $ // Matches a string consisting of 26 uppercase English letters
^[a-z] $ // Matches a string consisting of 26 lowercase English letters
^[A-Za-z0-9] $ // Matches a string consisting of numbers and 26 English letters
^w $ // Matches a string consisting of numbers, 26 English letters or underscores
Comments: Some of the most basic and commonly used expressions
Mind Map