*? * means 0 or more, ? means 0 or 1, the two are superimposed to identify more than 0, and the function overlaps with *
(S*?) The length of the tag must be greater than 0, so *? cannot be used.
, so they cannot be forced to close.
abc') // ["
abc", "div"]
This expression is also incomplete, such as the second This test statement is written in order to extract tags that contain text content. If you want to strictly match, you can modify it again:
var rtag = /^<([a-z] )s*/?> ( ?:1>)?$/i // Remove the middle .*
The application scope of this regular rule is limited to simple tag matching and extraction, and cannot match nested tags.
Regular matching of leading and trailing whitespace characters Expression
Let’s first talk about the version circulating on the Internet:
^s*|s*$
can delete the blank characters at the beginning and end of the line, for example:
' t nr abc t nr '.replace( /^s*|s*$/g, '' ) // abc
But using s* cannot determine whether the string has s at the beginning or end, for example:
/^s *|s*$/.test( 'abc' ) // true
amend as follows:
^s |s $
' t nr abc t nr '.replace( /^s |s $/ g, '' ) // abc
/^s |s $/.test( 'abc' ) // false
Regular expression matching email address
First introduce the rules of Email: local-part@domain
The maximum length of local-part is 64, the maximum length of domain is 253, and the maximum length is 256
Local-part can use any ASCII characters:
Uppercase and lowercase English letters a-z, A-Z
Numbers 0-9
Characters!#$%&'* -/=?^_`{|}~
Characters. Cannot be the first and last, and cannot appear twice in a row
But some mail servers Email addresses containing special characters will be rejected
Domain (domain name) is limited to 26 English letters, 10 numbers, hyphen -
Hyphen - cannot be the first character
Top-level domain name (com, cn, etc.) The length is 2 to 6 characters
Let’s first talk about the version circulating on the Internet:
w ([- .]w )*@w ([-.]w )*.w ([- .]w )*
() Inexplicable grouping, if you only group without recording, you can use (?:)
@w domain cannot contain underscore_
w ([-.]w )* Top level The domain name does not comply with the rules
and is corrected as follows:
var remail = /^([w-_] (?:.[w-_] )*)@((?:[a-z0-9] (? :-[a-zA-Z0-9] )*) .[a-z]{2,6})$/i
remail.exec( 'nuysoft@gmail.com' ) // "nuysoft@gmail.com ", "nuysoft", "gmail.com"]
remail.exec( 'nuysoft@gmail.comcomcom' ) // null
remail.exec( 'nuysoft@_gmail.com ) // null
The revised regex has the following limitations:
Does not support Chinese mailboxes and Chinese domain names. The reason why I do not support it is because of my personal preference and dislike such flashy stuff
Does not support special symbols, avoid Non-mail server rejection, can be added if needed.
Reference article:
http://en.wikipedia.org/wiki/Email_address
http://baike.baidu.com/view/119298.htm
Regular expression matching URL
Let’s first talk about the version circulating on the Internet:
[a-zA-z] ://[^s]*
Rough, Each block in the URL is not grouped
The correction is as follows (another version circulating on the Internet):
var _url = "^((https|http|ftp|rtsp|mms)?://)?" / /
"(([0-9a-z_!~*'().&= $%-] : )?[0-9a-z_!~*'().&= $%-] @) ?" // ftp user@
"(([0-9]{1,3}.){3}[0-9]{1,3}" // URL in IP form- 199.194.52.184
"|" // Allow IP and DOMAIN (domain name)
"([0-9a-z_!~*'()-] .)*" // Domain name - www.
"([ 0-9a-z][0-9a-z-]{0,61})?[0-9a-z]." // Second-level domain name
"[a-z]{2,6})" / / first level domain- .com or .museum
"(:[0-9]{1,4})?" // port- :80
"((/?)|" // a slash isn't required if there is no file name
"(/[0-9a-z_!~*'().;?:@&= $,%#-] ) /?)$";
var rurl = new RegExp( _url, 'i' );
Test:
rurl.exec( 'baidu.com' ) // ["baidu.com", undefined, undefined, undefined, undefined, "baidu.com", undefined, "baid", undefined, undefined, "", "", undefined]
rurl.exec( 'http://baidu.com' ) //
rurl. exec( 'http://www.baidu.com' ) // ["http://baidu.com", "http://", "http", undefined, undefined, "baidu.com", undefined, "baid", undefined, undefined, "", "", undefined]
rurl.test( 'baidu' ) // true
It seems that it is easy to use even if it is not very useful. You need to learn TODO.
Is the matching account legal
Let’s talk about the version circulating on the Internet first:
^[a-zA-Z][a-zA-Z0-9_]{4,15}$
(Starting with a letter, 5-16 characters allowed section, alphanumeric underscores are allowed)
The restriction must start with a letter. It seems inappropriate now. For example, the QQ login platform
The restriction cannot start with an underscore. It is not necessary. For example, Baidu allows it, so it is simple.
Correction is as follows :
var ruser = /w{4,16}/
Matching domestic phone numbers
The version circulating online is very useful:
d{3}-d{8}|d{4}-d{7}
Comment: The matching format is such as 0511-4405222 or 021-87888822
Matches Tencent QQ account
The version circulating on the Internet is very useful:
[1-9][0-9]{4,}
Comment: Tencent QQ The number starts from 10000
Match China Postal code
The version circulating on the Internet is very useful:
[1-9]d{5}(?!d)
Comment: China’s postal code is a 6-digit number
Match ID card
Let’s first talk about the version circulating on the Internet:
d{15}|d{18}
d{15}
d{18} Yes Judgment, but a bit rough
Address, birthday, gender, etc. can be parsed from the ID card, so here is a special explanation:
ID card rules
China’s ID card is 15 digits (first generation) or 18 digits ( Second generation), the difference is that the second generation certificate only adds 19 before the seventh digit of the first generation certificate and adds a verification code at the end
Upgrade the 15 digits to 18 digits and parse the 18 digit number composition ( Address, birthday, gender)
The code is as follows:
function parseID(ID) {
if ( ID.length == 15 ) {
// Upgrade to 18 bits
ID = ID. substr( 0, 6 ) "19" ID.substr( 6 );
// The coefficient corresponding to the first 17 digits
var rank = [
"7", "9", "10", " 5", "8", "4", "2", "1", "6", "3", "7", "9", "10", "5", "8", "4" , "2"
];
// The first 17 are the last ID number corresponding to the remainder after weighted division by 17
var last = [
"1", "0", "X", "9", "8", "7", "6", "5", "4", "3", "2"
];
// Weighted sum
for ( var i = 0, sum = 0, len = ID.length; i < len; i )
sum = ID[ i ] * rank[ i ];
// Add the last digit
ID = last[ sum % 11 ];
}
if ( ID.length != 18 ) return null;
var match = rid.exec( ID );
return match ? {
ID : ID,
area : match[ 1 ],
y : match[ 2 ],
m : match[ 3 ],
d : match[ 4 ],
sex : match[ 5 ] % 2
} : null;
}
Restrictions:
The address code is only parsed here. How to convert the code into an actual address? Please ask Du Niang.
The sex in the returned object is 1 (male) or 0 (female), and no conversion is done. If required for page display, it can be converted like this: sex ? "Male" : "Female"
Test:
console.info( parseID( "142327840821047" ) );
console.info( parseID("142327198408210470" ) );
Reference:
http://baike.baidu.com/view/118340 .htm#1
Match IP Address
Let’s first talk about the version circulating on the Internet:
d .d .d .d
d There is no limit to the number
The correction is as follows:
var rip = /^(?:( ?:[01]?d{1,2}|2[0-4]d|25[0-5]).){3}(?:[01]?d{1,2}|2[0 -4]d|25[0-5])$/;
rip.test( "192.168.1.1" ) // true
rip.test( "0.0.0.0" ) // true
rip.test( "255.255.255.255" ) // true
rip.test( "256.255.255.255" ) // false
Further increase the grouping:
var rip2 = /^([01]?d {1,2}|2[0-4]d|25[0-5]).([01]?d{1,2}|2[0-4]d|25[0-5]). ([01]?d{1,2}|2[0-4]d|25[0-5]).([01]?d{1,2}|2[0-4]d|25[ 0-5])$/;
rip2.exec( "192.168.1.1" ) // ["192.168.1.1", "192", "168", "1", "1"]
rip2 .exec( "0.0.0.0" ) // ["0.0.0.0", "0", "0", "0", "0"]
rip2.exec( "255.255.255.255" ) // [ "255.255.255.255", "255", "255", "255", "255"]
rip2.exec( "256.255.255.255" ) // null