If this is a summary about regular expressions, I prefer to regard it as a manual.

Three major methods of RegExp

The RegExp in this article uses direct quantity syntax: /pattern/attributes. There are three options for attributes, i, m and g. m (multi-line matching) is not commonly used and can be omitted directly, so a pattern (matching pattern) can be expressed as follows:

var pattern = /hello/ig;
i (ignore) means not case sensitive ( Local searchmatch), which is relatively simple and will not be described in the following examples; g (global) means global (search match), that is, continue to search after finding one, which is relatively complicated. The following methods will be particularly introduce.

Since they are the three major methods of RegExp, they are all in the format of pattern.test/exec/complie.

  • test

Main function: detect whether the specified string contains a certain substring (or matching pattern) and return true or false.

The example is as follows:

var s = 'you love me and I love you';
var pattern = /you/;
var ans = pattern.test(s);
console.log(ans); // true
If attributes use g, you can continue to search, which will also involve the lastIndexattribute (refer to the introduction of g in exec).

  • exec

Main function: Extract the required substring (or matching pattern) in the specified string and return a Array stores the matching results; if not, null is returned. (You can also write your own method Loop to extract all or specified index data)

exec can be said to be an upgraded version of test, because it can not only detect, but also directly extract the results after detection.

The example is as follows:

var s = 'you love me and I love you';
var pattern = /you/;
var ans = pattern.exec(s);
console.log(ans); // ["you", index: 0, input: "you love me and I love you"]
console.log(ans.index); // 0
console.log(ans.input); // you love me and I love you
The output is very interesting. The 0th element of this array is the text that matches the regular expression, the 1st element is the text that matches the 1st subexpression of RegExpObject (if any), the 2nd element is the text that matches the 2nd subexpression of RegExpObject (if any), and so on.

What is "text that matches a subexpression"? Look at the following example:

var s = 'you love me and I love you';
var pattern = /y(o?)u/;
var ans = pattern.exec(s);
console.log(ans);   // ["you", "o", index: 0, input: "you love me and I love you"]
console.log(ans.length) // 2
The so-called subexpression is the thing inside () in pattern (for details, please refer to the introduction of subexpressions below). Looking at the array length in the above example, it is 2! ! index and input are just array attributes (the above output in chrome may be misleading).

In addition to the array elements and length properties, the exec() method also returns two properties. The index attribute declares the position of the first character of the matched text. The input attribute stores the retrieved string string. We can see that when calling the exec() method of the non-global RegExp object, the array returned is the same as the array returned by calling the String.match() method .

If you use the "g" parameter, the working principle of exec() is as follows (still the above example ps: if test uses the g parameter it is similar):

  1. Find the first "you" and store its position

  2. If you run exec() again, start retrieval from the stored position (lastIndex) and find the next "you" and store it Location

The behavior of exec() is a little more complicated when the RegExpObject is a global regular expression. It starts retrieving the string string at the character specified by the RegExpObject's lastIndex property. When exec() finds text that matches an expression, it sets the RegExpObject's lastIndex property to the position next to the last character of the matching text after the match. This means that we can iterate through all matching text in a string by calling the exec() method repeatedly. When exec() no longer finds matching text, it returns null and resets the lastIndex property to 0. The lastIndex attribute is introduced here, which only works when combined with g and test (or g and exec). It is an attribute of pattern, a integer, indicating the character position where the next matching begins.

The example is as follows:

var s = 'you love me and I love you';
var pattern = /you/g;
var ans;
do {
  ans = pattern.exec(s);
while (ans !== null)
The result is as follows:

It should be easy to understand. When the third loop is performed, " you", so null is returned, and the lastIndex value also becomes 0.

如果在一个字符串中完成了一次模式匹配之后要开始检索新的字符串(仍然使用旧的pattern),就必须手动地把 lastIndex 属性重置为 0。

  • compile


这货是改变匹配模式时用的,用处不大,略过。详见JavaScript compile() 方法

String 四大护法




  • search



var s = 'you love me and I love you';
var pattern = /you/;
var ans = s.search(pattern);
console.log(ans);  // 0
  • match




var s = 'you love me and I love you';
console.log(s.match(/you/));    // ["you", index: 0, input: "you love me and I love you"]
console.log(s.match(/you/g));   // ["you", "you"]
  • replace

主要功能:用另一个子串替换指定字符串中的某子串(或者匹配模式),返回替换后的新的字符串 str.replace(‘搜索模式’,'替换的内容’) 如果用的是pattern并且带g,则全部替换;否则替换第一处。


var s = 'you love me and I love you';
console.log(s.replace('you', 'zichi')); // zichi love me and I love you
console.log(s.replace(/you/, 'zichi')); // zichi love me and I love you
console.log(s.replace(/you/g, 'zichi'));    // zichi love me and I love zichi
var s = 'I love you';
var pattern = /love/;
var ans = s.replace(pattern, '$`' + '$&' + "$'");
console.log(ans); // I I love you you
没错,’$`’ + ‘$&’ + “$’”其实就相当于原串了!



var s = 'I love you';
var pattern = /love/;
var ans = s.replace(pattern, function(a) {  // 只有一个参数,默认为匹配到的串(如还有参数,则按序表示子表达式和其他两个参数)
  return a.toUpperCase();
console.log(ans); // I LOVE you
  • split



var s = 'you love me and I love you';
var pattern = 'and';
var ans = s.split(pattern);
console.log(ans);   // ["you love me ", " I love you"]
var s = 'you love me and I love you';
var pattern = /and/;
var ans = s.split(pattern, 1);
console.log(ans);   // ["you love me "]
RegExp 字符

  • \s 任意空白字符 \S相反 空白字符可以是: 空格符 (space character) 制表符 (tab character) 回车符 (carriage return character) 换行符 (new line character) 垂直换行符 (vertical tab character) 换页符 (form feed character)

  • \b是正则表达式规定的一个特殊代码,代表着单词的开头或结尾,也就是单词的分界处。虽然通常英文的单词是由空格,标点符号或者换行来分隔的,但是\b并不匹配这些单词分隔字符中的任何一个,它只匹配一个位置。(和^ $ 以及零宽断言类似)

  • \w 匹配字母或数字或下划线 [a-z0-9A-Z_]完全等同于\w



var s = 'hello world welcome to my world';
var pattern = /hello.*world/;
var ans = pattern.exec(s);
console.log(ans)  // ["hello world welcome to my world", index: 0, input: "hello world welcome to my world"]
Copy after login

以上例子不会匹配最前面的Hello World,而是一直贪心的往后匹配。


var s = 'hello world welcome to my world';
var pattern = /hello.*?world/;
var ans = pattern.exec(s);
console.log(ans)  // ["hello world", index: 0, input: "hello world welcome to my world"]
Copy after login



  • 表示方式


var s = 'hello world';
var pattern = /(hello)/;
var ans = pattern.exec(s);
Copy after login
  • 子表达式出现场景


var s = 'hello world';
var pattern = /(h(e)llo)/;
var ans = pattern.exec(s);
console.log(ans); // ["hello", "hello", "e", index: 0, input: "hello world"]
Copy after login


var s = 'hello world';
var pattern = /(h\w*o)\s*(w\w*d)/;
var ans = s.replace(pattern, '$2 $1')
console.log(ans); // world hello
Copy after login

后向引用 & 零宽断言

  • 子表达式的序号问题






var s = 'hellohellochinaworldworld';
var pattern = /(\w+)\1/g;
var a = s.match(pattern);
console.log(a); // ["hellohello", "worldworld"]
Copy after login

这里的\1就表示和匹配模式中的第一个子表达式(分组)一样的内容,\2表示和第二个子表达式(如果有的话)一样的内容,\3 \4 以此类推。(也可以自己命名,详见参考文献)



var s = 'hellohellochinaworldworld';
var pattern = /(\w+)\1/g;
var ans;
do {
  ans = pattern.exec(s);
} while(ans !== null);

// result
// ["hellohello", "hello", index: 0, input: "hellohellochinaworldworld"] index.html:69
// ["worldworld", "world", index: 15, input: "hellohellochinaworldworld"] index.html:69
// null
Copy after login



var s = 'hellohellochinaworldworld';
var pattern = /(\w+)\1/g;
var ans = [];
s.replace(pattern, function(a, b) {
console.log(ans);   // ["hello", "world"]
Copy after login


String.prototype.getMost = function() {
  var a = this.split('');
  var s = a.join('');
  var pattern = /(\w)\1*/g;
  var a = s.match(pattern);
  a.sort(function(a, b) {
    return a.length < b.length;
  var letter = a[0][0];
  var num = a[0].length;
  return letter + &#39;: &#39; + num;

var s = &#39;aaabbbcccaaabbbcccccc&#39;;
console.log(s.getMost()); // c: 9
Copy after login
Copy after login





  • (?=exp)


// 获取字符串中以ing结尾的单词的前半部分
var s = &#39;I love dancing but he likes singing&#39;;
var pattern = /\b\w+(?=ing\b)/g;
var ans = s.match(pattern);
console.log(ans); // ["danc", "sing"]
Copy after login
  • (?!exp)


// 获取第五位不是i的单词的前四位
var s = &#39;I love dancing but he likes singing&#39;;
var pattern = /\b\w{4}(?!i)/g;
var ans = s.match(pattern);
console.log(ans); // ["love", "like"]
Copy after login




  • 字符转义

因为某些字符已经被正则表达式用掉了,比如. * ( ) / \ [],所以需要使用它们(作为字符)时,需要用\转义

var s = &#39;http://www.cnblogs.com/zichi/&#39;;
var pattern = /http:\/\/www\.cnblogs\.com\/zichi\//;
var ans = pattern.exec(s);
console.log(ans); // ["http://www.cnblogs.com/zichi/", index: 0, input: "http://www.cnblogs.com/zichi/"]
Copy after login
  • 分支条件



var s = "I don&#39;t like you but I love you";
var pattern = /I.*(like|love).*you/g;
var ans = s.match(pattern);
console.log(ans); // ["I don&#39;t like you but I love you"]
Copy after login


var s = "I don&#39;t like you but I love you";
var pattern = /I.*?(like|love).*?you/g;
var ans = s.match(pattern);
console.log(ans); // ["I don&#39;t like you", "I love you"]
Copy after login


  • 去除字符串首尾空格(replace)

String.prototype.trim = function() {
  return this.replace(/(^\s*)|(\s*$)/g, "");
var s = &#39;    hello  world     &#39;;
var ans = s.trim();
console.log(ans.length);    // 12
Copy after login
  • 给字符串加千分符(零宽断言)

String.prototype.getAns = function() {
  var pattern = /(?=((?!\b)\d{3})+$)/g;
  return this.replace(pattern, &#39;,&#39;);

var s = &#39;123456789&#39;;
console.log(s.getAns());  // 123,456,789
Copy after login
  • 找出字符串中出现最多的字符(后向引用)

String.prototype.getMost = function() {
  var a = this.split(&#39;&#39;);
  var s = a.join(&#39;&#39;);
  var pattern = /(\w)\1*/g;
  var a = s.match(pattern);
  a.sort(function(a, b) {
    return a.length < b.length;
  var letter = a[0][0];
  var num = a[0].length;
  return letter + &#39;: &#39; + num;

var s = &#39;aaabbbcccaaabbbcccccc&#39;;
console.log(s.getMost()); // c: 9
Copy after login
Copy after login


  1.  只能输入汉字:/^[\u4e00-\u9fa5]{0,}$/


  1. test: Check whether there is a certain substring (or a matching pattern) in the specified string, return true or false; perform a global pattern search if necessary.

  2. exec: Check whether there is a certain substring (or matching pattern) in the specified string, and if so, return an array (the array is rich in information, please refer to the introduction above ), if null is not returned; if necessary, a global search can be performed to find the information of all substrings (or matching patterns). The information contains the string corresponding to the subexpression in the matching pattern.

  3. compile: Modify the pattern in the regular expression

  4. search: Check the specified characters Is there a certain substring (or matching pattern) in the string? If so, returns the starting position of the substring (or matching pattern) in the original string. If not, returns -1. Global search is not possible.

  5. match: Check whether there is a certain substring (or matching pattern) in the specified string. In non-global mode, the returned information is consistent with exec; such as global search , directly returns a string array. (If you don’t need more information about each match, it is recommended to use match instead of exec)

  6. replace: Check whether there is a certain substring in the specified string (or matching pattern), and replace it with another substring (the substring can be related to the original string or the searched substring); if g is enabled, it will be replaced globally, otherwise only the first one will be replaced. The replace method can reference the value corresponding to the subexpression.

  7. split: Split the string using a specific pattern and return a string array; exactly the opposite of the join method of Array.

  8. Subexpression: A regular matching expression enclosed in parentheses, which can be referenced with a back reference; it can also be obtained with exec or replace its true matching value.

  9. Backward reference: Reference the group where the subexpression is located.

  10. Zero-width assertion: A positional concept similar to \b ^ and $.

The above is the detailed content of Detailed explanation of JavaScript regular expressions that everyone knows. For more information, please follow other related articles on the PHP Chinese website!

