Shortest matching should be used: If there is a piece of text, you only want to match the shortest possible, not the longest. The following article mainly introduces you to the relevant information about the usage of the shortest matching pattern in regular expressions. The introduction in the article is very detailed. Friends in need can refer to it. Let’s take a look together.
Preface
Recently, I wanted to use regular expressions to grab something from a web page. The content was not complicated, but there were many problems. . Not much to say below, let’s take a look at the detailed introduction:
When we use regular expressions to match the beginning and end of a tag, such as matching <h1>hello world</h1> The opening and closing tags of h1 in
Many people may write like this
/<.*h1>/g
But is this really okay?
Because the * matching character matches zero or more of the previous character, and it is a greedy matching
, so you will get the following result.
Obviously this is not what we want, so how to change greedy matching into minimum matching,
/<.*?h1>/g
The above writing method is enough, as shown below:
In fact, the principle should be very simple, because ? is also a greedy match, and can only match 0 to 1,
, so it will end when it matches the first one, thus preventing * from being greedy in matching multiple ones.
The above is the detailed content of Parsing the shortest matching pattern in regular expressions. For more information, please follow other related articles on the PHP Chinese website!