Find and replace using regular expressions
First of all, I need to state that in fact, I am not particularly skilled in the application of regular rules. It is just that I was "forced to have no choice" at work and gained some understanding of regular rules step by step. As I learn more about regular expressions, I find that regular expressions are really a very powerful tool. Using regular expressions can often get twice the result with half the effort. There are many general rules circulating on the Internet, such as rules for finding phone numbers and rules for finding emails. I believe there are many friends like me who started learning regular rules from these popular rules. When you realize the power of regular rules and these popular rules are no longer applicable, you will have the motivation to learn regular rules. In fact, the basic rules of regular rules are very simple, and it is easy to get started, but how to use them after getting started has its own differences. There are currently two types of regular expressions supported by PHP: POSIX extended regular expressions and perl-compatible regular expressions. Many PHP textbooks use POSIX extended regular expressions, but I prefer perl-compatible regular expressions. Firstly, it has better compatibility, and secondly, I think perl-compatible regular expressions The formula looks clearer. Let’s first assume a working environment: there is a file containing user information. There are 10,000 lines in total. Each line records the information of a user. The format is as follows: username,010-12345678,firstname.lastname,05/21/2007 In the following article, I will use this hypothetical working environment to introduce the search and replacement of perl-compatible regular expressions in PHP. Find The most commonly used search is preg_match(), the function description is as follows: int preg_match_all (string pattern, string subject, array matches [, int flags] I won’t say much about the syntax of regular expressions. I assume that everyone reading this article has a certain foundation in regular expressions. In fact, regular expression search is not very useful. Searches that are not too complex can be implemented through the strstr() function, and it is more efficient. Searches using regular expressions are usually more complex searches that cannot be achieved with strstr(). For example, if I want to find a row of records where the area code of the phone number is 010 and the last name is bill, I can write like this preg_match('/^[^,]*,010[^.]*.bill.*$/i',$line); Where $line means a line of data in the file. If the area code in $line is 010 and lastname happens to be bill, the above statement will return a non-zero positive integer. Usually we don't care about the value of this number, but only care about whether there is a match. And if we want to find out the records of the user whose last name is bill in 2007, we can use the following statement preg_match('/^[^,]*,[^,]*,[^.]*.bill,[^/]*/[^/]*/2007/i',$line); Regular expression replacement is usually used when two or more keywords need to be matched, and the two keywords are not adjacent. At this time, it cannot be achieved with normal search functions, so regular expressions are used. Replacement Compared with search, I think replacement is the most powerful and useful place of regular expressions. Suppose we now need to change the date format of 10,000 records in the file to yyyy/mm/dd. What would you do? You will find that ordinary search and replace cannot achieve this purpose. Maybe you will say that you can decompose a row of records, analyze it and then reorganize it. I admit that this is indeed a solution, but it is not the best solution. Let us take a look at how to use regular rules to achieve our requirements: $line = preg_replace('/([^,]*,[^,]*,[^,]*,)([0-9]*)/([0-9]*)/([0-9 ]*)/i',"$usup${4}${2}${3}",$line); Here we use the "sub-pattern" in the regular expression. If you observe carefully, you can find that there are four pairs of '()' in the first parameter of the preg_replace function, and the content in each pair of '()' is a " Submode", in the second parameter, these submodes can be combined at will through the format of $usup, ${2}. If I want to delete the phone number in the data, I can write it like this: $line = preg_replace('/([^,]*,)([^,]*,)[^,]*,)([0-9]*)/([0-9]*)/([ 0-9]*)/i',"$usup${4}${2}${3}",$line); Here, a more complicated way of writing is used to introduce the sub-mode. In fact, there is a simpler way: $line = preg_replace('/,[d]+-[d]+/i','',$line); There is a saying, I forgot whether it was said by others or my own original creation, but it has been in my mind for a long time: regular expressions are not technology, but skills:) Getting started with regular expressions is actually quite simple, you only need to use it once or twice You can master the basic grammar. How powerful this sword is to be exerted depends on personal practice. One thing that needs to be explained is that this sword is actually a double-edged sword: the execution efficiency of regular expressions is actually lower than that of functions such as strstr and strpos. Therefore, when your search is very simple, there is no need to The rules are applicable. |