Regular expression function in PHP
There are two sets of regular expression function libraries in PHP. One set is provided by the PCRE (Perl Compatible Regular Expression) library. The PCRE library implements regular expression pattern matching using the same syntax rules as Perl, using functions named with the "preg_" prefix. The other set is provided by the POSIX (Portable Operation System interface) extension library. POSIX extended regular expressions are defined by POSIX 1003.2 and generally use functions named with the "ereg_" prefix.
The functions of the two sets of function libraries are similar, but the execution efficiency is slightly different. Generally speaking, to achieve the same function, the efficiency of using the PCRE library is slightly superior. Its use is described in detail below.
Regular expression matching
1. preg_match()
Function prototype: int preg_match (string $pattern, string $content [, array $matches])
The preg_match () function searches the $content string for content that matches the regular expression given by $pattern. If $matches is provided, the matching results are placed in it. $matches[0] will contain the text that matches the entire pattern, $matches[1] will contain the first captured match of the pattern element enclosed in parentheses, and so on. This function only performs one match and ultimately returns the number of matching results of 0 or 1. Listing 6.1 shows a code example for the preg_match() function.
Code 6.1 Date and time matching
//The string that needs to be matched. The date function returns the current time
$content = "Current date and time is ".date("Y-m-d h:i a").", we are learning PHP together.";
//Use the usual method to match time
If (preg_match ("/d{4}-d{2}-d{2} d{2}:d{2} [ap]m/", $content, $m))
echo "The matching time is:" .$m[0]. "n";
//Since the time pattern is obvious, it can also be matched simply
If (preg_match ("/([d-]{10}) ([d:]{5} [ap]m)/", $content, $m))
echo "The current date is:" .$m[1]. "n";
echo "The current time is:" .$m[2]. "n";
This is a simple dynamic text string matching example. Assuming that the current system time is "13:25 on August 17, 2006", the following content will be output.
The matching time is: 2006-08-17 01:25 pm
The current date is: 2006-08-17
The current time is: 01:25 pm
2. ereg() and eregi()
ereg() is the regular expression matching function in the POSIX extension library. eregi() is a case-ignoring version of the ereg() function. Both have similar functions to preg_match, but the function returns a Boolean value indicating whether the match was successful or not. It should be noted that the first parameter of the POSIX extension library function accepts a regular expression string, that is, no delimiter is required. For example, Listing 6.2 is a method for checking the security of file names.
Code 6.2 Security check of file name
$username = $_SERVER['REMOTE_USER'];
$filename = $_GET['file'];
//Filter file names to ensure system security
If (!ereg('^[^./][^/]*$', $userfile))
die('This is not an illegal file name!');
//Filter usernames
If (!ereg('^[^./][^/]*$', $username))
die('This is not an invalid username');
//Place file paths through security filtering
$thefile = "/home/$username/$filename";
Typically, using the Perl-compatible regular expression matching function perg_match() will be faster than using ereg() or eregi(). If you just want to find whether a string contains a certain substring, it is recommended to use the strstr() or strpos() function.
3. preg_grep()
Function prototype: array preg_grep (string $pattern, array $input)
The preg_grep() function returns an array containing the cells in the $input array that match the given $pattern pattern. Preg_grep() also only performs a match for each element in the input array $input. The example given in Listing 6.3 simply illustrates the use of the preg_grep() function.
Code 6.3 Array query matching
$subjects = array(
"Mechanical Engineering", "Medicine",
"Social Science", "Agriculture",
"Commercial Science", "Politics"
$alonewords = preg_grep("/^[a-z]*$/i", $subjects);
6.3.2 进行全局正则表达式匹配
代码6.4 将文本中的链接地址转成HTML
function url2html($text)
preg_match_all("/http:\/\/?[^\s]+/i", $text, $links);
$max_size = 40;
foreach($links[0] as $link_url)
$len = strlen($link_url);
if($len > $max_size)
$link_text = substr($link_url, 0, $max_size)."...";
} else {
$link_text = $link_url;
$text = str_replace($link_url,"$link_text",$text);
return $text;
$str = “这是一个包含多个URL链接地址的多行文字。欢迎访问”;
print url2html($str);
代码6.5 文件内容的多行匹配
$rows = file('php.ini'); //将php.ini文件读到数组中
foreach($rows as $line)
if(eregi("^([a-z0-9_.]*) *=(.*)", $line, $matches))
$options[$matches[1]] = trim($matches[2]);
6.3.3 正则表达式的替换
函数原型:string ereg_replace (string $pattern, string $replacement, string $string)
string eregi_replace (string $pattern, string $replacement, string $string)
代码6.6 源代码的清理
$lines = file('source.php'); //将文件读入数组中
for($i=0; $i<count($lines); $i++)
$lines[$i] = eregi_replace("(\/\/|#).*$", "", $lines[$i]);
$lines[$i] = eregi_replace("[ \n\r\t\v\f]*$", "\r\n", $lines[$i]);
echo htmlspecialchars(join("",$lines));
函数原型:mixed preg_replace (mixed $pattern, mixed $replacement, mixed $subject [, int $limit])
代码6.7 数组替换
$string = "Name: {Name}<br>\nEmail: {Email}
\nAddress: {Address}
$patterns =array(
$replacements = array (
"No.5, Wilson St., New York, U.S.A",
"Thomas Ching",
print preg_replace($patterns, $replacements, $string);
Name: Thomas Ching",
Address: No.5, Wilson St., New York, U.S.A
$html_body = “<HTML>