Home > Backend Development > PHP Tutorial > PHP Regular Expressions: How to match all links in HTML

PHP Regular Expressions: How to match all links in HTML

王林
Release: 2023-06-22 14:08:01
Original
1404 people have browsed it

In web development, we often need to deal with links in HTML pages. How to use PHP regular expressions to match all links in an HTML page? here we come to find out.

Links in HTML pages are generally implemented through the tag. We can match links based on this tag. First, we need to get the source code of the HTML page through PHP's file_get_contents() function, for example:

$html = file_get_contents('http://www.example.com');
Copy after login

Next, we can use regular expressions to match all links. The following is a simple regular expression that matches links:

$pattern = '/<a href="https://www.php.cn/link/d28a3097fa7cf63ad01c4f328314e2f2">https://www.php.cn/link/d28a3097fa7cf63ad01c4f328314e2f2</a>/';
Copy after login

In the regular expression, matches a link tag that starts with and starts with the href attribute. href="https://www.php.cn/link/2b9bd744f7c0d06123d9d9557310fa80" matches the link address. The brackets indicate that this is a capturing group, which means that we can use the $matches variable to access the matching result later. >(.?) matches the link text and is also a capturing group.

Next, we can use the preg_match_all() function to apply the regular expression to the HTML page source code to match all links:

preg_match_all($pattern, $html, $matches);
Copy after login

The function returns an array $matches, where $ matches[0] contains the complete string of all matching links, $matches[1] corresponds to capture group 1, which is the link address, and $matches[2] corresponds to capture group 2, which is the link text.

Finally, we can loop through the $matches[1] array, which is the link address array, to get the addresses of all links:

foreach ($matches[1] as $link) {
    echo $link . "
";
}
Copy after login

The complete code is as follows:

$html = file_get_contents('http://www.example.com');
$pattern = '/<a href="https://www.php.cn/link/d28a3097fa7cf63ad01c4f328314e2f2">https://www.php.cn/link/d28a3097fa7cf63ad01c4f328314e2f2</a>/';
preg_match_all($pattern, $html, $matches);

foreach ($matches[1] as $link) {
    echo $link . "
";
}
Copy after login

Note , this regular expression can only match basic link formats, for example:

<a href="http://www.example.com">Example</a>
Copy after login

If the link contains other attributes or the label format does not meet the basic requirements, it cannot be matched. In practical applications, the regular expression can be modified as needed to adapt to different link formats.

In summary, to use PHP regular expressions to match links in HTML pages, you can use the file_get_contents() function to obtain the page source code, then use the preg_match_all() function and appropriate regular expressions to complete the matching, and finally loop Just access the matching results.

The above is the detailed content of PHP Regular Expressions: How to match all links in HTML. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template