Home Backend Development PHP Tutorial PHP Regular Expression: How to match all meta tags in HTML

PHP Regular Expression: How to match all meta tags in HTML

Jun 22, 2023 pm 10:21 PM
php regular expression meta tag

In web development, the meta tag is a very important element. It provides additional information about the content of the web page, such as web page title, web page description, keywords, etc. When processing HTML pages, sometimes you need to use regular expressions to match the meta tags in the front-end code. Let's introduce how to use PHP regular expressions to match all meta tags in the HTML page.

First of all, we need to understand the conventional writing method of meta tags in HTML pages. The general format is as follows:

<meta charset="UTF-8">
<meta name="description" content="这里是网页的描述">
<meta name="keywords" content="这里是网页的关键词">
<title>这里是网页标题</title>
Copy after login

According to this template, we can use regular expressions to match these meta tags. First, we need to get the source code of the HTML page, and then use PHP's preg_match_all() function to match the meta tags in it, as shown below:

$html = file_get_contents("http://www.example.com");
preg_match_all('/<meta.*?>/i', $html, $matches);
print_r($matches);
Copy after login

In the above code, first use the file_get_contents() function to obtain Get the source code of an HTML page, then use the preg_match_all() function to match all meta tags in the source code, and store the matching results in the $matches variable. Among them, /<meta.*?>/i is the regular expression used to match the meta tag, where <meta means matching the beginning of the <meta tag, and .*? means matching any character in the tag. No greedy matching until the end symbol > of the tag is matched. i means that case is ignored when matching.

The execution result of the above code may be as follows:

Array
(
    [0] =&gt; Array
        (
            [0] =&gt; &lt;meta charset=&quot;UTF-8&quot;&gt;
            [1] =&gt; &lt;meta name=&quot;description&quot; content=&quot;这里是网页的描述&quot;&gt;
            [2] =&gt; &lt;meta name=&quot;keywords&quot; content=&quot;这里是网页的关键词&quot;&gt;
        )

)
Copy after login

We can see that through the preg_match_all() function, we successfully matched all meta tags in the HTML page and will match the results Saved in the $matches array.

At the same time, if we need to match specific attribute values ​​in the meta tag, such as charset, name, content, etc., we can also add corresponding matching rules to the above regular expression, as shown below:

$html = file_get_contents(&quot;http://www.example.com&quot;);
preg_match_all('/&lt;metas+.*?charset=&quot;(S+).*?&gt;/i', $html, $matches);
print_r($matches);
Copy after login

In the above code, we added a matching rule of s to match the spaces between tag attributes, and then added the matching rule of charset="(S )" to the regular expression to use Matches the charset attribute and its attribute value in the meta tag. Where S means matching any character in the character set except spaces, indicating that the character set appears at least once. After running the above code, the output may look like the following:

Array
(
    [0] =&gt; Array
        (
            [0] =&gt; &lt;meta charset=&quot;UTF-8&quot;&gt;
        )

    [1] =&gt; Array
        (
            [0] =&gt; UTF-8
        )

)
Copy after login

From the above matching results, we can see that the charset attribute and its attribute value in the page are successfully matched.

In short, by using PHP's regular expressions, we can flexibly match various elements in HTML pages, including meta tags. It should be noted that although regular expressions are convenient, they also have certain limitations. For example, they cannot handle some complex nested tags, so you need to be careful when using regular expressions.

The above is the detailed content of PHP Regular Expression: How to match all meta tags in HTML. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot Article Tags

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian Dec 24, 2024 pm 04:42 PM

PHP 8.4 Installation and Upgrade guide for Ubuntu and Debian

CakePHP Date and Time CakePHP Date and Time Sep 10, 2024 pm 05:27 PM

CakePHP Date and Time

CakePHP File upload CakePHP File upload Sep 10, 2024 pm 05:27 PM

CakePHP File upload

CakePHP Routing CakePHP Routing Sep 10, 2024 pm 05:25 PM

CakePHP Routing

CakePHP Project Configuration CakePHP Project Configuration Sep 10, 2024 pm 05:25 PM

CakePHP Project Configuration

Discuss CakePHP Discuss CakePHP Sep 10, 2024 pm 05:28 PM

Discuss CakePHP

CakePHP Quick Guide CakePHP Quick Guide Sep 10, 2024 pm 05:27 PM

CakePHP Quick Guide

How To Set Up Visual Studio Code (VS Code) for PHP Development How To Set Up Visual Studio Code (VS Code) for PHP Development Dec 20, 2024 am 11:31 AM

How To Set Up Visual Studio Code (VS Code) for PHP Development

See all articles