Home > Backend Development > PHP Tutorial > PHP Regular Expressions: How to match all headings and paragraphs in HTML

PHP Regular Expressions: How to match all headings and paragraphs in HTML

王林
Release: 2023-06-22 19:22:02
Original
990 people have browsed it
<p>HTML tags are often used in modern websites, and these tags contain various elements, such as titles, paragraphs, etc. If you are a PHP developer, then you may have encountered the need to extract all headings and paragraph tags from an HTML file for further use. This is where regular expressions come in handy. This article will show you how to use PHP regular expressions to match all headings and paragraphs in HTML. </p> <p>First of all, we need to know what the tags of titles and paragraphs are in HTML. HTML tags contain a variety of title and paragraph elements, the most common of which are h1, h2, h3 and other tags used to represent titles, while p tags are used to represent paragraphs. In this article, we will only focus on these most commonly used tags. </p> <p>Now, let’s see how to use PHP regular expressions to match title and paragraph tags in HTML. The code below shows a simple PHP script that will read an HTML file and match all headings and paragraphs in it using regular expressions: </p><div class="code" style="position:relative; padding:0px; margin:0px;"><pre class='brush:php;toolbar:false;'><?php // 读入 HTML 文件 $html = file_get_contents('example.html'); // 正则表达式匹配所有标题和段落 $pattern = '/<(hd|p)[^>]*>(.*?)</>/si'; preg_match_all($pattern, $html, $matches); // 显示所匹配的结果 print_r($matches[0]); ?></pre><div class="contentsignin">Copy after login</div></div><p>Regular expressions in this code snippet<code>/<(hd|p)[^>]*>(.*?)</ >/si</code> You can do the following things: </p><ul><li> The <code><</code> and <code>></code> metacharacters are used to match the beginning and end of HTML tags. </li><li><code>hd|p</code> means match all heading (h1, h2, h3, etc.) and paragraph (p) tags. </li><li><code>[^>]*</code> matches all characters in the tag except <code>></code>, because <code>></code> is the starting point of the tag. delimiter of the starting position. </li><li><code>(.*?)</code> Use non-greedy matching to match text in the middle of tags. </li><li><code></ ></code> means matching the end tag corresponding to the start tag. <code> </code> means matching the previously specified tag (i.e. <code>hd|p</code>). </li></ul><p>In this regular expression, we use the two pattern modifiers <code>s</code> and <code>i</code>. Among them, <code>s</code> is used to turn on the "dot matching mode" so that the <code>.</code> metacharacter matches all characters, including newlines. And <code>i</code> is used to turn on the "case-insensitive mode" so that the case of the tag name does not affect the matching results. </p><p>When the script finishes running, it will print out all matching headings and paragraph tags. The result will look something like this: </p><div class="code" style="position:relative; padding:0px; margin:0px;"><pre class='brush:php;toolbar:false;'>Array ( [0] => <h1>PHP 正则表达式</h1> [1] => <p>现代网站中常常会用到 HTML 标记,这些标记包含了各种元素,例如标题和段落等等。</p> [2] => <h2>标题2</h2> [3] => <p>段落2</p> [4] => <h3>标题3</h3> [5] => <p>段落3</p> )</pre><div class="contentsignin">Copy after login</div></div><p>With this result, we can see that the PHP regular expression successfully matched all heading and paragraph tags in the HTML. This regular expression has other application scenarios, such as matching links, pictures and tables in HTML. Hopefully this article helped you better understand using PHP regular expressions to match elements in HTML. </p>

The above is the detailed content of PHP Regular Expressions: How to match all headings and paragraphs in HTML. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template