Introduction:
Parsing HTML using PHP DOM can be a challenge, especially when dealing with specific requirements. This article explores a solution to extract text from elements with a designated class and organize it into structured arrays.
Scenario:
Consider the following HTML content:
<p class="Heading1-P"> <span class="Heading1-H">Chapter 1</span> </p> <p class="Normal-P"> <span class="Normal-H">This is chapter 1</span> </p>
The goal is to extract the text from elements with the "Heading1-H" class into the $heading array and text from those with the "Normal-H" class into the $content array, resulting in:
$heading = ['Chapter 1', 'Chapter 2', 'Chapter 3']; $content = ['This is chapter 1', 'This is chapter 2', 'This is chapter 3'];
Solution Using DOMDocument and DOMXPath:
We employ DOMDocument and DOMXPath to tackle this task.
// Load HTML into DOMDocument $dom = new DOMDocument(); $dom->loadHTML($html); // Create DOMXPath object $xpath = new DOMXPath($dom); // Get elements with desired class using XPath $xpathQuery = "//*[@class='$class']"; $elements = $xpath->query($xpathQuery); // Extract text from elements and store in arrays $headings = []; $contents = []; foreach ($elements as $element) { $nodes = $element->childNodes; foreach ($nodes as $node) { $headings[] = $node->nodeValue; } } var_dump($headings);
This solution effectively parses the HTML and returns the desired arrays.
Note:
Using jQuery for this task is not recommended, as PHP DOM provides a more structured and programmatic approach to HTML manipulation.
The above is the detailed content of How to Extract Text from Elements with a Specific Class in PHP Using DOMDocument?. For more information, please follow other related articles on the PHP Chinese website!