Problem:
You possess HTML content with elements containing both headings and regular text. Your goal is to extract the text from elements with a designated class ("Heading1-H" for headings, "Normal-H" for text) into two separate arrays: $heading and $content.
Solution:
Using PHP DOM and XPath
PHP DOM (Document Object Model) and XPath (XML Path Language) offer a robust solution for this task. Here's the implementation:
$test = <<<HTML <p class="Heading1-P"> <span class="Heading1-H">Chapter 1</span> </p> <p class="Normal-P"> <span class="Normal-H">This is chapter 1</span> </p> <p class="Heading1-P"> <span class="Heading1-H">Chapter 2</span> </p> <p class="Normal-P"> <span class="Normal-H">This is chapter 2</span> </p> <p class="Heading1-P"> <span class="Heading1-H">Chapter 3</span> </p> <p class="Normal-P"> <span class="Normal-H">This is chapter 3</span> </p> HTML; $dom = new DOMDocument(); $dom->loadHTML($test); $xpath = new DOMXPath($dom); $heading = parseToArray($xpath, 'Heading1-H'); $content = parseToArray($xpath, 'Normal-H'); var_dump($heading); echo "<br/>"; var_dump($content); echo "<br/>"; function parseToArray(DOMXPath $xpath, string $class): array { $xpathquery = "//*[@class='$class']"; $elements = $xpath->query($xpathquery); $resultarray = []; foreach ($elements as $element) { $nodes = $element->childNodes; foreach ($nodes as $node) { $resultarray[] = $node->nodeValue; } } return $resultarray; }
Output:
array(3) { [0] => string(8) "Chapter 1" [1] => string(8) "Chapter 2" [2] => string(8) "Chapter 3" } <br/> array(3) { [0] => string(15) "This is chapter 1" [1] => string(15) "This is chapter 2" [2] => string(15) "This is chapter 3" } <br/>
The above is the detailed content of How can I extract text from specific HTML elements with different classes into separate arrays using PHP?. For more information, please follow other related articles on the PHP Chinese website!