How to Ignore HTML Tags in preg_replace
In your code snippet, you attempt to use preg_replace to highlight searched keywords within HTML text. However, this approach can lead to HTML structure disruption when the keyword matches content within HTML tags.
Instead of using regular expressions, it is recommended to leverage XPath and DOMDocument for this task. Consider the following approach:
Code Example:
$str = '...'; // HTML String $search = 'text that span'; $doc = new DOMDocument; $doc->loadXML($str); $xp = new DOMXPath($doc); $anchor = $doc->getElementsByTagName('body')->item(0); if (!$anchor) { throw new Exception('Anchor element not found.'); } $r = $xp->query('//*[contains(., "'.$search.'")]/*[FALSE = contains(., "'.$search.'")]/..', $anchor); if (!$r) { throw new Exception('XPath failed.'); } foreach ($r as $i => $node) { $textNodes = $xp->query('.//child::text()', $node); $range = new TextRange($textNodes); while (FALSE !== $start = strpos($range, $search)) { $base = $range->split($start); $range = $base->split(strlen($search)); $ranges[] = $base; } foreach ($ranges as $range) { foreach ($range->getNodes() as $node) { $span = $doc->createElement('span'); $span->setAttribute('class', 'search_hightlight'); $node = $node->parentNode->replaceChild($span, $node); $span->appendChild($node); } } } echo $doc->saveXML();
This approach allows you to effectively highlight search terms while disregarding HTML tags, preserving the structural integrity of your HTML content.
The above is the detailed content of How to Highlight Keywords in HTML While Ignoring Tags?. For more information, please follow other related articles on the PHP Chinese website!