Problem: Converting text URLs into hyperlinks can be a useful task, but it becomes challenging when images or other elements within HTML tags also contain URLs. In a specific instance, the user seeks a way to replace text URLs with anchor tags while avoiding replacing URLs embedded within image source attributes.
Solution:
The key to addressing this issue lies in using an XPath expression to select only those text nodes that contain URLs but are not descendants of anchor elements.
Here's a refined version of the XPath expression:
$xPath = new DOMXPath($dom); $texts = $xPath->query( '/html/body//text()[ not(ancestor::a) and ( contains(.,"http://") or contains(.,"https://") or contains(.,"ftp://") )]' );
This expression effectively excludes text nodes that are contained within anchor tags, ensuring that only plain text URLs are targeted for conversion.
Replacing Text URLs without Affecting Image URLs:
To avoid replacing URLs embedded within image source attributes, a non-standard but efficient approach is employed. Instead of splitting text nodes apart, a document fragment is used to replace the entire text node with the modified version.
Here's the code that performs this task:
foreach ($texts as $text) { $fragment = $dom->createDocumentFragment(); $fragment->appendXML( preg_replace( "~((?:http|https|ftp)://(?:\S*?\.\S*?))(?=\s|\;|\)|\]|\[|\{|\}|,|\"|'|:|\<|$|\.\s)~i", '<a href=""></a>', $text->data ) ); $text->parentNode->replaceChild($fragment, $text); }
In this code, the preg_replace function is used to search for URLs in the text node and replace them with their corresponding anchor tag versions.
Example:
Consider the following HTML:
<code class="html"><html> <body> <p> This is a text with a <a href="http://example.com/1">link</a> and another <a href="http://example.com/2">http://example.com/2</a> and also another http://example.com with the latter being the only one that should be replaced. There is also images in this text, like <img src="http://example.com/foo"/> but these should not be replaced either. In fact, only URLs in text that is no a descendant of an anchor element should be converted to a link. </p> </body> </html></code>
Applying the above solution will convert the text URLs to anchor tags while leaving the image URL untouched, producing the following output:
<code class="html"><html><body> <p> This is a text with a <a href="http://example.com/1">link</a> and another <a href="http://example.com/2">http://example.com/2</a> and also another <a href="http://example.com">http://example.com</a> with the latter being the only one that should be replaced. There is also images in this text, like <img src="http://example.com/foo"/> but these should not be replaced either. In fact, only URLs in text that is no a descendant of an anchor element should be converted to a link. </p> </body></html></code>
The above is the detailed content of How to Replace Text URLs with Hyperlinks While Excluding URLs within HTML Tags?. For more information, please follow other related articles on the PHP Chinese website!