Use regular expressions to extract the href attribute value of the anchor link
To extract the href attribute value from an HTML anchor link, you can use a custom regular expression. Here's a comprehensive answer for your specific needs:
The regex pattern "@(<a.>?>.?)" you provided identifies anchor links, but it does not capture the href value. To achieve this you need a more specific pattern:
<code><a\s+(?:[^>]*?\s+)?href=(["'])(.*?)</code>
This mode is broken down as follows:
<a
matches the starting anchor tag. s (?:[^>]*?s )?
matches any whitespace and optional attributes (non-capturing groups) within anchor tags. href=
matches the href attribute. (["'])(.*?)1
captures the href value, which is between double or single quotes (capturing group). Filter valid URLs
To filter out invalid URLs (URLs with neither "?" nor "=" characters), you can use the following regular expression:
<code>page\.php\?id\=.*</code>
This pattern matches strings that match the criteria you specify.
Extract href value from linked list
You have stated that you no longer need to parse anchor tags, and you now have a list of links in the format "href="abcdef"". To extract the href value from this list you can use:
<code>"href=(['"])(.*?)</code>
This mode captures href values even if they are enclosed in double or single quotes.
JavaScript code snippet
To demonstrate how to use these regular expression patterns in JavaScript, here is a code snippet:
<code class="language-javascript">const pattern = /<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)/; const linkText = '<a href="www.example.com/page.php?id=xxxx&name=yyyy"></a>'; const match = pattern.exec(linkText); if (match) { console.log(match[2]); // 输出:www.example.com/page.php?id=xxxx&name=yyyy }</code>
The above is the detailed content of How to Extract href Attribute Values from Anchor Links Using Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!