Home > Backend Development > C++ > How to Extract href Attribute Values from Anchor Links Using Regular Expressions?

How to Extract href Attribute Values from Anchor Links Using Regular Expressions?

Barbara Streisand
Release: 2025-01-10 10:39:41
Original
428 people have browsed it

How to Extract href Attribute Values from Anchor Links Using Regular Expressions?

Use regular expressions to extract the href attribute value of the anchor link

To extract the href attribute value from an HTML anchor link, you can use a custom regular expression. Here's a comprehensive answer for your specific needs:

The regex pattern "@(<a.>?>.?)" you provided identifies anchor links, but it does not capture the href value. To achieve this you need a more specific pattern:

<code><a\s+(?:[^>]*?\s+)?href=(["'])(.*?)</code>
Copy after login

This mode is broken down as follows:

  • <a matches the starting anchor tag.
  • s (?:[^>]*?s )? matches any whitespace and optional attributes (non-capturing groups) within anchor tags.
  • href= matches the href attribute.
  • (["'])(.*?)1 captures the href value, which is between double or single quotes (capturing group).

Filter valid URLs

To filter out invalid URLs (URLs with neither "?" nor "=" characters), you can use the following regular expression:

<code>page\.php\?id\=.*</code>
Copy after login

This pattern matches strings that match the criteria you specify.

Extract href value from linked list

You have stated that you no longer need to parse anchor tags, and you now have a list of links in the format "href="abcdef"". To extract the href value from this list you can use:

<code>"href=(['"])(.*?)</code>
Copy after login

This mode captures href values ​​even if they are enclosed in double or single quotes.

JavaScript code snippet

To demonstrate how to use these regular expression patterns in JavaScript, here is a code snippet:

<code class="language-javascript">const pattern = /<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)/;
const linkText = '<a href="www.example.com/page.php?id=xxxx&name=yyyy"></a>';
const match = pattern.exec(linkText);
if (match) {
  console.log(match[2]); // 输出:www.example.com/page.php?id=xxxx&name=yyyy
}</code>
Copy after login

The above is the detailed content of How to Extract href Attribute Values from Anchor Links Using Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template