Golang html.Parse rewrite href query string to contain &

王林
Release: 2024-02-09 23:42:08
forward
1168 people have browsed it

Golang html.Parse重写href查询字符串以包含&

php editor Zimo is here to introduce you to a little trick about Golang. When parsing HTML using html.Parse, sometimes we need to rewrite the query string of href to include the & symbol. This technique can help us be more flexible and convenient when processing HTML links, and improve development efficiency. Next, we will explain in detail how to use this technique and give sample code, hoping it will be helpful to everyone.

Question content

I have the following code:

package main

import (
    "os"
    "strings"

    "golang.org/x/net/html"
)

func main() {
    myhtmldocument := `<!doctype html>
<html>
<head>
</head>
<body>
    <a href="http://www.example.com/input?foo=bar&baz=quux">wtf</a>
</body>
</html>`

    doc, _ := html.parse(strings.newreader(myhtmldocument))
    html.render(os.stdout, doc)
}
Copy after login

html.render function produces the following output:

<!DOCTYPE html><html><head>

</head>
<body>
    <a href="http://www.example.com/input?foo=bar&baz=quux">WTF</a>

</body></html>
Copy after login

Why rewrite the query string and convert & to & (between bar and baz)?

Is there a way to avoid this behavior?

I'm trying to do a template conversion but I don't want it to break my urls.

Solution

html.parse If you want to generate valid html, and the html specification stipulates that the ampersand in the href attribute must Encode.

https://www.w3.org/tr/xhtml1/guidelines .html#c_12

In sgml and xml, the ampersand character ("&") declares the beginning of an entity reference (for example, ® represents the registered trademark symbol "®"). Unfortunately, many html user agents silently ignore incorrect usage of the & symbol in html documents - treating an & symbol that doesn't look like an entity reference as a literal & symbol. XML-based user agents will not tolerate this incorrect use, and any document that incorrectly uses the & symbol will not be "valid" and therefore will not conform to this specification. To ensure that the document is compatible with historical html user agents and xml-based user agents, the & symbol used in the document, which is treated as a literal character, must represent itself as an entity reference (such as "&"). For example, when the href attribute of an element refers to a cgi script with parameters, it must be expressed as http://my.site.dom/cgi-bin/myscript.pl?class=guest& name=user instead of http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user.

In this case, go actually makes your html better and more efficient

That being said - the browser will escape it, so if you click it, the resulting url will still be correct (no &, just &:

console.log(document.queryselector('a').href)
Copy after login
 <a href="http://www.example.com/input?foo=bar&baz=quux">WTF</a>
Copy after login

The above is the detailed content of Golang html.Parse rewrite href query string to contain &. For more information, please follow other related articles on the PHP Chinese website!

source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!