Home > Backend Development > Golang > How to Efficiently Convert HTML Escape Sequences in Go?

How to Efficiently Convert HTML Escape Sequences in Go?

Susan Sarandon
Release: 2024-12-17 15:22:16
Original
700 people have browsed it

How to Efficiently Convert HTML Escape Sequences in Go?

Converting Escape Characters in HTML Tags

In Go, the conversion of HTML tags containing escape characters is not as straightforward as desired. While json.Marshal() can easily convert strings with characters like "<" to its escape sequence "u003chtmlu003e," json.Unmarshal() does not provide a direct and efficient method for the reverse operation.

Using strconv.Unquote()

The strconv.Unquote() function can be employed to perform the conversion. However, it requires the string to be enclosed in quotation marks. Therefore, adding these enclosing characters manually is necessary.

import (
    "fmt"
    "strconv"
)

func main() {
    // Important to use backtick ` (raw string literal)
    // else the compiler will unquote it (interpreted string literal)!

    s := `\u003chtml\u003e`
    fmt.Println(s)
    s2, err := strconv.Unquote(`"` + s + `"`)
    if err != nil {
        panic(err)
    }
    fmt.Println(s2)
}
Copy after login

Output:

\u003chtml\u003e
<html></p>
<p><strong>Note:</strong></p>
<p>The html package is also available for HTML text escaping and unescaping. However, it does not decode unicode sequences of the form uxxxx, only decimal; or HH;.</p>
<pre class="brush:php;toolbar:false">import (
    "fmt"
    "html"
)

func main() {
    fmt.Println(html.UnescapeString(`\u003chtml\u003e`)) // wrong
    fmt.Println(html.UnescapeString(`&amp;#60;html&amp;#62;`))   // good
    fmt.Println(html.UnescapeString(`&amp;#x3c;html&amp;#x3e;`)) // good
}
Copy after login

Output:

\u003chtml\u003e
<html>
<html>
Copy after login

Note 2:

Remember that quoted strings using the double quote (") are interpreted strings, which are unquoted by the compiler. To specify a string with its quotes intact, use backticks to create a raw string literal.

s := "\u003chtml\u003e" // Interpreted string literal (unquoted by the compiler!)
fmt.Println(s)

s2 := `\u003chtml\u003e` // Raw string literal (no unquoting will take place)
fmt.Println(s2)

s3 := "\u003chtml\u003e" // Double quoted interpreted string literal
                           // (unquoted by the compiler to be "single" quoted)
fmt.Println(s3)
Copy after login

Output:

<html>
\u003chtml\u003e
Copy after login

The above is the detailed content of How to Efficiently Convert HTML Escape Sequences in Go?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template