Home > Backend Development > Golang > How to Handle Non-UTF-8 Encoded XML in Go?

How to Handle Non-UTF-8 Encoded XML in Go?

Mary-Kate Olsen
Release: 2024-12-26 03:28:15
Original
1098 people have browsed it

How to Handle Non-UTF-8 Encoded XML in Go?

Handling Non-UTF-8 XML Input in Go

When attempting to unmarshal an XML input using the Unmarshal function in Go's xml package, one might encounter issues if the input is not encoded in UTF-8. To address this, a CharsetReader is required.

Where to Find a CharsetReader

Fortunately, Go's net/html package provides a solution in the form of charset.NewReaderLabel. This reader can handle the conversion of non-UTF-8 encoded input to UTF-8.

Updated Solution for 2015 and Beyond

In earlier versions of Go, a custom CharsetReader had to be implemented. However, newer versions of Go provide a simpler solution using charset.NewReaderLabel. Here's an updated code snippet:

import (
    "encoding/xml"
    "bytes"
    "golang.org/x/net/html/charset"
)

// ...
reader := bytes.NewReader(theXml)
decoder := xml.NewDecoder(reader)
decoder.CharsetReader = charset.NewReaderLabel
err = decoder.Decode(&parsed)
Copy after login

By using charset.NewReaderLabel as the CharsetReader, the Unmarshal function can now successfully handle non-UTF-8 encoded XML input without manual conversion or custom implementations.

The above is the detailed content of How to Handle Non-UTF-8 Encoded XML in Go?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template