Golang web application security: Should you check if the input is valid utf-8?

WBOY
Release: 2024-02-10 08:10:08
forward
645 people have browsed it

Golang Web 应用程序安全:您应该检查输入是否有效 utf-8?

php editor Xiaoxin will introduce you to an important aspect of Golang web application security in this article: checking whether the input is valid utf-8. Input validation is critical in web application development because malicious users may submit input that contains malicious code or illegal characters. Especially for programming languages ​​like Golang, correctly handling and validating the UTF-8 encoding of input is an important part of ensuring application security. In this article, we'll look at how to efficiently check if your input is valid UTF-8, and provide some practical advice and tips.

Question content

According to several best practice documents, it is best to check if the input data is utf-8.

In my project, I use gin and use go-playground/validator for validation. There is an "ascii" validator but no "utf-8" validator.

I found https://pkg.go.dev/unicode/utf8#validstring and I was wondering if using it to check the input would be of any help or is it given since go itself uses unicode internally?

This is an example:

package main

import (
    "net/http"

    "github.com/gin-gonic/gin"
)

type User struct {
    Name string `json:"name" binding:"required,alphanum"`
}

func main() {
    r := gin.Default()
    r.POST("/user", createUserHandler)
    r.Run()
}

func createUserHandler(c *gin.Context) {
    var newUser User
    err := c.ShouldBindJSON(&newUser)

    if err != nil {
        c.AbortWithError(http.StatusBadRequest, err)
        return
    }

    c.Status(http.StatusCreated)
}
Copy after login

After calling c.shouldbindjson, do you ensure that the name in newuser is utf-8 encoded? Is there any benefit to using utf8.validstring to check name?

Workaround

Gin uses the standard encoding/json package to unmarshal JSON documents. Documentation description of this package:

Invalid UTF-8 or invalid UTF-16 surrogate pairs are not treated as errors when unmarshalling quoted strings. Instead, they are replaced by the Unicode replacement character U FFFD.

Ensure that the decoded string value is valid UTF-8. There is no advantage to using utf8.ValidString to check a string value.

Depending on application requirements, you may need to check and handle the Unicode replacement character "�". Aside: As indicated by � in this answer, SO treats Unicode replacement characters like any other character.

Go itself uses Unicode internally? ​​p>

Some language features use UTF-8 encoding (string ranges, []runes, and conversions between strings), but these features do not limit the bytes that can be stored in a string. Strings can contain any byte sequence, including invalid UTF-8.

The above is the detailed content of Golang web application security: Should you check if the input is valid utf-8?. For more information, please follow other related articles on the PHP Chinese website!

source:stackoverflow.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!