unicode/utf8 - [ Go 中文开发手册 ] - 在线原生手册

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

ディレクトリ 検索

archive archive/tar archive/zip bufio bufio（缓存） builtin builtin（内置包） bytes bytes（包字节） compress compress/bzip2（压缩/bzip2） compress/flate（压缩/flate） compress/gzip（压缩/gzip） compress/lzw（压缩/lzw） compress/zlib（压缩/zlib） container container/heap（容器数据结构heap） container/list（容器数据结构list） container/ring（容器数据结构ring） context context（上下文） crypto crypto（加密） crypto/aes（加密/aes） crypto/cipher（加密/cipher） crypto/des（加密/des） crypto/dsa（加密/dsa） crypto/ecdsa（加密/ecdsa） crypto/elliptic（加密/elliptic） crypto/hmac（加密/hmac） crypto/md5（加密/md5） crypto/rand（加密/rand） crypto/rc4（加密/rc4） crypto/rsa（加密/rsa） crypto/sha1（加密/sha1） crypto/sha256（加密/sha256） crypto/sha512（加密/sha512） crypto/subtle（加密/subtle） crypto/tls（加密/tls） crypto/x509（加密/x509） crypto/x509/pkix（加密/x509/pkix） database database/sql（数据库/sql） database/sql/driver（数据库/sql/driver） debug debug/dwarf（调试/dwarf） debug/elf（调试/elf） debug/gosym（调试/gosym） debug/macho（调试/macho） debug/pe（调试/pe） debug/plan9obj（调试/plan9obj） encoding encoding（编码） encoding/ascii85（编码/ascii85） encoding/asn1（编码/asn1） encoding/base32（编码/base32） encoding/base64（编码/base64） encoding/binary（编码/binary） encoding/csv（编码/csv） encoding/gob（编码/gob） encoding/hex（编码/hex） encoding/json（编码/json） encoding/pem（编码/pem） encoding/xml（编码/xml） errors errors（错误） expvar expvar flag flag（命令行参数解析flag包） fmt fmt go go/ast（抽象语法树） go/build go/constant（常量） go/doc（文档） go/format（格式） go/importer go/parser go/printer go/scanner（扫描仪） go/token（令牌） go/types（类型） hash hash（散列） hash/adler32 hash/crc32 hash/crc64 hash/fnv html html html/template（模板） image image（图像） image/color（颜色） image/color/palette（调色板） image/draw（绘图） image/gif image/jpeg image/png index index/suffixarray io io io/ioutil log log log/syslog（日志系统） math math math/big math/big math/bits math/bits math/cmplx math/cmplx math/rand math/rand mime mime mime/multipart（多部分） mime/quotedprintable net net net/http net/http net/http/cgi net/http/cookiejar net/http/fcgi net/http/httptest net/http/httptrace net/http/httputil net/http/internal net/http/pprof net/mail net/mail net/rpc net/rpc net/rpc/jsonrpc net/smtp net/smtp net/textproto net/textproto net/url net/url os os os/exec os/signal os/user path path path/filepath（文件路径） plugin plugin（插件） reflect reflect（反射） regexp regexp（正则表达式） regexp/syntax runtime runtime（运行时） runtime/debug（调试） runtime/internal/sys runtime/pprof runtime/race（竞争） runtime/trace（执行追踪器） sort sort（排序算法） strconv strconv（转换） strings strings（字符串） sync sync（同步） sync/atomic（原子操作） syscall syscall（系统调用） testing testing（测试） testing/iotest testing/quick text text/scanner（扫描文本） text/tabwriter text/template（定义模板） text/template/parse time time（时间戳） unicode unicode unicode/utf16 unicode/utf8 unsafe unsafe

テキスト

import "unicode/utf8"

概观

索引

示例

概观

打包 utf8 实现函数和常量以支持以 UTF-8 编码的文本。它包含在符文和 UTF-8 字节序列之间转换的函数。

索引

常量

func DecodeLastRune(p []byte) (r rune, size int)

func DecodeLastRuneInString(s string) (r rune, size int)

func DecodeRune(p []byte) (r rune, size int)

func DecodeRuneInString(s string) (r rune, size int)

func EncodeRune(p []byte, r rune) int

func FullRune(p []byte) bool

func FullRuneInString(s string) bool

func RuneCount(p []byte) int

func RuneCountInString(s string) (n int)

func RuneLen(r rune) int

func RuneStart(b byte) bool

func Valid(p []byte) bool

func ValidRune(r rune) bool

func ValidString(s string) bool

示例

DecodeLastRune DecodeLastRuneInString DecodeRune DecodeRuneInString EncodeRune FullRune FullRuneInString RuneCount RuneCountInString RuneLen RuneStart Valid ValidRune ValidString

打包文件

utf8.go

常量

编码的基本数字。

const (
        RuneError = '\uFFFD'     // the "error" Rune or "Unicode replacement character"
        RuneSelf  = 0x80         // characters below Runeself are represented as themselves in a single byte.
        MaxRune   = '\U0010FFFF' // Maximum valid Unicode code point.
        UTFMax    = 4            // maximum number of bytes of a UTF-8 encoded Unicode character.)

func DecodeLastRune

func DecodeLastRune(p []byte) (r rune, size int)

DecodeLastRune 解压 p 中的最后一个 UTF-8 编码，并以字节为单位返回符文及其宽度。如果p为空，则返回(RuneError, 0)。否则，如果编码无效，则返回(RuneError, 1)。对于正确的非空 UTF-8，两者都是不可能的结果。

如果编码不正确，则编码无效 UTF-8，对超出范围的符文进行编码，或者该值不是最短的 UTF-8 编码。不执行其他验证。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	b := []byte("Hello, 世界")for len(b) > 0 {
		r, size := utf8.DecodeLastRune(b)
		fmt.Printf("%c %v\n", r, size)

		b = b[:len(b)-size]}}

func DecodeLastRuneInString

func DecodeLastRuneInString(s string) (r rune, size int)

DecodeLastRuneInString 与 DecodeLastRune 类似，但其输入是一个字符串。如果 s 为空，则返回(RuneError, 0)。否则，如果编码无效，则返回(RuneError, 1)。对于正确的非空 UTF-8，两者都是不可能的结果。

如果编码不正确，则编码无效 UTF-8，对超出范围的符文进行编码，或者该值不是最短的UTF-8 编码。不执行其他验证。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	str := "Hello, 世界"for len(str) > 0 {
		r, size := utf8.DecodeLastRuneInString(str)
		fmt.Printf("%c %v\n", r, size)

		str = str[:len(str)-size]}}

func DecodeRune

func DecodeRune(p []byte) (r rune, size int)

DecodeRune 解压 p 中的第一个 UTF-8 编码，并以字节为单位返回符文及其宽度。如果 p 为空，则返回(RuneError, 0)。否则，如果编码无效，则返回(RuneError, 1)。对于正确的非空 UTF-8 ，两者都是不可能的结果。

如果编码不正确，则编码无效 UTF-8 ，对超出范围的符文进行编码，或者该值不是最短的 UTF-8 编码。不执行其他验证。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	b := []byte("Hello, 世界")for len(b) > 0 {
		r, size := utf8.DecodeRune(b)
		fmt.Printf("%c %v\n", r, size)

		b = b[size:]}}

func DecodeRuneInString

func DecodeRuneInString(s string) (r rune, size int)

DecodeRuneInString 与 DecodeRune类似，但其输入是一个字符串。如果 s 为空，则返回(RuneError, 0)。否则，如果编码无效，则返回 (RuneError, 1)。对于正确的非空 UTF-8 ，两者都是不可能的结果。

如果编码不正确，则编码无效 UTF-8，对超出范围的符文进行编码，或者该值不是最短的 UTF-8 编码。不执行其他验证。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	str := "Hello, 世界"for len(str) > 0 {
		r, size := utf8.DecodeRuneInString(str)
		fmt.Printf("%c %v\n", r, size)

		str = str[size:]}}

func EncodeRune

func EncodeRune(p []byte, r rune) int

EncodeRune 写入 p （必须足够大）符文的 UTF-8 编码。它返回写入的字节数。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	r := '世'
	buf := make([]byte, 3)

	n := utf8.EncodeRune(buf, r)

	fmt.Println(buf)
	fmt.Println(n)}

func FullRune

func FullRune(p []byte) bool

FullRune 报告 p 中的字节是否以完整的符文 UTF-8 编码开始。一个无效的编码被认为是一个完整的符文，因为它将转换为一个宽度为1的错误符文。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	buf := []byte{228, 184, 150} // 世
	fmt.Println(utf8.FullRune(buf))
	fmt.Println(utf8.FullRune(buf[:2]))}

func FullRuneInString

func FullRuneInString(s string) bool

FullRuneInString 与 FullRune 类似，但其输入是一个字符串。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	str := "世"
	fmt.Println(utf8.FullRuneInString(str))
	fmt.Println(utf8.FullRuneInString(str[:2]))}

func RuneCount

func RuneCount(p []byte) int

RuneCount 返回 p 中的符文数。错误和短的编码被视为宽度为1个字节的单个符文。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	buf := []byte("Hello, 世界")
	fmt.Println("bytes =", len(buf))
	fmt.Println("runes =", utf8.RuneCount(buf))}

func RuneCountInString

func RuneCountInString(s string) (n int)

RuneCountInString 就像 RuneCount ，但它的输入是一个字符串。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	str := "Hello, 世界"
	fmt.Println("bytes =", len(str))
	fmt.Println("runes =", utf8.RuneCountInString(str))}

func RuneLen

func RuneLen(r rune) int

RuneLen 返回对符文进行编码所需的字节数。如果符文不是以 UTF-8 编码的有效值，则它返回-1。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	fmt.Println(utf8.RuneLen('a'))
	fmt.Println(utf8.RuneLen('界'))}

func RuneStart

func RuneStart(b byte) bool

RuneStart 报告该字节是否可能是编码的，可能无效的符文的第一个字节。第二个和后续字节总是将前两位设置为10。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	buf := []byte("a界")
	fmt.Println(utf8.RuneStart(buf[0]))
	fmt.Println(utf8.RuneStart(buf[1]))
	fmt.Println(utf8.RuneStart(buf[2]))}

func Valid

func Valid(p []byte) bool

有效报告 p 是否完全由有效的 UTF-8 编码符文组成。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	valid := []byte("Hello, 世界")
	invalid := []byte{0xff, 0xfe, 0xfd}

	fmt.Println(utf8.Valid(valid))
	fmt.Println(utf8.Valid(invalid))}

func ValidRune

func ValidRune(r rune) bool

ValidRune 报告 r 是否可以合法编码为 UTF-8 。超出范围或代理一半的代码点是非法的。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	valid := 'a'
	invalid := rune(0xfffffff)

	fmt.Println(utf8.ValidRune(valid))
	fmt.Println(utf8.ValidRune(invalid))}

func ValidString

func ValidString(s string) bool

ValidString 报告 s 是否完全由有效的 UTF-8 编码符文组成。

示例

package mainimport ("fmt""unicode/utf8")func main() {
	valid := "Hello, 世界"
	invalid := string([]byte{0xff, 0xfe, 0xfd})

	fmt.Println(utf8.ValidString(valid))
	fmt.Println(utf8.ValidString(invalid))}

前の記事：次の記事：