First of all, what is the relationship between these two? Also, what is the relationship between coding and implementation? I don’t understand these concepts = =
First of all, what is the relationship between these two? Also, what is the relationship between coding and implementation? I don’t understand these concepts = =
Someone asked this question again, so I had to post the link.
https://segmentfault.com/q/1010000004240543/a-1020000004241029
http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html
Thank you Ruan Yifeng for your blog post http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html
To summarize, the difference is probably that Unicode is just a symbol set. It only specifies the binary code of the symbol, but does not specify how the binary code should be stored. UTF-8, etc. are the names of the storage methods of character sets. One is the symbol set and the other is the storage method. This is the difference.
ANSI and unicode are two different standard systems for representing characters.
ISO8859-1 and GBK are all derived from ANSI. This type of encoding is a standard established for a certain type of text and is generally only compatible with ASCII.
UTF-8, UTF-16 These are unicode standard encodings, designed to include all languages and characters in the world, so that the text can be displayed normally on computers in different language environments without garbled characters. This type of encoding contains There are a lot of characters, so the space occupied will be relatively large.
In short, the UTF-8 we often come into contact with is essentially the same, which is a packaging of unicode. Therefore, the conversion between encodings needs to be converted to unicode first and then converted.