Why is the return result of gbk encoding 3?

WBOY
Release: 2016-08-22 11:45:37
Original
1282 people have browsed it

php > $s="Hello";
php > echo mb_strlen($s,"utf8");
2
utf8 returns 2, I understand
php > echo mb_strlen($s,"gb2312") ;
4
This returns 4, I understand it too
php > echo mb_strlen($s,"gbk");
3
I don't understand here?

Reply content:

php > $s="Hello";
php > echo mb_strlen($s,"utf8");
2
utf8 returns 2, I understand
php > echo mb_strlen($s,"gb2312") ;
4
This returns 4, I understand it too
php > echo mb_strlen($s,"gbk");
3
I don't understand here?

Because $s is UTF8 encoded, you can get its length through GBK encoding without converting it to GBK.

UTF8 encoded Hello is HUAN犲ソ on GBK, so its length is 3.

This is what you should do:

<code>$a = mb_strlen(iconv( 'utf-8','gbk', $s), 'gbk');
$b = mb_strlen(iconv( 'utf-8','gb2312', $s), 'gb2312');
</code>
Copy after login

In other words, GB2312 is also wrong.

mb_strlen is the number of characters returned, so only returning 2 is correct. I don’t know how you understand the two cases of 4 and 3?

But when $s = "Hello", $s stores a UTF8 encoded string (encoded according to your source file). If you use GBK or GB2312 to decode this encoded data, It is possible to get garbled codes, so 4 and 3 should be the length of garbled codes.

Related labels:
php
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template