php mb_strlen函数指定不同的编码输出结果不同解决方案

WBOY
Release: 2016-06-13 11:19:41
Original
1146 people have browsed it

php mb_strlen函数指定不同的编码输出结果不同

本帖最后由 lylgxy2007wht 于 2013-04-02 11:37:02 编辑 mb_strlen函数指定不同的编码输出结果不同,哪位高手大侠给解释下原因

页面编码utf-8
<br />$text = "啊啊啊啊"; <br />echo mb_strlen($text,'utf8')."<br/>";<br />echo mb_strlen($text,'gbk')."<br/>";<br />echo mb_strlen($text,'gb2312')."<br/>";<br />echo strlen($text);<br />
Copy after login
Copy after login

输出:4 6 8 12

页面编码gb2312
<br />$text = "啊啊啊啊"; <br />echo mb_strlen($text,'utf8')."<br/>";<br />echo mb_strlen($text,'gbk')."<br/>";<br />echo mb_strlen($text,'gb2312')."<br/>";<br />echo strlen($text);<br />
Copy after login
Copy after login

输出:4  4  4  8
php
------解决方案--------------------
这是PHP官网字符集列表
http://www.php.net/manual/en/mbstring.supported-encodings.php


mb_internal_encoding("UTF-8");
echo mb_internal_encoding();


------解决方案--------------------
"啊啊啊啊"的字节16进制表示为 
UTF-8: E5 95 8A E5 95 8A E5 95 8A E5 95 8A ---12
GB2312: B0 A1 B0 A1 B0 A1 B0 A1 ---8

在utf-8时
utf-8 [E5 95 8A] [E5 95 8A] [E5 95 8A] [E5 95 8A] --- 4
gbk [E5 95]鍟 [8A E5]婂 [95 8A]晩 [E5 95]鍟 [8A E5]婂 [95 8A]晩 --- 6
gb2312 [E5 95]鍟 [8A] [E5 95]鍟 [8A] [E5 95]鍟 [8A] [E5 95]鍟 [8A]  ---8
注:8A开头不存在于gb2312(最低A1开始),所以独立计算了

在gb2312时
utf-8(不确定) 由于不存在B0字节开头的utf-8字符,我猜mb是“智能”地按双字节计算 ---4
gbk/gb2312 [B0 A1] [B0 A1] [B0 A1] [B0 A1] ---4
------解决方案--------------------
实测结果(php 5.4.12)
utf-8 下得 4 6 8 12
gb2312 下得 8 4 4 8

不需要做任何解释,只有在正确的字符集中才可以得到正确的结果
Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!