The mb_detect_encoding and mb_convert_encoding methods in PHP get encoding and transcoding issues
高洛峰
高洛峰 2017-05-16 13:14:00
0
1
477
  1. The return value obtained by using mb_detect_encoding is cp936. Does this correspond to GBK?

  2. After transcoding through mb_convert_encoding, although the text is displayed normally, when using mb_detect_encoding to detect the text encoding format, it is still cp936 and has not changed. Why is this?

code show as below:

$file_contents = fread($file,$fileSize);

$typeofData = mb_detect_encoding($file_contents,array("GBK","GB2312","UTF-8","ASCII","BIG5"));

if ($typeofData != "UTF-8"){
//    $file_contents = iconv("GBK","UTF-8",$file_contents);
    $file_contents = mb_convert_encoding($file_contents,"UTF-8","GBK");
}

echo  mb_detect_encoding($file_contents,array("GBK","GB2312","UTF-8","ASCII","BIG5"))."<br/>";
echo $file_contents;
高洛峰
高洛峰

拥有18年软件开发和IT教学经验。曾任多家上市公司技术总监、架构师、项目经理、高级软件工程师等职务。 网络人气名人讲师,...

reply all(1)
伊谢尔伦

The Code Page of GBK is CP936.
I tried it with PHP5 and PHP7 on Ubuntu. After converting to UTF-8 encoding, UTF-8 can be detected:

<?php
$str = file_get_contents('/path/to/gbk.txt'); //GBK编码的文本文件
$order = array('GB2312', 'GBK', 'GB18030', 'UTF-8', 'ASCII', 'BIG5');
$encode = mb_detect_encoding($str, $order, true); //可见CP936(即GBK)
$str = mb_convert_encoding($str, 'UTF-8', $encode); //转成UTF-8
echo mb_detect_encoding($str, $order, true); //输出UTF-8
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template