Detailed explanation on PHP automatically determining the character set and transcoding

Detailed explanation on PHP automatically determining the character set and transcoding_PHP tutorial

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2016-07-21 15:02:52

Original

1467 people have browsed it

The principle is very simple, because gb2312/gbk is Chinese two bytes, these two bytes have a value range, while Chinese characters in UTF-8 are three bytes, and each byte also has a value range. Regardless of the encoding situation, English is less than 128 and only occupies one byte (except full-width).
If it is an encoding check in the form of a file, you can also directly check the BOM information of utf-8. Without further ado, let’s go directly to the function. This function is used to check and transcode strings.

Copy code The code is as follows:

function safeEncoding($string,$outEncoding ='UTF -8') 
{ 
 $encoding = "UTF-8"; 
 for($i=0;$i { 
 if (ord($string{$i})<128) 
 continue; 

 if((ord($string{$i})&224)==224) 
 { 
 //The first byte is passed 
 $char = $string{++$i}; 
 if((ord($char)&128)==128) 
 $char = $string{++$i}; // The second byte is passed $char = $string{++$i}; 
 if((ord($char)&128)==128) 
 $encoding = " UTF-8"; 
                                                92) 
 { 
                                                                                                                                                        >                                                                                                                                                     🎜> if(strtoupper($encoding) == strtoupper($outEncoding)) 
 return $string; 
 else 
 return iconv($encoding,$outEncoding,$string); 
}
?>








http://www.bkjia.com/PHPjc/327896.html

www.bkjia.com

true

http: //www.bkjia.com/PHPjc/327896.html

TechArticle

The principle is very simple, because gb2312/gbk is Chinese two bytes, and these two bytes have a value range , and Chinese characters in UTF-8 are three bytes, and each byte also has a value range. And English no matter...