In "Learning PHP & MYSQL - Character Encoding (Part 1)", the conversion relationship between Unicode and UTF-8 is introduced, and a UTF-8 encoding rule is summarized. Based on this encoding rule, a UTF-8 encoding parsing program is written. , the following is the implementation of PHP:
Copy code The code is as follows:
/*
Program function, $str is a UTF-8 encoded string mixed with Chinese and English.
Decode and display this string correctly according to UTF-8 encoding rules.
*/
$str = 'Today is very happy, so we decided to go to KFC to eat Coke chicken wings!!!';
/*
$str is to be intercepted The string
$len is the number of intercepted characters
*/
function utf8sub($str,$len) {
if($len <= 0){
return '' ;
}
$offset = 0; // Offset when intercepting high-order bytes
$chars = 0; // Number of characters intercepted
$res = '' ; // Store the intercepted result string
while($chars < $len){
// Take the first byte of the string first
// Convert it to decimal
// Then convert to binary
$high = ord(substr($str,$offset,1));
// echo '$high='. $high .'
';
if($high == null ){ // If the high bit is null, it proves that it has been fetched to the end, break directly
break;
}
if( ($high>>2) === 0x3F){ // Shift the high bit to the right by 2 bits and compare it with binary 111111. If it is the same, take 6 bytes
// Intercept 2 bytes
$count = 6;
}else if(($high>>3) === 0x1F){ // Shift the high bit to the right by 2 bits and compare it with binary 11111. If it is the same, take 5 bytes
// Intercept 3 bytes
$count = 5;
}else if(($high>>4) === 0xF){ // Shift the high bit to the right by 2 bits and compare it with binary 1111. If they are the same, then Take 4 bytes
// Take 4 bytes
$count = 4;
}else if(($high>>5) === 0x7){ // Will Shift the high bit right by 2 bits and compare with binary 111. If it is the same, take 3 bytes
// Intercept 5 bytes
$count = 3;
}else if(($high> >6) === 0x3){ // Shift the high bit to the right by 2 bits, compare with binary 11, if the same, take 2 bytes
// Intercept 6 bytes
$count = 2;
}else if(($high>>7) === 0x0){ // Shift the high bit to the right by 2 bits and compare it with binary 0. If they are the same, take 1 byte
$count = 1;
}
// echo '$count='.$count.'
';
$res .= substr($str,$offset,$count); / / Take out a character and connect it to $res string
$chars += 1; // Number of intercepted characters + 1
$offset += $count; // Intercept the high offset and move it backward by $count Bytes
}
return $res;
}
echo utf8sub($str,100);
http://www.bkjia.com/PHPjc/326131.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/326131.htmlTechArticleIn "Learn PHPlt;?php /* Program function, $str is a mixed Chinese and English UTF-8 encoded character string, correctly decode and display this string according to UTF-8 encoding rules. */ $str = 'Today is very...