PHP Chinese encoding judgment sample code

怪我咯
Release: 2023-03-12 21:36:02
Original
1231 people have browsed it

php determines Chinese and encoding related gbk is double bytes, utf8 is three bytes, can be judged according to the range of Chinese

Encoding range 1. GBK (GB2312/GB18030)
\x00-\xff GBK double-byte encoding range
\x20-\x7f ASCII
\xa1-\xff Chinese
\x80-\xff Chinese
2. UTF -8 (Unicode)
\u4e00-\u9fa5 (Chinese)
\x3130-\x318F (Korean
\xAC00-\xD7A3 (Korean)
\u0800-\u4e00 (Japanese) )
ps: Korean is a character larger than [\u9fa5]
Regular example:
preg_replace(”/([\x80-\xff])/”,””,$ str);
preg_replace(”/([u4e00-u9fa5])/”,””,$str);
2. Code example

Code As follows:

//判断内容里有没有中文-GBK (PHP) 
function check_is_chinese($s){ 
return preg_match('/[\x80-\xff]./', $s); 
} 
//获取字符串长度-GBK (PHP) 
function gb_strlen($str){ 
$count = 0; 
for($i=0; $i<strlen($str); $i++){ 
$s = substr($str, $i, 1); 
if (preg_match("/[\x80-\xff]/", $s)) ++$i; 
++$count; 
} 
return $count; 
} 
//截取字符串字串-GBK (PHP) 
function gb_substr($str, $len){ 
$count = 0; 
for($i=0; $i<strlen($str); $i++){ 
if($count == $len) break; 
if(preg_match("/[\x80-\xff]/", substr($str, $i, 1))) ++$i; 
++$count; 
} 
return substr($str, 0, $i); 
} 
//统计字符串长度-UTF8 (PHP) 
function utf8_strlen($str) { 
$count = 0; 
for($i = 0; $i < strlen($str); $i++){ 
$value = ord($str[$i]); 
if($value > 127) { 
$count++; 
if($value >= 192 && $value <= 223) $i++; 
elseif($value >= 224 && $value <= 239) $i = $i + 2; 
elseif($value >= 240 && $value <= 247) $i = $i + 3; 
else die(&#39;Not a UTF-8 compatible string&#39;); 
} 
$count++; 
} 
return $count; 
} 
//截取字符串-UTF8(PHP) 
function utf8_substr($str,$position,$length){ 
$start_position = strlen($str); 
$start_byte = 0; 
$end_position = strlen($str); 
$count = 0; 
for($i = 0; $i < strlen($str); $i++){ 
if($count >= $position && $start_position > $i){ 
$start_position = $i; 
$start_byte = $count; 
} 
if(($count-$start_byte)>=$length) { 
$end_position = $i; 
break; 
} 
$value = ord($str[$i]); 
if($value > 127){ 
$count++; 
if($value >= 192 && $value <= 223) $i++; 
elseif($value >= 224 && $value <= 239) $i = $i + 2; 
elseif($value >= 240 && $value <= 247) $i = $i + 3; 
else die(&#39;Not a UTF-8 compatible string&#39;); 
} 
$count++; 
} 
return(substr($str,$start_position,$end_position-$start_position)); 
} 
//判断是否是有韩文-UTF-8 (JavaScript) 
function checkKoreaChar(str) { 
for(i=0; i<str.length; i++) { 
if(((str.charCodeAt(i) > 0x3130 && str.charCodeAt(i) < 0x318F) || (str.charCodeAt(i) >= 0xAC00 && str.charCodeAt(i) <= 0xD7A3))) { 
return true; 
} 
} 
return false; 
} 
//判断是否有中文字符-GBK (JavaScript) 
function check_chinese_char(s){ 
return (s.length != s.replace(/[^\x00-\xff]/g,"**").length); 
}
Copy after login

The above is the detailed content of PHP Chinese encoding judgment sample code. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!