This article will introduce to you the program code for PHP to determine whether the string encoding is utf-8. If you are interested, feel free to enter it for reference.
We used to use mb_detect_encoding() to detect character encoding
代码如下 |
复制代码 |
//判断字符串是什么编码
if ($tag === mb_convert_encoding(mb_convert_encoding($tag, "GB2312", "UTF-8"), "UTF-8", "GB2312")) {
}
else {//如果是gb2312 的就转换为utf8的
$tag = mb_convert_encoding($tag, 'UTF-8', 'GB2312');
} |
$keytitle = “%D0%BE%C6%AC”; The detection result is UTF-8. This bug is not actually a bug, and you should not rely too much on mb_detect_encoding when writing programs. When the string is short, the detection results are likely to be biased.
How to solve it? My solution is:
The code is as follows
代码如下 |
复制代码 |
$encode = mb_detect_encoding($keytitle, array('ASCII','GB2312′,'GBK','UTF-8');
|
|
Copy code
|
$encode = mb_detect_encoding($keytitle, array('ASCII','GB2312′,'GBK','UTF-8');
The
代码如下 |
复制代码 |
// Returns true if $string is valid UTF-8 and false otherwise.
function is_utf8($word)
{
if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2,}/",$word) == true)
{
return true;
}
else
{
return false;
}
} // function is_utf8
|
parameters are: the input variable to be detected, the detection order of the encoding method (once it is true, it will be automatically ignored later), and the strict mode |
Adjust the order of encoding detection to put the greatest possibility first, thus reducing the chance of incorrect conversion.
The above method still can’t solve it, so I found another solution below.
Example 1
The code is as follows
|
Copy code
// Returns true if $string is valid UTF-8 and false otherwise.
function is_utf8($word)
{
if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{ 1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228 )."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-". chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."] {1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2 ,}/",$word) == true)
{
return true;
}
else
{
return false;
}
} // function is_utf8
http://www.bkjia.com/PHPjc/632765.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/632765.htmlTechArticleThis article will introduce to you the program code for PHP to determine whether the string encoding is utf-8. If If you are interested, please enter the reference. We used to use the mb_detect_encoding() function...
|
|