Character encoding judgment is something we often use. In particular, I want to judge what encoding the characters entered or submitted by the user are for effective processing. Let me introduce to you the PHP string encoding function.
mb_detect_encoding()($str);
The code is as follows |
Copy code |
代码如下 |
复制代码 |
//判断字符串是什么编码
if ($tag === mb_convert_encoding(mb_convert_encoding($tag, "GB2312", "UTF-8"), "UTF-8", "GB2312")) {
}
else {//如果是gb2312 的就转换为utf8的
$tag = mb_convert_encoding($tag, 'UTF-8', 'GB2312');
}
|
//Determine what encoding the string is
if ($tag === mb_convert_encoding(mb_convert_encoding($tag, "GB2312", "UTF-8"), "UTF-8", "GB2312")) {
}
else {//If it is gb2312, convert it to utf8
$tag = mb_convert_encoding($tag, 'UTF-8', 'GB2312');
}
|
The function can detect encoding, but to use this function, you must open the extension=php_mbstring.dll extension of php, if you are using the space without modification
Will there be a better function to check the string encoding in the permissions of the php.ini configuration folder? Yes, yes
代码如下 |
复制代码 |
/**
+----------------------------------------------------------
* 检查字符串是否是UTF8编码
+----------------------------------------------------------
* @param string $string 字符串
+----------------------------------------------------------
* @return Boolean
+----------------------------------------------------------
*/
function is_utf8($string)
{
return preg_match('%^(?:
[x09x0Ax0Dx20-x7E] # ASCII
| [xC2-xDF][x80-xBF] # non-overlong 2-byte
| xE0[xA0-xBF][x80-xBF] # excluding overlongs
| [xE1-xECxEExEF][x80-xBF]{2} # straight 3-byte
| xED[x80-x9F][x80-xBF] # excluding surrogates
| xF0[x90-xBF][x80-xBF]{2} # planes 1-3
| [xF1-xF3][x80-xBF]{3} # planes 4-15
| xF4[x80-x8F][x80-xBF]{2} # plane 16
)*$%xs', $string);
}
|
Determine whether the string is UTF-8 encoded
The code is as follows |
Copy code |
/**
+------------------------------------------------- ---------
* Check whether the string is UTF8 encoded
+------------------------------------------------- ---------
* @param string $string string
+------------------------------------------------- ---------
* @return Boolean
+------------------------------------------------- ---------
*/
function is_utf8($string)
{
Return preg_match('%^(?:
| [xC2-xDF][x80-xBF] | [xC2-xDF][x80-xBF] # non-overlong 2-byte
| xE0[xA0-xBF][x80-xBF] | xE0[xA0-xBF][x80-xBF] # excluding overlongs
| [xE1-xECxEExEF][x80-xBF]{2} # straight 3-byte
| xED[x80-x9F][x80-xBF] | # excluding surrogates
| xF0[x90-xBF][x80-xBF]{2} # planes 1-3
| xF4[x80-x8F][x80-xBF]{2} # plane 16
)*$%xs', $string);
}
|
可检查出GB2312还是UTF-8
代码如下
代码如下 |
复制代码 |
function is_gb2312($str)
{
for($i=0; $i
$v = ord( $str[$i] );
if( $v > 127) {
if( ($v >= 228) && ($v <= 233) )
{
if( ($i+2) >= (strlen($str) - 1)) return true; // not enough
characters
$v1 = ord( $str[$i+1] );
$v2 = ord( $str[$i+2] );
if( ($v1 >= 128) && ($v1 <=191) && ($v2 >=128) && ($v2 <= 191) ) // utf
编码
return false;
else
return true;
}
}
}
return true;
}
|
| 复制代码
|
| function is_gb2312($str)
{
for($i=0; $i
$v = ord( $str[$i] );
if( $v > 127) {
if( ($v >= 228) && ($v <= 233) )
{
if( ($i+2) >= (strlen($str) - 1)) return true; // not enough
characters
$v1 = ord( $str[$i+1] );
$v2 = ord( $str[$i+2] );
if( ($v1 >= 128) && ($v1 <=191) && ($v2 >=128) && ($v2 <= 191) ) // utf
编码
return false;
else
return true;
}
}
}
return true;
}
有些朋友说可以使用mb_check_encoding函数来检查,这个本人没测试过大家可自行测试哦。
http://www.bkjia.com/PHPjc/630722.html
www.bkjia.comhttp://www.bkjia.com/PHPjc/630722.htmlTechArticle字符编码判断是我们时常全用于的一些东西,特别是我想判断用户输入的或提交过来的字符是什么编码从而进行有效的处理,下面我来给大...