Home > php教程 > php手册 > body text

PHP检测字符串是否为UTF8编码4种方法

WBOY
Release: 2016-05-26 08:20:03
Original
1405 people have browsed it

检测字符串编码可以有很多种方法,如利用ord获得字符的进制然后进入判断,或利用mb_detect_encoding函数来处理,下面整理了几种方法.

例子1,代码如下:

/** 
* 检测字符串是否为UTF8编码 
* @param string $str 被检测的字符串 
* @return boolean 
*/ 
function is_utf8($str){ 
    $len = strlen($str); 
    for($i = 0; $i < $len; $i++){ 
        $c = ord($str[$i]); 
        if ($c > 128) { 
            if (($c > 247)) return false; 
            elseif ($c > 239) $bytes = 4; 
            elseif ($c > 223) $bytes = 3; 
            elseif ($c > 191) $bytes = 2; 
            else return false; 
            if (($i + $bytes) > $len) return false; 
            while ($bytes > 1) { 
                $i++; 
                $b = ord($str[$i]); 
                if ($b < 128 || $b > 191) return false; 
                $bytes--; 
            } 
        } 
    } 
    return true; 
}
Copy after login

例子2,代码如下:

function is_utf8($string) {  
    return preg_match(&#39;%^(?:  
            [\x09\x0A\x0D\x20-\x7E]                 # ASCII  
        | [\xC2-\xDF][\x80-\xBF]                 # non-overlong 2-byte  
        |     \xE0[\xA0-\xBF][\x80-\xBF]             # excluding overlongs  
        | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}     # straight 3-byte  
        |     \xED[\x80-\x9F][\x80-\xBF]             # excluding surrogates  
        |     \xF0[\x90-\xBF][\x80-\xBF]{2}     # planes 1-3  
        | [\xF1-\xF3][\x80-\xBF]{3}             # planes 4-15  
        |     \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16  
    )*$%xs&#39;, $string);
Copy after login

准确率基本和mb_detect_encoding()一样,要对一起对,要错一起错,编码检测不可能100%准确,这个东西已经可以基本满足要求了.

例子3,代码如下:

function mb_is_utf8($string)    
{    
    return mb_detect_encoding($string, &#39;UTF-8&#39;) === &#39;UTF-8&#39;;//新发现    
}
Copy after login

例子4,代码如下:

// Returns true if $string is valid UTF-8 and false otherwise.    
function is_utf8($word)    
{    
    if (preg_match("/^([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){1}/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){1}$/",$word) == true || preg_match("/([".chr(228)."-".chr(233)."]{1}[".chr(128)."-".chr(191)."]{1}[".chr(128)."-".chr(191)."]{1}){2,}/",$word) == true)    
    {    
        return true;    
    }    
    else    
    {    
        return false;    
    }    
} // function is_utf8
Copy after login


教程链接:

随意转载~但请保留教程地址★

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template