Home > Backend Development > PHP Tutorial > PHP character interception function, compatible with various gbk, utf-8 encodings

PHP character interception function, compatible with various gbk, utf-8 encodings

PHP中文网
Release: 2023-02-28 18:34:01
Original
1049 people have browsed it

The character interception function substr in PHP can only intercept the whole English without garbled characters. If there are Chinese characters in it, it will definitely not be intercepted. Let me introduce two Compatible with various gbk, utf-8 encodingsString interceptionfunction

Example 1

function CsubStrPro($str, $start, $length, $charset = "utf-8", $suffix = false)
{
    if (function_exists ( "mb_substr" ))
        return mb_substr ( $str, $start, $length, $charset );
    $re ['utf-8'] = "/[x01-x7f]|[xc2-xdf][x80-xbf]|[xe0-xef][x80-xbf]{2}|[xf0-xff][x80-xbf]{3}/";
    $re ['gb2312'] = "/[x01-x7f]|[xb0-xf7][xa0-xfe]/";
    $re ['gbk'] = "/[x01-x7f]|[x81-xfe][x40-xfe]/";
    $re ['big5'] = "/[x01-x7f]|[x81-xfe]([x40-x7e]|xa1-xfe])/";
    preg_match_all ( $re [$charset], $str, $match );
    $slice = join ( "", array_slice ( $match [0], $start, $length ) );
    if ($suffix)
        return $slice . "…";
    return $slice;
}
Copy after login

Example 2

function subString_UTF8($str, $start, $lenth)
    {
        $len = strlen($str);
        $r = array();
        $n = 0;
        $m = 0;
        for($i = 0; $i < $len; $i++) {
            $x = substr($str, $i, 1);
            $a  = base_convert(ord($x), 10, 2);
            $a = substr(&#39;00000000&#39;.$a, -8);
            if ($n < $start){
                if (substr($a, 0, 1) == 0) {
                }elseif (substr($a, 0, 3) == 110) {
                    $i += 1;
                }elseif (substr($a, 0, 4) == 1110) {
                    $i += 2;
                }
                $n++;
            }else{
                if (substr($a, 0, 1) == 0) {
                    $r[ ] = substr($str, $i, 1);
                }elseif (substr($a, 0, 3) == 110) {
                    $r[ ] = substr($str, $i, 2);
                    $i += 1;
                }elseif (substr($a, 0, 4) == 1110) {
                    $r[ ] = substr($str, $i, 3);
                    $i += 2;
                }else{
                    $r[ ] = &#39;&#39;;
                }
                if (++$m >= $lenth){
                    break;
                }
            }
        }
        return $r;
    } // End subString_UTF8;
}// End String
Copy after login

#Since this function returns an array, it is necessary to cooperate with the join function to display the string: Example 2

#join(&#39;&#39;,subString_UTF8($str, $start, $lenth));
Copy after login

#When the page is displayed You can also follow this statement with a "..."

The above is the PHP character interception function, which is compatible with various gbk, utf-8 encoded content. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template