Using PHP’s built-in function to intercept Chinese characters, sometimes you will encounter question marks. Below we have compiled a few very good examples of accurate interception of Chinese characters.
The problems with php when operating strings are nothing more than two problems:
1. Determine whether the string encoding is gbk or unicode.
2. Adopt corresponding interception methods for the corresponding codes.
Generally, when we use substr to intercept Chinese characters, we may encounter garbled characters. Because Chinese characters are double-byte, when one byte is intercepted, the Chinese character cannot be displayed and is messed up.
In fact, the solution is very simple, look at the interception function below:
The code is as follows | Copy code | ||||
function curtStr($str,$len=30){
|
null means nothing, and the value of chr(0) is 0. Expressed in hexadecimal it is 0×00, expressed in binary it is 00000000
Although chr(0) does not display anything, it is a character.代码如下 | 复制代码 |
//截取utf8字符串 |
The code is as follows | Copy code |
//Intercept utf8 string <🎜> function utf8Substr($str, $from, $len) <🎜> { <🎜> Return preg_replace('#^(?:[x00-x7F]|[xC0-xFF][x80-xBF]+){0,'.$from.'}'. <🎜> '((?:[x00-x7F]|[xC0-xFF][x80-xBF]+){0,'.$len.'}).*#s', <🎜> ‘$1’,$str); <🎜> } <🎜> ?> |
Chinese character interception function supported by UTF-8 and GB2312
The code is as follows | Copy code |
/* Chinese character interception function supported by Utf-8 and gb2312 cut_str(string, cut length, starting length, encoding); The encoding defaults to utf-8 The starting length defaults to 0 */ Function cut_str($string, $sublen, $start = 0, $code = 'UTF-8') { If($code == 'UTF-8') $pa = "/[x01-x7f]|[xc2-xdf][x80-xbf]|xe0[xa0-xbf][x80-xbf]|[xe1-xef][x80-xbf][x80-xbf]| xf0[x90-xbf][x80-xbf][x80-xbf]|[xf1-xf7][x80-xbf][x80-xbf][x80-xbf]/"; Preg_match_all($pa, $string, $t_string); If(count($t_string[0]) - $start > $sublen) return join('', array_slice($t_string[0], $start, $sublen))."..."; return join('', array_slice($t_string[0], $start, $sublen)); $start = $start*2; $sublen = $sublen*2; $strlen = strlen($string); $tmpstr = ''; for($i=0; $i<$strlen; $i++) If($i>=$start && $i<($start+$sublen)) If($i>=$start && $i<($start+$sublen)) If(ord(substr($string, $i, 1))>129) through $tmpstr.= substr($string, $i, 2); else $tmpstr.= substr($string, $i, 1); If(ord(substr($string, $i, 1))>129) $i++; If(strlen($tmpstr)<$strlen ) $tmpstr.= "..."; return $tmpstr; } $str = "abcd string that needs to be intercepted"; echo cut_str($str, 8, 0, 'gb2312'); ?> |