PHP substr() function can split text, but if the text to be split includes Chinese characters, you will often encounter problems. In this case, you can use the mb_substr()/mb_strcut function. The usage of mb_substr()/mb_strcut is similar to substr(). Just add one more parameter at the end of mb_substr()/mb_strcut to set the encoding of the string. However, most servers do not open php_mbstring.dll. You need to open php_mbstring.dll in php.ini.
For example:
<?php echo mb_substr('这样一来我的字符串就不会有乱码^_^', 0, 7, 'utf-8'); ?>
Output: This way my words
<?php echo mb_strcut('这样一来我的字符串就不会有乱码^_^', 0, 7, 'utf-8'); ?>
Output: This way
As can be seen from the above example, mb_substr splits characters by words, while mb_strcut splits characters by bytes Split characters, but will not produce half a character...
MBstring function description:
php's mbstring extension module provides multi-byte character processing capabilities. The most commonly used method is to use mbstring to split Multi-byte Chinese characters, this can avoid the occurrence of half characters. Since it is an extension of PHP, its performance is better than some custom multi-byte segmentation functions.
mbstring extension provides several functions with similar functions, mb_substr and mb_strcut. See their explanation in the manual.
mb_substr
mb_substr() returns the portion of str specified by the start and length parameters.
mb_substr() performs multi-byte safe substr() operation based on number of characters. Position is counted from the beginning of str. First character's position is 0. Second character position is 1, and so on.
mb_strcut
mb_strcut() returns the portion of str specified by the start and length parameters.
mb_strcut() performs equivalent operation as mb_substr() with different method . If start position is multi-byte character's second byte or larger, it starts from first byte of multi-byte character.
It subtracts string from str that is shorter than length AND character that is not part of multi-byte string or not being middle of shift sequence.
For another example, there is a piece of text that is segmented using mb_substr and mb_strcut respectively:
PLAIN TEXT
CODE:
<?php $str = '我是一串比较长的中文-www.webjx.com'; echo "mb_substr:" . mb_substr($str, 0, 6, 'utf-8'); echo "<br>"; echo "mb_strcut:" . mb_strcut($str, 0, 6, 'utf-8'); ?>
The output result is as follows:
mb_substr: I am a string Compare
mb_strcut:I am