A brief discussion on the differences and usage of PHP substr(), mb_substr() and mb_strcut and the truncate regulator in Smarty template
The PHP substr() function can split text, but if the text to be split includes Chinese characters, you will often encounter problems. In this case, you can use the mb_substr()/mb_strcut function. The usage of mb_substr()/mb_strcut is similar to substr(), except that One more parameter needs to be added at the end of mb_substr()/mb_strcut to set the encoding of the string. However, most servers do not open php_mbstring.dll. You need to open php_mbstring.dll in php.ini.
For example:
echo mb_substr('This way my string will not be garbled^_^', 0, 7, 'utf-8');
?>
Output: This way my word
echo mb_strcut('This way my string will not be garbled^_^', 0, 7, 'utf-8');
?>
Output: like this
As can be seen from the above example, mb_substr splits characters by words, while mb_strcut splits characters by bytes, but neither of them will produce half a character...
Description of mbstring function:
php’s mbstring extension module provides multi-byte character processing capabilities. The most commonly used method is to use mbstring to split multi-byte Chinese characters. This can avoid the occurrence of half characters, because it is an extension of php , its performance is also better than some custom multi-byte splitting functions.
The mbstring extension provides several functions with similar functions, mb_substr and mb_strcut. See their explanation in the manual.
mb_substr
mb_substr() returns the portion of str specified by the start and length parameters.
mb_substr() performs multi-byte safe substr() operation based on number of characters. Position is counted from the beginning of str. First character's position is 0. Second character position is 1, and so on.
mb_strcut
mb_strcut() returns the portion of str specified by the start and length parameters.
mb_strcut() performs equivalent operation as mb_substr() with different method. If start position is multi-byte character's second byte or larger, it starts from first byte of multi-byte character.
It subtracts string from str that is shorter than length AND character that is not part of multi-byte string or not being middle of shift sequence.
For another example, there is a piece of text that is segmented using mb_substr and mb_strcut respectively:
PLAIN TEXT
CODE:
$str = 'I am a relatively long string of Chinese-www.webjx.com';
echo "mb_substr:" . mb_substr($str, 0, 6, 'utf-8');
echo "
";
echo "mb_strcut:" . mb_strcut($str, 0, 6, 'utf-8');
?>
The output is as follows:
mb_substr: I am a string of comparisons // intercepted by characters, an English letter and a Chinese character are both one character
mb_strcut: I am // intercepting by bytes, one Chinese character is 3 bytes
note: When using smarty templates, you may use the truncate adjuster to intercept strings. At this time, when the intercepted strings contain Chinese characters, garbled characters will appear (in variable-length encoding, usually The length of the string is measured by the number of characters instead of the number of bytes). At this time, we can define a variable adjuster to intercept the string according to the number of characters. Method: 1. Use the ord() function in php to obtain the ASCII code of the character. 2. Use the substr() function to intercept the string according to the ASCII code so that there will be no garbled characters even if there are Chinese characters.