-
- function substring($str, $start, $length){ //It is better to use the string interception function
- $len = $length;
- if($length < 0){
- $str = strrev($str);
- $len = -$length;
- }
- $len= ($len < strlen($str)) ? $len : strlen($str);
- $tmpstr = "" ;
- for ($i= $start; $i < $len; $i ++)
- {
- if (ord(substr($str, $i, 1)) > 0xa0)
- {
- $tmpstr . = substr($str, $i, 2);
- $i++;
- } else {
- $tmpstr .= substr($str, $i, 1);
- }
- }
- if($length < 0) $ tmpstr = strrev($tmpstr);
- return $tmpstr;
- }
- ?>
Copy code
Usage example:
-
-
$str1 = 'I am a relatively long string of Chinese without English'; - $str2 = 'I am a relatively long string of Chinese with yingwen';
$len = strlen($str1);
- echo '
'.$len; //return 28
$len = strlen ($str2);
- echo '
'.$len; //return 29
echo ' ';
- echo substring($str1, 0 , 11);
- echo '
';
- echo substring($str2, 0, 11);
- echo '
';
- echo substring($str1, 16, 28);
- echo '
';
- echo substring($str2, 16, 29);
- ?>
-
-
Copy the code
The result shows:
28
29
I am a string of comparisons
I am a string of comparisons
Chinese without English
Chinese with yingwen
This function is very useful. For example, it can be used to truncate a relatively long file name, but if you want to add... in the middle, you can do it like this:
-
- function formatName($str, $size){
- $len = strlen($str);
- if(strlen($str) > $size) {
- $part1 = substring ($str, 0, $size / 2);
- $part2 = substring($str, $len - ($size/2), $len);
- return $part1 . "..." . $part2;
- } else {
- return $str;
- }
- }
- ?>
-
Copy code
In addition, I saw a super simple Chinese truncation solution on the Internet. After testing, the effect is very good:
-
- echo substr($str1,0,10).chr(0);
- ?>
-
Copy code
Principle explanation:
chr(0) is not null
null means nothing, and the value of chr(0) is 0. Expressed in hexadecimal it is 0x00, expressed in binary it is 00000000
Although chr(0) does not display anything, it is a character.
When a Chinese character is truncated, according to the encoding rules, it always has to pull in other characters behind it and interpret them as Chinese characters. This is the reason why garbled characters appear. The combination of values 0x81 to 0xff and 0x00 is always displayed as "empty"
According to this feature, adding a chr(0) after the result of substr can prevent garbled characters
20120705 update:
Although the above method is good, you still encounter garbled characters occasionally, and the reason is not yet investigated. However, you can use the following method, which has been tried and tested for UTF8 character text.
Note: In this method, Chinese characters are calculated as 1 unit length, and one English letter is 1 unit length, so you need to pay attention to the length setting when truncation.
How to calculate length:
-
-
- function strlen_UTF8($str)
- {
- $len = strlen($str);
- $n = 0;
- for($i = 0; $i < $len; $i++) {
- $x = substr($str, $i, 1);
- $a = base_convert(ord($x), 10, 2);
- $a = substr('00000000' .$a, -8);
- if (substr($a, 0, 1) == 0) {
- }elseif (substr($a, 0, 3) == 110) {
- $i += 1;
- }elseif (substr($a, 0, 4) == 1110) {
- $i += 2;
- }
- $n++;
- }
- return $n;
- } // End strlen_UTF8;
///String truncation function:
- function subString_UTF8($str, $start, $lenth)
- {
- $len = strlen($str);
- $r = array();
- $n = 0;
- $m = 0;
- for($i = 0; $i < $len; $i++) {
- $x = substr($str, $i, 1);
- $a = base_convert(ord( $x), 10, 2);
- $a = substr('00000000'.$a, -8);
- if ($n < $start){
- if (substr($a, 0, 1) = = 0) {
- }elseif (substr($a, 0, 3) == 110) {
- $i += 1;
- }elseif (substr($a, 0, 4) == 1110) {
- $i += 2;
- }
- $n++;
- }else{
- if (substr($a, 0, 1) == 0) {
- $r[ ] = substr($str, $i, 1);
- } elseif (substr($a, 0, 3) == 110) {
- $r[ ] = substr($str, $i, 2);
- $i += 1;
- }elseif (substr($a, 0 , 4) == 1110) {
- $r[ ] = substr($str, $i, 3);
- $i += 2;
- }else{
- $r[ ] = '';
- }
- if ( ++$m >= $lenth){
- break;
- }
- }
- }
- return join($r);
- } // End subString_UTF8;
//Usage method and The same as what was introduced before, for example, formatName can be implemented as follows (this has a small optimization for the length of Chinese characters):
- function formatName($str, $size){
- $len = strlen_UTF8($str);
- $one_len = strlen($str );
- $size = $size * 1.5 * $len / ($one_len);
- if(strlen_UTF8($str) > $size) {
- $part1 = subString_UTF8($str, 0, $size / 2);
- $part2 = subString_UTF8($str, $len - ($size/2), $len);
- return $part1 . "..." . $part2;
- } else {
- return $str;
- }
- }
- ?>
-
Copy code
|