Home > Backend Development > PHP Tutorial > PHP intercepts Chinese function utf_substr

PHP intercepts Chinese function utf_substr

WBOY
Release: 2016-07-25 08:58:44
Original
917 people have browsed it
  1. $tmp = preg_replace('/[一-龥]/u','<@>','Hello who am I? 123abc');
  2. /u is UTF-8
Copy the code

Code 1, PHP intercepts UTF-8 strings to solve the half-character problem

  1. /***

  2. * PHP intercepts UTF-8 strings and solves the half-character problem. utf_substr
  3. * English and numbers (half-width) are 1 byte (8 bits), Chinese (full-width) are 3 bytes
  4. * @return When $len is less than or equal to 0, the entire string will be returned
  5. * @param $str Source string
  6. * $len The length of the substring on the left
  7. * @edit bbs.it-home.org
  8. function utf_substr($str,$len){
  9. for($i=0;$i< $len;$i++){
  10. $temp_str=substr($str,0,1);
  11. if(ord($temp_str) > 127){
  12. $i++;
  13. if($i<$len){
  14. $ new_str[]=substr($str,0,3);
  15. $str=substr($str,3);
  16. }
  17. }else{
  18. $new_str[]=substr($str,0,1);
  19. $ str=substr($str,1);
  20. }
  21. }
  22. return join($new_str);
  23. }

  24. //Calling example

  25. $str = utf_substr('Hello',4 );
  26. echo $str;
  27. ?>

Copy code

Code 2, intercept utf-8 string function

  1. /**

  2. * Intercept utf-8 string
  3. * edit bbs.it-home.org
  4. */
  5. function cut_str($sourcestr,$cutlength){
  6. $returnstr='';
  7. $i=0;
  8. $ n=0;
  9. $str_length=strlen($sourcestr);//The number of bytes in the string
  10. while (($n<$cutlength) and ($i<=$str_length)){
  11. $temp_str=substr($ sourcestr,$i,1);
  12. $ascnum=Ord($temp_str);//Get the ascii code of the $i-th character in the string
  13. if ($ascnum>=224){ //If the ASCII bit is high and 224 ,
  14. $returnstr=$returnstr.substr($sourcestr,$i,3); //According to the UTF-8 encoding specification, 3 consecutive characters are counted as a single character
  15. $i=$i+3; //Actual Byte is counted as 3
  16. $n++; //String length is counted as 1
  17. }elseif ($ascnum>=192){ //If the ASCII bit is higher than 192,
  18. $returnstr=$returnstr.substr($sourcestr,$i, 2); //According to the UTF-8 encoding specification, 2 consecutive characters are counted as a single character
  19. $i=$i+2; //The actual Byte is counted as 2
  20. $n++; //The string length is counted as 1
  21. }elseif ($ascnum>=65 && $ascnum<=90){ //If it is an uppercase letter,
  22. $returnstr=$returnstr.substr($sourcestr,$i,1);
  23. $i=$i+1; //The actual Byte number is still counted as 1
  24. $n++; //But considering the overall aesthetics, uppercase letters are counted as one high-bit character
  25. }else{ //In other cases, including lowercase letters and half-width punctuation marks,
  26. $returnstr= $returnstr.substr($sourcestr,$i,1);
  27. $i=$i+1; //The actual Byte count is 1
  28. $n=$n+0.5; //Lowercase letters and half-width punctuation are equal to Half the high character width...
  29. }
  30. }
  31. if ($str_length>$cutlength){
  32. $returnstr = $returnstr . "...";//Add an ellipse at the end when the length exceeds
  33. }
  34. return $returnstr;
  35. }

  36. //Call example

  37. $str = 'Hello! I'm good';
  38. $str = cut_str($str,3);
  39. echo $str;
  40. ? >

Copy code


source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template