Home Backend Development PHP Tutorial 真正根据utf8编码的规律来进行截取字符串的函数(utf8版sub_str )_php技巧

真正根据utf8编码的规律来进行截取字符串的函数(utf8版sub_str )_php技巧

May 17, 2016 am 09:08 AM
utf8 Intercept string

复制代码 代码如下:

/*
* 功能: 作用跟substr一样,除了它不会造成乱码
* 参数:
* 返回:
*/
function utf8_substr( $str , $start , $length=null ){
// 先正常截取一遍.
$res = substr( $str , $start , $length );
$strlen = strlen( $str );
/* 接着判断头尾各6字节是否完整(不残缺) */
// 如果参数start是正数
if ( $start >= 0 ){
// 往前再截取大约6字节
$next_start = $start + $length; // 初始位置
$next_len = $next_start + 6 $next_segm = substr( $str , $next_start , $next_len );
// 如果第1字节就不是 完整字符的首字节, 再往后截取大约6字节
$prev_start = $start - 6 > 0 ? $start - 6 : 0;
$prev_segm = substr( $str , $prev_start , $start - $prev_start );
}
// start是负数
else{
// 往前再截取大约6字节
$next_start = $strlen + $start + $length; // 初始位置
$next_len = $next_start + 6 $next_segm = substr( $str , $next_start , $next_len );
// 如果第1字节就不是 完整字符的首字节, 再往后截取大约6字节.
$start = $strlen + $start;
$prev_start = $start - 6 > 0 ? $start - 6 : 0;
$prev_segm = substr( $str , $prev_start , $start - $prev_start );
}
// 判断前6字节是否符合utf8规则
if ( preg_match( '@^([\x80-\xBF]{0,5})[\xC0-\xFD]?@' , $next_segm , $bytes ) ){
if ( !empty( $bytes[1] ) ){
$bytes = $bytes[1];
$res .= $bytes;
}
}
// 判断后6字节是否符合utf8规则
$ord0 = ord( $res[0] );
if ( 128 = $ord0 ){
// 往后截取 , 并加在res的前面.
if ( preg_match( '@[\xC0-\xFD][\x80-\xBF]{0,5}$@' , $prev_segm , $bytes ) ){
if ( !empty( $bytes[0] ) ){
$bytes = $bytes[0];
$res = $bytes . $res;
}
}
}
return $res;
}

测试数据::
复制代码 代码如下:

$str = 'dfjdjf测13f试65&2数据fdj(1就mfe&……就';
var_dump( utf8_substr( $str , 22 , 12 ) ); echo '
';
var_dump( utf8_substr( $str , 22 , -6 ) ); echo '
';
var_dump( utf8_substr( $str , 9 , 12 ) ); echo '
';
var_dump( utf8_substr( $str , 19 , 12 ) ); echo '
';
var_dump( utf8_substr( $str , 28 , -6 ) ); echo '
';

显示结果::(截取无乱码, 欢迎大家测试, 提交bug)
string(12) "据fdj"
string(26) "据fdj(1就mfe&…"
string(13) "13f试65&2数"
string(12) "数据fd"
string(20) "dj(1就mfe&…"
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How many bytes do utf8 encoded Chinese characters occupy? How many bytes do utf8 encoded Chinese characters occupy? Feb 21, 2023 am 11:40 AM

UTF8 encoded Chinese characters occupy 3 bytes. In UTF-8 encoding, one Chinese character is equal to three bytes, and one Chinese punctuation mark occupies three bytes; while in Unicode encoding, one Chinese character (including traditional Chinese) is equal to two bytes. UTF-8 uses 1~4 bytes to encode each character. One US-ASCIl character only needs 1 byte to encode. Latin, Greek, Cyrillic, Armenian, and Hebrew with diacritical marks. , Arabic, Syriac and other letters require 2-byte encoding.

How to intercept a string in Go language How to intercept a string in Go language Mar 13, 2024 am 08:33 AM

Go language is a powerful and flexible programming language that provides rich string processing functions, including string interception. In the Go language, we can use slices to intercept strings. Next, we will introduce in detail how to intercept strings in Go language, with specific code examples. 1. Use slicing to intercept a string. In the Go language, you can use slicing expressions to intercept a part of a string. The syntax of slice expression is as follows: slice:=str[start:end]where, s

How to intercept a string in go language How to intercept a string in go language Jan 12, 2023 pm 04:02 PM

Interception method: 1. Intercept a single character, the syntax is "string[index]", where "string" represents the source string, and "index" represents the character subscript to be obtained; 2. Intercept a substring, the syntax is "string[start: end" ]", where "start" represents the index of the first character to be intercepted (including this character when intercepting), "end" represents the index of the last character to be intercepted (excluding this character); 3. Get the entire String, syntax "string[:]".

How to use the LEFT function in MySQL to intercept the left part of a string How to use the LEFT function in MySQL to intercept the left part of a string Jul 12, 2023 pm 01:37 PM

How to use the LEFT function in MySQL to intercept the left part of a string. In database management systems, we often encounter situations where we need to intercept a certain part from a string. MySQL provides many built-in string functions, including the LEFT function, which can be used to intercept the left part of a string. The syntax of the LEFT function is as follows: LEFT (str, length) where str is the string to be intercepted and length is the length to be intercepted. Next, we will use code examples to demonstrate how

What to do if node utf8 Chinese characters are garbled What to do if node utf8 Chinese characters are garbled Feb 08, 2023 am 10:29 AM

Solution to garbled Chinese characters in node utf8: 1. Check the type of "SarchName" through "typeof"; 2. Use "Name=iconv.decode(name,'gbk')" to convert the encoding to utf8.

substr() function in PHP: how to intercept part of a string substr() function in PHP: how to intercept part of a string Nov 03, 2023 am 10:43 AM

The substr() function in PHP: How to intercept part of a string requires specific code examples. In PHP programming, string processing is one of the most common operations. Intercepting part of a string is a requirement that is often encountered when processing strings. In PHP, we can use the built-in substr() function to intercept part of a string. This article will introduce the usage of substr() function in detail and give specific code examples. The basic usage of the substr() function is as follows: string

Use MySQL's LEFT function to intercept the specified length of the string Use MySQL's LEFT function to intercept the specified length of the string Jul 25, 2023 pm 05:04 PM

Use MySQL's LEFT function to intercept the specified length of a string. In MySQL, we often need to intercept strings to meet specific needs. Among them, the LEFT function is a very practical function that can intercept the specified length of a string. This article will introduce how to use MySQL's LEFT function to intercept strings and give code examples. First, we need to understand the syntax of the LEFT function. The basic syntax of the LEFT function is as follows: LEFT(string,lengt

How to use the RIGHT function in MySQL to intercept the right part of a string How to use the RIGHT function in MySQL to intercept the right part of a string Jul 12, 2023 am 10:20 AM

How to use the RIGHT function in MySQL to intercept the right part of a string. In MySQL, the RIGHT function is a function used to intercept the right part of a string. It accepts two parameters: the string to be intercepted and the length to be intercepted, and returns a string containing the specified length. Use the RIGHT function to get the right part of a string very conveniently. Below we will demonstrate how to use the RIGHT function through code examples. First, we need to create a sample data table to store the strings to be intercepted. CR

See all articles