Home > Backend Development > PHP Tutorial > How to Get the Correct Character Offset in UTF-8 Strings After a preg_match() with PREG_OFFSET_CAPTURE?

How to Get the Correct Character Offset in UTF-8 Strings After a preg_match() with PREG_OFFSET_CAPTURE?

Linda Hamilton
Release: 2024-12-03 01:01:09
Original
1062 people have browsed it

How to Get the Correct Character Offset in UTF-8 Strings After a preg_match() with PREG_OFFSET_CAPTURE?

Get Multibyte Character Count Before Match with preg_match()

Problem:

When performing a regular expression match on a UTF-8 encoded string using preg_match() with the PREG_OFFSET_CAPTURE parameter, the resulting offset is calculated in bytes, not character count. This can be problematic when matching multibyte characters, as their byte length may differ from their character length.

For example, using the following code to match the "H" character in a UTF-8 encoded string, the resulting offset is 2, even though the character "H" is at index 1:

$str = "\xC2\xA1Hola!";
preg_match('/H/u', $str, $a_matches, PREG_OFFSET_CAPTURE);
echo $a_matches[0][1];
Copy after login

Resolution:

To obtain the correct character count offset, use mb_strlen() to determine the length of the substring up to the match:

$str = "\xC2\xA1Hola!";
preg_match('/H/u', $str, $a_matches, PREG_OFFSET_CAPTURE);
echo mb_strlen(substr($str, 0, $a_matches[0][1]));
Copy after login

This will calculate the offset in UTF-8 characters, providing the correct result.

The above is the detailed content of How to Get the Correct Character Offset in UTF-8 Strings After a preg_match() with PREG_OFFSET_CAPTURE?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template