Home > Backend Development > C++ > How to Determine the Actual Length of UTF-8 Encoded Strings in C ?

How to Determine the Actual Length of UTF-8 Encoded Strings in C ?

Susan Sarandon
Release: 2024-10-28 17:15:02
Original
608 people have browsed it

 How to Determine the Actual Length of UTF-8 Encoded Strings in C  ?

Determining the Actual Length of UTF-8 Encoded Strings in C

UTF-8 is a variable-width character encoding scheme, which means that the length of a string in bytes does not necessarily correspond to the number of characters it contains. This can be an issue when working with UTF-8 strings in C , as the str.length() method returns the number of bytes in the string, not the number of characters.

To accurately determine the length of a UTF-8 encoded string in C , you can use the following approach:

Count the number of first-bytes in the string. First-bytes are bytes that do not match 10xxxxxx, as these bytes indicate the start of multi-byte character sequences.

Here is an example implementation:

<code class="cpp">int len = 0;
while (*s) len += (*s++ & 0xc0) != 0x80;</code>
Copy after login

In this code, the s pointer iterates through the string, and the & 0xc0 operation masks off the first two bits of each byte. If the first two bits are 0b10 (indicating a continuation byte), the count is not incremented. Otherwise, it is incremented, and the pointer is advanced to the next byte. This process continues until the end of the string is reached, at which point len will contain the actual character length of the string.

The above is the detailed content of How to Determine the Actual Length of UTF-8 Encoded Strings in C ?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template