Home > Java > javaTutorial > body text

How does Java internally represent Strings: UTF-8 or UTF-16?

Patricia Arquette
Release: 2024-11-10 07:12:02
Original
719 people have browsed it

How does Java internally represent Strings: UTF-8 or UTF-16?

What is Java's Internal Representation for String: Modified UTF-8 or UTF-16?

Java utilizes UTF-16 for its internal text representation, as stated by the Oracle documentation. This representation applies to various data structures and classes that store character sequences within the Java platform, such as String and StringBuilder. A 16-bit unsigned integer (char) in Java can represent a Unicode code point or code units of UTF-16.

However, Java also employs a non-standard modification of UTF-8 for string serialization. This means that serialized strings are stored in UTF-8 format by default.

For storage in memory, Java uses 2 bytes for a char data type. Note that code points may require one or two char instances, resulting in 2 or 4 bytes of storage space, respectively.

The above is the detailed content of How does Java internally represent Strings: UTF-8 or UTF-16?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template