How to solve garbled characters in java-JavaBase-php.cn

Home

Java

JavaBase

How to solve garbled characters in java

angryTom

Nov 12, 2019 am 11:36 AM

java

java怎么解决乱码

java在字符串中统一用Unicode表示。

对于任意一个字符串：String string = “测试字符串”;

如果源文件是GBK编码，操作系统默认环境编码也为GBK，那么编译的时候，JVM将按照GBK编码将字节数组解析为字符，然后将字符转换为Unicode格式的字节数组，作为内部存储(字节数组→字符→Unicode字节数组)（推荐教程：java教程）

当打印这个字符串时，JVM根据操作系统本地的语言环境，将Unicode转换为GBK，然后操作系统将GBK格式的内容显示出来。

当源码文件是UTF-8, 我们需要通知编译器源码的格式，javac -encoding utf-8 … , 编译时，JVM按照utf-8 解析成字符，然后转换为unicode格式的字节数组，那么不论源码文件是什么格式，同样的字符串，最后得到的unicode字节数组是完全一致的，显示的时候，也是转成GBK来显示（跟OS环境有关）

乱码是如何产生的？

本质上都是由于字符串原本的编码格式与读取时解析用的编码格式不一致导致的。

乱码指的是程序显示出来的字符文本无法用任何语言去解读。一般情况下会包含大量的?。乱码问题是所有计算机用户或多或少会遇到的问题。造成乱码的原因就是因为使用了错误的字符编码去解码字节流，因此当我们在思考任何跟文本显示有关的问题时，请时刻保持清醒：当前使用的字符编码是什么。只有这样，我们才能正确分析和处理乱码问题。

例如最常见的网页乱码问题。如果你是网站技术人员，遇到这样的问题，需要检查以下原因：

● 服务器返回的响应头Content-Type没有指明字符编码

● 网页内是否使用META HTTP-EQUIV标签指定了字符编码

● 网页文件本身存储时使用的字符编码和网页声明的字符编码是否一致

java代码中的乱码问题如何解决呢？

例如：String s = “测试字符串”;

System.out.println( new String(s.getBytes(),"UTF-8")); 
//错误，因为getBytes()默认使用GBK编码， 而解析时使用UTF-8编码，肯定出错。

Copy after login

其中getBytes()是将Unicode转换为操作系统默认格式的字节数组，即“测试字符串”的GBK格式，new String (bytes, Charset) 中的charset 是指定读取byte的方式，这里指定为UTF-8，即把bytes的内容当做UTF-8来读取。

如下两种方式得到的结果都是正确的，因为它们的源内容编码和解析用的编码是一致的。

System.out.println( new String(s.getBytes(),"GBK"));
System.out.println( new String(s.getBytes("UTF-8"),"UTF-8"));

Copy after login

那么，如何利用getBytes 和 new String() 来进行编码转换呢？

网上流传着一种错误的方法:

GBK--> UTF-8: new String( s.getBytes("GBK") , "UTF-8);

Copy after login

这种方式是完全错误的，因为getBytes 的编码与 UTF-8 不一致，肯定是乱码。

但是为什么在tomcat 下，使用 new String(s.getBytes(“iso-8859-1”) ,”GBK”) 却可以用呢？

答案是：

tomcat 默认使用iso-8859-1编码，也就是说，如果原本字符串是GBK的，tomcat传输过程中，将GBK转成iso-8859-1了，默认情况下，使用iso-8859-1读取中文肯定是有问题的，那么我们需要将iso-8859-1 再转成GBK，而iso-8859-1 是单字节编码的，即他认为一个字节是一个字符，那么这种转换不会对原来的字节数组做任何改变，因为字节数组本来就是由单个字节组成的，如果之前用GBK编码，那么转成iso-8859-1后编码内容完全没变，则 s.getBytes(“iso-8859-1”) 实际上还是原来GBK的编码内容则 new String(s.getBytes(“iso-8859-1”) ,”GBK”) 就可以正确解码了。所以说这是一种巧合。

如何正确的将GBK转UTF-8 ? （实际上是unicode转UTF-8)

//利用getBytes将unicode字符串转成UTF-8格式的字节数组，然后用utf-8 对这个字节数组解码成新的字符串
new String( s.getBytes("utf-8") , "utf-8");

Copy after login

UTF-8 转GBK原理也是一样

new String( s.getBytes("GBK") , "GBK");

Copy after login

其实核心工作都由getBytes(charset)做了。getBytes的JDK描述：Encoding this String into a sequence of bytes using the named charset,storing the result into a new byte array.

OutputStreamWriter w1 = new OutputStreamWriter(new FileOutputStream("D:\\file1.txt"),"UTF-8");
InputStreamReader( stream, charset)

Copy after login

可以帮助我们轻松的按照指定编码读写文件。

The above is the detailed content of How to solve garbled characters in java. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

2 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hello Kitty Island Adventure: How To Get Giant Seeds

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

How Long Does It Take To Beat Split Fiction?

4 weeks ago By DDD

R.E.P.O. Save File Location: Where Is It & How to Protect It?

4 weeks ago By DDD

Two Point Museum: All Exhibits And Where To Find Them

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7378

Java Tutorial

1628

CakePHP Tutorial

1357

Laravel Tutorial

1267

PHP Tutorial

1216

Related knowledge

Square Root in Java Aug 30, 2024 pm 04:26 PM

Guide to Square Root in Java. Here we discuss how Square Root works in Java with example and its code implementation respectively.

Perfect Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Perfect Number in Java. Here we discuss the Definition, How to check Perfect number in Java?, examples with code implementation.

Random Number Generator in Java Aug 30, 2024 pm 04:27 PM

Guide to Random Number Generator in Java. Here we discuss Functions in Java with examples and two different Generators with ther examples.

Armstrong Number in Java Aug 30, 2024 pm 04:26 PM

Guide to the Armstrong Number in Java. Here we discuss an introduction to Armstrong's number in java along with some of the code.

Weka in Java Aug 30, 2024 pm 04:28 PM

Guide to Weka in Java. Here we discuss the Introduction, how to use weka java, the type of platform, and advantages with examples.

Smith Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Smith Number in Java. Here we discuss the Definition, How to check smith number in Java? example with code implementation.

Java Spring Interview Questions Aug 30, 2024 pm 04:29 PM

In this article, we have kept the most asked Java Spring Interview Questions with their detailed answers. So that you can crack the interview.

Break or return from Java 8 stream forEach? Feb 07, 2025 pm 12:09 PM

Java 8 introduces the Stream API, providing a powerful and expressive way to process data collections. However, a common question when using Stream is: How to break or return from a forEach operation? Traditional loops allow for early interruption or return, but Stream's forEach method does not directly support this method. This article will explain the reasons and explore alternative methods for implementing premature termination in Stream processing systems. Further reading: Java Stream API improvements Understand Stream forEach The forEach method is a terminal operation that performs one operation on each element in the Stream. Its design intention is

See all articles