Home Common Problem What to do if the bufferedinputstream is garbled?

What to do if the bufferedinputstream is garbled?

Mar 22, 2023 am 11:22 AM
Garbled characters

bufferedinputstream乱码是因为BufferedInputStream读取的是字节byte,那么如果读取的数据比较长,并且没有一次性读完,就会出现乱码,其解决乱码问题的办法就是用BufferedReader来读取,其读取代码如“BufferedReader reader = new BufferedReader (...)”。

What to do if the bufferedinputstream is garbled?

本教程操作环境:Windows10系统、Java8.0、Dell G3电脑。

bufferedinputstream乱码怎么办?

BufferedInputStream和BufferedOutputStream用法解决乱码

昨晚写了一个把所有的简体汉字转换成繁体并且取出拼音的程序,在IO流操作中遇到了中文乱码问题。

下面是我写的程序

package com.java.utils.charactor;
 
import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Statement;
 
/**
 * 简繁体转换
 *
 * @author pengjianbo <pengjianbosoft@gmail.com>
 * $Id$
 */
public class SimTradConvert {
 
    public SimTradConvert() throws Exception {
 
        File simplFile = new File(
                "D:\\android\\JavaUtils\\src\\com\\java\\utils\\charactor\\simplified.txt");
        FileInputStream simplFis = new FileInputStream(simplFile);
        BufferedInputStream simplBis = new BufferedInputStream(simplFis);
        BufferedReader simplBr = new BufferedReader(new InputStreamReader(simplBis));
        StringBuffer simplsb = new StringBuffer();
 
        byte[] simplb = new byte[1024];
        while ((simplBis.read(simplb)) != -1) {
            simplsb.append(new String(simplb));
        }
        
        simplFis.close();
        simplBis.close();
        
        
        File tradFile = new File(
                "D:\\android\\JavaUtils\\src\\com\\java\\utils\\charactor\\traditional.txt");
        FileInputStream tradFis = new FileInputStream(tradFile);
        BufferedInputStream tradBis = new BufferedInputStream(tradFis);
        StringBuffer tradsb = new StringBuffer();
 
        byte[] tradb = new byte[1024];
        while ((tradBis.read(tradb)) != -1) {
            tradsb.append(new String(tradb));
        }
        
        tradBis.close();
        tradFis.close();
        
        System.out.println(simplsb.toString());
        /*CnGetPinyin pinyin = new CnGetPinyin();
        //连接SQLite的JDBC
        Class.forName("org.sqlite.JDBC");
        Connection conn = DriverManager.getConnection("jdbc:sqlite:pai.db");
        Statement stat = conn.createStatement();
        for(int i = 0; i < simplsb.length() -1; i++ ) {
            
            stat.executeUpdate( "insert into CNLang(pinyin,simp,trad) values(&#39;" + pinyin.getPinyin(simplsb.substring(i, i + 1)) + "&#39;,&#39;"
                                + simplsb.substring(i, i + 1) + "&#39;,&#39;" + tradsb.substring(i, i + 1) + "&#39;)");
            System.out.println("正在添加:" + simplsb.substring(i, i + 1) + "-->"  + tradsb.substring(i, i + 1));
            if( i > simplsb.length() -1 ) {
                stat.close();
                conn.close();
            }
        }*/
        
    }
 
    public static void main(String[] args) throws Exception {
        new SimTradConvert();
    }
 
}
Copy after login

在我的这个程序中,用BufferedInputStream,而且用了read(byte[]),就出了读取出来现在部分的中文乱码,我想是我这个byte[] tradb = new byte[1024];缓冲大小设置的问题,试图去更改byte[]的在小,结果出现乱码的地方和原先的不一样了。也就说明了,在缓冲的末尾的时候出了问题,末尾的那个字节容纳不了一个汉字,所以出现的乱码。我想如果用read()去读取的话应该不会出现这个问题的(没试过)。像我的这种读取大量的中文数据我想我宁愿用read去读,大不了就开一个线程嘛。

下面是我看到网上别人写的博客:后来在网上找一下资料,转载如下:

BufferedInputStream和BufferedOutputStream是过滤流,需要使用已存在的节点来构造,即必须先有InputStream或OutputStream,相对直接读写,这两个流提供带缓存的读写,提高了系统读写效率性能.BufferedInputStream读取的是字节byte,因为一个汉字占两个字节,而当中英文混合的时候,有的字符占一个字节,有的字符占两个字节,所以如果直接读字节,而数据比较长,没有一次读完的时候,很可能刚好读到一个汉字的前一个字节,这样,这个中文就成了乱码,后面的数据因为没有字节对齐,也都成了乱码.所以我们需要用BufferedReader来读取,它读到的是字符,所以不会读到半个字符的情况,不会出现乱码.

package com.pocketdigi;
 
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
 
public class Main {
 
    public static void main(String[] args) throws IOException {
        File f = new File("d:/a.txt");
        FileOutputStream fos = new FileOutputStream(f);
        // 构建FileOutputStream对象,文件不存在会自动新建
        BufferedOutputStream bos = new BufferedOutputStream(fos);
        bos.write("1我是中文".getBytes());
        bos.close();
        // 关闭输出流,写入数据,如果下面还要写用flush();
        // 因为是BufferOutputStream链接到FileOutputStream,只需关闭尾端的流
        // 所以不需要关闭FileOutputStream;
        FileInputStream fis = new FileInputStream(f);
        BufferedInputStream bis = new BufferedInputStream(fis);
        BufferedReader reader = new BufferedReader (new InputStreamReader(bis));
        //之所以用BufferedReader,而不是直接用BufferedInputStream读取,是因为BufferedInputStream是InputStream的间接子类,
        //InputStream的read方法读取的是一个byte,而一个中文占两个byte,所以可能会出现读到半个汉字的情况,就是乱码.
        //BufferedReader继承自Reader,该类的read方法读取的是char,所以无论如何不会出现读个半个汉字的.
        StringBuffer result = new StringBuffer();
        while (reader.ready()) {
            result.append((char)reader.read());
        }
        System.out.println(result.toString());
        reader.close();
 
 
    }
 
}
Copy after login

推荐学习:《Java视频教程

The above is the detailed content of What to do if the bufferedinputstream is garbled?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to solve garbled word page numbers How to solve garbled word page numbers Jun 25, 2023 pm 03:23 PM

Solution to garbled word page numbers: 1. Open the word document and click the "File" option in the upper left corner; 2. Select the "More" option, and then click the "Options" button; 3. Select "Advanced" in the word options; 4. . Find "Show field codes instead of field values" in "Show document content", remove the check in front, and click OK to return to the home page.

How to solve Chinese garbled characters in Linux How to solve Chinese garbled characters in Linux Feb 21, 2024 am 10:48 AM

The Linux Chinese garbled problem is a common problem when using Chinese character sets and encodings. Garbled characters may be caused by incorrect file encoding settings, system locale not being installed or set, and terminal display configuration errors, etc. This article will introduce several common workarounds and provide specific code examples. 1. Check the file encoding setting. Use the file command to view the file encoding. Use the file command in the terminal to view the encoding of the file: file-ifilename. If there is "charset" in the output

How to solve tomcat startup garbled code How to solve tomcat startup garbled code Dec 26, 2023 pm 05:21 PM

Solutions to garbled tomcat startup: 1. Modify Tomcat's conf configuration file; 2. Modify the system language; 3. Modify the command line window encoding; 4. Check the Tomcat server configuration; 5. Check the project encoding; 6. Check the log file; 7 , try other solutions. Detailed introduction: 1. Modify Tomcat's conf configuration file, open Tomcat's conf directory, find the "logging.properties" file, etc.

How to solve the problem of Chinese garbled characters in Windows 10 How to solve the problem of Chinese garbled characters in Windows 10 Jan 16, 2024 pm 02:21 PM

In the Windows 10 system, garbled characters are common. The reason behind this is often that the operating system does not provide default support for some character sets, or there is an error in the set character set options. In order to prescribe the right medicine, we will analyze the actual operating procedures in detail below. How to solve Windows 10 garbled code 1. Open settings and find "Time and Language" 2. Then find "Language" 3. Find "Manage Language Settings" 4. Click "Change System Regional Settings" here 5. Check the box as shown and click Just make sure.

What to do if linux tty has Chinese garbled characters What to do if linux tty has Chinese garbled characters Mar 16, 2023 am 09:20 AM

Solution to Chinese garbled characters in Linux tty: 1. Download the font fbterm through the "sudo apt-get install fbterm" command; 2. Execute the "sudo fbterm" command; 3. Change the font and font size to "font-names=Ubuntu Mono font- size=14” is enough.

How to solve the problem of garbled characters in win11 system documents How to solve the problem of garbled characters in win11 system documents Jun 29, 2023 pm 06:29 PM

How to solve the problem of garbled text documents in win11? When many users use the win11 system, text documents are garbled and cannot be read normally. Many friends do not know how to solve this problem. In fact, this method is not difficult. Below, the editor has compiled the steps to solve the problem of garbled Windows 11 system documents. I hope it can bring you some inspiration! Steps to solve garbled Windows 11 system documents: 1. First, open the control panel of win11, enter control panel in the search box below, and click Search to enter the control panel. 2. After entering the panel, find the clock and area and click to enter, then click on the area option. 3. After entering, click on the management panel, and then click on Change system regional settings.

How to solve filezilla garbled characters How to solve filezilla garbled characters Nov 20, 2023 am 10:16 AM

Solutions to filezilla garbled characters include: 1. Check the encoding settings; 2. Check the file itself; 3. Check the server configuration; 4. Try other transfer tools; 5. Update the software version; 6. Check for network problems; 7. Seek technical support. To solve the problem of FileZilla garbled characters, you need to start from multiple aspects, gradually investigate the cause of the problem, and take corresponding measures to repair it.

Solve the problem of garbled characters in win11 notepad Solve the problem of garbled characters in win11 notepad Jan 05, 2024 pm 03:11 PM

Some friends want to open a notepad and find that their win11 notepad is garbled and don't know what to do. In fact, we generally only need to modify the region and language. Win11 Notepad is garbled: First step, use the search function, search and open "Control Panel" Second step, click "Change date, time or number format" under Clock and Region Third step, click the "Manage" option above Card. The fourth step is to click "Change System Regional Settings" below. The fifth step is to change the current system regional settings to "Chinese (Simplified, China)" and click "OK" to save.