Java reads text files (such as csv files, txt files, etc.), and when encountering Chinese, it becomes garbled. The reading code is as follows:
List<String> lines=new ArrayList<String>(); BufferedReader br = new BufferedReader(new FileReader(fileName)); String line = null; while ((line = br.readLine()) != null) { lines.add(line); } br.close();
Java's I/O class processing is as shown in the figure:
The Reader class is the parent of the read characters in Java's I/O class, and the InputStream class is the parent class for reading bytes. The InputStreamReader class is the bridge that associates bytes to characters. It is responsible for processing the conversion of read bytes into characters during the I/O process, and the decoding of specific bytes into characters. It is implemented by StreamDecoder, and the Charset encoding format must be specified by the user during the decoding process of StreamDecoder. It is worth noting that if you do not specify Charset, the default character set in the local environment will be used. For example, in the Chinese environment, GBK encoding will be used.
Summary: When Java reads the data stream, you must specify the encoding method of the data stream, otherwise the default character set in the local environment will be used.
After the above analysis, the modified code is as follows:
List<String> lines=new ArrayList<String>(); BufferedReader br=new BufferedReader(new InputStreamReader(new FileInputStream(fileName),"UTF-8")); String line = null; while ((line = br.readLine()) != null) { lines.add(line); } br.close();
For more java knowledge, please pay attention to the java basic tutorial column.
The above is the detailed content of Solution to Java reading Chinese garbled characters. For more information, please follow other related articles on the PHP Chinese website!