Java implementation method of specifying encoding when creating a file-javaTutorial-php.cn

Table of Contents

1. Problem analysis

2. Character encoding

3. Problem Solving

Home

Java

javaTutorial

Java implementation method of specifying encoding when creating a file

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Aug 24, 2022 am 09:09 AM

java

This article brings you relevant knowledge about java, which mainly introduces the implementation method of Java specifying encoding when creating a file. The article introduces it in detail through sample code, which is very useful for everyone. It has certain reference and learning value when studying or working. I hope it will be helpful to everyone.

Java implementation method of specifying encoding when creating a file

Recommended study: "java Video Tutorial"

Foreword: Recently, I learned the knowledge related to Java IO stream. I would like to Practice and consolidate the knowledge you have learned by reading and writing documents. When using the File class to create a file, I suddenly thought, how should I specify the encoding used by the file? Then I thought, how should I check the encoding of a file?

1. Problem analysis

First go to the Internet to find the answer. The results are as follows:

FileInputStream fis=new FileInputStream(“xxxx.txt”)；
OutputStreamWriter osw=new OutputStreamWriter(fis,“UTF-8”);

Copy after login

The above code probably means that when writing a file, the written characters use UTF-8 encoding is different from what I expected. I want to specify the encoding when creating the file. Like the following,

File myfile = new File("test.txt”, “UTF-8”);
if (!myfile.exists()) myfile.createNewFile();

Copy after login

So, I went to check the official documentation of Java API 8. File does not provide a constructor that can specify the character encoding.

Java implementation method of specifying encoding when creating a file

At the same time, other methods of accessing character encoding such as set or get are not provided, indicating that character encoding is not an inherent property of the file. Such as file creation time, file modification time, whether it is readable, writable, and executable, these are the inherent attributes of the file, or meta-information, they are part of the file.

Java implementation method of specifying encoding when creating a file

2. Character encoding

We know that any information stored in the computer is a string of 01, and text is no exception.

The processing of characters includes two processes: Encoding and decoding

Encoding: "map" the characters to the 01 string
Decoding: 01 The string "maps" to the characters

. Different character encodings, such as GBK and UTF-8, use different rules for encoding and decoding.

For the same text string: "China", use UTF-8 encoding to save. Generally, three bytes are used to save a Chinese character (the hexadecimal form of the underlying 01 string).

Java implementation method of specifying encoding when creating a file

Use GBK encoding to save, using two bytes to represent a Chinese character.

Java implementation method of specifying encoding when creating a file

When we write and save the text in the text editor, the editor will "map" the text into a 01 string according to the character encoding type you set.

The character type you set is just a conversion rule for the editor to encode text into 10 strings, and is not an attribute of the text.

When the editor opens the text file, what is displayed is not the underlying 01 string, but text. This is because the editor uses a certain text encoding to decode the 01 string into characters. If, when decoding, the character encoding used is consistent or compatible with the encoding, the text can be displayed correctly. If the character encoding used during decoding is inconsistent or incompatible with the encoding, the characters will be garbled.

For example, I have a text file using GBK encoding, the content is "When will the bright moon come out",

Java implementation method of specifying encoding when creating a file

# #I use VS code (a very easy-to-use text editor from Microsoft) to open the file. In terminology, it is to decode the file. The default text encoding used is UTF-8, and the decoding is the same. However, because the bottom layer of my text is a GBK-encoded 01 string (two bytes and one character), using UTF-8 to decode the 01 string will inevitably lead to garbled characters due to inconsistent encoding and decoding. At this time, as long as you manually select the corresponding GBK encoding, the decoded file will not be garbled.

Garbled characters also illustrate from the side that

character encoding is not an inherent attribute of the file.

I have talked so much just to illustrate this point:

Character encoding is the rule used when decoding and encoding, not an inherent attribute of the file.

I can't help but wonder, why didn't the character encoding be set as part of the file attributes?

Assuming it can be set and set to GBK, then the operating system needs to maintain the function. Just like a file is not writable, if a program tries to write the file, the operating system will refuse to write. The bytes that the operating system must write must meet the GBK encoding requirements. Then every time a byte is written, the operating system needs Checking the legality of the byte requires a very large performance overhead and is even impossible to implement, because some special bytes can represent either GBK or UTF-8, which is ambiguous. Now, what's the point of doing this? Is it so that the editor can select the correct encoding based on the encoding properties when opening the file? There is no need. A smart editor can infer what encoding your 01 string uses based on the first few bytes of the content. In addition, you can also manually set the character encoding used for decoding.

3. Problem Solving

When creating a file, the encoding of the file cannot be specified. When writing text to a file (for example, Ctrl S of a text editor to save, which essentially performs a writing operation), you can choose to convert the text into an encoding rule of 01 string.

For Java programs, the code is as follows, which is the code mentioned at the beginning of the article:

FileInputStream fis=new FileInputStream(“xxxx.txt”)；
OutputStreamWriter osw=new OutputStreamWriter(fis,“UTF-8”);

Copy after login

Recommended learning: "java Video Tutorial"

The above is the detailed content of Java implementation method of specifying encoding when creating a file. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

3 weeks ago By DDD

Saving in R.E.P.O. Explained (And Save Files)

1 months ago By 尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

4 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7563

CakePHP Tutorial

1385

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

Related knowledge

Perfect Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Perfect Number in Java. Here we discuss the Definition, How to check Perfect number in Java?, examples with code implementation.

Weka in Java Aug 30, 2024 pm 04:28 PM

Guide to Weka in Java. Here we discuss the Introduction, how to use weka java, the type of platform, and advantages with examples.

Smith Number in Java Aug 30, 2024 pm 04:28 PM

Guide to Smith Number in Java. Here we discuss the Definition, How to check smith number in Java? example with code implementation.

Java Spring Interview Questions Aug 30, 2024 pm 04:29 PM

In this article, we have kept the most asked Java Spring Interview Questions with their detailed answers. So that you can crack the interview.

Break or return from Java 8 stream forEach? Feb 07, 2025 pm 12:09 PM

Java 8 introduces the Stream API, providing a powerful and expressive way to process data collections. However, a common question when using Stream is: How to break or return from a forEach operation? Traditional loops allow for early interruption or return, but Stream's forEach method does not directly support this method. This article will explain the reasons and explore alternative methods for implementing premature termination in Stream processing systems. Further reading: Java Stream API improvements Understand Stream forEach The forEach method is a terminal operation that performs one operation on each element in the Stream. Its design intention is

TimeStamp to Date in Java Aug 30, 2024 pm 04:28 PM

Guide to TimeStamp to Date in Java. Here we also discuss the introduction and how to convert timestamp to date in java along with examples.

Java Program to Find the Volume of Capsule Feb 07, 2025 am 11:37 AM

Capsules are three-dimensional geometric figures, composed of a cylinder and a hemisphere at both ends. The volume of the capsule can be calculated by adding the volume of the cylinder and the volume of the hemisphere at both ends. This tutorial will discuss how to calculate the volume of a given capsule in Java using different methods. Capsule volume formula The formula for capsule volume is as follows: Capsule volume = Cylindrical volume Volume Two hemisphere volume in, r: The radius of the hemisphere. h: The height of the cylinder (excluding the hemisphere). Example 1 enter Radius = 5 units Height = 10 units Output Volume = 1570.8 cubic units explain Calculate volume using formula: Volume = π × r2 × h (4

How to Run Your First Spring Boot Application in Spring Tool Suite? Feb 07, 2025 pm 12:11 PM

Spring Boot simplifies the creation of robust, scalable, and production-ready Java applications, revolutionizing Java development. Its "convention over configuration" approach, inherent to the Spring ecosystem, minimizes manual setup, allo

See all articles