84669 person learning
152542 person learning
20005 person learning
5487 person learning
7821 person learning
359900 person learning
3350 person learning
180660 person learning
48569 person learning
18603 person learning
40936 person learning
1549 person learning
1183 person learning
32909 person learning
有一份10G以上大文本文件,需要替换里面的一些文本信息(每一行都有),如何高效读并替换掉生成新的文件
人生最曼妙的风景,竟是内心的淡定与从容!
Split into multiple files first
Multiple threads operate multiple files to avoid two threads operating the same file
Read files line by line and write new files line by line
Merge all files
1,4 Just use linux commands~
File file = new File(filepath); BufferedInputStream fis = new BufferedInputStream(new FileInputStream(file)); BufferedReader reader = new BufferedReader(new InputStreamReader(fis,"utf-8"),510241024);String line = "";while((line = reader.readLine()) != null){
//进行替换操作和其他业务
}
In order to improve performance, you may need mapped IO. For details, please refer to:
Why use Memory Mapped File or MappedByteBuffer in Java
java large file read and write operations, java nio's MappedByteBuffer, efficient file/memory mapping
A simple comparison of the performance of java.io and java.nio
If it is a simple text replacement, just use the sed command of Linux.
If it is a more complex text replacement, see below:
http://stackoverflow.com/ques...
http://www.baeldung.com/java-...
用spark分析、lines=sc.textFile("your_file");filterlines=lines.filter(your_filter_function)filterlines.xxx()
Split into multiple files first
Multiple threads operate multiple files to avoid two threads operating the same file
Read files line by line and write new files line by line
Merge all files
1,4 Just use linux commands~
File file = new File(filepath);
BufferedInputStream fis = new BufferedInputStream(new FileInputStream(file));
BufferedReader reader = new BufferedReader(new InputStreamReader(fis,"utf-8"),510241024);
String line = "";
while((line = reader.readLine()) != null){
}
In order to improve performance, you may need mapped IO. For details, please refer to:
Why use Memory Mapped File or MappedByteBuffer in Java
java large file read and write operations, java nio's MappedByteBuffer, efficient file/memory mapping
A simple comparison of the performance of java.io and java.nio
If it is a simple text replacement, just use the sed command of Linux.
If it is a more complex text replacement, see below:
http://stackoverflow.com/ques...
http://www.baeldung.com/java-...
用spark分析、
lines=sc.textFile("your_file");
filterlines=lines.filter(your_filter_function)
filterlines.xxx()