The memory here is like a pipe. Line-by-line reading is just to pass the 1G file through the memory. 10M represents the thickness of the pipe. So, line-by-line reading takes 1G file into 加载进去过memory.
try (BufferedReader in = new BufferedReader(new FileReader(file))) {
String line;
while ((line = in.readLine()) != null) {
// parse line
}
}
No matter how big the file is, as long as the length of each line is limited, it will take a lot of time to read the entire file, but it will not take up too much memory.
Read in chunks, read one result set for each chunk, and finally aggregate the result set If you are processing text, it will be better to know the number of lines
linux上面有个指令叫做splitYou can quickly divide large text into small files concurrently, and then process it conveniently. This algorithm is called external sorting
Memory is like scratch paper. Once you finish writing an article, turn it over. Used and unused data are thrown away.
A simple example, create a variable buff, set its size, open the file stream and fill it in. After it is filled, check the content you want. If found, it will be counted in another variable. Then clear the buff, continue to load the content again at the previously read position... Until the reading is completed, the statistics are completed.
The memory here is like a pipe. Line-by-line reading is just to pass the 1G file through the memory. 10M represents the thickness of the pipe.
So, line-by-line reading takes 1G file into
加载进去过
memory.No matter how big the file is, as long as the length of each line is limited, it will take a lot of time to read the entire file, but it will not take up too much memory.
Read in chunks, read one result set for each chunk, and finally aggregate the result set
If you are processing text, it will be better to know the number of lines
linux
上面有个指令叫做split
You can quickly divide large text into small files concurrently, and then process it conveniently. This algorithm is called external sortingMemory is like scratch paper. Once you finish writing an article, turn it over. Used and unused data are thrown away.
A simple example, create a variable buff, set its size, open the file stream and fill it in. After it is filled, check the content you want. If found, it will be counted in another variable. Then clear the buff, continue to load the content again at the previously read position... Until the reading is completed, the statistics are completed.
For different systems, an API will be provided to operate files larger than the memory, that is, the file will be treated as memory:
内存映射
mmap
CreateFileMapping