The following tutorial column will introduce you to golang's efficient processing of large files_Using Pandas to process large files in chunks. I hope it will be helpful to friends in need! Use Pandas to process large files in chunksProblem: Today when processing user data of Kuaishou, I encountered a txt text of almost 600M. It crashed when I opened it with sublime. I used pandas. It took nearly 2 minutes to read with read_table(). Finally, when I opened it, I found almost 30 million rows of data. It's just opening, I don't know how hard it would be to handle.
,
iteratorThe principle is The file data is not read into the memory at once, but multiple times. 1. Specify chunksize to read files in chunks
read_csv and read_table have a chunksize parameter to specify a chunk size (how many lines to read each time) and return an iterable TextFileReader object.
1 2 3 4 5 |
|
2. Specify iterator=True
iterator=True also returns a TextFileReader object
1 2 3 4 |
|
The above is the detailed content of How to efficiently process large files in golang. For more information, please follow other related articles on the PHP Chinese website!