Home > Backend Development > Golang > How to efficiently process large files in golang

How to efficiently process large files in golang

藏色散人
Release: 2021-05-12 11:52:57
forward
2337 people have browsed it

The following tutorial column will introduce you to golang's efficient processing of large files_Using Pandas to process large files in chunks. I hope it will be helpful to friends in need! Use Pandas to process large files in chunksProblem: Today when processing user data of Kuaishou, I encountered a txt text of almost 600M. It crashed when I opened it with sublime. I used pandas. It took nearly 2 minutes to read with read_table(). Finally, when I opened it, I found almost 30 million rows of data. It's just opening, I don't know how hard it would be to handle.

Solution: I looked through the documentation. This type of function that reads files has two parameters:

chunksize

,

iterator

The principle is The file data is not read into the memory at once, but multiple times. 1. Specify chunksize to read files in chunks

read_csv and read_table have a chunksize parameter to specify a chunk size (how many lines to read each time) and return an iterable TextFileReader object.

table=pd.read_table(path+'kuaishou.txt',sep='t',chunksize=1000000)
for df in table:
    对df处理
    #如df.drop(columns=['page','video_id'],axis=1,inplace=True)
    #print(type(df),df.shape)打印看一下信息
Copy after login

I have divided the file here again and divided it into several sub-files for separate processing (yes, to_csv also has the chunksize parameter)

2. Specify iterator=True

iterator=True also returns a TextFileReader object

reader = pd.read_table('tmp.sv', sep='t', iterator=True)
df=reader.get_chunk(10000)
#通过get_chunk(size),返回一个size行的块
#接着同样可以对df处理
Copy after login

Let’s take a look at the content of the pandas document in this aspect.

The above is the detailed content of How to efficiently process large files in golang. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:csdn.net
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template