Python Tutorial: How to split and merge large files using Python?

WBOY
Release: 2023-04-22 11:43:08
forward
1980 people have browsed it

Sometimes, we need to send a large file to others, but due to the limitations of the transmission channel, such as the limit on the size of email attachments, or the network condition is not very good, we need to split the large file into small files and send them multiple times. , the receiving end then merges these small files. Today I will share how to split and merge large files using Python.

Ideas and Implementation

If it is a text file, it can be divided by the number of lines. Whether it is a text file or a binary file, it can be split according to the specified size.

Using Python's file reading and writing function, you can split and merge files, set the size of each file, and then read bytes of the specified size and write them into a new file. The receiving end reads the small files in sequence. File, write the read bytes into a file in order, and then the merge can be completed.

Split

size = 1024 * 1000 * 10# 10MB
with open("bigfile", "rb") as reader:
part = 1
while True:
part_content = reader.read(size)
if not part_content:
print("split done.")
break
with open(f"bigfile_part{part}","wb") as writer:
writer.write(part_content)
Copy after login

Merge

total_parts = 5
with open("bigfile","wb") as writer:
for i in range(5):
with open(f"bigfile_part{i}", "rb") as reader:
writer.write(reader.read())
Copy after login

Use a third-party library

Although you can write it yourself, but Someone else has written it, why not save some time and use it directly? Just install it directly with pip:

pip install filesplit
Copy after login

Split

from filesplit.split import Split
split = Split("./data.rar", "./output")
split.bysize(size = 1024*1000*10) # 每个文件最多 10MB
Copy after login

After execution, we can see the split files in the output folder:

一文教会你如何用 Python 分割合并大文件

You can also split according to the number of file lines:

split.bylinecount(linecount = 10000) # 每个文件最多 10000 行
Copy after login

Merge

Merge requires small files in the folder To merge, the tool requires that there must be a manifest file in the folder. Its format is as follows:

filename,filesize,header
data_1.rar,10000000,False
data_2.rar,10000000,False
data_3.rar,10000000,False
data_4.rar,10000000,False
data_5.rar,1304145,False
Copy after login

The code to merge the files only needs to specify the directory to be merged, the target directory, and the merged file name. The code is as follows:

from filesplit.merge import Merge
merge = Merge(inputdir = "./output", outputdir="./merge", outputfilename = "merged.rar")
merge.merge()
Copy after login

After execution, you can see the merged file in the merge directory:

一文教会你如何用 Python 分割合并大文件

The above is the detailed content of Python Tutorial: How to split and merge large files using Python?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:51cto.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template