Home > Backend Development > Python Tutorial > Python downloads large files, which method is faster?

Python downloads large files, which method is faster?

王林
Release: 2023-04-14 21:19:01
forward
2133 people have browsed it

Python downloads large files, which method is faster?

Usually, we use the requests library to download. This library is so convenient to use.

Method 1

Use the following streaming code, Python memory usage will not increase regardless of the size of the downloaded file:

def download_file(url):
local_filename = url.split('/')[-1]
# 注意传入参数 stream=True
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192): 
f.write(chunk)
return local_filename
Copy after login

If you have a need for chunk encoding , then the chunk_size parameter should not be passed in, and there should be an if judgment.

def download_file(url):
local_filename = url.split('/')[-1]
# 注意传入参数 stream=True
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(local_filename, 'w') as f:
for chunk in r.iter_content(): 
if chunk:
f.write(chunk.decode("utf-8"))
return local_filename
Copy after login

iter_content[1] The function itself can also be decoded, just pass in the parameter decode_unicode = True. In addition, search the top Python background of the official account and reply "Advanced" to get a surprise gift package.

Please note that the number of bytes returned using iter_content is not exactly chunk_size, it is a random number that is usually larger and is expected to vary on each iteration.

Method 2

Use Response.raw[2] and shutil.copyfileobj[3]

import requests
import shutil

def download_file(url):
local_filename = url.split('/')[-1]
with requests.get(url, stream=True) as r:
with open(local_filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)

return local_filename
Copy after login

This streams the file to disk without using too much memory, and the code is simpler.

Note: According to the documentation, Response.raw will not decode, so you can manually replace the r.raw.read method if needed

response.raw.read = functools.partial(response.raw.read, decode_content=True)
Copy after login

Speed

Method two is faster. If method one is 2-3 MB/s, method two can reach nearly 40 MB/s.

References

[1]iter_content: https://requests.readthedocs.io/en/latest/api/#requests.Response.iter_content

[2]Response.raw: https://requests.readthedocs.io/en/latest/api/#requests.Response.raw

[3]shutil.copyfileobj: https://docs.python.org/3/library/shutil.html#shutil.copyfileobj

The above is the detailed content of Python downloads large files, which method is faster?. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:51cto.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template