Crawler | Batch download of HD wallpapers (source code + tools included)

Release: 2023-08-10 15:46:01
forward
1475 people have browsed it


#Unsplash is a free high-quality photo website. They are all real photography photos. The photo resolution is also very large. It is very good for designer friends. The material is also very useful for some illustration copywriting friends, and it also works well as wallpaper. The corresponding function code has been encapsulated into an exe tool. I hope it will be helpful to you. The code tool acquisition method is attached at the end of the article.


1. Import module

1.1 Import module

##Code:

Crawler | Batch download of HD wallpapers (source code + tools included)

#Let’s take a look at the manual download process first. Note that you do not right-click the image to save as. The image obtained by right-clicking the save method is compressed at a certain ratio, and the clarity will be reduced a lot. Take Nature as an example, click Download free and select the download path. The image size is 1.43M.

Crawler | Batch download of HD wallpapers (source code + tools included)
##Next,
analyze specific web pages
:
First of all, we observed that there is a page number selection option at the bottom of the web page. We tried to pull down the web page slider and found that the
pictures were dynamically loaded
. That is to say, when we pull down the web page, subsequent pictures will be displayed one after another.

After several operations, I found that when the page is pulled down, the web page will issue the following requests, click on one of them, You can see the total number of pictures
: 10000, the total number of pages: 500

.

Let’s take a look at a few URLs:

Crawler | Batch download of HD wallpapers (source code + tools included)

The above links are only page parameters are different, and they are increasing in sequence, which is relatively friendly. Just traverse them in sequence when requesting.

The page number problem has been solved. Next, analyze the link of each picture:

Crawler | Batch download of HD wallpapers (source code + tools included)

We see that the result list length is exactly 20, With the same per_page value in the request, there is no doubt that the link to each image we are looking for is here.
Analyzing web pages is often time-consuming, but overall it goes smoothly. Now we officially crawl the images.


#2. Crawl images

##2.1 Import module
import time
import random
import json
import requests
from fake_useragent import UserAgent
Copy after login
  • ##time: Timing
  • random: Generate random numbers

  • json: Process json format data

  • requests:Web page requests

  • fake_useragent:代理

2.2 获取图片
模拟代理,以网页的身份访问服务器,避免请求被服务器判定为机器爬虫而不响应请求
ua = UserAgent(verify_ssl=False)
headers = {'User-Agent': ua.random}
Copy after login
根据响应,获取所有图片链接:
def getpicurls(i,headers):
    picurls = []
    url = 'https://unsplash.com/napi/search/photos?query=nature&per_page=20&page={}&xp=feedback-loop-v2%3Aexperiment'.format(i)
    r = requests.get(url, headers=headers, timeout=5)
    time.sleep(random.uniform(3.1, 4.5))
    r.raise_for_status()
    r.encoding = r.apparent_encoding
    allinfo = json.loads(r.text)
    results = allinfo['results']
    for result in results:
        href = result['urls']['full']
        picurls.append(href)
    return picurls
Copy after login
2.3 保存图片

保存图片文件:
def getpic(count,url):
    r = requests.get(url, headers=headers, timeout=5)
    with open('pictures/{}.jpg'.format(count), 'wb') as f:
        f.write(r.content)
Copy after login
效果:

Crawler | Batch download of HD wallpapers (source code + tools included)


3. EXE爬取

exe工具运行结果:

Crawler | Batch download of HD wallpapers (source code + tools included)

Note:
  • Try not to crawl frequently to avoid affecting the network order!

  • The picture is a high-definition picture from the external network. The crawling speed depends on the network and is generally not too fast.

  • You can build a proxy pool to crawl faster.

The above is the detailed content of Crawler | Batch download of HD wallpapers (source code + tools included). For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:Python当打之年
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template