This article mainly introduces the Python crawler: the method of crawling Baidu images through keywords. It has a very good reference value. Let’s take a look at it with the editor.
Tools used: Python2.7, click here to download
scrapyFramework
sublime text3
一. Build python (Windows version)
1.Installpython2.7 ---Then enter python in cmd. If the interface is as follows, the installation is successful
## 2. Integrate the Scrapy framework----Enter the command line: pip install Scrapy
The successful installation interface is as follows: There are many failure situations, here is an example:Solution: Remaining errors can be searched on Baidu
two. StartProgramming.
1. Crawl static websites without anti-crawler measures. For example, Baidu Tieba and Douban Reading.
For example - a post in "Desktop Bar" tieba.baidu.com/p/2460150866?red_tag=3569129009The python code is as follows: CodeComments: Two modules urllib,re are introduced. Define two functions. The first function is to obtain the entire target webpage data, and the second function is to obtain the target image in the target webpage, traverse the webpage, and sort the acquired images starting from 0.
Note: re module knowledge points: Crawling picture rendering: picture The saving path defaults to the same directory as the created .py file.2. Crawling Baidu images with anti-crawler measures. Such as Baidu pictures, etc.
For example, the keyword search "emoticon package" https://image.baidu.com/search/index?tn=baiduimage&ct=201326592&lm=-1&cl=2&ie=gbk&word=%B1% ED%C7%E9%B0%FC&fr=ala&ori_query=%E8%A1%A8%E6%83%85%E5%8C%85&ala=0&alatpl=sp&pos=0&hs=2&xthttps=111111The picture is scrolling To load, crawl the top 30 pictures first. The code is as follows: Code comments: Import 4 modules,os module is used to specify the save path. The first two functions are the same as above. The third function uses the if statement and tryException exception.
The crawling process is as follows: Crawling results: Note: Write python code Pay attention to alignment, and do not mix tabs and spaces, as it is easy to report errors. 【Related recommendations】1. 2. 3.Python object-oriented video tutorial
The above is the detailed content of Teach you how to crawl web images through keywords. For more information, please follow other related articles on the PHP Chinese website!