This article mainly shares with you how to use Python to crawl Jingdong mobile phone pictures. It has a good reference value and I hope it will be helpful to everyone. Let’s follow the editor to take a look, I hope it can help everyone.
<span style="font-size: 14px; font-family: 微软雅黑, "Microsoft YaHei";">#爬取京东手机图片import re #导入re模块import urllib.request #导入urllib.request模块def craw(url,page): #定义函数craw<br/> html1 = urllib.request.urlopen(url).read() #调用urllib.request模块里的urlopen函数打开url链接,并且读取,最后赋值给html1<br/> html1 = str(html1) #将html1网页内容变成字符串<br/> pat1 = '''<p id="plist".+? <p class="page clearfix">''' #利用pat1正则表达式进行第一次信息过滤<br/> result1 = re.compile(pat1).findall(html1) #查找所有符合条件的信息<br/> result1 = result1[0]<br/> pat2 = '''<img width="220" height="220" data-img="1".+?"//(.+?\.jpg)">'''#".+?"组合可以匹配除换行外的任意字符,第二个正则表达式进行第二次过滤<br/> imagelist = re.compile(pat2).findall(result1)<br/> x = 1<br/> for imageurl in imagelist:<br/> imagename = "D:/Python35/myweb/part6/img1/"+"第"+str(page)+"页图"+str(x)+".jpg"<br/> imageurl = "http://"+imageurl try:<br/> urllib.request.urlretrieve(imageurl,filename=imagename) except urllib.error.URLError as e: if hasattr(e,"code"):<br/> x+=1<br/> if hasattr(e,"reason"):<br/> x+=1<br/> x+=1for i in range(1,2):<br/> url = "http://list.jd.com/list.html?cat=9987,653,655&page="+str(i)<br/> craw(url,i)</span>
Related recommendations:
python crawler practice-- -Crawling Jingdong pictures
Crawling Jingdong mobile phone pictures
Instance crawling Jingdong collected pictures
The above is the detailed content of Example analysis of how Python implements crawling JD mobile phone pictures. For more information, please follow other related articles on the PHP Chinese website!