Community

Learn

Tools Library

AI Tools

Leisure

English

Home > Backend Development > Python Tutorial > 用Python写的图片蜘蛛人代码

用Python写的图片蜘蛛人代码

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2016-06-16 08:46:59

Original

1441 people have browsed it

复制代码代码如下:

#coding=utf-8

import os
import sys
import re
import urllib

URL_REG = re.compile(r'(http://[^///]+)', re.I)
IMG_REG = re.compile(r' 用Python写的图片蜘蛛人代码

用Python写的图片蜘蛛人代码

]*?src=([/'"])([^/1]*?)/1', re.I)

def download(dir, url):
'''下载网页中的图片

@dir 保存到本地的路径
@url 网页url
'''
global URL_REG, IMG_REG

m = URL_REG.match(url)
if not m:
print '[Error]Invalid URL: ', url
return
host = m.group(1)

if not os.path.isdir(dir):
os.mkdir(dir)

# 获取html,提取图片url
html = urllib.urlopen(url).read()
imgs = [item[1].lower() for item in IMG_REG.findall(html)]
f = lambda path: path if path.startswith('http://') else /
host + path if path.startswith('/') else url + '/' + path
imgs = list(set(map(f, imgs)))
print '[Info]Find %d images.' % len(imgs)

# 下载图片
for idx, img in enumerate(imgs):
name = img.split('/')[-1]
path = os.path.join(dir, name)
try:
print '[Info]Download(%d): %s'% (idx + 1, img)
urllib.urlretrieve(img, path)
except:
print "[Error]Cant't download(%d): %s" % (idx + 1, img)

def main():
if len(sys.argv) != 3:
print 'Invalid argument count.'
return
dir, url = sys.argv[1:]
download(dir, url)

if __name__ == '__main__':
# download('D://Imgs', 'http://www.163.com')
main()

Related labels：

图片蜘蛛人

Previous article：python 实现堆排序算法代码 Next article：Python多线程学习资料

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

How LLMs Work: Pre-Training to Post-Training, Neural Networks, Hallucinations, and Inference

2025-02-26 03:58:14
I Combined the Blockchain and AI to Generate Art. Here’s What Happened Next.

2025-02-26 03:38:10
Advanced Prompt Engineering: Chain of Thought (CoT)

2025-02-26 03:17:10
Retrieval Augmented Generation in SQLite

2025-02-26 02:49:09
How to Use an LLM-Powered Boilerplate for Building Your Own Node.js API

2025-02-26 01:08:13
LLMs for Coding in 2024: Price, Performance, and the Battle for the Best

2025-02-26 00:46:10
Prompting Vision Language Models

2025-02-25 23:42:08
How to Measure the Reliability of a Large Language Model's Response

2025-02-25 22:50:13
An Illusion of Life

2025-02-25 21:54:11
Scientists Go Serious About Large Language Models Mirroring Human Thinking

2025-02-25 20:45:11

Latest Issues

Team collaboration - What should I do if someone needs the feature I wrote as a dependency in git flow?

From 1970-01-01 08:00:00

0

0

0

Objective-c - Constraints for iOS a warning issue

From 1970-01-01 08:00:00

0

0

0

Confusion about using gitlab's fork&pull request mode within the team

From 1970-01-01 08:00:00

0

0

0

Objective-c - In iOS development, Instagram cannot be authorized after logging in. Instagram does not jump back to the application. How to get the callback address?

From 1970-01-01 08:00:00

0

0

0

Version Control - About the use of SVN and GIT in company projects?

From 1970-01-01 08:00:00

0

0

0

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template