python菜鸟 想做一个简单的爬虫 求教程
PHP中文网
PHP中文网 2017-04-17 14:27:26
0
21
1390

python菜鸟 想做一个简单的爬虫 求教程 ps:一般公司做爬虫采集的话常用什么语言

PHP中文网
PHP中文网

认证0级讲师

reply all(21)
PHPzhong

Scrapy is a better choice, it is relatively simple, here is an introductory tutorial

Peter_Zhu

You can first use a crawler framework to implement business logic, such as scrapy, and then slowly replace the framework according to your own needs. Finally, you will find that you have implemented a crawler framework

大家讲道理

Python’s Scrapy is great for writing crawlers. Attached is a very simple welfare crawler I wrote

https://github.com/ZhangBohan/fun_crawler

小葫芦

You can use urllib/urllib2/requests to capture content. Requests is recommended.
You can use BeautifulSoup to analyze the content, or you can use regular or violent string parsing.

左手右手慢动作

http://cuiqingcai.com/1052.html

I’ve been learning Python crawler recently, and I find it very interesting, and it really makes life a lot easier. During the learning process, I summarized some study notes, and also recorded some small crawlers that I actually wrote. I will share them with you here. I hope it will be helpful to children who are interested in Python crawlers. If you have the opportunity, I look forward to communicating with you. .

1. Introduction to Python

  1. A review of getting started with Python crawlers

  2. Introduction to Python crawler 2: Basic understanding of crawlers

  3. Introduction to Python crawler 3: Basic use of Urllib library

  4. Introduction to Python crawler 4: Advanced usage of Urllib library

  5. Getting Started with Python Crawler 5: URLError Exception Handling

  6. Introduction to Python Crawler 6: Use of Cookies

  7. Getting Started with Python Crawler Seven Regular Expressions

2. Python Practical Combat

  1. Practical combat of Python crawler: Crawling embarrassing encyclopedia jokes

  2. Python Crawler Practical Combat 2 Crawling*

  3. Python crawler practice three: Calculating university grade points for this semester

  4. Python crawler practice four to capture Taobao MM photos

  5. Python crawler practice five simulations of logging into Taobao and getting all orders

3. Python Advanced

  1. Python crawler advanced one - crawler framework Scrapy installation configuration

These are the articles for now. They will be updated as the study progresses, so stay tuned~

Hope it helps everyone, thank you!

Please indicate when reprinting: Jingmi » Python crawler learning tutorial series

小葫芦

If you just want a spider that works
http://segmentfault.com/blog/eric/1190000002543828

黄舟

https://github.com/binux/pyspider
Powerful WebUI with script editor, task monitor, project manager and result viewer

小葫芦

Crawling anime pictures on Konachan. This was done when I first learned crawling. I can make do with it after getting started

小葫芦

For simple purposes, you can use: to obtain web pages, you can use beautifulsoup, regular, and urllib2.
For in-depth analysis, you can look at some open source frameworks, such as Python's scrapy, etc.
You can also look at some video tutorials, such as
A word from Geek Academy, practice more. . .

Peter_Zhu

Here is an existing example, you can refer to it:
How to crawl business information on Dianping.com (with chestnuts and codes attached)

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template