python菜鸟想做一个简单的爬虫求教程

Question

python菜鸟 想做一个简单的爬虫 求教程 ps:一般公司做爬虫采集的话常用什么语言

PHPz · Answer

Scrapy is a better choice, it is relatively simple, here is an introductory tutorial

天蓬老师 · Answer

You can first use a crawler framework to implement business logic, such as scrapy, and then slowly replace the framework according to your own needs. Finally, you will find that you have implemented a crawler framework

大家讲道理 · Answer

Python’s Scrapy is great for writing crawlers. Attached is a very simple welfare crawler I wrote

https://github.com/ZhangBohan/fun_crawler

高洛峰 · Answer

You can use urllib/urllib2/requests to capture content. Requests is recommended.
You can use BeautifulSoup to analyze the content, or you can use regular or violent string parsing.

ringa_lee · Answer

http://cuiqingcai.com/1052.html

I’ve been learning Python crawler recently, and I find it very interesting, and it really makes life a lot easier. During the learning process, I summarized some study notes, and also recorded some small crawlers that I actually wrote. I will share them with you here. I hope it will be helpful to children who are interested in Python crawlers. If you have the opportunity, I look forward to communicating with you. .

1. Introduction to Python

A review of getting started with Python crawlers
Introduction to Python crawler 2: Basic understanding of crawlers
Introduction to Python crawler 3: Basic use of Urllib library
Introduction to Python crawler 4: Advanced usage of Urllib library
Getting Started with Python Crawler 5: URLError Exception Handling
Introduction to Python Crawler 6: Use of Cookies
Getting Started with Python Crawler Seven Regular Expressions

2. Python Practical Combat

Practical combat of Python crawler: Crawling embarrassing encyclopedia jokes
Python Crawler Practical Combat 2 Crawling*
Python crawler practice three: Calculating university grade points for this semester
Python crawler practice four to capture Taobao MM photos
Python crawler practice five simulations of logging into Taobao and getting all orders

3. Python Advanced

Python crawler advanced one - crawler framework Scrapy installation configuration

These are the articles for now. They will be updated as the study progresses, so stay tuned~

Hope it helps everyone, thank you!

Please indicate when reprinting: Jingmi » Python crawler learning tutorial series

高洛峰 · Answer

If you just want a spider that works
http://segmentfault.com/blog/eric/1190000002543828

黄舟 · Answer

https://github.com/binux/pyspider
Powerful WebUI with script editor, task monitor, project manager and result viewer

高洛峰 · Answer

Crawling anime pictures on Konachan. This was done when I first learned crawling. I can make do with it after getting started

高洛峰 · Answer

For simple purposes, you can use: to obtain web pages, you can use beautifulsoup, regular, and urllib2.
For in-depth analysis, you can look at some open source frameworks, such as Python's scrapy, etc.
You can also look at some video tutorials, such as
A word from Geek Academy, practice more. . .

天蓬老师 · Answer

Here is an existing example, you can refer to it:
How to crawl business information on Dianping.com (with chestnuts and codes attached)