Python - How to quickly confirm the status code of a web page with 200 million++ URLs?
世界只因有你
世界只因有你 2017-05-18 10:56:14
0
5
612

I used requests to write a multi-thread, but it feels a bit slow. Are there any other methods?

世界只因有你
世界只因有你

reply all(5)
PHPzhong

Use Tornado’s curl client support to close the connection after reading the request header. (I haven’t tried it yet. If the HTTP client it provides does not support closing the connection midway, you can use TCP and then use http-parser to parse it like I did.)

Okay, actually you can just add an extension to fetchtitle to get the status code... (remember to install pycurl)

巴扎黑

Python is inherently slow. If you want to be fast, just write the tcp request directly and then read the reply. After reading the status, close the socket.

左手右手慢动作

Using grequests, requests are encapsulated concurrently

https://github.com/kennethrei...

迷茫

In this case, you can consider using gevent, tornado, scrapy-redis, asyncio!

大家讲道理

Using Head to request can it be faster?

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template