python - Use scrapy to write a crawler. After sending a request, the server all returns 202 directly. What should I do?
黄舟
黄舟 2017-06-28 09:25:09
0
2
1645

I crawled the Chinese Judgment Documents Network, which was fine before. I sent a request and the server returned 200, and then I processed the data in the body

But a week ago, suddenly all requests returned 202, and then the response body was also empty, and no data could be obtained at all. I blocked and waited in the callback function while (response.status == 202) and even slept. If used, the status will not change

what can we do about it?

I used crwalera's IP proxy service. It was also 202 for a while before, but it got better after a day, but this time it has lasted for a week, which is very strange

I think the target website has too much load, so I use an asynchronous method to send data, but how do I receive his data correctly in scrapy?

黄舟
黄舟

人生最曼妙的风景,竟是内心的淡定与从容!

reply all(2)
学霸

This situation is usually caused by illegal crawling, and the server has implemented anti-crawling restrictions. If it is captured legally, you can communicate with the content department to see if there is any accidental damage. If it is captured illegally, it is recommended not to do this. In serious cases, there may be a risk of prosecution

过去多啦不再A梦

If you have been prevented from harvesting, you can try changing your IP address or looking for loopholes to prevent harvesting

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template