Traverse the URL requesting page turning
for i in range(3):
yield Request("http:xx/page/%s"%str(i),callback=self.parse_page)
The result is that the response request is successful, but the content is the same every time. It is the content of the first request. However, using Postman to request the paginated URLs separately does not have this problem. = = Have you been banned? It was never like this before
Then we need to analyze the difference between the header requested when using postman or browser and the header requested when using scrapy
Recognized by anti-crawling
Look at the log printed by the console to see if the next page has been crawled correctly
2017-06-29 09:26:13 [scrapy] DEBUG: Scraped from <200 http:xx/page/x>,
Pay attention to whether the last x (http:xx/page/x) has changed