import sys
import time
import requests
import json
reload(sys)
sys.setdefaultencoding('utf-8')
time=int(time.time())
session=requests.session()
user_agent='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.87 Safari/537.36'
headers={'User-Agent':user_agent,'Host':'xygs.gsaic.gov.cn','Connection':'keep-alive','Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'}
params={'pripid':'62030200052016012700011'}
cookies={'JSESSIONID':'2B33BC6D34DF44BE8D76C2AE20701D95'}
Url='http://xygs.gsaic.gov.cn/gsxygs/smallEnt!view.do?pripid=62030200052016012700011'
captcha=session.get(Url,headers=headers,params=(params),cookies=cookies).text
print captcha
得不到表格里的信息,求解为什么啊?
https://segmentfault.com/q/1010000005117988
The previous question was answered for you. I don’t know if it has solved your problem. Why is there no response? If it has been solved, remember to adopt it. The code for this question is as follows:
The web form uses ajax technology. You can use Network in chrome tools to view the source of the table.
I just checked, it’s because you are missing the Accept-Language protocol headerIn addition, the crawler is not only based on the language python. You'd better learn some knowledge related to web development, especially js and http protocols. Sorry, I didn't read carefully because I answered on my mobile phone.