I just learned to get json content, but the website I crawled today does not return json content, and a random number is generated after each request link
I don’t know if it will affect the content I want to crawl
The content that needs to be obtained is the content in the middle of the picture below
Website link http://www.szse.cn/main/discl...
Code I tried myself:
import requests
dir = '/Users/S1Lence/Desktop/new_html/szse/许可类重组问询函'
headers = {'Host': 'www.szse.cn',
'Referer': 'http://www.szse.cn/main/disclosure/jgxxgk/wxhj/',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36'
payload= {'ACTIONID': '7',
'CATALOGID': 'main_wxhj',
'TABKEY': 'tab1',
'selecthjlb': '许可类重组问询函',
'tab1PAGENO': '1',
'tab1PAGECOUNT': '7',
'tab1RECORDCOUNT': '63',
'REPORT_ACTION': 'navigate'}
res = requests.post('http://www.szse.cn/szseWeb/FrontControllere', data=payload)
The output content is not what I want. How to climb
Copy his header information and use it. .
The url address of your post is wrong, it should be