这是出现类似问题的一个网页http://detail.zol.com.cn/inde...
测试代码
import urllib2
url = 'http://detail.zol.com.cn/inde...'
response = None
try:
response = urllib2.urlopen(url,timeout=5)
html = response.read()
print html
print "hehe"
except urllib2.URLError as e:
if hasattr(e, 'code'):
print 'Error code:',e.code
elif hasattr(e, 'reason'):
print 'Reason:',e.reason
finally:
if response:
response.close()
运行结果:C:Python27python.exe C:/Users/Administrator/PycharmProjects/untitled/data02
hehe
Process finished with exit code 0
这段代码运行后也是空值
page = urllib2.Request(url)
page.add_header('Referer', url)
page.add_header('User-Agent', "Mozilla/5.0 (Windows NT 6.2; rv:16.0) Gecko/20100101 Firefox/16.0")
r = urllib2.urlopen(page,timeout=5.0)
html = r.read()
soup = BeautifulSoup(html, 'lxml')
Be more violent and bring cookies
I cleared the browser cookies. When I visited this page, I found that it was also empty. I analyzed it and found that its cookies were encrypted and set with js. If you are good at js, you can try to analyze it. If it doesn’t work, use selenium to operate chrome. Let’s do it.