这是出现类似问题的一个网页http://detail.zol.com.cn/inde...
测试代码
import urllib2
url = 'http://detail.zol.com.cn/inde...'
response = None
try:
response = urllib2.urlopen(url,timeout=5)
html = response.read()
print html
print "hehe"
except urllib2.URLError as e:
if hasattr(e, 'code'):
print 'Error code:',e.code
elif hasattr(e, 'reason'):
print 'Reason:',e.reason
finally:
if response:
response.close()
运行结果:C:Python27python.exe C:/Users/Administrator/PycharmProjects/untitled/data02
hehe
Process finished with exit code 0
这段代码运行后也是空值
page = urllib2.Request(url)
page.add_header('Referer', url)
page.add_header('User-Agent', "Mozilla/5.0 (Windows NT 6.2; rv:16.0) Gecko/20100101 Firefox/16.0")
r = urllib2.urlopen(page,timeout=5.0)
html = r.read()
soup = BeautifulSoup(html, 'lxml')
暴力一點,cookie帶上
把瀏覽器cookie清除了,在訪問這個頁面,發現也是空的,分析了下,它的cookie加密了,用js設定的.如果你js好,可以嘗試分析下,實在不行就用selenium操作chrome來弄吧.