This article mainly introduces the relevant information about using python to obtain the web page encoding method to implement the code. Friends in need can refer to
python to obtain the web page encoding method to implement the code
<span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);"> </span><span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);"> python开发,自动化获取网页编码方式用到了chardet库,字符集检测,这个类在python2.7中没有,需要在官网上下载。 这里我下载好了chardet-2.3.0.tar.gz压缩包文件,只需要将压缩包文件解压后的chardet文件放到python安装包下的 python27/lib/site-packages/下,就可以了。</span>
Then import chardet
An automated detection function is written below to detect the Url connection, and then Returns the encoding method of the web page URL.
import chardet #字符集检测 import urllib url="http://www.jd.com" def automatic_detect(url): content=urllib.urlopen(url).read() result=chardet.detect(content) encoding=result['encoding'] return encoding urls=['http://www.baidu.com','http://www.163.com','http://dangdang.com'] for url in urls: print url,automatic_detect(url)
The detect method of the chardet class is used above, returns the dictionary, and then takes out the encoding method encoding
Thanks for reading, I hope it can help everyone , thank you all for your support of this site!
The above is the detailed content of Use python to obtain web page encoding method implementation code. For more information, please follow other related articles on the PHP Chinese website!