html - python decode('utf-8')出现错误:invalid start byte?

Question

写python爬虫,做下载器时,发现部分网页(一部分可以)无法通过decode('utf-8)去解码,查看网页,网页却是有&lt;meta charset=UTF-8&gt;这句,说明是UTF-8编码,为何无法解码? 部分网页解码失败的错误代码: {代码...} 这...

PHP中文网 · Answer

The part you're failing on doesn't look like any encoding, maybe binary data?

迷茫 · Answer

The returned data is gzipped, you should decompress it first. Now that you know how to use BeautifulSoup when writing crawlers, why not use requests instead of urllib.