When parsing HTML with Beautiful Soup 3, HTML entities often appear and need to be decoded. This can be done using the html.unescape() or HTMLParser.unescape() function.
Use html.unescape():
import html html.unescape('£682m')
From HTMLParser in Python 2.6-2.7 or html.parser in Python 3, unescape():
from html.parser import HTMLParser parser = HTMLParser() print(h.unescape('£682m'))
Alternatively, with the six compatibility library:
from six.moves.html_parser import HTMLParser parser = HTMLParser() print(h.unescape('£682m'))
The above is the detailed content of How to Decode HTML Entities in Python Strings?. For more information, please follow other related articles on the PHP Chinese website!