Parsing HTML with Python
Question:
How can I access and manipulate HTML elements using a Python parser? I need a module that allows me to get tags and their content in a structured format, similar to the nested structure displayed in Firefox's "Inspect element" feature.
Answer:
BeautifulSoup
The BeautifulSoup module is a popular and powerful Python library for parsing HTML. It can convert HTML into a nested data structure, allowing you to access and navigate HTML elements easily.
Example:
To parse the HTML document you provided:
from bs4 import BeautifulSoup html = "<html><head>Heading</head><body attr1='val1'><div class='container'><div>
To get the content of the "container" div within the "body" tag:
print(parsed_html.body.find('div', attrs={'class':'container'}).text)
Other Options:
The above is the detailed content of How Can I Parse HTML and Access Elements Using Python?. For more information, please follow other related articles on the PHP Chinese website!