Python parses specific node content in XML
XML is a commonly used format for storing and transmitting data. It describes the data structure in the form of tags and attributes and is a self-descriptive file. Format. In Python, we can use various libraries and methods to parse XML files and extract specific node content within them.
This article will introduce how to use Python to parse XML files and extract specific node contents. We will use Python’s built-in ElementTree
library for XML parsing. ElementTree
provides a simple and intuitive API, making parsing XML very easy.
First, we need to install the ElementTree
library, which can be installed using the following command:
pip install elementtree
After the installation is complete, we can start parsing the XML file. Suppose we have the following XML file (named example.xml):
<?xml version="1.0" encoding="UTF-8"?> <students> <student> <name>张三</name> <age>18</age> <gender>男</gender> </student> <student> <name>李四</name> <age>20</age> <gender>女</gender> </student> </students>
Our goal is to extract the name, age, and gender of each student node.
First, we need to import the ElementTree
library and load the XML file using the parse()
function:
import xml.etree.ElementTree as ET tree = ET.parse('example.xml') root = tree.getroot()
By calling parse( )
function, and passing in the path of the XML file, we loaded the XML file into the tree
object. Then, we use the getroot()
method to get the root node of the XML file.
Next, we can use the findall()
function to find a specific node based on the node name. In the findall()
function, we need to pass in an Xpath expression to specify the node we want to find. For our example, we need to find all student nodes, we can use the following code:
students = root.findall('student')
findall()
The function returns a list that contains all nodes that satisfy the Xpath expression. In our example, the students
list contains two student nodes.
Next, we can iterate through the students
list and extract the name, age, and gender in each student node. For each student node, we can find the corresponding node by calling the find()
method and passing in the node name. The text content of the node can then be obtained using the text
property.
for student in students: name = student.find('name').text age = student.find('age').text gender = student.find('gender').text print(f'姓名:{name}') print(f'年龄:{age}') print(f'性别:{gender} ')
With the above code, we can print out the name, age and gender of each student node.
The complete code is as follows:
import xml.etree.ElementTree as ET tree = ET.parse('example.xml') root = tree.getroot() students = root.findall('student') for student in students: name = student.find('name').text age = student.find('age').text gender = student.find('gender').text print(f'姓名:{name}') print(f'年龄:{age}') print(f'性别:{gender} ')
Execute the above code, we will get the following output:
姓名:张三 年龄:18 性别:男 姓名:李四 年龄:20 性别:女
Through the above example, we can see that parsing XML in Python and Extracting the content of a specific node is very simple. By using the ElementTree
library, we can easily load XML files, find and extract the required node content. This is very useful for processing data in XML files, especially for reading and analyzing large amounts of XML data.
To summarize, this article introduces how to use Python to parse XML files and extract the contents of specific nodes. Through the demonstration of examples, we can clearly understand how to use the ElementTree
library to process XML files, and how to use the findall()
and find()
functions to find and extract the required node content. I hope this article can provide some help to beginners. For more in-depth learning and use, you can refer to the official Python documentation.
The above is the detailed content of Python parses specific node content in XML. For more information, please follow other related articles on the PHP Chinese website!