This article will introduce how to extract data from XML and JSON files on the Internet. XML and JSON are currently commonly used data formats, so it is very necessary to master methods to extract useful information from them.
1. XML data extraction method
XML (Extensible Markup Language) is a markup language used to store and transmit data. XML data consists of tags, attributes, text and comments. The following describes how to extract data from XML files through Python.
The ElementTree module in Python is a way to process XML data. It can parse XML files into Element Tree objects and provides a series of methods to operate the objects. Here is a simple example:
import xml.etree.ElementTree as ET tree = ET.parse('data.xml') root = tree.getroot() for child in root: print(child.tag, child.attrib)
This code will print out the tags and attributes of each element in the XML file.
XPath is a language for selecting XML nodes. It provides a convenient way to locate and extract XML data. To use XPath, you can use the Element.findall() and Element.find() methods in the ElementTree module. Here is an example:
import xml.etree.ElementTree as ET tree = ET.parse('data.xml') root = tree.getroot() # 获取所有book元素 books= root.findall('.//book') # 获取第一个book元素的author子元素的值 author = root.find(".//book[1]/author").text
This code will get the value of all book elements in the XML file and the author sub-element of the first book element.
2. JSON data extraction method
JSON (JavaScript Object Notation) is a lightweight data exchange format whose structure is similar to the dictionary in Python. Here's how to use Python to extract data from a JSON file.
The json module in Python can convert a JSON string into a Python dictionary or list. A JSON string can be converted into a Python object using the json.loads() method, as shown below:
import json json_str = '{"name": "Alice", "age": 25, "city": "New York"}' data = json.loads(json_str) print(data["name"])
This code will output "Alice".
After converting JSON strings into Python objects, they can be manipulated just like dictionaries and lists. For example, you can use an index to get the value for a key. Here is an example:
import json json_str = '{"name": "Alice", "age": 25, "city": "New York"}' data = json.loads(json_str) print(data["name"])
This code will output "Alice".
Summary
This article introduces methods for extracting data from XML and JSON files on the web. Using the ElementTree module and XPath in Python makes it easy to extract data from XML files, while using the json module you can convert JSON strings into Python objects and then manipulate them like dictionaries and lists. Mastering these skills can help us process data more efficiently.
The above is the detailed content of Ways to extract data from XML and JSON files on the web. For more information, please follow other related articles on the PHP Chinese website!