Loading and Parsing a JSON File Containing Multiple JSON Objects
Unlike standard JSON files, which encapsulate all data within a single object or array, certain JSON formats store individual objects as separate lines in a text file. This can lead to challenges when attempting to parse such a file in Python.
Addressing the ValueError
When using Python's json.load() function to load a JSON file with multiple objects, you may encounter a "ValueError: Extra data" exception. This error indicates that the file contains unexpected data after the first parsed object.
Solution: Line-by-Line Parsing
To handle this issue, you need to treat each line in the file as an independent JSON object. Replace your current code with the following:
import json data = [] with open('file') as f: for line in f: data.append(json.loads(line))
This code iterates over each line in the file, parses it as a JSON object, and appends it to a list.
Consideration for Large Files
If the JSON file is particularly large, appending all objects to a single list can consume excessive memory. To mitigate this, process each object separately before moving on to the next line. Avoid:
data = [] # List of all objects
Instead, process each object as needed:
for line in f: process_object(json.loads(line))
Handling Delimited JSON Objects
If your JSON file contains individual objects separated by delimiters, such as commas or newlines, you can use the following technique to parse out each object:
import json, io # Read delimited JSON objects from a file with open('file') as f: json_string = f.read() # Create a buffered reader json_buffer = io.StringIO(json_string) while True: # Read next JSON object from the buffer json_object = json.load(json_buffer) # Process JSON object if not json_object: # Reached the end of the file break
The above is the detailed content of How to Parse a JSON File Containing Multiple JSON Objects in Python?. For more information, please follow other related articles on the PHP Chinese website!