Splitting Columns of Nested Dictionaries in Pandas with json_normalize
In Python, when working with Pandas DataFrames, it is possible to encounter instances where the last column contains nested dictionaries. To extract these values into separate columns, you may face challenges if the dictionaries are not of equal length.
This article presents a solution using the json_normalize() function. Here's an example:
import pandas as pd # Sample DataFrame with a column of nested dictionaries df = pd.DataFrame({ 'Station ID': ['8809', '8810', '8811', '8812', '8813'], 'Pollutant Levels': [ {'a': '46', 'b': '3', 'c': '12'}, {'a': '36', 'b': '5', 'c': '8'}, {'b': '2', 'c': '7'}, {'c': '11'}, {'a': '82', 'c': '15'}, ] }) # Extract columns using json_normalize df2 = pd.json_normalize(df['Pollutant Levels']) # Concatenate with original DataFrame df = pd.concat([df, df2], axis=1) # Drop the original 'Pollutant Levels' column df = df.drop(columns=['Pollutant Levels']) print(df)
Output:
Station ID a b c 0 8809 46 3 12 1 8810 36 5 8 2 8811 NaN 2 7 3 8812 NaN NaN 11 4 8813 82 NaN 15
This approach effectively extracts the nested dictionary values into separate columns, handling the issue of varying dictionary lengths. The json_normalize() function efficiently converts the nested JSON data into a tabular format, eliminating the need for complex apply functions.
The above is the detailed content of How Can I Efficiently Split Nested Dictionary Columns in Pandas DataFrames with Unequal Lengths?. For more information, please follow other related articles on the PHP Chinese website!