Constructing a pandas DataFrame from a Nested Dictionary with Hierarchical Index
This article addresses the need to convert a nested dictionary into a pandas DataFrame with a hierarchical index. The dictionary, structured with UserIds as the first level, Categories as the second level, and various attributes as the third level, poses a challenge in creating the desired DataFrame structure.
To construct a DataFrame with the intended hierarchical index, the first solution involves reshaping the nested dictionary. Each key in the dictionary should be a tuple corresponding to the values of the multi-index. Using pd.DataFrame.from_dict and setting the orient='index', the DataFrame can be created:
user_dict = {12: {'Category 1': {'att_1': 1, 'att_2': 'whatever'}, 'Category 2': {'att_1': 23, 'att_2': 'another'}}, 15: {'Category 1': {'att_1': 10, 'att_2': 'foo'}, 'Category 2': {'att_1': 30, 'att_2': 'bar'}}} pd.DataFrame.from_dict({(i,j): user_dict[i][j] for i in user_dict.keys() for j in user_dict[i].keys()}, orient='index')
An alternative approach involves constructing the DataFrame by concatenating component dataframes. This method appends the DataFrame for each UserId as follows:
user_ids = [] frames = [] for user_id, d in user_dict.iteritems(): user_ids.append(user_id) frames.append(pd.DataFrame.from_dict(d, orient='index')) pd.concat(frames, keys=user_ids)
By implementing one of these methods, a pandas DataFrame with a hierarchical index can be constructed from a nested dictionary, simplifying the organization and analysis of the data.
The above is the detailed content of How Can I Create a Pandas DataFrame with a Hierarchical Index from a Nested Dictionary?. For more information, please follow other related articles on the PHP Chinese website!