Consider the following DataFrame:
df1 = pd.DataFrame({'City': ['Seattle', 'Seattle', 'Portland', 'Seattle', 'Seattle', 'Portland'], 'Name': ['Alice', 'Bob', 'Mallory', 'Mallory', 'Bob', 'Mallory']})
After performing a grouping operation using groupby(), the output is a hierarchical index DataFrame, as shown below:
g1 = df1.groupby(['Name', 'City']).count() print(g1) City Name Name City Alice Seattle 1 1 Bob Seattle 2 2 Mallory Portland 2 2 Seattle 1 1
To retrieve the desired DataFrame structure, there are several approaches.
This involves adding a suffix to the column names and resetting the hierarchical index:
g1.add_suffix('_Count').reset_index() print( pd.DataFrame({'City_Count': g1['City'], 'Name_Count': g1['Name'] }) ) City_Count Name_Count Alice Seattle 1 1 Portland 2 2 Bob Seattle 2 2 Mallory Portland 2 2 Seattle 1 1
An alternative approach is to use the size() method to count the occurrences for each group and reset the index:
df1.groupby(['Name', 'City']).size().reset_index() print( pd.DataFrame({'Name': g1.index.get_level_values(0), 'City': g1.index.get_level_values(1), 'Count': g1['City'] }) ) Name City Count 0 Alice Seattle 1 1 Bob Seattle 2 2 Mallory Portland 2 3 Mallory Seattle 1
The above is the detailed content of How to Convert a Pandas GroupBy MultiIndex Output from Series to DataFrame?. For more information, please follow other related articles on the PHP Chinese website!