When grouping a DataFrame by multiple columns using GroupBy, the result is often a MultiIndex Series. However, in certain scenarios, you may require the data back in a DataFrame format. This article demonstrates how to convert a MultiIndex Series output of GroupBy back into a DataFrame.
Consider the following sample DataFrame:
City Name 0 Seattle Alice 1 Seattle Bob 2 Portland Mallory 3 Seattle Mallory 4 Seattle Bob 5 Portland Mallory
Using GroupBy with multiple columns, we can count the occurrences:
g1 = df1.groupby(["Name", "City"]).count()
However, the output of g1 is a MultiIndex Series:
City Name Name City Alice Seattle 1 1 Bob Seattle 2 2 Mallory Portland 2 2 Seattle 1 1
To convert this back to a DataFrame, you can leverage two approaches:
Method 1: Adding Suffix and Resetting Index
Add a suffix to the column names and reset the index:
g1.add_suffix('_Count').reset_index()
This will create a DataFrame with three columns: Name, City, and two additional columns suffixed with _Count to denote the counts.
Method 2: Using DataFrame Constructor
Alternatively, you can use the DataFrame constructor with the .size() method to count the occurrences and reset the index:
DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index()
This approach will create a DataFrame with two columns: Name, City, and an additional column count representing the counts.
The above is the detailed content of How to Convert a Pandas GroupBy MultiIndex Series to a DataFrame?. For more information, please follow other related articles on the PHP Chinese website!