Grouping and Summing Data in Pandas
In data analysis, it is often necessary to aggregate data by specific criteria to derive meaningful insights. Pandas, a powerful Python library for data manipulation, provides the groupby() method to group data based on one or more columns. This method can be combined with aggregation functions, such as sum(), to compute aggregate values for each group.
Calculating the Sum of Values by Group
Suppose we have a DataFrame containing information about fruit consumption by individuals. Each row represents a fruit purchase, including the fruit type, purchase date, customer name, and number of fruits purchased.
To calculate the total number of fruits purchased by each individual, grouped by both fruit type and customer name, we can use the following steps:
Step 1: Group the Data
First, we group the DataFrame by both the 'Fruit' and 'Name' columns using the groupby() method:
df_grouped = df.groupby(['Fruit', 'Name'])
This creates a SeriesGroupBy object, which represents the grouped data.
Step 2: Apply the Sum Function
To calculate the total number of fruits purchased by each group, we apply the sum() function to the grouped Series:
df_grouped_sum = df_grouped['Number'].sum()
The resulting Series, df_grouped_sum, contains the sum of fruit purchases for each unique combination of fruit type and customer name.
Example
Consider the following DataFrame:
Fruit Date Name Number Apples 10/6/2016 Bob 7 Apples 10/6/2016 Bob 8 Apples 10/6/2016 Mike 9 Apples 10/7/2016 Steve 10 Apples 10/7/2016 Bob 1 Oranges 10/7/2016 Bob 2 Oranges 10/6/2016 Tom 15 Oranges 10/6/2016 Mike 57 Oranges 10/6/2016 Bob 65 Oranges 10/7/2016 Tony 1 Grapes 10/7/2016 Bob 1 Grapes 10/7/2016 Tom 87 Grapes 10/7/2016 Bob 22 Grapes 10/7/2016 Bob 12 Grapes 10/7/2016 Tony 15
Applying the groupby() and sum() operations to this DataFrame, we get the following result:
Number Fruit Name Apples Bob 16 Mike 9 Steve 10 Grapes Bob 35 Tom 87 Tony 15 Oranges Bob 67 Mike 57 Tom 15 Tony 1
This output shows the total number of fruits purchased by each individual, broken down by fruit type.
The above is the detailed content of How Can I Group and Sum Data in Pandas to Calculate Total Purchases by Customer and Fruit Type?. For more information, please follow other related articles on the PHP Chinese website!