Percentage of Total with Groupby in Pandas
This article focuses on a common task in data analysis: calculating the percentage of a total for each group within a DataFrame. Here's the question and a solution using Pandas.
Question:
I have a CSV file with columns State, Office ID, and Sales. I want to calculate the percentage of sales per office in a given state, where the total of all percentages in each state is 100%.
Solution:
To achieve this, we need to create a second groupby level.
Firstly, we can create a groupby object based on the 'state' and 'office_id' columns:
import pandas as pd state_office = df.groupby(['state', 'office_id']).agg({'sales': 'sum'})
Then, we create a new DataFrame by dividing each sales value by its corresponding sum by state:
state_pcts = state_office.groupby(level=0).apply(lambda x: 100 * x / float(x.sum()))
This approach keeps track of the sum of sales per state while ensuring the percentage of each group within a state adds up to 100%.
The above is the detailed content of How to Calculate Percentage of Sales per Office within Each State Using Pandas?. For more information, please follow other related articles on the PHP Chinese website!