Pandas Percentage of Total with Groupby
Calculating the percentage of sales per office in a given state can be done using Pandas' groupby. However, it requires an additional step to achieve the desired result.
Suppose we have a CSV file with columns representing State, Office ID, and Sales. We can import Pandas and create a DataFrame:
import pandas as pd df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3, 'office_id': list(range(1, 7)) * 2, 'sales': [np.random.randint(100000, 999999) for _ in range(12)]})
To calculate the total sales for each office and state, we can group by those columns:
state_office = df.groupby(['state', 'office_id']).agg({'sales': 'sum'})
To calculate the percentage of sales per office in a given state, we can group by the state and apply a function that divides each office's sales by the total state sales:
state_pcts = state_office.groupby(level=0).apply(lambda x: 100 * x / float(x.sum()))
This results in a DataFrame with the percentage of sales for each office:
print(state_pcts) sales state office_id AZ 2 16.981365 4 19.250033 6 63.768601 CA 1 19.331879 3 33.858747 5 46.809373 CO 1 36.851857 3 19.874290 5 43.273852 WA 2 34.707233 4 35.511259 6 29.781508
This method effectively calculates the percentage of sales per office in a given state by "reaching up" to the state level of the groupby to total up the sales for the entire state.
The above is the detailed content of How to Calculate the Percentage of Sales per Office within Each State Using Pandas Groupby?. For more information, please follow other related articles on the PHP Chinese website!