Applying Multiple Aggregations to the Same Column in Pandas
In pandas, GroupBy.agg() provides a convenient way to perform multiple functions on grouped data. However, applying different functions to the same column using agg() can be seemingly challenging.
Traditionally, the syntactically incorrect but desired approach would be to pass duplicate keys to the dictionary argument of agg(), which is not allowed in Python.
To address this, pandas offers several options:
Option 1: List of Tuples
As of 2022-06-20, the preferred method is to provide a list of tuples [(column, function)] to agg(), where each tuple represents an aggregation to be performed on the specified column.
df.groupby('dummy').agg( Mean=('returns', np.mean), Sum=('returns', np.sum))
Option 2: Nested Dictionary
Another approach is to use a nested dictionary, where the outer key is the column and the inner values are the functions to be applied.
df.groupby('dummy').agg({'returns': {'Mean': np.mean, 'Sum': np.sum}})
Option 3: List of Functions
For historical versions of pandas, an alternative option is to pass the functions as a list within the dictionary argument of agg().
df.groupby('dummy').agg({'returns': [np.mean, np.sum]})
By utilizing these options, you can conveniently perform multiple aggregations on the same column without the need for auxiliary functions or explicitly calling agg() multiple times.
The above is the detailed content of How Can I Apply Multiple Aggregations to the Same Column in Pandas?. For more information, please follow other related articles on the PHP Chinese website!