Home > Backend Development > Python Tutorial > How to Add a New Column with Grouped Summation in Pandas Using `transform()`?

How to Add a New Column with Grouped Summation in Pandas Using `transform()`?

Mary-Kate Olsen
Release: 2024-12-24 10:46:14
Original
861 people have browsed it

How to Add a New Column with Grouped Summation in Pandas Using `transform()`?

Creating a New Column Based on Grouped Summation in Pandas

Problem Statement

When attempting to create a new column based on the summation of a value grouped by date using pandas' groupby(), NaN results are encountered. The objective is to add a column that displays the total sum of a specific value for all dates, regardless of the number of rows associated with that date.

Solution

To achieve this, the transform() function is employed. Unlike the apply() function, which operates row-by-row, transform() performs computations on grouped data and returns a series aligned with the original dataframe.

df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')
Copy after login
Copy after login

Here's a step-by-step breakdown:

  • df['Data3'].groupby(df['Date']): This line groups the 'Data3' column by 'Date'.
  • transform('sum'): The 'transform' function is applied to the grouped object, calculating the sum of 'Data3' for each date group.
  • The result is a series aligned with the original dataframe, allowing it to be added as a new column named 'Data4'.

Example Usage

Consider the following dataframe:

         Date   Sym  Data2  Data3
0  2015-05-08  aapl     11      5
1  2015-05-07  aapl      8      8
2  2015-05-06  aapl     10      6
3  2015-05-05  aapl     15      1
4  2015-05-08  aaww    110     50
5  2015-05-07  aaww     60    100
6  2015-05-06  aaww    100     60
7  2015-05-05  aaww     40    120
Copy after login

Applying the transform() function:

df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')
Copy after login
Copy after login

Results in:

         Date   Sym  Data2  Data3  Data4
0  2015-05-08  aapl     11      5     55
1  2015-05-07  aapl      8      8    108
2  2015-05-06  aapl     10      6     66
3  2015-05-05  aapl     15      1    121
4  2015-05-08  aaww    110     50     55
5  2015-05-07  aaww     60    100    108
6  2015-05-06  aaww    100     60     66
7  2015-05-05  aaww     40    120    121
Copy after login

As evident from the output, the 'Data4' column now holds the sum of 'Data3' for each unique 'Date' value.

The above is the detailed content of How to Add a New Column with Grouped Summation in Pandas Using `transform()`?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template