Why not all groupby operations work with transform
The following code works:
df.groupby('A').apply(lambda x: (x['C'] - x['D']).mean())
Copy after login
but the following does not:
df.groupby('A').transform(lambda x: (x['C'] - x['D']).mean())
Copy after login
Copy after login
The reason for this is that apply and transform work differently.
apply
- The apply() method applies a function to each group in a DataFrame.
- The function can take a single argument, which is the group, or it can take multiple arguments, which are the columns in the group.
- The function can return a single value, or it can return a Series or DataFrame.
- If the function returns a single value, then the result will be a Series.
- If the function returns a Series or DataFrame, then the result will be a DataFrame.
transform
- The transform() method applies a function to each row in a group.
- The function can take a single argument, which is the row, or it can take multiple arguments, which are the columns in the row.
- The function must return a single value.
- The result of the function will be a Series.
In the example code, the apply() method is used to calculate the mean of the difference between the C and D columns for each group.
- The transform() method cannot be used to calculate this value because the function returns a Series, not a single value.
To calculate the mean of the difference between the C and D columns for each group using the transform() method, the function must be modified to return a single value.
- This can be done by using the mean() method on the Series returned by the function.
- The following code shows how to do this:
df.groupby('A').transform(lambda x: (x['C'] - x['D']).mean())
Copy after login
Copy after login
The above is the detailed content of Why Doesn't Transform Work for All Groupby Operations?. For more information, please follow other related articles on the PHP Chinese website!