The following is a pandas method for getting the row with the maximum value in the groupby group. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together
pandas method of getting the row with the maximum value in the groupby group
For example, in the following DataFrame, group by Mt and take out The row with the largest Count
import pandas as pd df = pd.DataFrame({'Sp':['a','b','c','d','e','f'], 'Mt':['s1', 's1', 's2','s2','s2','s3'], 'Value':[1,2,3,4,5,6], 'Count':[3,2,5,10,10,6]}) df
Mt | Sp | Value | ||
---|---|---|---|---|
3 | s1 | a | 1 | |
2 | s1 | b | 2 | ##2 |
s2 | c | 3 | 3 | |
s2 | d | 4 | 4 | |
s2 | #e | 5 | 5 | |
s3 | f | 6 |
df.groupby('Mt').apply(lambda t: t[t.Count==t.Count.max()])
##Count | Mt | SpValue | Mt | ||
---|---|---|---|---|---|
#s1 | |||||
s1 | a | 1 | s2 | 3 | |
s2 | d | 4 | 4 | 10 | |
e | 5 | s3 | 5 | ||
s3 | f | 6 | Method 2: Use transform to get the index of the original dataframe, and then filter out the required rows |
print df.groupby(['Mt'])['Count'].agg(max)
idx=df.groupby(['Mt'])['Count'].transform(max)
print idx
idx1 = idx == df['Count']
print idx1
df[idx1]
Mt s1 3 s2 10 s3 6 Name: Count, dtype: int64 0 3 1 3 2 10 3 10 4 10 5 6 dtype: int64 0 True 1 False 2 False 3 True 4 True 5 True dtype: bool
Sp | Value##0 | 3 | ||
---|---|---|---|---|
1 | 3 | 10 | s2 | |
4 | 4 | 10 | s2 | |
5 | 5 | 6 | s3 | |
6 | The above method has a problem with the values in rows 3 and 4. They are all maximum values, so multiple rows are returned. What if only one row is returned? | Method 3: idmax (the old version of pandas is argmax) |
idx = df.groupby('Mt')['Count'].idxmax() print idx
df.iloc[idx]
Mt
s1 0
s2 3
s3 5
Name: Count, dtype: int64
#Count
Value | 0 | 3 | s1 | |
---|---|---|---|---|
3 | 10 | s2 | d | |
5 | 6 | s3 | f | |
Mt
0 | 3 | s1 | ||
---|---|---|---|---|
3 | 10 | s2 | d | |
5 | 6 | s3 | f | |
def using_apply(df): return (df.groupby('Mt').apply(lambda subf: subf['Value'][subf['Count'].idxmax()])) def using_idxmax_loc(df): idx = df.groupby('Mt')['Count'].idxmax() return df.loc[idx, ['Mt', 'Value']] print using_apply(df) using_idxmax_loc(df) Copy after login | Mt s1 1 s2 4 s3 6 dtype: int64 Copy after login |
##Mt
#Value
0 | s11 | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | ##5 | |||||||||||||||||||||
6 | ||||||||||||||||||||||
from each group | df.sort('Count', ascending=False).groupby('Mt', as_index=False).first() Copy after login
##Mt Count Sp
The above is the detailed content of pandas method to get the row with the maximum value in the groupby group. For more information, please follow other related articles on the PHP Chinese website!
Related labels:
source:php.cn
Previous article:python3+PyQt5+Qt Designer implements extended dialog box
Next article:A complete collection of common method codes and examples for using redis in PHP operations
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Latest Issues
Related Topics
More>
Popular Recommendations
Popular Tutorials
More>
Latest Downloads
More>
|