How to Sort Data Within Groups in Pandas DataFrames?

Susan Sarandon
Release: 2024-10-20 17:27:02
Original
346 people have browsed it

How to Sort Data Within Groups in Pandas DataFrames?

Sorting Within Groups in pandas

When working with pandas dataframes, it is often necessary to group data by specific columns and then perform additional operations within those groups. One common requirement is to sort the grouped data based on a certain criterion.

To achieve this, the groupby function can be chained with the sort_values function. As an example, consider a dataframe df that has columns count, job, and source.

In [167]: df

Out[167]:
   count     job source
0      2   sales      A
1      4   sales      B
2      6   sales      C
3      3   sales      D
4      7   sales      E
5      5  market      A
6      3  market      B
7      2  market      C
8      4  market      D
9      1  market      E
Copy after login

If you want to group the data by job and source and then sort the aggregated results by count in descending order, you can do the following:

In [168]: df.groupby(['job','source']).agg({'count':sum})
Copy after login

This will create a new dataframe that contains the aggregated count values for each group. However, the resulting dataframe will not be sorted by count. To sort the dataframe, you can use the sort_values function:

In [34]: df.sort_values(['job','count'],ascending=False)
Copy after login

This will sort the dataframe by job first and then by count in descending order. The resulting dataframe will look like this:

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B
Copy after login
Copy after login

To take the top three rows of each group, you can use the head function:

In [34]: df.sort_values(['job','count'],ascending=False).groupby('job').head(3)
Copy after login

This will result in a new dataframe that contains the top three rows of each group, sorted by count in descending order.

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B
Copy after login
Copy after login

The above is the detailed content of How to Sort Data Within Groups in Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!

source:php
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template