Consider a situation where you have a DataFrame with multiple columns and you want to group the rows by two columns. After grouping, you may need to further sort the aggregated results within each group, such as sorting by a count column in descending order. Here's how you can achieve that:
To group data by multiple columns and then sort within the groups, you can combine the groupby() and sort_values() functions. For example, suppose you have a DataFrame with columns count, job, and source.
<code class="python">import pandas as pd df = pd.DataFrame({'count': [2, 4, 6, 3, 7, 5, 3, 2, 4, 1], 'job': ['sales','sales','sales','sales','sales', 'market','market','market','market','market'], 'source': ['A','B','C','D','E','A','B','C','D','E']})</code>
To get the overall count for each job and source combination, you can do:
<code class="python">df.groupby(['job','source']).agg({'count':sum})</code>
Next, To sort the count column in descending order within each of the groups and take only the top three rows, you can do the following:
<code class="python">result = df.sort_values(['job','count'],ascending=False).groupby('job').head(3)</code>
This will return a DataFrame with the top three rows for each job group, sorted by the count column in descending order. The resulting DataFrame might look like this:
<code class="python">print(result) count job source 4 7 sales E 2 6 sales C 1 4 sales B 5 5 market A 8 4 market D 6 3 market B</code>
The above is the detailed content of How to Group Data and Sort Within Groups Using Pandas GroupBy?. For more information, please follow other related articles on the PHP Chinese website!