Use window function to get the first N rows of each group
This article describes how to use SQL window functions to efficiently extract the first N rows from grouped data. This is very practical in many data analysis scenarios.
MySQL 8 and above provide three commonly used window functions: ROW_NUMBER
, RANK
and DENSE_RANK
. These functions assign a numerical value to each row, allowing us to select the first few rows in each grouping.
Sample data:
Suppose we have the following data set:
pkid | catid | value |
---|---|---|
1 | p01 | 100 |
2 | p01 | 90 |
3 | p01 | 90 |
4 | p01 | 80 |
5 | p01 | 80 |
6 | p01 | 80 |
7 | p01 | 70 |
8 | p01 | 60 |
9 | p01 | 50 |
10 | p01 | 40 |
Comparison of the output results of the three window functions:
pkid | catid | value | row_number | rank | dense_rank |
---|---|---|---|---|---|
1 | p01 | 100 | 1 | 1 | 1 |
2 | p01 | 90 | 2 | 2 | 2 |
3 | p01 | 90 | 3 | 2 | 2 |
4 | p01 | 80 | 4 | 4 | 3 |
5 | p01 | 80 | 5 | 4 | 3 |
6 | p01 | 80 | 6 | 4 | 3 |
7 | p01 | 70 | 7 | 7 | 4 |
8 | p01 | 60 | 8 | 8 | 5 |
9 | p01 | 50 | 9 | 9 | 6 |
10 | p01 | 40 | 10 | 10 | 7 |
Function description:
ROW_NUMBER
: Assigns a unique sequential number to each row in each group, starting with 1. For example, catid
is a p01
whose first 5 rows have row_number
values from 1 to 5.
RANK
: Assigns a rank to each row in each grouping, with rows of the same value having the same rank. For example, the first 5 rows where catid
is p01
have rank
values from 1 to 5 (even if some values are repeated).
DENSE_RANK
: Similar to RANK
, but rows with the same value are ranked consecutively without gaps. For example, the first 5 rows where catid
is p01
have dense_rank
values from 1 to 5 (even if some values are repeated).
Choose the appropriate window function:
Which window function you choose depends on your specific needs and your definition of "first N rows". If you need each row to have a unique ranking, use ROW_NUMBER
; if you allow rows with the same value to have the same rank, use RANK
; if you need consecutive rankings, use DENSE_RANK
.
The above is the detailed content of How Can I Retrieve the Top N Rows for Each Group in SQL Using Window Functions?. For more information, please follow other related articles on the PHP Chinese website!