Efficiently obtain the largest row in the partition and avoid redundant table access
In the field of data query, efficiency is crucial, especially when dealing with large tables. A common optimization challenge is to find the row with the largest value in a specific column from each partition of a table.
Suppose we need to get the score obtained by each ID in the latest round (ROUND) from the SCORES table:
ID | ROUND | SCORE |
---|---|---|
1 | 1 | 3 |
1 | 2 | 6 |
1 | 3 | 2 |
2 | 1 | 10 |
2 | 2 | 12 |
3 | 1 | 6 |
Initial method:
One way is to retrieve all rows and then filter out those that do not represent the maximum ROUND per ID:
<code class="language-sql">SELECT * FROM (SELECT id, round, CASE WHEN (MAX(round) OVER (PARTITION BY id)) = round THEN score ELSE NULL END score FROM SCORES where id in (1,2,3) ) scorevals WHERE scorevals.round is not null;</code>
Although this method works, it is inefficient due to redundant table scans.
Optimization plan:
Another more efficient way is to use window functions and the DISTINCT clause:
<code class="language-sql">SELECT DISTINCT id ,max(round) OVER (PARTITION BY id) AS round ,first_value(score) OVER (PARTITION BY id ORDER BY round DESC) AS score FROM SCORES WHERE id IN (1,2,3) ORDER BY id;</code>
In this method, the window function max(round) OVER (PARTITION BY id)
calculates the maximum ROUND for each ID. Then apply a DISTINCT clause after the window function, ensuring only the highest ROUND row for each ID is returned. Finally, the first_value(score) OVER (PARTITION BY id ORDER BY round DESC)
window function retrieves the first SCORE associated with the largest ROUND for each ID.
This optimized scheme achieves the desired results without requiring multiple table scans, significantly improving performance.
The above is the detailed content of How to Efficiently Select the Max Row from Partitions Without Multiple Table Scans?. For more information, please follow other related articles on the PHP Chinese website!