Select the first row in PostgreSQL GROUP BY group
In PostgreSQL, selecting the first row of each group in a GROUP BY operation is a common requirement. For this we can use the powerful DISTINCT ON
clause.
<code class="language-sql">SELECT DISTINCT ON (customer) id, customer, total FROM purchases ORDER BY customer, total DESC</code>
DISTINCT ON
clause allows us to define which expressions are considered duplicates. In this example, we specified the customer
column, which means that only rows with the same customer
value will be considered duplicates.
ORDER BY
clause defines the order in which rows are processed. The first row in each customer
group will be the row with the highest total value. If there are multiple rows with the same maximum total value, the deciding factor will be the order in which the id
columns are defined.
Key points
DISTINCT ON
is an extension to PostgreSQL that allows us to specify which expressions determine duplicates. DISTINCT ON
expression must match the leading expression in the ORDER BY
clause. ORDER BY
clause can be used to prioritize rows within each group. SELECT
list that are not in a DISTINCT ON
or ORDER BY
clause. customer
, total
, and id
columns, in the same order as the query, will provide the best performance. Additional Notes
For data with customer
high column cardinality (a small number of rows per customer), the DISTINCT ON
method is very efficient. However, for customer
data with low column cardinality (large number of rows per customer), other query techniques may be more efficient.
The above is the detailed content of How to Select the First Row of Each Group in a PostgreSQL GROUP BY Query?. For more information, please follow other related articles on the PHP Chinese website!