SQL select only rows with maximum value on column
P粉937382230
P粉937382230 2023-08-23 11:08:57
0
2
462
<p>I have this document form (here is a simplified version): </p> <table class="s-table"> <thead> <tr> <th>id</th> <th>Transfer</th> <th>Content</th> </tr> </thead> <tbody> <tr> <td>1</td> <td>1</td> <td>...</td> </tr> <tr> <td>2</td> <td>1</td> <td>...</td> </tr> <tr> <td>1</td> <td>2</td> <td>...</td> </tr> <tr> <td>1</td> <td>3</td> <td>...</td> </tr> </tbody> </table> <p>How to select one row per ID and only the largest RPM? </p><p> Using the above data, the result should contain two rows: <code>[1, 3, ...]</code> and <code>[2, ​​1, ..]</code>. I'm using<em><strong>MySQL</strong></em>. </p> <p>Currently, I'm using a check in a <code>while</code> loop to detect and overwrite old revs in the result set. But is this the only way to achieve this result? Don't have a <strong>SQL</strong> solution? </p>
P粉937382230
P粉937382230

reply all(2)
P粉638343995

My preference is to use as little code as possible...

You can do this using IN Try this:

SELECT * 
FROM t1 WHERE (id,rev) IN 
( SELECT id, MAX(rev)
  FROM t1
  GROUP BY id
)

In my opinion, it's less complex... easier to read and maintain.

P粉517475670

At first glance...

All you need is a GROUP BY clause with a MAX aggregate function:

SELECT id, MAX(rev)
FROM YourTable
GROUP BY id

Things are never that simple, are they?

I just noticed that you also need the content column.

This is a very common problem in SQL: find all the data for a row with some maximum value in each group identifier column. I hear this statement often in my career. In fact, this was one of the questions I answered in a technical interview at my current job.

In fact, the Stack Overflow community created a tag to address questions like this: .

Basically, you have two ways to solve this problem:

Use simple group-identifier, max-value-in-groupSubquery join

In this approach you first find the group-identifier, max-value-in-group (solved above) in the subquery. Then, join the table to the subquery and make it equal on group-identifier and max-value-in-group:

SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
    SELECT id, MAX(rev) rev
    FROM YourTable
    GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev

Left join with self, adjust connection conditions and filters

In this approach you left join the table to itself. Equality exists in group-identifier. Then, 2 smart moves:

  1. The second connection condition is that the value on the left is less than the value on the right
  2. When you do step 1, the row that actually has the largest value will show NULL on the right (that's a LEFT JOIN, remember?). We then filter the join results to only show rows with NULL on the right side.

So you end up with:

SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
    ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;

in conclusion

Both methods will bring exactly the same results.

If you have two rows group-identifier with max-value-in-group, both rows will appear in the results of both methods.

Both methods are SQL ANSI compatible, so they will work with your favorite RDBMS regardless of their "flavor".

Both approaches are also performance-friendly, but your situation may differ (RDBMS, database structure, indexes, etc.). So baselines when you choose one method over another. And make sure you choose the one that makes the most sense to you.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template