Preserving One Copy of Duplicate Rows in Postgres
When dealing with duplicate data, it's often necessary to remove the redundancies while retaining at least one instance of each duplicate set. In Postgres, this can be achieved through the following approach:
To delete duplicate rows while retaining a single copy, we can use the following query:
DELETE FROM foo WHERE id NOT IN ( SELECT MIN(id) -- or MAX(id) for the latest copy FROM foo GROUP BY hash );
In this query:
By using MIN(id) or MAX(id) inside the subquery, we ensure that only the oldest or newest copy of each duplicate set is retained.
This approach ensures that at least one copy of each duplicate row is preserved in the table while removing the unnecessary duplicates.
The above is the detailed content of How to Keep Only One Copy of Duplicate Rows in PostgreSQL?. For more information, please follow other related articles on the PHP Chinese website!