MySQL: Deleting Duplicate Records While Retaining the Most Recent
Question: How can I eliminate duplicate email addresses in a MySQL table while preserving the most recent one based on the unique ID field?
Solution:
To achieve this, follow these steps:
Identify Duplicate Emails:
Retrieve a list of all duplicate email addresses using the query:
SELECT email FROM test GROUP BY email HAVING COUNT(*) > 1;
Find the Latest ID for Each Duplicate:
Determine the latest ID associated with each duplicate email using the query:
SELECT MAX(id) AS lastID, email FROM test WHERE email IN ( SELECT email FROM test GROUP BY email HAVING COUNT(*) > 1 ) GROUP BY email;
Delete Oldest Duplicates:
Perform a DELETE operation to remove duplicate emails with an ID less than the latest one:
DELETE test FROM test INNER JOIN ( SELECT MAX(id) AS lastID, email FROM test WHERE email IN ( SELECT email FROM test GROUP BY email HAVING COUNT(*) > 1 ) GROUP BY email ) duplic ON duplic.email = test.email WHERE test.id < duplic.lastID;
Optimized Version:
The following optimized version provides the same result while potentially improving performance:
DELETE test FROM test INNER JOIN ( SELECT MAX(id) AS lastID, email FROM test GROUP BY email HAVING COUNT(*) > 1) duplic ON duplic.email = test.email WHERE test.id < duplic.lastID;
This revised DELETE statement focuses on eliminating the oldest duplicates.
Additional Option:
Alternatively, you can utilize this query provided by Rene Limon:
DELETE FROM test WHERE id NOT IN ( SELECT MAX(id) FROM test GROUP BY email);
This approach retains the most recent duplicate email address based on the maximum ID.
The above is the detailed content of How to Remove Duplicate Email Addresses in MySQL While Keeping the Most Recent Record?. For more information, please follow other related articles on the PHP Chinese website!