Home > Database > Mysql Tutorial > How to Efficiently Delete Duplicate Rows from a Table in SQL?

How to Efficiently Delete Duplicate Rows from a Table in SQL?

Mary-Kate Olsen
Release: 2025-01-19 02:47:08
Original
115 people have browsed it

How to Efficiently Delete Duplicate Rows from a Table in SQL?

SQL: Eliminating Duplicate Rows in Tables

Challenge:

You're faced with a table lacking primary keys or constraints, containing multiple rows with identical data. The goal is to remove duplicate rows, preserving only a single instance of each unique row. We'll use a column named "key" to identify duplicates, ensuring only one entry exists for each distinct "key" value.

Solution:

This process involves two key steps:

  1. Identifying the First Occurrence: Locate the initial instance of each duplicate row.
  2. Removing Subsequent Duplicates: Delete all rows that are not the first occurrence identified in step 1.

The following SQL command accomplishes this:

<code class="language-sql">DELETE FROM dups a USING (
    SELECT MIN(ctid) as ctid, key
    FROM dups 
    GROUP BY key HAVING COUNT(*) > 1
) b
WHERE a.key = b.key 
AND a.ctid <> b.ctid;</code>
Copy after login

Important Note: This method doesn't offer control over which specific row is retained.

Illustrative Example:

Consider this sample table:

<code class="language-sql">CREATE TABLE people (
    name    varchar(50) NOT NULL,
    surname varchar(50) NOT NULL,
    age     integer NOT NULL
);

INSERT INTO people (name, surname, age) VALUES 
    ('A.', 'Tom', 30),
    ('A.', 'Tom', 10),
    ('B.', 'Tom', 20),
    ('B', 'Chris', 20);</code>
Copy after login

To find the first occurrence of duplicates:

<code class="language-sql">SELECT MIN(ctid) as ctid, name, surname
FROM people 
GROUP BY (name, surname) HAVING COUNT(*) > 1;</code>
Copy after login

Output:

ctid name surname
(0,1) A. Tom

Deleting the non-first duplicates:

<code class="language-sql">DELETE FROM people a USING (
    SELECT MIN(ctid) as ctid, name, surname
    FROM people 
    GROUP BY (name, surname) HAVING COUNT(*) > 1
) b
WHERE a.name = b.name
AND a.surname = b.surname
AND a.ctid <> b.ctid;</code>
Copy after login

The remaining rows:

<code class="language-sql">SELECT * FROM people;</code>
Copy after login

Output:

name surname age
A. Tom 30
B. Tom 20
B Chris 20

The above is the detailed content of How to Efficiently Delete Duplicate Rows from a Table in SQL?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template