Home > Database > Mysql Tutorial > How to Efficiently Remove Duplicate Rows from a Table Without Unique Identifiers?

How to Efficiently Remove Duplicate Rows from a Table Without Unique Identifiers?

Susan Sarandon
Release: 2025-01-23 01:41:08
Original
644 people have browsed it

How to Efficiently Remove Duplicate Rows from a Table Without Unique Identifiers?

Efficiently remove duplicate rows without unique identifiers

Removing duplicates can be a challenge when a data table lacks unique row identifiers. This article provides an efficient solution for removing duplicate rows while retaining the first occurrence of the row.

Let’s look at a table with duplicate rows:

col1 col2 col3 col4 col5 col6 col7
john 1 1 1 1 1 1
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
sally 2 2 2 2 2 2

The desired result after removing duplicate rows is:

col1 col2 col3 col4 col5 col6 col7
john 1 1 1 1 1 1
sally 2 2 2 2 2 2

Solution using CTE and ROW_NUMBER

This method utilizes the common table expression (CTE) and the ROW_NUMBER() function. CTE assigns each row a sequence number (RN) based on a specific order, allowing us to identify and eliminate duplicates.

Here is the SQL query with step-by-step instructions:

<code class="language-sql">WITH CTE AS (
   SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],
       RN = ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col1) -- 为 col1 定义的每个组内分配序列号
   FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1; -- 删除 RN 大于 1 的行(表示重复项)</code>
Copy after login

Instructions:

  • CTE Creation: The WITH statement creates a CTE named CTE that contains the columns of the table and assigns RN values ​​to each row using the ROW_NUMBER() function. The PARTITION BY clause groups the rows based on the col1 column and sorts them within each group to determine the order.
  • ROW_NUMBER() function: The ROW_NUMBER() function generates a sequence of integers starting from 1 for each row within each partition defined by the PARTITION BY clause.
  • Delete operation: The DELETE statement deletes rows with RN greater than 1 in the CTE, thereby eliminating duplicate rows.

Output:

After executing the query, the updated table will contain:

col1 col2 col3 col4 col5 col6 col7
john 1 1 1 1 1 1
sally 2 2 2 2 2 2

The above is the detailed content of How to Efficiently Remove Duplicate Rows from a Table Without Unique Identifiers?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template