Keep unique rows: handle duplicate removal
In the field of database operations, it is often necessary to delete duplicate rows from tables. However, this task can become challenging when rows lack unique identifiers. This question discusses this situation, seeking a solution to eliminate duplicate rows while retaining the first occurrence of each unique combination.
Query:
The originally provided query attempts to delete rows based on the presence of duplicate id values. However, this method fails because no such unique identifier exists in the table. Instead, a more robust solution is needed to handle duplicate detection and removal without relying on explicit row identifiers.
Use CTE and ROW_NUMBER:
An effective way to achieve this is to use a common table expression (CTE) in conjunction with the ROW_NUMBER() function. The CTE technique creates a temporary table CTE that contains the original column and an additional column RN that represents the row number for each combination of col1 values.
Partition and number:
ROW_NUMBER() function allows to partition rows based on col1 column and assign row numbers in ascending order within each partition. Therefore, duplicate rows within the same partition will have unique RN values greater than 1.
Deletion process:
By leveraging CTE, we can isolate and remove any rows with an RN value greater than 1, effectively removing duplicates while retaining the first instance of each unique combination.
Result:
After applying the modified query, the expected results are achieved:
<code>COL1 COL2 COL3 COL4 COL5 COL6 COL7 john 1 1 1 1 1 1 sally 2 2 2 2 2 2</code>
Extended functions:
Queries can be further customized to handle duplicate detection and removal across multiple columns by simply adding these columns to the PARTITION BY clause. For example, to consider col1 and col2 for duplicate identification, the PARTITION BY clause would become:
<code>ROW_NUMBER()OVER(PARTITION BY Col1, Col2 ORDER BY OrderColumn)</code>
This method provides a reliable and efficient way to eliminate duplicate rows based on selected columns, providing flexibility in adapting to different data needs.
The above is the detailed content of How Can I Delete Duplicate Rows in a Table While Preserving the First Occurrence of Each Unique Combination?. For more information, please follow other related articles on the PHP Chinese website!