Deleting Duplicate Records Without a Primary Key in SQL
Duplicate records can pose challenges in data integrity, leading to inaccurate results and data inconsistencies. In situations where a table lacks a primary key, identifying and removing duplicate records can be more complex.
In the absence of a primary key, one approach to eliminate duplicate records is to identify those with matching values in specific fields, such as an employee ID and Social Security number. To achieve this, a subquery can be employed to partition the table based on the desired criteria and assign a sequential number to each record.
The DELETE statement can then be utilized to delete records with duplicate field values by filtering on those with a row number greater than 1. This ensures that only the first occurrence of each unique set of values is retained.
Consider the following SQL query:
DELETE SUB FROM (SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt FROM Employee) SUB WHERE SUB.cnt > 1
In this query, the subquery calculates the row number for each record, partitioned by the EmpId, EmpName, and EmpSSN fields. The ORDER BY clause ensures that the row number is assigned sequentially. The DELETE statement then removes any records with a row number greater than 1, effectively deleting duplicates.
The above is the detailed content of How to Delete Duplicate Records in SQL Tables Without a Primary Key?. For more information, please follow other related articles on the PHP Chinese website!