In this article, we will discuss how to delete duplicate rows from a table in SQL server
This is a very common SQL interview question. Here is the table with duplicate data, notice that mark record is duplicated 3 times Marry record 2 times and Ben record 3 times.
Now the delete query that we are going to write should delete all the rows except one for the employee.
It means after we execute the delete query the employee table should leave with just 3 records as you can see here.
How to delete duplicate rows from a table using Row Number and Partition?
Let us see how to achieve this using SQL server management studio.
I have already created the employee table and populated it with some test data. Here is the SQL script that can do it.
Now to write a delete query we are going to make use of common table expression called CTE.
Now we are going to select all the columns from the employee table and along with all the columns we are going to make use of row number function.
Then over we want data to be portioned by ID column and also we want the order by this ID column that generates the row number.
Now select the results from CTE and look at the results.
Now the data is partitioned by ID and the row numbers are unique within that partition.
Here when the new partition stats look at the row number, it starts with a new number and we have partitioned the data basically by the ID.
Within the partition, row numbers are unique when a new partition stats with 1.
Now using this row number column we can very easily delete all the rows except 1.
In the delete query, we can use row numbers as a filter.
Instead of selecting all the rows from the employee table what we can do is delete from employees CTE where row number greater than 1.
What is it going to do is, it is going to delete anything that has got greater than 1. Meaning it is going to leave just one record per employee.
|ID||First Name||Last Name||Gender||Salary||Rownumber|
That’s it we got the results as expected.
How to Retrieve Only Unique Records from the Employee Table?
Let us write a simple query for this one.
Select ID, First Name, Last Name, Gender, Salary, Count(*) as Unique from Employee Group By ID, First Name, Last Name, Gender, Salary Having Count(*)=1;
Result for this query will be
|ID||First Name||Last Name||Gender||Salary||Unique|
I hope you enjoyed reading this article How to delete duplicate rows from a table in SQL server. If you still have any queries please leave a comment in the comment section.
For more information on how to delete duplicate rows from a table in SQL Server, you check this article Different Ways to SQL delete duplicate rows from a SQL table.
I hope this article is useful to you. Thank You. Have a good day.
Do you have something to share about how to eliminate duplicate records from a table? Or Do you have any queries on deleting the duplicate records from a table?
Please leave your comments in the comment section below.