Hi all,
Kindly suggest or share such query to me , How we can delete the rows having duplicate records
Condition; Table having only one column from the we have to delete duplicate records
Thanks
February 2, 2017 at 4:05 am
Write a CTE that numbers the rows, partitioned on the values of the single column, then delete from that CTE where the row number is greater than 1.
John
February 2, 2017 at 6:35 am
Bharat21 - Thursday, February 2, 2017 3:57 AMHi all,Kindly suggest or share such query to me , How we can delete the rows having duplicate records
Condition; Table having only one column from the we have to delete duplicate recordsThanks
What percent of the table contains duplicates?
--Jeff Moden
Change is inevitable... Change for the better is not.
February 2, 2017 at 6:58 am
John Mitchell-245523 - Thursday, February 2, 2017 4:05 AMWrite a CTE that numbers the rows, partitioned on the values of the single column, then delete from that CTE where the row number is greater than 1.John
If the table in question only has 1 column, how would the DELETE know to only remove 1 of the records of the matching criteria? I think this might be a bit more complicated than that, involving some type of DELETE TOP 1... kind of expression, not sure how to make that a set based solution though. :ermm:
February 2, 2017 at 7:02 am
How many rows are in the table? It might be easier to copy the data to a new table with a SELECT DISTINCT ... INTO .. and then rename the tables afterward.
February 2, 2017 at 7:09 am
Chris Harshman - Thursday, February 2, 2017 6:58 AMJohn Mitchell-245523 - Thursday, February 2, 2017 4:05 AMWrite a CTE that numbers the rows, partitioned on the values of the single column, then delete from that CTE where the row number is greater than 1.John
If the table in question only has 1 column, how would the DELETE know to only remove 1 of the records of the matching criteria? I think this might be a bit more complicated than that, involving some type of DELETE TOP 1... kind of expression, not sure how to make that a set based solution though. :ermm:
Upon re-reading, I notice the requirement isn't totally clear, but I read it as meaning delete duplicates so that there's only one of each value left. If I'm right, my solution will work. If we're only removing one, then delete where row number is 1. If the requirement is to remove all duplicated values, use COUNT instead of ROW_NUMBER, and delete where the count is greater than 1. But as you and (I think) Jeff have suggested, it may be more efficient to move and rename (although that won't be possible if there are no DDL permissions in the database).
John
Chris Harshman - Thursday, February 2, 2017 6:58 AMJohn Mitchell-245523 - Thursday, February 2, 2017 4:05 AMWrite a CTE that numbers the rows, partitioned on the values of the single column, then delete from that CTE where the row number is greater than 1.John
If the table in question only has 1 column, how would the DELETE know to only remove 1 of the records of the matching criteria? I think this might be a bit more complicated than that, involving some type of DELETE TOP 1... kind of expression, not sure how to make that a set based solution though. :ermm:
You mean this?
IF OBJECT_ID('tempdb..#Clients') IS NOT NULL DROP TABLE #Clients; CREATE TABLE #Clients (Surname VARCHAR(30));
INSERT INTO #Clients (Surname) VALUES
('Smith'),('Smith'),('Smith'),('Smith'),
('Jones'),('Jones'),('Jones'),('Jones'),('Jones'),
('sahathevarajan'),('sahathevarajan');
SELECT Surname, rn = ROW_NUMBER() OVER(PARTITION BY Surname ORDER BY (SELECT NULL)) FROM #Clients;
WITH Deleter AS (SELECT Surname, rn = ROW_NUMBER() OVER(PARTITION BY Surname ORDER BY (SELECT NULL)) FROM #Clients)
DELETE FROM Deleter WHERE rn > 1
SELECT Surname FROM #Clients;
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
February 2, 2017 at 7:21 am
Chris Harshman - Thursday, February 2, 2017 6:58 AMJohn Mitchell-245523 - Thursday, February 2, 2017 4:05 AMWrite a CTE that numbers the rows, partitioned on the values of the single column, then delete from that CTE where the row number is greater than 1.John
If the table in question only has 1 column, how would the DELETE know to only remove 1 of the records of the matching criteria? I think this might be a bit more complicated than that, involving some type of DELETE TOP 1... kind of expression, not sure how to make that a set based solution though. :ermm:
It's not more complicated. It works perfectly thanks to the internal magic of CTEs and hidden row identification using RID that's associated with the return of the generated ROW_NUMBER() OVER..
--Jeff Moden
Change is inevitable... Change for the better is not.
February 2, 2017 at 8:01 am
" It works perfectly thanks to the internal magic of CTEs and hidden row identification using RID that's associated with the return of the generated ROW_NUMBER() OVER.."
Good to know, thanks!
February 9, 2017 at 3:54 am
Thanks to all
February 16, 2017 at 8:45 pm
This was removed by the editor as SPAM
February 16, 2017 at 9:50 pm
JasonClark - Thursday, February 16, 2017 8:45 PMPS: Please perform delete operation within a transaction.
I have to ask "why"? One of the advantages of doing it using the good method you described is that the new de-duped data exists in a separate table. Once completed, there's no real need to protect the original table. Another advantage of your method is that you can take advantage of minimal logging. As of 2008 and provided that you have the same clustered index keys as the original table, the INSERT/SELECT will be minimally logged and won't require the duplication of reserved space like a SELECT INTO followed by a CI build would.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 17, 2017 at 6:19 am
Don't you just love it when DB designers leave out primary key or unique key constraints? If it wasn't for poor database design we wouldn't get to have so much fun with T-SQL.
June 19, 2020 at 2:00 pm
Yes, I had been wanted this.
Thanks, it's worked for me
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply