February 1, 2016 at 5:44 am
andy_111 (2/1/2016)
Adrian_1 (2/1/2016)
The OP is intending that only 1 row will be deleted (because SET ROWCOUNT = 1) and therefore on the 2nd pass will not delete because there is now only 1 left...
see previous post
according to MSDN, "Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax. For more information, see TOP (Transact-SQL)."
https://msdn.microsoft.com/en-us/library/ms188774.aspx%5B/quote%5D
Based on the nested comment if SET ROWCOUNT = 1 and you have a 10k dups. All that I will say is good luck!:-P.
--------------------------------------------------------------------------------------------------------------------------------------------------------
To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
http://www.sql-sa.co.za
February 1, 2016 at 6:14 am
I agree with the general forum sentiment - rowcount being an older, deprecated method, should obviously be avoided. I agree the windowing methods are more appropriate. I am glad it was posted though, from a general trivia perspective. Without noticing its role, the delete query would lead one to assume an incorrect number of records would be removed. :w00t:
February 1, 2016 at 6:22 am
It is a weird solution but I am glad I have seen a new solution to remove duplication after 10 years(after introduction of row_number in SQL 2005) 😀
February 1, 2016 at 7:19 am
Even the method posted on Microsoft Support is better than this ugly RBAR option with a deprecated feature.
February 1, 2016 at 7:37 am
Nice post! I have a problem looking for duplicates in a similar table with "text" fields. What would that SQL look like?
Thank you,
Bob
February 1, 2016 at 7:40 am
Wouldn't this be a simpler solution and be a bit less confusing?
delete Customers
where CustID in (
select max(CustID) -- or change to min() if wanting to keep latest
from Customers
group by CustName
having count(*) > 1
)
February 1, 2016 at 7:46 am
vopipari (2/1/2016)
Wouldn't this be a simpler solution and be a bit less confusing?
delete Customers
where CustID in (
select max(CustID) -- or change to min() if wanting to keep latest
from Customers
group by CustName
having count(*) > 1
)
This will still remove ALL customer records that have a duplicate record
--------------------------------------------------------------------------------------------------------------------------------------------------------
To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
http://www.sql-sa.co.za
February 1, 2016 at 7:48 am
It only picks the min or max CustID for the pair of duplicates.
February 1, 2016 at 7:53 am
Microsoft said "Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax."
February 1, 2016 at 8:10 am
vopipari (2/1/2016)
It only picks the min or max CustID for the pair of duplicates.
😛
February 1, 2016 at 8:23 am
but it won't tidy up triplicates....
February 1, 2016 at 8:32 am
Adrian_1 (2/1/2016)
but it won't tidy up triplicates....
True, I was only thinking of duplicates. This fixes that though:
delete Customers
where CustName in (
select CustName
from Customers
group by CustName
having count(*) > 1
)
and CustID not in (
select max(CustID) -- keeps latest, change to min() if wanting to keep the first
from Customers
group by CustName
having count(*) > 1
);
February 1, 2016 at 9:45 am
bob_chang (2/1/2016)
Nice post! I have a problem looking for duplicates in a similar table with "text" fields. What would that SQL look like?Thank you,
Bob
The same... The answers given here are all generic. It doesn't matter what data types you are dealing with.
February 1, 2016 at 10:07 am
Personally I prefer
WITH cte AS (
SELECT a.*,
ROW_NUMBER() OVER(PARTITION BY field1,field2 ORDER BY field1,field2) rrn
FROM dbo.file_with_duplicates a)
DELETE FROM cte WHERE rrn > 1
February 1, 2016 at 10:59 am
JustASQLGuy (2/1/2016)
Wouldn't this be a simpler solution and be a bit less confusing?
delete Customers
where CustID in (
select max(CustID) -- or change to min() if wanting to keep latest
from Customers
group by CustName
having count(*) > 1
)
Once I read the post the issue of deleting ALL records hit me because I have been in this kind of situation before. Take care to backup the table as an exported copy before trying out on production. The above query by JustASQLGuy comes closer to a solution and if run multiple time in a WHILE loop could clear all entries occuring multiple times. I haven't tested this theory though.
Br. Kenneth Igiri
https://kennethigiri.com
All nations come to my light, all kings to the brightness of my rising
Viewing 15 posts - 16 through 30 (of 76 total)
You must be logged in to reply to this topic. Login to reply