I recently got the following email:
“I just keyed in two rows into a table with all identical column values. How do I get rid of just one of them?”
Now even way back when in the dark ages when I started as a database developer I ran into this problem occasionally. Every now and again I would create a table and forget to put a primary key on it. (I’m fairly sure I’m the only one who has ever done this.) As a result I inevitably ended up with some rows that were exact duplicates and I needed to get rid of just one of them. So the first thing I tried was this:
DELETE TOP 1 FROM MyHeap WHERE WhatShouldHaveBeenMyKeyValue = 'WhatWasIThinking'
And it failed. Looking in BOL (written on stone tablets at the time) I found out that TOP wasn’t available in DELETE, or UPDATE for that matter. Remember this was back in SQL 2000. I honestly don’t remember if this could be done in FoxPro (my first database language). I developed a couple of tricks to manage the problem, and they worked fairly well, but I had always wished for DELETE TOP #. Well my wish finally came true, as of SQL 2005 TOP is available for use with DELETE, UPDATE and even INSERT. Unfortunately I only learned this recently while studying for my 2008 Development MCTS. Here is the definition in 2005-2008 R2. I haven’t looked at 2012 yet but I imagine that it is the same or similar.
TOP (expression) [PERCENT] [ WITH TIES ]
So back to my developer’s problem, I could now give him the simple solution.
DELETE TOP 1 FROM MyHeap WHERE WhatShouldHaveBeenMyKeyValue = 'WhatWasIThinking'
and it worked perfectly. In case you were wondering the original solutions I came up with was this:
SELECT DISTINCT * INTO #temp FROM MyHeap WHERE WhatShouldHaveBeenMyKeyValue = 'WhatWasIThinking' DELETE FROM MyHeap WHERE WhatShouldHaveBeenMyKeyValue = 'WhatWasIThinking' INSERT INTO MyHeap SELECT * FROM #temp
With complications for identity columns of course. So you can see how much easier the TOP keyword makes things.
FYI You can also use the TOP keyword as a way to limit a transaction size. I’m using 5000 rows as a transactions size for the following examples.
DELETE TOP (5000) SET ColumnA = 'NewValue' WHERE ColumnA = 'OldValue' WHILE @@ROWCOUNT = 5000 BEGIN WAITFOR DELAY ’00:00:02’ – Wait 2 seconds between transactions DELETE TOP (5000) SET ColumnA = 'NewValue' WHERE ColumnA = 'OldValue' END
Or
UPDATE TOP (5000) SET ColumnA = 'NewValue' WHERE ColumnA = 'OldValue' WHILE @@ROWCOUNT = 5000 BEGIN WAITFOR DELAY '00:00:02' – Wait 2 seconds between transactions UPDATE TOP (5000) SET ColumnA = 'NewValue' WHERE ColumnA = 'OldValue' END
Note: Scripts corrected based on comments below.
I haven’t had a chance to do any extensive testing on this yet but the theory is sound and I didn’t run into any problems the couple of times I’ve tried it.