November 6, 2009 at 7:48 am
I had a table (tblStop) containing a list of serial numbers, which were a subset of records in another table (tblCard) and was asked to set a flag and date on records in tblCard where the serial number existed in tblStop. There were 150 records in tblStop, and 65,000 records in tblCard
I created this update query:
UPDATE tblCard
SET Stopped = 1, StoppedDate = Getdate()
WHERE tblCard.SerialNumber IN (Select SerialNumber from tblStop)
When I ran the query, it reported 65,000 records updated!
I then realised that the column in tblStop was called CardSerialNumber, not SerialNumber
I would have expected an error to be generated, as it would if I just ran Select SerialNumber FROM tblStop, but no error, and every single record in the target table was updated! The first words out of my mouth were Oh 5h1t! Luckily I was able to set the records back to their previous status using a backup, but I was shocked when it happened.
Can anyone explain why it happened?
November 6, 2009 at 8:02 am
This is a gotcha with correlated subqueries.
The tblCard table is implicitly included in your outer select (tblStop). Because SQL Server can't find SerialNumber in tblStop, it has a look in the inner table and finds SerialNumber in tblCard.
Change your query to this, and it will generate the error you were expecting
UPDATE tblCard
SET Stopped = 1, StoppedDate = Getdate()
WHERE tblCard.SerialNumber IN (Select tblStop.SerialNumber from tblStop)
Somewhere (can't find it at the moment), I've seen an article explaining this a lot better than I have.
November 6, 2009 at 8:04 am
I tried it on one of our databases and it did not work. I got an error. This was on a SQL 2008 server.
The only thing I would think off, if this happened to me, would be that maybe I selected the first 2 lines and left out the WHERE clause.
November 6, 2009 at 8:23 am
A practice that comes in handy:
When doing any sort of manual update in a production database, wrap it in a transaction, and don't commit till you know the results will be correct.
If you can't afford to have an uncommitted transaction (because of locks), then roll it back when you're validating it.
The Output clause (introduce in SQL 2005) can make it very, very easy to verify exactly what will be done by the command before committing it.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 6, 2009 at 8:25 am
sometimes using the UPDATE FROM syntax can keep you from fatfingering the correlated subquery like that; i know i did that same error to an invoices table a long time ago, marking everything as "shipped"
ouchy.
UPDATE tblCard
SET Stopped = 1, StoppedDate = Getdate()
FROM tblStop
WHERE tblCard.SerialNumber = tblStop.SerialNumber --this will error out and you'd change it to CardSerialNumber
Lowell
November 6, 2009 at 8:29 am
This wasn't the article I was thinking of, but is Microsoft's explanation:-
http://msdn.microsoft.com/en-us/library/ms178050(SQL.90).aspx
November 6, 2009 at 9:22 am
Hi
A very common error (I learned, too). UPDATE is the most dangerous statement in SQL in my opinion (something I tell people whenever I conduct a class). It's a psychological reason. Words like DELETE, TRUNCATE or even DROP tell you "I can corrupt your data", UPDATE sounds not really dangerous but it can corrupt a complete table, too.
I use a little trick to reduce the possibility of those errors. (BTW: I always try to avoid sub-queries if possible). Using the joined-UPDATE syntax makes it quiet simple to test statements.
--UPDATE c SET Stopped = 1, StoppedDate = Getdate()
SELECT *
FROM tblCard c
JOIN tblStop s ON c.SerialNumber = s.SerialNumber
So I can develop without any dangerous. When everything looks fine, just comment the "SELECT *" and uncomment the "UPDATE".
Greets
Flo
November 6, 2009 at 12:02 pm
As everyone else has already stated - this is a known situation. Sub-queries have access to all of the columns from the tables in the outer query where the sub-query is referenced. If this was not possible, we would not have the ability to perform correlated sub-queries.
Two practices that will help you avoid these kinds of issues in the first place are:
1) When performing ad-hoc updates or deletes, wrap the code in a transaction.
2) Alias every single table being referenced - always...
For the first item - example would be:
BEGIN TRANSACTION;
-- perform update or delete here
ROLLBACK TRANSACTION; -- after testing to make sure everything works
-- COMMIT TRANSACTION; -- un-comment this line and comment out previous line
For the second item - example would be:
UPDATE MyTable
SET somecolumn = 'some value'
WHERE othercolumn IN (SELECT othercolumn FROM othertable ot);
Now, just because the othertable is aliased now - this will not cause the query to fail. The update would still succeed because the othercolumn can still be referenced from MyTable. Modify it to:
UPDATE MyTable
SET somecolumn = 'some value'
WHERE othercolumn IN (SELECT ot.othercolumn FROM othertable ot);
And this one will immediately fail because othertable does not have othercolumn.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
November 9, 2009 at 2:13 am
Many thanks to all concerned for your input
November 9, 2009 at 5:05 am
While we're all chipping in with tips and tricks...
If you only expect to update one row, say so: UPDATE TOP (1)
Yes, wrapping things in a transaction and rolling back as a test is good...but identity columns can trip you up.
I run with IMPLICIT_TRANSACTIONS ON, in case I forget. No lectures please: it works for me.
I also prefer Flo's approach of SELECTing the (TOP n) records which will be affected before adding the UPDATE/DELETE/INSERT.
I would like to see the syntax that allows the original error deprecated, in favour of requiring aliases, just like with table variables.
Paul
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply