September 15, 2016 at 3:46 pm
Hello. I had this weird thing happen on SQL 2012 SP2 and SP3.
CREATE TABLE dbo.EmployeeHistory (StartDate SMALLDATETIME, EndDate SMALLDATETIME, NativeWorkerID VARCHAR(255), EmployeeID VARCHAR(20))
CREATE TABLE dbo.EH_NativeWorker_Dups(NativeWorkedID VARCHAR(255))
SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory
WHERE
'9/5/16' BETWEEN StartDate AND EndDate
AND NativeWorkerID NOT IN
(SELECT NativeWorkerID FROM dbo.EH_NativeWorker_Dups)
This should return an error. For me it simply returns no rows. If you look closely I fat fingered the creation of dbo.EH_NativeWorker_Dups. Maybe the compiler/parser got confused because NativeWorkerID exists on EmployeeHistory. When ran on a broader scope this snippet caused us to lose a day's worth of production reporting data. :crying:
Thanks,
Ken
September 15, 2016 at 6:51 pm
It didn't return an error for me either. Even with data in both tables.
Have you reported it to Microsoft?
You say you fat-fingered it. Did you run this in production without testing it first?
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
September 15, 2016 at 6:56 pm
Although this is too little too late, you might reconsider that IN clause for the future. Use a non-existence test instead. You'll get the error you expect. I have no idea why the subquery for your IN clause lets the typo slip by.
SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory EH
WHERE
'9/5/16' BETWEEN StartDate AND EndDate
AND NOT EXISTS (SELECT 1 FROM dbo.EH_NativeWorker_Dups D where D.NativeWorkerID = EH.NativeWorkerID)
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
September 15, 2016 at 7:01 pm
Curiouser and curiouser, your code actually seems to run correctly on my limited testing.
When there is a date range match and a dup match, a row doesn't display.
When there is a date range match and no dup match, a row does display.
I even changed the name of the column in the EH_Native_Worker_Dups_Table to [X].
It is apparently disregarding the column name entirely and does so EVEN IF THERE IS MORE THAN one column.
By the way, I'm running on 2016.
CREATE TABLE dbo.EmployeeHistory (StartDate SMALLDATETIME, EndDate SMALLDATETIME, NativeWorkerID VARCHAR(255), EmployeeID VARCHAR(20))
insert into dbo.EmployeeHistory
select dateadd(YEAR,-2,getdate()), getdate(), 2, 2
CREATE TABLE dbo.EH_NativeWorker_Dups(X VARCHAR(255), Y Int)
insert into dbo.EH_NativeWorker_Dups
select 2, null
--SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
--FROM dbo.EmployeeHistory EH
--WHERE
--'9/15/16' BETWEEN StartDate AND EndDate
--AND NOT EXISTS (SELECT 1 FROM dbo.EH_NativeWorker_Dups D where D.NativeWorkerID = EH.NativeWorkerID)
SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory
WHERE
'9/5/16' BETWEEN StartDate AND EndDate
AND NativeWorkerID NOT IN
(SELECT NativeWorkerID FROM dbo.EH_NativeWorker_Dups)
TRUNCATE TABLE dbo.EH_NativeWorker_Dups
SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory
WHERE
'9/5/16' BETWEEN StartDate AND EndDate
AND NativeWorkerID NOT IN
(SELECT NativeWorkerID FROM dbo.EH_NativeWorker_Dups)
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
September 15, 2016 at 8:11 pm
This was removed by the editor as SPAM
September 15, 2016 at 8:23 pm
ken.trock (9/15/2016)
Hello. I had this weird thing happen on SQL 2012 SP2 and SP3.
CREATE TABLE dbo.EmployeeHistory (StartDate SMALLDATETIME, EndDate SMALLDATETIME, NativeWorkerID VARCHAR(255), EmployeeID VARCHAR(20))
CREATE TABLE dbo.EH_NativeWorker_Dups(NativeWorkedID VARCHAR(255))
SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory
WHERE
'9/5/16' BETWEEN StartDate AND EndDate
AND NativeWorkerID NOT IN
(SELECT NativeWorkerID FROM dbo.EH_NativeWorker_Dups)
This should return an error. For me it simply returns no rows. If you look closely I fat fingered the creation of dbo.EH_NativeWorker_Dups. Maybe the compiler/parser got confused because NativeWorkerID exists on EmployeeHistory. When ran on a broader scope this snippet caused us to lose a day's worth of production reporting data. :crying:
Thanks,
Ken
Actually it's working as expected... This is a know behavior and why it important to ALWAYS alias your tables & columns. Without aliases, the optimizer is free to reference "NativeWorkerID" from any table, from the outer query or sub-query, that has that column... Since EmployeeHistory is the only table with that column it chose it.
By using aliases, you're telling the optimizer what table each column should come from.
This is the precise reason we require all queries to be aliased, even single table queries (because single tables tend to not remain single table queries forever), before allowing them into production.
Aliased version of the query...
SELECT DISTINCT eh.EmployeeID, eh.NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory eh
WHERE
'2016-09-05' BETWEEN eh.StartDate AND eh.EndDate
AND eh.NativeWorkerID NOT IN
(SELECT nwd.NativeWorkerID FROM dbo.EH_NativeWorker_Dups nwd)
Result...
Msg 207, Level 16, State 1, Line 14
Invalid column name 'NativeWorkerID'.
And again, using aliases to illustrate the point...
SELECT DISTINCT eh.EmployeeID, eh.NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory eh
WHERE
'2016-09-05' BETWEEN eh.StartDate AND eh.EndDate
AND eh.NativeWorkerID NOT IN
(SELECT eh.NativeWorkerID FROM dbo.EH_NativeWorker_Dups nwd)
Result...
EmployeeID CIMWorkerID
-------------------- ------------------
(0 row(s) affected)
HTH,
Jason
September 15, 2016 at 8:47 pm
Thank you, gentlemen. That was driving me bonkers.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
September 16, 2016 at 2:01 am
ken.trock (9/15/2016)
This should return an error. For me it simply returns no rows. If you look closely I fat fingered the creation of dbo.EH_NativeWorker_Dups. Maybe the compiler/parser got confused because NativeWorkerID exists on EmployeeHistory.
Not error and not a confused parser (well, not an error in SQL Server)
The binding order for columns within a subquery are:
1) To the tables inside the subquery
2) Then, if the column isn't found, to tables in the outer query.
Hence your query is the perfectly valid
SELECT DISTINCT EmployeeID, NativeWorkerID AS 'CIMWorkerID'
FROM dbo.EmployeeHistory
WHERE
'9/5/16' BETWEEN StartDate AND EndDate
AND NativeWorkerID NOT IN
(SELECT EmployeeHistory.NativeWorkerID FROM dbo.EH_NativeWorker_Dups)
If the binding check didn't allow for columns mentioned in the subquery to bind to tables in the outer query, then we wouldn't be able to write correlated subqueries, like the non-existance check that was suggested
AND NOT EXISTS (SELECT 1 FROM dbo.EH_NativeWorker_Dups D where D.NativeWorkerID = EH.NativeWorkerID)
Always, always, always qualify your column names, especially when there are subqueries in use.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 16, 2016 at 7:47 am
Yes, thank you everyone who responded. Definitely good info.
Lessons learned:
-Alias column names even in single table queries lest they get moved into larger queries un-aliased.
-Do thorough testing on even small changes before deploying something to production!
Ken
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply