April 13, 2016 at 9:40 am
Find orderdetails for categoryid=1. (categoryid is in the product table)
The following works successfully
select od.*
from sales.orderdetails od
where productid in (select productid from production.products where categoryid=1)
However, let's say I make a mistake without noticing. The only word changed is in capitals
select od.*
from sales.orderdetails od
where productid in (select ORDERID from production.products where categoryid=1)
OREDERID comes from the outer table. (Execution plan says 'nested loops- left semi join')
If I run the entire query it returns no rows (obviously), BUT if I run (select ORDERID from production.products where categoryid=1) ON IT'S OWN - it errors with invalid column name!!!!!
So, how come, on it's own it errors but run as a subselect in the whole query, it doesn't?????
Thanks
April 13, 2016 at 9:47 am
I ran the below on adventureworks2012 running on a sqlserver2012 instance version 11.03156 have reproduced the issue.
Had to change the code to get it to run on adventureworks.
The subquery throws an error. --Edited(I changed the column in the subquery to salesorderdetailID as per adventureworks and runs without issue)
select od.*
from sales.salesorderdetail od
where productid in (select productid from production.product where [ProductSubcategoryID] =1)
go
select od.*
from sales.salesorderdetail od
where productid in (select SalesOrderID from production.product where [ProductSubcategoryID] =1)
GO
April 13, 2016 at 9:51 am
lotusnotes (4/13/2016)
OREDERID comes from the outer table. (Execution plan says 'nested loops- left semi join')
You answered the question yourself. If the column isn't found in the inner query, the outer query is searched. This is a good lesson on why you should use aliases - it avoids confusion.
John
April 13, 2016 at 9:56 am
This is not a bug. It's a feature.
Allowing to reference columns from the outer query allows you to create correlated subqueries. This is specially important when using EXISTS or APPLY.
Due to this feature, it's important that you always qualify your columns (using 2 part names) to know where they come from.
Reference: https://technet.microsoft.com/en-us/library/ms178050(v=sql.105).aspx
April 14, 2016 at 2:06 am
Ok. Thanks guys.
If you use an alias od.orderid it still lets you do it & the overall code still works & returns no rows.
You would hope to spot that is the wrong alias in the inner query.
What is really scary is if you change the code to 'where productid NOT in (select ordered.....'
the whole query returns ALL the rows from the outer table. i.e if you were using the resulting rowset in a merge statement, you would be in big trouble.
April 14, 2016 at 6:56 am
lotusnotes (4/14/2016)
Ok. Thanks guys.If you use an alias od.orderid it still lets you do it & the overall code still works & returns no rows.
You would hope to spot that is the wrong alias in the inner query.
What is really scary is if you change the code to 'where productid NOT in (select ordered.....'
the whole query returns ALL the rows from the outer table. i.e if you were using the resulting rowset in a merge statement, you would be in big trouble.
When you use
select od.*
from sales.orderdetails od
where productid in (select ORDERID from production.products where categoryid=1)
You're telling SQL Server to return from sales.orderdetails where the product is equal to the orderid, but only if there are rows in the table products where categoryid = 1.
Here's a quick example to try to make it clear.
CREATE TABLE Product(
ProductId int,
ProductSubcategoryID int
);
CREATE TABLE SalesOrderDetail(
SalesOrderID int,
ProductId int
);
INSERT INTO Product
VALUES
(1,1),
(2,1),
(3,2),
(4,2),
(5,2),
(6,3);
INSERT INTO SalesOrderDetail
VALUES
(1,1), --Will show up because ProductId = SalesOrderID
(1,2), --Won't show up because ProductId <> SalesOrderID
(2,1), --Won't show up because ProductId <> SalesOrderID
(2,4), --Won't show up because ProductId <> SalesOrderID
(2,3), --Won't show up because ProductId <> SalesOrderID
(3,3); --Will show up because ProductId = SalesOrderID
--Returns 2 rows because we have rows in Product where ProductSubcategoryID = 1
SELECT od.*
FROM SalesOrderDetail od
WHERE ProductId in (SELECT SalesOrderID FROM Product WHERE ProductSubcategoryID = 1);
--Returns 0 rows because we DON'T have rows in Product where ProductSubcategoryID = 6
SELECT od.*
FROM SalesOrderDetail od
WHERE ProductId in (SELECT SalesOrderID FROM Product WHERE ProductSubcategoryID = 6);
GO
DROP TABLE Product;
DROP TABLE SalesOrderDetail;
April 14, 2016 at 9:39 am
lotusnotes (4/14/2016)
Ok. Thanks guys.If you use an alias od.orderid it still lets you do it & the overall code still works & returns no rows.
You would hope to spot that is the wrong alias in the inner query.
What is really scary is if you change the code to 'where productid NOT in (select ordered.....'
the whole query returns ALL the rows from the outer table. i.e if you were using the resulting rowset in a merge statement, you would be in big trouble.
There is very often a need to reference data from the outer tables in a subquery, so if you explicitly type od.orderid you apparently explicitly want to reference that column. SQL Server does not judge your queries, it simply executes them.
If you had wanted to use a column from the subquery itself, you would have typed pr.orderid and then you would have gotten the expected error.
In queries that have more than a single table reference, b est practise is to *ALWAYS* table-qualify *EVERY* column reference throughout the entire query. Either with the table name or with the table alias (for readability, I prefer the latter).
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply