September 29, 2008 at 8:29 am
Is there anyway to "short-circuit" a query? The following query generates the same "actual execution plan" regardless of the @brand_id value (null or non-null value):
DECLARE @brand_id INT;
SET @brand_id = null;
SELECT pk_product_id
FROM tbl_products
WHERE @brand_id IS NULL OR fk_brand_id IN (SELECT pk_brand_id
FROMtbl_brands AS b WITH (NOLOCK)
WHEREb.fk_brand_map_id = @brand_id
ORb.pk_brand_id = @brand_id)
Ideally, if @brand_id IS NULL, I don't want to evaulate the inner query (SELECT ... FROM tbl_brands)
September 29, 2008 at 8:58 am
This should do the trick:
[font="Courier New"]DECLARE @brand_id INT
IF @brand_id IS NOT NULL
BEGIN
SELECT p.pk_product_id
FROM tbl_brands b (NOLOCK)
INNER JOIN tbl_products p ON p.fk_brand_id = b.pk_brand_id
WHERE b.fk_brand_map_id = @brand_id OR b.pk_brand_id = @brand_id)
END
ELSE
BEGIN
SELECT pk_product_id
FROM tbl_products
WHERE 0 = 1
END
[/font]
Please note that the code is untested.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 29, 2008 at 9:01 am
jlp3630 (9/29/2008)
Is there anyway to "short-circuit" a query? The following query generates the same "actual execution plan" regardless of the @brand_id value (null or non-null value):DECLARE @brand_id INT;
SET @brand_id = null;
SELECT pk_product_id
FROM tbl_products
WHERE @brand_id IS NULL OR fk_brand_id IN (SELECT pk_brand_id
FROMtbl_brands AS b WITH (NOLOCK)
WHEREb.fk_brand_map_id = @brand_id
ORb.pk_brand_id = @brand_id)
Ideally, if @brand_id IS NULL, I don't want to evaulate the inner query (SELECT ... FROM tbl_brands)
The better approach might be to put an IF clause in so that you validate the variable prior to running the query. Then you don't have to use an OR statement either. Plus, you can do a JOIN rather than the IN statement, which is going to run very poorly in most situations. Something like this (untested):
IF NOT (@brand_id IS NULL)
BEGIN
SELECT p.pk_product_id
FROM tbl_products AS p
JOIN tbl_brands AS b
ON p.fk_brand_id = b.pk_brand_id
WHERE b.fk_brand_map_id = @brand_id
OR b.pk_brand_id = @brand_id
END
But that second OR is still going to cause you some difficulty.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 29, 2008 at 9:07 am
Spooky, Grant. Very spooky 😎
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
September 29, 2008 at 9:11 am
Funny that we both fixed the JOIN as well as putting in the IF.
Well, normally I'd say to stay out of my head, but since you posted first, I'll try to stay out of yours in the future.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 30, 2008 at 8:25 am
Are there any situations (resultset or performance) where using IN is preferred over JOIN?
September 30, 2008 at 8:32 am
It depends...
IN basically is an OR statement. It's one of those hidden RBAR (row-by-agonizing-row, TM: Jeff Moden) operations because it's going to check each row of the IN clause against all the rows of the SELECT statement. If both sets of data are fairly small, three rows to three rows for example, then it's not necessarily a problem and it might even perform faster than using a JOIN (maybe). Circumstances like that are fairly rare. Much more common is to see hundreds or thousands of rows being compared to hundreds or thousands of rows... one... at... a.... time... Which, I'm sure you can imagine, isn't a good thing.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 30, 2008 at 8:55 am
Grant Fritchey (9/30/2008)
It depends...IN basically is an OR statement. It's one of those hidden RBAR (row-by-agonizing-row, TM: Jeff Moden) operations because it's going to check each row of the IN clause against all the rows of the SELECT statement.
Not necessarily. Most of the time the optimiser will convert a subquery (even a correlated one) into a join. In this case, the subquery isn't correlated (no reference to the outer query) so there's no reason it has to be run once per row of the outer.
Consider these two (adventureworks)
select * from Sales.SalesOrderDetail where ProductID in (select ProductID from Production.Product where ListPrice > 100)
select sod.* from Sales.SalesOrderDetail sod inner join Production.Product p on sod.ProductID = p.ProductID where ListPrice > 100
They have an identical output and an identical execution plan. If you run via SQL 2008 management studio (so you can see the number of executes that occur per operator) they both show 1 execution for both the sales order detail and the products tables.
If they're run with stats IO on, I get the following logical reads for both.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 1240, physical reads 0.
Table 'Product'. Scan count 1, logical reads 15, physical reads 0.
I spent the entire weekend trying to get an example of a correlated subquery where the subquery was run once per row of the outer query, and I failed. Every query I tried, the optimiser managed to convert the subquery into a join that was run only once.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 30, 2008 at 9:49 am
I must be regurgitating bad stuff again. Sorry. It's when it's a delimited list that it behaves like that, right?
However, I did a little looking around IN clauses don't work as well as JOINs. Here's an example (cleaned up from here: http://tgrignon.wordpress.com/2008/01/21/sql-server-performance-and-the-in-clause/ ):
SELECT DISTINCT
c.LastName
,c.FirstName
FROM Person.Person c
JOIN HumanResources.Employee e
ON e.BusinessEntityId = c.BusinessEntityID
WHERE e.BusinessEntityID IN (
SELECT SalesPersonID
FROM Sales.SalesOrderHeader
WHERE SalesOrderID IN (
SELECT SalesOrderID
FROM Sales.SalesOrderDetail
WHERE ProductID IN (SELECT ProductID
FROM Production.Product p
WHERE ProductNumber LIKE 'FW%'))) ;
SELECT DISTINCT
c.LastName
,c.FirstName
FROM Person.Person c
JOIN HumanResources.Employee e
ON e.BusinessEntityId = c.BusinessEntityId
JOIN Sales.SalesOrderHeader AS soh
ON e.BusinessEntityID = soh.SalesPersonID
JOIN Sales.SalesOrderDetail AS sod
ON soh.SalesOrderID = sod.SalesOrderID
JOIN Production.Product AS p
ON sod.ProductID = p.ProductID
WHERE p.ProductNumber LIKE 'FW%'
Now you get quite different plans and quite different performance.
I also this bit on issues around the lazy spool using NOT IN: http://sql-server-performance.com/Community/forums/p/2422/13885.aspx#13885
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 30, 2008 at 10:07 am
Grant Fritchey (9/30/2008)
I must be regurgitating bad stuff again. Sorry. It's when it's a delimited list that it behaves like that, right?
You mean something like this?
select ProductID from Sales.SalesOrderDetail where ProductID in (717,718,719,722,723,725,726,727,729,730,732,733,736,738,739,741,742,743,744,745)
However, I did a little looking around IN clauses don't work as well as JOINs. Here's an example (cleaned up from here: http://tgrignon.wordpress.com/2008/01/21/sql-server-performance-and-the-in-clause/ ):
<snip>
Now you get quite different plans and quite different performance.
Oh, there are cases where that happens, correlated and noncorrelated subqueries. It's just not a given for every case that involves a subquery of any form.
And even in that example, the subquery is not executed row by row. In the exec plan I get, the seek on product only executes once. The seek on Sales order detail runs 16 times, but that's because of the nested loop join to product (Sales Order Detail is the inner table in the loop) and the 16 rows returned from product.
If the subqueries were run once per row of the outer query, we would expect 3806 reads of SalesOrderDetail, because that's the number of rows returned by salesorderheader, and a really large number of reads of product.
I also this bit on issues around the lazy spool using NOT IN: http://sql-server-performance.com/Community/forums/p/2422/13885.aspx#13885
Interesting, thanks.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 30, 2008 at 10:13 am
Are you kidding, thank you. I hate posting bad information.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
September 30, 2008 at 10:18 am
Don't we all.
Oh, and on SQL 2008, I get the following results for those two queries:
Subqueries: (35% cost)
Table 'Person'. Scan count 0, logical reads 45
Table 'Worktable'. Scan count 0, logical reads 0
Table 'SalesOrderDetail'. Scan count 16, logical reads 34
Table 'Product'. Scan count 1, logical reads 2
Table 'SalesOrderHeader'. Scan count 290, logical reads 641
Table 'Employee'. Scan count 1, logical reads 2
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 195 ms.
Joins: (65% cost)
Table 'Person'. Scan count 0, logical reads 2615
Table 'Worktable'. Scan count 0, logical reads 0
Table 'SalesOrderHeader'. Scan count 1, logical reads 57
Table 'SalesOrderDetail'. Scan count 16, logical reads 34
Table 'Product'. Scan count 1, logical reads 2
Table 'Employee'. Scan count 1, logical reads 2
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 344 ms.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 1, 2008 at 10:16 am
Grant Fritchey (9/29/2008)
IF NOT (@brand_id IS NULL)
BEGIN
SELECT p.pk_product_id
FROM tbl_products AS p
JOIN tbl_brands AS b
ON p.fk_brand_id = b.pk_brand_id
WHERE b.fk_brand_map_id = @brand_id
OR b.pk_brand_id = @brand_id
END
But that second OR is still going to cause you some difficulty.
Assuming I have indices on b.fk_brand_map_id and b.pk_brand_id, would using the UNION operator get around the problem of the second OR.
SELECT p.pk_product_id
FROM tbl_products AS p
JOIN tbl_brands AS b
ON p.fk_brand_id = b.pk_brand_id
WHERE b.fk_brand_map_id = @brand_id
UNION
SELECT p.pk_product_id
FROM tbl_products AS p
JOIN tbl_brands AS b
ON p.fk_brand_id = b.pk_brand_id
WHERE b.pk_brand_id = @brand_id
October 1, 2008 at 10:32 am
Maybe. It's worth testing, however in 2005 and higher, the optimiser has the same options for optimising an 'or' as it has optimising a 'union', and there's a good chance that the two will have a very similar plan.
The conversion or an or to a union was a common trick in SQL 2000, as the optimiser had several ways to do a 'union' and much fewer for an 'or'
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply