Re: Short-circuiting a query

Question

Post reply

Re: Short-circuiting a query

jlp3630

Hall of Fame

Points: 3243
More actions
September 29, 2008 at 8:29 am

#198664

Is there anyway to "short-circuit" a query? The following query generates the same "actual execution plan" regardless of the @brand_id value (null or non-null value):
DECLARE @brand_id INT;
SET @brand_id = null;
SELECT pk_product_id
FROM tbl_products
WHERE @brand_id IS NULL OR fk_brand_id IN (SELECT pk_brand_id
FROMtbl_brands AS b WITH (NOLOCK)
WHEREb.fk_brand_map_id = @brand_id
ORb.pk_brand_id = @brand_id)
Ideally, if @brand_id IS NULL, I don't want to evaulate the inner query (SELECT ... FROM tbl_brands)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 1

This should do the trick:

[font="Courier New"]DECLARE @brand_id INT

IF @brand_id IS NOT NULL

BEGIN

SELECT p.pk_product_id

FROM tbl_brands b (NOLOCK)

INNER JOIN tbl_products p ON p.fk_brand_id = b.pk_brand_id

WHERE b.fk_brand_map_id = @brand_id OR b.pk_brand_id = @brand_id)

END

ELSE

BEGIN

SELECT pk_product_id

FROM tbl_products

WHERE 0 = 1

END

[/font]

Please note that the code is untested.

Cheers

ChrisM

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 2

jlp3630 (9/29/2008)
Is there anyway to "short-circuit" a query? The following query generates the same "actual execution plan" regardless of the @brand_id value (null or non-null value):
DECLARE @brand_id INT;
SET @brand_id = null;
SELECT pk_product_id
FROM tbl_products
WHERE @brand_id IS NULL OR fk_brand_id IN (SELECT pk_brand_id
FROMtbl_brands AS b WITH (NOLOCK)
WHEREb.fk_brand_map_id = @brand_id
ORb.pk_brand_id = @brand_id)
Ideally, if @brand_id IS NULL, I don't want to evaulate the inner query (SELECT ... FROM tbl_brands)

The better approach might be to put an IF clause in so that you validate the variable prior to running the query. Then you don't have to use an OR statement either. Plus, you can do a JOIN rather than the IN statement, which is going to run very poorly in most situations. Something like this (untested):

IF NOT (@brand_id IS NULL)

BEGIN

SELECT p.pk_product_id

FROM tbl_products AS p

JOIN tbl_brands AS b

ON p.fk_brand_id = b.pk_brand_id

WHERE b.fk_brand_map_id = @brand_id

OR b.pk_brand_id = @brand_id

END

But that second OR is still going to cause you some difficulty.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Spooky, Grant. Very spooky 😎

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 4

Funny that we both fixed the JOIN as well as putting in the IF.

Well, normally I'd say to stay out of my head, but since you posted first, I'll try to stay out of yours in the future.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

jlp3630 Hall of Fame Points: 3243 More actions · Answer 5

Are there any situations (resultset or performance) where using IN is preferred over JOIN?

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 6

It depends...

IN basically is an OR statement. It's one of those hidden RBAR (row-by-agonizing-row, TM: Jeff Moden) operations because it's going to check each row of the IN clause against all the rows of the SELECT statement. If both sets of data are fairly small, three rows to three rows for example, then it's not necessarily a problem and it might even perform faster than using a JOIN (maybe). Circumstances like that are fairly rare. Much more common is to see hundreds or thousands of rows being compared to hundreds or thousands of rows... one... at... a.... time... Which, I'm sure you can imagine, isn't a good thing.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Grant Fritchey (9/30/2008)
It depends...
IN basically is an OR statement. It's one of those hidden RBAR (row-by-agonizing-row, TM: Jeff Moden) operations because it's going to check each row of the IN clause against all the rows of the SELECT statement.

Not necessarily. Most of the time the optimiser will convert a subquery (even a correlated one) into a join. In this case, the subquery isn't correlated (no reference to the outer query) so there's no reason it has to be run once per row of the outer.

Consider these two (adventureworks)

select * from Sales.SalesOrderDetail where ProductID in (select ProductID from Production.Product where ListPrice > 100)

select sod.* from Sales.SalesOrderDetail sod inner join Production.Product p on sod.ProductID = p.ProductID where ListPrice > 100

They have an identical output and an identical execution plan. If you run via SQL 2008 management studio (so you can see the number of executes that occur per operator) they both show 1 execution for both the sales order detail and the products tables.

If they're run with stats IO on, I get the following logical reads for both.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0.
Table 'SalesOrderDetail'. Scan count 1, logical reads 1240, physical reads 0.
Table 'Product'. Scan count 1, logical reads 15, physical reads 0.

I spent the entire weekend trying to get an example of a correlated subquery where the subquery was run once per row of the outer query, and I failed. Every query I tried, the optimiser managed to convert the subquery into a join that was run only once.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 8

I must be regurgitating bad stuff again. Sorry. It's when it's a delimited list that it behaves like that, right?

However, I did a little looking around IN clauses don't work as well as JOINs. Here's an example (cleaned up from here: http://tgrignon.wordpress.com/2008/01/21/sql-server-performance-and-the-in-clause/ ):

SELECT DISTINCT

c.LastName

,c.FirstName

FROM Person.Person c

JOIN HumanResources.Employee e

ON e.BusinessEntityId = c.BusinessEntityID

WHERE e.BusinessEntityID IN (

SELECT SalesPersonID

FROM Sales.SalesOrderHeader

WHERE SalesOrderID IN (

SELECT SalesOrderID

FROM Sales.SalesOrderDetail

WHERE ProductID IN (SELECT ProductID

FROM Production.Product p

WHERE ProductNumber LIKE 'FW%'))) ;

SELECT DISTINCT

c.LastName

,c.FirstName

FROM Person.Person c

JOIN HumanResources.Employee e

ON e.BusinessEntityId = c.BusinessEntityId

JOIN Sales.SalesOrderHeader AS soh

ON e.BusinessEntityID = soh.SalesPersonID

JOIN Sales.SalesOrderDetail AS sod

ON soh.SalesOrderID = sod.SalesOrderID

JOIN Production.Product AS p

ON sod.ProductID = p.ProductID

WHERE p.ProductNumber LIKE 'FW%'

Now you get quite different plans and quite different performance.

I also this bit on issues around the lazy spool using NOT IN: http://sql-server-performance.com/Community/forums/p/2422/13885.aspx#13885

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Grant Fritchey (9/30/2008)
I must be regurgitating bad stuff again. Sorry. It's when it's a delimited list that it behaves like that, right?

You mean something like this?

select ProductID from Sales.SalesOrderDetail where ProductID in (717,718,719,722,723,725,726,727,729,730,732,733,736,738,739,741,742,743,744,745)

However, I did a little looking around IN clauses don't work as well as JOINs. Here's an example (cleaned up from here: http://tgrignon.wordpress.com/2008/01/21/sql-server-performance-and-the-in-clause/ ):
<snip>
Now you get quite different plans and quite different performance.

Oh, there are cases where that happens, correlated and noncorrelated subqueries. It's just not a given for every case that involves a subquery of any form.

And even in that example, the subquery is not executed row by row. In the exec plan I get, the seek on product only executes once. The seek on Sales order detail runs 16 times, but that's because of the nested loop join to product (Sales Order Detail is the inner table in the loop) and the 16 rows returned from product.

If the subqueries were run once per row of the outer query, we would expect 3806 reads of SalesOrderDetail, because that's the number of rows returned by salesorderheader, and a really large number of reads of product.

I also this bit on issues around the lazy spool using NOT IN: http://sql-server-performance.com/Community/forums/p/2422/13885.aspx#13885

Interesting, thanks.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Grant Fritchey SSC Guru Points: 398882 More actions · Answer 10

Are you kidding, thank you. I hate posting bad information.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

Don't we all.

Oh, and on SQL 2008, I get the following results for those two queries:

Subqueries: (35% cost)

Table 'Person'. Scan count 0, logical reads 45

Table 'Worktable'. Scan count 0, logical reads 0

Table 'SalesOrderDetail'. Scan count 16, logical reads 34

Table 'Product'. Scan count 1, logical reads 2

Table 'SalesOrderHeader'. Scan count 290, logical reads 641

Table 'Employee'. Scan count 1, logical reads 2

SQL Server Execution Times:

CPU time = 15 ms, elapsed time = 195 ms.

Joins: (65% cost)

Table 'Person'. Scan count 0, logical reads 2615

Table 'Worktable'. Scan count 0, logical reads 0

Table 'SalesOrderHeader'. Scan count 1, logical reads 57

Table 'SalesOrderDetail'. Scan count 16, logical reads 34

Table 'Product'. Scan count 1, logical reads 2

Table 'Employee'. Scan count 1, logical reads 2

SQL Server Execution Times:

CPU time = 16 ms, elapsed time = 344 ms.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

jlp3630 Hall of Fame Points: 3243 More actions · Answer 12

Grant Fritchey (9/29/2008)
IF NOT (@brand_id IS NULL)
BEGIN
SELECT p.pk_product_id
FROM tbl_products AS p
JOIN tbl_brands AS b
ON p.fk_brand_id = b.pk_brand_id
WHERE b.fk_brand_map_id = @brand_id
OR b.pk_brand_id = @brand_id
END
But that second OR is still going to cause you some difficulty.

Assuming I have indices on b.fk_brand_map_id and b.pk_brand_id, would using the UNION operator get around the problem of the second OR.

SELECT p.pk_product_id

FROM tbl_products AS p

JOIN tbl_brands AS b

ON p.fk_brand_id = b.pk_brand_id

WHERE b.fk_brand_map_id = @brand_id

UNION

SELECT p.pk_product_id

FROM tbl_products AS p

JOIN tbl_brands AS b

ON p.fk_brand_id = b.pk_brand_id

WHERE b.pk_brand_id = @brand_id

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

Maybe. It's worth testing, however in 2005 and higher, the optimiser has the same options for optimising an 'or' as it has optimising a 'union', and there's a good chance that the two will have a very similar plan.

The conversion or an or to a union was a common trick in SQL 2000, as the optimiser had several ways to do a 'union' and much fewer for an 'or'

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass