WHERE Date Range

Question

WHERE Date Range

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 1

Jeff Moden (2/26/2012)
BSavoie (2/25/2012)
Actually I have many more search criteria. All are strings. Can I still take the simple approach or does that change things?
Thanks as always!
If that's the case, I definitely recommend the dynamic SQL methods that Gail has in her fine blog post. Done correctly, as she demonstrated in her blog, it will be bullet-proof against SQL Injection and fast.

Ah... let me retract that and say "It Depends". You have to look at a bunch of different aspects. I'm working on an example that demonstrates one very important aspect that Gail brought up that I flat forgot about even though it has caused me great pain in the past.

I'll be back...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

Honestly, if there are multiple optional parameters I would recommend either dynamic SQL or adding OPTION(RECOMPILE) if you're on SQL 2008 SP2 or above.

While there are various T-SQL methods to allow those kind of queries to seek, the root of the matter is that if the parameters are optional the query has no single optimal execution plan because depending on the parameters passed, some predicates will limit rows more than others, some are better done as seeks and others as filters, etc, etc.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 3

GilaMonster (2/26/2012)
Honestly, if there are multiple optional parameters I would recommend either dynamic SQL or adding OPTION(RECOMPILE) if you're on SQL 2008 SP2 or above.
While there are various T-SQL methods to allow those kind of queries to seek, the root of the matter is that if the parameters are optional the query has no single optimal execution plan because depending on the parameters passed, some predicates will limit rows more than others, some are better done as seeks and others as filters, etc, etc.

I'm getting ready to prove why using dynamic SQL isn't always the best option. I'll be back. 🙂

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 4

Gail has hit the nail I wanted to talk about squarely on the head in the following.

GilaMonster (2/26/2012)
While there are various T-SQL methods to allow those kind of queries to seek, the root of the matter is that if the parameters are optional the query has no single optimal execution plan because depending on the parameters passed, some predicates will limit rows more than others, some are better done as seeks and others as filters, etc, etc.

So, to summarize, no matter what you do with optional parameters in "catch all" queries, you could be screwing with future performance.

Let's first cover what Gail said about the method I used which, for simplicity sake during this discussion, I'll simply call the "MAX/MIN Limit" method where we conditionally substitute (for example) a value for a NULL date parameter. In the case of the simple requirements that BSavoie gave (summarized as follows)...

1. If the FromDate parameter is NULL and the ThroughDate (think "EndDate" here)is NOT NULL, then we should select everything up to and including the ThroughDate .

2. If the ThroughDate parameter is NULL and the FromDate is NOT NULL, then we should select ever past the FromDate including the FromDate.

3. If both parameters are NULL, select everything.

4. If both parameters have a datetime value, select everything between those two dates including the FromDate and the ThroughDate.

The code I posted for this is as follows and is SARGable (allows a proper SEEK) to occur and is SQL Injection Proof. Here's the code again as a convenient reference sans the setup of variables.

DECLARE @FromDate DATETIME,

@ThroughDate DATETIME

;

SELECT @FromDate = '20050101',

@ThroughDate = '20050102';

SELECT SomeDateTime,

SomeID

FROM dbo.JBMTest

WHERE SomeDateTime >= ISNULL(@FromDate, '17530101')

AND SomeDateTime <= ISNULL(@ThroughDate, '99991231')

;

As Gail pointed out, the optimizer has a bit of a tough time estimating the row count and, when the code was executed, it produced EstimateRows = "90000", ActualRows = "128" just like Gail said.

Is that a "bad" thing? The answer is "it could be" and for all the reasons Gail stated. It did, at least, over-estimate the number of rows it was going to have to work with instead of the other way around. If the actual execution plan had produced EstimateRows = "128", ActualRows = "90000", I wouldn't have even posted the code because, IMHO, it's usually when the optimizer grossly under-estimates rows that you really run into problems.

Shifting gears a bit, while it is absolutely true that the use of dynamic SQL can solve this problem, it will only solve the problem if it causes a recompile. THAT type of SQL is normally executed using only EXEC. That method has the additional problem of requiring concatenated code which opens you up for SQL Injection unless you're very, very careful about how you screen your string based input parameters. Of course, concatenating string dates derived from a DATETIME parameter is safe because the datatype simply won't allow for any type of non-date strings.

The only truly safe way to use dynamic SQL with optional string based parameters is to use sp_ExecuteSQL with proper parameterization. That's the rub, though. That type of dynamic SQL does NOT automatically guarantee that the correct execution plan will be used because the other purpose of sp_ExecuteSQL is to reuse existing execution plans if one is available. To wit, sp_ExecuteSQL is one of the major causes of the infamous "Parameter Sniffing" problem.

Let's see what I mean. Before I ran any of the following, I restarted the SQL instance on my desktop box (it only takes a couple of seconds there. I DON'T recommend you do it on a box that other people are using :-P) to guarantee that nothing was in cache. Then, I ran the test data generator as we did before. Here's the code again so you don't have to go back in this thread to look for it. And, yes, it contains the very same indexes as those we used before.

/**********************************************************************************************************************

Purpose:

Create a voluminous test table with various types of highly randomized data.

--Jeff Moden

**********************************************************************************************************************/

--===== Do this test in a nice, safe place that everyone has.

USE tempdb

;

--===== Conditionally drop the test table to make reruns easier

IF OBJECT_ID('dbo.JBMTest','U') IS NOT NULL

DROP TABLE dbo.JBMTest

;

--===== Create and populate a 1,000,000 row test table.

-- "SomeID" has a range of 1 to 1,000,000 unique numbers

-- "SomeInt" has a range of 1 to 50,000 non-unique numbers

-- "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

-- "SomeMoney has a range of 0.00 to 100.00 non-unique numbers

-- "SomeDateTime" has a range of >=01/01/2000 and <01/01/2020 non-unique date/times

-- "SomeDate" has a range of >=01/01/2000 and <01/01/2020 non-unique "whole dates"

-- "SomeName" contains random characters at random lengths from 2 to 20 characters

SELECT TOP 1000000

SomeID = IDENTITY(INT,1,1),

SomeInt = ABS(CHECKSUM(NEWID())) % 50000 + 1,

SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())) % (ASCII('Z')-ASCII('A')+1) +ASCII('A'))

+ CHAR(ABS(CHECKSUM(NEWID())) % (ASCII('Z')-ASCII('A')+1) +ASCII('A')),

SomeMoney = CAST(RAND(CHECKSUM(NEWID())) * 100 AS DECIMAL(9,2)), --Note rounding

SomeDateTime = RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2000','2020') + CAST('2000' AS DATETIME),

SomeDate = ABS (CHECKSUM(NEWID())) % DATEDIFF(dd,'2000','2020') + CAST('2000' AS DATETIME),

SomeName = RIGHT(NEWID(),ABS(CHECKSUM(NEWID())) % 19 + 2)

INTO dbo.JBMTest

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

;

ALTER TABLE dbo.JBMTest

ADD CONSTRAINT PK_JBMTest PRIMARY KEY CLUSTERED (SomeID) WITH FILLFACTOR = 90

;

CREATE INDEX IX_SomeDateTime

ON dbo.JBMTest (SomeDateTime)

;

Next, I made a stored procedure to use Injection Proof dynamic SQL following the methods in Gail's wonderful article on the subject of "Catch-all queries". Again, for the convenience of everyone, here's the link to that article, again. As a side bar, this article should be required reading for anyone who goes near SQL Server. Yeah... it's that good.

http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/

As another sidebar, I didn't have to use as much complexity in this proc as Gail did in her examples because we're going to use the two parameters whether they're NULL or not. That's one of the big differences between the "MAX/MIN Limit" code I wrote and the example Gail uses in her article. The parameters themselves aren't optional. We just need to handle NULL values for them.

Here's the code for the proc I created.

--===== Again, do this in a nice, safe place everyone has.

USE tempdb

;

GO

CREATE PROCEDURE dbo.TestLookup

--===== Declare the I/O parameters

@pFromDate DATETIME,

@pThroughDate DATETIME

AS

--===== Declare the Dynamic SQL variables

DECLARE @sql NVARCHAR(MAX)

;

--===== Create the Dynamic SQL.

-- Since we use both parameters even if they're null,

-- no need to calculate the WHERE clause for this

-- particlar demonstration.

SELECT @sql = N'

SELECT SomeDateTime,

SomeID

FROM dbo.JBMTest

WHERE SomeDateTime >= @pFromDate

AND SomeDateTime <= @pThroughDate

';

--===== Replace NULL parameters with the limits of their datatype

SELECT @pFromDate = ISNULL(@pFromDate ,'17530101'),

@pThroughDate = ISNULL(@pThroughDate,'99991231')

;

--===== Execute the Dynamic SQL

EXEC sp_ExecuteSQL @sql,

N'@pFromDate DATETIME, @pThroughDate DATETIME',

@pFromDate, @pThroughDate

;

GO

If you're following along, we now have a table filled with a million rows data and a new stored procedure on a box we had just previously bounced to ensure there's no chance of caching of exection plans.

Let's run some code now and see what happens. Don't forget to turn on the "Actual Execution Plan" if you're doing this with me.

--===== Execute the proc using a small date range first

-- and then follow with a large date range.

-- We'll also throw in the method I used.

-- Then we'll compare execution plans for actual

-- and estimated rows. Don't forget to turn the

-- actual execution plan on!!!

DECLARE @FromDate DATETIME,

@ThroughDate DATETIME

;

--===== Protected Dynamic #1

SELECT @FromDate = '20050101', @ThroughDate = '20050102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

--===== Protected Dynamic #2

SELECT @FromDate = '20080601', @ThroughDate = '20080731';

EXEC dbo.TestLookup @FromDate, @ThroughDate;;

--===== MaxMin Limit Protected #1 using max date ranges to replace NULLs

SELECT @FromDate = '20050101', @ThroughDate = '20050102';

SELECT SomeDateTime,

SomeID

FROM dbo.JBMTest

WHERE SomeDateTime >= ISNULL(@FromDate, '17530101')

AND SomeDateTime <= ISNULL(@ThroughDate, '99991231')

;

--===== MaxMin Limit Protected #2 using max date ranges to replace NULLs

SELECT @FromDate = '20080601', @ThroughDate = '20080731';

SELECT SomeDateTime,

SomeID

FROM dbo.JBMTest

WHERE SomeDateTime >= ISNULL(@FromDate, '17530101')

AND SomeDateTime <= ISNULL(@ThroughDate, '99991231')

;

GO

As indicated in the comments in the code above, the first thing we ran with the proc was a narrow date range of one day followed by a wider run of two months. Then we did the same thing with the "Max/Min Limit" method. Obviously, the code produced 4 execution plans the look identical. Let's do a deeper dive on those.

Looking at the first query which used the dynamic SQL in the proc, we see that the estimated and actual number of rows comes very close to matching and the code operates as expected. For later discussion, please also note that the "% of Batch" has been evaluated as "1%".

The second query also used the dynamic SQL in the proc. It doesn't match the estimated and actual number of rows so nicely. In fact, we can see that some nasty "Parameter Sniffing" is involved because it "estimated" exactly the same low number of rows even though a much large number of rows was involved. IMHO, this is the "bad" type of Parameter Sniffing because it grossly under-estimated what was required. I'd much rather see it over estimate and experience (YMMV) has taught me that code can run much quicker if the number of rows is over-estimated rather than under-estimated. As Gail said, it doesn't matter much in this simple example but could very much matter for larger and more complicated code.

Again, notice the "% of Batch" is "1%".

Now, let's look at the first "Max/Min Limit" example. This is the same example I used previously in the thread that sparked the rowcount controversy. It does, in fact, grossly over estimate the number of rows it needed to use. Again, I'd rather have that rather than a gross under-estimation.

Notice the "% of Batch" is "49%" here.

Looking at the second "Max/Min Limit" example, we see the same gross over-estimation. Again, I just don't have a problem with over-estimation. I've found that under-estimation can be a real problem, though.

Once again, the "% of Batch" is "49%" here. I WILL eventually get to this. 🙂

Last and certainly not least, there's another thing to consider for performance. While it may be OK to use the "Max/Min Limit" method if the correct indexing is in place, there can be an advantage to using the proper dynamic SQL method (see Gail's article for how to do that) to eliminate column lookups in the criteria. The advantage there is that you may not need indexes quite as wide if you have a lot of optional criteria and skinny indexes are faster than fat ones.

To summarize the points I'm trying to make...

1. I'd rather have the optimizer over-estimate than under-estimate in most cases.

2. The use of proper dynamic SQL won't rebuild an execution plan. In fact, it's designed to reuse executions plans, if available, and that can sometimes lead to performance problems due to Parameter Sniffing just as the missed estimation of rowcounts in the Max/Min Limit method can. The Max/Min Limit method may have a slight advantage because it tends to over-estimate rows rather than reuse plans that may have a gross under-estimation.

3. The "Actual Execution Plan" can be misleading a bit if you're looking at costs. I did do multiple timed runs of all the SQL and both methods (despite the Parameter Sniffing of the 2nd dynamic SQL run) took turns winning for duration and CPU and IO were virtually the same on this simple example (THAT doesn't mean that Parameter Sniffing isn't a problem. It just means it wasn't a problem on this example). When troubleshooting, you shouldn't necessarily use "% of Batch" as an indication of where a performance problem may exist because the optimizer only evaluated how long it would take to build the dynamic SQL, not execute it.

4. With all the "mays" and "cans" and "coulds" and "shoulds" this thread just pointed out, we really have only proven one thing. "It Depends" continues to prevail, so you might want to try both methods. 😀

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

The point of the dynamic SQL approach is to have a number of plans for the specific combinations of parameters that are passed. Having the dynamic SQL contain all of the parameters and be assigned defaults (min and max of the range) if the parameters aren't passed is defeating the whole point of using dynamic SQL at all.

The key is that the two date parameters are optional (or passed as null if they don't apply). Given that, if the parameter is NULL, there's no reason to have it in the query at all. I would much rather do this:

CREATE PROCEDURE dbo.TestLookup

--===== Declare the I/O parameters

@pFromDate DATETIME,

@pThroughDate DATETIME

AS

--===== Declare the Dynamic SQL variables

DECLARE @sql NVARCHAR(MAX), @WhereClause NVARCHAR(MAX) = '';

SELECT @sql = N'

SELECT SomeDateTime,

SomeID

FROM dbo.JBMTest';

IF @pFromDate IS NOT NULL -- we are limiting the resultset based on this parameter

SET @WhereClause = @WhereClause + ' SomeDateTime >= @pFromDate AND';

IF @pThroughDate IS NOT NULL -- we are limiting the resultset based on this parameter

SET @WhereClause = @WhereClause + ' SomeDateTime <= @pThroughDate AND';

SET @WhereClause = LEFT(@WhereClause, LEN(@WhereClause)-3);

SET @sql = @sql + ' WHERE ' + @WhereClause;

--===== Execute the Dynamic SQL

EXEC sp_ExecuteSQL @sql,

N'@pFromDate DATETIME, @pThroughDate DATETIME',

@pFromDate, @pThroughDate;

GO

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = '20050101', @ThroughDate = '20050102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = NULL, @ThroughDate = '20050102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = '20050101', @ThroughDate = NULL;

EXEC dbo.TestLookup @FromDate, @ThroughDate;

The performance of the dynamic SQL and the min-max is identical in this case. I say this case, because this is a special case of a single column that's been seeked on with one or two predicates. The execution plan is, in fact, a trivial one to generate, there's only one real option (well, two, it could scan the table or index, but the plan is a Trivial Plan). To test the differences properly needs a scenario where there are multiple columns and multiple optional parameters that result in the resultset being filtered on various different combinations of those column.

When there are multiple optional parameters across multiple columns, the dynamic SQL (when the existence of parameter values is used to build up the dynamic SQL) allows for the existence of multiple execution plans, each specific to the parameters that existed for the query. In this very simple case, I would get up to 4 execution plans in cache for the query, one where neither parameter was passed, one plan for each of the cases where only one was passed and one for the case where both were passed. This reduces the chance of parameter sniffing problems (reduces, not eliminated) because the plan that was created for the case where both parameters were passed won't be used for the query where only one was passed, and it doesn't need any recompiles to do this (half the purpose of dynamic SQL over OPTION RECOMPILE is to reduce the need for recompilation to get a sensible plan.

I do not have the time right now to work up a more comprehensive test with multiple columns, multiple indexes, multiple optional parameters, need to return to the chapter on De Jong and Function Optimisation.

I usually use the AdventureWorks table OrderDetails for these kind of tests, lots of columns, lots of different data distributions.

p.s. If you check the execution plans for my revised dynamic SQL, you'll see the row estimations are a lot closer to correct if you compare a case where both parameters are passed and where only one is passed. In the version that builds a single piece of dynamic SQL, if a small range is queried first and then as a second execution only one parameter is passed, the row estimation on the 2nd one is horrid (estimated 128, actual 250000).

For the case where first a small range is specified and later a large range, well that's classic parameter sniffing problem and any time a plan is reused (ad-hoc SQL, dynamic SQL, stored procedure) you can get that. It's a different problem with different resolutions (option optimise for, option recompile, variables instead of parameters, etc)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

p.s. I'd rather over-estimation than under-estimation, but over-estimation means a fair-good chance that the optimiser will use operators that are suitable to huge row counts, ones that have high startup costs and low per-row costs, like hash joins, parallelism. Means that resources are going to waste. For a single query, not a concern. For a busy server with lots of it happening, much bigger problem.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 7

GilaMonster (2/26/2012)
The point of the dynamic SQL approach is to have a number of plans for the specific combinations of parameters that are passed.

You example is a good one for the first run only. Subsequent runs, even on so simple a problem, show what I'm talking about (you and I actually agree on this... optional parameters will always be a problem).

Using your revised stored procedure, here are your 3 queries followed by 3 new ones that reuse the cached plans. The comments on the 3 new ones describe the problems.

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = '20050101', @ThroughDate = '20050102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = NULL, @ThroughDate = '20050102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = '20050101', @ThroughDate = NULL;

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

--===== Once the plan was cached for the smaller range,

-- it was used again for the larger range.

-- Actual number of rows = 250,331

-- Estimated number of rows = 138.627

-- Could get a pretty poor bit of performance from this.

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = '20020101', @ThroughDate = '20070102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

--===== Having a cached execution plan didn't help much here.

-- Actual number of rows = 137

-- Estimated number of rows = 250,316

-- Could get the very same high startup costs that you mentioned.

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = NULL, @ThroughDate = '20000102';

EXEC dbo.TestLookup @FromDate, @ThroughDate;

GO

--===== Having a cached execution plan didn't help much here.

-- Actual number of rows = 4,198

-- Estimated number of rows = 749,821

-- Could get the very same high startup costs that you mentioned.

DECLARE @FromDate DATETIME, @ThroughDate DATETIME;

SELECT @FromDate = '20191201', @ThroughDate = NULL;

EXEC dbo.TestLookup @FromDate, @ThroughDate;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

Jeff Moden (2/26/2012)
Subsequent runs, even on so simple a problem, show what I'm talking about (you and I actually agree on this... optional parameters will always be a problem).

That has nothing whatsoever to do with optional parameters. It's a stock-standard, classic case of a parameter sniffing problem, not at all related to the optional nature of the parameters, related solely to the fact that different parameter values result in different cardinalities. It happens any time you have a cached plan, parameterised ad-hoc SQL, parameterised dynamic SQL, stored procedures. It is a completely different problem to the optional parameter problem and mixing them together is going to just confuse people.

To prove that is has nothing whatsoever to do with the optional parameter problem

CREATE PROCEDURE SimpleDateRange (@StartDate DATETIME, @EndDate DATETIME)

AS

SELECT * FROM dbo.JBMTest WHERE SomeDateTime BETWEEN @StartDate AND @EndDate

GO

EXEC SimpleDateRange '20050101', '20050102'

GO

EXEC SimpleDateRange '20000101', '20101231'

Run that first with a small range and later with a large and it will show exactly the same effect, with no sign of optional parameters anywhere.

The problem with optional parameters is that depending which ones are passed (and the case that's been tested here is far too simple to show the problem) the optimal execution plan (as in which index to seek on, not just whether to seek or table scan) is radically different and reuse of an inappropriate plan is horridly bad because the estimates based on one set of parameters aren't appropriate for a different set.

Consider a table like an Orders table where a single procedure accepts the following as optional parameters: OrderStatus, CustomerID, OrderDate (2 parameters for a range), ShipDate (2 parameters for a range)

So you could call that procedure

EXEC SearchOrders @OrderStatus = 'Open'

Or

EXEC SearchOrders @CustomerID = 42, @OrderDateStart = '2012/01/01'

Or

EXEC SearchOrders @OrderDateStart = '2011/12/01', @OrderDateEnd = '2012/01/01'

Or

EXEC SearchOrders @OrderStatus = 'Returned', @ShipDateStart = '2011/12/01', @ShipDateEnd = '2012/01/01'

etc, etc

To meaningfully discuss options around optional parameters, there need to be multiple optional parameters for different columns, not a single column with one or two limits to its values. Just about anything will work for a case that simple, there aren't enough options when generating the plan to show meaningful effects.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 9

GilaMonster (2/26/2012)
Jeff Moden (2/26/2012)
Subsequent runs, even on so simple a problem, show what I'm talking about (you and I actually agree on this... optional parameters will always be a problem).
That has nothing whatsoever to do with optional parameters.

Heh... gosh... even when I agree with you, you find something else to disagree with. 😛

The reason why I said "optional parameters" is because that's why we were using dynamic SQL in the first place. But, call it what you will, using proper dynamic SQL will always be a possible problem because of "Parameter Sniffing".

So far as using a more complicated query goes, if you can get such a wide disparity on the estimated vs actual row counts, I don't believe we need a more complicated example to prove that the Parameter Sniffing will be a problem. I do agree and have stated several times that the elimination of "not used" parameters will make it easier to hit a more narrow index and, yes, that could definitely make for a better execution plan... provided that Parameter Sniffing doesn't ball it all up.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Ben Teraberry SSChampion Points: 11845 More actions · Answer 10

Jeff, I'm a bit confused. Are you saying that parameter sniffing is more likely with dynamic sql? I don't understand that to be the case (and I would love to be corrected if I am mistaken in this) and so I don't understand how you are relating the sniffing to the dynamic sql. It would seem the sniffing here is an inherent result of the cardinal disparity of the queries being run, regardless of the origin of the statement, dynamic or static.

Please take this as the honest question it is. The topic of plan generation specifically in regards to dynamic statements is something I'm very much interested in at the moment.

└> bt

Forum Etiquette: How to post data/code on a forum to get the best help[/url]

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

bteraberry (2/26/2012)
Are you saying that parameter sniffing is more likely with dynamic sql?

Parameter sniffing is the ability of the optimiser to sniff values of parameters and to compile queries based on the row counts estimated for those parameter values. It's usually a very good thing.

Parameter sniffing problems occur any time a plan is compiled (for explicitly parameterised ad-hoc queries, for auto-parameterised ad-hoc queries, for explicitly parameterised dynamic SQL, for auto-parameterised dynamic SQL, for stored procedures/functions/triggers) and that plan is later reused with different parameter values that result in different cardinalities.

There is no difference in the way SQL compiles a piece of parameterised ad-hoc or dynamic SQL over how it compiles a piece of SQL in a stored procedure. The only real difference is in reuse, a stored procedure matches on Object_id, a piece of dynamic or ad-hoc SQL on a hash of the string.

As I said earlier, the estimated/actual row count disparity is not an artifact of dynamic SQL, it's just as likely and possible with a procedure like this

CREATE PROCEDURE SimpleDateRange (@StartDate DATETIME, @EndDate DATETIME)

AS

SELECT * FROM SomeTable WHERE SomeDate BETWEEN @StartDate AND @EndDate

GO

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

HowardW One Orange Chip Points: 29109 More actions · Answer 12

Jeff Moden (2/25/2012)
HowardW (2/24/2012)
Artoo22 (2/24/2012)
SELECT* FROM#t WHEREDateCreated >= ISNULL(@FromDate, DateCreated)
ANDDateCreated <= ISNULL(@ThroughDate, DateCreated)
It might look neater, but the use of scalar functions (ISNULL) guarantees that indexes can't be used to fulfil the query.
The word "guarantees" may be a bit too strong in this case. Please see the following QOTD...
http://www.sqlservercentral.com/questions/Indexes/86916/

Hehe. Yes, I wasn't saying that using an ISNULL on a column caused other predicates in the query to be non SARGable 😉 That seems to be the gist of that QOTD, and is of course correct (if a little silly :-P)

It does indeed guarantee it for the query I quoted in my reply.

And obviously this only applies to using scalar functions on column values rather than comparing column values to the results of a scalar function - that is obviously perfectly SARGable.

HowardW One Orange Chip Points: 29109 More actions · Answer 13

GilaMonster (2/27/2012)
Parameter sniffing problems occur any time a plan is compiled (for explicitly parameterised ad-hoc queries, for auto-parameterised ad-hoc queries, for explicitly parameterised dynamic SQL, for auto-parameterised dynamic SQL, for stored procedures/functions/triggers) and that plan is later reused with different parameter values that result in different cardinalities.
There is no difference in the way SQL compiles a piece of parameterised ad-hoc or dynamic SQL over how it compiles a piece of SQL in a stored procedure. The only real difference is in reuse, a stored procedure matches on Object_id, a piece of dynamic or ad-hoc SQL on a hash of the string.

Presumably, the one exception to this is the OPTION(RECOMPILE) optimisation in 2k8 SP2 onwards in that it forces it not to parameterise? E.g. you're always getting the best plan it can generate for those particular input parameters, with the downside that it will have to compile a plan for every single execution.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

HowardW (2/27/2012)
GilaMonster (2/27/2012)
Parameter sniffing problems occur any time a plan is compiled (for explicitly parameterised ad-hoc queries, for auto-parameterised ad-hoc queries, for explicitly parameterised dynamic SQL, for auto-parameterised dynamic SQL, for stored procedures/functions/triggers) and that plan is later reused with different parameter values that result in different cardinalities.
There is no difference in the way SQL compiles a piece of parameterised ad-hoc or dynamic SQL over how it compiles a piece of SQL in a stored procedure. The only real difference is in reuse, a stored procedure matches on Object_id, a piece of dynamic or ad-hoc SQL on a hash of the string.
Presumably, the one exception to this is the OPTION(RECOMPILE) optimisation in 2k8 SP2 onwards in that it forces it not to parameterise?

That's still parameterised (well, depends on whether the query you specify that on is a parameterised one or not), it still uses parameter sniffing (looks at the values of the parameters and variables to generate a plan). Because such a plan is never cached, it can't be reused and you can't have the problem where a cached plan is used inappropriately.

Option recompile actually allows more parameter sniffing because the optimiser can sniff variables as well as parameters.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1004436 More actions · Answer 15

bteraberry (2/26/2012)
Jeff, I'm a bit confused. Are you saying that parameter sniffing is more likely with dynamic sql?

No. Absolutely not. One of the statements made was that dynamic SQL is a solution for parameter sniffing. My point there was that if you do dynamic SQL properly with sp_ExecuteSQL so that it's SQL Injection proof, it's subject to parameter sniffing just like anything else.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)