January 4, 2008 at 7:54 am
Simple query on a table with > 10 million rows. Non-clustered index with TRANS_DATE (smalldatetime) field as the first column.
This query uses the index and is lightning fast:
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL
where TRANS_DATE between '2007-12-01 00:00:00' and '2007-12-01 23:59:59'
This query uses a table scan and takes > 60 seconds:
declare @ddate varchar(20)
declare @dstartdate varchar(20)
declare @denddate varchar(20)
set @dDate = '2007-12-01'
set @dStartDate = @dDate + ' 00:00:00'
set @dEndDate = @dDate + ' 23:59:59'
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL
where trans_date between @dStartDate and @dEndDate
@dDate would actually be a string passed into the sp, above is just for illustrative purposes. Why is query optimizer not using the index when I add the time portion onto the date? I've tried CONVERT and CAST with the same result. Any suggestions are appreciated!
Steve
January 4, 2008 at 8:12 am
It's not the concat that's the problem, it's the variable usage.
If a query uses either parameters or hard coded values, the optimiser knows, at the point that it's compiling and optimising the query what the values are. It can use those values, along with the column statistics to get an estimate of the number of rows that the query will affect. Based on the number of rows, if can pick a good execution plan. If only a very small percentage of the table will be returned, it picks a index seek.
If the query uses variables, the optimiser can't see the value of those variables at compile and optimise time. Since it can't see the values, it has no idea if the between is going to affect 1 row, or the entire table. The optimiser makes a guess that 1/3 of the table will be returned. Based on that guess it picks a plan. Since it doesn't know that only a few rows will be returned, it picks a plan optimal for around 3 million rows - table scan.
The best solution would be to pass to the stored proc, the two dates that you're using in the query. If you can't change the proc, try doing the date manipulation in the where clause.
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL
where trans_date >= @dDate and trans_date < DATEADD(dd,1,@dDate)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 4, 2008 at 9:34 am
Thanks for the input. Unfortunately, that didn't help either. Seems that manipulating the date at all is enough for SQL to ignore the index. I can't pass the actual dates to the sp because this is running in a scheduled DTS package with getdate as the reference. However, I finally found a solution by using an index hint. This works perfectly:
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL (INDEX=IX_TRAN_FULL)
where TRANS_DATE >= @dStartDate and TRANS_DATE <= @dEndDate
January 4, 2008 at 9:16 pm
If memory serves correctly, I believe that form of index hint is "deprecated". Try this, instead...
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL WITH (INDEX(IX_TRAN_FULL))
where TRANS_DATE >= @dStartDate and TRANS_DATE <= @dEndDate
Also, if your dates have a time other than midnight, then the best way to do this is...
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL WITH (INDEX(IX_TRAN_FULL))
where TRANS_DATE >= @dStartDate and TRANS_DATE < @dEndDate + 1
Better to do it that way most of the time because requirements on dates can easily change...
--Jeff Moden
Change is inevitable... Change for the better is not.
January 5, 2008 at 8:23 am
Steve Ervolino (1/4/2008)
Thanks for the input. Unfortunately, that didn't help either. Seems that manipulating the date at all is enough for SQL to ignore the index.
Hmmm. Sometimes I love parameter sniffing and sometimes I hate it.
Ok, here's another possibility. Create a wrapper proc that the DTS calls. The wrapper proc recalculates the dates and calls a second proc that takes 2 parameters. something like
CREATE PROCEDURE OuterProc @dt
as
declare @dstartdate datetime
declare @denddate datetime
set @dStartDate = dateadd(dd, datediff(dd,0, @dt),0)
set @dEndDate = dateadd(dd, datediff(dd,0, @TheDate) + 1,0)
EXEC InnerProc @dStartDate, @dEndDate
GO
CREATE PROCEDURE InnerProc @dStartDate, @dEndDate
AS
select ACCT_NUM, OPERATOR_NUM, BRANCH_NUM
from TRANS_J_DATA_FULL
where trans_date >= @dStartDate and trans_date < @dEndDate
GO
The DTS package calls OuterProc. That should get you an optimal plan, without handicapping the query optimiser.
The problem with index hints are twofold. Firstly, if that index is ever renamed or dropped, your procedure will break. Second, while you can force SQL to us a particular index, you cannot force it to seek on that index. It may decide to do an index scan. Also, if things change so that that index is no longer the optimal one to use, the query will still use the non-optimal index.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 5, 2008 at 8:05 pm
Steve,
index on a column used in "range" selection ("BETWEEN", ">" or/and "<", TRANS_DATE in your case) must be clustered.
Fix it and the SP will be always fast.
_____________
Code for TallyGenerator
January 5, 2008 at 8:29 pm
I see another potential problem here...what is the datatype of your column in the table? You are declaring your variables as type varchar, but if your columns are a datetime, then the system has to convert behind the scenes. If your column datatypes are datetime, then try changing the datatypes of your variables to match. Perhaps the reason the first one works with a "varchar literal" is that behind the scenes, it makes them dates.
Let me know if I'm off base.
Hope this helps.
January 6, 2008 at 10:52 pm
Sergiy (1/5/2008)
Steve,index on a column used in "range" selection ("BETWEEN", ">" or/and "<", TRANS_DATE in your case) must be clustered.
Fix it and the SP will be always fast.
I don't agree with you there. A covering noncluster works as well (or better) than a cluster for a range selection. A non-covering nonclustered may or may not be used for a range, depending on how many rows qualify. Lookups to the cluster/heap are expensive
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 11, 2009 at 5:18 am
GilaMonster (1/6/2008)
Sergiy (1/5/2008)
Steve,index on a column used in "range" selection ("BETWEEN", ">" or/and "<", TRANS_DATE in your case) must be clustered.
Fix it and the SP will be always fast.
I don't agree with you there. A covering noncluster works as well (or better) than a cluster for a range selection. A non-covering nonclustered may or may not be used for a range, depending on how many rows qualify. Lookups to the cluster/heap are expensive
Apologies for dragging up an old thread, but I've just come across this myself.
A simple query on a Date field and a Bit field didn't use the covering non-clustered index
SELECT * FROM MyTable Where MyDateField < '2009-06-01' and MyBitField = 0 -- DOES A TABLE SCAN
It did use the non-clustered index if I change the MyDateField condition to an "="
SELECT * FROM MyTable Where MyDateField < '2009-06-01' and MyBitField = 0 -- USES THE INDEX
Turns out the table didn't have any clustered indexes - would this affect whether the index was used or not ?
June 11, 2009 at 5:28 am
Joseph Fallon what order are the fields in your in covering index, this will make a difference.
The other thing to look out for is that I'm pretty sure, but happy to be told otherwise, that because you have a select * , the optimizer might think it's cheaper to scan the table than to use the index and then have to do a Look-up afterwords...
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
June 11, 2009 at 5:52 am
Christopher Stobbs (6/11/2009)
Joseph Fallon what order are the fields in your in covering index, this will make a difference.
It will, because one of the columns is an inequality.
http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/
p.s. Please in future start a new thread for a new problem. Thanks
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 11, 2009 at 5:56 am
Gail am I correcting in saying and I think I might have learn't or if I am wrong missunderstood, but ify ou have two columns in your where clause and 1 is an inequality the inequality should be on the right of the index list.
So for example:
WHERE DateField > '2009-01-01' AND BitField = 0
Index should be: (BitField, DateField).
I tested this earlier and it would do a seek with both columns in the predicate but it did depend on what my output was.
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
June 11, 2009 at 6:00 am
Christopher Stobbs (6/11/2009)
Gail am I correcting in saying and I think I might have learn't or if I am wrong missunderstood, but ify ou have two columns in your where clause and 1 is an inequality the inequality should be on the right of the index list.
Yup.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
June 11, 2009 at 6:54 am
Right ...
When I don't use "SELECT * FROM " the query always uses the covering index. Thanks for the tip Mr Stobbs.
When I DO use "SELECT * FROM " then the query will do either a table scan or an index scan depending on whether I use ">" or " '2009-06-01'
and prealloc = 1[/i]
Returns 50 records, uses index
select * from [investing fund transactions] where allocation_date < '2009-06-01'
and prealloc = 1
Returns 3625 records, does not use index
select fund_no from [investing fund transactions] where allocation_date < '2009-06-01'
and prealloc = 1
Always uses the index
June 11, 2009 at 7:12 am
Joseph Fallon (6/11/2009)
When I don't use "SELECT * FROM " the query always uses the covering index. Thanks for the tip Mr Stobbs.
Is the index really covering for SELECT *? Does it have every single column in the table somewhere in the index (key or include)?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 31 total)
You must be logged in to reply to this topic. Login to reply