September 19, 2008 at 12:41 pm
HI all, I am doing a simple select query, returns about 4000 records from 1570321 records.
select * from sales where (sales_id = 1123013 AND sales_in_no >= 199701 AND sales_in_no <= 199701)
The query takes about 1min to run.
Index sales_idx is created for sales_id and sales_in_no. But I looked at the sql profiler, doesnt look like sql server is using this index. Is there any reason why the index is not been used??
Thanks for your suggestions.
September 19, 2008 at 12:56 pm
Just outta curiosity, what is this for?
salesin_no >= 199701 AND sales_in_no <= 199701
It looks like that would select everything in that column.
Depending on the order of the columns in your index the optimizer may find it more efficient to just scan the entire table.
What results are you getting from the execution plan when you run it?
-Luke.
September 19, 2008 at 12:57 pm
- Is sales_id the first column of your index ?
- is the other column the first one of the index ?
- from your total column distribution, which percentage is of sales_id = 1123013 ?
- from your total column distribution, which percentage is of the other column matches your search value ?
- is your index well organized ?
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 19, 2008 at 1:18 pm
Sorry, that was a typo.
select * from sales where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)
This is the output I get from the execution plan:
Physical Operation: Clustered Index scan
Logical Operation: Clustered Index scan
Object
[test].[dbo].[sales].[sales.pk]
Hall of Fame,
Yes, sales_id is the first column of my index.
About 70% of the total column distribution is of sales_id = 1123013.
About 60% of the total column distribution matches the search value of the other column.
This is the index:
USE [test]
GO
CREATE NONCLUSTERED INDEX [SALES_IDX] ON [dbo].[SALES]
(
[SALES_ID] ASC,
[SALES_IN_NO] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
Will it help if I use clustered index? Thanks.
September 19, 2008 at 1:45 pm
since so much of the table would be included in your index it's possible that the optimiser wants to do the scan because it thinks it's more efficient than do all of the bookmark lookups for the rest of the columns. I'd be interested if it would use the index if you didn't use Select *
but actually used
SELECT sales_id, sales_in_no
FROM sales
where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)
September 19, 2008 at 1:49 pm
I tried that, no luck. It didnt use the index when I selected just 2 columns..
September 19, 2008 at 2:30 pm
Check whats the cost involved(from execution plan) when you specify the required index by using index hint. Turn on statistics (set statistics io on) and check logical and physical reads.
Do the same with normal query execution(using clustered index) and compare the figures. Make sure you empty the cache before executing the query for the second time otherwise records will be thre in cache in advance.
Is statistics updated for the indexes involved?
MJ
September 19, 2008 at 3:18 pm
What's the fragmentation on that index and the clustered index like?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 21, 2008 at 12:32 pm
- Can you rebuild your indexes (clustered and non clustered) ?
- If this behavior is still the same afterward, you can force it to use the index, but then I strongly advise to document why you used the hint.
(Because you'll have to re-evaluate its need with every sqlserver hotfix/sp/upgrade you apply)
- Don't use hints unless you really have to and unless you tested they really pay off !
/* dd 20080921 used index hint because the index was not used and the gain is considerable */
Select ...
from [dbo].[SALES] with (index(SALES_IDX))
where sales_id = 1123013
AND sales_in_no >= 199701
AND sales_in_no <= 199701
Johan
Learn to play, play to learn !
Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:
- How to post Performance Problems
- How to post data/code to get the best help[/url]
- How to prevent a sore throat after hours of presenting ppt
press F1 for solution, press shift+F1 for urgent solution 😀
Need a bit of Powershell? How about this
Who am I ? Sometimes this is me but most of the time this is me
September 21, 2008 at 5:54 pm
psangeetha (9/19/2008)
Sorry, that was a typo.select * from sales where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)
This is the output I get from the execution plan:
Physical Operation: Clustered Index scan
Logical Operation: Clustered Index scan
Object
[test].[dbo].[sales].[sales.pk]
Hall of Fame,
Yes, sales_id is the first column of my index.
About 70% of the total column distribution is of sales_id = 1123013.
About 60% of the total column distribution matches the search value of the other column.
This is the index:
USE [test]
GO
CREATE NONCLUSTERED INDEX [SALES_IDX] ON [dbo].[SALES]
(
[SALES_ID] ASC,
[SALES_IN_NO] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
Will it help if I use clustered index? Thanks.
Heh... it's still a typo... 2nd value is greater than the first. Since nothing is found, it will take longer.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 22, 2008 at 2:10 am
psangeetha (9/19/2008)
Sorry, that was a typo.select * from sales where (sales_id = 1123013 AND sales_in_no >= 199901 AND sales_in_no <= 199701)
This is the output I get from the execution plan:
Physical Operation: Clustered Index scan
Logical Operation: Clustered Index scan
Object
[test].[dbo].[sales].[sales.pk]
Hall of Fame,
Yes, sales_id is the first column of my index.
About 70% of the total column distribution is of sales_id = 1123013.
About 60% of the total column distribution matches the search value of the other column.
This is the index:
USE [test]
GO
CREATE NONCLUSTERED INDEX [SALES_IDX] ON [dbo].[SALES]
(
[SALES_ID] ASC,
[SALES_IN_NO] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
Will it help if I use clustered index? Thanks.
It's not the query that takes time, but fetching all data. 60% is a lot, sqlserver will use index if the resultset is a small part of table, 15% or less, depending on table size and other things.
Clustered index scan is the same as table scan and is more cost effective in your case than index seek.
Also, try to use between, as query can be better optimized and typos like this:
select * from sales where (sales_id = 1123013 AND sales_in_no between 199901 AND 199701)
become more obvious.
September 22, 2008 at 2:23 am
Robert (9/22/2008)
It's not the query that takes time, but fetching all data. 60% is a lot, sqlserver will use index if the resultset is a small part of table, 15% or less, depending on table size and other things.
That's if bookmark lookups are required, and often the tipping point is closer to 1%. If the index is covering then SQL will normally use the NC index, even if 100% of the table is required.
I did a test of that a while back here. Will find the link.
Edit: http://www.sqlservercentral.com/Forums/FindPost564821.aspx
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 22, 2008 at 3:18 am
GilaMonster (9/22/2008)
That's if bookmark lookups are required, and often the tipping point is closer to 1%. If the index is covering then SQL will normally use the NC index, even if 100% of the table is required.
You're right. But queries with covering index are relatively rare. Tipping point depends on lot of things if query becomes a complex join. In real world it can be over 10% even with huge resultsets or it can happen to choose full table scan even if query is on primary key (which I consider a bug in the engine).
September 22, 2008 at 7:58 am
Can't you use Between instead of = ???
🙂
September 22, 2008 at 11:45 am
Thanks for all your comments. I've changed the code to use between and the execution time has reduced. THe application team is currently happy with it now. If they come back again, I am planning on executing the query by forcing the index.
Thanks again.
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply