May 18, 2009 at 11:58 am
I have a table with 38 Mil records.
A sproc selects records from that table only (no joins) based solely on an input "StatusId" (which can be one of six possible StatusId values) in the where clause. There is a non-clustered index ("IX_StatusId") on that column of the table.
The sproc can run an hour and not finish. If I add an index hint to the select statement within the sproc ["WITH (INDEX(IX_StatusID))"], then the sproc finishes almost immediately.
Why do I have to include the hint? What is keeping SQL from knowing that the index helps that much? Even rebuilding the index made no difference.
May 18, 2009 at 12:16 pm
here's my guess;
as a general rule, an index which uniquely identifies a row is more valueable than an index which shows a value releated accross millions of rows.
assuming your status index has about 10 or so values,
I would guess the optimizer ignores it in favor of something involving the PK; it's looking at the statistics and deciding that because that value repeats so often, that index is not as valuable as other indexes.
the resulting table scan as it whips through the leaves of the index makes it slow. forcing it to use the index makes it build a much better execution plan, and it's way quicker.
Lowell
May 18, 2009 at 12:24 pm
As you have said, the index field can only have 6 possible values. Since the table has some million rows, the selectivity of the index is very low.... so a table scan is chosen in place of index seek/scan... are you sure you really need to return 6 million records each time ?? May be you just need to add another condition for selecting the appropriate records or just page them to return just the few that will be managed by the application
May 18, 2009 at 12:27 pm
I agree with Lowell. From the description, the index is woefully inadequate for the data and the optimizer assumes a scan will work better. But there must be more to the query than a simple SELECT... WHERE StatusID = 1. Are you performing aggregates, sorts or other operations? Is it pulling the data through a view or through a UDF? I'm not saying it would choose that index without these other operations, because the selectivity on that index must be near zero. But something is changing when you put that index hint in... a loop gets changed to a merge or something within the execution plan. Can you post the execution plans with & without the hint?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
May 18, 2009 at 12:42 pm
The sproc was designed to return a relative handful of new records. I knew SQL prefered indexes that tended toward unique, but I was surprised that it would throw out one that exactly matched the "where." I guess we live with the hint, though MS advises against it. Thank you all for your help.
Edit: Oh, there are no aggregates, views, etc., either.
May 18, 2009 at 1:23 pm
Hmm. Now, there IS a function in the where clause ("where statusID = dbo.GetIdfromEnum(1)", which hits a table with 6 rows). Hard-coding the GUID I'm looking for makes the query without the hint return just as fast as the lookup with the function call.
Using a variable instead of the function (to supply the GUID for the where clause) does not help. Only hard-coding the GUID to match on speeds the query without the hint. Since that Guid would be different on different installations, I may be stuck with the hint.
Thanks for compelling me to look at functions, because I had discounted the one in the where clause.
May 18, 2009 at 1:40 pm
You got it. Hard coding provides optimizer a specific value to work with and can look at index value spread and determine a much more accurate estimate of returned row estimate, vs having a variable in where clause, which would lead it to be able to determine a less accurate estimate (can't be sure, but I think it would be like million or so rows, devided by 6).
If you are ready to ignore the ugliness of coding - you could code six hard coded SQL statements inside the stored procedure, encapsulated in a 'IF..ELSE..' clause.
May 18, 2009 at 1:59 pm
Could you provide the code (stored proc and function), table DDL (target table for the stored proc and the UDF)?
There may be a better way, but we'd need to see the code.
May 18, 2009 at 2:33 pm
A couple of things you could look at instead of using a specific index hint would be to use a query hint or plan guide.
Try using the OPTIMIZE FOR statement to tell SQL Server what value to use for your variable during query optimization. It might help - don't know for sure but is something you can try.
If that doesn't work, look at setting up a plan guide. Again, not sure if it will help.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply