August 22, 2003 at 9:47 am
I was wondering if anyone had enough insight to solve my delima. I have two tables.
The primary table (detail) has 4+ million rows in it. It is linked to a code table (Disposition) that has approximately 6 rows in it.
The code table has some configuration columns that I'm using to retrieve rows from the primary table.
** Example Structure**
Primary table:
CREATE TABLE [Detail] (
[ID_Detail] [bigint] IDENTITY (1, 1) NOT NULL ,
[ID_Disposition] [tinyint] NOT NULL ,
[Quantity] [smallint] NOT NULL
)
CREATE TABLE [Diposition](
[ID_Disposition] [tinyint] IDENTITY (1, 1) NOT NULL ,
[Description] [varchar(50)] NOT NULL ,
[Flag_Hazard] [bit] NOT NULL,
[Flag_Fail] [bit] NOT NULL,
)
The query is a simple query looking for all of the detail records with a "Hazard" disposition. So my initial query looked as follows:
SELECT [Detail].* FROM [Detail] INNER JOIN [Disposition] ON [Detail].ID_Disposition = [Disposition].ID_Disposition WHERE [Disposition].[Flag_Hazard] = 1
Upon reviewing the execution plan, it shows that it is performing a Hash Match / Inner Join between the tables, a Clustered Index Scan on the Primary Table (all 4 million records) and an index scan on the code table (returning only 1 records). It states that it built a hash table of the values from my code table and is scanning the records in the detail table to probe into the hash table for matches.
As you can guess, this is not performing very well. So my initial take on it was to add an index on the ID_Disposition column on the Detail table. The ID_Disposition column is the primary key (clustered) on the code table so no index was needed there. To my surprise, this did absolutely nothing!
I played with the query a bit to see if i could make it use my newly created index to speed things up, so this became the next iteration:
SELECT [Detail].* FROM [Detail] WHERE [Detail].[ID_Disposition] IN (SELECT [ID_Disposition] FROM [Disposition] WHERE [Flag_Hazard] = 1)
This did not behave any differently that the first inner join statement. I did a little more research and found that there was only 1 record that is flagged as a hazard so I grabbed that id and did a hard-coded query like this:
SELECT [Detail].* FROM [Detail] WHERE [Detail].[ID_Disposition] = 3
And whoala! 1 sec response time! It utilized my index! So, I thought maybe it was because multiple rows could be causing it so I tried this...
SELECT [Detail].* FROM [Detail] WHERE [Detail].[ID_Disposition] IN (1,3)
and whoala! 1 sec response time again! It also utilized my index! Now I'm just confused. Why won't either the inner join or the sub-select statement use my index?
Finally, I place the index hint into the query and it comes back blazing fast.
My question is this, why didn't sql server choose to use the index on it's own? I'd rather not place an index hint within the statement so if there is something else that i can do (multi-column statistics, indexes, etc.) then I'd rather do that.
Thanks in advance,
August 22, 2003 at 10:22 am
My quess would be that you had all the data before creating the index. Try UPDATE STATISTICS [Detail] and see if it makes any difference.
Guarddata-
August 22, 2003 at 10:28 am
That was it. I thought I remember reading in BOL that if alot data was inserted, new indexes created, that statistics would auto update (if the db option is set, which it is). But apparently not. Thanks for your help.
August 22, 2003 at 10:42 am
Yeah - but BOL doesn't admit that the update happens *eventually* after a series of queries have been run.
Glad it made a difference.
Guarddata-
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply