Part 2 (Detailed): Query optimizer does not pick the correct index for a query that is executed from a job

Question

Post reply

Part 2 (Detailed): Query optimizer does not pick the correct index for a query that is executed from a job

Gabriel Petrovay

Mr or Mrs. 500

Points: 546
More actions
July 9, 2006 at 1:16 am

#114668

This is the root cause of the topic "SQL Server Job performs very very slow..."
SELECT top X itemID, name
FROM dbo.Items -- with (index(IX_Items_catID))
WHERE catID = 289
The Items table has 8M rows and I have the primary key on itemID. I also have an index on catID and a more complex index having catID as the first column.
There are about 200K rows with catID = 289
In the TOP clause, depending on the X value, another index is picked:
-> if X <= 63, IX_Items_catID is picked
-> if X > 63, PK_Items (having itemID column) is picked
For X = 63, the duration of the query is about 170.
For X = 64, the duration of the query is about 12000.
If I force the use of IX_Items_catID index with an index hint, the 64 version also performs in about 170.
All the statistics are up to date. I have just refreshed them. Anyway the data in this table doen not change.
All this happens in QA.
---------------
If I put the same story in a job, things change:
The 63 value remains the same, but different indexes are picked:
-> if X <= 63, IX_Items_catID1 is picked (the complex index containins catID as the first column)
-> if X > 63, PK_Items (having itemID column) is picked
Why do queries behave different in QA than performed in a job by SQL server Agent?
Why does it use that threashold? Because it would peform better only with the IX_Items_catID?
Thanks!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

David.Poole SSC Guru Points: 76035 More actions · Answer 1

I'm wondering if it has something to do with caching.

QA will be running this query from scratch. Perhaps the SQLSERVERAGENT has an execution plan cached in some way.

If you run DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS and run the job what happens?

Try updating the statistics on dbo.items or even rebuilding the index and trying again.

LinkedIn Profile

Gabriel Petrovay Mr or Mrs. 500 Points: 546 More actions · Answer 2

I have recompute the statistics and rebuilt indexes. No success.

DBCC FREEPROCCACHE and DBCC DROPCLEANBUFFERS none work

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 3

to be blunt the optimiser is probably picking the correct index, you probably don't provide the most efficient indexe for your query. Have you changed any settings in QA away from the defaults, as mentione din your first post these can make a difference? You can check the contents of the procedure cache to see what is happening.

Try putting the query as a proc and see what happens. I note you don't have an order by with your top which is bad practice.

You don't mention a clustered index, a covered index on ItemID,name,CatID should give the best performance. Making it clustered might make it faster - but not necesscarily

I assume your tests run this query in isolation and the SARGs are not parameterised or replaced with avriables?

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Philip Yale-193937 SSCarpal Tunnel Points: 4603 More actions · Answer 4

Colin's right - a covered index would give the best performance here (although I think it should be on [CatID, ItemID, Name], since CatID is the column in the Search Argument). I wouldn't cluster it - there's no need, and a covered index is even faster as the leaf pages are smaller than the data pages of a clustered index would be, so you can retrieve more rows with a single IO request.

I think part of the problem is that the likely number of matching rows (200k) is approx 2.5% of the total table, which is right on the cusp of the percentage at which the optimizer decides that a NC index becomes inefficient. There's no simple way of predicting this (at least, none that I've ever seen) and many people believe it's as high as 10% before a switch to a tablescan/clustered index scan occurs, but in reality you'll find it is much lower. It does depend on a number of different factors.

You'll probably find that if you don't do a TOP at all, then the NC index will be used each time. Look for a value of CatID that returns 250k rows and I wouldn't be quite so sure. At 400k rows I'd put good money on a clustered index scan.

By adding the TOP clause, you're adding a requirement to do yet more work, and this could just be the final little nudge that the optimizer needs to cause it to flip the plan to an index scan once the number of rows to be sorted and retrieved exceed 63.

By putting a covering index on, (and by including an ORDER BY ), all the records will be in the correct sequence, efficiently packaged into index leaf pages, and there should be minimal overhead from using it.

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 5

actually secondary indexes are even more selective, typically 95% or better. You can improve matters with a covered index ( we'll agree to disagree about column order !! ). It's all to do with the number of pages, i/o's and locks that are required at which point the optimiser makes it's choice. I've found that an inefficient ( in terms of selectivity ) clustered index is more likely to be used that a non clustered. It's a case of test and test and test again.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Philip Yale-193937 SSCarpal Tunnel Points: 4603 More actions · Answer 6

Colin - I'm intrigued about your comment on the subject of column order. Since the only SARG column is on CatID, if this wasn't the leading column in the index then the optimizer would have to scan all 8 million index rows, and then aggregate the ones that matched, before it could access the Name and ItemID columns.

By putting CatID first, it can do an index seek on the first occurrence of CatID, then scan the subsequent 200K rows, reading the Name and ItemID values as it does so, until the CatID changes to a new value at which point it knows to stop.

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 7

Well originally covered indexes could be fussy about column order in the index matching column order in the select. It shouldn't make any difference and I've to date not found any in the search column being first or last. I still tend to make covered indexes match select order. If you're in the uK Kimberley Tripp is giving an advanced index cource in Reading next month.

I usually test the various index permutations and it depends upon how many other queries the index may have to satisfy as to my final column choice and order.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Philip Yale-193937 SSCarpal Tunnel Points: 4603 More actions · Answer 8

Can't argue with the "test, test, test" doctrine. I've put a request in to my line manager to attend the Reading seminar, too!

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 9

Cool! if I get to attend, which is unlikely as I think I'll be on a new contract, I'll do my best to find you! I attended the one day KImberley Tripp event and Kalen Delaney event which covered lots of index stuff.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Philip Yale-193937 SSCarpal Tunnel Points: 4603 More actions · Answer 10

At the risk of flogging a point to death here, but I do think it's of interest, I've adopted the "test" approach and come up with some results. See you in Reading!

Table (14,305,700 rows):

======

CREATE TABLE [dbo].[live_traffic_incident](

[row_id] [int] IDENTITY(1,1) NOT NULL,

[incident_number] [bigint] NOT NULL,

[version_number] [smallint] NOT NULL,

[inserted_date_time] [datetime] NOT NULL,

[incident_recorded] [datetime] NOT NULL,

[incident_start] [datetime] NOT NULL,

[incident_end] [datetime] NOT NULL,

[priority] [tinyint] NOT NULL,

[on_restricted_fvd] [bit] NOT NULL,

[event_code] [smallint] NOT NULL,

[ltn10_primary] [int] NOT NULL,

[ltn10_secondary] [int] NOT NULL,

[ltn10_direction] [smallint] NOT NULL,

[ltn10_extent] [tinyint] NOT NULL,

[multigroup_type] [float] NOT NULL,

[multigroup_data] [smallint] NOT NULL,

[source] [tinyint] NOT NULL,

[server] [varchar](25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[extent_miles] [float] NULL,

[delay_type] [int] NULL,

[delay] [int] NULL,

[p_link_index] [int] NULL ,

[s_link_index] [int] NULL ,

CONSTRAINT [PK_live_traffic_incident] PRIMARY KEY CLUSTERED

(

[row_id] ASC

)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]

) ON [PRIMARY]

GO

Index:

======

ix_event_code non-clustered on [event_code]

Data: (Apologies for the formatting!)

=====

EventCode % of Total Rows

========= ===============

212 0.100134911259148

211 0.114968159544797

1031 0.148094815353321

6460.24144921255164

240.30578021348134

3430.308772027932922

2140.313867898809565

5050.37359234431031

4050.582012764142964

5190.590988207497711

5200.689620221310387

2400.847207756348868

2010.869247922156903

18040.912083994491706

5151.36530194258233

702.25808593777306

742.2833695659772

4012.92720384182529

5142.94621025185765

733.25405957066065

5173.41469484191616

723.97124223211727

715.60260595426998

7356.34579922688159

5187.37394884556505

50022.4500653585634

70828.3434854638361

Query:

======

select top 100 incident_number, version_number

from live_traffic_incident

where event_code = 405 /* 0.58% of total rows match */

order by incident_number

The above query scans the clustered index, even though only 0.58 of the total rows match - the overhead of the resulting bookmark lookups from the NC index outweighs the cost of simply scanning the entire clustered index.

Modifying the SARG slightly to a more selective value, we can finally get the optimizer to switch:

select top 100 incident_number, version_number

from live_traffic_incident

where event_code = 505 /* 0.37% of total rows match */

order by incident_number

This now uses an index SEEK on ix_event_code.

If I create a COVERING index on (incident_number, version_number, event_code) then there is NO value of event_code that uses an index seek, even the most selective in the table. It uses a clustered index scan every time.

Adding a further covering index on (event_code, incident_number, version_number) has the result that it now ALWAYS uses an index seek via that index, even with the LEAST selective value of event_code (28.3%)

This proves (in this case at least) that the SARG column needs to be the leading column in the index before the optimizer can use it.

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 11

yeah sorry I forgot the first column of my covered index was very selective, if the first column isn't selective enough then you're absolutely correct, the optimiser will ignore it. You can improve this by making the covered index clustered, selectivity is lower for a clustered index.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/