Index Challenge

Question

Post reply

Index Challenge

Abhijit More

SSCertifiable

Points: 6897
More actions
January 28, 2011 at 3:23 am

#233567

Hi Follks!
I have one simple index related challage for all...
CREATE TABLE TableX(
X_ridint identity(1, 1) primary key,
x_value1datetime,
x_value2varchar(10),
x_value3int,
)
CREATE TABLE TableY(
Y_ridint identity(1, 1) primary key,
X_ridint constraint fk_001 foreign key references TableX(X_rid),
y_value1float,
y_value2int,
y_value3datetime,
y_value4datetime
)
create nonclustered index idx_001 ON TableY (X_rid)
include (y_value3,y_value4,Y_rid,y_value1,y_value2)
select x.x_value2, y.Y_rid, y.y_value1, y.y_value2, y.y_value3, y.y_value4
from TableX x
join TableY y on x.X_rid = y.X_rid
where y.X_rid = 1
and y.y_value3 > GETDATE()
and y.y_value4 < GETDATE()
This is the sample structure and Query. The Query written above generates the attached execution plan.
The challange is to convert the Index Seek to Clustered Index Seeks.
Rules:
1. You can not change the Table Design (:-P).
2. You can change the query but cannot change the query output.
3. You can add indexes whatever you want.
Let's do some brainstorming..:hehe:
Abhijit - http://abhijitmore.wordpress.com

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Abhijit More (1/28/2011)
The challange is to convert the Index Seek to Clustered Index Seeks.

Why? The index seek should be more efficient.

btw, any form of exec plan investigation is meaningless on 0 rows, as is the requirement to keep the same output (0 rows is easy). If it's to be a challenge, add some sample data.

Here you go

select x.x_value2, y.Y_rid, y.y_value1, y.y_value2, y.y_value3, y.y_value4

from TableX x

join TableY y WITH (INDEX = 1, FORCESEEK) on x.X_rid = y.X_rid

where y.X_rid = 1

and y.y_value3 > GETDATE()

and y.y_value4 < GETDATE()

AND y.Y_Rid > 0 -- totally meaningless because the identity will always be > 0, but it is a SARGable predicate on the clustered index key

The additional predicate is meaningless, it's essentially a case of AND TRUE, and on any large table will kill performance because the nonclustered index seek has essentially turned into a table scan. It appears as a seek, but, if the table had any rows, would result in all of them being read.

p.s. exactly the same result can be obtained by omitting the hints and dropping the nonclustered index

DROP INDEX idx_001 ON TableY

select x.x_value2, y.Y_rid, y.y_value1, y.y_value2, y.y_value3, y.y_value4

from TableX x

join TableY y on x.X_rid = y.X_rid

where y.X_rid = 1

and y.y_value3 > GETDATE()

and y.y_value4 < GETDATE()

AND y.Y_Rid > 0 -- totally meaningless because the identity will always be > 0, but it is a SARGable predicate on the clustered index key

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

I added some data (a couple hundred thousand rows) and compared the original and the challenge requirement.

INSERT INTO TableX (x_value1, x_value2, x_value3)

SELECT TOP (250000) GETDATE(), 'abc', a.object_id + b.column_id

FROM master.sys.columns a CROSS JOIN master.sys.columns b

INSERT INTO TableY (X_Rid, y_value1, y_value2, y_value3, y_value4)

SELECT TOP (75) PERCENT X_rid,

(RAND(CHECKSUM(NEWID()))-0.5)*200, 0,

DATEADD(dd,Floor((RAND(CHECKSUM(NEWID()))-0.5)*20),GETDATE()),

DATEADD(dd,Floor((RAND(CHECKSUM(NEWID()))-0.5)*20),GETDATE())

FROM TableX

Original:

Table 'TableY'. Scan count 1, logical reads 3, physical reads 0

Table 'TableX'. Scan count 0, logical reads 3, physical reads 0

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

Required query (clustered index seek on both tables)

Table 'TableY'. Scan count 1, logical reads 1053, physical reads 0.

Table 'TableX'. Scan count 0, logical reads 3, physical reads 0.

SQL Server Execution Times:

CPU time = 406 ms, elapsed time = 422 ms.

I think I'll stick to the original if you don't mind...

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Abhijit More SSCertifiable Points: 6897 More actions · Answer 3

Indeed I agree with you Gail, there is no meaning is testing the query w/o data. Thats my bad.

and Thanks for adding sample data, I'll appreciate it.

Okay! now my question is by specifying the FORCESEEK we force the query optimizer to use the seek operation, does it will caz anytime a performance issues by specifying ther index as well.

Looking at the statistics even I would prefer to stick with the original query. Can you brief more on this..

Abhijit - http://abhijitmore.wordpress.com

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

Abhijit More (1/28/2011)
Okay! now my question is by specifying the FORCESEEK we force the query optimizer to use the seek operation, does it will caz anytime a performance issues by specifying ther index as well.

As I said at the end of the first post, the hints aren't required. Add the predicate and drop the nonclustered index.

Looking at the statistics even I would prefer to stick with the original query. Can you brief more on this..

Err, what more do you want me to say?

Original query: 6 logical reads, 0 ms

New query (with required 2 clustered index seeks): 1056 reads, 400ms

Not hard to tell that the original was far, far, far more efficient.

I'm curious as to the point of this, are you looking for ways to make queries run slower and less efficiently?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Abhijit More SSCertifiable Points: 6897 More actions · Answer 5

nope I am finding out the ways we can beat the Operators.

Abhijit - http://abhijitmore.wordpress.com

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

Bear in mind that silly tricks (like adding a predicate that qualifies all rows) will not make queries run faster. Tuning queries by trying to force specific operators is also a silly thing, unless you really, really, really know why the desired operator is absolutely the best and you know that it's the best in all cases.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Grant Fritchey SSC Guru Points: 398909 More actions · Answer 7

Abhijit More (1/28/2011)
nope I am finding out the ways we can beat the Operators.

You don't want to try to "beat the Operators." You want to try to work with the optimizer to help it do it's job. Not make the, probably mistaken assumption, that you really know better how to tune queries than people like Dr. DeWitt who builds these tools.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning