January 28, 2011 at 3:23 am
Hi Follks!
I have one simple index related challage for all...
CREATE TABLE TableX(
X_ridint identity(1, 1) primary key,
x_value1datetime,
x_value2varchar(10),
x_value3int,
)
CREATE TABLE TableY(
Y_ridint identity(1, 1) primary key,
X_ridint constraint fk_001 foreign key references TableX(X_rid),
y_value1float,
y_value2int,
y_value3datetime,
y_value4datetime
)
create nonclustered index idx_001 ON TableY (X_rid)
include (y_value3,y_value4,Y_rid,y_value1,y_value2)
select x.x_value2, y.Y_rid, y.y_value1, y.y_value2, y.y_value3, y.y_value4
from TableX x
join TableY y on x.X_rid = y.X_rid
where y.X_rid = 1
and y.y_value3 > GETDATE()
and y.y_value4 < GETDATE()
This is the sample structure and Query. The Query written above generates the attached execution plan.
The challange is to convert the Index Seek to Clustered Index Seeks.
Rules:
1. You can not change the Table Design (:-P).
2. You can change the query but cannot change the query output.
3. You can add indexes whatever you want.
Let's do some brainstorming..:hehe:
Abhijit - http://abhijitmore.wordpress.com
January 28, 2011 at 4:21 am
Abhijit More (1/28/2011)
The challange is to convert the Index Seek to Clustered Index Seeks.
Why? The index seek should be more efficient.
btw, any form of exec plan investigation is meaningless on 0 rows, as is the requirement to keep the same output (0 rows is easy). If it's to be a challenge, add some sample data.
Here you go
select x.x_value2, y.Y_rid, y.y_value1, y.y_value2, y.y_value3, y.y_value4
from TableX x
join TableY y WITH (INDEX = 1, FORCESEEK) on x.X_rid = y.X_rid
where y.X_rid = 1
and y.y_value3 > GETDATE()
and y.y_value4 < GETDATE()
AND y.Y_Rid > 0 -- totally meaningless because the identity will always be > 0, but it is a SARGable predicate on the clustered index key
The additional predicate is meaningless, it's essentially a case of AND TRUE, and on any large table will kill performance because the nonclustered index seek has essentially turned into a table scan. It appears as a seek, but, if the table had any rows, would result in all of them being read.
p.s. exactly the same result can be obtained by omitting the hints and dropping the nonclustered index
DROP INDEX idx_001 ON TableY
select x.x_value2, y.Y_rid, y.y_value1, y.y_value2, y.y_value3, y.y_value4
from TableX x
join TableY y on x.X_rid = y.X_rid
where y.X_rid = 1
and y.y_value3 > GETDATE()
and y.y_value4 < GETDATE()
AND y.Y_Rid > 0 -- totally meaningless because the identity will always be > 0, but it is a SARGable predicate on the clustered index key
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2011 at 4:30 am
I added some data (a couple hundred thousand rows) and compared the original and the challenge requirement.
INSERT INTO TableX (x_value1, x_value2, x_value3)
SELECT TOP (250000) GETDATE(), 'abc', a.object_id + b.column_id
FROM master.sys.columns a CROSS JOIN master.sys.columns b
INSERT INTO TableY (X_Rid, y_value1, y_value2, y_value3, y_value4)
SELECT TOP (75) PERCENT X_rid,
(RAND(CHECKSUM(NEWID()))-0.5)*200, 0,
DATEADD(dd,Floor((RAND(CHECKSUM(NEWID()))-0.5)*20),GETDATE()),
DATEADD(dd,Floor((RAND(CHECKSUM(NEWID()))-0.5)*20),GETDATE())
FROM TableX
Original:
Table 'TableY'. Scan count 1, logical reads 3, physical reads 0
Table 'TableX'. Scan count 0, logical reads 3, physical reads 0
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Required query (clustered index seek on both tables)
Table 'TableY'. Scan count 1, logical reads 1053, physical reads 0.
Table 'TableX'. Scan count 0, logical reads 3, physical reads 0.
SQL Server Execution Times:
CPU time = 406 ms, elapsed time = 422 ms.
I think I'll stick to the original if you don't mind...
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2011 at 5:18 am
Indeed I agree with you Gail, there is no meaning is testing the query w/o data. Thats my bad.
and Thanks for adding sample data, I'll appreciate it.
Okay! now my question is by specifying the FORCESEEK we force the query optimizer to use the seek operation, does it will caz anytime a performance issues by specifying ther index as well.
Looking at the statistics even I would prefer to stick with the original query. Can you brief more on this..
Abhijit - http://abhijitmore.wordpress.com
January 28, 2011 at 5:54 am
Abhijit More (1/28/2011)
Okay! now my question is by specifying the FORCESEEK we force the query optimizer to use the seek operation, does it will caz anytime a performance issues by specifying ther index as well.
As I said at the end of the first post, the hints aren't required. Add the predicate and drop the nonclustered index.
Looking at the statistics even I would prefer to stick with the original query. Can you brief more on this..
Err, what more do you want me to say?
Original query: 6 logical reads, 0 ms
New query (with required 2 clustered index seeks): 1056 reads, 400ms
Not hard to tell that the original was far, far, far more efficient.
I'm curious as to the point of this, are you looking for ways to make queries run slower and less efficiently?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2011 at 7:00 am
nope I am finding out the ways we can beat the Operators.
Abhijit - http://abhijitmore.wordpress.com
January 28, 2011 at 7:13 am
Bear in mind that silly tricks (like adding a predicate that qualifies all rows) will not make queries run faster. Tuning queries by trying to force specific operators is also a silly thing, unless you really, really, really know why the desired operator is absolutely the best and you know that it's the best in all cases.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 28, 2011 at 11:25 am
Abhijit More (1/28/2011)
nope I am finding out the ways we can beat the Operators.
You don't want to try to "beat the Operators." You want to try to work with the optimizer to help it do it's job. Not make the, probably mistaken assumption, that you really know better how to tune queries than people like Dr. DeWitt who builds these tools.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply