simple SELECT query takes ten seconds to return 5000 rows - can I optimize?

Question

Post reply

simple SELECT query takes ten seconds to return 5000 rows - can I optimize?

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Adam.denby@focus-solutions.co.uk Mr or Mrs. 500 Points: 541 More actions · Answer 1

/* setup */

CREATE TABLE X (

HistoryID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,

ObjID INT NOT NULL,

Date1 DATETIME,

Date2 DATETIME,

Info1 CHAR(400),

Info2 CHAR(400),

Int1 INT,

Int2 INT,

Int3 INT

);

/* create test data */

DECLARE @i INT; SET @i = 0;

DECLARE @objid INT; SET @objid = 0;

WHILE (@i < 4000000)

BEGIN

INSERT INTO X VALUES(@objid, GETDATE(), CURRENT_TIMESTAMP, 'Info1','Info2',1,2,3);

SET @i = @i + 1;

SET @objid = @objid + 1

IF (@objid > 5000) SET @objid = 0;

END;

CREATE INDEX X_IX1 ON X (ObjID);

/* end setup */

/* timed query */

DBCC DROPCLEANBUFFERS;

WITH maxed AS

(

SELECT

MAX(HistoryID) AS MaxID

FROM X

GROUP BY ObjID

)

SELECT

X.*

FROM X

INNER JOIN maxed X2

ON X.HistoryID = X2.MaxID;

/* end timed query */

The code from and including DBCC onwards runs in two seconds on a laptop class machine, sql 2005 x64 dev edition sp3. In certain situations I've found cte's helpful. Without it your original query does take a longer time.

hth

Jesse Reich Hall of Fame Points: 3020 More actions · Answer 2

This question has turned into an excellent learning experience for me. I took the opportunity to run the new queries against Paul White's optimized schema on both my laptop and the server using the sample data. The times were calculated, as before, using SET STATISTICS TIME ON.

SELECT

X1.*

FROM X X1

WHERE HistoryID = (

SELECT

MAX(HistoryID) AS MaxID

FROM X X2

WHERE X2.ObjID = X1.ObjID

GROUP BY ObjID

)

Laptop:

Table 'Y'. Scan count 1, logical reads 32780, physical reads 6, read-ahead reads 10779, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 1607 ms, elapsed time = 24411 ms.

Server:

Table 'Y'. Scan count 1, logical reads 33065, physical reads 5, read-ahead reads 10779, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 1235 ms, elapsed time = 8044 ms.

SELECT

X1.*

FROM X X1

WHERE HistoryID = (

SELECT

MAX(HistoryID) AS MaxID

FROM X X2

WHERE X2.ObjID = X1.ObjID

GROUP BY ObjID

)

OPTION (FAST 100)

Laptop:

Table 'Y'. Scan count 1, logical reads 160692, physical reads 399, read-ahead reads 160335, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 3089 ms, elapsed time = 32309 ms.

Server:

Table 'Y'. Scan count 1, logical reads 160880, physical reads 618, read-ahead reads 160335, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 2703 ms, elapsed time = 12883 ms.

;with cte as( select distinct objid

from dbo.X

)

select ca.*

from cte

cross apply (select top 1 *

from dbo.X

where objid = cte.objid

order by HistoryID desc) ca

Laptop:

Table 'Y'. Scan count 5001, logical reads 42886, physical reads 16, read-ahead reads 10779, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 1560 ms, elapsed time = 22924 ms.

Server:

Table 'Y'. Scan count 5001, logical reads 43061, physical reads 8, read-ahead reads 10779, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 968 ms, elapsed time = 7574 ms.

The CROSS APPLY method seems to be the top performer and that seems to follow a point in Paul's blog post...

"Nevertheless, the APPLY plan might be the optimal choice if a list of distinct bins and shelves is available from another table (eliminating the DISTINCT) and if there are relatively few shelf and bin combinations (so limiting the number of index seeks)."

In this case we only have 5000 unique objIDs and perhaps that is why it performs so well.

The downside is that the original clustered index is there for a reason. The majority of searching is done on HistoryID while this query is more infrequent use case. That said, it makes me wonder if having an indexed view clustered on ObjID and HistoryID might be the solution to the problem.

Thanks again for everyone's assistance.

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 3

If you can afford the disk space and maintenance load, specialized covering indexes are always on the path to optimal query performance.

Thanks for posting your results 8kb, and good luck to you.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Paul White SSC Guru Points: 150467 More actions · Answer 4

The Dixie Flatline (9/17/2010)
Paul: I wasn't aware this was a Speed Phreak challenge. 😀

:laugh: No, but you beat me to my favourite solution (APPLY) so I thought I'd have a go with the alternatives.

There are a number of very interesting features of these 'top n per group' problems, probably too much for a forum post. I did intend to return to this topic on my blog (after the Segment Top entry) and this thread has prompted me to do just that. I'll post a link as soon as it is up, in case anyone is interested.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 5

Please do, Paul. I hear your blog is well written and enjoys a HUGE following in Hyderabad. 😉

I am behind on my reading but the articles you've posted recently are bookmarked for when I have time to catch up.

Using TOP 1/ORDER BY instead of MAX() is the key here. Last year in his Friday post-Summit presentation Itzik Ben-Gan demonstrated how the optimizer walks the index logically row by row for the number of times specified by TOP. It follows that,with the proper supporting index in place, TOP 1 / ORDER BY can supply a MIN or MAX value with only one logical read per "group", in this case the objid. That's why the scan count is always the number of objIDs, plus 1 for the DISTINCT scan.

The DISTINCT CTE was an efficient way to get the list of groups (objIDS) because of the clustered index structure you set up. But I was just testing a simple nonclustered index on (objID, historyID descending) and it runs even faster. The nonclustered index is tighter, and the clustered index seeks are the same no matter where the list of ObjIDs comes from.

By the way, I've rebuilt the 4 million row X table with 50,000 and 500,000 distinct objIDs and the cross apply technique is still holding fairly well. At a half million differtent ObjIDs it is running in about 3 seconds on our server.

8kb: You might try building such a non-clustered index on objid, historyID over your existing table schema. It should work just fine with your historyID clustered index and the CROSS APPLY.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Paul White SSC Guru Points: 150467 More actions · Answer 6

The Dixie Flatline (9/17/2010)
Please do, Paul. I hear your blog is well written and enjoys a HUGE following in Hyderabad.

:laugh: Well as long as they just read it 😉

I am behind on my reading but the articles you've posted recently are bookmarked for when I have time to catch up.

The latest four-part series is just copied from my blog.

Using TOP 1/ORDER BY instead of MAX() is the key here. Last year in his Friday post-Summit presentation Itzik Ben-Gan demonstrated how the optimizer walks the index logically row by row for the number of times specified by TOP. It follows that,with the proper supporting index in place, TOP 1 / ORDER BY can supply a MIN or MAX value with only one logical read per "group", in this case the objid. That's why the scan count is always the number of objIDs, plus 1 for the DISTINCT scan.

Yeah, I explained that to Jeff on a very interesting thread back in May: http://www.sqlservercentral.com/Forums/FindPost917491.aspx

The DISTINCT CTE was an efficient way to get the list of groups (objIDS) because of the clustered index structure you set up. But I was just testing a simple nonclustered index on (objID, historyID descending) and it runs even faster. The nonclustered index is tighter, and the clustered index seeks are the same no matter where the list of ObjIDs comes from.

Yup. Just to be clear on this, the APPLY solution is the one I normally post - but do try changing the outer SELECT * to SELECT objID, HistoryID instead; with a non-clustered index on (objID DESC, HistoryID DESC), and see how APPLY compares to Segment Top...:-P

By the way, I've rebuilt the 4 million row X table with 50,000 and 500,000 distinct objIDs and the cross apply technique is still holding fairly well. At a half million differtent ObjIDs it is running in about 3 seconds on our server.

For sure. The worst case for APPLY is when there are a large number of groups (objID), very few rows per group (HistoryID) and none of the data pages are in memory yet. Be sure to run CHECKPOINT (to flush dirty pages to disk) and DBCC DROPCLEANBUFFERS (to drop the clean pages) before every test run.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150467 More actions · Answer 7

8kb (9/17/2010)
The CROSS APPLY method seems to be the top performer and that seems to follow a point in Paul's blog post...
"Nevertheless, the APPLY plan might be the optimal choice if a list of distinct bins and shelves is available from another table (eliminating the DISTINCT) and if there are relatively few shelf and bin combinations (so limiting the number of index seeks)."
In this case we only have 5000 unique objIDs and perhaps that is why it performs so well.

Absolutely, yes! The other thing is the outer SELECT star. The Segment Top really flies when there's a covering non-clustered index for the final output - I had to play games with the optimizer (the XML plan) to get it to use the clustered index. Even so, it's not ideal, for all sorts of good reasons.

Changing the outer SELECT to just the ObjID and History columns (with an n/c index on (ObjID, HistoryID)) makes all the difference: no hints, no XML, just a super-fast single scan. Guaranteed to beat APPLY 🙂

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 8

Paul White NZ (9/18/2010)
Yeah, I explained that to Jeff on a very interesting thread back in May: http://www.sqlservercentral.com/Forums/FindPost917491.aspx

Actually, the triangular join thread that leads to is even more interesting. I'm still not done playing with that one.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150467 More actions · Answer 9

Jeff Moden (9/18/2010)
Paul White NZ (9/18/2010)
Yeah, I explained that to Jeff on a very interesting thread back in May: http://www.sqlservercentral.com/Forums/FindPost917491.aspx
Actually, the triangular join thread that leads to is even more interesting. I'm still not done playing with that one.

I remember! In fact there were two interesting 'triangle' threads:

http://www.sqlservercentral.com/Forums/Topic896583-338-1.aspx

http://www.sqlservercentral.com/Forums/Topic911849-392-1.aspx

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi