Help us tune a query

Question

Post reply

Help us tune a query

Viewing 15 posts - 31 through 45 (of 103 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Excellent. That's what I was hoping for. How does that perform?

Add these lines before the query and post the details from the messages tab

SET STATISTICS IO ON

SET STATISTICS TIME ON

GO

Now that the select seems to be under control, can you test out the two options for the insert (one with the PointsCategory, one without) and with the added index on the UserPoints table.

CREATE INDEX idx_UserPoints_PointsCategoryUserID ON UserPoints (PointsCategory, UserID) INCLUDE (Date, PointsScored)

-- When the category is specified

INSERT INTO #TopScores (

UserID,

RecentPoints,

AllPoints,

RowNumber

)

SELECT [UserPoints].UserID,

SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) AS RecentPoints,

SUM([PointsScored]) AS AllPoints,

ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) DESC,

SUM([PointsScored]) DESC) as RowNumber

FROM [UserPoints]

WHERE PointsCategory = @PointsCategory

AND UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query

GROUP BY [UserPoints].UserID

and

-- When the category is not

INSERT INTO #TopScores (

UserID,

RecentPoints,

AllPoints,

RowNumber

)

SELECT [UserPoints].UserID,

SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) AS RecentPoints,

SUM([PointsScored]) AS AllPoints,

ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) DESC,

SUM([PointsScored]) DESC) as RowNumber

FROM [UserPoints]

WHERE UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query

GROUP BY [UserPoints].UserID

What's that cutoffdate and what values can it have?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

RBarryYoung SSC Guru Points: 143329 More actions · Answer 2

That sure looks a lot better. 🙂

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

WayneS SSC Guru Points: 95461 More actions · Answer 3

GSquared (7/13/2009)

Table variables are always estimated as if they were going to only have 1 row in them. That will give you a very false low on the cost, and can end up with all kinds of other problems in the execution plan.
They're generally okay if you can be certain they will not ever exceed about 100 rows. That's a rule-of-thumb and kind of arbitrary, but it generally works.

Just as an FYI:

Recently I was using a table variable, one date column, 36 rows. Clustered PK on the column. Date values from first day of current month to first day of CurrentMonth-35.

This table was being joined to another table with < 10,000 rows.

Execution time: 3+ minutes.

converted table variable to temporary table (the only changes were to change @table to #table and changed DECLARE @table TABLE to CREATE TABLE #table). Execution time < 30ms.

So, with only 1/3 of GSquared's rule-of-thumb, table variables can still be massive performance problems. IMO, just avoid them unless you need some specific feature about them that temporary tables don't have.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Christopher Stobbs SSC-Insane Points: 21098 More actions · Answer 4

would it be worth trying to add a non-clustered index to the temp table to try and get ride of the key lookup?

----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley

:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]

SQL-4-Life

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Christopher Stobbs (7/14/2009)
would it be worth trying to add a non-clustered index to the temp table to try and get ride of the key lookup?

Could widen the nonclustered to include all the columns, but it's only 6 rows and it can't be more than that..

It's odd, SQL should be doing that lookup before it does the merge for the OR. If it did that, it would only be 1 row in the key lookup. The other 5 are retrieved from the cluster.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Mel Harbour SSC Eights! Points: 857 More actions · Answer 6

Ok, lots of information to post after various different tests. Attached are the execution plans of the insert statement. There are separate runs for it with and without points category, and with and without the addition of the index suggested by Gail.

The outputs of the messages window are also in the zip file, with similar names to the execution plans.

The select statement at the end seems to be performing nicely now. CPU time required is way down on what it used to be according to the statistics generated in the messages window. We haven't put it live yet - holding off til we think we've got the first part running nicely.

Mel HarbourProject ManagerRed Gate Software

Mel Harbour SSC Eights! Points: 857 More actions · Answer 7

Oh, and I forgot to mention what @cutoffdate is doing. That's actually only ever set to one value, namely 30 days ago. It happens to calculate this value in the some of the C# classes and then pass it to the stored procedure. Again, if it would improve the query to have this calculation done in the SP, then that's fair game.

Mel HarbourProject ManagerRed Gate Software

Jeff Moden SSC Guru Points: 1004504 More actions · Answer 8

Mel,

I haven't had the time to look at all the fine suggestions folks may have given, but if I look at your original query, I see the potential for a major slow down in the form of an ORDER BY on a SUM. I believe a little "pre-aggregation" (as Perter Larson calls it) along with a little "Divide'N'Conquer" via a temp table would go a very long way in speeding up such a query.

I did see where you provided a table structure... I'll try to make a more substantial suggestion in the form of code tonight (if someone hasn't already beat me to it).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Jeff Moden (7/14/2009)
I haven't had the time to look at all the fine suggestions folks may have given, but if I look at your original query, I see the potential for a major slow down in the form of an ORDER BY on a SUM. I believe a little "pre-aggregation" (as Perter Larson calls it) along with a little "Divide'N'Conquer" via a temp table would go a very long way in speeding up such a query.

That's already been done in Mel's initial post. The original one with the order on the sum was replaced with one with a table variable, since changed to a temp table.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Mel Harbour SSC Eights! Points: 857 More actions · Answer 10

Just tried Jeff's suggestion, restructured the query to do a pre-aggregation with a CTE, and that slowed the performance down (more CPU time).

Mel HarbourProject ManagerRed Gate Software

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

Mel Harbour (7/14/2009)
Just tried Jeff's suggestion, restructured the query to do a pre-aggregation with a CTE, and that slowed the performance down (more CPU time).

It will. A CTE, despite it's name, is just a named subquery. Jeff was suggesting temp table, which is exactly what the revised query you initially posted used (well, after we changed table var to temp table)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jack Corbett SSC Guru Points: 184394 More actions · Answer 12

Was Jeff referring to the Order By in the ROW_NUMBER() function?

A side note, I think a summary of the steps and results of this thread would make a great article on How To Performance Tune a query.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Christopher Stobbs SSC-Insane Points: 21098 More actions · Answer 13

I agree with Jack.

Also Mel could you post the latest copy of the solution , I've lost track of where you all got to.

----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley

:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]

SQL-4-Life

Mel Harbour SSC Eights! Points: 857 More actions · Answer 14

Sure!

The current query is:

CREATE TABLE #TopScores

(

UserID INT,

RecentPoints INT,

AllPoints INT,

RowNumber INT

)

CREATE CLUSTERED INDEX idx_TopScores_RowNumber ON #TopScores (RowNumber)

CREATE INDEX idx_TopScores_UserID ON #TopScores (UserID)

INSERT INTO #TopScores (

UserID,

RecentPoints,

AllPoints,

RowNumber

)

SELECT UserPoints.UserID,

SUM(CASE WHEN [Date] >= '2009-06-13' THEN [PointsScored] ELSE 0 END) AS RecentPoints,

SUM([PointsScored]) AS AllPoints,

ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= '2009-06-13' THEN [PointsScored] ELSE 0 END) DESC, SUM([PointsScored]) DESC) as RowNumber

FROM UserPoints

WHERE PointsCategory = ISNULL(NULL, PointsCategory)

AND UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query

GROUP BY [UserPoints].UserID

SELECT t.RowNumber, u.UserID, u.DisplayName, t.RecentPoints, t.AllPoints

FROM #TopScores t

JOIN Users u ON u.[UserID] = t.UserID

WHERE t.RowNumber <= 5 OR 769521 = t.UserID

ORDER BY t.RowNumber

DROP TABLE #TopScores

Performance-wise, the problems are all restricted to the INSERT-SELECT statement.

Mel HarbourProject ManagerRed Gate Software

Christopher Stobbs SSC-Insane Points: 21098 More actions · Answer 15

This might be sound stupid, I'm not sure if it makes a difference at all I might need to do some testing.

but creating the clustered Index on #TopScores after the actual insert?

Would this help at all?

I was under the assumption that adding to a clustered index had slightly more over head than add a clustered index, this could be rubbish so please let me know.

----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley

:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]

SQL-4-Life