July 13, 2009 at 11:01 am
Excellent. That's what I was hoping for. How does that perform?
Add these lines before the query and post the details from the messages tab
SET STATISTICS IO ON
SET STATISTICS TIME ON
GO
Now that the select seems to be under control, can you test out the two options for the insert (one with the PointsCategory, one without) and with the added index on the UserPoints table.
CREATE INDEX idx_UserPoints_PointsCategoryUserID ON UserPoints (PointsCategory, UserID) INCLUDE (Date, PointsScored)
-- When the category is specified
INSERT INTO #TopScores (
UserID,
RecentPoints,
AllPoints,
RowNumber
)
SELECT [UserPoints].UserID,
SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) AS RecentPoints,
SUM([PointsScored]) AS AllPoints,
ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) DESC,
SUM([PointsScored]) DESC) as RowNumber
FROM [UserPoints]
WHERE PointsCategory = @PointsCategory
AND UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query
GROUP BY [UserPoints].UserID
and
-- When the category is not
INSERT INTO #TopScores (
UserID,
RecentPoints,
AllPoints,
RowNumber
)
SELECT [UserPoints].UserID,
SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) AS RecentPoints,
SUM([PointsScored]) AS AllPoints,
ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= @CutoffDate THEN [PointsScored] ELSE 0 END) DESC,
SUM([PointsScored]) DESC) as RowNumber
FROM [UserPoints]
WHERE UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query
GROUP BY [UserPoints].UserID
What's that cutoffdate and what values can it have?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 13, 2009 at 11:43 am
That sure looks a lot better. 🙂
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
July 13, 2009 at 5:59 pm
GSquared (7/13/2009)
Table variables are always estimated as if they were going to only have 1 row in them. That will give you a very false low on the cost, and can end up with all kinds of other problems in the execution plan.
They're generally okay if you can be certain they will not ever exceed about 100 rows. That's a rule-of-thumb and kind of arbitrary, but it generally works.
Just as an FYI:
Recently I was using a table variable, one date column, 36 rows. Clustered PK on the column. Date values from first day of current month to first day of CurrentMonth-35.
This table was being joined to another table with < 10,000 rows.
Execution time: 3+ minutes.
converted table variable to temporary table (the only changes were to change @table to #table and changed DECLARE @table TABLE to CREATE TABLE #table). Execution time < 30ms.
So, with only 1/3 of GSquared's rule-of-thumb, table variables can still be massive performance problems. IMO, just avoid them unless you need some specific feature about them that temporary tables don't have.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
July 14, 2009 at 2:03 am
would it be worth trying to add a non-clustered index to the temp table to try and get ride of the key lookup?
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
July 14, 2009 at 2:19 am
Christopher Stobbs (7/14/2009)
would it be worth trying to add a non-clustered index to the temp table to try and get ride of the key lookup?
Could widen the nonclustered to include all the columns, but it's only 6 rows and it can't be more than that..
It's odd, SQL should be doing that lookup before it does the merge for the OR. If it did that, it would only be 1 row in the key lookup. The other 5 are retrieved from the cluster.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 14, 2009 at 2:38 am
Ok, lots of information to post after various different tests. Attached are the execution plans of the insert statement. There are separate runs for it with and without points category, and with and without the addition of the index suggested by Gail.
The outputs of the messages window are also in the zip file, with similar names to the execution plans.
The select statement at the end seems to be performing nicely now. CPU time required is way down on what it used to be according to the statistics generated in the messages window. We haven't put it live yet - holding off til we think we've got the first part running nicely.
Mel HarbourProject ManagerRed Gate Software
July 14, 2009 at 2:50 am
Oh, and I forgot to mention what @cutoffdate is doing. That's actually only ever set to one value, namely 30 days ago. It happens to calculate this value in the some of the C# classes and then pass it to the stored procedure. Again, if it would improve the query to have this calculation done in the SP, then that's fair game.
Mel HarbourProject ManagerRed Gate Software
July 14, 2009 at 5:48 am
Mel,
I haven't had the time to look at all the fine suggestions folks may have given, but if I look at your original query, I see the potential for a major slow down in the form of an ORDER BY on a SUM. I believe a little "pre-aggregation" (as Perter Larson calls it) along with a little "Divide'N'Conquer" via a temp table would go a very long way in speeding up such a query.
I did see where you provided a table structure... I'll try to make a more substantial suggestion in the form of code tonight (if someone hasn't already beat me to it).
--Jeff Moden
Change is inevitable... Change for the better is not.
July 14, 2009 at 6:34 am
Jeff Moden (7/14/2009)
I haven't had the time to look at all the fine suggestions folks may have given, but if I look at your original query, I see the potential for a major slow down in the form of an ORDER BY on a SUM. I believe a little "pre-aggregation" (as Perter Larson calls it) along with a little "Divide'N'Conquer" via a temp table would go a very long way in speeding up such a query.
That's already been done in Mel's initial post. The original one with the order on the sum was replaced with one with a table variable, since changed to a temp table.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 14, 2009 at 8:06 am
Just tried Jeff's suggestion, restructured the query to do a pre-aggregation with a CTE, and that slowed the performance down (more CPU time).
Mel HarbourProject ManagerRed Gate Software
July 14, 2009 at 8:44 am
Mel Harbour (7/14/2009)
Just tried Jeff's suggestion, restructured the query to do a pre-aggregation with a CTE, and that slowed the performance down (more CPU time).
It will. A CTE, despite it's name, is just a named subquery. Jeff was suggesting temp table, which is exactly what the revised query you initially posted used (well, after we changed table var to temp table)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 14, 2009 at 9:38 am
Was Jeff referring to the Order By in the ROW_NUMBER() function?
A side note, I think a summary of the steps and results of this thread would make a great article on How To Performance Tune a query.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
July 14, 2009 at 10:07 am
I agree with Jack.
Also Mel could you post the latest copy of the solution , I've lost track of where you all got to.
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
July 14, 2009 at 10:09 am
Sure!
The current query is:
CREATE TABLE #TopScores
(
UserID INT,
RecentPoints INT,
AllPoints INT,
RowNumber INT
)
CREATE CLUSTERED INDEX idx_TopScores_RowNumber ON #TopScores (RowNumber)
CREATE INDEX idx_TopScores_UserID ON #TopScores (UserID)
INSERT INTO #TopScores (
UserID,
RecentPoints,
AllPoints,
RowNumber
)
SELECT UserPoints.UserID,
SUM(CASE WHEN [Date] >= '2009-06-13' THEN [PointsScored] ELSE 0 END) AS RecentPoints,
SUM([PointsScored]) AS AllPoints,
ROW_NUMBER() OVER(ORDER BY SUM(CASE WHEN [Date] >= '2009-06-13' THEN [PointsScored] ELSE 0 END) DESC, SUM([PointsScored]) DESC) as RowNumber
FROM UserPoints
WHERE PointsCategory = ISNULL(NULL, PointsCategory)
AND UserID 57832 -- Exclude 'Site Owners', but do it explicitly by UserID to optimise query
GROUP BY [UserPoints].UserID
SELECT t.RowNumber, u.UserID, u.DisplayName, t.RecentPoints, t.AllPoints
FROM #TopScores t
JOIN Users u ON u.[UserID] = t.UserID
WHERE t.RowNumber <= 5 OR 769521 = t.UserID
ORDER BY t.RowNumber
DROP TABLE #TopScores
Performance-wise, the problems are all restricted to the INSERT-SELECT statement.
Mel HarbourProject ManagerRed Gate Software
July 14, 2009 at 10:19 am
This might be sound stupid, I'm not sure if it makes a difference at all I might need to do some testing.
but creating the clustered Index on #TopScores after the actual insert?
Would this help at all?
I was under the assumption that adding to a clustered index had slightly more over head than add a clustered index, this could be rubbish so please let me know.
----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley
:w00t:
Posting Best Practices[/url]
Numbers / Tally Tables[/url]
Viewing 15 posts - 31 through 45 (of 103 total)
You must be logged in to reply to this topic. Login to reply