January 15, 2009 at 9:48 pm
I'm getting a major slowdown on a Top N sort.
Here's the part that so slow:
UPDATE #events SET userID =
(SELECT TOP 1 userID FROM #events e2
WHERE e2.coCRN = e.coCRN AND e2.deptCRN = e.deptCRN AND e2.county = e.county AND e2.orderNo = e.orderNo
ORDER BY e2.logID)
FROM #events e
I've been searching and searching looking for a different methoed to avoid this slow sort. I can't figure out a way to go within the subquery. This temp table has about 433k records, and I've tried all kinds of indexes...but still slow.
Any ideas?
January 15, 2009 at 10:45 pm
Can you show us the execution plan.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
January 16, 2009 at 5:55 am
You are using correlated sub-queries, which is another form of RBAR..., to know more about RBAR and its performance implications see http://www.sqlservercentral.com/links/276514/84234.
Based on the query, what I understood is that you want to update the userID based on some criteria on the same table and taking only the first matching record.
Here is what I came up with....
UPDATEE1
SETE1.userID = E2.userID
FROM#events E1
INNER JOIN
(
SELECTROW_NUMBER() OVER( PARTITION BY coCRN, deptCRN, county, orderNo ORDER BY logID ) AS RowNumber,
coCRN, deptCRN, county, orderNo, userID
FROM#events
) E2 ON E1.coCRN = E2.coCRN AND E1.deptCRN = E2.deptCRN AND E1.county = E2.county AND E1.orderNo = E2.orderNo
WHEREE2.RowNumber = 1
Further, you can add index on the columns specified in where clause and specify the userID column in the INCLUDE clause.
--Ramesh
January 16, 2009 at 6:17 am
Ramesh (1/16/2009)
You are using correlated sub-queries, which is another form of RBAR...,
The optimiser can handle most correlated subqueries just fine. If the comparison with the outer query is an equality, the subquery won't be processed row by row.
If it's an inequality comparison, it's a whole 'nother story.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 16, 2009 at 6:17 am
This may be doable with a preaggregated derived table instead of a correlated subquery:
-- current query
SELECT e.coCRN, e.deptCRN, e.county, e.orderNo, MAXuserID = (SELECT TOP 1 userID FROM #events e2
WHERE e2.coCRN = e.coCRN AND e2.deptCRN = e.deptCRN AND e2.county = e.county AND e2.orderNo = e.orderNo
ORDER BY e2.logID)
FROM #events e
-- alternative query
SELECT e.coCRN, e.deptCRN, e.county, e.orderNo, MAXuserID
FROM #events e
INNER JOIN (SELECT MAX(userID) AS MAXuserID, coCRN, deptCRN, county, orderNo
FROM #events
GROUP BY coCRN, deptCRN, county, orderNo
) e2 ON e2.coCRN = e.coCRN
AND e2.deptCRN = e.deptCRN
AND e2.county = e.county
AND e2.orderNo = e.orderNo
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 16, 2009 at 6:23 am
I wish it was available online, but if you can track down the final copy of the SQL Standard, I wrote an article comparing TOP, MAX and ROW_NUMBER as a mechanism for getting versioned data. Depending on the data sets involved, TOP works best on larger data sets and ROW_NUMBER works best on smaller data sets. But you need to test in your environment to be sure.
Generally, something along the lines of the following works best for larger data sets:
SELECT ...
FROM X
WHERE X.Val = (SELECT TOP(1) x2.Val
FROM X AS x2
WHERE X.id = x2.Id
ORDER BY x.Val DESC)
You have to have the right indexes in place of course because table/index scans will kill performance. You can also express it like a JOIN:
SELECT...
FROM X AS x
JOIN X as x2
ON x.ID = x2.ID
AND x2.Val = (SELECT TOP(1).... -- you get the picture
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 16, 2009 at 6:27 am
GilaMonster (1/16/2009)
Ramesh (1/16/2009)
You are using correlated sub-queries, which is another form of RBAR...,The optimiser can handle most correlated subqueries just fine. If the comparison with the outer query is an equality, the subquery won't be processed row by row.
If it's an inequality comparison, it's a whole 'nother story.
Thanks gail, for letting me know about the optimizer and correcting me. One thing, is it do the same in all environments? Or is it dependent on other factors?
--Ramesh
January 16, 2009 at 6:48 am
Ramesh (1/16/2009)
Thanks gail, for letting me know about the optimizer and correcting me. One thing, is it do the same in all environments? Or is it dependent on other factors?
What environments or factors are you thinking of?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 16, 2009 at 6:58 am
I meant to ask you that does the optimizer always process the two tables/queries separately (i.e. not a row-by-row process as you've mentioned) and do a equality join, similar to an INNER JOIN, irrespective of the amount of data, sql 2005 edition, hardware etc.?
--Ramesh
January 16, 2009 at 7:17 am
They're not processed separately, that's the point. SQL can and does 'de-correlate' the subquery and create an exec plan that processes the inner and outer query together.
Take this example I did in a presentation a couple months back
USE AdventureWorks
GO
SET NOCOUNT ON
DECLARE @StartTime DATETIME
SET @StartTime = GETDATE()
SELECT productId, productnumber,
(SELECT SUM(LineTotal) SumTotal FROM Sales.SalesOrderDetail sd WHERE sd.productID = p.productid )
FROM Production.Product p
WHERE ProductNumber like 'bk%'
print 'Duration with correlated subqueries: ' + CAST(DATEDIFF(ms, @StartTime, getdate()) as varchar(10)) + ' ms'
SET @StartTime = GETDATE()
SELECT p.ProductID, ProductNumber, SumTotal
FROM Production.Product p
INNER JOIN (
SELECT productid, SUM(LineTotal) SumTotal
FROM Sales.SalesOrderDetail
GROUP BY ProductID
) SalesTotals on p.ProductID = SalesTotals.ProductID
WHERE ProductNumber like 'bk%'
print 'Duration with derived table: ' + CAST(DATEDIFF(ms, @StartTime, getdate()) as varchar(10)) + ' ms'
Near-identical execution plans and durations
(run it twice and take the second time, the first time the data and plan caches are cold)
As far as I know, it doesn't depend on edition, data, day of the year, etc
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 16, 2009 at 7:26 am
Hey Gail, firstly thank you very much for taking time in explaining me the concept and providing me with a working example.
Secondly, you were right, it does provide the same plans and complete in approximately same durations.
--Ramesh
January 16, 2009 at 7:28 am
THANK YOU ALL SOOOOO MUCH!!
I really appreciate and am in awe of your skills.
You all may have saved my job!!!
Good karma to you all, and if any of you are ever in Sunny West Palm Beach - DRINKS ARE ON ME!!!
January 16, 2009 at 7:31 am
Hi Krypto
Which solution did you decide to use? Would it be possible for you to post it, for the potential benefit of others who may experience the same or a similar problem? Thanks!
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
January 16, 2009 at 8:34 am
GilaMonster (1/16/2009)
If the comparison with the outer query is an equality, the subquery won't be processed row by row.
I pretty sure it will with an ORDER BY...
--Jeff Moden
Change is inevitable... Change for the better is not.
January 16, 2009 at 8:50 am
Jeff Moden (1/16/2009)
GilaMonster (1/16/2009)
If the comparison with the outer query is an equality, the subquery won't be processed row by row.I pretty sure it will with an ORDER BY...
Order by where? In the subquery? In the outer query?
I'll test, I have code set up. I'm just not sure what to add where.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 32 total)
You must be logged in to reply to this topic. Login to reply