Query performance help (ready-to-use sample data included)

Question

Post reply

Query performance help (ready-to-use sample data included)

ColdCoffee

SSC-Dedicated

Points: 39972
More actions
March 18, 2016 at 3:12 pm

#317082

Hello folks.. i have the following requirement (the table names are modified) :
1. For every row in OrderDetails table with ActionCode not equal to 'V' , check the last inserted row in the OrderDetailsHistory to see if it "V"
2. If it is "V" , then do not pick the row
3. Pick up all rows in OrderDetails with ActionCode = 'V' (irrespective of a whether a row exists in OrderDetailsHistory or not)
4. Pick up all rows that are exisitng only in OrderDetails (irrespective of a whether a row exists in OrderDetailsHistory or not)
-- Expected result is attached in "My tries"
Sample data
DECLARE @OrderDetails TABLE ( OrderID INT, ActionCode CHAR(1))
DECLARE @OrderDetailsHistory TABLE (TSID INT IDENTITY(1,1), OrderID INT, ActionCode CHAR(1))
INSERT INTO @OrderDetails VALUES
( 1, 'N') , (2,'A') , (3,'B') ,(4,'C') ,(5,'D') ,(6,'V'),(7,'V'),(8,'N') ,(9,'N') ,(10,'V')
INSERT INTO @OrderDetailsHistory VALUES
(1,'N'),(1,'N'),(1,'V'),(2,'N'),(2,'N'),(3,'N'),(3,'V'),(4,'N'),(5,'N'),(5,'N'),(5,'V'),(6,'N')
,(6,'V'),(7,'N'),(7,'N'),(7,'N'),(5,'N'),(5,'N')
My tries:
SELECT *
FROM @OrderDetails OD
OUTER APPLY (SELECT TOP 1 TSActionCode = ODH.ActionCode
FROM @OrderDetailsHistory ODH
WHERE ODH.OrderID = OD.OrderID
ORDER BY ODH.TSID DESC
) OtrApp
WHERE NOT (OD.ActionCode != 'V' AND ISNULL(OtrApp.TSActionCode,OD.ActionCode) = 'V')
SELECT *
FROM @OrderDetails PT
WHERE NOT EXISTS
(
SELECT 1
WHERE PT.ActionCode != 'V' AND
(
SELECT TOP 1 TSActionCode = ODH.ActionCode
FROM @OrderDetailsHistory ODH
WHERE ODH.OrderID = PT.OrderID
ORDER BY ODH.TSID DESC
) = 'V'
)
Now both of the above code gives me results that i expect; but is there a better version? i tried all combination using INTERSECT, EXCEPT, UNION, but nothing is as performant as the above. Both the table have at least 300 million rows while also being heavily active. Any help is appreciated.
PS: I cannot change the schema ; can only add indexes.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

ScottPletcher SSC Guru Points: 101098 More actions · Answer 1

Don't see anything better at a relatively quick look, although I would get rid of the double negative logic:

SELECT *

FROM #OrderDetails OD

WHERE OD.ActionCode = 'V' OR ISNULL(

(SELECT TOP (1) TSActionCode = ODH.ActionCode

FROM #OrderDetailsHistory ODH

WHERE ODH.OrderID = OD.OrderID

ORDER BY ODH.TSID DESC

), '-') <> 'V'

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Eirikur Eiriksson SSC Guru Points: 182911 More actions · Answer 2

Quick suggestions: don't use table variables for any kind of serious testing and include covering indices to eliminate sorts and key lookups.

😎

IF OBJECT_ID(N'dbo.OrderDetails') IS NOT NULL DROP TABLE dbo.OrderDetails;

CREATE TABLE dbo.OrderDetails ( OrderID INT NOT NULL PRIMARY KEY CLUSTERED, ActionCode CHAR(1))

IF OBJECT_ID(N'dbo.OrderDetailsHistory') IS NOT NULL DROP TABLE dbo.OrderDetailsHistory;

CREATE TABLE dbo.OrderDetailsHistory (TSID INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED, OrderID INT, ActionCode CHAR(1))

INSERT INTO dbo.OrderDetails VALUES

( 1, 'N') , (2,'A') , (3,'B') ,(4,'C') ,(5,'D') ,(6,'V'),(7,'V'),(8,'N') ,(9,'N') ,(10,'V')

INSERT INTO dbo.OrderDetailsHistory VALUES

(1,'N'),(1,'N'),(1,'V'),(2,'N'),(2,'N'),(3,'N'),(3,'V'),(4,'N'),(5,'N'),(5,'N'),(5,'V'),(6,'N')

,(6,'V'),(7,'N'),(7,'N'),(7,'N'),(5,'N'),(5,'N')

CREATE NONCLUSTERED INDEX NCLIDX_DBO_ORDERDETAILSHISTORY_TSID_INCL_ACTIONCODE ON dbo.OrderDetailsHistory (TSID ASC) INCLUDE (ActionCode);

SELECT *

FROM dbo.OrderDetails OD

OUTER APPLY (SELECT TOP 1 TSActionCode = ODH.ActionCode

FROM dbo.OrderDetailsHistory ODH

WHERE ODH.OrderID = OD.OrderID

ORDER BY ODH.TSID DESC

) OtrApp

WHERE NOT (OD.ActionCode != 'V' AND ISNULL(OtrApp.TSActionCode,OD.ActionCode) = 'V')

SELECT *

FROM dbo.OrderDetails PT

WHERE NOT EXISTS

(

SELECT 1

WHERE PT.ActionCode != 'V' AND

(

SELECT TOP 1 TSActionCode = ODH.ActionCode

FROM dbo.OrderDetailsHistory ODH

WHERE ODH.OrderID = PT.OrderID

ORDER BY ODH.TSID DESC

) = 'V'

)

Sachin Nandanwar SSCertifiable Points: 7699 More actions · Answer 3

Only scope of improvement I can see in the given state is to add an NC index on #OrderDetailsHistory for columns(TSID DESC,ActionCode,OrderID).

CREATE INDEX NCI_OrderDetailsHistory

ON #OrderDetailsHistory(TSID DESC,ActionCode,OrderID)

The APPLY clause needs to combined with an order by to return the correct resultset and adding the above index will completely eliminate the sort operation of the APPLY clause as the optimizer would just have to scan the NC index to match the records returned in the correct order without the need for an explicit sort .Make sure you mark all the columns as part of the index that are to be returned by the select clause else the optimizer would have to resort to a lookup operation.Scan operations with lookups are always expensive.

APPLY clause is notorious for sticking only to nested loop joins even for very large inputs which cane be detrimental unless it intuitively assumes that the APPLY clause is logically equivalent to a inner/left join and would only then consider applying hash or merge joins.

My solution:

SELECT *

FROM (SELECT OD.orderid,

OD.actioncode,

(SELECT TOP 1 ODH.actioncode

FROM orderdetailshistory ODH

WHERE ODH.orderid = OD.orderid

ORDER BY ODH.tsid DESC) ODH

FROM orderdetails OD)T

WHERE odh != 'V' OR Isnull(actioncode, odh) != 'V'

Almost same as the one you posted with APPLY clause.Would be interesting to know, what physical join the optimizer resorts to for a larger input.

--------------------------------------------------------------------------------------------------
I am just an another naive wannabe DBA trying to learn SQL Server

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 4

Thanks for the advices and tips guys. Regarding indexes, i have tried all sorts of indexes (filtered/covered/clustered/non-clustered), and have optimized the query. But I was more interested to know if there are any special ways to work this, like quirky update or something but cant figure out.

Sergiy SSC Guru Points: 110208 More actions · Answer 5

ColdCoffee (3/18/2016)
Hello folks.. i have the following requirement (the table names are modified) :
1. For every row in OrderDetails table with ActionCode not equal to 'V' , check the last inserted row in the OrderDetailsHistory to see if it "V"
2. If it is "V" , then do not pick the row
3. Pick up all rows in OrderDetails with ActionCode = 'V' (irrespective of a whether a row exists in OrderDetailsHistory or not)
4. Pick up all rows that are exisitng only in OrderDetails (irrespective of a whether a row exists in OrderDetailsHistory or not)

You have not described one scenario:

1a. Row in OrderDetails table with ActionCode not equal to 'V' , with the last inserted row in the OrderDetailsHistory is "V" - what to do in this case?

_____________
Code for TallyGenerator