October 31, 2011 at 7:53 am
We've got a database dump from a case management system. In some DW I'm doing in my first few days of using T-SQL I'm writing a report which is pulling information from about 15 different tables, so at the moment I'm using a lot of left joins. One of the pieces of information I need to pull out is a last action date from the history of each matter. There are nearly a million history entries over all matters, so trying to pull this data out isn't very efficient and also for some reason not that accurate (sometimes pulling out three rows for one matter, incorporating three history dates.
In the absence of a "bottom" code to counter the "TOP" I've used the following:
LEFT JOIN Logical.sys_HS ON Logical.sys_mt.MT05 = Logical.sys_hs.hs04
AND Logical.sys_hs.hs01 =
(
SELECT TOP 1 hs01
FROM Logical.sys_hs AS InnerSelect2
WHERE logical.sys_mt.mt05 = InnerSelect2.hs04
ORDER BY hs02 desc
)
Can anyone give me a pointer on how I can A) ensure only one row per matter and b) make the query faster (there should be just under 6000 rows and this takes almost 5 mins to run)
Thanks all.
October 31, 2011 at 9:51 am
There are several ways of handling these kinds of problems, and it's hard to provide specifics without more information on your table structures and so forth. However, here's a quick example with some fake "Case" and "History" data that demonstrates one way of solving this by using a derived table. The CTE (Common Table Expression) isn't strictly needed, but it helps keep the main query clean and legible. Again, this is just a quick example, you'll have to do some work to adapt the concept to your own data and schema.
-- Create two table variables to contain Case and History data
DECLARE @Case AS TABLE
(
CaseID INT,
CaseDesc VARCHAR(100),
PRIMARY KEY (CaseID)
)
DECLARE @History AS TABLE
(
CaseID INT,
HistID INT,
HistDate DATETIME,
HistDesc VARCHAR(100),
PRIMARY KEY (HistID)
)
-- Populate Case and History tables with some sample data
INSERT INTO @Case (CaseID, CaseDesc)
SELECT 1, 'Test Case' UNION
SELECT 2, 'Test Case 2' UNION
SELECT 3, 'Another Case' UNION
SELECT 4, 'Case #4';
INSERT INTO @History (CaseID, HistID, HistDate, HistDesc)
SELECT 1, 1, '1/1/2011', 'Case 1 history 1' UNION
SELECT 1, 2, '2/1/2011', 'Case 1 history 2' UNION
SELECT 1, 3, '3/3/2011', 'Case 1 history 3' UNION
SELECT 2, 4, '2/2/2011', 'Case 2 history 1' UNION
SELECT 2, 5, '4/1/2011', 'Case 2 history 2' UNION
SELECT 3, 6, '3/3/2011', 'Case 3 history 1' UNION
SELECT 4, 7, '1/1/2011', 'Case 4 history 1' UNION
SELECT 4, 8, '2/1/2011', 'Case 4 history 2';
-- Declare a CTE that contains the latest History row for each Case
-- Note the use of the derived table
WITH MaxHist
AS
(
SELECT H1.CaseID, H1.HistID, H1.HistDate, H1.HistDesc
FROM @History AS H1 INNER JOIN
(
SELECT CaseID, MAX(HistDate) AS HistDate
FROM @History
GROUP BY CaseID
) AS H2 ON H1.CaseID = H2.CaseID AND
H1.HistDate = H2.HistDate
)
-- Join the Case table to the MaxHist CTE to get most recent history details for each case
SELECT C.CaseID, C.CaseDesc, H.HistID, H.HistDate, H.HistDesc
FROM @Case AS C INNER JOIN
MaxHist AS H ON C.CaseID = H.CaseID
October 31, 2011 at 9:58 am
JonFox (10/31/2011)
There are several ways of handling these kinds of problems, and it's hard to provide specifics without more information on your table structures and so forth. However, here's a quick example with some fake "Case" and "History" data that demonstrates one way of solving this by using a derived table. The CTE (Common Table Expression) isn't strictly needed, but it helps keep the main query clean and legible. Again, this is just a quick example, you'll have to do some work to adapt the concept to your own data and schema.
-- Create two table variables to contain Case and History data
DECLARE @Case AS TABLE
(
CaseID INT,
CaseDesc VARCHAR(100),
PRIMARY KEY (CaseID)
)
DECLARE @History AS TABLE
(
CaseID INT,
HistID INT,
HistDate DATETIME,
HistDesc VARCHAR(100),
PRIMARY KEY (HistID)
)
-- Populate Case and History tables with some sample data
INSERT INTO @Case (CaseID, CaseDesc)
SELECT 1, 'Test Case' UNION
SELECT 2, 'Test Case 2' UNION
SELECT 3, 'Another Case' UNION
SELECT 4, 'Case #4';
INSERT INTO @History (CaseID, HistID, HistDate, HistDesc)
SELECT 1, 1, '1/1/2011', 'Case 1 history 1' UNION
SELECT 1, 2, '2/1/2011', 'Case 1 history 2' UNION
SELECT 1, 3, '3/3/2011', 'Case 1 history 3' UNION
SELECT 2, 4, '2/2/2011', 'Case 2 history 1' UNION
SELECT 2, 5, '4/1/2011', 'Case 2 history 2' UNION
SELECT 3, 6, '3/3/2011', 'Case 3 history 1' UNION
SELECT 4, 7, '1/1/2011', 'Case 4 history 1' UNION
SELECT 4, 8, '2/1/2011', 'Case 4 history 2';
-- Declare a CTE that contains the latest History row for each Case
-- Note the use of the derived table
WITH MaxHist
AS
(
SELECT H1.CaseID, H1.HistID, H1.HistDate, H1.HistDesc
FROM @History AS H1 INNER JOIN
(
SELECT CaseID, MAX(HistDate) AS HistDate
FROM @History
GROUP BY CaseID
) AS H2 ON H1.CaseID = H2.CaseID AND
H1.HistDate = H2.HistDate
)
-- Join the Case table to the MaxHist CTE to get most recent history details for each case
SELECT C.CaseID, C.CaseDesc, H.HistID, H.HistDate, H.HistDesc
FROM @Case AS C INNER JOIN
MaxHist AS H ON C.CaseID = H.CaseID
You can also use ROW_NUMBER here
WITH CTE AS (
SELECT C.CaseID, C.CaseDesc, H.HistID, H.HistDate, H.HistDesc,
ROW_NUMBER() OVER(PARTITION BY C.CaseID ORDER BY H.HistDate DESC) AS rn
FROM @Case AS C
INNER JOIN @History AS H ON H.CaseID=C.CaseID)
SELECT CaseID, CaseDesc, HistID, HistDate, HistDesc
FROM CTE
WHERE rn=1;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537November 1, 2011 at 5:02 am
Thanks guys - will give this a go.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply