July 22, 2014 at 6:22 pm
Thank you for reading. I have a table (represented by #Events) that holds modifications made to another table. I do have some control over the table structure and indexing. I want to pull all of the change records that were made between two dates.
The tricky part is to include the previous version of each record, which will usually be found prior to the start date in question.
The code that I have provided below works. So you can use it to easily see what should be returned. But it's very slow in production.
Can you think of a better method to pull this data together?
Thank you in advance.
-- Production version of this table has 4.5 million rows (roughly 1,000 rows per day)
-- Primary key is on L4Ident (clustered)
-- nonclustered index on ProcessDate, LinkRL4
DROP TABLE dbo.#Events;
DROP TABLE dbo.#Results;
CREATE TABLE dbo.#Events (
L4Ident int IDENTITY(1,1) NOT NULL,
LinkRL4 int NOT NULL,
ProcessDate date NOT NULL,
UserName char(10) NOT NULL,
SequenceNumber int NOT NULL,
JobDescription char(10) NOT NULL,
EntryType char(7) NOT NULL,
ChangingField char(10) NOT NULL);
CREATE TABLE dbo.#Results (
ResultsID int IDENTITY(1,1) NOT NULL,
L4Ident int NOT NULL,
LinkRL4 int NOT NULL,
ProcessDate date NOT NULL,
UserName char(10) NOT NULL,
SequenceNumber int NOT NULL,
JobDescription char(10) NOT NULL,
EntryType char(7) NOT NULL,
ChangingField char(10) NOT NULL,
RowNumber smallint NOT NULL);
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (111, '2014-06-01', 'Michael', 1, 'Daily', 'INSERT', 'GM'),
(111, '2014-06-09', 'Michael', 12, 'Daily', 'UPDATE', 'GMC'),
(111, '2014-06-15', 'Rachel', 19, 'Daily', 'UPDATE', 'GMotors'),
(111, '2014-07-01', 'Rachel', 24, 'Daily', 'UPDATE', 'Motors'),
(111, '2014-07-05', 'Rachel', 39, 'Daily', 'UPDATE', 'GenMotors');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (112, '2014-06-04', 'Thomas', 6, 'Daily', 'INSERT', 'MB'),
(112, '2014-06-07', 'Thomas', 10, 'Daily', 'UPDATE', 'Mercedes'),
(112, '2014-06-15', 'Thomas', 20, 'Daily', 'UPDATE', 'MBenz'),
(112, '2014-07-03', 'Michael', 27, 'Daily', 'UPDATE', 'Benz'),
(112, '2014-07-05', 'Rachel', 40, 'Daily', 'UPDATE', 'Mer Benz');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (113, '2014-07-06', 'Michael', 55, 'Daily', 'INSERT', 'BMW');
TRUNCATE TABLE dbo.#Results;
DECLARE @StartDate date = '2014-06-12',
@EndDate date = '2014-07-06';
DECLARE @L4Ident int;
DECLARE @CurrentLinkRL4 int,
@MaxLinkRL4 int;
SELECT @CurrentLinkRL4 = MIN(LinkRL4),
@MaxLinkRL4 = MAX(LinkRL4)
FROM dbo.#Events
WHERE ProcessDate BETWEEN @StartDate AND @EndDate;
WHILE ((@CurrentLinkRL4 <= @MaxLinkRL4) AND (@CurrentLinkRL4 IS NOT NULL))
BEGIN
-- insert all records that match within the date range
INSERT dbo.#Results (
L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber)
SELECT L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber = RANK() OVER (PARTITION BY LinkRL4 ORDER BY L4Ident)
FROM dbo.#Events
WHERE LinkRL4 = @CurrentLinkRL4
AND ProcessDate BETWEEN @StartDate AND @EndDate;
-- find the record immediately prior to the date range
SELECT @L4Ident = MIN(L4Ident)
FROM dbo.#Results
WHERE LinkRL4 = @CurrentLinkRL4;
SELECT @L4Ident = MAX(L4Ident)
FROM dbo.#Events
WHERE LinkRL4 = @CurrentLinkRL4
AND L4Ident < @L4Ident;
-- add one record immediately prior to the date range
INSERT dbo.#Results (
L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber)
SELECT L4Ident,
LinkRL4,
ProcessDate,
UserName,
SequenceNumber,
JobDescription,
EntryType,
ChangingField,
RowNumber = 0
FROM dbo.#Events
WHERE L4Ident = @L4Ident;
-- increment loop counter
SELECT @CurrentLinkRL4 = MIN(LinkRL4)
FROM dbo.#Events
WHERE ProcessDate BETWEEN @StartDate AND @EndDate
AND LinkRL4 > @CurrentLinkRL4;
END
SELECT *
FROM dbo.#Results
ORDER BY LinkRL4, ProcessDate, RowNumber
July 23, 2014 at 4:48 am
You could use a single query like below. I have not tested it against a 4.5 million row input set, but it's probably faster then the loop and all the separate queries you are using right now. I recommend to include the column LinkRL4 to the non-clustered index on ProcessDate. If the zero-offset row numbering in case of a previous row outside the data range is not important, remove the CASE expression.
SELECT
evt.*,
RowNumber = ROW_NUMBER() OVER (PARTITION BY evt.LinkRL4 ORDER BY evt.L4Ident) -
CASE WHEN MIN(evt.ProcessDate) OVER (PARTITION BY evt.LinkRL4) < @StartDate THEN 1 ELSE 0 END
FROM
dbo.#Events evt
WHERE
evt.ProcessDate <= @EndDate
AND NOT EXISTS
(
SELECT
*
FROM
dbo.#Events evt2
WHERE
evt2.LinkRL4 = evt.LinkRL4
AND evt2.ProcessDate > evt.ProcessDate
AND evt2.ProcessDate < @StartDate
)
ORDER BY
evt.LinkRL4, evt.ProcessDate, evt.L4Ident
July 31, 2014 at 12:09 pm
Peter, I think you're on right track, but there is a flaw. The initial sample data does not show it, but when used with date range of 6/12 - 7/01 this data does:
INSERTdbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (111, '2014-06-01', 'Michael', 1, 'Daily', 'INSERT', 'GM'),
(111, '2014-06-09', 'Michael', 12, 'Daily', 'UPDATE', 'GMC'),
(111, '2014-06-15', 'Rachel', 19, 'Daily', 'UPDATE', 'GMotors'),
(111, '2014-07-01', 'Rachel', 24, 'Daily', 'UPDATE', 'Motors'),
(111, '2014-07-05', 'Rachel', 39, 'Daily', 'UPDATE', 'GenMotors');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (112, '2014-06-04', 'Thomas', 6, 'Daily', 'INSERT', 'MB'),
(112, '2014-06-07', 'Thomas', 10, 'Daily', 'UPDATE', 'Mercedes'),
(112, '2014-06-15', 'Thomas', 20, 'Daily', 'UPDATE', 'MBenz'),
(112, '2014-07-03', 'Michael', 27, 'Daily', 'UPDATE', 'Benz'),
(112, '2014-07-05', 'Rachel', 40, 'Daily', 'UPDATE', 'Mer Benz');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (113, '2014-07-06', 'Michael', 55, 'Daily', 'INSERT', 'BMW');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (114, '2014-06-01', 'Steve', 4, 'Daily', 'INSERT', 'VW'),
(114, '2014-06-02', 'Steve', 60, 'Daily', 'UPDATE', 'Volks');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (115, '2014-06-01', 'Stan', 2, 'Daily', 'INSERT', 'BMW'),
(115, '2014-06-02', 'Stan', 3, 'Daily', 'UPDATE', 'Beamer'),
(115, '2014-08-02', 'Stan', 80, 'Daily', 'UPDATE', 'BMW');
INSERT dbo.#Events (LinkRL4, ProcessDate, UserName, SequenceNumber, JobDescription, EntryType, ChangingField)
VALUES (116, '2014-06-25', 'Lee', 32, 'Daily', 'INSERT', 'BMW'),
(117, '2014-08-02', 'Lee', 81, 'Daily', 'UPDATE', 'BMW');
GO
July 31, 2014 at 12:10 pm
I think this works:
--first part gets events within the date range
SELECTevt.*
FROM dbo.#Events evt
WHERE evt.ProcessDate BETWEEN @StartDate AND @EndDate
--second part gets related events that occurred before
UNION ALL
SELECTPrior.*
FROM (SELECT LinkRL4, MIN(ProcessDate) AS ProcessDate FROM dbo.#Events evt WHERE ProcessDate BETWEEN @StartDate AND @EndDate GROUP BY LinkRL4) AS MinDate
CROSS APPLY (
SELECT TOP 1 evt.*
FROM dbo.#Events evt
WHERE evt.LinkRL4 = MinDate.LinkRL4
AND evt.ProcessDate < MinDate.ProcessDate
ORDER BY evt.ProcessDate DESC) as Prior
ORDER BY LinkRL4, ProcessDate, L4Ident
August 1, 2014 at 11:25 am
I'd also create the best indexes while re-working the code.
#Events indexes:
1) Unique Clustered on ( ProcessDate, L4Ident ) --in that order
2) Nonclustered on ( L4Ident )
3) NNonclustered on ( LinkRL4, L4Ident ) Include ( ProcessDate ) --yes, processdate would be auto-included anyway, but I like to explicitly specify it anyway, in case the clus index changes later.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply