Populate Start and End Time from Previous End Time

Question

Populate Start and End Time from Previous End Time

MattieNH

SSCertifiable

Points: 7463
More actions
December 15, 2011 at 8:49 am

#264869

--I need to compute a series of starting and ending datetime values,
--based on the value of the previous row in a table.
--(For those of you with a procedural programming language
--background, a control break.) If you run the following
--code, you'll see that what I get is a start and end time for the first
--occurrence on the control break fields:
--EmployeeID, Department, and LeaveDate. What I can't figure out
--how to compute is the subsequent start and end times.
--I've attached a spreadsheet that shows the results I want. The
--items I don't know how to populated are shown in red.
--I've also attached the code below as a txt file, which may be easier
--to read.
--Thanks for your help,
Mattie
DECLARE @StartHourAS TINYINT= 8
DECLARE @StartTimeAS TIME
SET@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
CREATE TABLE #Leave(
EmployeeIdINT ,
DepartmentCHAR(4),
TotalHoursInMinutesINT,
LeaveDateDATE,
StartDateDATETIME,
EndDateDATETIME)
INSERT INTO #Leave
( EmployeeId ,
Department ,
TotalHoursInMinutes,
LeaveDate )
VALUES(1, '1200', 450, '6/1/2011'),
(1, '1200', 60, '6/2/2011'),
(1, '1200', 30, '6/2/2011'),
(1, '1200', 90, '6/2/2011'),
(1, '2000', 60, '6/2/2011'),
(2, '1200', 30, '6/1/2011'),
(2, '1200', 90, '6/1/2011')
UPDATE#Leave
SETStartDate = CAST(CONVERT(DATETIME, l.LeaveDate, 101) + @StartTime AS DATETIME),
EndDate = DATEADD(mi, l.TotalHoursInMinutes, CAST(CONVERT(DATETIME, l.LeaveDate, 101) + @StartTime AS DATETIME))
FROM#Leavel
INNER JOIN (
SELECTEmployeeId ,
LeaveDate ,
Department,
Max(TotalHoursInMinutes)AS MaxTotalHoursInMinutes
FROM#Leave
WHEREStartDate IS NULL
GROUP BY
EmployeeId ,
LeaveDate,
Department)t
ONl.EmployeeId = t.EmployeeID
ANDl.LeaveDate = t.LeaveDate
ANDl.Department = t.Department
ANDl.TotalHoursInMinutes = t.MaxTotalHoursInMinutes
SELECT*
FROM#Leave l
ORDER BY
EmployeeId,
Department,
LeaveDate,
StartDate DESC
DROP TABLE #Leave

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

drew.allen SSC Guru Points: 77014 More actions · Answer 1

This is essentially a running total. Jeff has a very good article on calculating running totals: Solving the Running Total and Ordinal Rank Problems (Rewritten)[/url]. Be sure that you read the discussion for that particular article.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

MattieNH SSCertifiable Points: 7463 More actions · Answer 2

Hi Drew,

I had sort of realized that, and tried to implement the 'Quirky Update' running total solution referenced in that article (19 pages printed), but probably attributable to my lack of understanding, I couldn't get it to work.

Here's the code I tried for that. One I get the MinutesRunningCount value updated, I can compute the Start and End dates from that. Can you see what I'm doing wrong here, or recommend another approach from that article?

Thanks,

Mattie

DECLARE @StartHourTINYINT= 8

DECLARE @StartTimeTIME

DECLARE @PrevEmployeeIDINT= -1

DECLARE @PrevLeaveDateDATE= '1/1/1900'

DECLARE @PrevDepartmentVARCHAR(4)= 'x'

DECLARE @MinutesRunningCountINT= 0

SET@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'

CREATE TABLE #Leave(

EmployeeIdINT ,

DepartmentCHAR(4),

TotalHoursInMinutesINT,

MinutesRunningCountINT,

LeaveDateDATE,

StartDateDATETIME,

EndDateDATETIME)

INSERT INTO #Leave

( EmployeeId ,

Department ,

TotalHoursInMinutes,

LeaveDate,

MinutesRunningCount )

VALUES(1, '1200', 450, '6/1/2011', 0),

(1, '1200', 60, '6/2/2011', 0),

(1, '1200', 30, '6/2/2011', 0),

(1, '1200', 90, '6/2/2011', 0),

(1, '2000', 60, '6/2/2011', 0),

(2, '1200', 30, '6/1/2011', 0),

(2, '1200', 90, '6/1/2011', 0)

UPDATE #Leave

SET @MinutesRunningCount = MinutesRunningCount = CASE

WHEN EmployeeID = @PrevEmployeeID

AND LeaveDate = @PrevLeaveDate

AND Department = @PrevDepartment

THEN @MinutesRunningCount + MinutesRunningCount

ELSE MinutesRunningCount

END,

@PrevEmployeeID = EmployeeID,

@PrevLeaveDate = LeaveDate,

@PrevDepartment = Department

FROM #Leave l

SELECT*

FROM#Leave l

ORDER BY

EmployeeId,

Department,

LeaveDate,

StartDate DESC

DROP TABLE #Leave

drew.allen SSC Guru Points: 77014 More actions · Answer 3

You're using the wrong field for your current value to add to the running total. You're using MinutesRunningCount when you should be using TotalHoursInMinutes. (BTW why isn't this just TotalMinutes?)

Just change the field in two lines of your code.

THEN @MinutesRunningCount + TotalHoursInMinutes

ELSE TotalHoursInMinutes

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

MattieNH SSCertifiable Points: 7463 More actions · Answer 4

Argh. I was so convinced it was something totally outside my grasp (page splits, clustered indexes) that I never thought to double check the actual fields ("The simplest explanation for some phenomenon is more likely to be accurate than more complicated explanations." ).

Thanks so much. The reason I named it TotalHoursInMinutes is because somewhere along the line I had a column called TotalHours, and that name seemed like a better reminder about its origin.

Thanks again,

Mattie

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 5

MattieNH (12/15/2011)
Hi Drew,
I had sort of realized that, and tried to implement the 'Quirky Update' running total solution referenced in that article (19 pages printed), but probably attributable to my lack of understanding, I couldn't get it to work.
Here's the code I tried for that. One I get the MinutesRunningCount value updated, I can compute the Start and End dates from that. Can you see what I'm doing wrong here, or recommend another approach from that article?
Thanks,
Mattie
DECLARE @StartHourTINYINT= 8
DECLARE @StartTimeTIME
DECLARE @PrevEmployeeIDINT= -1
DECLARE @PrevLeaveDateDATE= '1/1/1900'
DECLARE @PrevDepartmentVARCHAR(4)= 'x'
DECLARE @MinutesRunningCountINT= 0
SET@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
CREATE TABLE #Leave(
EmployeeIdINT ,
DepartmentCHAR(4),
TotalHoursInMinutesINT,
MinutesRunningCountINT,
LeaveDateDATE,
StartDateDATETIME,
EndDateDATETIME)
INSERT INTO #Leave
( EmployeeId ,
Department ,
TotalHoursInMinutes,
LeaveDate,
MinutesRunningCount )
VALUES(1, '1200', 450, '6/1/2011', 0),
(1, '1200', 60, '6/2/2011', 0),
(1, '1200', 30, '6/2/2011', 0),
(1, '1200', 90, '6/2/2011', 0),
(1, '2000', 60, '6/2/2011', 0),
(2, '1200', 30, '6/1/2011', 0),
(2, '1200', 90, '6/1/2011', 0)
UPDATE #Leave
SET @MinutesRunningCount = MinutesRunningCount = CASE
WHEN EmployeeID = @PrevEmployeeID
AND LeaveDate = @PrevLeaveDate
AND Department = @PrevDepartment
THEN @MinutesRunningCount + MinutesRunningCount
ELSE MinutesRunningCount
END,
@PrevEmployeeID = EmployeeID,
@PrevLeaveDate = LeaveDate,
@PrevDepartment = Department
FROM #Leave l
SELECT*
FROM#Leave l
ORDER BY
EmployeeId,
Department,
LeaveDate,
StartDate DESC
DROP TABLE #Leave

Please don't use the "Quirky Update" unless you follow all of the rules. You don't have the required Clustered Index, TABLOCKX, or MAXDOP. Go back and reread the rules. Yeah, I know... your code works as is... for now. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

MattieNH SSCertifiable Points: 7463 More actions · Answer 6

Hi Jeff,

Thanks. So is this what I want? It seems to return the correct data (although so did the other code). Changes are in bolded italics. The clustered index needs to be in the (control break) order I want the evaluation made, correct?

DECLARE @StartHour TINYINT = 8

DECLARE @StartTime TIME

DECLARE @PrevEmployeeID INT = -1

DECLARE @PrevLeaveDate DATE = '1/1/1900'

DECLARE @PrevDepartment VARCHAR(4) = 'x'

DECLARE @MinutesRunningCount INT = 0

SET @StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'

CREATE TABLE #Leave(

EmployeeId INT ,

Department CHAR(4),

TotalHoursInMinutes INT,

MinutesRunningCount INT,

LeaveDate DATE,

StartDate DATETIME,

EndDate DATETIME )

CREATE CLUSTERED INDEX IX_LeaveClusteredIndex

ON #Leave (EmployeeId, LeaveDate, Department, TotalHours)

INSERT INTO #Leave

(EmployeeId ,

Department ,

TotalHoursInMinutes,

LeaveDate,

MinutesRunningCount )

VALUES (1, '1200', 450, '6/1/2011', 0),

(1, '1200', 60, '6/2/2011', 0),

(1, '1200', 30, '6/2/2011', 0),

(1, '1200', 90, '6/2/2011', 0),

(1, '2000', 60, '6/2/2011', 0),

(2, '1200', 30, '6/1/2011', 0),

(2, '1200', 90, '6/1/2011', 0)

UPDATE #Leave

SET @MinutesRunningCount = MinutesRunningCount = CASE

WHEN EmployeeID = @PrevEmployeeID

AND LeaveDate = @PrevLeaveDate

AND Department = @PrevDepartment

THEN @MinutesRunningCount + TotalHoursInMinutes

ELSE TotalHoursInMinutes

END,

@PrevEmployeeID = EmployeeID,

@PrevLeaveDate = LeaveDate,

@PrevDepartment = Department

FROM #Leave with (tablockx)

Option (MAXDOP 1)

SELECT *

ORDER BY

EmployeeId,

Department,

LeaveDate,

StartDate DESC

DROP TABLE #Leave

Mattie

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 7

Just a couple of notes:

1. The clustered index contained columns that didn't matter.

2. You had two orphaned variables that did nothing in the code. I removed them.

3. The final SELECT didn't have the same order as the clustered index for manual verification.

4. You don't need to include starting values for most of the variables because NULL is never equal to anything.

5. You don't need to include starting values in the #Leave table for the MinutesRunningCount column. They're just going to get overwritten.

6. I added the new "safety check feature" that Paul White and Tom Thompson came up with even though it's not really needed on Temp Tables just in case you ever want to use this on a permanent table.

7. I optimized the original DECLAREs by changing them to the "old style". Also, I'm working from my 2k5 machine today and your 2k8 compatible declarations and inserts just weren't doing it for me. Something to remember when you post a question for 2k8... not everyone that can help you is sitting in front of a 2k8 machine. Keep your posts for help at a 2k5 level unless absolutely required. You'll get more people to help that way. I also had to change DATE datatypes to DATETIME to be 2k5 compatible.

8. If you tell me what you want done with the currently unused StartDate and EndDate columns, I can help there or you can have the fun of doing it (I certainly don't mind).

9. You had a datatype mismatch for department between the variable and the table column.

10. Finally, if you tell me the name of the table that you'll actually be populating the #Leave table from, we can optimize this for even more speed actually using less code in the process.

As a bit of a side bar... yes, I know the code worked before and it's the nature of the beast to work just fine on HEAPs (tables without a Clustered Index) without any of the hints you added... most of the time. However, if parallelism were to occur without you knowing it when the table gets bigger or something disturbed your "insert order", then you'd get a whole lot of incorrect answers. That's why I said that you absolutely must follow the rules from the article. They don't call it a "Quirky" UPDATE just to be cute. 🙂

Here're the modifications to your code that I've made, so far. Please let me know if you have any additional questions on this important subject.

DECLARE @PrevEmployeeID INT,

@PrevLeaveDate DATETIME,

@PrevDepartment CHAR(4),

@MinutesRunningCount INT,

@RowCounter INT

;

SELECT @MinutesRunningCount = 0,

@RowCounter = 1

;

CREATE TABLE #Leave

(

EmployeeId INT ,

Department CHAR(4),

TotalHoursInMinutes INT,

MinutesRunningCount INT,

LeaveDate DATETIME,

StartDate DATETIME,

EndDate DATETIME

)

;

CREATE CLUSTERED INDEX IX_Leave#ClusteredIndex

ON #Leave (EmployeeId, LeaveDate, Department)

;

INSERT INTO #Leave

(EmployeeId, Department, TotalHoursInMinutes, LeaveDate)

SELECT 1, '1200', 450, '6/1/2011' UNION ALL

SELECT 1, '1200', 60, '6/2/2011' UNION ALL

SELECT 1, '1200', 30, '6/2/2011' UNION ALL

SELECT 1, '1200', 90, '6/2/2011' UNION ALL

SELECT 1, '2000', 60, '6/2/2011' UNION ALL

SELECT 2, '1200', 30, '6/1/2011' UNION ALL

SELECT 2, '1200', 90, '6/1/2011'

;

WITH

cteBuildSafety AS

(

SELECT RowCounter = ROW_NUMBER() OVER(ORDER BY EmployeeId, LeaveDate, Department),

EmployeeId, LeaveDate, Department, TotalHoursInMinutes, MinutesRunningCount

FROM #Leave

)

UPDATE tgt

SET @MinutesRunningCount = MinutesRunningCount

= CASE

WHEN RowCounter = @RowCounter --Safety Counter/Check

THEN

CASE

WHEN EmployeeID = @PrevEmployeeID

AND LeaveDate = @PrevLeaveDate

AND Department = @PrevDepartment

THEN @MinutesRunningCount + TotalHoursInMinutes

ELSE TotalHoursInMinutes

END

ELSE 1/0 --Force failure if Safety Counter is out of sync

END,

@PrevEmployeeID = EmployeeID,

@PrevLeaveDate = LeaveDate,

@PrevDepartment = Department,

@RowCounter = @RowCounter + 1

FROM cteBuildSafety tgt WITH (TABLOCKX)

OPTION (MAXDOP 1)

;

SELECT *

FROM #Leave

ORDER BY EmployeeId, LeaveDate, Department

;

DROP TABLE #Leave

;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

MattieNH SSCertifiable Points: 7463 More actions · Answer 8

Jeff,

Thank you so much for critiquing my code, it's been incredibly helpful, not just for this project.

I've added the code that does the update of StartDate and EndDate, which is what those two orphaned variables you found were for.

I'm not terribly worried about additional optimization, because this is a one-shot data conversion project. We're moving leave data into a new table structure that supports a beginning and ending time, and since our standard day starts at 8:00 am, I want the first occurrence per person per department per day to have their leave start at 8:00 am, and then add on from there. Of the 450,000 rows to convert, I only have about 4500 that even have more than one leave type per day. But I've included the DDL for the table (Leave) that this code will update.

I do have one minor question. Is the point of @RowCounter to make sure this code doesn't go into a loop?

Again, thanks so much. This has been a terrific learning tool.

Mattie

DECLARE @PrevEmployeeIDINT,

@PrevLeaveDateDATETIME,

@PrevDepartmentCHAR(4),

@MinutesRunningCountINT,

@RowCounterINT,

@StartHourTINYINT,

@StartTimeTIME

;

SELECT @MinutesRunningCount = 0,

@RowCounter = 1,

@StartHour = 8,

@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'

;

CREATE TABLE #Leave

(

EmployeeId INT ,

Department CHAR(4),

TotalHoursInMinutes INT,

MinutesRunningCount INT,

LeaveDate DATETIME,

StartDate DATETIME,

EndDate DATETIME

)

;

CREATE CLUSTERED INDEX IX_Leave#ClusteredIndex

ON #Leave (EmployeeId, LeaveDate, Department)

;

INSERT INTO #Leave

(EmployeeId, Department, TotalHoursInMinutes, LeaveDate)

SELECT 1, '1200', 450, '6/1/2011' UNION ALL

SELECT 1, '1200', 60, '6/2/2011' UNION ALL

SELECT 1, '1200', 30, '6/2/2011' UNION ALL

SELECT 1, '1200', 90, '6/2/2011' UNION ALL

SELECT 1, '2000', 60, '6/2/2011' UNION ALL

SELECT 2, '1200', 30, '6/1/2011' UNION ALL

SELECT 2, '1200', 90, '6/1/2011'

;

WITH

cteBuildSafety AS

(

SELECT RowCounter = ROW_NUMBER() OVER(ORDER BY EmployeeId, LeaveDate, Department),

EmployeeId, LeaveDate, Department, TotalHoursInMinutes, MinutesRunningCount

FROM #Leave

)

UPDATE tgt

SET @MinutesRunningCount = MinutesRunningCount

= CASE

WHEN RowCounter = @RowCounter --Safety Counter/Check

THEN

CASE

WHEN EmployeeID = @PrevEmployeeID

AND LeaveDate = @PrevLeaveDate

AND Department = @PrevDepartment

THEN @MinutesRunningCount + TotalHoursInMinutes

ELSE TotalHoursInMinutes

END

ELSE 1/0 --Force failure if Safety Counter is out of sync

END,

@PrevEmployeeID = EmployeeID,

@PrevLeaveDate = LeaveDate,

@PrevDepartment = Department,

@RowCounter = @RowCounter + 1

FROM cteBuildSafety tgt WITH (TABLOCKX)

OPTION (MAXDOP 1)

;

UPDATE #Leave

SETStartDate = DATEADD(mi,

MinutesRunningCount - TotalHoursInMinutes,

CAST(CONVERT(DATETIME, LeaveDate, 101) + @StartTime AS DATETIME)),

EndDate = DATEADD(mi,

MinutesRunningCount,

CAST(CONVERT(DATETIME, LeaveDate, 101) + @StartTime AS DATETIME));

SELECT*

FROM#Leave

ORDER BY

EmployeeId,

LeaveDate,

Department

;

DROP TABLE #Leave

;

CREATE TABLE [dbo].[Leave](

[Id] [int] IDENTITY(1,1) NOT NULL,

[StartDate] [datetime] NOT NULL,

[EndDate] [datetime] NOT NULL,

[TotalHours] [decimal](7, 3) NOT NULL,

[CreatedBy] [varchar](32) NOT NULL,

[CreatedOn] [datetime] NOT NULL,

[ModifiedBy] [varchar](32) NULL,

[ModifiedOn] [datetime] NULL,

[CurrentStatus_Id] [int] NULL,

[Employee_EmployeeId] [int] NOT NULL,

[Position_PositionId] [int] NULL,

[Activity_Id] [int] NULL,

[Location_Id] [int] NULL,

[LeaveReason_Id] [int] NOT NULL,

[LeaveType_Id] [int] NOT NULL,

PRIMARY KEY CLUSTERED

(

[Id] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 9

Hi Mattie,

Thank you for the very nice feedback and for posting your code.

Nope. @RowCounter doesn't keep things from "going into a loop". @RowCounter makes sure that the order of the update occurs in the proper order according to the "RowCounter" in the cte. Because you have absolute control over a Temp Table, such a safety check isn't really necessary when the target of the update is, in fact, a Temp Table. However, I'm never one to turn down a safety feature especially since it costs so very little, as in this case.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)