December 15, 2011 at 8:49 am
--I need to compute a series of starting and ending datetime values,
--based on the value of the previous row in a table.
--(For those of you with a procedural programming language
--background, a control break.) If you run the following
--code, you'll see that what I get is a start and end time for the first
--occurrence on the control break fields:
--EmployeeID, Department, and LeaveDate. What I can't figure out
--how to compute is the subsequent start and end times.
--I've attached a spreadsheet that shows the results I want. The
--items I don't know how to populated are shown in red.
--I've also attached the code below as a txt file, which may be easier
--to read.
--Thanks for your help,
Mattie
DECLARE @StartHourAS TINYINT= 8
DECLARE @StartTimeAS TIME
SET@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
CREATE TABLE #Leave(
EmployeeIdINT ,
DepartmentCHAR(4),
TotalHoursInMinutesINT,
LeaveDateDATE,
StartDateDATETIME,
EndDateDATETIME)
INSERT INTO #Leave
( EmployeeId ,
Department ,
TotalHoursInMinutes,
LeaveDate )
VALUES(1, '1200', 450, '6/1/2011'),
(1, '1200', 60, '6/2/2011'),
(1, '1200', 30, '6/2/2011'),
(1, '1200', 90, '6/2/2011'),
(1, '2000', 60, '6/2/2011'),
(2, '1200', 30, '6/1/2011'),
(2, '1200', 90, '6/1/2011')
UPDATE#Leave
SETStartDate = CAST(CONVERT(DATETIME, l.LeaveDate, 101) + @StartTime AS DATETIME),
EndDate = DATEADD(mi, l.TotalHoursInMinutes, CAST(CONVERT(DATETIME, l.LeaveDate, 101) + @StartTime AS DATETIME))
FROM#Leavel
INNER JOIN (
SELECTEmployeeId ,
LeaveDate ,
Department,
Max(TotalHoursInMinutes)AS MaxTotalHoursInMinutes
FROM#Leave
WHEREStartDate IS NULL
GROUP BY
EmployeeId ,
LeaveDate,
Department)t
ONl.EmployeeId = t.EmployeeID
ANDl.LeaveDate = t.LeaveDate
ANDl.Department = t.Department
ANDl.TotalHoursInMinutes = t.MaxTotalHoursInMinutes
SELECT*
FROM#Leave l
ORDER BY
EmployeeId,
Department,
LeaveDate,
StartDate DESC
DROP TABLE #Leave
December 15, 2011 at 9:05 am
December 15, 2011 at 10:10 am
Hi Drew,
I had sort of realized that, and tried to implement the 'Quirky Update' running total solution referenced in that article (19 pages printed), but probably attributable to my lack of understanding, I couldn't get it to work.
Here's the code I tried for that. One I get the MinutesRunningCount value updated, I can compute the Start and End dates from that. Can you see what I'm doing wrong here, or recommend another approach from that article?
Thanks,
Mattie
DECLARE @StartHourTINYINT= 8
DECLARE @StartTimeTIME
DECLARE @PrevEmployeeIDINT= -1
DECLARE @PrevLeaveDateDATE= '1/1/1900'
DECLARE @PrevDepartmentVARCHAR(4)= 'x'
DECLARE @MinutesRunningCountINT= 0
SET@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
CREATE TABLE #Leave(
EmployeeIdINT ,
DepartmentCHAR(4),
TotalHoursInMinutesINT,
MinutesRunningCountINT,
LeaveDateDATE,
StartDateDATETIME,
EndDateDATETIME)
INSERT INTO #Leave
( EmployeeId ,
Department ,
TotalHoursInMinutes,
LeaveDate,
MinutesRunningCount )
VALUES(1, '1200', 450, '6/1/2011', 0),
(1, '1200', 60, '6/2/2011', 0),
(1, '1200', 30, '6/2/2011', 0),
(1, '1200', 90, '6/2/2011', 0),
(1, '2000', 60, '6/2/2011', 0),
(2, '1200', 30, '6/1/2011', 0),
(2, '1200', 90, '6/1/2011', 0)
UPDATE #Leave
SET @MinutesRunningCount = MinutesRunningCount = CASE
WHEN EmployeeID = @PrevEmployeeID
AND LeaveDate = @PrevLeaveDate
AND Department = @PrevDepartment
THEN @MinutesRunningCount + MinutesRunningCount
ELSE MinutesRunningCount
END,
@PrevEmployeeID = EmployeeID,
@PrevLeaveDate = LeaveDate,
@PrevDepartment = Department
FROM #Leave l
SELECT*
FROM#Leave l
ORDER BY
EmployeeId,
Department,
LeaveDate,
StartDate DESC
DROP TABLE #Leave
December 15, 2011 at 11:50 am
You're using the wrong field for your current value to add to the running total. You're using MinutesRunningCount when you should be using TotalHoursInMinutes. (BTW why isn't this just TotalMinutes?)
Just change the field in two lines of your code.
THEN @MinutesRunningCount + TotalHoursInMinutes
ELSE TotalHoursInMinutes
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
December 15, 2011 at 12:23 pm
Argh. I was so convinced it was something totally outside my grasp (page splits, clustered indexes) that I never thought to double check the actual fields ("The simplest explanation for some phenomenon is more likely to be accurate than more complicated explanations." ).
Thanks so much. The reason I named it TotalHoursInMinutes is because somewhere along the line I had a column called TotalHours, and that name seemed like a better reminder about its origin.
Thanks again,
Mattie
December 16, 2011 at 10:57 pm
MattieNH (12/15/2011)
Hi Drew,I had sort of realized that, and tried to implement the 'Quirky Update' running total solution referenced in that article (19 pages printed), but probably attributable to my lack of understanding, I couldn't get it to work.
Here's the code I tried for that. One I get the MinutesRunningCount value updated, I can compute the Start and End dates from that. Can you see what I'm doing wrong here, or recommend another approach from that article?
Thanks,
Mattie
DECLARE @StartHourTINYINT= 8
DECLARE @StartTimeTIME
DECLARE @PrevEmployeeIDINT= -1
DECLARE @PrevLeaveDateDATE= '1/1/1900'
DECLARE @PrevDepartmentVARCHAR(4)= 'x'
DECLARE @MinutesRunningCountINT= 0
SET@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
CREATE TABLE #Leave(
EmployeeIdINT ,
DepartmentCHAR(4),
TotalHoursInMinutesINT,
MinutesRunningCountINT,
LeaveDateDATE,
StartDateDATETIME,
EndDateDATETIME)
INSERT INTO #Leave
( EmployeeId ,
Department ,
TotalHoursInMinutes,
LeaveDate,
MinutesRunningCount )
VALUES(1, '1200', 450, '6/1/2011', 0),
(1, '1200', 60, '6/2/2011', 0),
(1, '1200', 30, '6/2/2011', 0),
(1, '1200', 90, '6/2/2011', 0),
(1, '2000', 60, '6/2/2011', 0),
(2, '1200', 30, '6/1/2011', 0),
(2, '1200', 90, '6/1/2011', 0)
UPDATE #Leave
SET @MinutesRunningCount = MinutesRunningCount = CASE
WHEN EmployeeID = @PrevEmployeeID
AND LeaveDate = @PrevLeaveDate
AND Department = @PrevDepartment
THEN @MinutesRunningCount + MinutesRunningCount
ELSE MinutesRunningCount
END,
@PrevEmployeeID = EmployeeID,
@PrevLeaveDate = LeaveDate,
@PrevDepartment = Department
FROM #Leave l
SELECT*
FROM#Leave l
ORDER BY
EmployeeId,
Department,
LeaveDate,
StartDate DESC
DROP TABLE #Leave
Please don't use the "Quirky Update" unless you follow all of the rules. You don't have the required Clustered Index, TABLOCKX, or MAXDOP. Go back and reread the rules. Yeah, I know... your code works as is... for now. 😉
--Jeff Moden
Change is inevitable... Change for the better is not.
December 19, 2011 at 11:53 am
Hi Jeff,
Thanks. So is this what I want? It seems to return the correct data (although so did the other code). Changes are in bolded italics. The clustered index needs to be in the (control break) order I want the evaluation made, correct?
DECLARE @StartHour TINYINT = 8
DECLARE @StartTime TIME
DECLARE @PrevEmployeeID INT = -1
DECLARE @PrevLeaveDate DATE = '1/1/1900'
DECLARE @PrevDepartment VARCHAR(4) = 'x'
DECLARE @MinutesRunningCount INT = 0
SET @StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
CREATE TABLE #Leave(
EmployeeId INT ,
Department CHAR(4),
TotalHoursInMinutes INT,
MinutesRunningCount INT,
LeaveDate DATE,
StartDate DATETIME,
EndDate DATETIME )
CREATE CLUSTERED INDEX IX_LeaveClusteredIndex
ON #Leave (EmployeeId, LeaveDate, Department, TotalHours)
INSERT INTO #Leave
(EmployeeId ,
Department ,
TotalHoursInMinutes,
LeaveDate,
MinutesRunningCount )
VALUES (1, '1200', 450, '6/1/2011', 0),
(1, '1200', 60, '6/2/2011', 0),
(1, '1200', 30, '6/2/2011', 0),
(1, '1200', 90, '6/2/2011', 0),
(1, '2000', 60, '6/2/2011', 0),
(2, '1200', 30, '6/1/2011', 0),
(2, '1200', 90, '6/1/2011', 0)
UPDATE #Leave
SET @MinutesRunningCount = MinutesRunningCount = CASE
WHEN EmployeeID = @PrevEmployeeID
AND LeaveDate = @PrevLeaveDate
AND Department = @PrevDepartment
THEN @MinutesRunningCount + TotalHoursInMinutes
ELSE TotalHoursInMinutes
END,
@PrevEmployeeID = EmployeeID,
@PrevLeaveDate = LeaveDate,
@PrevDepartment = Department
FROM #Leave with (tablockx)
Option (MAXDOP 1)
SELECT *
ORDER BY
EmployeeId,
Department,
LeaveDate,
StartDate DESC
DROP TABLE #Leave
Mattie
December 19, 2011 at 7:44 pm
Just a couple of notes:
1. The clustered index contained columns that didn't matter.
2. You had two orphaned variables that did nothing in the code. I removed them.
3. The final SELECT didn't have the same order as the clustered index for manual verification.
4. You don't need to include starting values for most of the variables because NULL is never equal to anything.
5. You don't need to include starting values in the #Leave table for the MinutesRunningCount column. They're just going to get overwritten.
6. I added the new "safety check feature" that Paul White and Tom Thompson came up with even though it's not really needed on Temp Tables just in case you ever want to use this on a permanent table.
7. I optimized the original DECLAREs by changing them to the "old style". Also, I'm working from my 2k5 machine today and your 2k8 compatible declarations and inserts just weren't doing it for me. Something to remember when you post a question for 2k8... not everyone that can help you is sitting in front of a 2k8 machine. Keep your posts for help at a 2k5 level unless absolutely required. You'll get more people to help that way. I also had to change DATE datatypes to DATETIME to be 2k5 compatible.
8. If you tell me what you want done with the currently unused StartDate and EndDate columns, I can help there or you can have the fun of doing it (I certainly don't mind).
9. You had a datatype mismatch for department between the variable and the table column.
10. Finally, if you tell me the name of the table that you'll actually be populating the #Leave table from, we can optimize this for even more speed actually using less code in the process.
As a bit of a side bar... yes, I know the code worked before and it's the nature of the beast to work just fine on HEAPs (tables without a Clustered Index) without any of the hints you added... most of the time. However, if parallelism were to occur without you knowing it when the table gets bigger or something disturbed your "insert order", then you'd get a whole lot of incorrect answers. That's why I said that you absolutely must follow the rules from the article. They don't call it a "Quirky" UPDATE just to be cute. 🙂
Here're the modifications to your code that I've made, so far. Please let me know if you have any additional questions on this important subject.
DECLARE @PrevEmployeeID INT,
@PrevLeaveDate DATETIME,
@PrevDepartment CHAR(4),
@MinutesRunningCount INT,
@RowCounter INT
;
SELECT @MinutesRunningCount = 0,
@RowCounter = 1
;
CREATE TABLE #Leave
(
EmployeeId INT ,
Department CHAR(4),
TotalHoursInMinutes INT,
MinutesRunningCount INT,
LeaveDate DATETIME,
StartDate DATETIME,
EndDate DATETIME
)
;
CREATE CLUSTERED INDEX IX_Leave#ClusteredIndex
ON #Leave (EmployeeId, LeaveDate, Department)
;
INSERT INTO #Leave
(EmployeeId, Department, TotalHoursInMinutes, LeaveDate)
SELECT 1, '1200', 450, '6/1/2011' UNION ALL
SELECT 1, '1200', 60, '6/2/2011' UNION ALL
SELECT 1, '1200', 30, '6/2/2011' UNION ALL
SELECT 1, '1200', 90, '6/2/2011' UNION ALL
SELECT 1, '2000', 60, '6/2/2011' UNION ALL
SELECT 2, '1200', 30, '6/1/2011' UNION ALL
SELECT 2, '1200', 90, '6/1/2011'
;
WITH
cteBuildSafety AS
(
SELECT RowCounter = ROW_NUMBER() OVER(ORDER BY EmployeeId, LeaveDate, Department),
EmployeeId, LeaveDate, Department, TotalHoursInMinutes, MinutesRunningCount
FROM #Leave
)
UPDATE tgt
SET @MinutesRunningCount = MinutesRunningCount
= CASE
WHEN RowCounter = @RowCounter --Safety Counter/Check
THEN
CASE
WHEN EmployeeID = @PrevEmployeeID
AND LeaveDate = @PrevLeaveDate
AND Department = @PrevDepartment
THEN @MinutesRunningCount + TotalHoursInMinutes
ELSE TotalHoursInMinutes
END
ELSE 1/0 --Force failure if Safety Counter is out of sync
END,
@PrevEmployeeID = EmployeeID,
@PrevLeaveDate = LeaveDate,
@PrevDepartment = Department,
@RowCounter = @RowCounter + 1
FROM cteBuildSafety tgt WITH (TABLOCKX)
OPTION (MAXDOP 1)
;
SELECT *
FROM #Leave
ORDER BY EmployeeId, LeaveDate, Department
;
DROP TABLE #Leave
;
--Jeff Moden
Change is inevitable... Change for the better is not.
December 20, 2011 at 7:08 am
Jeff,
Thank you so much for critiquing my code, it's been incredibly helpful, not just for this project.
I've added the code that does the update of StartDate and EndDate, which is what those two orphaned variables you found were for.
I'm not terribly worried about additional optimization, because this is a one-shot data conversion project. We're moving leave data into a new table structure that supports a beginning and ending time, and since our standard day starts at 8:00 am, I want the first occurrence per person per department per day to have their leave start at 8:00 am, and then add on from there. Of the 450,000 rows to convert, I only have about 4500 that even have more than one leave type per day. But I've included the DDL for the table (Leave) that this code will update.
I do have one minor question. Is the point of @RowCounter to make sure this code doesn't go into a loop?
Again, thanks so much. This has been a terrific learning tool.
Mattie
DECLARE @PrevEmployeeIDINT,
@PrevLeaveDateDATETIME,
@PrevDepartmentCHAR(4),
@MinutesRunningCountINT,
@RowCounterINT,
@StartHourTINYINT,
@StartTimeTIME
;
SELECT @MinutesRunningCount = 0,
@RowCounter = 1,
@StartHour = 8,
@StartTime = CAST(@StartHour AS VARCHAR(2)) + ':00:00'
;
CREATE TABLE #Leave
(
EmployeeId INT ,
Department CHAR(4),
TotalHoursInMinutes INT,
MinutesRunningCount INT,
LeaveDate DATETIME,
StartDate DATETIME,
EndDate DATETIME
)
;
CREATE CLUSTERED INDEX IX_Leave#ClusteredIndex
ON #Leave (EmployeeId, LeaveDate, Department)
;
INSERT INTO #Leave
(EmployeeId, Department, TotalHoursInMinutes, LeaveDate)
SELECT 1, '1200', 450, '6/1/2011' UNION ALL
SELECT 1, '1200', 60, '6/2/2011' UNION ALL
SELECT 1, '1200', 30, '6/2/2011' UNION ALL
SELECT 1, '1200', 90, '6/2/2011' UNION ALL
SELECT 1, '2000', 60, '6/2/2011' UNION ALL
SELECT 2, '1200', 30, '6/1/2011' UNION ALL
SELECT 2, '1200', 90, '6/1/2011'
;
WITH
cteBuildSafety AS
(
SELECT RowCounter = ROW_NUMBER() OVER(ORDER BY EmployeeId, LeaveDate, Department),
EmployeeId, LeaveDate, Department, TotalHoursInMinutes, MinutesRunningCount
FROM #Leave
)
UPDATE tgt
SET @MinutesRunningCount = MinutesRunningCount
= CASE
WHEN RowCounter = @RowCounter --Safety Counter/Check
THEN
CASE
WHEN EmployeeID = @PrevEmployeeID
AND LeaveDate = @PrevLeaveDate
AND Department = @PrevDepartment
THEN @MinutesRunningCount + TotalHoursInMinutes
ELSE TotalHoursInMinutes
END
ELSE 1/0 --Force failure if Safety Counter is out of sync
END,
@PrevEmployeeID = EmployeeID,
@PrevLeaveDate = LeaveDate,
@PrevDepartment = Department,
@RowCounter = @RowCounter + 1
FROM cteBuildSafety tgt WITH (TABLOCKX)
OPTION (MAXDOP 1)
;
UPDATE #Leave
SETStartDate = DATEADD(mi,
MinutesRunningCount - TotalHoursInMinutes,
CAST(CONVERT(DATETIME, LeaveDate, 101) + @StartTime AS DATETIME)),
EndDate = DATEADD(mi,
MinutesRunningCount,
CAST(CONVERT(DATETIME, LeaveDate, 101) + @StartTime AS DATETIME));
SELECT*
FROM#Leave
ORDER BY
EmployeeId,
LeaveDate,
Department
;
DROP TABLE #Leave
;
CREATE TABLE [dbo].[Leave](
[Id] [int] IDENTITY(1,1) NOT NULL,
[StartDate] [datetime] NOT NULL,
[EndDate] [datetime] NOT NULL,
[TotalHours] [decimal](7, 3) NOT NULL,
[CreatedBy] [varchar](32) NOT NULL,
[CreatedOn] [datetime] NOT NULL,
[ModifiedBy] [varchar](32) NULL,
[ModifiedOn] [datetime] NULL,
[CurrentStatus_Id] [int] NULL,
[Employee_EmployeeId] [int] NOT NULL,
[Position_PositionId] [int] NULL,
[Activity_Id] [int] NULL,
[Location_Id] [int] NULL,
[LeaveReason_Id] [int] NOT NULL,
[LeaveType_Id] [int] NOT NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
December 20, 2011 at 7:14 pm
Hi Mattie,
Thank you for the very nice feedback and for posting your code.
Nope. @RowCounter doesn't keep things from "going into a loop". @RowCounter makes sure that the order of the update occurs in the proper order according to the "RowCounter" in the cte. Because you have absolute control over a Temp Table, such a safety check isn't really necessary when the target of the update is, in fact, a Temp Table. However, I'm never one to turn down a safety feature especially since it costs so very little, as in this case.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply