August 21, 2008 at 2:59 pm
I have a stored procedure that is taking employee payroll clock-ins / clock-outs and populating a table with [employeeid], [date], [hour], [portion_of_hour_worked]. For example:
input row:
Employee 123 for 8/20/2008 PunchIn: 8:30am PunchOut:12:00pm
would result in these rows added to my HoursWorked table:
123,8/20/08,8,0.5
123,8/20/08,9,1.0
123,8/20/08,10,1.0
123,8/20/08,11,1.0
I'll post the sproc and udf below. But basically its a cursor calling a udf for each in/out punch. It takes over an hour to process 20,000 input punches. I just wondered if someone has a better way. I was looking at the APPLY feature using the UDF in a correlated subquery. I could probably relocate this job from a SQL 2000 DTS to SQL 2005 SSIS.
Here is the code:
CREATE PROCEDURE dbo.GenerateHoursWorked -- '6/18/2008' , '7/1/2008 '
@StartDate datetime,
@EndDate datetime
AS
SET NOCOUNT ON
-- add end of day time 23:59:59 to date
SELECT @EndDate = DATEADD(second, -1, (DATEADD(day, 1,@EndDate)))
-- create a temp table to hold time detail recs for tellers
-- and other hourly emps who have transactions for Date Range
-- PunchInTime = Null will exclude Paid Absence time, etc
DECLARE @TD Table(EmployeeID varchar(10),
IN_Punch smalldatetime,
OUT_Punch smalldatetime,
Job varchar(50),
Punch_RecID int,
RecID int not null IDENTITY(1,1) PRIMARY KEY
)
INSERT @TD
SELECT
td.EmpID
, CAST((td.PunchDate + td.PunchInTime)as datetime) [IN_Punch]
, CAST((td.PunchDate+td.PunchOutTime) as datetime) [OUT_Punch]
, td.Job
, td.RecID [Punch_RecID]
FROM dbo.ADPTimeDetail td
WHERE td.PunchInTime IS NOT NULL
AND td.PunchDate BETWEEN @StartDate and @EndDate
AND td.EmpID IN (
SELECT ADP_ID
FROM dbo.EmployeeXref xr
INNER JOIN dbo.TransactionDetail tr
ON xr.Bwy_ID = tr.UserID
AND TransactionTime BETWEEN @StartDate and @EndDate
)
ORDER BY 1,2,3
-- Loop through temp table and populate the HoursWorked table based on the punches
DECLARE @LastRecID int, @CurRecID int
SELECT @CurRecID = 1, @LastRecID = MAX(RecID) FROM @TD
DECLARE @CurEmpID varchar(10), @CurIN smalldatetime, @CurOUT smalldatetime
, @Job varchar(50), @Punch_RecID int
WHILE @CurRecID <= @LastRecID
BEGIN
-- Select the current time punch record
SELECT @CurEmpID = EmployeeID, @CurIN = IN_Punch, @CurOUT= OUT_Punch,
@Job = Job, @Punch_RecID = Punch_RecID
FROM @TD
WHERE RecID = @CurRecID
-- Create the Hours Worked break out for the current time punch
INSERT dbo.HoursWorked (DateHourWorked, ADP_ID, AmountofHour, Job, Punch_RecID)
SELECT DateHourWorked
, EmployeeID
, Amount
, @Job
, @Punch_RecID FROM dbo.udf_tblHoursWorked(@CurEmpID, @CurIN, @CurOUT)
-- Increment for the next time punch record
SET @CurRecID = @CurRecID + 1
END
RETURN
CREATE FUNCTION [dbo].[udf_tblHoursWorked]
(
@EmployeeID varchar(10),
@PunchIN datetime,
@PunchOUT datetime
)
RETURNS @hw TABLE(
DateHourWorked datetime,
EmployeeID varchar(10),
Amount decimal(18,2)
)
AS
BEGIN
DECLARE @HrBegin smalldatetime,
@HrEnd smalldatetime,
@AmtWorked decimal(18,2)
-- init with the PunchIn time
SELECT @HrBegin = @PunchIN
-- rollback to the beginning of the PunchIn hour and add 1 hour for the hour end
SELECT @HrEnd = DATEADD(hh,1,DATEADD(mi,(DATEPART(mi,@HrBegin)*-1),@HrBegin))
-- the hour end cannot exceed the PunchOut time
SELECT @HrEnd = CASE WHEN @HrEnd < @PunchOUT THEN @HrEnd ELSE @PunchOUT END
-- Loop for each hour in the PunchIn/out range
-- and insert a record for the hour
WHILE @HrBegin < @PunchOUT
BEGIN
SELECT @AmtWorked = (DATEDIFF(mi,@HrBegin,@HrEnd)/60.00)
INSERT @hw
SELECT DATEADD(mi,(DATEPART(mi,@HrBegin)*-1),@HrBegin),
@EmployeeID,
@AmtWorked
-- increment the hour and loop for next
SELECT @HrBegin = @HrEnd
SELECT @HrEnd = DATEADD(hh,1,DATEADD(mi,(DATEPART(mi,@HrBegin)*-1),@HrBegin))
SELECT @HrEnd = CASE WHEN @HrEnd < @PunchOUT THEN @HrEnd ELSE @PunchOUT END
END
RETURN
END
August 22, 2008 at 7:24 am
I think this code could/should put you an a better path:
[font="Courier New"]DECLARE @punches TABLE (id INT, in_date VARCHAR(10), in_time VARCHAR(5), out_time VARCHAR(5))
DECLARE @hours TABLE(hour_start DATETIME, hour_end DATETIME)
DECLARE @nums TABLE (n INT)
DECLARE @i INT
DECLARE @start_date DATETIME
SET @start_date = '8/20/08'
SET @i = 0
/* build table with hours of the day you may want to
build a permanent table with indexes for this*/
WHILE @i < 23
BEGIN
INSERT INTO @hours
SELECT
DATEADD(hour, @i, @start_date),
DATEADD(hour, @i + 1, @start_date)
/* build numbers table, if you already have one use it
seach SSC for "TALLY" to learn about its many uses*/
INSERT INTO @nums
SELECT
@i
SET @i = @i + 1
END
/* create some test data. You can skip this */
INSERT INTO @punches
SELECT
123,
'8/20/2008',
'08:30' ,
'12:00'
UNION ALL
SELECT
124,
'8/20/2008',
'08:00' ,
'11:45'
-- return the data
SELECT
A.id,
-- calculate portion of hour worked
CASE
WHEN A.in_datetime > H.hour_start THEN DATEDIFF(minute, A.in_datetime, H.hour_end)/60.00
WHEN A.out_datetime < H.hour_end THEN DATEDIFF(minute, H.hour_start, A.out_datetime)/60.00
ELSE 1
END AS part_hour_worked,
hour_of_day,
A.in_datetime,
A.out_datetime,
H.hour_start,
H.hour_end
FROM
-- derived table returns each hour of the day in which a person worked
(SELECT
id,
CONVERT(DATETIME, in_date + ' ' + in_time) AS in_datetime,
CONVERT(DATETIME, in_date + ' ' + out_time) AS out_datetime,
DATEADD(hour, n, CONVERT(DATETIME, in_date + ' ' + in_time)) AS hour_of_day
FROM
@punches A CROSS JOIN
@nums N
WHERE
-- need <= in order to get the record where out_time is not a full hour
N.n <= DATEDIFF(hour, CONVERT(DATETIME, in_date + ' ' + in_time), CONVERT(DATETIME, in_date + ' ' + out_time))) AS A JOIN
-- join on the hours table
@hours H ON
A.hour_of_day >= H.hour_start AND A.hour_of_day < H.hour_end
WHERE
-- need this to eliminate 0's
CASE
WHEN A.in_datetime > H.hour_start THEN DATEDIFF(minute, A.in_datetime, H.hour_end)/60.00
WHEN A.out_datetime < H.hour_end THEN DATEDIFF(minute, H.hour_start, A.out_datetime)/60.00
ELSE 1
END > 0
ORDER BY
id,
hour_start[/font]
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 22, 2008 at 7:38 am
Please read Jeff Moden's article 'The "Numbers" or "Tally" Table: What it is and how it replaces a loop' at http://www.sqlservercentral.com/articles/TSQL/62867/
First, construct a set with the 24 hours of the day:
selectTally.N - 1as StartHour
,Tally.N as EndHour
,DATEADD(HH, Tally.N - 1, @StartTs) as StartTs
,DATEADD(HH, Tally.N - 1, @StartTs) + (1.0 / 24) as EndTs
frommaster.dbo.Tally
whereTally.N between 1 and 24
Then join the DayHours table to the PunchCard table
[/code]
create table #EmpPunch
(EmpIDintegernot null
, PunchInTs datetime not null
, PunchOutTs datetime not null
)
insert into #EmpPunch
(EmpID, PunchInTs , PunchOutTs )
VALUES ( 1 ,'2008-08-20 08:30:00', '2008-08-20 12:00:00')
insert into #EmpPunch
(EmpID, PunchInTs , PunchOutTs )
VALUES ( 2 ,'2008-08-20 08:30:00', '2008-08-20 08:50:00')
insert into #EmpPunch
(EmpID, PunchInTs , PunchOutTs )
VALUES ( 3 ,'2008-08-20 08:30:00', '2008-08-20 12:45:00')
[/code]
The SQL:
-- "Ts" is a timestamp
DECLARE@StartTsdatetime
,@EndTsdatetime
SET@StartTs = '2008-08-20'
SELECT@EndTs = DATEADD(second, -1, (DATEADD(day, 1,@StartTs)))
select #EmpPunch.EmpID
,@StartTs
,DayHours.StartHour
,DayHours.StartTs
,DayHours.EndTs
,#EmpPunch.PunchInTs
,#EmpPunch.PunchOutTs
,CASE
-- Worked the full hour
WHEN #EmpPunch.PunchInTs = DayHours.EndTs
THEN 1
-- Work started during the hour and ended after
WHEN #EmpPunch.PunchInTs > DayHours.StartTs AND #EmpPunch.PunchOutTs >= DayHours.EndTs
THEN ( DATEPART(MI, #EmpPunch.PunchInTs) / 60.0 )
-- Work started during the hour and ended during the hour
WHEN #EmpPunch.PunchInTs > DayHours.StartTs AND #EmpPunch.PunchOutTs < DayHours.EndTs
THEN (DATEPART(MI, #EmpPunch.PunchOutTs) - DATEPART(MI, #EmpPunch.PunchInTs)
) / 60.0
-- Work started before the hour and ended during the hour
WHEN #EmpPunch.PunchInTs <= DayHours.StartTs AND #EmpPunch.PunchOutTs < DayHours.EndTs
THEN DATEPART(MI, #EmpPunch.PunchOutTs) / 60.0
ELSE NULL
END AS WorkedHours
FROM#EmpPunch
JOIN
(
selectTally.N - 1as StartHour
,Tally.N as EndHour
,DATEADD(HH, Tally.N - 1, @StartTs) as StartTs
,DATEADD(HH, Tally.N - 1, @StartTs) + (1.0 / 24) as EndTs
frommaster.dbo.Tally
whereTally.N between 1 and 24
) DayHours
ON#EmpPunch.PunchInTs <= DayHours.EndTs
AND #EmpPunch.PunchOutTs > DayHours.StartTs
WHERE #EmpPunch.PunchInTs between @StartTs and @EndTs
SQL = Scarcely Qualifies as a Language
August 22, 2008 at 7:44 am
Excellent Jack. Thank you!
After seeing so many RBAR posts from Jeff I had begun to research the tally table method. But I just could not get my head around how it would be used in this situation. I've had an epiphany here.
I haven't had time yet to implement your solution. I'm anxious to see the performance improvement.
Thanks again.
Ed
ps:
Thank you also Carl. I will consider each alternative here to see which best fits my circumstance.
August 22, 2008 at 7:51 am
After a quick glance it looks like Carl's solution is very similar in idea to mine.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 22, 2008 at 8:56 am
Whatever did we do without tally tables?
[font="Courier New"]DROP TABLE #ADPTimeDetail
CREATE TABLE #ADPTimeDetail (EmpID INT, [IN_Punch] DATETIME, [OUT_Punch] DATETIME, Job VARCHAR(3), [Punch_RecID] INT)
INSERT INTO #ADPTimeDetail (EmpID, [IN_Punch], [OUT_Punch], Job, [Punch_RecID])
SELECT 1, GETDATE(), DATEADD(mi, 175,GETDATE()), 'AAA', 1 UNION ALL
SELECT 1, DATEADD(dd, 1, GETDATE()), DATEADD(mi, 200,DATEADD(dd, 1, GETDATE())), 'BBB', 2 UNION ALL
SELECT 2, GETDATE(), DATEADD(mi, 175,GETDATE()), 'CCC', 3 UNION ALL
SELECT 2, DATEADD(dd, 1, GETDATE()), DATEADD(mi, 175,DATEADD(dd, 1, GETDATE())), 'DDD', 4
--SELECT * FROM #ADPTimeDetail -- sanity check
-- Yep it's that pesky numbers table again
SELECT EmpID, [IN_Punch] AS DateWorked,
CASE WHEN number < 13 THEN number ELSE number-12 END AS [hour],
CASE WHEN number = DATEPART(hh, [IN_Punch]) THEN CAST(4*DATEPART(mi, [IN_Punch])/60 AS INT)/4.00
WHEN number = DATEPART(hh, [OUT_Punch]) THEN CAST(4*DATEPART(mi, [OUT_Punch])/60 AS INT)/4.00
ELSE 1.00 END AS [portion_of_hour_worked]
FROM Numbers n, #ADPTimeDetail
WHERE n.number BETWEEN DATEPART(hh, [IN_Punch]) AND DATEPART(hh, [OUT_Punch])[/font]
Results:
EmpID DateWorked hour portion_of_hour_worked
----------- ----------------------- ----------- ----------------------
1 2008-08-22 15:53:02.940 3 .750000
1 2008-08-22 15:53:02.940 4 1.000000
1 2008-08-22 15:53:02.940 5 1.000000
1 2008-08-22 15:53:02.940 6 .750000
1 2008-08-23 15:53:02.940 3 .750000
1 2008-08-23 15:53:02.940 4 1.000000
1 2008-08-23 15:53:02.940 5 1.000000
1 2008-08-23 15:53:02.940 6 1.000000
1 2008-08-23 15:53:02.940 7 .000000
2 2008-08-22 15:53:02.940 3 .750000
2 2008-08-22 15:53:02.940 4 1.000000
2 2008-08-22 15:53:02.940 5 1.000000
2 2008-08-22 15:53:02.940 6 .750000
2 2008-08-23 15:53:02.940 3 .750000
2 2008-08-23 15:53:02.940 4 1.000000
2 2008-08-23 15:53:02.940 5 1.000000
2 2008-08-23 15:53:02.940 6 .750000
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 22, 2008 at 9:12 am
Chris,
Can you explain the why and how of this:
CAST(4*DATEPART(mi, [IN_Punch])/60 AS INT)/4.00
I would bet that performance-wise your solution will be the best.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
August 22, 2008 at 9:22 am
Sure Jack!
Multiply the decimal fraction by 4,
0.26 becomes 1.04
0.24 becomes 0.96
then Cast as INT to truncate
0.26 becomes 1
0.24 becomes 0
then Divide by 4.00
0.26 becomes 0.25
0.24 becomes 0.00
Test with:
[font="Courier New"]
SELECT CAST(4*number/60 AS INT)/4.00
FROM Numbers
WHERE number < 60[/font]
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
August 22, 2008 at 9:53 am
I see... sort of a FLOOR function for quarter hour rounding.
In my case I'll just record the ugly unrounded decimal because
we'll be aggregating the labor-hour values in later analysis.
I don't want to compound rounding errors when we do this.
You all have certainly given me food for thought here.
Thanks again.
August 22, 2008 at 2:57 pm
My final code ended up as something of a hybrid of the code all of you supplied. The execution time went from an hour to under 10 seconds!
Another testament to the power of the TALLY.
August 22, 2008 at 3:11 pm
Awesome. Not just the power of TALLY, but the waste of loops!
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply