TVF with Cursor - is there a better way?

  • I have a stored procedure that is taking employee payroll clock-ins / clock-outs and populating a table with [employeeid], [date], [hour], [portion_of_hour_worked]. For example:

    input row:

    Employee 123 for 8/20/2008 PunchIn: 8:30am PunchOut:12:00pm

    would result in these rows added to my HoursWorked table:

    123,8/20/08,8,0.5

    123,8/20/08,9,1.0

    123,8/20/08,10,1.0

    123,8/20/08,11,1.0

    I'll post the sproc and udf below. But basically its a cursor calling a udf for each in/out punch. It takes over an hour to process 20,000 input punches. I just wondered if someone has a better way. I was looking at the APPLY feature using the UDF in a correlated subquery. I could probably relocate this job from a SQL 2000 DTS to SQL 2005 SSIS.

    Here is the code:

    CREATE PROCEDURE dbo.GenerateHoursWorked -- '6/18/2008' , '7/1/2008 '

    @StartDate datetime,

    @EndDate datetime

    AS

    SET NOCOUNT ON

    -- add end of day time 23:59:59 to date

    SELECT @EndDate = DATEADD(second, -1, (DATEADD(day, 1,@EndDate)))

    -- create a temp table to hold time detail recs for tellers

    -- and other hourly emps who have transactions for Date Range

    -- PunchInTime = Null will exclude Paid Absence time, etc

    DECLARE @TD Table(EmployeeID varchar(10),

    IN_Punch smalldatetime,

    OUT_Punch smalldatetime,

    Job varchar(50),

    Punch_RecID int,

    RecID int not null IDENTITY(1,1) PRIMARY KEY

    )

    INSERT @TD

    SELECT

    td.EmpID

    , CAST((td.PunchDate + td.PunchInTime)as datetime) [IN_Punch]

    , CAST((td.PunchDate+td.PunchOutTime) as datetime) [OUT_Punch]

    , td.Job

    , td.RecID [Punch_RecID]

    FROM dbo.ADPTimeDetail td

    WHERE td.PunchInTime IS NOT NULL

    AND td.PunchDate BETWEEN @StartDate and @EndDate

    AND td.EmpID IN (

    SELECT ADP_ID

    FROM dbo.EmployeeXref xr

    INNER JOIN dbo.TransactionDetail tr

    ON xr.Bwy_ID = tr.UserID

    AND TransactionTime BETWEEN @StartDate and @EndDate

    )

    ORDER BY 1,2,3

    -- Loop through temp table and populate the HoursWorked table based on the punches

    DECLARE @LastRecID int, @CurRecID int

    SELECT @CurRecID = 1, @LastRecID = MAX(RecID) FROM @TD

    DECLARE @CurEmpID varchar(10), @CurIN smalldatetime, @CurOUT smalldatetime

    , @Job varchar(50), @Punch_RecID int

    WHILE @CurRecID <= @LastRecID

    BEGIN

    -- Select the current time punch record

    SELECT @CurEmpID = EmployeeID, @CurIN = IN_Punch, @CurOUT= OUT_Punch,

    @Job = Job, @Punch_RecID = Punch_RecID

    FROM @TD

    WHERE RecID = @CurRecID

    -- Create the Hours Worked break out for the current time punch

    INSERT dbo.HoursWorked (DateHourWorked, ADP_ID, AmountofHour, Job, Punch_RecID)

    SELECT DateHourWorked

    , EmployeeID

    , Amount

    , @Job

    , @Punch_RecID FROM dbo.udf_tblHoursWorked(@CurEmpID, @CurIN, @CurOUT)

    -- Increment for the next time punch record

    SET @CurRecID = @CurRecID + 1

    END

    RETURN

    CREATE FUNCTION [dbo].[udf_tblHoursWorked]

    (

    @EmployeeID varchar(10),

    @PunchIN datetime,

    @PunchOUT datetime

    )

    RETURNS @hw TABLE(

    DateHourWorked datetime,

    EmployeeID varchar(10),

    Amount decimal(18,2)

    )

    AS

    BEGIN

    DECLARE @HrBegin smalldatetime,

    @HrEnd smalldatetime,

    @AmtWorked decimal(18,2)

    -- init with the PunchIn time

    SELECT @HrBegin = @PunchIN

    -- rollback to the beginning of the PunchIn hour and add 1 hour for the hour end

    SELECT @HrEnd = DATEADD(hh,1,DATEADD(mi,(DATEPART(mi,@HrBegin)*-1),@HrBegin))

    -- the hour end cannot exceed the PunchOut time

    SELECT @HrEnd = CASE WHEN @HrEnd < @PunchOUT THEN @HrEnd ELSE @PunchOUT END

    -- Loop for each hour in the PunchIn/out range

    -- and insert a record for the hour

    WHILE @HrBegin < @PunchOUT

    BEGIN

    SELECT @AmtWorked = (DATEDIFF(mi,@HrBegin,@HrEnd)/60.00)

    INSERT @hw

    SELECT DATEADD(mi,(DATEPART(mi,@HrBegin)*-1),@HrBegin),

    @EmployeeID,

    @AmtWorked

    -- increment the hour and loop for next

    SELECT @HrBegin = @HrEnd

    SELECT @HrEnd = DATEADD(hh,1,DATEADD(mi,(DATEPART(mi,@HrBegin)*-1),@HrBegin))

    SELECT @HrEnd = CASE WHEN @HrEnd < @PunchOUT THEN @HrEnd ELSE @PunchOUT END

    END

    RETURN

    END

  • I think this code could/should put you an a better path:

    [font="Courier New"]DECLARE @punches TABLE (id INT, in_date VARCHAR(10), in_time VARCHAR(5), out_time VARCHAR(5))

    DECLARE @hours TABLE(hour_start DATETIME, hour_end DATETIME)

    DECLARE @nums TABLE (n INT)

    DECLARE @i INT

    DECLARE @start_date DATETIME

    SET @start_date = '8/20/08'

    SET @i = 0

    /* build table with hours of the day you may want to

    build a permanent table with indexes for this*/

    WHILE @i < 23

       BEGIN

           INSERT INTO @hours

               SELECT

                   DATEADD(hour, @i, @start_date),

                   DATEADD(hour, @i + 1, @start_date)

              

               /* build numbers table, if you already have one use it

               seach SSC for "TALLY" to learn about its many uses*/

               INSERT INTO @nums

                   SELECT

                       @i

               SET @i = @i + 1

       END

    /* create some test data.  You can skip this */

    INSERT INTO @punches

       SELECT

           123,

           '8/20/2008',

           '08:30' ,

           '12:00'

       UNION ALL

       SELECT

           124,

           '8/20/2008',

           '08:00' ,

           '11:45'

      

    -- return the data

    SELECT

       A.id,

       -- calculate portion of hour worked

       CASE

           WHEN A.in_datetime > H.hour_start THEN DATEDIFF(minute, A.in_datetime, H.hour_end)/60.00

           WHEN  A.out_datetime < H.hour_end THEN DATEDIFF(minute, H.hour_start, A.out_datetime)/60.00

           ELSE 1

       END AS part_hour_worked,

       hour_of_day,

       A.in_datetime,

       A.out_datetime,

       H.hour_start,

       H.hour_end

    FROM

       -- derived table returns each hour of the day in which a person worked

       (SELECT

           id,

           CONVERT(DATETIME, in_date + ' ' + in_time)  AS in_datetime,

           CONVERT(DATETIME, in_date + ' ' + out_time) AS out_datetime,

           DATEADD(hour, n, CONVERT(DATETIME, in_date + ' ' + in_time))  AS hour_of_day

       FROM  

           @punches A CROSS JOIN

           @nums N

       WHERE

           -- need <= in order to get the record where out_time is not a full hour

           N.n <= DATEDIFF(hour, CONVERT(DATETIME, in_date + ' ' + in_time), CONVERT(DATETIME, in_date + ' ' + out_time))) AS A JOIN

       -- join on the hours table

       @hours H ON

           A.hour_of_day >= H.hour_start AND A.hour_of_day < H.hour_end

    WHERE

           -- need this to eliminate 0's

           CASE

               WHEN A.in_datetime > H.hour_start THEN DATEDIFF(minute, A.in_datetime, H.hour_end)/60.00

               WHEN  A.out_datetime < H.hour_end THEN DATEDIFF(minute, H.hour_start, A.out_datetime)/60.00

               ELSE 1

           END > 0

       ORDER BY

           id,

           hour_start[/font]

  • Please read Jeff Moden's article 'The "Numbers" or "Tally" Table: What it is and how it replaces a loop' at http://www.sqlservercentral.com/articles/TSQL/62867/

    First, construct a set with the 24 hours of the day:

    selectTally.N - 1as StartHour

    ,Tally.N as EndHour

    ,DATEADD(HH, Tally.N - 1, @StartTs) as StartTs

    ,DATEADD(HH, Tally.N - 1, @StartTs) + (1.0 / 24) as EndTs

    frommaster.dbo.Tally

    whereTally.N between 1 and 24

    Then join the DayHours table to the PunchCard table

    [/code]

    create table #EmpPunch

    (EmpIDintegernot null

    , PunchInTs datetime not null

    , PunchOutTs datetime not null

    )

    insert into #EmpPunch

    (EmpID, PunchInTs , PunchOutTs )

    VALUES ( 1 ,'2008-08-20 08:30:00', '2008-08-20 12:00:00')

    insert into #EmpPunch

    (EmpID, PunchInTs , PunchOutTs )

    VALUES ( 2 ,'2008-08-20 08:30:00', '2008-08-20 08:50:00')

    insert into #EmpPunch

    (EmpID, PunchInTs , PunchOutTs )

    VALUES ( 3 ,'2008-08-20 08:30:00', '2008-08-20 12:45:00')

    [/code]

    The SQL:

    -- "Ts" is a timestamp

    DECLARE@StartTsdatetime

    ,@EndTsdatetime

    SET@StartTs = '2008-08-20'

    SELECT@EndTs = DATEADD(second, -1, (DATEADD(day, 1,@StartTs)))

    select #EmpPunch.EmpID

    ,@StartTs

    ,DayHours.StartHour

    ,DayHours.StartTs

    ,DayHours.EndTs

    ,#EmpPunch.PunchInTs

    ,#EmpPunch.PunchOutTs

    ,CASE

    -- Worked the full hour

    WHEN #EmpPunch.PunchInTs = DayHours.EndTs

    THEN 1

    -- Work started during the hour and ended after

    WHEN #EmpPunch.PunchInTs > DayHours.StartTs AND #EmpPunch.PunchOutTs >= DayHours.EndTs

    THEN ( DATEPART(MI, #EmpPunch.PunchInTs) / 60.0 )

    -- Work started during the hour and ended during the hour

    WHEN #EmpPunch.PunchInTs > DayHours.StartTs AND #EmpPunch.PunchOutTs < DayHours.EndTs

    THEN (DATEPART(MI, #EmpPunch.PunchOutTs) - DATEPART(MI, #EmpPunch.PunchInTs)

    ) / 60.0

    -- Work started before the hour and ended during the hour

    WHEN #EmpPunch.PunchInTs <= DayHours.StartTs AND #EmpPunch.PunchOutTs < DayHours.EndTs

    THEN DATEPART(MI, #EmpPunch.PunchOutTs) / 60.0

    ELSE NULL

    END AS WorkedHours

    FROM#EmpPunch

    JOIN

    (

    selectTally.N - 1as StartHour

    ,Tally.N as EndHour

    ,DATEADD(HH, Tally.N - 1, @StartTs) as StartTs

    ,DATEADD(HH, Tally.N - 1, @StartTs) + (1.0 / 24) as EndTs

    frommaster.dbo.Tally

    whereTally.N between 1 and 24

    ) DayHours

    ON#EmpPunch.PunchInTs <= DayHours.EndTs

    AND #EmpPunch.PunchOutTs > DayHours.StartTs

    WHERE #EmpPunch.PunchInTs between @StartTs and @EndTs

    SQL = Scarcely Qualifies as a Language

  • Excellent Jack. Thank you!

    After seeing so many RBAR posts from Jeff I had begun to research the tally table method. But I just could not get my head around how it would be used in this situation. I've had an epiphany here.

    I haven't had time yet to implement your solution. I'm anxious to see the performance improvement.

    Thanks again.

    Ed

    ps:

    Thank you also Carl. I will consider each alternative here to see which best fits my circumstance.

  • After a quick glance it looks like Carl's solution is very similar in idea to mine.

  • Whatever did we do without tally tables?

    [font="Courier New"]DROP TABLE #ADPTimeDetail

    CREATE TABLE #ADPTimeDetail (EmpID INT, [IN_Punch] DATETIME, [OUT_Punch] DATETIME, Job VARCHAR(3), [Punch_RecID] INT)

    INSERT INTO #ADPTimeDetail (EmpID, [IN_Punch], [OUT_Punch], Job, [Punch_RecID])

    SELECT 1, GETDATE(), DATEADD(mi, 175,GETDATE()), 'AAA', 1 UNION ALL

    SELECT 1, DATEADD(dd, 1, GETDATE()), DATEADD(mi, 200,DATEADD(dd, 1, GETDATE())), 'BBB', 2 UNION ALL

    SELECT 2, GETDATE(), DATEADD(mi, 175,GETDATE()), 'CCC', 3 UNION ALL

    SELECT 2, DATEADD(dd, 1, GETDATE()), DATEADD(mi, 175,DATEADD(dd, 1, GETDATE())), 'DDD', 4

    --SELECT * FROM #ADPTimeDetail -- sanity check

    -- Yep it's that pesky numbers table again

    SELECT EmpID, [IN_Punch] AS DateWorked,

       CASE WHEN number < 13 THEN number ELSE number-12 END AS [hour],

       CASE WHEN number = DATEPART(hh, [IN_Punch]) THEN CAST(4*DATEPART(mi, [IN_Punch])/60 AS INT)/4.00  

           WHEN number = DATEPART(hh, [OUT_Punch]) THEN CAST(4*DATEPART(mi, [OUT_Punch])/60 AS INT)/4.00

           ELSE 1.00 END AS [portion_of_hour_worked]

    FROM Numbers n, #ADPTimeDetail

    WHERE n.number BETWEEN DATEPART(hh, [IN_Punch]) AND DATEPART(hh, [OUT_Punch])[/font]

    Results:

    EmpID DateWorked hour portion_of_hour_worked

    ----------- ----------------------- ----------- ----------------------

    1 2008-08-22 15:53:02.940 3 .750000

    1 2008-08-22 15:53:02.940 4 1.000000

    1 2008-08-22 15:53:02.940 5 1.000000

    1 2008-08-22 15:53:02.940 6 .750000

    1 2008-08-23 15:53:02.940 3 .750000

    1 2008-08-23 15:53:02.940 4 1.000000

    1 2008-08-23 15:53:02.940 5 1.000000

    1 2008-08-23 15:53:02.940 6 1.000000

    1 2008-08-23 15:53:02.940 7 .000000

    2 2008-08-22 15:53:02.940 3 .750000

    2 2008-08-22 15:53:02.940 4 1.000000

    2 2008-08-22 15:53:02.940 5 1.000000

    2 2008-08-22 15:53:02.940 6 .750000

    2 2008-08-23 15:53:02.940 3 .750000

    2 2008-08-23 15:53:02.940 4 1.000000

    2 2008-08-23 15:53:02.940 5 1.000000

    2 2008-08-23 15:53:02.940 6 .750000

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris,

    Can you explain the why and how of this:

    CAST(4*DATEPART(mi, [IN_Punch])/60 AS INT)/4.00

    I would bet that performance-wise your solution will be the best.

  • Sure Jack!

    Multiply the decimal fraction by 4,

    0.26 becomes 1.04

    0.24 becomes 0.96

    then Cast as INT to truncate

    0.26 becomes 1

    0.24 becomes 0

    then Divide by 4.00

    0.26 becomes 0.25

    0.24 becomes 0.00

    Test with:

    [font="Courier New"]

    SELECT CAST(4*number/60 AS INT)/4.00  

    FROM Numbers

    WHERE number < 60[/font]

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I see... sort of a FLOOR function for quarter hour rounding.

    In my case I'll just record the ugly unrounded decimal because

    we'll be aggregating the labor-hour values in later analysis.

    I don't want to compound rounding errors when we do this.

    You all have certainly given me food for thought here.

    Thanks again.

  • My final code ended up as something of a hybrid of the code all of you supplied. The execution time went from an hour to under 10 seconds!

    Another testament to the power of the TALLY.

  • Awesome. Not just the power of TALLY, but the waste of loops!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply