Converting sequential time records into IN and OUT times

  • Hi,

    We have a Time & Attendance application that stores the time punched by each employee in a sequential format. We have the indication for that. if the employee punches in and out on the same day no problem on that. if the employee punch in at 8 PM and punch out the next day at 5 AM the problem will come.  for that time we don't have to punch out because the date was changed by the machine considering the employee's punch out as the next day punch in.

    Example:

    AttendanceDate EmployeeId           InTime                               OutTime

    2023-05-26         4119              2023-05-26 20:56:33     2023-05-26 20:56:33

    2023-05-27         4119              2023-05-27 05:04:51      2023-05-27 20:56:49

    2023-05-28        4119              2023-05-28 05:54:43      2023-05-28 20:56:19

    2023-05-29        4119              2023-05-29 20:56:46      2023-05-29 20:56:46

    2023-05-30        4119              2023-05-30 04:15:15       2023-05-30 20:52:35

    2023-05-31         4119              2023-05-31 05:30:38      2023-05-31 20:53:58

    Expecting Result:

    AttendanceDate EmployeeId           InTime                               OutTime

    2023-05-26         4119              2023-05-26 20:56:33      2023-05-27 05:04:51

    2023-05-27         4119              2023-05-27 20:56:49      2023-05-28 05:54:43

    I want to write a query to export records with the in and out times for each employee per day. How can this be done?

  • Your expected results ignore rows with InTime from 28-31 May, presumably because these rows are considered 'good'?

    But if row 3 (28th) is good, why do you want to use its InTime for row 2 (27th)? Would that not imply that the employee attended from 20:56 on 27th to 20:56 on 28th, ie 24 hours?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Your expected results ignore rows with InTime from 28-31 May, presumably because these rows are considered 'good'? - No, missed punch out for the 28th. I just described two sample records.

    Actually, that employee working on the night shift. for example 26th Intime - 20:56 and Outime - 27th morning 05:04

     

  • Mathanagopal Parameswaran wrote:

    Your expected results ignore rows with InTime from 28-31 May, presumably because these rows are considered 'good'? - No, missed punch out for the 28th. I just described two sample records.

    Actually, that employee working on the night shift. for example 26th Intime - 20:56 and Outime - 27th morning 05:04

    I cannot understand this, sorry. Your sample data and expected results (based on that sample data) need to be accurate. Also, please describe the logic for determining the following:

    a) Whether a row is good or not

    b) If a row is considered bad, how should its revised InTime or OutTime be derived?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • With:

    /* *** Test Data - which, if you want answers, you will provide in future */
    SET ANSI_NULLS, QUOTED_IDENTIFIER, ANSI_PADDING ON;
    GO
    CREATE TABLE #t
    (
    EmployeeId int NOT NULL
    ,PunchTime datetime NOT NULL
    ,PRIMARY KEY (EmployeeId, PunchTime)
    );
    INSERT INTO #t
    VALUES (4119, '2023-05-26 05:56:33')
    ,(4119, '2023-05-26 20:56:33')
    ,(4119, '2023-05-27 05:04:51')
    ,(4119, '2023-05-27 20:56:49')
    ,(4119, '2023-05-28 05:54:43')
    ,(4119, '2023-05-28 20:56:19')
    ,(4119, '2023-05-29 20:56:46')
    ,(4119, '2023-05-30 04:15:15')
    ,(4119, '2023-05-30 20:52:35')
    ,(4119, '2023-05-31 05:30:38')
    ,(4119, '2023-05-31 20:53:58');
    GO
    /* *** End Test Data *** */

    This is a Gaps and Islands problem. One approach, assuming the maximum shift length is 10 hours:

    WITH PrevTimes
    AS
    (
    SELECT EmployeeId, PunchTime
    ,LAG(PunchTime) OVER (PARTITION BY EmployeeId ORDER BY PunchTime) AS PrevPunchTime
    FROM #t
    )
    ,Boundaries
    AS
    (
    SELECT EmployeeId, PunchTime, PrevPunchTime
    ,CASE
    WHEN DATEDIFF(minute, PrevPunchTime, PunchTime) <= 10 * 60
    THEN 0
    ELSE 1
    END AS NewGrp
    FROM PrevTimes
    )
    ,Grps
    AS
    (
    SELECT EmployeeId, PunchTime, PrevPunchTime
    ,SUM(NewGrp) OVER (PARTITION BY EmployeeId ORDER BY PunchTime) AS Grp
    FROM Boundaries
    )
    ,Aggregates
    AS
    (
    SELECT EmployeeId
    ,MIN(PrevPunchTime) AS PrevPunchTime
    ,MIN(PunchTime) AS InTime
    ,MAX(PunchTime) AS OutTime
    FROM Grps
    GROUP BY EmployeeId, Grp
    )
    SELECT EmployeeId
    ,CASE
    WHEN PrevPunchTime IS NULL
    THEN NULL
    ELSE InTime
    END AS InTime
    ,CASE
    WHEN PrevPunchTime IS NULL
    THEN OutTime
    WHEN InTime = OutTime
    THEN NULL
    ELSE OutTime
    END AS OutTime
    FROM Aggregates;

    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
  • Hi Ken,

    Thanks for your script. I'm getting the expected result But we have InTime and OutTime in the same row and single table.  How can this be done?

     

    Attachments:
    You must be logged in to view attached files.
  • Hi Phil,

    a) Whether a row is good or not - NOT

    b) If a row is considered bad, how should its revised InTime or OutTime be derived? - I Have Attached the Sample and Expecting Data. Kindly refer to that.

    Attachments:
    You must be logged in to view attached files.
  • >>/* *** Test Data - which, if you want answers, you will provide in future */

  • Hi Ken,

    Kindly find the below test data.

    /* *** Test Data*/

    SET ANSI_NULLS,  QUOTED_IDENTIFIER, ANSI_PADDING ON;
    GO
    CREATE TABLE #t1
    (
        EmployeeId int NOT NULL
    ,AttendanceDate datetime NOT NULL
        ,InTime datetime NOT NULL
    ,OutTime datetime NOT NULL
        ,PRIMARY KEY (EmployeeId,AttendanceDate)
    );
    INSERT INTO #t1
    VALUES 
    (4119,'2023-05-29 00:00:00', '2023-05-29 20:56:46', '2023-05-29 20:56:46'),
    (4119,'2023-05-30 00:00:00', '2023-05-30 04:15:15', '2023-05-30 20:52:35'),
    (4119,'2023-05-31 00:00:00', '2023-05-31 05:30:38', '2023-05-31 20:53:58'),
    (4119,'2023-06-01 00:00:00', '2023-06-01 04:38:38', '2023-06-01 20:43:50'),
    (4119,'2023-06-02 00:00:00', '2023-06-02 05:38:42', '2023-06-02 20:38:38'),
    (4119,'2023-06-03 00:00:00', '2023-06-03 05:30:39', '2023-06-03 20:35:58')
    GO

    /* *** End Test Data *** */
  • Mathanagopal Parameswaran wrote:

    Hi Phil,

    a) Whether a row is good or not - NOT

    b) If a row is considered bad, how should its revised InTime or OutTime be derived? - I Have Attached the Sample and Expecting Data. Kindly refer to that.

    This is not a description of logic. You are expecting me to work out the logic based on your sample data. Please describe the logic using words. For example:

    "A bad row is one where the InTime and OutTime have the same value"

    "A good row is one where the InTime and OutTime occur on the same day. Good rows may be ignored."

    I do not think that either of the above statements is correct, however, perhaps now you can understand what I am requesting from you?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Are you sure this is a table and not a view? A table like this is poor design.

    With the given data, just double the number of rows, with CROSS APPLY, and then proceed like the last answer:

    WITH DateList
    AS
    (
    SELECT DISTINCT EmployeeId
    ,CASE
    WHEN X.RowType = 'In'
    THEN T.InTime
    ELSE T.OutTime
    END AS PunchTime
    FROM #t1 T
    CROSS APPLY (VALUES ('In'),('Out')) X (RowType)
    )
    ,PrevTimes
    AS
    (
    SELECT EmployeeId, PunchTime
    ,LAG(PunchTime) OVER (PARTITION BY EmployeeId ORDER BY PunchTime) AS PrevPunchTime
    FROM DateList
    ...
  • This is a very basic attempt to solve your problem.  It does not match your expected data, but, as others have pointed out, your expected data may be missing rows.  You can update the ROW_NUMBER() to check for other factors that might reset the count by including those resets in the PARTITION BY clause.

    WITH Clocktimes AS
    (
    SELECT t.EmployeeId
    , t.AttendanceDate
    , t.InTime
    , t.OutTime
    , ct.ClockTime
    , ROW_NUMBER() OVER(PARTITION BY t.EmployeeId ORDER BY ct.ClockTime ) - 1 AS rn
    FROM #t1 AS t
    CROSS APPLY
    (
    SELECT t.InTime AS ClockTime
    UNION
    SELECT t.OutTime
    ) ct
    )
    SELECT t.EmployeeId
    , MIN(t.AttendanceDate) AS AttendanceDate
    , MIN(t.ClockTime) AS InTime
    , MAX(t.ClockTime) AS OutTime
    FROM Clocktimes AS t
    GROUP BY t.EmployeeId, rn / 2

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi Ken,

    Thanks for your script it's working as expected. you're correct the above test data is not correct. Apologies for that. we have lots of columns that's why I didn't mention those columns. We have two shift time like 8 hours and 12 hours.

    SET ANSI_NULLS,  QUOTED_IDENTIFIER, ANSI_PADDING ON;
    GO
    CREATE TABLE #t1
    (
    EmployeeId int NOT NULL
    ,AttendanceDate datetime NOT NULL
    ,InTime datetime NOT NULL
    ,OutTime datetime NOT NULL
    ,PRIMARY KEY (EmployeeId,AttendanceDate)
    );
    INSERT INTO #t1
    VALUES
    (4119,'2023-05-14 00:00:00','2023-05-14 10:08:14','2023-05-14 23:24:45'),
    (4119,'2023-05-15 00:00:00','2023-05-15 09:58:18','2023-05-15 22:27:44'),
    (4119,'2023-05-16 00:00:00','2023-05-16 08:51:18','2023-05-16 21:07:42'),
    (4119,'2023-05-17 00:00:00','2023-05-17 08:55:37','2023-05-17 21:06:09'),
    (4119,'2023-05-18 00:00:00','2023-05-18 08:55:02','2023-05-18 21:04:18'),
    (4119,'2023-05-19 00:00:00','2023-05-19 08:51:05','2023-05-19 21:07:33'),
    (4119,'2023-05-20 00:00:00','2023-05-20 08:56:23','2023-05-20 21:03:49'),
    (4119,'2023-05-21 00:00:00','2023-05-21 08:56:56','2023-05-21 21:03:01'),
    (4119,'2023-05-22 00:00:00','2023-05-22 08:53:14','2023-05-22 21:01:30'),
    (4119,'2023-05-23 00:00:00','2023-05-23 08:53:50','2023-05-23 21:05:19'),
    (4119,'2023-05-24 00:00:00','2023-05-24 08:55:22','2023-05-24 21:04:01'),
    (4119,'2023-05-25 00:00:00','2023-05-25 08:50:01','2023-05-25 21:05:59'),
    (4119,'2023-05-29 00:00:00','2023-05-29 20:56:46','2023-05-29 20:56:46'),
    (4119,'2023-05-30 00:00:00','2023-05-30 04:15:15','2023-05-30 20:52:35'),
    (4119,'2023-05-31 00:00:00','2023-05-31 05:30:38','2023-05-31 20:53:58'),
    (4119,'2023-06-01 00:00:00','2023-06-01 04:38:38','2023-06-01 20:43:50'),
    (4119,'2023-06-02 00:00:00','2023-06-02 05:38:42','2023-06-02 20:38:38'),
    (4119,'2023-06-03 00:00:00','2023-06-03 05:30:39','2023-06-03 20:35:58')
    GO
  • drew.allen wrote:

    This is a very basic attempt to solve your problem.  It does not match your expected data, but, as others have pointed out, your expected data may be missing rows.  You can update the ROW_NUMBER() to check for other factors that might reset the count by including those resets in the PARTITION BY clause.

    WITH Clocktimes AS
    (
    SELECT t.EmployeeId
    , t.AttendanceDate
    , t.InTime
    , t.OutTime
    , ct.ClockTime
    , ROW_NUMBER() OVER(PARTITION BY t.EmployeeId ORDER BY ct.ClockTime ) - 1 AS rn
    FROM #t1 AS t
    CROSS APPLY
    (
    SELECT t.InTime AS ClockTime
    UNION
    SELECT t.OutTime
    ) ct
    )
    SELECT t.EmployeeId
    , MIN(t.AttendanceDate) AS AttendanceDate
    , MIN(t.ClockTime) AS InTime
    , MAX(t.ClockTime) AS OutTime
    FROM Clocktimes AS t
    GROUP BY t.EmployeeId, rn / 2

     

    Hi Allen,

    Thanks for your script. I have checked it ignored some dates. may I know why it's ignored? FYI I'll attach the result also.

    Result1

  • As your latest data seems to have no missing times, try something like:

    WITH DateList
    AS
    (
    SELECT T.EmployeeId, X.PunchTime
    ,(ROW_NUMBER() OVER (PARTITION BY T.EmployeeId ORDER BY X.PunchTime) -1 )/2 AS Grp
    FROM #t1 T
    CROSS APPLY (VALUES (T.InTime),(T.OutTime)) X (PunchTime)
    GROUP BY T.EmployeeId, X.PunchTime
    )
    SELECT EmployeeId
    ,MIN(PunchTime) AS InTime
    ,MAX(PunchTime) AS OutTime
    FROM DateList
    GROUP BY EmployeeId, Grp;

    Whatever your data is like, you should now have enough information to solve the problem yourself.

    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.
    • This reply was modified 1 year, 3 months ago by  Ken McKelvey.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply