Converting sequential time records into IN and OUT times

Question

Converting sequential time records into IN and OUT times

Mathanagopal Parameswaran

SSC Enthusiast

Points: 190
More actions
August 8, 2023 at 6:24 am

#4255552

Hi,
We have a Time & Attendance application that stores the time punched by each employee in a sequential format. We have the indication for that. if the employee punches in and out on the same day no problem on that. if the employee punch in at 8 PM and punch out the next day at 5 AM the problem will come. for that time we don't have to punch out because the date was changed by the machine considering the employee's punch out as the next day punch in.
Example:
AttendanceDate EmployeeId InTime OutTime
2023-05-26 4119 2023-05-26 20:56:33 2023-05-26 20:56:33
2023-05-27 4119 2023-05-27 05:04:51 2023-05-27 20:56:49
2023-05-28 4119 2023-05-28 05:54:43 2023-05-28 20:56:19
2023-05-29 4119 2023-05-29 20:56:46 2023-05-29 20:56:46
2023-05-30 4119 2023-05-30 04:15:15 2023-05-30 20:52:35
2023-05-31 4119 2023-05-31 05:30:38 2023-05-31 20:53:58
Expecting Result:
AttendanceDate EmployeeId InTime OutTime
2023-05-26 4119 2023-05-26 20:56:33 2023-05-27 05:04:51
2023-05-27 4119 2023-05-27 20:56:49 2023-05-28 05:54:43
I want to write a query to export records with the in and out times for each employee per day. How can this be done?

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 246962 More actions · Answer 1

Your expected results ignore rows with InTime from 28-31 May, presumably because these rows are considered 'good'?

But if row 3 (28th) is good, why do you want to use its InTime for row 2 (27th)? Would that not imply that the employee attended from 20:56 on 27th to 20:56 on 28th, ie 24 hours?

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Mathanagopal Parameswaran SSC Enthusiast Points: 190 More actions · Answer 2

Your expected results ignore rows with InTime from 28-31 May, presumably because these rows are considered 'good'? - No, missed punch out for the 28th. I just described two sample records.

Actually, that employee working on the night shift. for example 26th Intime - 20:56 and Outime - 27th morning 05:04

Phil Parkin SSC Guru Points: 246962 More actions · Answer 3

Mathanagopal Parameswaran wrote:

Your expected results ignore rows with InTime from 28-31 May, presumably because these rows are considered 'good'? - No, missed punch out for the 28th. I just described two sample records.
Actually, that employee working on the night shift. for example 26th Intime - 20:56 and Outime - 27th morning 05:04

I cannot understand this, sorry. Your sample data and expected results (based on that sample data) need to be accurate. Also, please describe the logic for determining the following:

a) Whether a row is good or not

b) If a row is considered bad, how should its revised InTime or OutTime be derived?

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Ken McKelvey SSCoach Points: 18843 More actions · Answer 4

With:

/* *** Test Data - which, if you want answers, you will provide in future */
SET ANSI_NULLS,  QUOTED_IDENTIFIER, ANSI_PADDING ON;
GO
CREATE TABLE #t
(
    EmployeeId int NOT NULL
    ,PunchTime datetime NOT NULL
    ,PRIMARY KEY (EmployeeId, PunchTime)
);
INSERT INTO #t
VALUES (4119, '2023-05-26 05:56:33')
    ,(4119, '2023-05-26 20:56:33')
    ,(4119, '2023-05-27 05:04:51')
    ,(4119, '2023-05-27 20:56:49')
    ,(4119, '2023-05-28 05:54:43')
    ,(4119, '2023-05-28 20:56:19')
    ,(4119, '2023-05-29 20:56:46')
    ,(4119, '2023-05-30 04:15:15')
    ,(4119, '2023-05-30 20:52:35')
    ,(4119, '2023-05-31 05:30:38')
    ,(4119, '2023-05-31 20:53:58');
GO
/* *** End Test Data *** */

This is a Gaps and Islands problem. One approach, assuming the maximum shift length is 10 hours:

WITH PrevTimes
AS
(
    SELECT EmployeeId, PunchTime
        ,LAG(PunchTime) OVER (PARTITION BY EmployeeId ORDER BY PunchTime) AS PrevPunchTime
    FROM #t
)
,Boundaries
AS
(
    SELECT EmployeeId, PunchTime, PrevPunchTime
        ,CASE
            WHEN DATEDIFF(minute, PrevPunchTime, PunchTime) <= 10 * 60
            THEN 0
            ELSE 1
        END AS NewGrp
    FROM PrevTimes
)
,Grps
AS
(
    SELECT EmployeeId, PunchTime, PrevPunchTime
        ,SUM(NewGrp) OVER (PARTITION BY EmployeeId ORDER BY PunchTime) AS Grp
    FROM Boundaries
)
,Aggregates
AS
(
    SELECT EmployeeId
        ,MIN(PrevPunchTime) AS PrevPunchTime
        ,MIN(PunchTime) AS InTime
        ,MAX(PunchTime) AS OutTime
    FROM Grps
    GROUP BY EmployeeId, Grp
)
SELECT EmployeeId
    ,CASE
        WHEN PrevPunchTime IS NULL
        THEN NULL
        ELSE InTime
    END AS InTime
    ,CASE
        WHEN PrevPunchTime IS NULL
        THEN OutTime
        WHEN InTime = OutTime
        THEN NULL
        ELSE OutTime
    END AS OutTime
FROM Aggregates;

This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.

Mathanagopal Parameswaran SSC Enthusiast Points: 190 More actions · Answer 5

Hi Ken,

Thanks for your script. I'm getting the expected result But we have InTime and OutTime in the same row and single table. How can this be done?

Attachments:

You must be logged in to view attached files.

Mathanagopal Parameswaran SSC Enthusiast Points: 190 More actions · Answer 6

Hi Phil,

a) Whether a row is good or not - NOT

b) If a row is considered bad, how should its revised InTime or OutTime be derived? - I Have Attached the Sample and Expecting Data. Kindly refer to that.

Attachments:

You must be logged in to view attached files.

Ken McKelvey SSCoach Points: 18843 More actions · Answer 7

>>/* *** Test Data - which, if you want answers, you will provide in future */

Mathanagopal Parameswaran SSC Enthusiast Points: 190 More actions · Answer 8

Hi Ken,

Kindly find the below test data.

/* *** Test Data*/

SET ANSI_NULLS,  QUOTED_IDENTIFIER, ANSI_PADDING ON;
GO
CREATE TABLE #t1
(
    EmployeeId int NOT NULL
,AttendanceDate datetime NOT NULL
    ,InTime datetime NOT NULL
,OutTime datetime NOT NULL
    ,PRIMARY KEY (EmployeeId,AttendanceDate)
);
INSERT INTO #t1
VALUES 
(4119,'2023-05-29 00:00:00', '2023-05-29 20:56:46', '2023-05-29 20:56:46'),
(4119,'2023-05-30 00:00:00', '2023-05-30 04:15:15', '2023-05-30 20:52:35'),
(4119,'2023-05-31 00:00:00', '2023-05-31 05:30:38', '2023-05-31 20:53:58'),
(4119,'2023-06-01 00:00:00', '2023-06-01 04:38:38', '2023-06-01 20:43:50'),
(4119,'2023-06-02 00:00:00', '2023-06-02 05:38:42', '2023-06-02 20:38:38'),
(4119,'2023-06-03 00:00:00', '2023-06-03 05:30:39', '2023-06-03 20:35:58')
GO

/* *** End Test Data *** */

This reply was modified 1 year, 3 months ago by Mathanagopal Parameswaran.

Phil Parkin SSC Guru Points: 246962 More actions · Answer 9

Mathanagopal Parameswaran wrote:

Hi Phil,
a) Whether a row is good or not - NOT
b) If a row is considered bad, how should its revised InTime or OutTime be derived? - I Have Attached the Sample and Expecting Data. Kindly refer to that.

This is not a description of logic. You are expecting me to work out the logic based on your sample data. Please describe the logic using words. For example:

"A bad row is one where the InTime and OutTime have the same value"

"A good row is one where the InTime and OutTime occur on the same day. Good rows may be ignored."

I do not think that either of the above statements is correct, however, perhaps now you can understand what I am requesting from you?

The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin

Ken McKelvey SSCoach Points: 18843 More actions · Answer 10

Are you sure this is a table and not a view? A table like this is poor design.

With the given data, just double the number of rows, with CROSS APPLY, and then proceed like the last answer:

WITH DateList
AS
(
    SELECT DISTINCT EmployeeId
        ,CASE
            WHEN X.RowType = 'In'
            THEN T.InTime
            ELSE T.OutTime
        END AS PunchTime
    FROM #t1 T
    CROSS APPLY (VALUES ('In'),('Out')) X (RowType)
)
,PrevTimes
AS
(
    SELECT EmployeeId, PunchTime
        ,LAG(PunchTime) OVER (PARTITION BY EmployeeId ORDER BY PunchTime) AS PrevPunchTime
    FROM DateList
...

drew.allen SSC Guru Points: 76987 More actions · Answer 11

This is a very basic attempt to solve your problem. It does not match your expected data, but, as others have pointed out, your expected data may be missing rows. You can update the ROW_NUMBER() to check for other factors that might reset the count by including those resets in the PARTITION BY clause.

WITH Clocktimes AS
(
    SELECT t.EmployeeId
     , t.AttendanceDate
     , t.InTime
     , t.OutTime
     , ct.ClockTime
         , ROW_NUMBER() OVER(PARTITION BY t.EmployeeId ORDER BY ct.ClockTime ) - 1 AS rn
    FROM #t1 AS t
    CROSS APPLY
    (
        SELECT t.InTime AS ClockTime
        UNION
        SELECT t.OutTime
    ) ct
)
SELECT t.EmployeeId
 , MIN(t.AttendanceDate) AS AttendanceDate
     , MIN(t.ClockTime) AS InTime
     , MAX(t.ClockTime) AS OutTime
FROM Clocktimes AS t
GROUP BY t.EmployeeId, rn / 2

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Mathanagopal Parameswaran SSC Enthusiast Points: 190 More actions · Answer 12

Hi Ken,

Thanks for your script it's working as expected. you're correct the above test data is not correct. Apologies for that. we have lots of columns that's why I didn't mention those columns. We have two shift time like 8 hours and 12 hours.

SET ANSI_NULLS,  QUOTED_IDENTIFIER, ANSI_PADDING ON;
GO
CREATE TABLE #t1
(
    EmployeeId int NOT NULL
,AttendanceDate datetime NOT NULL
    ,InTime datetime NOT NULL
,OutTime datetime NOT NULL
    ,PRIMARY KEY (EmployeeId,AttendanceDate)
);
INSERT INTO #t1
VALUES 
(4119,'2023-05-14 00:00:00','2023-05-14 10:08:14','2023-05-14 23:24:45'),
(4119,'2023-05-15 00:00:00','2023-05-15 09:58:18','2023-05-15 22:27:44'),
(4119,'2023-05-16 00:00:00','2023-05-16 08:51:18','2023-05-16 21:07:42'),
(4119,'2023-05-17 00:00:00','2023-05-17 08:55:37','2023-05-17 21:06:09'),
(4119,'2023-05-18 00:00:00','2023-05-18 08:55:02','2023-05-18 21:04:18'),
(4119,'2023-05-19 00:00:00','2023-05-19 08:51:05','2023-05-19 21:07:33'),
(4119,'2023-05-20 00:00:00','2023-05-20 08:56:23','2023-05-20 21:03:49'),
(4119,'2023-05-21 00:00:00','2023-05-21 08:56:56','2023-05-21 21:03:01'),
(4119,'2023-05-22 00:00:00','2023-05-22 08:53:14','2023-05-22 21:01:30'),
(4119,'2023-05-23 00:00:00','2023-05-23 08:53:50','2023-05-23 21:05:19'),
(4119,'2023-05-24 00:00:00','2023-05-24 08:55:22','2023-05-24 21:04:01'),
(4119,'2023-05-25 00:00:00','2023-05-25 08:50:01','2023-05-25 21:05:59'),
(4119,'2023-05-29 00:00:00','2023-05-29 20:56:46','2023-05-29 20:56:46'),
(4119,'2023-05-30 00:00:00','2023-05-30 04:15:15','2023-05-30 20:52:35'),
(4119,'2023-05-31 00:00:00','2023-05-31 05:30:38','2023-05-31 20:53:58'),
(4119,'2023-06-01 00:00:00','2023-06-01 04:38:38','2023-06-01 20:43:50'),
(4119,'2023-06-02 00:00:00','2023-06-02 05:38:42','2023-06-02 20:38:38'),
(4119,'2023-06-03 00:00:00','2023-06-03 05:30:39','2023-06-03 20:35:58')
GO

Mathanagopal Parameswaran SSC Enthusiast Points: 190 More actions · Answer 13

drew.allen wrote:

This is a very basic attempt to solve your problem. It does not match your expected data, but, as others have pointed out, your expected data may be missing rows. You can update the ROW_NUMBER() to check for other factors that might reset the count by including those resets in the PARTITION BY clause.

WITH Clocktimes AS
(
    SELECT t.EmployeeId
     , t.AttendanceDate
     , t.InTime
     , t.OutTime
     , ct.ClockTime
         , ROW_NUMBER() OVER(PARTITION BY t.EmployeeId ORDER BY ct.ClockTime ) - 1 AS rn
    FROM #t1 AS t
    CROSS APPLY
    (
        SELECT t.InTime AS ClockTime
        UNION
        SELECT t.OutTime
    ) ct
)
SELECT t.EmployeeId
 , MIN(t.AttendanceDate) AS AttendanceDate
     , MIN(t.ClockTime) AS InTime
     , MAX(t.ClockTime) AS OutTime
FROM Clocktimes AS t
GROUP BY t.EmployeeId, rn / 2

Hi Allen,

Thanks for your script. I have checked it ignored some dates. may I know why it's ignored? FYI I'll attach the result also.

Result1

Ken McKelvey SSCoach Points: 18843 More actions · Answer 14

As your latest data seems to have no missing times, try something like:

WITH DateList
AS
(
    SELECT T.EmployeeId, X.PunchTime
        ,(ROW_NUMBER() OVER (PARTITION BY T.EmployeeId ORDER BY X.PunchTime) -1 )/2 AS Grp
    FROM #t1 T
    CROSS APPLY (VALUES (T.InTime),(T.OutTime)) X (PunchTime)
    GROUP BY T.EmployeeId, X.PunchTime
)
SELECT EmployeeId
    ,MIN(PunchTime) AS InTime
    ,MAX(PunchTime) AS OutTime
FROM DateList
GROUP BY EmployeeId, Grp;

Whatever your data is like, you should now have enough information to solve the problem yourself.

This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.
This reply was modified 1 year, 3 months ago by Ken McKelvey.