Subtracting SQL Transaction Times Row to Row

Question

Subtracting SQL Transaction Times Row to Row

brubin 51691

Valued Member

Points: 60
More actions
April 20, 2012 at 6:33 pm

#256765

I have a table called Transactions, which has 4 fields (for the sake of this example):
[TimePunchID] int PK auto#
[DateAdded] datetime
[UserID] varchar(100)
[Activity] varchar(100)
Example output:
IDdateaddeduserid activity DifferenceInMinutes
12012-04-20 08:15:00 AM 200Break
22012-04-20 08:30:00 AM 200Project A
32012-04-20 10:30:00 AM 200Project B
42012-04-20 08:15:00 AM 240Break
52012-04-20 09:30:00 AM 240Project C
62012-04-20 11:30:00 AM 240Project A
I want to find an efficient way to programmatically subtract the date difference between the rows so that I can group and display the hours that each person spent on each activity. The first entry for each user won't have a difference as there would be nothing from which to subtract; the following entries would. The idea is to populate the DifferenceInMinutes field with the minute difference between the dates.
Thanks so much!!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

CapnHector SSCoach Points: 16143 More actions · Answer 1

what is the end time you are using to figure out the difference in min. and can you post the DDL and Sample data in a more consumer friendly format?? please see my signature for help on how we like to see it here.

For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]

brubin 51691 Valued Member Points: 60 More actions · Answer 2

Thanks for the reply. Here is a quick example of some test data.

--Create a table to hold the data.

create table #tbl_Data (TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))

--Populate some data in the table.

insert into #tbl_Data (dateadded, userid, activity)

SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL

SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL

SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL

SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL

SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL

SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL

SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL

SELECT '2012-04-20 09:58:00 AM','23456','Project C'

--View the results.

select * from #tbl_Data

--Release the table.

drop table #tbl_data

Ideally, I would like to see the difference between each row (subtracting the dates in minutes). For example, if I summed the total amount of work that UserID 12345 performed on Project A, I would get X minutes.

Please let me know if I can clarify anything else.

brubin 51691 Valued Member Points: 60 More actions · Answer 3

I can actually get close to where I think that I need to be. The problem with these results is that the first entry gets no time associated with it (because it's the first entry), even though technically 15 minutes were spent on it (8:15 to 8:30). Therefore, when the data is summed up by activity, the times are off by each previous amount.

--Create a table to hold the data.

create table #tbl_Data (TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))

--Populate some data in the table.

insert into #tbl_Data (dateadded, userid, activity)

SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL

SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL

SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL

SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL

SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL

SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL

SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL

SELECT '2012-04-20 09:58:00 AM','23456','Project C'

--View the results.

select * from #tbl_Data

Select userid, Activity,

Cast(dateadded AS DateTime) as [fulldate],

DiffFromPrevious = Coalesce(

DateDiff(

mi,

(

Select Top 1 Cast(dateadded AS DateTime) AS [fulldate]

From #tbl_Data

Where Cast(dateadded AS DateTime) < Cast(t1.dateadded AS DateTime)

and convert(varchar(100),dateadded,101)

= convert(datetime,'4/20/2012',101)

and userid = '12345'

Order By [fulldate] Desc

),

Cast(dateadded AS DateTime)

),

0)

From

#tbl_Data t1

where convert(varchar(100),t1.dateadded,101)

= convert(datetime,'4/20/2012',101)

and userid = '12345'

Order By

[fulldate] Asc

drop table #tbl_data

brubin 51691 Valued Member Points: 60 More actions · Answer 4

To concretely answer your question about the End Time - each entry "ends" the time spent on the previous entry; for example, 8:15am - 8:30am, (8:30am ends the 8:15am entry and starts the new entry, yielding 15 minutes for the activity associated with the 8:15am entry).

Mark Fitzgerald-331224 SSCrazy Eights Points: 8489 More actions · Answer 5

brubin 51691 (4/20/2012)
To concretely answer your question about the End Time - each entry "ends" the time spent on the previous entry; for example, 8:15am - 8:30am, (8:30am ends the 8:15am entry and starts the new entry, yielding 15 minutes for the activity associated with the 8:15am entry).

Given the above statement, you cannot arrive at an answer as the last task has not ended yielding infinite. Given the demo data in your code the second user has started Project C at 9:58 and is still working.

This is a business decision you will need to make. Do you want the sum f completed tasks?

Fitz

brubin 51691 Valued Member Points: 60 More actions · Answer 6

Thanks for the reply. Yes, the sum of completed tasks is needed; the last entry for each user will be open.

Dwain Camps SSC Guru Points: 86948 More actions · Answer 7

Brubin,

I believe the following will get you what you want although I played a bit fast and loose with your problem description.

--Create a table to hold the data.

DECLARE @tbl_Data TABLE

(TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))

--Populate some data in the table.

insert into @tbl_Data (dateadded, userid, activity)

SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL

SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL

SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL

SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL

SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL

SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL

SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL

SELECT '2012-04-20 09:58:00 AM','23456','Project C'

;WITH MyData AS (

SELECT dateadded, userid, activity

,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk

FROM @tbl_data)

,ByActivity AS (

SELECT dateadded, userid, activity

,DATEDIFF(minute

,dateadded

,(SELECT dateadded

FROM MyData t2

WHERE t1.userid = t2.userid and t1.rk + 1 = t2.rk)

) As DifferenceInMinute

FROM MyData t1)

--SELECT dateadded, userid, activity, DifferenceInMinute

SELECT userid, activity, SUM(DifferenceInMinute) As TotalByActivity

FROM ByActivity

GROUP BY userid, activity

You can see what I mean by "fast and loose" if you uncomment the SELECT, comment the SELECT with the SUM and comment the GROUP BY lines.

Basically, I've put the DifferenceInMinute on the task start record (not the task close record). This allows the GROUPing to SUM correctly by activity.

My results set:

useridactivityTotalByActivity

12345Break120

23456Break90

12345Project A20

23456Project B15

23456Project CNULL

12345Project D75

This is actually a knock off of the cumulative sum problem that Jeff Moden's "quirky update" would probably solve faster, but this solution should work.

It can also be done with a JOIN of course.

;WITH MyData AS (

SELECT dateadded, userid, activity

,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk

FROM @tbl_data)

SELECT t1.userid, t1.activity, SUM(DATEDIFF(minute, t1.dateadded, t2.dateadded)) As TotalByActivity

FROM MyData t1

LEFT JOIN MyData t2 ON t1.userid = t2.userid and t1.rk + 1 = t2.rk

GROUP BY t1.userid, t1.activity

ORDER BY t1.userid, t1.activity

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

CapnHector SSCoach Points: 16143 More actions · Answer 8

dwain.c (4/22/2012)
Brubin,
I believe the following will get you what you want although I played a bit fast and loose with your problem description.
--Create a table to hold the data.
DECLARE @tbl_Data TABLE
(TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))
--Populate some data in the table.
insert into @tbl_Data (dateadded, userid, activity)
SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL
SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL
SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL
SELECT '2012-04-20 09:58:00 AM','23456','Project C'
;WITH MyData AS (
SELECT dateadded, userid, activity
,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk
FROM @tbl_data)
,ByActivity AS (
SELECT dateadded, userid, activity
,DATEDIFF(minute
,dateadded
,(SELECT dateadded
FROM MyData t2
WHERE t1.userid = t2.userid and t1.rk + 1 = t2.rk)
) As DifferenceInMinute
FROM MyData t1)
--SELECT dateadded, userid, activity, DifferenceInMinute
SELECT userid, activity, SUM(DifferenceInMinute) As TotalByActivity
FROM ByActivity
GROUP BY userid, activity
You can see what I mean by "fast and loose" if you uncomment the SELECT, comment the SELECT with the SUM and comment the GROUP BY lines.
Basically, I've put the DifferenceInMinute on the task start record (not the task close record). This allows the GROUPing to SUM correctly by activity.
My results set:
useridactivityTotalByActivity
12345Break120
23456Break90
12345Project A20
23456Project B15
23456Project CNULL
12345Project D75
This is actually a knock off of the cumulative sum problem that Jeff Moden's "quirky update" would probably solve faster, but this solution should work.
It can also be done with a JOIN of course.
;WITH MyData AS (
SELECT dateadded, userid, activity
,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk
FROM @tbl_data)
SELECT t1.userid, t1.activity, SUM(DATEDIFF(minute, t1.dateadded, t2.dateadded)) As TotalByActivity
FROM MyData t1
LEFT JOIN MyData t2 ON t1.userid = t2.userid and t1.rk + 1 = t2.rk
GROUP BY t1.userid, t1.activity
ORDER BY t1.userid, t1.activity

i did not have time over the weekend to post back but went with dwain's self join solution for my approach. had the exact same code to. (well almost but same enough not to post)

For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]

brubin 51691 Valued Member Points: 60 More actions · Answer 9

Thanks for the replies! Both of the last two posts produce solid output, and I really appreciate the effort. The last part of the equation is to take the minutes summed and display it as hh.mm (ie, 8.50 would = 8 hours 30 minutes) to 2 decimal places to ensure that the actual times being summed account for seconds.

Dwain Camps SSC Guru Points: 86948 More actions · Answer 10

Normally I try to avoid self joins because they can be slow performance-wise with many rows in the table and there's usually another approach that is faster.

In this case though, the query plan cost for both solutions I posted came up identical, even though the plans themselves were slightly different, so I posted the self join option because some may think it's a little easier on the eyes.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86948 More actions · Answer 11

Like I said, self joins aren't particularly efficient. This solution produces a lower query plan cost so is likely to run faster than the self join or the first suggestion I made.

;WITH MyData AS (

SELECT dateadded, userid, activity

,(SELECT TOP 1 dateadded

FROM @tbl_data t2

WHERE t1.userid = t2.userid and t1.dateadded < t2.dateadded

ORDER BY dateadded) As dateadded2

FROM @tbl_data t1)

SELECT userid, activity, SUM(DATEDIFF(minute, dateadded, dateadded2)) As TotalByActivity

FROM MyData t1

GROUP BY userid, activity

ORDER BY userid, activity

-- With formatting to MM:SS

;WITH MyData AS (

SELECT dateadded, userid, activity

,(SELECT TOP 1 dateadded

FROM @tbl_data t2

WHERE t1.userid = t2.userid and t1.dateadded < t2.dateadded

ORDER BY dateadded) As dateadded2

FROM @tbl_data t1)

SELECT userid, activity

,CAST(TotalByActivity/60 AS VARCHAR) + ':' +

RIGHT('00'+CAST(TotalByActivity%60 AS VARCHAR),2) As [TotalByActivity MM:SS]

FROM (

SELECT userid, activity, SUM(DATEDIFF(second, dateadded, dateadded2)) As TotalByActivity

FROM MyData t1

GROUP BY userid, activity) x

ORDER BY userid, activity

The second version even formats to MM:SS as you want.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St