April 20, 2012 at 6:33 pm
I have a table called Transactions, which has 4 fields (for the sake of this example):
[TimePunchID] int PK auto#
[DateAdded] datetime
[UserID] varchar(100)
[Activity] varchar(100)
Example output:
IDdateaddeduserid activity DifferenceInMinutes
12012-04-20 08:15:00 AM 200Break
22012-04-20 08:30:00 AM 200Project A
32012-04-20 10:30:00 AM 200Project B
42012-04-20 08:15:00 AM 240Break
52012-04-20 09:30:00 AM 240Project C
62012-04-20 11:30:00 AM 240Project A
I want to find an efficient way to programmatically subtract the date difference between the rows so that I can group and display the hours that each person spent on each activity. The first entry for each user won't have a difference as there would be nothing from which to subtract; the following entries would. The idea is to populate the DifferenceInMinutes field with the minute difference between the dates.
Thanks so much!!
April 20, 2012 at 7:00 pm
what is the end time you are using to figure out the difference in min. and can you post the DDL and Sample data in a more consumer friendly format?? please see my signature for help on how we like to see it here.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 20, 2012 at 7:16 pm
Thanks for the reply. Here is a quick example of some test data.
--Create a table to hold the data.
create table #tbl_Data (TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))
--Populate some data in the table.
insert into #tbl_Data (dateadded, userid, activity)
SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL
SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL
SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL
SELECT '2012-04-20 09:58:00 AM','23456','Project C'
--View the results.
select * from #tbl_Data
--Release the table.
drop table #tbl_data
Ideally, I would like to see the difference between each row (subtracting the dates in minutes). For example, if I summed the total amount of work that UserID 12345 performed on Project A, I would get X minutes.
Please let me know if I can clarify anything else.
April 20, 2012 at 7:24 pm
I can actually get close to where I think that I need to be. The problem with these results is that the first entry gets no time associated with it (because it's the first entry), even though technically 15 minutes were spent on it (8:15 to 8:30). Therefore, when the data is summed up by activity, the times are off by each previous amount.
--Create a table to hold the data.
create table #tbl_Data (TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))
--Populate some data in the table.
insert into #tbl_Data (dateadded, userid, activity)
SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL
SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL
SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL
SELECT '2012-04-20 09:58:00 AM','23456','Project C'
--View the results.
select * from #tbl_Data
Select userid, Activity,
Cast(dateadded AS DateTime) as [fulldate],
DiffFromPrevious = Coalesce(
DateDiff(
mi,
(
Select Top 1 Cast(dateadded AS DateTime) AS [fulldate]
From #tbl_Data
Where Cast(dateadded AS DateTime) < Cast(t1.dateadded AS DateTime)
and convert(varchar(100),dateadded,101)
= convert(datetime,'4/20/2012',101)
and userid = '12345'
Order By [fulldate] Desc
),
Cast(dateadded AS DateTime)
),
0)
From
#tbl_Data t1
where convert(varchar(100),t1.dateadded,101)
= convert(datetime,'4/20/2012',101)
and userid = '12345'
Order By
[fulldate] Asc
drop table #tbl_data
April 20, 2012 at 7:29 pm
To concretely answer your question about the End Time - each entry "ends" the time spent on the previous entry; for example, 8:15am - 8:30am, (8:30am ends the 8:15am entry and starts the new entry, yielding 15 minutes for the activity associated with the 8:15am entry).
April 20, 2012 at 10:49 pm
brubin 51691 (4/20/2012)
To concretely answer your question about the End Time - each entry "ends" the time spent on the previous entry; for example, 8:15am - 8:30am, (8:30am ends the 8:15am entry and starts the new entry, yielding 15 minutes for the activity associated with the 8:15am entry).
Given the above statement, you cannot arrive at an answer as the last task has not ended yielding infinite. Given the demo data in your code the second user has started Project C at 9:58 and is still working.
This is a business decision you will need to make. Do you want the sum f completed tasks?
Fitz
April 20, 2012 at 10:55 pm
Thanks for the reply. Yes, the sum of completed tasks is needed; the last entry for each user will be open.
April 22, 2012 at 8:16 pm
Brubin,
I believe the following will get you what you want although I played a bit fast and loose with your problem description.
--Create a table to hold the data.
DECLARE @tbl_Data TABLE
(TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))
--Populate some data in the table.
insert into @tbl_Data (dateadded, userid, activity)
SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL
SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL
SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL
SELECT '2012-04-20 09:58:00 AM','23456','Project C'
;WITH MyData AS (
SELECT dateadded, userid, activity
,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk
FROM @tbl_data)
,ByActivity AS (
SELECT dateadded, userid, activity
,DATEDIFF(minute
,dateadded
,(SELECT dateadded
FROM MyData t2
WHERE t1.userid = t2.userid and t1.rk + 1 = t2.rk)
) As DifferenceInMinute
FROM MyData t1)
--SELECT dateadded, userid, activity, DifferenceInMinute
SELECT userid, activity, SUM(DifferenceInMinute) As TotalByActivity
FROM ByActivity
GROUP BY userid, activity
You can see what I mean by "fast and loose" if you uncomment the SELECT, comment the SELECT with the SUM and comment the GROUP BY lines.
Basically, I've put the DifferenceInMinute on the task start record (not the task close record). This allows the GROUPing to SUM correctly by activity.
My results set:
useridactivityTotalByActivity
12345Break120
23456Break90
12345Project A20
23456Project B15
23456Project CNULL
12345Project D75
This is actually a knock off of the cumulative sum problem that Jeff Moden's "quirky update" would probably solve faster, but this solution should work.
It can also be done with a JOIN of course.
;WITH MyData AS (
SELECT dateadded, userid, activity
,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk
FROM @tbl_data)
SELECT t1.userid, t1.activity, SUM(DATEDIFF(minute, t1.dateadded, t2.dateadded)) As TotalByActivity
FROM MyData t1
LEFT JOIN MyData t2 ON t1.userid = t2.userid and t1.rk + 1 = t2.rk
GROUP BY t1.userid, t1.activity
ORDER BY t1.userid, t1.activity
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 23, 2012 at 8:44 am
dwain.c (4/22/2012)
Brubin,I believe the following will get you what you want although I played a bit fast and loose with your problem description.
--Create a table to hold the data.
DECLARE @tbl_Data TABLE
(TimePunchID int identity(1,1), dateadded datetime, userid varchar(100), activity varchar(100))
--Populate some data in the table.
insert into @tbl_Data (dateadded, userid, activity)
SELECT '2012-04-20 08:15:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 08:30:00 AM','12345','Project D' UNION ALL
SELECT '2012-04-20 09:45:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 11:45:00 AM','12345','Project A' UNION ALL
SELECT '2012-04-20 11:50:00 AM','12345','Break' UNION ALL
SELECT '2012-04-20 08:13:00 AM','23456','Project B' UNION ALL
SELECT '2012-04-20 08:28:00 AM','23456','Break' UNION ALL
SELECT '2012-04-20 09:58:00 AM','23456','Project C'
;WITH MyData AS (
SELECT dateadded, userid, activity
,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk
FROM @tbl_data)
,ByActivity AS (
SELECT dateadded, userid, activity
,DATEDIFF(minute
,dateadded
,(SELECT dateadded
FROM MyData t2
WHERE t1.userid = t2.userid and t1.rk + 1 = t2.rk)
) As DifferenceInMinute
FROM MyData t1)
--SELECT dateadded, userid, activity, DifferenceInMinute
SELECT userid, activity, SUM(DifferenceInMinute) As TotalByActivity
FROM ByActivity
GROUP BY userid, activity
You can see what I mean by "fast and loose" if you uncomment the SELECT, comment the SELECT with the SUM and comment the GROUP BY lines.
Basically, I've put the DifferenceInMinute on the task start record (not the task close record). This allows the GROUPing to SUM correctly by activity.
My results set:
useridactivityTotalByActivity
12345Break120
23456Break90
12345Project A20
23456Project B15
23456Project CNULL
12345Project D75
This is actually a knock off of the cumulative sum problem that Jeff Moden's "quirky update" would probably solve faster, but this solution should work.
It can also be done with a JOIN of course.
;WITH MyData AS (
SELECT dateadded, userid, activity
,ROW_NUMBER() OVER (PARTITION BY userid ORDER BY userid, dateadded) As rk
FROM @tbl_data)
SELECT t1.userid, t1.activity, SUM(DATEDIFF(minute, t1.dateadded, t2.dateadded)) As TotalByActivity
FROM MyData t1
LEFT JOIN MyData t2 ON t1.userid = t2.userid and t1.rk + 1 = t2.rk
GROUP BY t1.userid, t1.activity
ORDER BY t1.userid, t1.activity
i did not have time over the weekend to post back but went with dwain's self join solution for my approach. had the exact same code to. (well almost but same enough not to post)
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 23, 2012 at 8:58 am
Thanks for the replies! Both of the last two posts produce solid output, and I really appreciate the effort. The last part of the equation is to take the minutes summed and display it as hh.mm (ie, 8.50 would = 8 hours 30 minutes) to 2 decimal places to ensure that the actual times being summed account for seconds.
April 23, 2012 at 6:32 pm
Normally I try to avoid self joins because they can be slow performance-wise with many rows in the table and there's usually another approach that is faster.
In this case though, the query plan cost for both solutions I posted came up identical, even though the plans themselves were slightly different, so I posted the self join option because some may think it's a little easier on the eyes.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 23, 2012 at 8:40 pm
Like I said, self joins aren't particularly efficient. This solution produces a lower query plan cost so is likely to run faster than the self join or the first suggestion I made.
;WITH MyData AS (
SELECT dateadded, userid, activity
,(SELECT TOP 1 dateadded
FROM @tbl_data t2
WHERE t1.userid = t2.userid and t1.dateadded < t2.dateadded
ORDER BY dateadded) As dateadded2
FROM @tbl_data t1)
SELECT userid, activity, SUM(DATEDIFF(minute, dateadded, dateadded2)) As TotalByActivity
FROM MyData t1
GROUP BY userid, activity
ORDER BY userid, activity
-- With formatting to MM:SS
;WITH MyData AS (
SELECT dateadded, userid, activity
,(SELECT TOP 1 dateadded
FROM @tbl_data t2
WHERE t1.userid = t2.userid and t1.dateadded < t2.dateadded
ORDER BY dateadded) As dateadded2
FROM @tbl_data t1)
SELECT userid, activity
,CAST(TotalByActivity/60 AS VARCHAR) + ':' +
RIGHT('00'+CAST(TotalByActivity%60 AS VARCHAR),2) As [TotalByActivity MM:SS]
FROM (
SELECT userid, activity, SUM(DATEDIFF(second, dateadded, dateadded2)) As TotalByActivity
FROM MyData t1
GROUP BY userid, activity) x
ORDER BY userid, activity
The second version even formats to MM:SS as you want.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply