September 16, 2009 at 11:10 am
I am trying to write an SSRS report to replace an existing vendor-created report for custom software. I can see how I would figure out the report if I was creating it manually, but I'm having a heck of a time figuring out how to do it in SQL / SSRS. I'm hoping that in describing the problem I might help myself to figure it out... otherwise if anyone has any pointers or suggestions I'd be grateful!
There are two tables, FlightDay and DutyDay. DutyDay gives a start and end time for when the aircraft began 'working' -- the pilot is sitting in the plane, ready to go. When they actually take off, there is an entry in FlightDay. However, sometimes they just sit on the ground and are considered "idle". Where it's getting confusing is there can be multiple FlightDay entries within the span of time of one DutyDay.
And I'm being called into a meeting - I'll finish this post in a bit.
September 16, 2009 at 11:48 am
Okay, I'm back.
So what I need to do is create a report that shows the individual start and stop times from the FlightDay table, and give a total number of hours per activity. That's really simple - just report on FlightDay, group by activityId, and use a subtotal line on the hours.
What makes this into a beast is then I *also* need to calculate the total number of idle hours for that activity group, which is calculated by taking the total hours from DutyDay for that activity/time range and subtracting the total FlightDay hours. If there was one flightday record per dutyday record, this would be easy, but there can be multiple flights in one day.
Here's the code to set it up...
CREATE TABLE [dbo].[FlightDay](
[flightDayId] [bigint] IDENTITY(1,1) NOT NULL,
[resourceId] [bigint] NOT NULL,
[activityId] [bigint] NOT NULL,
[startDay] [datetime] NOT NULL,
[stopDay] [datetime] NULL,
[hours] [float] NULL,
PRIMARY KEY CLUSTERED
(
[flightDayId] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
go
CREATE TABLE [dbo].[DutyDay](
[dutyDayId] [bigint] IDENTITY(1,1) NOT NULL,
[resourceId] [bigint] NOT NULL,
[activityId] [bigint] NULL,
[startDay] [datetime] NOT NULL,
[stopDay] [datetime] NULL,
[hours] [float] NULL,
PRIMARY KEY CLUSTERED
(
[dutyDayId] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
insert dutyday (resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-02 10:21', '2009-09-02 10:29', 0.1)
insert dutyday (resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-04 09:19', '2009-09-04 15:30', 6.2)
insert dutyday (resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-05 08:59', '2009-09-05 10:57', 2)
insert flightday(resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-02 10:21', '2009-09-02 10:29', 0.1)
insert flightday(resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-04 09:19', '2009-09-04 12:14', 2.9)
insert flightday(resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-04 12:52', '2009-09-04 15:26', 2.6)
insert flightday(resourceid, activityid, startday,stopday,hours) values (1,1,'2009-09-05 08:59', '2009-09-05 10:54', 1.9)
go
select * from DutyDay compute sum(hours)
select * from FlightDay compute sum(hours)
This is sort of close but it's not right because it returns only three line items (one per day) and really I want four line items (one per flight) but I only want to count the idle hours once. I don't actually need the individual idle hours as line items, just the total at the end.
select dd.resourceid, dd.activityid, dd.startday, max(dd.hours) TotalHours, sum(fd.hours) WorkHours, max(dd.hours)-sum(fd.hours) as IdleHours
from dutyday dd, flightday fd
where fd.startday between dd.startday and dd.stopday and dd.resourceid = fd.resourceid and dd.activityid = fd.activityid
group by dd.resourceid, dd.activityid, dd.startday
I'm wondering if I'll have to do it as two separate queries in SSRS somehow... the main query to get the info from FlightDay and completely ignore DutyDay, and then the subtotal query that would do something like "select sum({hours from query1}) - sum(hours) from DutyDay where starttime = {min start time from query1} and stoptime = {max stop time from query1}"...
??????
Pointers greatly appreciated. I don't care if I end up writing the solution as a stored proc, or mangle it within SSRS... I just don't know how to get two queries in an SSRS report at the same time.
I can post a screen shot of the report if that's helpful.... although I'm not sure how to do that on here. Upload to photobucket or something first?
September 16, 2009 at 11:58 am
First of all: you did an excellent job in providing sample data!!
Not seen too often...
One thing I'd like you to add: what is the output you're expecting?
Best thing would be if you'd add the result set based on your sample data.
It would work just fine if you'd post it as plain text, comma separated, with column headers.
September 16, 2009 at 12:05 pm
What I would like to see is something like this:
ActivityId, Hours, StartTime, EndTime
1, 1.9, Sep 5 08:59, Sep 5 10:54
1, 2.6, Sep 4 12:52, Sep 4 15:26
1, 2.9, Sep 4 09:19, Sep 4 12:14
1, 0.1, Sep 2 10:21, Sep 2 10:29
Working Hours: 7.5 Idle Hours: 0.8
2, 3.4, Sep 1 13:52, Sep 1 17:14
Working Hours: 3.4 Idle Hours: 0.0
1, 1.7 Sep 1 09:46, Sep 1 11:28
(etc)
If I could just ignore the idle hours requirement it would be all good 🙂 I only included sample data for the first section, but in reality they can go back and forth between several activities. The report is sorted in reverse chronological order, and they want a subtotal every time the activity changes.
September 16, 2009 at 12:26 pm
I have the beginnings of a thought forming. Tell me if this makes sense....
Create a view for both DutyDay and FlightDay that ignores the time portion of the startTime field, and group by activityId and resourceId to get the total hours for each, and include the min startTime
Then do a left join from the original tables back to the view so that there is only one view line item returned per DAY from FlightDay rather than per ROW.
Hmm. I'm going to try that and see if I can get somewhere with it....
September 16, 2009 at 12:43 pm
Success! This works with my four rows of sample data....
create view viewDutyDay as
select
resourceId
, activityId
, DATEADD(dd,CAST(DATEDIFF(DAY,'17530101',startDay) AS INT),'17530101') as startDayNoTime
, sum(hours) as TotalDutyHours
from dutyday
group by resourceId, activityId, DATEADD(dd,CAST(DATEDIFF(DAY,'17530101',startDay) AS INT),'17530101')
go
create view viewFlightDay as
select
resourceId
, activityId
, DATEADD(dd,CAST(DATEDIFF(DAY,'17530101',startDay) AS INT),'17530101') as startDayNoTime
, min(startDay) as FirstFlightDay
, sum(hours) as TotalFlightHours
from Flightday
group by resourceId, activityId, DATEADD(dd,CAST(DATEDIFF(DAY,'17530101',startDay) AS INT),'17530101')
go
select
fd.resourceId
, fd.activityId
, fd.hours as FlightHours
, fd.startDay
, fd.stopDay
, vdd.TotalDutyHours
, vfd.TotalFlightHours
, vdd.TotalDutyHours - vfd.TotalFlightHours as TotalIdleHours
from flightday fd left join viewflightday vfd on fd.resourceid = vfd.resourceid and fd.activityid = vfd.activityid and fd.startday = vfd.firstflightday
left join viewdutyday vdd on vfd.resourceid = vdd.resourceid and vfd.activityid = vdd.activityid and vfd.startdaynotime = vdd.startdaynotime
Results:
resourceId,activityId,FlightHours,startDay,stopDay,TotalDutyHours, TotalFlightHours,TotalIdleHours
1,1,0.1,2009-09-02 10:21:00.000,2009-09-02 10:29:00.000,0.1,0.1,0
1,1,2.9,2009-09-04 09:19:00.000,2009-09-04 12:14:00.000,6.2,5.5,0.7
1,1,2.6,2009-09-04 12:52:00.000,2009-09-04 15:26:00.000,NULL,NULL,NULL
1,1,1.9,2009-09-05 08:59:00.000,2009-09-05 10:54:00.000,2,1.9,0.1
Now... it's a *way* to do it, but is this a GOOD way?
September 16, 2009 at 1:12 pm
Good job, again!
It's great to see that there are still some people out there that aren't relying on a forum answer only. This makes it a lot more fun to stay around trying to help 🙂
Here's another proposal to do it.
It might be a little faster since there aren't that many joins and date conversions in it.
The trick is the ROW_NUMBER function. It is used to assign the daily work hours to the first flight and set the following to zero. You can SUM over the last three columns to get the total per day, same as with your solution.
There might be even easier ways to do it but I can't figure one right now. Maybe one of the guru's will jump in and have a look at it...
If there's anything else we can help you with or if you have any questions regarding the solution below, let us know!
;WITH cte AS
(
SELECT
activityid AS id,
row_number () OVER (PARTITION BY activityid,dateadd(dd,0,datediff(dd,0,startday)) ORDER BY startday ) AS row,
startday AS dstart ,
stopday AS dstop,
hours AS work
FROM flightday
)
SELECT id,
dstart,
dstop,
work,
CASE WHEN row = 1 THEN d.hours ELSE 0 END AS TotalHrs ,
CASE WHEN row = 1 THEN d.hours ELSE 0 END - work AS idle
FROM dutyday d
INNER JOIN cte
ON d.activityid = cte.id
AND cte.dstart>= d.startday
AND cte.dstop<= d.stopday
/* result set
iddstartdstopworkTotalHrsidle
12009-09-02 10:21:00.0002009-09-02 10:29:00.0000,10,10
12009-09-04 09:19:00.0002009-09-04 12:14:00.0002,96,23,3
12009-09-04 12:52:00.0002009-09-04 15:26:00.0002,60-2,6
12009-09-05 08:59:00.0002009-09-05 10:54:00.0001,920,1
*/
September 16, 2009 at 3:39 pm
That's an interesting way to do it... I don't fully understand it! I'm going to have to read it a few more times and look up PARTITION. 🙂
However, using your method against the real data it takes 16 seconds to filter down to just the 4 rows I included in the sample, but mine takes 9 seconds.
I'm putting your code modified to use live data below so I don't misplace it 🙂
;WITH cte AS
(
SELECT
activityid,
resourceId,
row_number () OVER (
PARTITION BY activityid,dateadd(dd,0,datediff(dd,0,startdatetime))
ORDER BY startdatetime ) AS row,
startdatetime AS dstart,
stopdatetime AS dstop,
hours AS FlightHours
FROM prt.view_flightday
)
SELECT cte.activityId,
cte.resourceId,
dstart,
dstop,
FlightHours,
CASE WHEN row = 1 THEN d.hours ELSE 0 END AS TotalHrs ,
CASE WHEN row = 1 THEN d.hours ELSE 0 END - flighthours AS IdleHours
FROM prt.view_dutyday d
INNER JOIN cte
ON d.activityid = cte.activityid and d.resourceid = cte.resourceid
AND cte.dstart>= d.startdatetime
AND cte.dstop<= d.stopdatetime
where d.resourceid = 7685 and dstart between '2009-09-02' and '2009-09-06'
September 16, 2009 at 4:10 pm
The ROW_NUMBER () OVER (PARTITION BY ... ORDER BY) will simply number the rows for each activityid and day starting from 1 ordered by startdatetime.
By looking at your real life sample I think it should be modified to
PARTITION BY resourceId,activityid,dateadd(dd,0,datediff(dd,0,startdatetime))
To see hoe it works just run the select statement within the WITH claus separately and you'll see how it works.
regarding the performance issue:
The cte basically is a subselect, just easier to read... (at least from my point of view...).
What probably happens here is the subselect will be performed against all data (meaning each and every resourceid) and the result will be joined against the table prt.view_dutyday. After that the filter "d.resourceid = 7685 and dstart between '2009-09-02' and '2009-09-06'"
will be applied. But I'm not really sure about that since I'm not an expert on how the query optimizer works...
What I'd recommend is to move the WHERE clause inside the CTE:
...FROM prt.view_flightday
where resourceid = 7685 and startdatetime between '2009-09-02' and '2009-09-06'
Another question would be how your tables are indexed:
Do you have an index on resourceid, startdatetime, stopdatetime and activityid on both tables view_flightday and view_dutyday? If not, what are the indexes on those tables and how many rows are in each one?
Edit:
Based on the table names: is it possible that you're running the query against two views? If so, indexing the views might get a "little bit complicated", especially if there are aggregation functions involved or anything else that would prevent indexed views...
Maybe it's possible to find a solution based on the source tables of both views. Would you please post both view definitions?
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply