Cumulative SUM by HOUR for specific DATE range

Question

Cumulative SUM by HOUR for specific DATE range

lilywhites

SSCrazy

Points: 2732
More actions
February 11, 2013 at 5:51 am

#270614

Hello,
I have searched a number of forums and tried a few examples but cannot get anything to work. I hope i include enough information to be helpful.
Please can someone advise how i can accomplish the following:
==========
existing table with format
RUNID; TIMEFRAME_START; QUEUE_NAME; ATSTART_MAILS; REPLIED_MAILS
1; 2012-02-10 00:00:00.000; QUEUE1; 25; 15
1; 2012-02-10 01:00:00.000; QUEUE1; 23; 14
1; 2012-02-10 02:00:00.000; QUEUE1; 21; 11
>>>>>
1; 2012-02-10 22:00:00.000; QUEUE1; 14; 9
1; 2012-02-10 23:00:00.000; QUEUE1; 19; 17
2; 2012-02-11 00:00:00.000; QUEUE1; 22; 13
2; 2012-02-11 01:00:00.000; QUEUE1; 21; 11
etc, etc, etc
i need to run some SQL daily to output results for the last day in the following format:
TIMEFRAME_START; QUEUE_NAME; ATSTART_MAILS; REPLIED_MAILS
2012-02-10 00:00:00.000; QUEUE1; 25; 15
2012-02-10 01:00:00.000; QUEUE1; 48; 29
2012-02-10 02:00:00.000; QUEUE1; 69; 40
etc, etc, etc
i have tried using CTE with a <= inner join but keep getting huge results (i narrowed it down to 1 queue for easy analysis of results....and will need to add more columns once the base code is sorted)
with data as
(
select TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS from EGTS_OUT_EMAIL_ACTIVITY_HOURLY
WHERE RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)
)
SELECT
t.TIMEFRAME_START, t.queue_name,
(SELECT SUM(atstart_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START),
(SELECT sum(replied_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START)
FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY t
where RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)
and t.QUEUE_NAME='5 UK GOODS'
group by t.timeframe_start, t.QUEUE_NAME
order by TIMEFRAME_START asc
i know the subqueries are bad :p
i think its something to do with the <= join not using the RUNID clause somehow but not sure as i added it on both sides....
many thanks
samuel

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 1

Could you please provide DDL and insert sample data script? Tips how to do so are in the link at the bottom of my signature.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

lilywhites SSCrazy Points: 2732 More actions · Answer 2

ignore me....the code works!!

im an idiot and forgot to reference properly!!

thanks anyway :p

with data as

(

select TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS from EGTS_OUT_EMAIL_ACTIVITY_HOURLY

WHERE RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)

and QUEUE_NAME='5 UK GOODS'

)

SELECT

t.TIMEFRAME_START, t.queue_name,

(SELECT SUM(atstart_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START),

(SELECT sum(replied_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START)

FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY t

where t.RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)

and t.QUEUE_NAME='5 UK GOODS'

group by t.timeframe_start, t.QUEUE_NAME

order by t.TIMEFRAME_START asc

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 3

You might wnat to revisit the code, I would suggest reading this article http://www.sqlservercentral.com/articles/T-SQL/61539/ and then look at a turning this into a Quirky update for the Running total, something like this http://www.sqlservercentral.com/articles/T-SQL/68467/

_________________________________________________________________________
SSC Guide to Posting and Best Practices

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

Jason-299789 (2/11/2013)
You might wnat to revisit the code, I would suggest reading this article http://www.sqlservercentral.com/articles/T-SQL/61539/ and then look at a turning this into a Quirky update for the Running total, something like this http://www.sqlservercentral.com/articles/T-SQL/68467/

The OP's query uses triangular joins, usually a no-no because of the potential for huge working sets. However, if partitions are small, it can perform reasonably well. The easiest way to calculate running totals is with a recursive CTE as follows:

;WITH SampleData (RUNID, TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS)AS (

SELECT 1, '2012-02-10 00:00:00.000', 'QUEUE1', 25, 15 UNION ALL

SELECT 1, '2012-02-10 01:00:00.000', 'QUEUE1', 23, 14 UNION ALL

SELECT 1, '2012-02-10 02:00:00.000', 'QUEUE1', 21, 11 UNION ALL

SELECT 1, '2012-02-10 22:00:00.000', 'QUEUE1', 14, 9 UNION ALL

SELECT 1, '2012-02-10 23:00:00.000', 'QUEUE1', 19, 17 UNION ALL

SELECT 2, '2012-02-11 00:00:00.000', 'QUEUE1', 22, 13 UNION ALL

SELECT 2, '2012-02-11 01:00:00.000', 'QUEUE1', 21, 11 )

, SequencedData AS (

SELECT

Seq = ROW_NUMBER() OVER(ORDER BY RUNID, TIMEFRAME_START),

*

FROM SampleData

),

Calculator AS (

SELECT Seq, RUNID, TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS

FROM SequencedData

WHERE Seq = 1

UNION ALL

SELECT nr.Seq, nr.RUNID, nr.TIMEFRAME_START, nr.QUEUE_NAME,

ATSTART_MAILS = CASE WHEN nr.RUNID = lr.RUNID THEN nr.ATSTART_MAILS+lr.ATSTART_MAILS ELSE nr.ATSTART_MAILS END,

REPLIED_MAILS = CASE WHEN nr.RUNID = lr.RUNID THEN nr.REPLIED_MAILS+lr.REPLIED_MAILS ELSE nr.REPLIED_MAILS END

FROM Calculator lr -- last row

INNER JOIN SequencedData nr -- new row

ON nr.Seq = lr.Seq+1

)

SELECT *

FROM Calculator

The fastest way to calculate running totals is the QU method referenced by Jason.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

lilywhites SSCrazy Points: 2732 More actions · Answer 5

ive got over 500 queues and 15 extra data columns to compute also....would those links scale up??

ive got this running in under 20 seconds....ill look into those other methods

many thanks

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 6

Chris,

I've never seen the recursive CTE method before, and looks like a viable option to the quirky update to a point.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

lilywhites (2/11/2013)
ive got over 500 queues and 15 extra data columns to compute also....would those links scale up??
ive got this running in under 20 seconds....ill look into those other methods
many thanks

The QU method scales well and will take about 5 seconds to UPDATE a million rows (on what? Most folks' testing rig). The rCTE method also scales well and takes about 25 seconds to SELECT a million rows (plus network time). There are a number of other methods, all of which are slower. The triangular join method you are using usually scales particularly badly, because as the set size increases, so does the number of rows to be aggregated. In your case the data is partitioned which helps.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Jason-299789 (2/11/2013)
Chris,
I've never seen the recursive CTE method before, and looks like a viable option to the quirky update to a point.

Hi Jason, it's been subjected to a significant amount of intermittent testing here on ssc, have a Goooogle.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

lilywhites SSCrazy Points: 2732 More actions · Answer 9

ill check out the rCTE method for the final phase as there may well be expansion in the data set in the future....as a proof of concept this will have to do for now 🙁

thanks for your help