Maximum flow from continuous flow rate

Question

Maximum flow from continuous flow rate

Brian Ki

SSC Enthusiast

Points: 147
More actions
October 31, 2011 at 12:07 pm

#247025

I'm having trouble with this one, and haven't found ideas on the web. Given a day's worth of nearly continuous (1 second) flow rate data, how would you calculate the maximum flow for a given length of time, say an hour. Conceptually you would go 1 second at a time and get the avg flow rate for the next hour, then simply take the highest avg and convert it to flow. Any ideas on set-based SQL to accomplish this?

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 1

Can you give a little more detail as to the process you'd like the SQL to perform?

In my head, I'm thinking you'd just grab MAX(flowrate) group by (Hour Dividers), but that seems to easy for what you're looking for so I'm probably just not awake yet.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Brian Ki SSC Enthusiast Points: 147 More actions · Answer 2

Group By Hour would work if I were after the maximum avg flowrate for each hour, with hour being between 1:00 and 2:00, 2:00 and 3:00, etc. But my 'hour' is any hour of time - it could be from 1:00:00 to 2:00;00, 1:00:01 to 2:00:01, 1:00:02 to 2:00:02, etc.

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 3

have you any sample data and DDL scripts...along with expected results?

it would help us to help you in providing some tested answers.

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 4

Brian Kier (10/31/2011)
Group By Hour would work if I were after the maximum avg flowrate for each hour, with hour being between 1:00 and 2:00, 2:00 and 3:00, etc. But my 'hour' is any hour of time - it could be from 1:00:00 to 2:00;00, 1:00:01 to 2:00:01, 1:00:02 to 2:00:02, etc.

Not as difficult as you believe, actually, though it's a bit of voodoo if you've never seen it before. If all you're looking for is to set your 'starting moment' via parameter and have it count off hours from there, not a problem. I just want to make sure I understand the requirements, since there's no sample data/result example.

So, to confirm, the requirement is a maximumFlowPerHour with a variable/parameterized starting point?

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Brian Ki SSC Enthusiast Points: 147 More actions · Answer 5

that doesn't sound quite right, though I'm not sure. There doesn't really need to be any parameters. If I assume a table is filled with 86,400 records (1 record per second for 1 day) - each record is time stamped hh:mm:ss (ignore day). I need the hour of time where the highest average. Recall that when I say "hour of time" I mean any 60 minute period. So the answer might be "The hour between 2:02:14 and 2:03:14 had the highest avg with a value of 6.5 cfs".

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 6

Brian Kier (10/31/2011)
that doesn't sound quite right, though I'm not sure. There doesn't really need to be any parameters. If I assume a table is filled with 86,400 records (1 record per second for 1 day) - each record is time stamped hh:mm:ss (ignore day). I need the hour of time where the highest average. Recall that when I say "hour of time" I mean any 60 minute period. So the answer might be "The hour between 2:02:14 and 2:03:14 had the highest avg with a value of 6.5 cfs".

AH! You need to review a rolling window. Is it safe to limit this to a single day for performance sanity?

Edit: Psssst, that's a minute, not an hour, you listed above, btw. 😉

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Brian Ki SSC Enthusiast Points: 147 More actions · Answer 7

yes, Rolling Window describes it well - never heard of that. Yes, we can limit it to a day, and yes, I should have put 2:02:14 to 3:02:14.

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 8

The execution plan for this is some serious ugly. I'm getting ~ 1 minute runtimes on my PC, but I want to make sure the logic is what you're looking for before I go trying to nail this down and bringing anything significant to this that'll obfuscate the purpose of the process.

There are a few small comments in the code. Please review and let me know if you need assistance with any of this.

Sample Data build (million row test)

IF OBJECT_ID('tempdb..#tester') IS NOT NULL

DROP TABLE #tester

CREATE TABLE #tester

( TimeOfEntry DATETIME NOT NULL,

FlowRateBIGINT NOT NULL

)

CREATE CLUSTERED INDEX idx_c_tester ON #tester

(TimeOfEntry)

-- Seconds in a day: 60* 60 * 24 = 86400

-- My tally table is a million rows, which will give us ~ 11 days of data.

-- TRUNCATE TABLE #tester

INSERT INTO #tester

SELECT

DATEADD( ss, n, '20110101') AS TimeOfEntry,

ABS( CHECKSUM( NEWID())) AS FlowRate

FROM

tempdb..Tally AS t

--select top 100 * from #tester

--select min(TimeOfEntry), Max(timeofEntry) FROM #tester

--select min(FlowRate), Max(FlowRate) FROM #tester

Query Logic:

DECLARE @DateOfInterest DATETIME

SET @DateOfInterest = '20110103'

;WITH cte AS

(SELECT

distinctList.SecondsOnly,

ca.HourlyAverageFlowRate

FROM

(SELECT DISTINCT

-- Used to remove any millisecond inclusions

-- @DateOfInterest is used because seconds since 1/1/1900 causes overflows.

DATEADD( ss, DATEDIFF( ss, @DateOfInterest, TimeOfEntry) , @DateOfInterest) AS SecondsOnly

FROM

#tester

WHERE

TimeOfEntry >= @DateOfInterest

AND TimeOfEntry < DATEADD( dd, 1, @DateOfInterest)

) AS distinctList

CROSS APPLY

( SELECT

AVG( FlowRate) AS HourlyAverageFlowRate

FROM

#tester

WHERE

TimeOfEntry >= distinctList.SecondsOnly

AND TimeOfEntry < dateadd(hh, 1, SecondsOnly)

) AS ca

)

-- To see the results from the CTE

-- SELECT * FROM cte ORDER BY SecondsOnly

SELECT

SecondsOnly AS BeginningTime,

dateadd( hh, 1, SecondsOnly) AS HourEndsAt,

MaxRate AS HighestRate

FROM

(SELECT

MAX( HourlyAverageFlowRate) AS MaxRate

FROM

cte

) AS a

JOIN

cte

ONa.MaxRate = cte.HourlyAverageFlowRate

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 9

Brian Ki (10/31/2011)
that doesn't sound quite right, though I'm not sure. There doesn't really need to be any parameters. If I assume a table is filled with 86,400 records (1 record per second for 1 day) - each record is time stamped hh:mm:ss (ignore day). I need the hour of time where the highest average. Recall that when I say "hour of time" I mean any 60 minute period. So the answer might be "The hour between 2:02:14 and 2:03:14 had the highest avg with a value of 6.5 cfs".

Post the CREATE TABLE statement for the table, 10 rows of data and tell me what the expected min/max flowrates are (I need limits for the test data I'm going to build) as well as the total number of rows you currently have in your table 'cuz I have a lightning fast idea on how to do this.;-)

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 10

Heh, I wonder if we're thinking along the same lines Jeff. I'm currently puzzling out a rolling window mechanic that should only need one (maybe two) passes on the source data via mathmatics based on packaging the rows into multiple simultaneous containers and a CASE'd AVG statement.

Just can't seem to nail it down in my head though.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Phil Parkin SSC Guru Points: 246973 More actions · Answer 11

Here's a two-pass solution.

Add a new column to your source data - HourlyRate.

Pass 1: Update HourlyRate to be the flow rate for the hour ending with the 'current' row, where HourlyRate is NULL.

Pass 2: Select Max(HourlyRate) within the time period of interest.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 12

Evil Kraig F (11/1/2011)
Heh, I wonder if we're thinking along the same lines Jeff. I'm currently puzzling out a rolling window mechanic that should only need one (maybe two) passes on the source data via mathmatics based on packaging the rows into multiple simultaneous containers and a CASE'd AVG statement.
Just can't seem to nail it down in my head though.

I'd have to look for it but someone wrote a wonderful article on such rolling averages. Intead of doing an average over each trailing hour using a bit of a Triangular Join, you calculate the first SUM and COUNT used to make an average and then do the following for each row using a single update...

Do a single lookup for the "trail off" row and substract it from the SUM.

You're already at the "leading" row during an UPDATE... add it to the SUM.

Calculate the average using SUM/COUNT.

This can be done in one of two ways... correlated subquery or Cross Apply (which should very fast because it's an equality) or an offset self join (which I think might be a bit slower).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 13

Jeff Moden (11/1/2011)
Do a single lookup for the "trail off" row and substract it from the SUM.
You're already at the "leading" row during an UPDATE... add it to the SUM.
Calculate the average using SUM/COUNT.

My personal test bed cause an overflow when I thought about that pattern, but you're right, that'll depend heavily on the base data.

EDIT: Oh, yeah, and I didn't want to fight with ranges inside each of the increments, a simple FIFO push stack wasn't going to easily handle ranges.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Jeff Moden SSC Guru Points: 1004392 More actions · Answer 14

Brian Ki (10/31/2011)
yes, Rolling Window describes it well - never heard of that. Yes, we can limit it to a day, and yes, I should have put 2:02:14 to 3:02:14.

Let me ask again... what do you expect for a "normal" min and max instantaneous flow rate? I'd like to make the test data reasonably close to the real thing.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)