Maximum flow from continuous flow rate

  • I'm having trouble with this one, and haven't found ideas on the web. Given a day's worth of nearly continuous (1 second) flow rate data, how would you calculate the maximum flow for a given length of time, say an hour. Conceptually you would go 1 second at a time and get the avg flow rate for the next hour, then simply take the highest avg and convert it to flow. Any ideas on set-based SQL to accomplish this?

  • Can you give a little more detail as to the process you'd like the SQL to perform?

    In my head, I'm thinking you'd just grab MAX(flowrate) group by (Hour Dividers), but that seems to easy for what you're looking for so I'm probably just not awake yet.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Group By Hour would work if I were after the maximum avg flowrate for each hour, with hour being between 1:00 and 2:00, 2:00 and 3:00, etc. But my 'hour' is any hour of time - it could be from 1:00:00 to 2:00;00, 1:00:01 to 2:00:01, 1:00:02 to 2:00:02, etc.

  • have you any sample data and DDL scripts...along with expected results?

    it would help us to help you in providing some tested answers.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Brian Kier (10/31/2011)


    Group By Hour would work if I were after the maximum avg flowrate for each hour, with hour being between 1:00 and 2:00, 2:00 and 3:00, etc. But my 'hour' is any hour of time - it could be from 1:00:00 to 2:00;00, 1:00:01 to 2:00:01, 1:00:02 to 2:00:02, etc.

    Not as difficult as you believe, actually, though it's a bit of voodoo if you've never seen it before. If all you're looking for is to set your 'starting moment' via parameter and have it count off hours from there, not a problem. I just want to make sure I understand the requirements, since there's no sample data/result example.

    So, to confirm, the requirement is a maximumFlowPerHour with a variable/parameterized starting point?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • that doesn't sound quite right, though I'm not sure. There doesn't really need to be any parameters. If I assume a table is filled with 86,400 records (1 record per second for 1 day) - each record is time stamped hh:mm:ss (ignore day). I need the hour of time where the highest average. Recall that when I say "hour of time" I mean any 60 minute period. So the answer might be "The hour between 2:02:14 and 2:03:14 had the highest avg with a value of 6.5 cfs".

  • Brian Kier (10/31/2011)


    that doesn't sound quite right, though I'm not sure. There doesn't really need to be any parameters. If I assume a table is filled with 86,400 records (1 record per second for 1 day) - each record is time stamped hh:mm:ss (ignore day). I need the hour of time where the highest average. Recall that when I say "hour of time" I mean any 60 minute period. So the answer might be "The hour between 2:02:14 and 2:03:14 had the highest avg with a value of 6.5 cfs".

    AH! You need to review a rolling window. Is it safe to limit this to a single day for performance sanity?

    Edit: Psssst, that's a minute, not an hour, you listed above, btw. 😉


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • yes, Rolling Window describes it well - never heard of that. Yes, we can limit it to a day, and yes, I should have put 2:02:14 to 3:02:14.

  • The execution plan for this is some serious ugly. I'm getting ~ 1 minute runtimes on my PC, but I want to make sure the logic is what you're looking for before I go trying to nail this down and bringing anything significant to this that'll obfuscate the purpose of the process.

    There are a few small comments in the code. Please review and let me know if you need assistance with any of this.

    Sample Data build (million row test)

    IF OBJECT_ID('tempdb..#tester') IS NOT NULL

    DROP TABLE #tester

    CREATE TABLE #tester

    ( TimeOfEntry DATETIME NOT NULL,

    FlowRateBIGINT NOT NULL

    )

    CREATE CLUSTERED INDEX idx_c_tester ON #tester

    (TimeOfEntry)

    -- Seconds in a day: 60* 60 * 24 = 86400

    -- My tally table is a million rows, which will give us ~ 11 days of data.

    -- TRUNCATE TABLE #tester

    INSERT INTO #tester

    SELECT

    DATEADD( ss, n, '20110101') AS TimeOfEntry,

    ABS( CHECKSUM( NEWID())) AS FlowRate

    FROM

    tempdb..Tally AS t

    --select top 100 * from #tester

    --select min(TimeOfEntry), Max(timeofEntry) FROM #tester

    --select min(FlowRate), Max(FlowRate) FROM #tester

    Query Logic:

    DECLARE @DateOfInterest DATETIME

    SET @DateOfInterest = '20110103'

    ;WITH cte AS

    (SELECT

    distinctList.SecondsOnly,

    ca.HourlyAverageFlowRate

    FROM

    (SELECT DISTINCT

    -- Used to remove any millisecond inclusions

    -- @DateOfInterest is used because seconds since 1/1/1900 causes overflows.

    DATEADD( ss, DATEDIFF( ss, @DateOfInterest, TimeOfEntry) , @DateOfInterest) AS SecondsOnly

    FROM

    #tester

    WHERE

    TimeOfEntry >= @DateOfInterest

    AND TimeOfEntry < DATEADD( dd, 1, @DateOfInterest)

    ) AS distinctList

    CROSS APPLY

    ( SELECT

    AVG( FlowRate) AS HourlyAverageFlowRate

    FROM

    #tester

    WHERE

    TimeOfEntry >= distinctList.SecondsOnly

    AND TimeOfEntry < dateadd(hh, 1, SecondsOnly)

    ) AS ca

    )

    -- To see the results from the CTE

    -- SELECT * FROM cte ORDER BY SecondsOnly

    SELECT

    SecondsOnly AS BeginningTime,

    dateadd( hh, 1, SecondsOnly) AS HourEndsAt,

    MaxRate AS HighestRate

    FROM

    (SELECT

    MAX( HourlyAverageFlowRate) AS MaxRate

    FROM

    cte

    ) AS a

    JOIN

    cte

    ONa.MaxRate = cte.HourlyAverageFlowRate


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Brian Ki (10/31/2011)


    that doesn't sound quite right, though I'm not sure. There doesn't really need to be any parameters. If I assume a table is filled with 86,400 records (1 record per second for 1 day) - each record is time stamped hh:mm:ss (ignore day). I need the hour of time where the highest average. Recall that when I say "hour of time" I mean any 60 minute period. So the answer might be "The hour between 2:02:14 and 2:03:14 had the highest avg with a value of 6.5 cfs".

    Post the CREATE TABLE statement for the table, 10 rows of data and tell me what the expected min/max flowrates are (I need limits for the test data I'm going to build) as well as the total number of rows you currently have in your table 'cuz I have a lightning fast idea on how to do this.;-)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh, I wonder if we're thinking along the same lines Jeff. I'm currently puzzling out a rolling window mechanic that should only need one (maybe two) passes on the source data via mathmatics based on packaging the rows into multiple simultaneous containers and a CASE'd AVG statement.

    Just can't seem to nail it down in my head though.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Here's a two-pass solution.

    Add a new column to your source data - HourlyRate.

    Pass 1: Update HourlyRate to be the flow rate for the hour ending with the 'current' row, where HourlyRate is NULL.

    Pass 2: Select Max(HourlyRate) within the time period of interest.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Evil Kraig F (11/1/2011)


    Heh, I wonder if we're thinking along the same lines Jeff. I'm currently puzzling out a rolling window mechanic that should only need one (maybe two) passes on the source data via mathmatics based on packaging the rows into multiple simultaneous containers and a CASE'd AVG statement.

    Just can't seem to nail it down in my head though.

    I'd have to look for it but someone wrote a wonderful article on such rolling averages. Intead of doing an average over each trailing hour using a bit of a Triangular Join, you calculate the first SUM and COUNT used to make an average and then do the following for each row using a single update...

    Do a single lookup for the "trail off" row and substract it from the SUM.

    You're already at the "leading" row during an UPDATE... add it to the SUM.

    Calculate the average using SUM/COUNT.

    This can be done in one of two ways... correlated subquery or Cross Apply (which should very fast because it's an equality) or an offset self join (which I think might be a bit slower).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/1/2011)


    Do a single lookup for the "trail off" row and substract it from the SUM.

    You're already at the "leading" row during an UPDATE... add it to the SUM.

    Calculate the average using SUM/COUNT.

    My personal test bed cause an overflow when I thought about that pattern, but you're right, that'll depend heavily on the base data.

    EDIT: Oh, yeah, and I didn't want to fight with ranges inside each of the increments, a simple FIFO push stack wasn't going to easily handle ranges.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Brian Ki (10/31/2011)


    yes, Rolling Window describes it well - never heard of that. Yes, we can limit it to a day, and yes, I should have put 2:02:14 to 3:02:14.

    Let me ask again... what do you expect for a "normal" min and max instantaneous flow rate? I'd like to make the test data reasonably close to the real thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply