Is there a more efficient alternative to WHILE LOOPS and CURSORS

  • Hi members,

    I have a statement that loops through the records of a table to get two date values from which the number of days is derived. If the number of days is 30, then 30 records from a second table with a a Date value, are inserted into a third table. Below is the statement:

    SET @intRowNo = (SELECT MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)

    WHILE @intRowNo <= (SELECT MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)

    BEGIN

    INSERT INTO [NADECORP\SNjenga].Calendarized_Data(REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH,

    [DATE], ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD)

    SELECT(SELECT BROKER_NB FROM StagingVolume WHERE BillDate_ID = @intRowNo), (SELECT CONTRACT_NB

    FROM StagingVolume WHERE BillDate_ID = @intRowNo), WEATHER_STATION, (SELECT FIN_YEAR FROM StagingVolume

    WHERE BillDate_ID = @intRowNo), (SELECT FIN_MONTH FROM StagingVolume

    WHERE BillDate_ID = @intRowNo), [DATE], (SELECT ACCT_NO FROM StagingVolume

    WHERE BillDate_ID = @intRowNo), (SELECT BILL_YEAR FROM StagingVolume

    WHERE BillDate_ID = @intRowNo), (SELECT BILL_MONTH FROM StagingVolume

    WHERE BillDate_ID = @intRowNo), (SELECT FROM_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT TO_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo), HDD, (SELECT SUM(HDD)

    FROM ACTUAL_HDD_DAILY WHERE weather_station =

    (SELECT WEATHER_STATION FROM EXISTING_CUST_ALPHA_BETA WHERE ACCOUNT_NUMBER =

    (SELECT ACCT_NO FROM StagingVolume WHERE BillDate_ID = @intRowNo))

    and [DATE] between (SELECT FROM_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo)

    and (SELECT TO_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo) and Market_ID = 3)

    TOTHDD

    FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock

    WHERE WEATHER_STATION =

    (SELECT WEATHER_STATION FROM StagingVolume WHERE BillDate_ID = @intRowNo)

    AND Market_ID = 3

    SET @intRowNo = @intRowNo + 1

    END

    Is there a more efficient way of accomplishing this for 75 million records to be inserted into the third table?

    Regards,

    Sam

  • Hmmm , that is quite a convuluted mess.

    Start by doing a proper join rather than sub-selects

    so

    FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock

    WHERE WEATHER_STATION =

    (SELECT WEATHER_STATION FROM StagingVolume WHERE BillDate_ID = @intRowNo)

    AND Market_ID = 3

    Will become

    FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY

    join StagingVolume

    on StagingVolume.Weather_station = ACTUAL_HDD_DAILY.Weather_station

    where StagingVolume.BillDate_ID = @intRowNo

    and ACTUAL_HDD_DAILY.Market_ID =3

    Now all the subselects can be removed are the data taken directly from the instance of StagingVolume on the join.

    Then is should be much easier to achieve your task



    Clear Sky SQL
    My Blog[/url]

  • Ow, my head hurts wrapping around this logic. I'll have to come back to it, but for anyone looking for an easier read, here's the query formatted:

    SET @intRowNo = (SELECT MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)

    WHILE @intRowNo <= (SELECT MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)

    BEGIN

    INSERT INTO [NADECORP\SNjenga].Calendarized_Data

    (REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH, [DATE], ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD)

    SELECT

    (SELECT BROKER_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT CONTRACT_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    WEATHER_STATION,

    (SELECT FIN_YEARFROM StagingVolumeWHERE BillDate_ID = @intRowNo),

    (SELECT FIN_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    [DATE],

    (SELECT ACCT_NOFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT BILL_YEARFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT BILL_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT FROM_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT TO_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    HDD,

    (SELECT SUM(HDD) FROM ACTUAL_HDD_DAILY WHERE weather_station = (SELECT WEATHER_STATION

    FROMEXISTING_CUST_ALPHA_BETA

    WHEREACCOUNT_NUMBER = (SELECT ACCT_NO

    FROM StagingVolume

    WHERE BillDate_ID = @intRowNo

    )

    )

    and [DATE] between (SELECTFROM_DATE

    FROMStagingVolume

    WHEREBillDate_ID = @intRowNo)

    and (SELECT TO_DATE

    FROMStagingVolume

    WHEREBillDate_ID = @intRowNo)

    and Market_ID = 3

    ) TOTHDD

    FROM

    [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock

    WHERE

    WEATHER_STATION = (SELECTWEATHER_STATION

    FROMStagingVolume

    WHEREBillDate_ID = @intRowNo)

    AND Market_ID = 3

    SET @intRowNo = @intRowNo + 1

    END

    I'm going to have to come back to this, but the mass correllated subqueries can't be helping.

    I don't see any significant '30 day' limiters on this, btw. The only thing that controls this is the data in StagingVolume and BillDate_ID being your iterator. There's definately a way to de-loop this, since your rows aren't dependent on prior computations. It'll just take me a bit to dig down through the logic.

    Can you provide a sample schema/dataset for ACtual_HDD_Daily, StagingVolume, and Existing_CUST_ALPHA_BETA so I can test my rebuild vs. your original code?


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I won't say this is correct simply b/c we don't have any data to test and I am not sure I 100% understand what you are trying to do. If you have a DBA at your organization, I would recommend you take it to him and don't be surprised if he is horrified at the current query.

    DECLARE @min-2 INT

    , @max-2 INT

    CREATE TABLE #StagingVolume

    ( WEATHER_STATION INT

    , SUM_HDD INT

    )

    SELECT @min-2 = MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock

    SELECT @max-2 = MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock

    INSERT INTO #StagingVolume

    SELECT d.Weather_station, SUM(HDD)

    FROM ACTUAL_HDD_DAILY AS D

    INNER JOIN (

    SELECT WEATHER_STATION

    FROM EXISTING_CUST_ALPHA_BETA AS B

    INNER JOIN StagingVolume as SV on B.Acct_no = SV.Acct_no

    INNER JOIN (

    SELECT Acct_No, from_date, to_date

    FROM StagingVolume as sv1

    WHERE billdate_id between @min-2 and @max-2

    ) as tmp on sv.acct_no = tmp.acct_no

    WHERE BillDate_ID between @min-2 and @max-2

    and sv.[DATE] between tmp.from_date and tmp.to_date

    ) as tmp1 on d.Weather_station = tmp1.Weather_Station

    WHERE Market_ID = 3

    GROUP BY d.weather_station

    INSERT INTO [NADECORP\SNjenga].Calendarized_Data

    (REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH, [DATE]

    , ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD

    )

    SELECT SV.BROKER_NB

    , SV.CONTRACT_NBFROM

    , D.WEATHER_STATION

    , SV.FIN_YEAR

    , SV.FIN_MONTH

    , D.[DATE]

    , SV.ACCT_NO

    , SV.BILL_YEAR

    , SV.BILL_MONTH

    , SV.FROM_DATE

    , SV.TO_DATE

    , HDD

    , ISNULL(TSV.SUM_HDD,0) AS TOTHDD

    FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock AS D

    INNER JOIN StagingValues AS SV ON D.Weather_Station = SV.WeatherStation

    LEFT OUTER JOIN #StagingValues as TSV ON D.Weather_Station = TSV.Weather_Station

    WHERE D.Market_ID = 3

    AND BillDate_ID between @min-2 and @max-2

  • Evil Kraig F (1/18/2012)


    Ow, my head hurts wrapping around this logic. I'll have to come back to it, but for anyone looking for an easier read, here's the query formatted:

    SET @intRowNo = (SELECT MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)

    WHILE @intRowNo <= (SELECT MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)

    BEGIN

    INSERT INTO [NADECORP\SNjenga].Calendarized_Data

    (REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH, [DATE], ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD)

    SELECT

    (SELECT BROKER_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT CONTRACT_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    WEATHER_STATION,

    (SELECT FIN_YEARFROM StagingVolumeWHERE BillDate_ID = @intRowNo),

    (SELECT FIN_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    [DATE],

    (SELECT ACCT_NOFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT BILL_YEARFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT BILL_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT FROM_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    (SELECT TO_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),

    HDD,

    (SELECT SUM(HDD) FROM ACTUAL_HDD_DAILY WHERE weather_station = (SELECT WEATHER_STATION

    FROMEXISTING_CUST_ALPHA_BETA

    WHEREACCOUNT_NUMBER = (SELECT ACCT_NO

    FROM StagingVolume

    WHERE BillDate_ID = @intRowNo

    )

    )

    and [DATE] between (SELECTFROM_DATE

    FROMStagingVolume

    WHEREBillDate_ID = @intRowNo)

    and (SELECT TO_DATE

    FROMStagingVolume

    WHEREBillDate_ID = @intRowNo)

    and Market_ID = 3

    ) TOTHDD

    FROM

    [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock

    WHERE

    WEATHER_STATION = (SELECTWEATHER_STATION

    FROMStagingVolume

    WHEREBillDate_ID = @intRowNo)

    AND Market_ID = 3

    SET @intRowNo = @intRowNo + 1

    END

    I'm going to have to come back to this, but the mass correllated subqueries can't be helping.

    I don't see any significant '30 day' limiters on this, btw. The only thing that controls this is the data in StagingVolume and BillDate_ID being your iterator. There's definately a way to de-loop this, since your rows aren't dependent on prior computations. It'll just take me a bit to dig down through the logic.

    Can you provide a sample schema/dataset for ACtual_HDD_Daily, StagingVolume, and Existing_CUST_ALPHA_BETA so I can test my rebuild vs. your original code?

    Here's the query rapidly converted to joins:

    SELECT

    sv.BROKER_NB,

    sv.CONTRACT_NB,

    a.WEATHER_STATION,

    sv.FIN_YEAR,

    sv.FIN_MONTH,

    a.[DATE],

    sv.ACCT_NO,

    sv.BILL_YEAR,

    sv.BILL_MONTH,

    sv.FROM_DATE,

    sv.TO_DATE,

    a.HDD,

    TOTHDD = (

    SELECT SUM(a.HDD)

    FROM ACTUAL_HDD_DAILY a

    INNER JOIN EXISTING_CUST_ALPHA_BETA e ON e.WEATHER_STATION = a.weather_station

    INNER JOIN StagingVolume s ON s.ACCT_NO = e.ACCOUNT_NUMBER AND s.BillDate_ID = @intRowNo

    INNER JOIN StagingVolume f ON a.[DATE] BETWEEN f.FROM_DATE AND f.TO_DATE AND f.BillDate_ID = @intRowNo

    WHERE a.Market_ID = 3

    )

    FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY a

    INNER JOIN StagingVolume sv ON sv.WEATHER_STATION = a.WEATHER_STATION AND sv.BillDate_ID = @intRowNo

    WHERE a.Market_ID = 3

    Edit: there's a slim chance it might work simply by removing references to the counter from the query?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Roughly sure this is what you need:

    SELECT

    sv.Broker_NB,

    sv.Contract_NB,

    ahd.WEATHER_STATION,

    sv.FIN_YEAR,

    sv.FIN_MONTH,

    ahd.[Date],

    sv.ACCT_NO,

    sv.BILL_YEAR,

    sv.FROM_DATE,

    sv.TO_DATE,

    ahd.HDD,

    ca.HDDSUM AS TOTHDD

    FROM

    [NADECORP\SNjenga].StagingVolume AS sv

    JOIN

    [NADECORP\SNjenga].ACTUAL_HDD_DAILY AS ahd

    ONsv.Weather_station = ahd.Weather_station

    CROSS APPLY

    (SELECT SUM(HDD) AS HDDSUM

    FROM

    ACTUAL_HDD_DAILY AS ahd2

    JOIN

    StagingVolume AS sv2

    ONsv.BILLDATE_ID = sv2.BILLDATE_ID

    JOIN

    EXISTING_CUST_ALPHA_BETA AS ecab

    ONahd2.Weather_station = ecab.weather_station

    AND ecab.account_number = sv2.acct_no

    WHERE

    ahd2.[DATE] between sv2.FROM_DATE AND sv2.TO_DATE

    and Market_ID = 3

    ) AS ca

    WHERE

    ahd.Market_ID = 3


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (1/18/2012)


    Roughly sure this is what you need:

    You're missing an AND from the BETWEEN in the cross apply 😉


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/18/2012)


    Evil Kraig F (1/18/2012)


    Roughly sure this is what you need:

    You're missing an AND from the BETWEEN in the cross apply 😉

    D'oh, *facepalm* Fixed shortly...

    See what happens when I don't have sample stuff to work from? 😛

    EDIT: On second review I could clean up that Cross Apply Further, but I'd like some feedback from the OP that it's functional first.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Thanks, Craig I'll try out this statement and let you know how it goes.

  • I'm quite sure the sub-selects are slowing execution. I'll replace them with the joins and let you know how it goes.

  • Thanks, Kraig,

    I'll test the query and get back with you soon.

    Sam :discuss:

  • Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.

  • s.njenga (1/18/2012)


    Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.

    If that's in reference to Chris' re-write, he was merely re-writing the inner query, not completely overhauling the logic, trying to make life easier.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (1/19/2012)


    s.njenga (1/18/2012)


    Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.

    If that's in reference to Chris' re-write, he was merely re-writing the inner query, not completely overhauling the logic, trying to make life easier.

    That would be my guess too Craig.

    I reckon, in your query, you might need a correlation between the cross apply and the outer query?


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (1/19/2012)


    Evil Kraig F (1/19/2012)


    s.njenga (1/18/2012)


    Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.

    If that's in reference to Chris' re-write, he was merely re-writing the inner query, not completely overhauling the logic, trying to make life easier.

    That would be my guess too Craig.

    I reckon, in your query, you might need a correlation between the cross apply and the outer query?

    It's there, just well hidden. Part of what I was thinking of 'cleaning up'.

    JOINStagingVolume AS sv2ONsv.BILLDATE_ID = sv2.BILLDATE_ID


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply