January 18, 2012 at 2:22 pm
Hi members,
I have a statement that loops through the records of a table to get two date values from which the number of days is derived. If the number of days is 30, then 30 records from a second table with a a Date value, are inserted into a third table. Below is the statement:
SET @intRowNo = (SELECT MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)
WHILE @intRowNo <= (SELECT MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)
BEGIN
INSERT INTO [NADECORP\SNjenga].Calendarized_Data(REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH,
[DATE], ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD)
SELECT(SELECT BROKER_NB FROM StagingVolume WHERE BillDate_ID = @intRowNo), (SELECT CONTRACT_NB
FROM StagingVolume WHERE BillDate_ID = @intRowNo), WEATHER_STATION, (SELECT FIN_YEAR FROM StagingVolume
WHERE BillDate_ID = @intRowNo), (SELECT FIN_MONTH FROM StagingVolume
WHERE BillDate_ID = @intRowNo), [DATE], (SELECT ACCT_NO FROM StagingVolume
WHERE BillDate_ID = @intRowNo), (SELECT BILL_YEAR FROM StagingVolume
WHERE BillDate_ID = @intRowNo), (SELECT BILL_MONTH FROM StagingVolume
WHERE BillDate_ID = @intRowNo), (SELECT FROM_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT TO_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo), HDD, (SELECT SUM(HDD)
FROM ACTUAL_HDD_DAILY WHERE weather_station =
(SELECT WEATHER_STATION FROM EXISTING_CUST_ALPHA_BETA WHERE ACCOUNT_NUMBER =
(SELECT ACCT_NO FROM StagingVolume WHERE BillDate_ID = @intRowNo))
and [DATE] between (SELECT FROM_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo)
and (SELECT TO_DATE FROM StagingVolume WHERE BillDate_ID = @intRowNo) and Market_ID = 3)
TOTHDD
FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock
WHERE WEATHER_STATION =
(SELECT WEATHER_STATION FROM StagingVolume WHERE BillDate_ID = @intRowNo)
AND Market_ID = 3
SET @intRowNo = @intRowNo + 1
END
Is there a more efficient way of accomplishing this for 75 million records to be inserted into the third table?
Regards,
Sam
January 18, 2012 at 2:44 pm
Hmmm , that is quite a convuluted mess.
Start by doing a proper join rather than sub-selects
so
FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock
WHERE WEATHER_STATION =
(SELECT WEATHER_STATION FROM StagingVolume WHERE BillDate_ID = @intRowNo)
AND Market_ID = 3
Will become
FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY
join StagingVolume
on StagingVolume.Weather_station = ACTUAL_HDD_DAILY.Weather_station
where StagingVolume.BillDate_ID = @intRowNo
and ACTUAL_HDD_DAILY.Market_ID =3
Now all the subselects can be removed are the data taken directly from the instance of StagingVolume on the join.
Then is should be much easier to achieve your task
January 18, 2012 at 2:47 pm
Ow, my head hurts wrapping around this logic. I'll have to come back to it, but for anyone looking for an easier read, here's the query formatted:
SET @intRowNo = (SELECT MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)
WHILE @intRowNo <= (SELECT MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)
BEGIN
INSERT INTO [NADECORP\SNjenga].Calendarized_Data
(REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH, [DATE], ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD)
SELECT
(SELECT BROKER_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT CONTRACT_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),
WEATHER_STATION,
(SELECT FIN_YEARFROM StagingVolumeWHERE BillDate_ID = @intRowNo),
(SELECT FIN_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),
[DATE],
(SELECT ACCT_NOFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT BILL_YEARFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT BILL_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT FROM_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT TO_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),
HDD,
(SELECT SUM(HDD) FROM ACTUAL_HDD_DAILY WHERE weather_station = (SELECT WEATHER_STATION
FROMEXISTING_CUST_ALPHA_BETA
WHEREACCOUNT_NUMBER = (SELECT ACCT_NO
FROM StagingVolume
WHERE BillDate_ID = @intRowNo
)
)
and [DATE] between (SELECTFROM_DATE
FROMStagingVolume
WHEREBillDate_ID = @intRowNo)
and (SELECT TO_DATE
FROMStagingVolume
WHEREBillDate_ID = @intRowNo)
and Market_ID = 3
) TOTHDD
FROM
[NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock
WHERE
WEATHER_STATION = (SELECTWEATHER_STATION
FROMStagingVolume
WHEREBillDate_ID = @intRowNo)
AND Market_ID = 3
SET @intRowNo = @intRowNo + 1
END
I'm going to have to come back to this, but the mass correllated subqueries can't be helping.
I don't see any significant '30 day' limiters on this, btw. The only thing that controls this is the data in StagingVolume and BillDate_ID being your iterator. There's definately a way to de-loop this, since your rows aren't dependent on prior computations. It'll just take me a bit to dig down through the logic.
Can you provide a sample schema/dataset for ACtual_HDD_Daily, StagingVolume, and Existing_CUST_ALPHA_BETA so I can test my rebuild vs. your original code?
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 18, 2012 at 2:56 pm
I won't say this is correct simply b/c we don't have any data to test and I am not sure I 100% understand what you are trying to do. If you have a DBA at your organization, I would recommend you take it to him and don't be surprised if he is horrified at the current query.
DECLARE @min-2 INT
, @max-2 INT
CREATE TABLE #StagingVolume
( WEATHER_STATION INT
, SUM_HDD INT
)
SELECT @min-2 = MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock
SELECT @max-2 = MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock
INSERT INTO #StagingVolume
SELECT d.Weather_station, SUM(HDD)
FROM ACTUAL_HDD_DAILY AS D
INNER JOIN (
SELECT WEATHER_STATION
FROM EXISTING_CUST_ALPHA_BETA AS B
INNER JOIN StagingVolume as SV on B.Acct_no = SV.Acct_no
INNER JOIN (
SELECT Acct_No, from_date, to_date
FROM StagingVolume as sv1
WHERE billdate_id between @min-2 and @max-2
) as tmp on sv.acct_no = tmp.acct_no
WHERE BillDate_ID between @min-2 and @max-2
and sv.[DATE] between tmp.from_date and tmp.to_date
) as tmp1 on d.Weather_station = tmp1.Weather_Station
WHERE Market_ID = 3
GROUP BY d.weather_station
INSERT INTO [NADECORP\SNjenga].Calendarized_Data
(REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH, [DATE]
, ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD
)
SELECT SV.BROKER_NB
, SV.CONTRACT_NBFROM
, D.WEATHER_STATION
, SV.FIN_YEAR
, SV.FIN_MONTH
, D.[DATE]
, SV.ACCT_NO
, SV.BILL_YEAR
, SV.BILL_MONTH
, SV.FROM_DATE
, SV.TO_DATE
, HDD
, ISNULL(TSV.SUM_HDD,0) AS TOTHDD
FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock AS D
INNER JOIN StagingValues AS SV ON D.Weather_Station = SV.WeatherStation
LEFT OUTER JOIN #StagingValues as TSV ON D.Weather_Station = TSV.Weather_Station
WHERE D.Market_ID = 3
January 18, 2012 at 3:05 pm
Evil Kraig F (1/18/2012)
Ow, my head hurts wrapping around this logic. I'll have to come back to it, but for anyone looking for an easier read, here's the query formatted:
SET @intRowNo = (SELECT MIN(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)
WHILE @intRowNo <= (SELECT MAX(BillDate_ID) FROM [NADECORP\SNjenga].StagingVolume nolock)
BEGIN
INSERT INTO [NADECORP\SNjenga].Calendarized_Data
(REGION, CONTRACT_NB, WEATHER_STATION, FIN_YEAR, FIN_MONTH, [DATE], ACCT_NO, BILL_YEAR, BILL_MONTH, FROM_DATE, TO_DATE, HDD, BILL_PERIOD_HDD)
SELECT
(SELECT BROKER_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT CONTRACT_NBFROM StagingVolume WHERE BillDate_ID = @intRowNo),
WEATHER_STATION,
(SELECT FIN_YEARFROM StagingVolumeWHERE BillDate_ID = @intRowNo),
(SELECT FIN_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),
[DATE],
(SELECT ACCT_NOFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT BILL_YEARFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT BILL_MONTHFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT FROM_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),
(SELECT TO_DATEFROM StagingVolume WHERE BillDate_ID = @intRowNo),
HDD,
(SELECT SUM(HDD) FROM ACTUAL_HDD_DAILY WHERE weather_station = (SELECT WEATHER_STATION
FROMEXISTING_CUST_ALPHA_BETA
WHEREACCOUNT_NUMBER = (SELECT ACCT_NO
FROM StagingVolume
WHERE BillDate_ID = @intRowNo
)
)
and [DATE] between (SELECTFROM_DATE
FROMStagingVolume
WHEREBillDate_ID = @intRowNo)
and (SELECT TO_DATE
FROMStagingVolume
WHEREBillDate_ID = @intRowNo)
and Market_ID = 3
) TOTHDD
FROM
[NADECORP\SNjenga].ACTUAL_HDD_DAILY nolock
WHERE
WEATHER_STATION = (SELECTWEATHER_STATION
FROMStagingVolume
WHEREBillDate_ID = @intRowNo)
AND Market_ID = 3
SET @intRowNo = @intRowNo + 1
END
I'm going to have to come back to this, but the mass correllated subqueries can't be helping.
I don't see any significant '30 day' limiters on this, btw. The only thing that controls this is the data in StagingVolume and BillDate_ID being your iterator. There's definately a way to de-loop this, since your rows aren't dependent on prior computations. It'll just take me a bit to dig down through the logic.
Can you provide a sample schema/dataset for ACtual_HDD_Daily, StagingVolume, and Existing_CUST_ALPHA_BETA so I can test my rebuild vs. your original code?
Here's the query rapidly converted to joins:
SELECT
sv.BROKER_NB,
sv.CONTRACT_NB,
a.WEATHER_STATION,
sv.FIN_YEAR,
sv.FIN_MONTH,
a.[DATE],
sv.ACCT_NO,
sv.BILL_YEAR,
sv.BILL_MONTH,
sv.FROM_DATE,
sv.TO_DATE,
a.HDD,
TOTHDD = (
SELECT SUM(a.HDD)
FROM ACTUAL_HDD_DAILY a
INNER JOIN EXISTING_CUST_ALPHA_BETA e ON e.WEATHER_STATION = a.weather_station
INNER JOIN StagingVolume s ON s.ACCT_NO = e.ACCOUNT_NUMBER AND s.BillDate_ID = @intRowNo
INNER JOIN StagingVolume f ON a.[DATE] BETWEEN f.FROM_DATE AND f.TO_DATE AND f.BillDate_ID = @intRowNo
WHERE a.Market_ID = 3
)
FROM [NADECORP\SNjenga].ACTUAL_HDD_DAILY a
INNER JOIN StagingVolume sv ON sv.WEATHER_STATION = a.WEATHER_STATION AND sv.BillDate_ID = @intRowNo
WHERE a.Market_ID = 3
Edit: there's a slim chance it might work simply by removing references to the counter from the query?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 18, 2012 at 3:09 pm
Roughly sure this is what you need:
SELECT
sv.Broker_NB,
sv.Contract_NB,
ahd.WEATHER_STATION,
sv.FIN_YEAR,
sv.FIN_MONTH,
ahd.[Date],
sv.ACCT_NO,
sv.BILL_YEAR,
sv.FROM_DATE,
sv.TO_DATE,
ahd.HDD,
ca.HDDSUM AS TOTHDD
FROM
[NADECORP\SNjenga].StagingVolume AS sv
JOIN
[NADECORP\SNjenga].ACTUAL_HDD_DAILY AS ahd
ONsv.Weather_station = ahd.Weather_station
CROSS APPLY
(SELECT SUM(HDD) AS HDDSUM
FROM
ACTUAL_HDD_DAILY AS ahd2
JOIN
StagingVolume AS sv2
ONsv.BILLDATE_ID = sv2.BILLDATE_ID
JOIN
EXISTING_CUST_ALPHA_BETA AS ecab
ONahd2.Weather_station = ecab.weather_station
AND ecab.account_number = sv2.acct_no
WHERE
ahd2.[DATE] between sv2.FROM_DATE AND sv2.TO_DATE
and Market_ID = 3
) AS ca
WHERE
ahd.Market_ID = 3
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 18, 2012 at 3:12 pm
Evil Kraig F (1/18/2012)
Roughly sure this is what you need:
You're missing an AND from the BETWEEN in the cross apply 😉
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 18, 2012 at 3:13 pm
ChrisM@home (1/18/2012)
Evil Kraig F (1/18/2012)
Roughly sure this is what you need:You're missing an AND from the BETWEEN in the cross apply 😉
D'oh, *facepalm* Fixed shortly...
See what happens when I don't have sample stuff to work from? 😛
EDIT: On second review I could clean up that Cross Apply Further, but I'd like some feedback from the OP that it's functional first.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 18, 2012 at 7:12 pm
Thanks, Craig I'll try out this statement and let you know how it goes.
January 18, 2012 at 7:14 pm
I'm quite sure the sub-selects are slowing execution. I'll replace them with the joins and let you know how it goes.
January 18, 2012 at 7:51 pm
Thanks, Kraig,
I'll test the query and get back with you soon.
Sam :discuss:
January 18, 2012 at 7:55 pm
Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.
January 19, 2012 at 11:35 am
s.njenga (1/18/2012)
Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.
If that's in reference to Chris' re-write, he was merely re-writing the inner query, not completely overhauling the logic, trying to make life easier.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
January 19, 2012 at 11:42 am
Evil Kraig F (1/19/2012)
s.njenga (1/18/2012)
Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.If that's in reference to Chris' re-write, he was merely re-writing the inner query, not completely overhauling the logic, trying to make life easier.
That would be my guess too Craig.
I reckon, in your query, you might need a correlation between the cross apply and the outer query?
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
January 19, 2012 at 11:43 am
ChrisM@home (1/19/2012)
Evil Kraig F (1/19/2012)
s.njenga (1/18/2012)
Your query is only referencing @introwno ones instead of referencing all rows in the staging table without having to loop through each row.If that's in reference to Chris' re-write, he was merely re-writing the inner query, not completely overhauling the logic, trying to make life easier.
That would be my guess too Craig.
I reckon, in your query, you might need a correlation between the cross apply and the outer query?
It's there, just well hidden. Part of what I was thinking of 'cleaning up'.
JOINStagingVolume AS sv2ONsv.BILLDATE_ID = sv2.BILLDATE_ID
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply