November 26, 2007 at 6:14 am
Hi pals,
I have a datawarehouse database in which two tables are there.
One is "Src" table and other "trg" serves as target table.
select * from src
ssn distcodefee dt
10110NULL
10110NULL
101152007-11-26 00:00:00.000
1011102007-11-26 00:00:00.000
I loaded the ssn and repo
select * from trg
ssn distcodetot_fee
1011NULL
Now i need to update the total_fee field based on below conditions
inital calculation:
-------------------
sum of fee where "dt" is null group by ssn,distcode.
Note: is the above sum is zero then consider below secondary calculation
Secondary calculation:
------------------------
sum of "fee" where "dt" = max of dt in the "Src" table group by ssn,distcode
Note: is the sum is zero store zero itself.
for the above requirement i have used the below update statement. This is working fine.
If u observe the below query the "src" table is loading 3 times.
can we optimize this query? Can we avoid these multiple loading of the same table into memory?
---
update trg set total_fee =
(
select case sum(b.fee) when 0 then
--------- secondary condition
(select case sum(c.fee) when 0.0 then
0.0
else
sum(c.fee)
end
from src c
where c.dt = (select max(dt) from src)
and c.ssn=a.ssn
and c.distcode=a.distcode)
--------
else
sum(b.fee)
end
from src b
where b.dt is null
and b.ssn=a.ssn
and b.distcode=a.distcode
) from trg a
Any suggestions are greatly appreciated
Thanks and Regards,
November 26, 2007 at 6:41 am
Check the the below queries...
Solution 1
UPDATEt
SETt.total_fee = ( CASE WHEN COALESCE( n1.Fee, 0 ) = 0 THEN COALESCE( n2.Fee, 0 ) ELSE COALESCE( n1.Fee, 0 ) END )
FROMtrg t
LEFT JOIN
(
SELECTssn, distcode, SUM( Fee ) AS Fee
FROMSrc
WHEREdt IS NULL
GROUP BY ssn, distcode
) n1 ON t.ssn = n1.ssn AND t.distcode = n1.distcode
LEFT JOIN
(
SELECTROW_NUMBER() OVER( PARTITION BY ssn, distcode ORDER BY dt DESC ) AS RowNum,
ssn, distcode, SUM( Fee ) AS Fee
FROMSrc
WHEREdt IS NOT NULL
GROUP BY ssn, distcode
) n2 ON t.ssn = n2.ssn AND t.distcode = n2.distcode AND n2.RowNum = 1
Solution 2 (in form of two updates)
UPDATEt
SETt.total_fee = COALESCE( n1.Fee, 0 )
FROMtrg t
INNER JOIN
(
SELECTssn, distcode, SUM( Fee ) AS Fee
FROMSrc
WHEREdt IS NULL
GROUP BY ssn, distcode
) n1 ON t.ssn = n1.ssn AND t.distcode = n1.distcode
UPDATEt
SETt.total_fee = COALESCE( n2.Fee, 0 )
INNER JOIN
(
SELECTROW_NUMBER() OVER( PARTITION BY ssn, distcode ORDER BY dt DESC ) AS RowNum,
ssn, distcode, SUM( Fee ) AS Fee
FROMSrc
WHEREdt IS NOT NULL
GROUP BY ssn, distcode
) n2 ON t.ssn = n2.ssn AND t.distcode = n2.distcode AND n2.RowNum = 1
WHEREt.total_fee = 0
--Ramesh
November 26, 2007 at 6:57 am
I have gone thru your problem and found that it can be done in the following way where we are traversing the source table twice.
Will think about another solution which could even do it in one traverse.
DECLARE @MaximumDate smalldatetime
SET @MaximumDate = getdate();
WITH Temp AS
(
SELECT
ssn, distcode,
CASE
when dt IS NULL THEN @MaximumDate
ELSE dt
END dt,
sum(fee) totalfee
FROM src
Group By
ssn, distcode, dt
Having sum(fee)<> 0
)
Update trg
SET tot_fee = A.totalfee
FROM
Temp A
INNER Join (
SELECT
Max(dt) MaxDate,
ssn,
distcode
FROM
Temp B
Group BY ssn, distcode) C
ON
A.ssn = C.ssn and
A.distcode = C.distcode and
CASE
WHEN A.dt IS NULL THEN @MaximumDate
ELSE A.dt
END = C.MaxDate
INNER JOIN trg Target ON Target.ssn = A.ssn and Target.distcode = A.distcode
-Rohit
November 26, 2007 at 9:02 am
What you have here is a ranking problem, with 2 levels of ranking.
If dt is NULL and there's a fee, that ranks first, otherwise you apply a secondary ranking based on the max() date.
Here's the SELECT statement. You'll need to patch this into your UPDATE:
Select *
From
(
Select ssn, distcode, dt, FeeSum,
-- Secondary ranking, based on the date (descending)
Rank() Over(Partition By ssn, distcode
Order By DateFeeRank, dt desc) As Ranking
From
(
Select ssn,
distcode,
dt,
Sum(Fee) As FeeSum,
-- Primary ranking - rank #1 if NULL date and there is a fee
Case When dt Is Null And Sum(Fee) > 0.0
Then 1
Else 2
End As DateFeeRank
From #ssn
Group By ssn, distcode, dt
) dt
) dtRank
-- Select #1 ranked group from with each ssn/distcode
Where Ranking = 1
November 26, 2007 at 9:11 am
Thanks for the help.
November 26, 2007 at 9:16 am
Ramesh , i just want to ask one question. We are again using Joins and functions and temporary tables. Don't it effect the query performance and resource utilization?
Regards,
November 27, 2007 at 4:30 am
mahesh.vsp (11/26/2007)
We are again using Joins and functions and temporary tables. Don't it effect the query performance and resource utilization?
The queries which I and PW has posted will surely improve the performance, which you can compare it by your version of the query!!!!!
--Ramesh
November 28, 2007 at 6:53 am
Ok Ramesh. Thanks.
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply