Update stmt tuning

  • Hi pals,

    I have a datawarehouse database in which two tables are there.

    One is "Src" table and other "trg" serves as target table.

    select * from src

    ssn distcodefee dt

    10110NULL

    10110NULL

    101152007-11-26 00:00:00.000

    1011102007-11-26 00:00:00.000

    I loaded the ssn and repo

    select * from trg

    ssn distcodetot_fee

    1011NULL

    Now i need to update the total_fee field based on below conditions

    inital calculation:

    -------------------

    sum of fee where "dt" is null group by ssn,distcode.

    Note: is the above sum is zero then consider below secondary calculation

    Secondary calculation:

    ------------------------

    sum of "fee" where "dt" = max of dt in the "Src" table group by ssn,distcode

    Note: is the sum is zero store zero itself.

    for the above requirement i have used the below update statement. This is working fine.

    If u observe the below query the "src" table is loading 3 times.

    can we optimize this query? Can we avoid these multiple loading of the same table into memory?

    ---

    update trg set total_fee =

    (

    select case sum(b.fee) when 0 then

    --------- secondary condition

    (select case sum(c.fee) when 0.0 then

    0.0

    else

    sum(c.fee)

    end

    from src c

    where c.dt = (select max(dt) from src)

    and c.ssn=a.ssn

    and c.distcode=a.distcode)

    --------

    else

    sum(b.fee)

    end

    from src b

    where b.dt is null

    and b.ssn=a.ssn

    and b.distcode=a.distcode

    ) from trg a

    Any suggestions are greatly appreciated

    Thanks and Regards,

  • Check the the below queries...

    Solution 1

    UPDATEt

    SETt.total_fee = ( CASE WHEN COALESCE( n1.Fee, 0 ) = 0 THEN COALESCE( n2.Fee, 0 ) ELSE COALESCE( n1.Fee, 0 ) END )

    FROMtrg t

    LEFT JOIN

    (

    SELECTssn, distcode, SUM( Fee ) AS Fee

    FROMSrc

    WHEREdt IS NULL

    GROUP BY ssn, distcode

    ) n1 ON t.ssn = n1.ssn AND t.distcode = n1.distcode

    LEFT JOIN

    (

    SELECTROW_NUMBER() OVER( PARTITION BY ssn, distcode ORDER BY dt DESC ) AS RowNum,

    ssn, distcode, SUM( Fee ) AS Fee

    FROMSrc

    WHEREdt IS NOT NULL

    GROUP BY ssn, distcode

    ) n2 ON t.ssn = n2.ssn AND t.distcode = n2.distcode AND n2.RowNum = 1

    Solution 2 (in form of two updates)

    UPDATEt

    SETt.total_fee = COALESCE( n1.Fee, 0 )

    FROMtrg t

    INNER JOIN

    (

    SELECTssn, distcode, SUM( Fee ) AS Fee

    FROMSrc

    WHEREdt IS NULL

    GROUP BY ssn, distcode

    ) n1 ON t.ssn = n1.ssn AND t.distcode = n1.distcode

    UPDATEt

    SETt.total_fee = COALESCE( n2.Fee, 0 )

    INNER JOIN

    (

    SELECTROW_NUMBER() OVER( PARTITION BY ssn, distcode ORDER BY dt DESC ) AS RowNum,

    ssn, distcode, SUM( Fee ) AS Fee

    FROMSrc

    WHEREdt IS NOT NULL

    GROUP BY ssn, distcode

    ) n2 ON t.ssn = n2.ssn AND t.distcode = n2.distcode AND n2.RowNum = 1

    WHEREt.total_fee = 0

    --Ramesh


  • I have gone thru your problem and found that it can be done in the following way where we are traversing the source table twice.

    Will think about another solution which could even do it in one traverse.

    DECLARE @MaximumDate smalldatetime

    SET @MaximumDate = getdate();

    WITH Temp AS

    (

    SELECT

    ssn, distcode,

    CASE

    when dt IS NULL THEN @MaximumDate

    ELSE dt

    END dt,

    sum(fee) totalfee

    FROM src

    Group By

    ssn, distcode, dt

    Having sum(fee)<> 0

    )

    Update trg

    SET tot_fee = A.totalfee

    FROM

    Temp A

    INNER Join (

    SELECT

    Max(dt) MaxDate,

    ssn,

    distcode

    FROM

    Temp B

    Group BY ssn, distcode) C

    ON

    A.ssn = C.ssn and

    A.distcode = C.distcode and

    CASE

    WHEN A.dt IS NULL THEN @MaximumDate

    ELSE A.dt

    END = C.MaxDate

    INNER JOIN trg Target ON Target.ssn = A.ssn and Target.distcode = A.distcode

    -Rohit

  • What you have here is a ranking problem, with 2 levels of ranking.

    If dt is NULL and there's a fee, that ranks first, otherwise you apply a secondary ranking based on the max() date.

    Here's the SELECT statement. You'll need to patch this into your UPDATE:

    Select *

    From

    (

    Select ssn, distcode, dt, FeeSum,

    -- Secondary ranking, based on the date (descending)

    Rank() Over(Partition By ssn, distcode

    Order By DateFeeRank, dt desc) As Ranking

    From

    (

    Select ssn,

    distcode,

    dt,

    Sum(Fee) As FeeSum,

    -- Primary ranking - rank #1 if NULL date and there is a fee

    Case When dt Is Null And Sum(Fee) > 0.0

    Then 1

    Else 2

    End As DateFeeRank

    From #ssn

    Group By ssn, distcode, dt

    ) dt

    ) dtRank

    -- Select #1 ranked group from with each ssn/distcode

    Where Ranking = 1

  • Thanks for the help.

  • Ramesh , i just want to ask one question. We are again using Joins and functions and temporary tables. Don't it effect the query performance and resource utilization?

    Regards,

  • mahesh.vsp (11/26/2007)


    We are again using Joins and functions and temporary tables. Don't it effect the query performance and resource utilization?

    The queries which I and PW has posted will surely improve the performance, which you can compare it by your version of the query!!!!!

    --Ramesh


  • Ok Ramesh. Thanks.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply