Update current period data based on previous period data

  • Hi to all gurus,

    I have a problem here and below is my story:

    My client had prepared their yearly cost data and I had loaded the data into my fact table. Bear in mind that some of the month might not have cost incurred. When I pull the cost data in my SQL Server Management Studio, here is roughly how it looks like:

    PRODUCT ID | ENTITY ID | MONTH | COST

    -------------------------------------------------

    CMP1000 | PM1 | Jan.09 | 100

    CMP1000 | PM1 | Feb.09 | 80

    CMP1000 | PM1 | Mar.09 |

    CMP1000 | PM1 | Apr.09 | 110

    CMP1000 | PM1 | May.09 |

    CMP1000 | PM1 | Jun.09 |

    CMP1000 | PM1 | Jul.09 |

    Now the problem is how can we automatically update the COST column so that if there is no cost incurred in Mar.09, it'll automatically refer to last month's cost? The same goes to the cost for May.09, Jun.09 and Jul.09. It should take the 110 figure for this period.

    The reason for this is to calculate the product profitability. We'll take the SALES figure minus the COST figure. Eg; if there is no cost incurred in Jul.09, the product profitability will be calculated as SALES in Jul.09 minus COST in Jul.09 which is 110.

    I'm thinking of writing a stored procedure and append the NULL costs with updated values into another column. But I'm not sure how to do it.

    If SSC members here came across this situation before or get to know a more feasible solution, pls share with me.

    Many thanks!

  • What you have in hand is a classic "running total" statement..

    i suggest you go thro the following excellent article by Jeff Moden to get the gist and solutions of your problem

    http://www.sqlservercentral.com/articles/T-SQL/68467/

  • As ColdCoffee said read the article and implement it. I have already implemented 'Quirky Update' to resolve running total problem 🙂

    It is working perfectly...

    karthik

  • :ermm: Not a running total problem.

    DECLARE @Sample

    TABLE (

    product_id CHAR(7) NOT NULL,

    entity_id CHAR(3) NOT NULL,

    period INTEGER NOT NULL,

    cost SMALLMONEY NULL

    );

    INSERT @Sample

    (product_id, entity_id, period, cost)

    SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090701, NULL;

    SELECT S.product_id,

    S.entity_id,

    S.period,

    CA.cost

    FROM @Sample S

    CROSS

    APPLY (

    SELECT TOP (1)

    *

    FROM @Sample S2

    WHERE S2.product_id = S.product_id

    AND S2.entity_id = S.entity_id

    AND S2.cost IS NOT NULL

    AND S2.period <= S.period

    ORDER BY

    S2.period DESC

    ) CA;

    Paul

  • Paul White NZ (4/5/2010)


    :ermm: Not a running total problem.

    DECLARE @Sample

    TABLE (

    product_id CHAR(7) NOT NULL,

    entity_id CHAR(3) NOT NULL,

    period INTEGER NOT NULL,

    cost SMALLMONEY NULL

    );

    INSERT @Sample

    (product_id, entity_id, period, cost)

    SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090701, NULL;

    SELECT S.product_id,

    S.entity_id,

    S.period,

    CA.cost

    FROM @Sample S

    CROSS

    APPLY (

    SELECT TOP (1)

    *

    FROM @Sample S2

    WHERE S2.product_id = S.product_id

    AND S2.entity_id = S.entity_id

    AND S2.cost IS NOT NULL

    AND S2.period <= S.period

    ORDER BY

    S2.period DESC

    ) CA;

    Paul

    Genius Paul! This is what I want...Million thanks!

    YOU ARE MY SAVIOUR 😀

  • As a bit of a side bar, I don't know if you're data is actually like you posted it but, just in case, I'd recommend you NOT store formatted dates like ' Jan.09'

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/5/2010)


    As a bit of a side bar, I don't know if you're data is actually like you posted it but, just in case, I'd recommend you NOT store formatted dates like ' Jan.09'

    Absolutely! I meant to mention that when I posted my script - but forgot.

    My advice? Always store date and time data using one of the 'date or time' data types - never as anything else. And I don't make absolute statements like that very often...:-)

  • Hmmmm.... be careful, folks. I thought I recognized the pattern in the Cross Apply. It contains a Triangular Join and I just tested it on just 10K rows... it generates over 50 million internal rows and a quarter million reads just for 10k rows and that's with or without an index.

    For more information on Triangular joins and how they can cripple a server, please see the article at the following URL:

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Paul and Jeff,

    The actual data in the period column which shows the output as Jan.09 is actually a datetime datatype which stores "1/1/2009 12:00:00 AM". But when we display the period in the Web Intelligence report it'll shows as Jan.09 because I had apply the Object Format to Month.Year condition in Universe Designer.

    This post is actually for my BusinessObjects project.

    Again...thanks for all the inputs.

  • Paul White NZ (4/5/2010)


    :ermm: Not a running total problem.

    CREATE TABLE Sample1

    (

    product_id CHAR(7) NOT NULL,

    entity_id CHAR(3) NOT NULL,

    period INTEGER NOT NULL,

    cost SMALLMONEY NULL

    );

    INSERT Sample1

    (product_id, entity_id, period, cost)

    SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090701, NULL;

    SELECT S.product_id,

    S.entity_id,

    S.period,

    CA.cost

    FROM Sample1 S

    CROSS

    APPLY (

    SELECT TOP (1)

    *

    FROM Sample1 S2

    WHERE S2.product_id = S.product_id

    AND S2.entity_id = S.entity_id

    AND S2.cost IS NOT NULL

    AND S2.period <= S.period

    ORDER BY

    S2.period DESC

    ) CA;

    Paul

    Hi Paul,

    I had executed the code above (after done some minor amendments) and the query result is displayed as below:

    PRODUCT ID | ENTITY ID | MONTH | COST

    -------------------------------------------------

    CMP1000 | PM1 | Jan.09 | 100

    CMP1000 | PM1 | Feb.09 | 80

    CMP1000 | PM1 | Mar.09 | 80

    CMP1000 | PM1 | Apr.09 | 110

    CMP1000 | PM1 | May.09 | 110

    CMP1000 | PM1 | Jun.09 | 110

    CMP1000 | PM1 | Jul.09 | 110

    But when I open the table, the result is still the same as previous one. I suspect that after the CROSS APPLY execution, the result is not updated to my Cost table. How can we save back the result to the Cost table?

    Thanks.

  • yingchai (4/5/2010)


    Paul White NZ (4/5/2010)


    :ermm: Not a running total problem.

    CREATE TABLE Sample1

    (

    product_id CHAR(7) NOT NULL,

    entity_id CHAR(3) NOT NULL,

    period INTEGER NOT NULL,

    cost SMALLMONEY NULL

    );

    INSERT Sample1

    (product_id, entity_id, period, cost)

    SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL

    SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL

    SELECT 'CMP1000', 'PM1', 20090701, NULL;

    SELECT S.product_id,

    S.entity_id,

    S.period,

    CA.cost

    FROM Sample1 S

    CROSS

    APPLY (

    SELECT TOP (1)

    *

    FROM Sample1 S2

    WHERE S2.product_id = S.product_id

    AND S2.entity_id = S.entity_id

    AND S2.cost IS NOT NULL

    AND S2.period <= S.period

    ORDER BY

    S2.period DESC

    ) CA;

    Paul

    Hi Paul,

    I had executed the code above (after done some minor amendments) and the query result is displayed as below:

    PRODUCT ID | ENTITY ID | MONTH | COST

    -------------------------------------------------

    CMP1000 | PM1 | Jan.09 | 100

    CMP1000 | PM1 | Feb.09 | 80

    CMP1000 | PM1 | Mar.09 | 80

    CMP1000 | PM1 | Apr.09 | 110

    CMP1000 | PM1 | May.09 | 110

    CMP1000 | PM1 | Jun.09 | 110

    CMP1000 | PM1 | Jul.09 | 110

    But when I open the table, the result is still the same as previous one. I suspect that after the CROSS APPLY execution, the result is not updated to my Cost table. How can we save back the result to the Cost table?

    Thanks.

    Heh... you've heard of the UPDATE statement? The conversion should be easy on your part.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm also thinking that you didn't read the article on "Triangular Joins" whose link I posted. I strongly recommend you do before you implement the code... you stand a serious chance of driving your I/O system nuts.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is not a triangular join in the traditional sense (assuming the correct supporting index exists) since the APPLY results in a single-row seek per input row, not an aggregate over all preceding values, as would be the case with a running-total problem.

    The query presented, properly indexed, is a correlated join - the speciality of APPLY.

  • Hi Jeff,

    I had sorted out the update portion. The code should look like this:

    update S set

    cost = C.cost

    FROM factcosts2 S

    CROSS

    APPLY (

    SELECT TOP (1)

    *

    FROM factcosts2 S2

    WHERE S2.productid = S.productid

    AND S2.entityid = S.entityid

    AND S2.planitemid = S.planitemid

    AND S2.cost <> 0

    AND S2.period <= S.period

    ORDER BY

    S2.period DESC

    ) C;

    I'm not so concern about the I/O performance as I predict the Cost fact table will grow to approximately 1 mil records in 10 years. Currently, the database server is running on 16 GB RAM. Should be no problem right?

  • yingchai (4/6/2010)


    I'm not so concern about the I/O performance as I predict the Cost fact table will grow to approximately 1 mil records in 10 years. Currently, the database server is running on 16 GB RAM. Should be no problem right?

    Given correct indexing, it will be fine.

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply