Update current period data based on previous period data

Question

Post reply

Update current period data based on previous period data

yingchai

SSCommitted

Points: 1669
More actions
April 5, 2010 at 2:48 am

#139327

Hi to all gurus,
I have a problem here and below is my story:
My client had prepared their yearly cost data and I had loaded the data into my fact table. Bear in mind that some of the month might not have cost incurred. When I pull the cost data in my SQL Server Management Studio, here is roughly how it looks like:
PRODUCT ID | ENTITY ID | MONTH | COST
-------------------------------------------------
CMP1000 | PM1 | Jan.09 | 100
CMP1000 | PM1 | Feb.09 | 80
CMP1000 | PM1 | Mar.09 |
CMP1000 | PM1 | Apr.09 | 110
CMP1000 | PM1 | May.09 |
CMP1000 | PM1 | Jun.09 |
CMP1000 | PM1 | Jul.09 |
Now the problem is how can we automatically update the COST column so that if there is no cost incurred in Mar.09, it'll automatically refer to last month's cost? The same goes to the cost for May.09, Jun.09 and Jul.09. It should take the 110 figure for this period.
The reason for this is to calculate the product profitability. We'll take the SALES figure minus the COST figure. Eg; if there is no cost incurred in Jul.09, the product profitability will be calculated as SALES in Jul.09 minus COST in Jul.09 which is 110.
I'm thinking of writing a stored procedure and append the NULL costs with updated values into another column. But I'm not sure how to do it.
If SSC members here came across this situation before or get to know a more feasible solution, pls share with me.
Many thanks!

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 1

What you have in hand is a classic "running total" statement..

i suggest you go thro the following excellent article by Jeff Moden to get the gist and solutions of your problem

http://www.sqlservercentral.com/articles/T-SQL/68467/

karthik M One Orange Chip Points: 29139 More actions · Answer 2

As ColdCoffee said read the article and implement it. I have already implemented 'Quirky Update' to resolve running total problem 🙂

It is working perfectly...

karthik

Paul White SSC Guru Points: 150468 More actions · Answer 3

:ermm: Not a running total problem.

DECLARE @Sample

TABLE (

product_id CHAR(7) NOT NULL,

entity_id CHAR(3) NOT NULL,

period INTEGER NOT NULL,

cost SMALLMONEY NULL

);

INSERT @Sample

(product_id, entity_id, period, cost)

SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL

SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL

SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL

SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL

SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL

SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL

SELECT 'CMP1000', 'PM1', 20090701, NULL;

SELECT S.product_id,

S.entity_id,

S.period,

CA.cost

FROM @Sample S

CROSS

APPLY (

SELECT TOP (1)

*

FROM @Sample S2

WHERE S2.product_id = S.product_id

AND S2.entity_id = S.entity_id

AND S2.cost IS NOT NULL

AND S2.period <= S.period

ORDER BY

S2.period DESC

) CA;

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

yingchai SSCommitted Points: 1669 More actions · Answer 4

Paul White NZ (4/5/2010)
:ermm: Not a running total problem.
DECLARE @Sample
TABLE (
product_id CHAR(7) NOT NULL,
entity_id CHAR(3) NOT NULL,
period INTEGER NOT NULL,
cost SMALLMONEY NULL
);
INSERT @Sample
(product_id, entity_id, period, cost)
SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL
SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL
SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL
SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090701, NULL;
SELECT S.product_id,
S.entity_id,
S.period,
CA.cost
FROM @Sample S
CROSS
APPLY (
SELECT TOP (1)
*
FROM @Sample S2
WHERE S2.product_id = S.product_id
AND S2.entity_id = S.entity_id
AND S2.cost IS NOT NULL
AND S2.period <= S.period
ORDER BY
S2.period DESC
) CA;
Paul

Genius Paul! This is what I want...Million thanks!

YOU ARE MY SAVIOUR 😀

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 5

As a bit of a side bar, I don't know if you're data is actually like you posted it but, just in case, I'd recommend you NOT store formatted dates like ' Jan.09'

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150468 More actions · Answer 6

Jeff Moden (4/5/2010)
As a bit of a side bar, I don't know if you're data is actually like you posted it but, just in case, I'd recommend you NOT store formatted dates like ' Jan.09'

Absolutely! I meant to mention that when I posted my script - but forgot.

My advice? Always store date and time data using one of the 'date or time' data types - never as anything else. And I don't make absolute statements like that very often...:-)

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 7

Hmmmm.... be careful, folks. I thought I recognized the pattern in the Cross Apply. It contains a Triangular Join and I just tested it on just 10K rows... it generates over 50 million internal rows and a quarter million reads just for 10k rows and that's with or without an index.

For more information on Triangular joins and how they can cripple a server, please see the article at the following URL:

http://www.sqlservercentral.com/articles/T-SQL/61539/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

yingchai SSCommitted Points: 1669 More actions · Answer 8

Hi Paul and Jeff,

The actual data in the period column which shows the output as Jan.09 is actually a datetime datatype which stores "1/1/2009 12:00:00 AM". But when we display the period in the Web Intelligence report it'll shows as Jan.09 because I had apply the Object Format to Month.Year condition in Universe Designer.

This post is actually for my BusinessObjects project.

Again...thanks for all the inputs.

yingchai SSCommitted Points: 1669 More actions · Answer 9

Paul White NZ (4/5/2010)
:ermm: Not a running total problem.
CREATE TABLE Sample1
(
product_id CHAR(7) NOT NULL,
entity_id CHAR(3) NOT NULL,
period INTEGER NOT NULL,
cost SMALLMONEY NULL
);
INSERT Sample1
(product_id, entity_id, period, cost)
SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL
SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL
SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL
SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090701, NULL;
SELECT S.product_id,
S.entity_id,
S.period,
CA.cost
FROM Sample1 S
CROSS
APPLY (
SELECT TOP (1)
*
FROM Sample1 S2
WHERE S2.product_id = S.product_id
AND S2.entity_id = S.entity_id
AND S2.cost IS NOT NULL
AND S2.period <= S.period
ORDER BY
S2.period DESC
) CA;
Paul

Hi Paul,

I had executed the code above (after done some minor amendments) and the query result is displayed as below:

PRODUCT ID | ENTITY ID | MONTH | COST

-------------------------------------------------

CMP1000 | PM1 | Jan.09 | 100

CMP1000 | PM1 | Feb.09 | 80

CMP1000 | PM1 | Mar.09 | 80

CMP1000 | PM1 | Apr.09 | 110

CMP1000 | PM1 | May.09 | 110

CMP1000 | PM1 | Jun.09 | 110

CMP1000 | PM1 | Jul.09 | 110

But when I open the table, the result is still the same as previous one. I suspect that after the CROSS APPLY execution, the result is not updated to my Cost table. How can we save back the result to the Cost table?

Thanks.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 10

yingchai (4/5/2010)
Paul White NZ (4/5/2010)
:ermm: Not a running total problem.
CREATE TABLE Sample1
(
product_id CHAR(7) NOT NULL,
entity_id CHAR(3) NOT NULL,
period INTEGER NOT NULL,
cost SMALLMONEY NULL
);
INSERT Sample1
(product_id, entity_id, period, cost)
SELECT 'CMP1000', 'PM1', 20090101, 100 UNION ALL
SELECT 'CMP1000', 'PM1', 20090201, 80 UNION ALL
SELECT 'CMP1000', 'PM1', 20090301, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090401, 110 UNION ALL
SELECT 'CMP1000', 'PM1', 20090501, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090601, NULL UNION ALL
SELECT 'CMP1000', 'PM1', 20090701, NULL;
SELECT S.product_id,
S.entity_id,
S.period,
CA.cost
FROM Sample1 S
CROSS
APPLY (
SELECT TOP (1)
*
FROM Sample1 S2
WHERE S2.product_id = S.product_id
AND S2.entity_id = S.entity_id
AND S2.cost IS NOT NULL
AND S2.period <= S.period
ORDER BY
S2.period DESC
) CA;
Paul
Hi Paul,
I had executed the code above (after done some minor amendments) and the query result is displayed as below:
PRODUCT ID | ENTITY ID | MONTH | COST
-------------------------------------------------
CMP1000 | PM1 | Jan.09 | 100
CMP1000 | PM1 | Feb.09 | 80
CMP1000 | PM1 | Mar.09 | 80
CMP1000 | PM1 | Apr.09 | 110
CMP1000 | PM1 | May.09 | 110
CMP1000 | PM1 | Jun.09 | 110
CMP1000 | PM1 | Jul.09 | 110
But when I open the table, the result is still the same as previous one. I suspect that after the CROSS APPLY execution, the result is not updated to my Cost table. How can we save back the result to the Cost table?
Thanks.

Heh... you've heard of the UPDATE statement? The conversion should be easy on your part.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 11

I'm also thinking that you didn't read the article on "Triangular Joins" whose link I posted. I strongly recommend you do before you implement the code... you stand a serious chance of driving your I/O system nuts.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150468 More actions · Answer 12

This is not a triangular join in the traditional sense (assuming the correct supporting index exists) since the APPLY results in a single-row seek per input row, not an aggregate over all preceding values, as would be the case with a running-total problem.

The query presented, properly indexed, is a correlated join - the speciality of APPLY.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

yingchai SSCommitted Points: 1669 More actions · Answer 13

Hi Jeff,

I had sorted out the update portion. The code should look like this:

update S set

cost = C.cost

FROM factcosts2 S

CROSS

APPLY (

SELECT TOP (1)

*

FROM factcosts2 S2

WHERE S2.productid = S.productid

AND S2.entityid = S.entityid

AND S2.planitemid = S.planitemid

AND S2.cost <> 0

AND S2.period <= S.period

ORDER BY

S2.period DESC

) C;

I'm not so concern about the I/O performance as I predict the Cost fact table will grow to approximately 1 mil records in 10 years. Currently, the database server is running on 16 GB RAM. Should be no problem right?

Paul White SSC Guru Points: 150468 More actions · Answer 14

yingchai (4/6/2010)
I'm not so concern about the I/O performance as I predict the Cost fact table will grow to approximately 1 mil records in 10 years. Currently, the database server is running on 16 GB RAM. Should be no problem right?

Given correct indexing, it will be fine.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi