Slowly changing fact?

Question

Post reply

Slowly changing fact?

james-700399

SSC Enthusiast

Points: 100
More actions
May 23, 2008 at 12:48 am

#186980

The company I work for track refinery capacities around the world and what to
put this data into a data warehouse. At first glance it looks simple enough,
every quarter they take a snapshot of each refineries capacity and that is
the fact table and the dimensions would be owners, location etc.
The problem I have though is that the capacity snapshot is not a fact, it is
the best information at the time, and is subject to change, even going back
years, when better data is available. The business has a need to see best
current data as well as being able to know what the best current data was at
a certain time.
Would the best approach be to make the fact table slowly changing and then
have validity data as dimension, or is there a better approach?

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Neelakandan V-453229 Ten Centuries Points: 1109 More actions · Answer 1

Hi,

I would more information before suggesting something..

1. What is the approx db size (in terms of no. of fact rows) in each quarter?

2. How much historical data is required? (how many snapshots to be kept in the dw for analysis)

Thanks,

Vijay.

Warm Regards,
Neel aka Vijay.

james-700399 SSC Enthusiast Points: 100 More actions · Answer 2

Thanks Vijay,

What it would look like is an initial load of 150k rows for the historical (30 years), then there would be a couple of rows a day of the historical data being edited, then 1,200 rows added once a new quarters snapshots are loaded.

Regards,

James

Jeff Moden SSC Guru Points: 1004340 More actions · Answer 3

Without knowing the structure of the data at hand, I'd say just load it, edit it when necessary, and add to it whenever you need to. If necessary, you can split the information into more normalized tables. I wouldn't even worry about DW on it right now. You may never need it...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RBarryYoung SSC Guru Points: 143329 More actions · Answer 4

If on the other hand, you are already committed to a DW approach, then I would say that your original idea is on the right track. I would simply have two (2) date dimensions: one as the date for which the estimate is being made, and then a second that that represents the date on which the estimate was made.

This makes the accumulation of new facts much easier and more efficient and has the added benefit of providing information for new kinds of analysis (that is, "how have the estimates changed over time?").

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]