November 6, 2015 at 3:34 am
Hi,
I Have a table which I've de-normalized for my DimDestinationLocations. see Below
How ever due to the amount of potential repeating rows I'm tempted to
load out the distinct address's into another table and have a look up.
For every booking there would be a Collection and Destination address, this table is going to get very large.
Is there any room to do this in Kimball methodology or is it just bad practise.
November 6, 2015 at 3:36 am
November 6, 2015 at 3:42 am
SimonH (11/6/2015)
Hi,I Have a table which I've de-normalized for my DimDestinationLocations. see Below
How ever due to the amount of potential repeating rows I'm tempted to
load out the distinct address's into another table and have a look up.
For every booking there would be a Collection and Destination address, this table is going to get very large.
Is there any room to do this in Kimball methodology or is it just bad practise.
Could you have a dim.Location table and then, in fact.Booking, add CollectionKey and DestinationKey, both FKs to dim.Location?
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
November 6, 2015 at 3:55 am
Thanks Phil.
That's the plan,
However
If a company has a warehouse say in Southampton, and they were sending
packages from Manchester say every day/every hour.
The Manchester and Southampton addresses are going to be repeated 1000's of times in the Dimlocations table, each with a unique Natural Key (Booking Key)
And a surrogate key
to the Bookings Fact table.
My Question is could I have just the distinct addresses in DimLoocations,
and have a bridging table (containing the keys) to the Fact table?
Or Should I just leave repeating addresses in Dimension table?
Thanks,
November 6, 2015 at 5:16 am
SimonH (11/6/2015)
Thanks Phil.That's the plan,
However
If a company has a warehouse say in Southampton, and they were sending
packages from Manchester say every day/every hour.
The Manchester and Southampton addresses are going to be repeated 1000's of times in the Dimlocations table, each with a unique Natural Key (Booking Key)
And a surrogate key
to the Bookings Fact table.
My Question is could I have just the distinct addresses in DimLoocations,
and have a bridging table (containing the keys) to the Fact table?
Or Should I just leave repeating addresses in Dimension table?
Thanks,
Assuming that the repetition of addresses is merely a shortcoming of the front-end system and not something that you will ever want to analyse in the DW, I would get rid of the dupes on their way in to the DW.
As I suggested, if the booking fact table contains FKs as necessary to the de-duped addresses dimension, you have just made things a whole lot simpler without losing anything useful.
Of course, your ETL process will have to implement a lookup on the locations dim (and have a way of creating new ones before the booking fact population).
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
November 6, 2015 at 9:06 am
SimonH (11/6/2015)
Thanks Phil.That's the plan,
However
If a company has a warehouse say in Southampton, and they were sending
packages from Manchester say every day/every hour.
The Manchester and Southampton addresses are going to be repeated 1000's of times in the Dimlocations table, each with a unique Natural Key (Booking Key)
And a surrogate key
to the Bookings Fact table.
My Question is could I have just the distinct addresses in DimLoocations,
and have a bridging table (containing the keys) to the Fact table?
Or Should I just leave repeating addresses in Dimension table?
Thanks,
That really depends on your requirements and how that data is stored/represented on the front end. For example if you normalize the address information and link it with a fact table if that address changes it will change all the historical data, that may or may not be desired in a datawarehouse....
November 6, 2015 at 9:41 am
That really depends on your requirements and how that data is stored/represented on the front end. For example if you normalize the address information and link it with a fact table if that address changes it will change all the historical data, that may or may not be desired in a datawarehouse
The Address Is linked to a booking so it's added everytime a booking is made.
It just so happens some client make multiple bookings to the same destination. So that wouldn't be an issue.
Thanks
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply