November 17, 2016 at 11:57 am
In terms of the Kimball methodology of star-schema modeling, I believe this would be an example of a "role playing dimension".
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
November 17, 2016 at 1:42 pm
Eric M Russell (11/17/2016)
In terms of the Kimball methodology of star-schema modeling, I believe this would be an example of a "role playing dimension".
From what I understand I dont think it is here. OP mentions that fields in question are mutually exclusive. That indicates to me that the record will originate from one of two sources. That is, CounterpartyAccountKey and GenevaAccountKey are intended to have the same meaning and designed to hold either NULL or one other value only.
Hence for example >> if a fact record was pulled >
from sourceA --> CounterpartyAccountKey gets value 23 and GenevaAccountKey is NULL
from sourceB -->CounterpartyAccountKey is NULL and GenevaAccountKey gets value 5
A role playing dimension has the same dimension apply to different attributes of the same fact record
(orderDate, paymentDate, shippedDate, arrivedOnDate, ...)
----------------------------------------------------
November 21, 2016 at 10:45 am
Hi,
I have updated my design to add a Source table. I have added the SourceKey to the fact table and removed the source reference from the other dimension tables.
Is this now a better design?
Thanks!
November 21, 2016 at 11:22 am
Yes. That appears like a straight forward star schema. One note though is that I would be consistent with the namings. If a column is named sourceKey in the fact table then good to name it the same way in the dimension, for all keys.
----------------------------------------------------
November 21, 2016 at 11:28 am
Upon further inspection, do you still need the following in the fact table: CounterpartyAccountKey and GenevaAccountKey? I would think these are being replaced with the sourceKey field.
----------------------------------------------------
November 21, 2016 at 11:28 am
I still see 2 foreign keys from the fact to Account table, and 2 foreign keys from the fact to Broker table. In a post on 11/16 you said they were mutually exclusive, so it would make sense that you would only need one BrokerKey column and one AccountKey column in the fact table. The model will be much easier for your end users to work with once these relationships are simplified.
Edit: D'oh, MMartin just beat me to this observation 😉
November 22, 2016 at 1:36 am
Chris Harshman (11/21/2016)
I still see 2 foreign keys from the fact to Account table, and 2 foreign keys from the fact to Broker table. In a post on 11/16 you said they were mutually exclusive, so it would make sense that you would only need one BrokerKey column and one AccountKey column in the fact table. The model will be much easier for your end users to work with once these relationships are simplified.Edit: D'oh, MMartin just beat me to this observation 😉
Thanks! Updated design as below...
November 22, 2016 at 7:02 am
OK, the fact table looks better, but I'd need to understand the business a little better to know why you still have multiple balances in the fact. Will these ever be aggregated together in any way? If they are the same measure then they should be the same column, and your Source information would be used in a report or pivot table to filter out or group the various entities such as Geneva or CounterParty. If they are truly different measures then keep them separate.
I also have a question in your dimension tables Account and Broker. How are they managed from the original system(s)? If you only have an ID and Name column, either you're using the ID from your OLTP system(s) or assuming that you'll always be able to match by Name, which seems unlikely.
November 24, 2016 at 6:01 am
Chris Harshman (11/22/2016)
OK, the fact table looks better, but I'd need to understand the business a little better to know why you still have multiple balances in the fact. Will these ever be aggregated together in any way? If they are the same measure then they should be the same column, and your Source information would be used in a report or pivot table to filter out or group the various entities such as Geneva or CounterParty. If they are truly different measures then keep them separate.
Apologies. I should only have had one balance column. Design updated as below.
I also have a question in your dimension tables Account and Broker. How are they managed from the original system(s)? If you only have an ID and Name column, either you're using the ID from your OLTP system(s) or assuming that you'll always be able to match by Name, which seems unlikely.
The Account and Broker name values should be identical to the source system. So account name of 'EEERT0450' is unique and will be on source and DW. So when I populate the DW, the process would check if that account already exists in the Dimension tbl. Do I need a better process around this?
Thanks!
November 25, 2016 at 12:11 pm
. So account name of 'EEERT0450' is unique and will be on source and DW. So when I populate the DW, the process would check if that account already exists in the Dimension tbl. Do I need a better process around this?
I would be hesitant to check using name as misspelling that occurred before can then be corrected in the source system and your process would think you have a new account here. So I would stick with using the ID fields for this (for as the name implies), that is what they are for.
I am sure you've heard of Kimball and the book "The data warehouse toolkit?" It is worth the read, even just the first few chapters to get you started.
----------------------------------------------------
December 7, 2016 at 3:47 am
Thanks! Currently reading it.
Should I always have an identifier key in each dimension that points back to the source/staging table?
December 9, 2016 at 10:52 am
I think it is a good idea to keep the business ID along with your surrogate key definitely. As far as the source, that sounds to me more for auditing suspect data correct? I would keep that in your Operational Data Store (if you have one). Otherwise as mentioned you can have that in a Source dimension (similar to a Supplier dimension).
----------------------------------------------------
December 9, 2016 at 10:53 am
points back to the source/staging table
Just does not come across to me as a good idea. That should be defined and referred to in your ETL process.
----------------------------------------------------
December 9, 2016 at 11:21 am
I think it is a good idea to keep the business ID along with your surrogate key definitely.
Strongly agree with it in the fact table. Unsure what you meant about it pointing back to the staging table. Normally that is truncated on each load, unless you mean as an immediate indicator if something goes wrong.
December 13, 2016 at 5:39 am
MMartin1 (12/9/2016)
I think it is a good idea to keep the business ID along with your surrogate key definitely. As far as the source, that sounds to me more for auditing suspect data correct? I would keep that in your Operational Data Store (if you have one). Otherwise as mentioned you can have that in a Source dimension (similar to a Supplier dimension).
OK. So I have added a business id to the Broker, Account, and Currency dimensions. This business id represents the OLTP data. I have also cleaned up the fact table.
Is there anything else I need to add?
Thanks for all the help! I'm learning a lot!
Viewing 15 posts - 16 through 30 (of 38 total)
You must be logged in to reply to this topic. Login to reply