December 13, 2016 at 8:37 am
Quick questions:
Any multiple currency subjects?
Currency exchange rates?
Exchange rate history?
Aggregation across currencies?
😎
December 13, 2016 at 11:06 am
Thanks for the feedback Kevin. That looks like a star schema coming together now. What about the account type? That can be an attribute in the account dimension. As far as other measures, you have now a scalable fact table where any other measures you want cab be added, like YTD interest paid and number of account holders, etc... That will depend more on what your users need.
----------------------------------------------------
December 14, 2016 at 2:23 am
Hi,
Just on the business keys. The data is actually sourced from 3rd party systems. The data is loaded from these systems through to the staging tables by the ETL process. So I don't have a business key for this data - unless I generate a surrogate key or use the PK in the staging table (but the staging table will get cleared each day)?
What would be the best practice here? Would a business key not be required then?
Thanks!
December 14, 2016 at 10:41 am
Quick answers,
a) Creating a surrogate key should be a normal routine
b) If there is no business ID (primary key usually) in your source data , try communicating to the appropriate personnel about this issue. Having this identifier is crucial to match the rows of a data import to any records already existing for them. You should not have to work around this issue and should be part of the agreement between your business and your sources of data. The identifyer in a staging table is useless.
----------------------------------------------------
December 19, 2016 at 10:29 am
MMartin1 (12/14/2016)
Quick answers,a) Creating a surrogate key should be a normal routine
b) If there is no business ID (primary key usually) in your source data , try communicating to the appropriate personnel about this issue. Having this identifier is crucial to match the rows of a data import to any records already existing for them. You should not have to work around this issue and should be part of the agreement between your business and your sources of data. The identifyer in a staging table is useless.
Agreed. However, one issue is that one source is a csv file. Any best practice for using CSV as a source for a DW?
Thank you!
December 19, 2016 at 11:53 am
kevin.obrien 66193 (12/19/2016)
MMartin1 (12/14/2016)
Quick answers,a) Creating a surrogate key should be a normal routine
b) If there is no business ID (primary key usually) in your source data , try communicating to the appropriate personnel about this issue. Having this identifier is crucial to match the rows of a data import to any records already existing for them. You should not have to work around this issue and should be part of the agreement between your business and your sources of data. The identifyer in a staging table is useless.
Agreed. However, one issue is that one source is a csv file. Any best practice for using CSV as a source for a DW?
Thank you!
I do not think this would require any difference in the conceptual design of a DW. Your CSV should have the business ID's in them regardless. You treat the source the same as you would when pulling from other machines at a high level. At a lower level you may want to watch out for commas that are part of the the string value (in which case you would need quoted text identifyers , for example "this is my string value, thank you") so that one field doesn't get confused as two due to the existence of a comma. And I always like having the headers in the CSV so that I know what I am looking at to be sure I am throwing that field in its rightful column at the destination table.
----------------------------------------------------
December 29, 2016 at 4:46 am
Hi,
The Data Warehouse is working successfully. Thanks for all the help!
One new request now is to build a report to reconcile the data from both sources. So basically I need to match the data from SourceKey 1 (Broker) to the data in SourceKey 2 (Geneva). The key to match on is AccountKey.
There is a 'mapping' table that stores the relationship between the AccountKey on both sources.
What is the best way to build the SQL query on the DW to match using this mapping tbl? I have tried using a self join as below but not successful...
Fact Tbl Data...
Thanks.
January 3, 2017 at 11:12 am
Wouldn't you just join on brokerKey (fact table)= brokerAccountKey (mapping table)? Try keeping the columns names consistent on multiple tables when representing the same column.
----------------------------------------------------
December 20, 2017 at 8:49 am
Hi again.
Thanks for all the help previously. I now have an Investment tbl and an InvestmentRelation tbl added to the DW as shown in the first image below.
The InvestmentKey in the FACT table relates to the InvestmentKey in the Investment Dim table.
The InvestmentRelation table is used to store a predefined set of InvestmentCodes that match from the various sources.
We have 2 sources in the Source tbl - Geneva (SourceID=1) & Counterparty (SourceID=2).
If the InvestmentCode (InvestmentID) from Geneva is a match to the InvestmentCode from Counterparty then that match would be recorded in the InvestmentRelation table.
Screenshot A:
What is the best way to flag these as matched? I am thinking of adding a new column in the FACT tbl, which would relate to InvestmentRelationKey column in the InvestmentRelation tbl.
So in effect, if InvestmentKeys 10 and 4 in the FACT table (Screenshot B) are matched in the InvestmentRelation tbl (i.e. in as 1 row) (Screenshot C), then the new FACT column would contain this one InvestmentRelationKey in both rows (Screenshot D).
Is this the correct approach or is there a better/simpler way of doing this? Ultimately the data that matches will need to be presented as one row - so in this case a SQL query would return InvestmentKeys 10 and 4 as one row. This would look like Screenshot E below.
I imagine a ROLLUP or GROUPING query would be required for this?
Thanks!
Screenshot B:
Screenshot C:
Screenshot D:
Screenshot E:
Viewing 9 posts - 31 through 38 (of 38 total)
You must be logged in to reply to this topic. Login to reply