April 26, 2016 at 2:41 pm
HI Team
This would be a common problem and I tried to search the net but could not find a good answer. The problem is
Say the model of the fact table is an insert only model , where the records are always getting inserted by the ETL process. A separate process would search if the inserted record already exist and if it does, will mark the end_date of that record . The new record will have an end_date as null.
I have achieved it through a self join on the table. We use Informatica , and a target lookup would have also helped, but caching the full table is expensive. Is there any other way to do it ?
Thanks
April 27, 2016 at 2:31 am
Output the new facts to a working table. Then use T-SQL to insert the new records and update the old ones.
Jez
April 27, 2016 at 5:17 am
While you're at it, I recommend making the new end-date = '9999' (equals 9999-01-01 when inserted into a DATETIME column) so that you're queries aren't plagued by the use of ISNULL or IS NULL and makes the possibility of future dating not only possible but identical in nature to normal date searches.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 27, 2016 at 12:21 pm
Yes Jez we have that implementation too . Thanks for suggesting that .
I like Jeff's idea.Will do it that ways .
December 12, 2017 at 1:26 pm
Bringing an old topic alive , this is what I am thinking .
We have a fact table that is , say 1000 columns wide and then lets say we have an additional column , an isactive column which , if 1 means that the fact record is active and if 0 means the record is inactive.
NOw if the same record comes in with the latest ETL , the older version of the record needs to be set to isactive = 0 and the latest one retains as isactive = 1 .
Lets say , we take the PK , the Natural Key and the isactive column to a separate table and instead of updating the 1000 column wide table on one flag column , we update this new table to switch the flags.
Downside is , the select queries have to join the 2 tables to get the active records.
Is there any obvious performance difference between updating 1 column in a 4 column wide table vs 1 columns in a 1000 column wide table ?
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply