January 17, 2012 at 5:33 am
i m looking a best practice solution, to daily update bulk of records from Transaction Processing System (text files) into SQL Server database. we get large amount of data i.e Customer's static information and their associated trade in 2 text files. these 2 text files are placed on daily basis on FTP Server.we have to read these 2 text files and import it in SQL Server database for reporting and other MIS. One way out is to delete every time the previous imported data from SQL Server and import these 2 files. I want to avoid this method as didnt seems to be workable for large amount of data. -I want to insert only the new reocords which are present in text file -I want to update only the modified records -and delelte only the reocrds which are not present in the text file. this text file can have records from range of 5 lacs to 1.5 million of records.
January 17, 2012 at 6:29 am
safzalhussain (1/17/2012)
i m looking a best practice solution, to daily update bulk of records from Transaction Processing System (text files) into SQL Server database. we get large amount of data i.e Customer's static information and their associated trade in 2 text files. these 2 text files are placed on daily basis on FTP Server.we have to read these 2 text files and import it in SQL Server database for reporting and other MIS. One way out is to delete every time the previous imported data from SQL Server and import these 2 files. I want to avoid this method as didnt seems to be workable for large amount of data. -I want to insert only the new reocords which are present in text file -I want to update only the modified records -and delelte only the reocrds which are not present in the text file. this text file can have records from range of 5 lacs to 1.5 million of records.
1.5 million rows is a fairly small volume - what's the maximun volume you expect?
Two comments.
1- If you are deleting the whole table please use truncate rather than delete. Truncate will do it instantly and generate no t-log.
2- Alternative to your full refresh strategy would be a traditional ETL process where you load the daily data into staging tables then "merge" staging data with your core tables. For small tables this may take longer than a full refresh.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.January 17, 2012 at 7:24 am
PaulB-TheOneAndOnly (1/17/2012)
1- If you are deleting the whole table please use truncate rather than delete. Truncate will do it instantly and generate no t-log.
Minimal transaction log activity, not none. Truncate is a logged operation like everything in SQL Server.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 17, 2012 at 7:31 am
GilaMonster (1/17/2012)
PaulB-TheOneAndOnly (1/17/2012)
1- If you are deleting the whole table please use truncate rather than delete. Truncate will do it instantly and generate no t-log.Minimal transaction log activity, not none. Truncate is a logged operation like everything in SQL Server.
That's correct Gail. Thank you for keeping me honest.
My Oracle background tricked me on this one - in the Oracle world truncate is a true DDL non-logged operation.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.January 17, 2012 at 7:53 am
Dear Experts
Thanks for giving your expert comments for my query.
the data size may varry from 1.5 mln to 5 mln in a year. so at this moment if FULL refresh is suggested then will it also be feasible for the forecasted data.?
please suggest any sample/example of such situation with merging or staging.
January 17, 2012 at 8:28 am
safzalhussain (1/17/2012)
the data size may varry from 1.5 mln to 5 mln in a year. so at this moment if FULL refresh is suggested then will it also be feasible for the forecasted data.?please suggest any sample/example of such situation with merging or staging.
Five million rows still looks like something easy to manage, especially if a Full Refresh is what is needed.
For future reference, a traditional merge based ETL process works like this...
1- Load source data (your text file) into a Staging table that gets truncated before each run.
2- Run a merge statement from staging table against target table - merge statement can do insert/update/delete statements as needed. See link next for details http://msdn.microsoft.com/en-us/library/bb510625.aspx
A traditional merge based ETL process works great when just a fraction of the target table gets inserted/updated/deleted on each run.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply