September 26, 2013 at 12:22 pm
I want to make a package to load some text files into a database table. Any suggestions on how I can check if a file has been loaded already or if a duplicate file exists, then don't load the duplicate ?
Thanks.
September 26, 2013 at 1:03 pm
etl.laptop (9/26/2013)
I want to make a package to load some text files into a database table. Any suggestions on how I can check if a file has been loaded already or if a duplicate file exists, then don't load the duplicate ?Thanks.
After loading a file, move it to an archive folder (and rename it to include the load date in the file name). Then you can't load it twice.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
September 28, 2013 at 2:26 am
You can use either of the approach to solve this:
1. Use a For each Loop container to load the text files and then move them to archive or delete them.
2. If you have a business key in the table and text file, you can compare the business key in text file with table to validate that file has been previously loaded or not.
3. Duplicate file will not have same name but they may have same data. Use approach #2.
Vikash Kumar Singh || www.singhvikash.in
September 30, 2013 at 12:33 am
For the #2 approach which you have mentioned, the design must have a dedicated staging area tables to perform an OUTER JOIN with destinations to load only new data.
Raunak J
September 30, 2013 at 12:53 am
..must have a dedicated staging area tables ...
Please justify this comment. Because I can think of a way of doing this without staging tables.
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
September 30, 2013 at 12:57 am
MERGE JOIN transformation?
Raunak J
September 30, 2013 at 1:09 am
i believe the best solution will be of using the an archived folder where u can move the processed files. (use file system task 🙂 )
September 30, 2013 at 1:15 am
Mr. Kapsicum (9/30/2013)
i believe the best solution will be of using the an archived folder where u can move the processed files. (use file system task 🙂 )
+1
It's a system hard to break + you have the loaded files waiting for you in a directory for error handling or auditing.
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply