August 21, 2008 at 9:07 am
Hi,
Plesae help me to solve following senario:-
1) Have 10 flat file with 10 record each
2) Have to push all file into one table
3) With using SSIS
Problem is :- Duplicate record might be there in flat file and need to restrict these duplictae record to be inserted into table.
Like :-
Flat File 1
Name Id
Radhey 1
Jon 2
Flat File 2
Name id
Jon 2
Jourge 3
Final output should be
Name ID
Radhey 1
Jon 2
Jourge 3
One senario :- May be I can put all record in one temp table like tem1
and use select distinct into #temp2 then use temp2 as my input source.
But do we have any other option or any tool which will find out any duplicate id.
It is urgent.
Thanks!
Naveen
August 21, 2008 at 9:11 am
The way i would approach this would be to load all the data into SQL table with duplicates, make sure you include an unique id.
Then use Select Distinct, or another method, to remove the duplicates and insert these into your final table.
There are ways to do it in SSIS but they be less efficent and more trouble.
August 21, 2008 at 9:18 am
Thanks steveb..
I also only have this approach in my mind but as some of my head is asking that couls you have any approach which is better than this ....
Just one more thing:-
1) This package should be automated and I want incase if this get fail then to know what or which task get faild to debug...
Do we have any task id for all task and can we capture that in error connection...
Thanks!
Naveen
August 21, 2008 at 9:31 am
If you load them all with a single data flow, SSIS can remove the duplicates before loading them into the table.
Add two source components to your data flow and connect each one to one of your text files.
Use SORT components to sort each output from the source components.
Combine the outputs from the source components using a MERGE component - this is basically a UNION and it will strip out duplicates automatically.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply