import data from one flat source file to two tables

Question

import data from one flat source file to two tables

sqlfriend

SSC Guru

Points: 52781
More actions
November 19, 2011 at 7:15 pm

#248136

I need to import some flat files data to SQl database tables.
I created data flow task to do this.
But one of our flat source file have some columns of two tables in my sql database, so I want to split the flat file data into two, I need to select distinc from 3 columns into table A, select distinct the other columns into table B. Currently I implement this doing a data flow task first from all columns in a source file to a staging table, then I add another sql task using two insert into....select distinct ... statements into two different tables from the staging table,
is this the efficient way to implement this, or is there better way to do it?
Thanks

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

sam.dahl Hall of Fame Points: 3112 More actions · Answer 1

Just a thought, but you could just load the data from the file, multicast it to two streams, sort each stream on the required keys removing duplicates, and send to the two destinations. I don't quite recall if there's a problem with sorting two streams of a multicast in different ways.

In theory the above idea saves you writing and reading twice, but sorts can be costly. In fact it might be better just to read the flat file twice only getting the required columns each time.

YMMV, test if you like.

sqlfriend SSC Guru Points: 52781 More actions · Answer 2

sam.dahl (11/20/2011)
Just a thought, but you could just load the data from the file, multicast it to two streams, sort each stream on the required keys removing duplicates, and send to the two destinations. I don't quite recall if there's a problem with sorting two streams of a multicast in different ways.
In theory the above idea saves you writing and reading twice, but sorts can be costly. In fact it might be better just to read the flat file twice only getting the required columns each time.
YMMV, test if you like.

Thanks, I learned from you to do two streams from multicast, although the sorting is expensive.

The second method you mentioned is to read the flat file twice, but if using that way, some columns have duplicated records.

Is there any problem the way I do it from my previous post?

Thanks

sam.dahl Hall of Fame Points: 3112 More actions · Answer 3

Hmm, I assumed a select distinct query would be possible on a flat file source in the 'read twice' scenario.

Is there any problem the way I do it from my previous post?

Not at all, if you are getting the required performance for your dataset then staging is fine.

sqlfriend SSC Guru Points: 52781 More actions · Answer 4

Thank you!

Also I cannot find a way to use select distict in flat file source, is there a way to do that?

Thanks

sam.dahl Hall of Fame Points: 3112 More actions · Answer 5

No, after checking, I don't believe there is a way to do it. I was getting my etl tools confused. It would still require sorting to remove duplicates.

sqlfriend SSC Guru Points: 52781 More actions · Answer 6

sqlfriend

SSC Guru

Points: 52781

November 21, 2011 at 9:43 am

#1410860

Thanks a lot

RonKyle SSC-Dedicated Points: 31557 More actions · Answer 7

I think your original solution solves the problem well.