File System Task vs. DFT for flat file copy

Question

File System Task vs. DFT for flat file copy

BI_Developer

SSC Eights!

Points: 908
More actions
August 23, 2017 at 9:56 am

#328049

I'm designing a new package where the very first step I need to copy a flat file from a share and put it in an official share folder where our packages pick up other flat files for other purposes. This flat file gets overwritten during every copy. It has to run once daily. My question is; for run time and performance optimization, which would be the recommended task? A file copy task to copy/overwrite the flat file, or a data flow task to import flat file; get row count inserted into a variable for auditing purposes, and last, loading the data into a flat file on the SSIS bound flat files.
I'm leaning file copy task, but I'd like to hear from the experts.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 246987 More actions · Answer 1

BI_Dev - Wednesday, August 23, 2017 9:56 AM
I'm designing a new package where the very first step I need to copy a flat file from a share and put it in an official share folder where our packages pick up other flat files for other purposes. This flat file gets overwritten during every copy. It has to run once daily. My question is; for run time and performance optimization, which would be the recommended task? A file copy task to copy/overwrite the flat file, or a data flow task to import flat file; get row count inserted into a variable for auditing purposes, and last, loading the data into a flat file on the SSIS bound flat files.
I'm leaning file copy task, but I'd like to hear from the experts.

File Copy, using a File System Task or Script Task, gets my vote.
Your alternative suggestion is not something I've ever encountered 'in the wild' before and it would puzzle me if I ever did ("Why did they do it that way ...?").

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

BI_Developer SSC Eights! Points: 908 More actions · Answer 2

Phil Parkin - Wednesday, August 23, 2017 10:16 AM
BI_Dev - Wednesday, August 23, 2017 9:56 AM
I'm designing a new package where the very first step I need to copy a flat file from a share and put it in an official share folder where our packages pick up other flat files for other purposes. This flat file gets overwritten during every copy. It has to run once daily. My question is; for run time and performance optimization, which would be the recommended task? A file copy task to copy/overwrite the flat file, or a data flow task to import flat file; get row count inserted into a variable for auditing purposes, and last, loading the data into a flat file on the SSIS bound flat files.
I'm leaning file copy task, but I'd like to hear from the experts.
File Copy, using a File System Task or Script Task, gets my vote.
Your alternative suggestion is not something I've ever encountered 'in the wild' before and it would puzzle me if I ever did ("Why did they do it that way ...?").

I agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?

Phil Parkin SSC Guru Points: 246987 More actions · Answer 3

BI_Dev - Wednesday, August 23, 2017 11:31 AM
I agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?

The usual process would be something like this
1) Move the file
2) Process the file, logging obtain row count & other information of interest
3) Archive the file

You cannot get a row count from the File System Task, AFAIK.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

BI_Developer SSC Eights! Points: 908 More actions · Answer 4

Phil Parkin - Wednesday, August 23, 2017 12:03 PM
BI_Dev - Wednesday, August 23, 2017 11:31 AM
I agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?
The usual process would be something like this
1) Move the file
2) Process the file, logging obtain row count & other information of interest
3) Archive the file
You cannot get a row count from the File System Task, AFAIK.

My preference would be to get row count/total number of records once the source file is copied to the destination share.

Phil Parkin SSC Guru Points: 246987 More actions · Answer 5

BI_Dev - Thursday, August 24, 2017 1:02 PM
My preference would be to get row count/total number of records once the source file is copied to the destination share.

Just as part of a copy and not as part of a data load?
Can you please itemise the process which you would like to implement, please? I don't understand why you would need this.

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel

BI_Developer SSC Eights! Points: 908 More actions · Answer 6

BI_Dev - Thursday, August 24, 2017 1:02 PM
Phil Parkin - Wednesday, August 23, 2017 12:03 PM
BI_Dev - Wednesday, August 23, 2017 11:31 AM
I agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?
The usual process would be something like this
1) Move the file
2) Process the file, logging obtain row count & other information of interest
3) Archive the file
You cannot get a row count from the File System Task, AFAIK.
My preference would be to get row count/total number of records once the source file is copied to the destination share.

I moved the file both ways to see what the run time would be.
File copy task it took 13:30 minutes
DFT took 10:00 minutes, and, in the middle I inserted a rowcount task which passed to total rows to a variable, which in turn gets inserted into an audit table. This way I'm able to know how many rows are on the file that gets moved.

Phil Parkin SSC Guru Points: 246987 More actions · Answer 7

BI_Dev - Thursday, August 24, 2017 2:02 PM
I moved the file both ways to see what the run time would be.
File copy task it took 13:30 minutes
DFT took 10:00 minutes, and, in the middle I inserted a rowcount task which passed to total rows to a variable, which in turn gets inserted into an audit table. This way I'm able to know how many rows are on the file that gets moved.

I'd be seriously scratching my head to understand why a file copy would take longer than a data flow.
If you are moving a file from A to B without processing it, why do you need the row count?
Why would audit care that the file you moved from A to B has 17,451 rows? Why not count the rows when you process them? Or simply use the file's size?

The absence of evidence is not evidence of absence.
Martin Rees

You can lead a horse to water, but a pencil must be lead.
Stan Laurel