August 23, 2017 at 9:56 am
I'm designing a new package where the very first step I need to copy a flat file from a share and put it in an official share folder where our packages pick up other flat files for other purposes. This flat file gets overwritten during every copy. It has to run once daily. My question is; for run time and performance optimization, which would be the recommended task? A file copy task to copy/overwrite the flat file, or a data flow task to import flat file; get row count inserted into a variable for auditing purposes, and last, loading the data into a flat file on the SSIS bound flat files.
I'm leaning file copy task, but I'd like to hear from the experts.
August 23, 2017 at 10:16 am
BI_Dev - Wednesday, August 23, 2017 9:56 AMI'm designing a new package where the very first step I need to copy a flat file from a share and put it in an official share folder where our packages pick up other flat files for other purposes. This flat file gets overwritten during every copy. It has to run once daily. My question is; for run time and performance optimization, which would be the recommended task? A file copy task to copy/overwrite the flat file, or a data flow task to import flat file; get row count inserted into a variable for auditing purposes, and last, loading the data into a flat file on the SSIS bound flat files.I'm leaning file copy task, but I'd like to hear from the experts.
File Copy, using a File System Task or Script Task, gets my vote.
Your alternative suggestion is not something I've ever encountered 'in the wild' before and it would puzzle me if I ever did ("Why did they do it that way ...?").
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 23, 2017 at 11:31 am
Phil Parkin - Wednesday, August 23, 2017 10:16 AMBI_Dev - Wednesday, August 23, 2017 9:56 AMI'm designing a new package where the very first step I need to copy a flat file from a share and put it in an official share folder where our packages pick up other flat files for other purposes. This flat file gets overwritten during every copy. It has to run once daily. My question is; for run time and performance optimization, which would be the recommended task? A file copy task to copy/overwrite the flat file, or a data flow task to import flat file; get row count inserted into a variable for auditing purposes, and last, loading the data into a flat file on the SSIS bound flat files.I'm leaning file copy task, but I'd like to hear from the experts.
File Copy, using a File System Task or Script Task, gets my vote.
Your alternative suggestion is not something I've ever encountered 'in the wild' before and it would puzzle me if I ever did ("Why did they do it that way ...?").
I agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?
August 23, 2017 at 12:03 pm
BI_Dev - Wednesday, August 23, 2017 11:31 AMI agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?
The usual process would be something like this
1) Move the file
2) Process the file, logging obtain row count & other information of interest
3) Archive the file
You cannot get a row count from the File System Task, AFAIK.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 24, 2017 at 1:02 pm
Phil Parkin - Wednesday, August 23, 2017 12:03 PMBI_Dev - Wednesday, August 23, 2017 11:31 AMI agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?The usual process would be something like this
1) Move the file
2) Process the file, logging obtain row count & other information of interest
3) Archive the fileYou cannot get a row count from the File System Task, AFAIK.
My preference would be to get row count/total number of records once the source file is copied to the destination share.
August 24, 2017 at 1:41 pm
BI_Dev - Thursday, August 24, 2017 1:02 PMMy preference would be to get row count/total number of records once the source file is copied to the destination share.
Just as part of a copy and not as part of a data load?
Can you please itemise the process which you would like to implement, please? I don't understand why you would need this.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
August 24, 2017 at 2:02 pm
BI_Dev - Thursday, August 24, 2017 1:02 PMPhil Parkin - Wednesday, August 23, 2017 12:03 PMBI_Dev - Wednesday, August 23, 2017 11:31 AMI agree. I think the reason a DFT was used because the total row count had to be passed to a variable which in turn, gets inserted into a sql audit table. How would I be able to do this for the file copy task?The usual process would be something like this
1) Move the file
2) Process the file, logging obtain row count & other information of interest
3) Archive the fileYou cannot get a row count from the File System Task, AFAIK.
My preference would be to get row count/total number of records once the source file is copied to the destination share.
I moved the file both ways to see what the run time would be.
File copy task it took 13:30 minutes
DFT took 10:00 minutes, and, in the middle I inserted a rowcount task which passed to total rows to a variable, which in turn gets inserted into an audit table. This way I'm able to know how many rows are on the file that gets moved.
August 24, 2017 at 2:10 pm
BI_Dev - Thursday, August 24, 2017 2:02 PMI moved the file both ways to see what the run time would be.
File copy task it took 13:30 minutes
DFT took 10:00 minutes, and, in the middle I inserted a rowcount task which passed to total rows to a variable, which in turn gets inserted into an audit table. This way I'm able to know how many rows are on the file that gets moved.
I'd be seriously scratching my head to understand why a file copy would take longer than a data flow.
If you are moving a file from A to B without processing it, why do you need the row count?
Why would audit care that the file you moved from A to B has 17,451 rows? Why not count the rows when you process them? Or simply use the file's size?
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply