I have 14 .csv files from which I need to extract 1) a date found only in the file name itself e.g. "ConvoEVQ2021-03-22" (i.e. I need to harvest the date part of the file name and place it in all the date fields of each SQL record associated with that .csv), 2) an individual's full name from inside the file (to be stored as one field), 3) a constant literal stored in a field called Domain, and 4) a time in hours to be stored as a string in SQL. So the records written to SQL would look like:
DATE NAME DOMAIN HOURS
2021-03-22 Fred Brown EVQ 04:14:19
2021-03-22 Flisa Robins EVQ 06:27:53
2021-03-22 Pablo Ortiz EVQ 03:45:10
I plan to use a "Foreach Loop Container" to process the files in SSIS but I need some guidance about how to create the individual "Execute SQL Task"s and the "Data Flow Task" itself into SQL Server. I have no idea how to extract the date from the .csv file name and write it to the SQL table, write a constant into the SQL table, and pull out the name and hours.
I have created Variables to hold all four fields and have created four "Execute SQL Tasks" (each referencing a variable) inside the "Foreach Loop Container" but I need help configuring everything.
Any help on this would be greatly appreciated for this mid-level SSIS developer!
April 15, 2021 at 3:33 pm
Why do you need an Execute T-SQL Task here? A (simple) Data Flow Task with just a Flat File Source and OLEDB/SQL Server Destination should be more than enough here. Why are you using variables to hold the values (for each row?) when you should just be using a Dataflow Task that very much looks like this:
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
April 15, 2021 at 3:42 pm
Well your for each loop should have a variable with the current file it's processing. You'll need to do script foo on that to get the date out of it. Then it's just a data flow with the file to your table mapping the relevant fields from the file with the date you extracted from the file name and the constant in the other fields.
April 15, 2021 at 3:44 pm
Just a guess as to my strategy - I'm using variables to hold the values of each field as the process iterates through each row of the .csv file. Can I get by with just a single Data Flow Task and still harvest the date from the .csv file name, assign the constant "EVQ", pull out the Name and Hours for each row and iterate through the file and write it to SQL Server table then go onto the next of the 14 .csv files?
April 15, 2021 at 3:46 pm
You'll need to do script foo on that to get the date out of it.
A script is probably not necessary. A calculated variable, using the variable holding the filename, will probably do this.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 15, 2021 at 3:50 pm
Just a guess as to my strategy - I'm using variables to hold the values of each field as the process iterates through each row of the .csv file. Can I get by with just a single Data Flow Task and still harvest the date from the .csv file name, assign the constant "EVQ", pull out the Name and Hours for each row and iterate through the file and write it to SQL Server table then go onto the next of the 14 .csv files?
Why iterate through the rows in the individual files at all though? Why not insert all the data in one go? If you need to file path in the data, as a column, you can use a Derived Column Transformation, and just have the expression for said column be the variable the path is in. The only looping you should need is the ForEach Loop on the files; nothing more.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
April 15, 2021 at 4:01 pm
DaveBriCam wrote:Just a guess as to my strategy - I'm using variables to hold the values of each field as the process iterates through each row of the .csv file. Can I get by with just a single Data Flow Task and still harvest the date from the .csv file name, assign the constant "EVQ", pull out the Name and Hours for each row and iterate through the file and write it to SQL Server table then go onto the next of the 14 .csv files?
Why iterate through the rows in the individual files at all though? Why not insert all the data in one go? If you need to file path in the data, as a column, you can use a Derived Column Transformation, and just have the expression for said column be the variable the path is in. The only looping you should need is the ForEach Loop on the files; nothing more.
That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 15, 2021 at 4:03 pm
That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂
How are you getting that value into the table though?
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
April 15, 2021 at 4:03 pm
To get the date would I Take the file name and a variable like DATE and do something like a:
SELECT RIGHT ([FILENAME], 10) AS [DATE] to get the date out of the file name: "ConvoEVQ2021-03-22" ?
Do I need a variable for the FileName and the Date itself that will be going into the Date field in SQL Server? i.e. 2 variables
April 15, 2021 at 4:07 pm
I can't find the Derived Column Transformation in my Toolbox
April 15, 2021 at 4:08 pm
Phil Parkin wrote:That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂
How are you getting that value into the table though?
Derived column using the calculated variable.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 15, 2021 at 4:08 pm
I can't find the Derived Column Transformation in my Toolbox
It is available only in the Data Flow.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
April 15, 2021 at 4:10 pm
Thom A wrote:Phil Parkin wrote:That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂
How are you getting that value into the table though?
Derived column using the calculated variable.
But that's what I said? I'm confused. How is your version different to mine?
I can't find the Derived Column Transformation in my Toolbox
It'll be there; it's been there for as long as I can remember (and memories of BIDS are not fond memories). In recent versions it's listed under "common" rather than "Other Transformations"
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
April 15, 2021 at 4:14 pm
It's simply not there
April 15, 2021 at 4:15 pm
Sorry Thom, you are right. My brain transformed this
you can use a Derived Column Transformation, and just have the expression for said column be the variable the path is in
into this
you can use a Derived Column Transformation with a really complicated expression which re-evaluates the required string, for every row in the source file
!
Can't blame a lack of coffee this time, need to think of another excuse.
The absence of evidence is not evidence of absence
- Martin Rees
The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
- Phil Parkin
Viewing 15 posts - 1 through 15 (of 55 total)
You must be logged in to reply to this topic. Login to reply