SSIS Package to load contents of CSVs to SQL Table

  • I have 14 .csv files from which I need to extract 1) a date found only in the file name itself e.g. "ConvoEVQ2021-03-22" (i.e. I need to harvest the date part of the file name and place it in all the date fields of each SQL record associated with that .csv), 2) an individual's full name from inside the file (to be stored as one field), 3) a constant literal stored in a field called Domain, and 4) a time in hours to be stored as a string in SQL.  So the records written to SQL would look like:

    DATE                 NAME             DOMAIN      HOURS

    2021-03-22     Fred Brown          EVQ          04:14:19

    2021-03-22     Flisa Robins         EVQ          06:27:53

    2021-03-22     Pablo Ortiz           EVQ          03:45:10

    I plan to use a "Foreach Loop Container" to process the files in SSIS but I need some guidance about how to create the individual "Execute SQL Task"s and the "Data Flow Task" itself into SQL Server. I have no idea how to extract the date from the .csv file name and write it to the SQL table, write a constant into the SQL table, and pull out the name and hours.

    I have created Variables to hold all four fields and have created four "Execute SQL Tasks" (each referencing a variable) inside the "Foreach Loop Container" but I need help configuring everything.

    Any help on this would be greatly appreciated for this mid-level SSIS developer!

    • This topic was modified 3 years, 7 months ago by  DaveBriCam.
  • Why do you need an Execute T-SQL Task here? A (simple) Data Flow Task with just a Flat File Source and OLEDB/SQL Server Destination should be more than enough here. Why are you using variables to hold the values (for each row?) when you should just be using a Dataflow Task that very much looks like this:

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Well your for each loop should have a variable with the current file it's processing.  You'll need to do script foo on that to get the date out of it.  Then it's just a data flow with the file to your table mapping the relevant fields from the file with the date you extracted from the file name and the constant in the other fields.

  • Just a guess as to my strategy - I'm using variables to hold the values of each field as the process iterates through each row of the .csv file. Can I get by with just a single Data Flow Task and still harvest the date from the .csv file name, assign the constant "EVQ", pull out the Name and Hours for each row and iterate through the file and write it to SQL Server table then go onto the next of the 14 .csv files?

  • ZZartin wrote:

    You'll need to do script foo on that to get the date out of it.

    A script is probably not necessary. A calculated variable, using the variable holding the filename, will probably do this.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • DaveBriCam wrote:

    Just a guess as to my strategy - I'm using variables to hold the values of each field as the process iterates through each row of the .csv file. Can I get by with just a single Data Flow Task and still harvest the date from the .csv file name, assign the constant "EVQ", pull out the Name and Hours for each row and iterate through the file and write it to SQL Server table then go onto the next of the 14 .csv files?

    Why iterate through the rows in the individual files at all though? Why not insert all the data in one go? If you need to file path in the data, as a column, you can use a Derived Column Transformation, and just have the expression for said column be the variable the path is in. The only looping you should need is the ForEach Loop on the files; nothing more.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thom A wrote:

    DaveBriCam wrote:

    Just a guess as to my strategy - I'm using variables to hold the values of each field as the process iterates through each row of the .csv file. Can I get by with just a single Data Flow Task and still harvest the date from the .csv file name, assign the constant "EVQ", pull out the Name and Hours for each row and iterate through the file and write it to SQL Server table then go onto the next of the 14 .csv files?

    Why iterate through the rows in the individual files at all though? Why not insert all the data in one go? If you need to file path in the data, as a column, you can use a Derived Column Transformation, and just have the expression for said column be the variable the path is in. The only looping you should need is the ForEach Loop on the files; nothing more.

    That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin wrote:

    That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂

    How are you getting that value into the table though?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • To get the date would I Take the file name and a variable like DATE and do something like a:

    SELECT RIGHT ([FILENAME], 10) AS [DATE] to get the date out of the file name: "ConvoEVQ2021-03-22" ?

    Do I need a variable for the FileName and the Date itself that will be going into the Date field in SQL Server? i.e. 2 variables

  • I can't find the Derived Column Transformation in my Toolbox

  • Thom A wrote:

    Phil Parkin wrote:

    That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂

    How are you getting that value into the table though?

    Derived column using the calculated variable.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • DaveBriCam wrote:

    I can't find the Derived Column Transformation in my Toolbox

    It is available only in the Data Flow.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin wrote:

    Thom A wrote:

    Phil Parkin wrote:

    That means that the derived column needs to be recalculated for every row in the file ... potentially not very efficient, given the unchanging file name. I prefer my idea 🙂

    How are you getting that value into the table though?

    Derived column using the calculated variable.

    But that's what I said? I'm confused. How is your version different to mine?

    DaveBriCam wrote:

    I can't find the Derived Column Transformation in my Toolbox

    It'll be there; it's been there for as long as I can remember (and memories of BIDS are not fond memories). In recent versions it's listed under "common" rather than "Other Transformations"

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • It's simply not there

  • Sorry Thom, you are right. My brain transformed this

    you can use a Derived Column Transformation, and just have the expression for said column be the variable the path is in

    into this

    you can use a Derived Column Transformation with a really complicated expression which re-evaluates the required string, for every row in the source file

    !

    Can't blame a lack of coffee this time, need to think of another excuse.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 15 posts - 1 through 15 (of 55 total)

You must be logged in to reply to this topic. Login to reply