Data migration approach

  • Hello friends, I am just started working in SQL Server. Currently I have a requirement where data has to be migrated from a CSV file to SQL table. It's simple for one time. However I have to load the table as and when new file comes with truncate and load. There is no scheduled frequency and data has to be loaded as and when needed. One more thing, I have to have the control of the process in production environment. If it's a job then we have to go to DBAs to run the job Everytime we need data load.

    So what do you think should be good approach to implement it ?

  • Probably the easiest way is to set up a scheduled check for the existence of the file. If it's there, fire off the load process. If it's not there, wait some period of time (determined by you, is checking very 10 minutes adequate? Adjust up or down as needed.) before checking again. That's a pretty standard method. You should be able to automate the entire process pretty easily using a combination of SQL Agent and either SSIS or Powershell.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks Grant for giving me direction. I'll check it out. One more thing. The text values in my file are enclosed within " for example "sports". And the numeric values are mentioned simply. Is there a way to handle two different kind of separators ?

  • Not an SSIS expert by any means, but...

    I assume you have a delimiter between the fields that translates to columns? If so, you should be able to map those out individually by location within the set of fields.

    This is from distant memory of my work in SSIS, so if I'm off in terms, apologies.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Not sure if this answers your question.

    The Text values should be within quotes anyways, compared to integers or numeric fields for SQL. Your delimiter though should be distinct, you can not have comma values in the data ("Sports,Activity") or it will not work. The other thing you can try looking into is the ERRORFILE property as part of your insert for validation. This will create an error log file for handling bad data.

    Create an SSIS package similar to what is described in this article. This will look for files in a specified network location.

    http://www.sqlservercentral.com/articles/Integration+Services+%28SSIS%29/90571/

  • +1 for using SSIS - it comes as part of the implementation for all versions of SQL other than Express.

    Loading data from a CSV file is a pretty trivial process for SSIS, however if you don't have control over the source of the CSV and you cannot verify that it won't contain weird values (like invalid dates) then error checking can sometimes be a bit painful.  Also make sure that you include some audit details in the SSIS package, typically output the start and end times of the execute, how many lines were read from the source and how many were written to the target(s). Also make sure that any values that could change (e.g. file location, server name, user credentials etc..) are stored in some sort of config [there are numerous options] rather than being hard coded in the package.

    also don't forget that SSIS packages are code just like anything else and they need to be tested, documented, placed under source code control and follow a release process just like application code.

    SSIS can be called from a command line using windows services, SQL agent jobs, Powershell or from inside a .net application if you need the loader to be user controlled.

  • sqlenthu 89358 - Thursday, January 5, 2017 5:01 AM

    Thanks Grant for giving me direction. I'll check it out. One more thing. The text values in my file are enclosed within " for example "sports". And the numeric values are mentioned simply. Is there a way to handle two different kind of separators ?

    Actually, it's pretty easy to do with a "BCP Format File" .  Although not in the documentation, end of field delimiters with quotes can be multi-character and the quotes are escaped using a backslash in the delimiter field of the format file.  And, it runs nasty fast with Bulk Insert.

    And, no... you don't actually need SSIS for any of this unless you want to build process flow control in it instead of relatively simple T-SQL.  For sure, you don't need PowerShell for any of this.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply