Best way to import data...

  • Hello everyone,

    My question is: What is the best way to design the SSIS package when importing data? And what I mean by that is we have X number of files coming in and they are associated with a table in the database...When designing the package, would it be best to have each load as a separate package and then have one big package that runs each data load package OR just have one huge package where it loads each file into each table? They are pretty big files and will take some time to load so I am not sure if breaking each one down to load them or just do it one package?

    My apologies if that sounds confusing and I can try to clarify any confusion...

    Thanks in advance for any response?

  • Ok, so the question really boils down to big monolithic package or singles with a process over the top to call them?

    I hate to do this, but, it depends, at least in my mind.

    How many files and tables are we talking about here?

    I'll give a real-world example. A few years ago, I had to do a truncate/load of about 23 tables from one SQL server to another. This included about 17 discrete tables (mostly under 100,000 records) and 3 pairs of related tables with a lot of records, 500K-4 million. I built it all in a single package but with 5 data-flows. The idea was that I could get the smaller items done while the bigger items ran asynchronously.

    I have attached a screen print of a mockup of what I did. Even if I had the actual code I wouldn't be able to screen print it since I don't own it.. Sorry.

    CEWII

  • Thank you for the quick response...

    It it like 9 flat files that will get loaded into their respective table...But the thing is these files are supposedly huge like over a gig...that is why i wasnt sure if i needed to do each one separate or make one big package with all the files and tables...

  • My personal preference is to make a package for each file.

    That way everything is neatly modulare and seperated (especially the logging).

    But as Elliot said: It depends 😀

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • While I am in general agreement with Koen, the tables I was dealing with tended to require being loaded together, always.

    If ALL of the tables are large, and really, 1GB isn't terribly large, then with only 9, I'd probably do one package with some parallelization. You can see this in this article:

    http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/70346/[/url]

    With all that said, what do you think and why?

    CEWII

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply