Is SSIS really that bad?

  • My recommendation would be to try it out.  You are licensed for it; give it a shot.

    Build up a proof of concept data migration and compare it to the C# thing that your developers came up with and include things like:

    1- maintenance <-- which code is easier to maintain

    2 - code reuse <-- which code is easier to reuse when you want to move more data

    3 - readability <-- which code is easier for you and your team to read and understand what is happening

    4 - performance <-- which code performs better

    5 - Schedule <-- are you doing a 1-time data migration OR are you doing a repeated on a schedule data migration

    Trial and error goes a long way, and once you determine which method you prefer to migrate the data, you may just prefer one method over another.

    We use SSIS pretty heavily for our data loads and have hit some snags with it, but overall we like it.  One thing to watch out for is SSIS operates in its own memory space (similar to what a C# app would do) and not inside the SQL Server memory space.  The reason I say to watch out for that is we had a few out of memory exceptions the first time we set it up as SQL was configured to (and was) use most of the memory and we didn't have enough for SSIS.

    In the end, it is a matter of picking the right tool for the job and the tool that works for you and that you can support.  When the CEO of your company comes by your desk and says "we need this data migrated as soon as you possibly can", you want to make sure you are using a method you trust and can support.  If you hit go and it fails to migrate the data, you don't want the CEO standing over your shoulder while you google and post on forums asking why your package failed OR your C# code failed.  Sometimes you will have to (we all run into obscure errors at times), but you want to be able to self-investigate as well as search online for the problem.

    If in your situation, they need the data moved as soon as possible, I would be more inclined to use the C# method for now because you have a team of people who are skilled in that who can help with the data migration and in the future, you could work/test more on the SSIS side when you are not under pressure.  That being said, if you have time right now to play around with both, it might not hurt to try things out and see which method you and your team prefer and can support long term.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Rod at work wrote:

    Chris Harshman wrote:

    Over the years as both a Database Developer and DBA, I've found SSIS to be a good tool.  As others have said, not necessarily the best tool, but it exposes many capabilities that would be difficult to reproduce in .Net or PowerShell or some other programming language.  I agree with Jack's comment about building small SSIS packages that each perform one task, and I also typically focus on the extract and load portions of ETL within SSIS, and do most of my actual transformations with plain old T-SQL.

    I find it funny that .Net developers think SSIS was designed for DBA's.  In my perspective, so much of it is in the developer's terminology and mindset.  If it was written for DBA's, it probably wouldn't have all of the strange datatypes you have to convert and map to, and would probably be more script based than drag-and-drop, point and click property based.

    I do apologize for my colleagues mindset concerning DBAs and SSIS.

    Heh... it's OK, Rod.  I actually thought it was written for GUI Developers that might not have a good handle on databases. 😀  The ol' "Frame 52 Split" exists even outside of submarines. 😀  Please extend my apologies to your colleagues. 😀 😀 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  •                 //-------------------------------------------------------------
    // ItemChange
    bool ItemChange;
    if(Commission_ID != LastItem){
    ItemChange = true;
    }else{
    ItemChange = false;
    }



    //-------------------------------------------------------------
    // LastItem
    LastItem = Commission_ID;
    /*
    * orig:
    $LastItem := Commission_ID;
    return Commission_ID;
    * */


    //-------------------------------------------------------------
    // LastChangePerDay
    bool LastChangePerDay;
    if(ItemChange){
    LastChangePerDay = true;
    }else{
    LastChangePerDay = (Effective_Date != LastDate);
    }


    //-------------------------------------------------------------
    // LastDate
    LastDate = Effective_Date;


    //-------------------------------------------------------------
    // EndDate
    DateTime EndDate;
    if(ItemChange){
    LastEndDate = Effective_Date;
    EndDate = DateTime.Now;
    }else{
    NewEndDate = LastEndDate;
    LastEndDate = Effective_Date;
    EndDate = NewEndDate;
    }


    //-------------------------------------------------------------
    // StorageTime
    int StorageTime;
    if(
    ((LifecycleStatus_ID == 2 || LifecycleStatus_ID == 3 ||LifecycleStatus_ID == 6 ||LifecycleStatus_ID == 103) ||
    (LifecycleStatus_ID == 102 && Value == 1) ||
    (LifecycleStatus_ID == 107 && Value == -1)) && Ordering_Dealer_BTG_ID != 4){
    TimeSpan duration = EndDate - Effective_Date;
    StorageTime = duration.Days;
    }else{
    StorageTime = -1;
    }



    //-------------------------------------------------------------
    // Corr_Retailing_Dealer_ID
    int Corr_Retailing_Dealer_ID;
    if(BusinessTypeGroup_ID == 2){
    Corr_Retailing_Dealer_ID = Ordering_Dealer_ID;
    }else{
    Corr_Retailing_Dealer_ID = Retailing_Dealer_ID;
    }


    //-------------------------------------------------------------
    // Corr_Retailing_Dealer_BTG_ID
    int Corr_Retailing_Dealer_BTG_ID;
    if(BusinessTypeGroup_ID == 2){
    Corr_Retailing_Dealer_BTG_ID = Ordering_Dealer_BTG_ID;
    }else{
    Corr_Retailing_Dealer_BTG_ID = Retailing_Dealer_BTG_ID;
    }


    //-------------------------------------------------------------
    // PK_Sales
    PK = PK + 1;

     

    Now some would argue you could pack this in CASE Statements which sounds to an extend doable and fine but if you just need for a tiny little piece of the whole Data Import Logic C# writing everything in C# seems to me like: Why the hell would you do that? No I mean seriously, what I see happen is C# with EntityFramework or some other bulls*hit being used because it makes development soooo much quicker and easier and kicks any attempt at performance right out of the window forever - until you rewrite a proper SQL Statement and include it in C# again ...

    I do ask for enlightenment how such a scenario is better off in C# rather than just throwing the part actually needed in C# into a Script Task. It's not any easier to maintain, you have to know C# for any of it. Just wait until you have to discover the first time that your custom dll needs TRUSTWORTHY=ON and compliance asks you "well, why is this dll not trustworthy?" and ofcourse not to forget your SQLCLR wait stats shooting through the roof.

    I don't say C# code can't perform well, it's rather "well Assembler can perform well, too." and one of the fastest random ID generators I've seen wasn't written in C# but Assembler so please enlighten me.

    I like how you can easily organize in which order which SQL Script is executed, especially the visualization makes it easy to get a rather fast view where at what point which script is being run and why. Asides that you don't need to do anything in SSIS by GUI, you can use the BIML instead which is aswell rather neat if you need to do the same thing for 20 tables.

    Plus to all the "TSQL Only" experts here:

    Care to know out of your head how to do something like this(see attachment)? Because I don't and it's a quick way to get going.

    Attachments:
    You must be logged in to view attached files.
  • Jeff Moden wrote:

    I thought they got rid of everything that was 32 bit (tongue in cheek on that comment).

    Unfortunately, the problem is more related to the fact that people don't check what drivers they are installing when they install ACE (as ACE and Office are still available in 32 and 64bit editions). If you install the 32bit version of the drivers, and you have a 64 bit version of SQL Server (which on modern versions is the only one you can have) then the driver won't be found; as SQL Server is looking for the 64bit version.

    As a result, if the "DBA" does install the 32bit driver, they need to tell SSIS to run in 32bit mode, to use the 32bit driver. That's all, and logically it does make sense. it's not so much SSIS's fault, and more the fault of whomever installed the (wrong) driver, in my view.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • The main confusion I've experienced with the Access Database Engine redistributable is based on trying to install it on a developers computer.  Since Visual Studio is still a 32 bit program, it runs things differently than they will run on a server which is going to be 64 bit.  Also just trying to get Access Database Engine installed on a computer that already has Microsoft Office on it can be a challenge itself.  It was much easier to work with back in the MDAC days, where one package installed all the data drivers you needed in a simple way.

     

  • Visual Studiom and SSMS, only being 32bit really confuses me. it honestly makes no sense. Especially when their "younger brother" VSCode and ADS are only available as 64bit applications.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Mr. Brian Gale wrote:

    My recommendation would be to try it out.  You are licensed for it; give it a shot.

    Build up a proof of concept data migration and compare it to the C# thing that your developers came up with and include things like:

    1- maintenance <-- which code is easier to maintain

    2 - code reuse <-- which code is easier to reuse when you want to move more data

    3 - readability <-- which code is easier for you and your team to read and understand what is happening

    4 - performance <-- which code performs better

    5 - Schedule <-- are you doing a 1-time data migration OR are you doing a repeated on a schedule data migration

    Trial and error goes a long way, and once you determine which method you prefer to migrate the data, you may just prefer one method over another.

    We use SSIS pretty heavily for our data loads and have hit some snags with it, but overall we like it.  One thing to watch out for is SSIS operates in its own memory space (similar to what a C# app would do) and not inside the SQL Server memory space.  The reason I say to watch out for that is we had a few out of memory exceptions the first time we set it up as SQL was configured to (and was) use most of the memory and we didn't have enough for SSIS.

    In the end, it is a matter of picking the right tool for the job and the tool that works for you and that you can support.  When the CEO of your company comes by your desk and says "we need this data migrated as soon as you possibly can", you want to make sure you are using a method you trust and can support.  If you hit go and it fails to migrate the data, you don't want the CEO standing over your shoulder while you google and post on forums asking why your package failed OR your C# code failed.  Sometimes you will have to (we all run into obscure errors at times), but you want to be able to self-investigate as well as search online for the problem.

    If in your situation, they need the data moved as soon as possible, I would be more inclined to use the C# method for now because you have a team of people who are skilled in that who can help with the data migration and in the future, you could work/test more on the SSIS side when you are not under pressure.  That being said, if you have time right now to play around with both, it might not hurt to try things out and see which method you and your team prefer and can support long term.

    I like your suggestions here and will try to do it. I am not hopeful as I think it most likely our supervisor will just demand going with a C# solution rather than SSIS. He, too, is influenced by the idea that SSIS isn't worth the time to learn how to use, due to perceived problems.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • DinoRS wrote:

    Plus to all the "TSQL Only" experts here:

    Care to know out of your head how to do something like this(see attachment)? Because I don't and it's a quick way to get going.

    I don't know but that's only because I've not had to work with Hadoop yet. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My opinion, overall SSIS is nice and easy to work with.  Has a bit of a learning curve and I think as long as you aren't trying to do anything overly complex, it is easy to pick up, set up and test things out.

    Once you get into the scripting, it can get a bit more messy OR if you are using a lot of transforms on the data it can get ugly.  But, if you follow the keep it simple idea and treat each SSIS package like you would a C# function, maintenance gets a lot easier.  It is nice when you open up an SSIS package and you have 10-15 well-named steps total in it.  Easy to figure out what is going on; easy to follow the flow of the package; easy to maintain.  When you open one up with 1000's of steps and you can't even tell what anything is doing because they used poorly named steps and have so many in the package that you need to zoom in and scroll around to figure out what is happening, you will want to beat that developer with your keyboard.  NOTE - I have been that developer and I do want to beat my past self with my keyboard at times.

    I think a lot of times when someone has "perceived problems", it is more they read a blog or forum post once that said the technology was bad and nobody should use it.  SOMETIMES those posts are accurate, or accurate at the time, but sometimes things change and the technology improves.  My opinion, SSIS is the right tool for this job.  It is what it is made for.

    It may be worth asking your supervisor why he is against SSIS and then determine if his concerns have any merit.  Then again, depending on your supervisor, it may be easier to say "Yes Sir/Ma'am" and just do it in C#.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Mr. Brian Gale wrote:

    It is nice when you open up an SSIS package and you have 10-15 well-named steps total in it.  Easy to figure out what is going on; easy to follow the flow of the package;

    That is one really good thing about SSIS... packages inherently contain a functional flow chart.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have hardly ever used SSIS, but have seen it used extensively by others.  Like any tool, if you get to know it then you know what can be done quickly and what is best done using a different technique.  It is reliable and when written well it runs fast.

    The main drawback of SSIS is that not many people know how to use it, which breeds a reluctance to get to know it.  Most people like to learn things that will be of use when they move on in their career, and SSIS is a niche.  Some organisations really want SSIS specialists, but most could not really care.

    If you want to learn how to use it then do so.

    Consider setting up a template that will help in writing your data moves.  The template could have pre-coded all your environments (Dev, Test, Prod) and their respective connection strings (SQL, Postgres, File, Azure Blob, AWS S3, etc).  Also include whatever reporting you need, and the main steps you need for your data move and allow it to do auto mapping where column names and definitions match.  A useful add-on would be Data Quality Services to parse and auto-correct data according to your MDM definitions.  Also, the whole thing could be contained within your favourite source control system.

    When you create a new package based on the template you would remove connections that were not needed, and add any coding required for items that did not auto match. At run time you supply the parameters for environment, input source, output destination, and it is all done.  A new data transformation from requirements through to testing and ready for production in less than a day.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • SSIS is not as bad as people make out, purely because people don't know how to use it. I think DTS was better (but im an old guy) -

    MVDBA

  • I have used TSQL and SSIS and my thoughts are:

    • There is a learning curve with SSIS and you need to use it in the appropriate situations. It's not worth creating an SSIS package for ad-hoc imports where you only import a few tables a few times.
    • As someone else said, there is the advantage that you have a self-documenting flow diagram. SSIS 2012 and later has excellent debugging tools, data flow watchers, etc.
    • To keep things simple avoid using the script task to call VB.NET or C#.NET. Also if you can, don't connect to Access or Excel as these need the ACE OLEDB drivers. (Just to be clear, these things are all possible and are supported, but they can cause obscure problems).
    • One powerful advantage is the SSIS package can be transactional. You can have a complex flow, import data, run SPs, do lots of updates and at the end you COMMIT or ROLLBACK the entire flow. This did not exist in the DTS world and made it very difficult to trouble shoot a failed DTS package.
    • It's probably only worth creating an SSIS package to automate a "data task" if you're going to save say 5 - 10 days work. In practice it can take a while to get an SSIS package up and running, so if you only save a day or two of work it's not worth it.

    These are purely my own thoughts and others may have different experience.

    • This reply was modified 4 years, 10 months ago by  William Rayer.
    • This reply was modified 4 years, 10 months ago by  William Rayer.
  • I had to teach a bunch of R developers how to get data from CSV (actually ~ delimeted)  how to use the import/export wizard.. and saving to a SSIS package was a godsend.  then just put an sql agent job on that truncates the table and re-imports - show them how to script up the job... all done

    yes this was SSIS for beginners,  but it was worth it to bring in new clients and start to get some nontechnical people thinking about delimiters and file formats

     

     

    MVDBA

  • If I can ever get SSIS to work, I'm hoping to start using it, I used to love bcp, other forms of bulk loads etc, but our systems got reorganized so I don't have access to the windows job scheduler or xp_cmdshell anymore which use to be my goto tools, so hoping to expand my horizons and see what other tools folks might be happier with at work here.

    Unfortunately, made the studio 2019 mistake and can't get SSIS to work right now.

     

Viewing 15 posts - 16 through 30 (of 45 total)

You must be logged in to reply to this topic. Login to reply