SSIS package required to replace Stored Proc?

  • Hi

    I have a SP that does the following:

    1. TRUNCATE TableA

    2. INSERT INTO TableA (field1,field2,field3...)

    SELECT fielda,fieldb,fieldc ...

    FROM TableB

    WHERE fielda <>0

    AND Fieldb in (SELECT fieldx FROM TableC)

    3. UPDATE TableA SET Field4

    TableB has 10 million records, but fielda <>0 will filter out only 500,000 records.

    Would anyone please advise if this is better kept as a Stored Proc, or if an SSIS package could do this faster? The current SP is taking about 6 hours to run, so I am looking for a significant performance improvement.

    If a SSIS package is more efficient, then what are the components I need to achieve this?

    Appreciate any help you may have.

    regards

    K

  • Assuming SQL server is the source and destination..

    You could do this pretty easily with an SSIS and I would expect it to be faster.

    But I'm having a little trouble with the WHY.

    It seems that tables A, B, & C are all on the same server and the same database.

    Also, the IN statement might be replaced by using the EXISTS clause instead.

    Is fielda, fieldb or fieldx indexed?

    Is there a reason why you don't want to do this as a view..

    CEWII

  • To answer your question which components you need:

    1. Execute SQL Task in the Control Flow

    1.a Add a data flow task after the previous task with a precedence constraint (green arrow)

    2. OLE DB Source in the data flow. You can use the select statement here:

    SELECT fielda,fieldb,fieldc ...

    FROM TableB

    WHERE fielda <>0

    AND Fieldb in (SELECT fieldx FROM TableC)

    Maybe re-write it for performance as Elliot W suggested.

    2.a OLE DB Destination. Map the columns of the data flow to the destination columns

    3. Another SQL Task in the Control Flow. This one comes after the data flow task with a precedence constraint.

    As Elliot W mentioned, indexes are very useful to speed up performance. So if you haven't created them yet, please do so.

    Regards.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thanks for the responses.

    I was getting a little lost in the tools that SSIS has to offer, and was looking for an answer that was more complicated than it needed to be.

    Regards

    K

  • Using SSIS in place of SP is not going to resolve the performance issue. SP is always fater.

    This could be issue of indexing. make sure your select and update query is tuned properly by all means

  • vidya_pande (3/23/2010)


    -- SP is always fater.

    I presume you meant faster.

    I'd like to see the proof. 'Always' is a big call.

    SSIS is optimised for data flows pretty much by default. An SP can be optimised, but I would say, when dealing with very large data sets, that there is a good chance that your rookie SSIS package would trounce your rookie SP.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Before doing anything please modify your select query to

    SELECT fielda,fieldb,fieldc ...

    FROM TableB b

    JOIN TableC c

    ON b.fieldb = c.fieldx

    AND fielda <>0

    Using IN clause in SELECT would causing a full table scan. So, always try to replace IN with EXISTS and if possible with a JOIN clause. Also, check for the indexing on columns fieldb, fieldx and fielda.

    I guess it is definitely going to enhance the performance of the query.

  • Thanks for all your responses.

    Comments are acknowledged, and I have picked up a few great tips. This has been very useful.

    regards

    K

  • I'm still questioning whether this is the right path. Can you explain to us why you are wanting to copy up to 9.5M rows instead of using a view?

    You might not have to do what you are looking at doing at all so I think it deserves a review.

    CEWII

  • Hi Elliot

    Thanks for your concern. I had simplified things a little in my original post. The data is actually sitting in different databases, and the source table is part of a system that is being de-commissioned.

    This is all part of a transition/testing process.

    regards

    K

  • So this is a one-time thing. I would definitely use SSIS to do this.

    CEWII

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply