Query performance changes when run in SSIS

  • Hello,

    We recently went through a round of performance tuning our SSIS packages. We identified some data flows that could be streamlined and fixed some queries that had been running slowly in production. Sometimes there would be queries that would take 20+ minutes to run when they should have taken at most a minute. After testing on the TEST environment everything looked good and performance had been noticeably improved. Our problem queries were not giving us any more problems. On our very first run in production the server goes to 100% CPU and just stays there. The package is stuck on a query that when run in SQL Management Studio takes 3 seconds on production. When we look to see what is causing the slow down we see the query waiting with a wait type of CXPACKET. This is the same wait type that we had seen earlier but now it is happening for a different set of queries.

    My question is why is it that when the query is run in SQL Management Studio we get fantastic performance but put that same query into SSIS and the server is brought to its knees?

    Thanks.

  • It could be due to parameter sniffing. Check if the plans generated are different when running from your package and SSMS.

    When running from your package, are you using different datatype for parameters than what is specified on the table?

    -Roy

  • The queries don't have parameters.

    One of the queries did have two CTEs. When we rewrote the query using derived tables instead of the CTEs we didn't get the CXPACKET waits. The execution plan for both versions of the query though were identical.

    While not exactly the same query, we do use the CTE "version" of the problem query in several other data flows in the package and they all work run the same in SSIS and SSMS.

  • If you haven't already, double check that your connections in the SSIS package, and your configurations, are pointing where you think they are. May just be a typo.

    Next step to determining the issue would be to trace the server in question and pull the full XML plan when the SSIS package runs that process and see where the difference lies.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I am not sure why when running on SSMS it does not show any trace of parallelism. Do you have hyper threading enabled?

    You could try to band-aid of using a MAXDOP hint of 1 to get rid of parallelism.

    Just out of curiosity, what is the size of the DB and how much RAM do you have? And your disks, are the RAID config or SAN? If RAID, what is the RAID config?

    -Roy

  • Are these SQL tasks or dataflows?

    if the latter, are the queries in the DF source or have You said "use table / view" option? If that, then your problem might lie there.

    When using a table or view as a source (without saying Select col1, col2 etc) SSIS issues a SET ROWCOUNT 1 to get the definition back.

    On occasion, SQL will cache this query plan and try use it when you run the real query.

    it's bad practice to use a table or a view as a source.

    I've had it bite badly in the past.

    if you not using DFs then.... :hehe:

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

  • We are using data flows where the source is a SQL command. The queries have been gone over with several times to make sure that they using the right indexes and are as tight as possible. Despite all of this when SSIS gets a hold of the queries all that effort goes out the window.

    A new wrinkle is that when we run the package through the package execution utility the CPU goes to 100% and the server locks up. Running the same package as a job and everything works fine.

  • is there any "dynamic" sql in the query? Can you change any parameters to hardcoded values as a test

    Assuming you've confirmed the queries are the same through profiler...

    Run DBCC FREEPROCCACHE (understand it if you going to do it on a prod server!!!)

    How many rows are you expecting the query to return? if SQL is rendering the data but SSIS is struggling to consume it into buffers, could this be a problem? is there a memory contention? Are we talking 100s of rows, 100k's or millions?

    if on SQL you run your query into a temp table, how long does it take to complete?

    Are you able to run the package on a different machine which would divorce SSIS / SQL and stop any memory issues.

    what's the difference in spec between the test server and prod? it's not unheard of for a lower spec machine to be quicker than a higher spec one as SQL can come up with a better plan on the lower spec and decide to brute-force it on the higher spec.

    You say the plans are identical? Does that mean they both bad or both good? Any large hash joins thrashing the disks?

    thinking out loud... :w00t:

    final thought - what's your data flow like? Lots of ASYNC components?

    Cheers,CrispinI can't die, there are too many people who still have to meet me!It's not a bug, SQL just misunderstood me!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply