performance problem on Production and test servers but not on local development machines

  • I have several queries that take about 10 seconds to run on any of our local developer workstations, when moved to the production or test server in our colo, the same query takes 5 minutes to run.

    I have taken a backup of the production database and restored it to the local workstation so the database is exactly the same.

    I have compared all of the sql server configuration settings and don't see any differences.

    There is an AV client on the server, but it has sql directories and process excluded and shows no cpu activity when the query is running.

    Production server seems to be CPU bound, pegs CPU utilization at 100% for the entire time the query is running , very little or no disk IO.

    more specifically the slow down seems to occur when joining a temp table into several other tables.

    Most queries seem to run fine in production, or at least performance is not degraded enough to notice.

    Any suggestions on other things to look at?

  • Quote: more specifically the slow down seems to occur when joining a temp table into several other tables.

    In your production environment:

    Check the configuration of TEMPDB database: make sure to move it to adequate storage and pre-size it to avoid auto-growth; create 1 data file per CPU; isolate log from data at the physical disk level.

    Check physical file fragmentation and logical fragmentation: either run DBCC SHOWCONTIG, or use the information in sys.dm_db_index_physical_stats to determine the extent of logical / extent fragmentation. If there is fragmentation, run index defragmentation.

    Hope this helps.

  • A major difference between your development and production environments could be CPU resources, specifically the number of cores. It's possible that in production the Maximum Degree of Parallelism has not been set appropriately, and a parallel plan is killing the server.

    I recommend confirming whether you have a parallel query plan, and if so, consider dropping the Max Degree of Parallelism (set via sp_configure) to a lower value - say 1 or 2. If this then solves the problem, you then need to decide whether to keep it as a server-wide setting, or to include this as hints in your problematic queries.

  • Jim McLeod (7/8/2010)


    A major difference between your development and production environments could be CPU resources, specifically the number of cores. It's possible that in production the Maximum Degree of Parallelism has not been set appropriately, and a parallel plan is killing the server.

    I recommend confirming whether you have a parallel query plan, and if so, consider dropping the Max Degree of Parallelism (set via sp_configure) to a lower value - say 1 or 2. If this then solves the problem, you then need to decide whether to keep it as a server-wide setting, or to include this as hints in your problematic queries.

    Excellent advice. In general, I'd just check the execution plans between the environments to see if they're different. If different, why. I think Jim's answer is pretty likely, but it could be something else.

    Also, it never hurts, check the OS & SQL to be sure they're all at the same service pack & CU level. You never know. Also, so that you're comparing apples to apples, be sure about the speed of their CPU's. I'll never forget the time I was troubleshooting a query and I couldn't figure out why it was running faster in Dev than in Prod when I finally realized that Dev was a newer machine and had a faster CPU than Prod.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks for the ideas guys,

    I neglected one important piece of info. The development workstations are running sql 2008, database is in 2005 mode

    production servers(I've tested 3 different ones are various versions of 2005 sp2 or sp3)

    I did look into tempdb issues suspecting that at first, but don't see any problem there, I tried adding additional files to tempdb and watched the disk IO while running the query(pretty low transfer rates and disk queues never really moving about 1 or 2)

    originally the query used a table variable and it was even slower, but I refactored the query more significantly than just that temp table so not a straight across comparison.

    Its a complicated query but a pretty small dataset, servers don't seem to have any memory pressure, whole database can fit in RAM

    Hardware is definitely different across all machines, but hard to imagine that would result in 10x performance difference, one of the production servers is roughly comparable to my workstation.

    Just tested with MDOP = 1, didn't see any difference

    query plans are a little hard to compare because they are so big, I'll look at that more and see what I can find

    thanks again

  • I'd also suggest looking at parameter sniffing and whether statistics are updated on the production servers as well...both of those things can lead to bad performance.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I have run full Maintenance plans rebuilding indexes and stats, have even restarted the sql instance

    have also tried running query with RECOMPILE so don't think its a cached plan issue

  • Tom Robertson (7/8/2010)


    Thanks for the ideas guys,

    I neglected one important piece of info. The development workstations are running sql 2008, database is in 2005 mode

    production servers(I've tested 3 different ones are various versions of 2005 sp2 or sp3)

    I did look into tempdb issues suspecting that at first, but don't see any problem there, I tried adding additional files to tempdb and watched the disk IO while running the query(pretty low transfer rates and disk queues never really moving about 1 or 2)

    originally the query used a table variable and it was even slower, but I refactored the query more significantly than just that temp table so not a straight across comparison.

    Its a complicated query but a pretty small dataset, servers don't seem to have any memory pressure, whole database can fit in RAM

    Hardware is definitely different across all machines, but hard to imagine that would result in 10x performance difference, one of the production servers is roughly comparable to my workstation.

    Just tested with MDOP = 1, didn't see any difference

    query plans are a little hard to compare because they are so big, I'll look at that more and see what I can find

    thanks again

    Between differences in hardware and different versions of SQL Server, it's really going to be hard to nail down the issue. As much as possible you have to have apples to apples comparisons to understand performance issues. It's OK if there are differences in the number or type of CPU as long as you can identify that as the issue. It's OK if there are differences in the amount of memory, again, as long as you can isolate that. But right now you've got multiple factors at play on a complicated query. There's no way to say for sure what it is because it could be almost anything. There are differences (not huge, but they're there) between the optmizer in 2005 and 2008. Add in the fact of running on 2008 in 2005 compatibility mode... it's hard to say what might be happenning.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I hear what you are saying with not having a great comparison, but I really think the hardware is not the issue.

    the reason I think that is because

    A) the performance difference is so great(10x or more)

    B) the performance is within 25% across wildly different hardware(4 core 4G RAM vs 1 core 1G RAM) in the 3 production servers I've tested

    C) the performance is within 25% on 2 different workstations I've tested that have very hardware(4 core 4G Ram vs 2 core 2G Ram)

    I've looked closer at the query plans and they look essential identical, slight differences in the weight %, but pretty close. No Parallelism in plan.

  • Tom Robertson (7/8/2010)


    I hear what you are saying with not having a great comparison, but I really think the hardware is not the issue.

    the reason I think that is because

    A) the performance difference is so great(10x or more)

    B) the performance is within 25% across wildly different hardware(4 core 4G RAM vs 1 core 1G RAM) in the 3 production servers I've tested

    C) the performance is within 25% on 2 different workstations I've tested that have very hardware(4 core 4G Ram vs 2 core 2G Ram)

    I've looked closer at the query plans and they look essential identical, slight differences in the weight %, but pretty close. No Parallelism in plan.

    Nah, I think the 2005/2008 mismatch is the biggest issue. It's just so hard to know what might be going on differently between the two, especially since Microsoft doesn't release details on what was updated & what wasn't. For example, it's my understanding that statistics are generated and stored slightly differently between the two versions (internal details, not the stuff we can see). That right there could lead to different execution plans, even if the optimizers were identical.

    That's why it's so hard to say "Ah, here's your problem" when we're dealing with unknowns.

    If the plans are showing up with different costs on the same database with the same data & parameters... you have different plans.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • You'd definitely want to compare like-for-like. I recommend installing 2005 on a dev workstation and seeing how the database performs there.

    As an example, in SQL Server 2005, UDFs were changed so that the plan cache was not looked up every time the function ran as in 2000 - instead, it was once per query. If you ran a function 1 million times in one query, that's a lot of plan cache lookups. It could be a similar sort of thing between 2005 and 2008, but it will be hard to pinpoint. If you can reproduce in a development 2005 environment, you should be on a good path to resolve it.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply