Query performance analysis

  • All,

    I'm looking at a particular issue and I'm not sure if my understanding or my approach are correct. Performance analysis is an area I'm trying to learn more about. My reason for posting is as much to improve my knowledge of how to deal with these type of issues as solving this particular issue.

    I'm not asking for people to write out all the details of how something works when it's documented online. I'm happy for people to advise me where I'm wrong and suggest that I go and read about a certain topic.

    The issue is as follows:

    I have two servers, which I call A and B, which have almost identical tables. There are slight differences in row counts in some tables.

    The server specifications and SQL CPU configuration are the same as far as I can find.

    I am looking at two queries. One is the problem query and one is one I looked at for reference.

    On both servers the reference query has an execution plan cost of around 45000 (based on the subtree cost of the first node in the plan) and runs in about 30 minutes (It's a data warehousing type processing query not something a user is waiting for)

    The problem query has a cost of around 35 on both servers. On server A it runs in 8 seconds. On Server B it runs for over an hour before I stop it. I'm trying to understand the vast difference in runtime.

    As the execution costs are almost the same can I rule out a difference in the execution plans being the reason for the significant difference in runtime? I don't think the content of the tables is the cause of the issue but maybe I'm wrong.

    On server B I did some analysis with the following query:

    USE Db_Admin
    SELECT er.session_id,es.host_name,es.program_name, login_name, er.status,command, er.blocking_session_id,er.wait_type,er.wait_time,
    DB_NAME(er.database_id) AS db_name,
    text, query_plan
    FROM sys.dm_exec_sessions es
    INNER join sys.dm_exec_requests er ON es.session_id=er.session_id
    outer APPLY sys.dm_exec_sql_text(er.sql_handle)
    outer APPLY sys.dm_exec_query_plan(er.plan_handle)
    WHERE er.session_id<>@@SPID
    AND host_name IS NOT null

    When I started the problem query the above shows it staying in 'runnable' status for 8 minutes. I don't see that on any other queries on our system. During those 8 minutes several other queries started and almost immediately went into running status and finished in reasonable times.

    The only reason a query would stay in 'runnable' status is if something is stopping it from acquiring the CPU threads it needs?

    For a lot of the 8 minutes the analysis query only showed the problem query so there should be CPU resource available.

    How do I try and diagnose why it sits in runnable status for that length of time?

    Once the query went into running status. It stayed in running status and there was no blocking_session_id. That means it is not being blocked and is using CPU resource? If it was waiting for IO or network resource it would be in suspended status and would show a wait type?

    Thanks

     

    • This topic was modified 8 months, 1 week ago by  as_1234. Reason: Rephrased one section for clarity
    • This topic was modified 8 months, 1 week ago by  as_1234.
  • Are both servers running the same SQL Server version?

    Are both databases using the same compatibility level?

    Are both databases using the same transaction isolation level?

    Are both databases/tables using the same collation?

    Do both servers have relatively the same traffic?

    You might want to track blocking & waits on both servers when the queries are executing.

    Are you looking at estimated execution plans or actual execution plans generated when the queries executed?

    • This reply was modified 8 months, 1 week ago by  ratbak.
  • Duplicate created when original appeared to time out.

    Deleted.

  • Hello,

    Thank you for your help.

    ratbak wrote:

    Are both servers running the same SQL Server version?

    Are both databases using the same compatibility level?

    Are both databases using the same transaction isolation level?

    Are both databases/tables using the same collation?

    These are all the same.

    ratbak wrote:

    Do both servers have relatively the same traffic?

    You might want to track blocking & waits on both servers when the queries are executing.

    Are you looking at estimated execution plans or actual execution plans generated when the queries executed?

    The traffic levels are similar. If anything it's probably lighter on server B.

    I'll have a look at blocking and waits.

    In terms of plans. I've compared the estimated plans and I've compared the plans provided by this query:

    USE Db_Admin
    SELECT er.session_id,es.host_name,es.program_name, login_name, er.status,command, er.blocking_session_id,er.wait_type,er.wait_time,
    DB_NAME(er.database_id) AS db_name,
    text, query_plan
    FROM sys.dm_exec_sessions es
    INNER join sys.dm_exec_requests er ON es.session_id=er.session_id
    outer APPLY sys.dm_exec_sql_text(er.sql_handle)
    outer APPLY sys.dm_exec_query_plan(er.plan_handle)
    WHERE er.session_id<>@@SPID
    AND host_name IS NOT null
  • Please provide the table structures with indexes assigned.

    This could be tip of the iceberg.  Proper table joins, database design, and normalization could

    provide good performance.   I would look at your hardware, CPU, memory, Disk I/O, and

    networking.

     

     

    DBASupport

  • The cost doesn't really tell you how long a query is going to take - or should take.  When reviewing an execution plan - you want to look at each operator and see how many rows it thinks will need to be read/processed (estimated) - and the number of rows that were actually read/processed.

    I would look at statistics on the relevant tables - and make sure those are up to date.  If the stats are not up to date that could cause the type of issue you are seeing.

    Is the query exactly the same on both instances?  Or - are you using different parameters for each query?

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • All,

    Thank you for your replies and suggestions. Sorry I haven't been back to this topic, I've had some other unplanned things to sort out. When I get back to it, hopefully this week, I'll update the topic if I solve the issue or find anything worth adding.

    Jeffrey Williams wrote:

    Is the query exactly the same on both instances?  Or - are you using different parameters for each query?

    It is exactly the same. I copied and pasted it from one to the other.

    • This reply was modified 8 months ago by  as_1234.
  • as_1234 wrote:

    All,

    Thank you for your replies and suggestions. Sorry I haven't been back to this topic, I've had some other unplanned things to sort out. When I get back to it, hopefully this week, I'll update the topic if I solve the issue or find anything worth adding.

    Jeffrey Williams wrote:

    Is the query exactly the same on both instances?  Or - are you using different parameters for each query?

    It is exactly the same. I copied and pasted it from one to the other.

    My first step would be to rebuild the statistics on the tables being used.  Then I'd add OPTION (RECOMPILE) to the query and try running it again.

    If the problem persists after that, my recommendation is to read and follow the article at the second link in my signature line below to provide us with enough information to help with this problem.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • All,

    Thanks for your help.

    I updated the statistics today and it seems to have fixed the issue. I will retry the code over the next few days to see if it reoccurs.

  • Thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've discovered that the table is updated overnight and the issue re-occurs overnight.

    If I run the code in the morning then it takes too long to complete. If I then update the stats and then run it then it uses a different query plan and completes OK.

    I've used stats_date to verify that the statistics are not updating after the overnight update.

    Auto_update_statistics is switched on.

    I've been doing some research on how the stats update process works.

    Am I correct in understanding that SQL server only looks at the number of rows changed to determine whether the stats need updating? Or does it take other things into consideration as well?

    Thanks

    • This reply was modified 7 months, 2 weeks ago by  as_1234.
  • as_1234 wrote:

    I've discovered that the table is updated overnight and the issue re-occurs overnight.

    If I run the code in the morning then it takes too long to complete. If I then update the stats and then run it then it uses a different query plan and completes OK.

    I've used stats_date to verify that the statistics are not updating after the overnight update.

    Auto_update_statistics is switched on.

    I've been doing some research on how the stats update process works.

    Am I correct in understanding that SQL server only looks at the number of rows changed to determine whether the stats need updating? Or does it take other things into consideration as well?

    Thanks

    Please see the following link for the answer to that question in both Pre 2016 environments and after.

    https://learn.microsoft.com/en-us/sql/relational-databases/statistics/statistics#auto_update_statistics-option

    I'll also state that if your indexes have "ever-increasing" keys, you may have to update stats a whole lot more often than the auto status updates will.

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks for your reply.

    All: Thank you for your help. I put in some code to update the stats and it seems to be solve the issue consistently and now I have a better idea what to look for if I get a similar issue again.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply