SQL SERVER 2014 BOTTLENECK ONCE WE CHANGE NODE --FAILOVER CLUSTER--

  • hello sql server newbie here,

    we have sql server 2014, with failover clustering. everytime we change node the fist 2 hours
    we are experiencing system slowdown, in our sql server. we keep changing node to make the performance
    faster. most of the time changing node seems to be okay. but it is a big hustle once the system is slowing again.

    the image above shows that our sql server is very slow.

    i simply dont know where to look what cause this issue.
    i need help on this.

  • On the surface, it looks like you are having a lot of physical disk reads.  This is not too surprising immediately after a restart of the instance.  How large is the database, and how much memory does the node have available to it?  How much of that memory is dedicated to SQL Server?

    Once the two hours passes, is the performance on SQL Server return to acceptable levels?  Is this transition from unacceptable to acceptable sudden or gradual (over 15 - 20 minutes).  If it is a sudden transition, you could have something configured to run at startup.  Check in SQL Agent for jobs running at startup. and check for startup stored procedures.

  • When you switch nodes, the server is fast for about 2 hours, and then slows down?  Is that what you are saying?
    When you switch nodes, you are re-starting the server.  The cache is being cleared, you are starting over.  

    Instead of failing the cluster over with the next slowdown, run this and see if things get faster:
    DBCC FREEPROCCACHE

    If that works, then you have a bigger problem. 
    I would start with the configuration.
    Run
    sp_configure 'Show advanced options', 1
    RECONFIGURE
    Then run sp_configure, and publish the results.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John - Monday, October 16, 2017 2:19 PM

    When you switch nodes, the server is fast for about 2 hours, and then slows down?  Is that what you are saying?
    When you switch nodes, you are re-starting the server.  The cache is being cleared, you are starting over.  

    Instead of failing the cluster over with the next slowdown, run this and see if things get faster:
    DBCC FREEPROCCACHE

    If that works, then you have a bigger problem. 
    I would start with the configuration.
    Run
    sp_configure 'Show advanced options', 1
    RECONFIGURE
    Then run sp_configure, and publish the results.

    When you switch nodes, the server is fast for about 2 hours, and then slows down?  Is that what you are saying?
    - Yes, all operational application run smoothly for a few moment, and when the image above display like that. the all application are very slow, 
    it is an intermittent problem in the fist 2 hours after the restart of sql server.

    here are the result of the sp_configure.

    for the DBCC FREEPROCCACHE -- i have to check if it is safe to run this procedure, and how long it would take to run.

    Thank you for the reply, I really appreciate your time.

  • crow1969 - Monday, October 16, 2017 1:48 PM

    On the surface, it looks like you are having a lot of physical disk reads.  This is not too surprising immediately after a restart of the instance.  How large is the database, and how much memory does the node have available to it?  How much of that memory is dedicated to SQL Server?

    Once the two hours passes, is the performance on SQL Server return to acceptable levels?  Is this transition from unacceptable to acceptable sudden or gradual (over 15 - 20 minutes).  If it is a sudden transition, you could have something configured to run at startup.  Check in SQL Agent for jobs running at startup. and check for startup stored procedures.

    the size of DB is now more than 1 TB. memory allocation @ 512 GB. for SQL.. SQL is using 60 - 70 % percent.
    --Once the two hours passes, is the performance on SQL Server return to acceptable levels?--
    yes, it return to normal level, but we are experiencing intermittent slowdown. and very frustrating. 
    i will have to check about the startup stored procedures.
    and no jobs running during the sql server restart.

    once again, thank you for the reply and i really appreciate your time.

  • First thing I would do is cap the server max memory (MB) parameter in sp_configure.  Right now SQL Server will take up to a few megabytes less than the amount of physical memory on the box leaving the OS and any other applications (SSIS, Virus Scan, etc.) to get paged out into the page file, or bump SQL Server's pages into the page file.  The recommendation I normally see is 75% of the physical memory goes to SQL Server.  The rest to the OS.  This will set the memory to no more than 384 GB:

    exec sp_configure 'max server memory (MB)', 393216
    go
    reconfigure with override

    Do not do this when there is a lot of activity on the server, as a lot of dirty pages may have to get written to the data files.  This change would rule out other processes on the machine trying to edge out SQL Server on memory.  You should be able to check in the ring buffers for requests to free up memory.

    If the problem persists, you may have a rogue process of some sort (a bad procedure plan, or an ad hoc user, could be a lot of things).

  • crow1969 - Wednesday, October 18, 2017 11:27 AM

    First thing I would do is cap the server max memory (MB) parameter in sp_configure.  Right now SQL Server will take up to a few megabytes less than the amount of physical memory on the box leaving the OS and any other applications (SSIS, Virus Scan, etc.) to get paged out into the page file, or bump SQL Server's pages into the page file.  The recommendation I normally see is 75% of the physical memory goes to SQL Server.  The rest to the OS.  This will set the memory to no more than 384 GB:

    exec sp_configure 'max server memory (MB)', 393216
    go
    reconfigure with override

    Do not do this when there is a lot of activity on the server, as a lot of dirty pages may have to get written to the data files.  This change would rule out other processes on the machine trying to edge out SQL Server on memory.  You should be able to check in the ring buffers for requests to free up memory.

    If the problem persists, you may have a rogue process of some sort (a bad procedure plan, or an ad hoc user, could be a lot of things).

    I have narrowed my search on a bad procedure plan and the ad hoc user.
    recently, I have pulled the performance I/O logs and found 10 queries that might caused the bottleneck.
    and also we are using sql 2014, and the compatibility level is set to sql 2008. which is lower cardinality estimator, no idea if i set it on sql 2014 would improve the performance.
    i want to get the exact culprit that cause these issue.
    Just a question? in your practice how often you restart sql server? do you restart your sql server if the tempdb is more than 20GB?

  • lapz_anry - Wednesday, October 18, 2017 12:53 PM

    crow1969 - Wednesday, October 18, 2017 11:27 AM

    First thing I would do is cap the server max memory (MB) parameter in sp_configure.  Right now SQL Server will take up to a few megabytes less than the amount of physical memory on the box leaving the OS and any other applications (SSIS, Virus Scan, etc.) to get paged out into the page file, or bump SQL Server's pages into the page file.  The recommendation I normally see is 75% of the physical memory goes to SQL Server.  The rest to the OS.  This will set the memory to no more than 384 GB:

    exec sp_configure 'max server memory (MB)', 393216
    go
    reconfigure with override

    Do not do this when there is a lot of activity on the server, as a lot of dirty pages may have to get written to the data files.  This change would rule out other processes on the machine trying to edge out SQL Server on memory.  You should be able to check in the ring buffers for requests to free up memory.

    If the problem persists, you may have a rogue process of some sort (a bad procedure plan, or an ad hoc user, could be a lot of things).

    I have narrowed my search on a bad procedure plan and the ad hoc user.
    recently, I have pulled the performance I/O logs and found 10 queries that might caused the bottleneck.
    and also we are using sql 2014, and the compatibility level is set to sql 2008. which is lower cardinality estimator, no idea if i set it on sql 2014 would improve the performance.
    i want to get the exact culprit that cause these issue.
    Just a question? in your practice how often you restart sql server? do you restart your sql server if the tempdb is more than 20GB?

    Rarely do we re-start SQL Servers.  We do quarterly patching, that's it.  Re-starting a server is not a solution for a performance problem. 

    Where did you come up with the 20 GB figure for tempdb?  
    File growth is normal.  If tempdb is growing out of control, then you have an issue. Otherwise, figure out the size you may need it to be, and size the database files appropriately.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John - Wednesday, October 18, 2017 1:14 PM

    lapz_anry - Wednesday, October 18, 2017 12:53 PM

    crow1969 - Wednesday, October 18, 2017 11:27 AM

    First thing I would do is cap the server max memory (MB) parameter in sp_configure.  Right now SQL Server will take up to a few megabytes less than the amount of physical memory on the box leaving the OS and any other applications (SSIS, Virus Scan, etc.) to get paged out into the page file, or bump SQL Server's pages into the page file.  The recommendation I normally see is 75% of the physical memory goes to SQL Server.  The rest to the OS.  This will set the memory to no more than 384 GB:

    exec sp_configure 'max server memory (MB)', 393216
    go
    reconfigure with override

    Do not do this when there is a lot of activity on the server, as a lot of dirty pages may have to get written to the data files.  This change would rule out other processes on the machine trying to edge out SQL Server on memory.  You should be able to check in the ring buffers for requests to free up memory.

    If the problem persists, you may have a rogue process of some sort (a bad procedure plan, or an ad hoc user, could be a lot of things).

    I have narrowed my search on a bad procedure plan and the ad hoc user.
    recently, I have pulled the performance I/O logs and found 10 queries that might caused the bottleneck.
    and also we are using sql 2014, and the compatibility level is set to sql 2008. which is lower cardinality estimator, no idea if i set it on sql 2014 would improve the performance.
    i want to get the exact culprit that cause these issue.
    Just a question? in your practice how often you restart sql server? do you restart your sql server if the tempdb is more than 20GB?

    Rarely do we re-start SQL Servers.  We do quarterly patching, that's it.  Re-starting a server is not a solution for a performance problem. 

    Where did you come up with the 20 GB figure for tempdb?  
    File growth is normal.  If tempdb is growing out of control, then you have an issue. Otherwise, figure out the size you may need it to be, and size the database files appropriately.

    Thank you!.

    what i meant is that we restart our sql server if our tempdb reaches 20GB or more and that is how it is at the moment. i sure it is not the best practice to restart sql server just to resize the tempdb.

    anyhow, I think i need to set the cardinality estimator to sql 2014 because at the moment it is configured at the level of 2008 sql.

  • lapz_anry - Wednesday, October 18, 2017 1:45 PM

    Michael L John - Wednesday, October 18, 2017 1:14 PM

    lapz_anry - Wednesday, October 18, 2017 12:53 PM

    crow1969 - Wednesday, October 18, 2017 11:27 AM

    First thing I would do is cap the server max memory (MB) parameter in sp_configure.  Right now SQL Server will take up to a few megabytes less than the amount of physical memory on the box leaving the OS and any other applications (SSIS, Virus Scan, etc.) to get paged out into the page file, or bump SQL Server's pages into the page file.  The recommendation I normally see is 75% of the physical memory goes to SQL Server.  The rest to the OS.  This will set the memory to no more than 384 GB:

    exec sp_configure 'max server memory (MB)', 393216
    go
    reconfigure with override

    Do not do this when there is a lot of activity on the server, as a lot of dirty pages may have to get written to the data files.  This change would rule out other processes on the machine trying to edge out SQL Server on memory.  You should be able to check in the ring buffers for requests to free up memory.

    If the problem persists, you may have a rogue process of some sort (a bad procedure plan, or an ad hoc user, could be a lot of things).

    I have narrowed my search on a bad procedure plan and the ad hoc user.
    recently, I have pulled the performance I/O logs and found 10 queries that might caused the bottleneck.
    and also we are using sql 2014, and the compatibility level is set to sql 2008. which is lower cardinality estimator, no idea if i set it on sql 2014 would improve the performance.
    i want to get the exact culprit that cause these issue.
    Just a question? in your practice how often you restart sql server? do you restart your sql server if the tempdb is more than 20GB?

    Rarely do we re-start SQL Servers.  We do quarterly patching, that's it.  Re-starting a server is not a solution for a performance problem. 

    Where did you come up with the 20 GB figure for tempdb?  
    File growth is normal.  If tempdb is growing out of control, then you have an issue. Otherwise, figure out the size you may need it to be, and size the database files appropriately.

    Thank you!.

    what i meant is that we restart our sql server if our tempdb reaches 20GB or more and that is how it is at the moment. i sure it is not the best practice to restart sql server just to resize the tempdb.

    anyhow, I think i need to set the cardinality estimator to sql 2014 because at the moment it is configured at the level of 2008 sql.

    What I meant is why would you do that?  What made you decide that this was a good idea??? And how did you come up with 20 GB?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Michael L John - Wednesday, October 18, 2017 2:01 PM

    lapz_anry - Wednesday, October 18, 2017 1:45 PM

    Michael L John - Wednesday, October 18, 2017 1:14 PM

    lapz_anry - Wednesday, October 18, 2017 12:53 PM

    crow1969 - Wednesday, October 18, 2017 11:27 AM

    First thing I would do is cap the server max memory (MB) parameter in sp_configure.  Right now SQL Server will take up to a few megabytes less than the amount of physical memory on the box leaving the OS and any other applications (SSIS, Virus Scan, etc.) to get paged out into the page file, or bump SQL Server's pages into the page file.  The recommendation I normally see is 75% of the physical memory goes to SQL Server.  The rest to the OS.  This will set the memory to no more than 384 GB:

    exec sp_configure 'max server memory (MB)', 393216
    go
    reconfigure with override

    Do not do this when there is a lot of activity on the server, as a lot of dirty pages may have to get written to the data files.  This change would rule out other processes on the machine trying to edge out SQL Server on memory.  You should be able to check in the ring buffers for requests to free up memory.

    If the problem persists, you may have a rogue process of some sort (a bad procedure plan, or an ad hoc user, could be a lot of things).

    I have narrowed my search on a bad procedure plan and the ad hoc user.
    recently, I have pulled the performance I/O logs and found 10 queries that might caused the bottleneck.
    and also we are using sql 2014, and the compatibility level is set to sql 2008. which is lower cardinality estimator, no idea if i set it on sql 2014 would improve the performance.
    i want to get the exact culprit that cause these issue.
    Just a question? in your practice how often you restart sql server? do you restart your sql server if the tempdb is more than 20GB?

    Rarely do we re-start SQL Servers.  We do quarterly patching, that's it.  Re-starting a server is not a solution for a performance problem. 

    Where did you come up with the 20 GB figure for tempdb?  
    File growth is normal.  If tempdb is growing out of control, then you have an issue. Otherwise, figure out the size you may need it to be, and size the database files appropriately.

    Thank you!.

    what i meant is that we restart our sql server if our tempdb reaches 20GB or more and that is how it is at the moment. i sure it is not the best practice to restart sql server just to resize the tempdb.

    anyhow, I think i need to set the cardinality estimator to sql 2014 because at the moment it is configured at the level of 2008 sql.

    What I meant is why would you do that?  What made you decide that this was a good idea??? And how did you come up with 20 GB?

    wellfor me it was not a good idea, i was just following the previous db admin practice. the size of tempdb automatically grow.

  • Let it grow.  I have no real figures to back this up with, but a 20 GB tempdb with a 1 TB user database seems very low to me.  I would expect a tempdb of 50 - 100 GB depending on the workload (higher for OLAP, lower for OLTP).  Once the growth levels off, you will have found how much space the system needs for its daily use.    Cap the max size at say 50% over that, so the tempdb files do not fill the entire disk.

    There are cases where an application can cause unlimited growth in tempdb.  I forget the exact circumstances, but do you have any applications using non-default isolation levels?

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply