Horrible performance using sys.dm_os_performance_counters

  • Hi,
    We are in process of buying a third party application (NewRelic) that does a query the SQL 2016 server with the following.

    SELECT
    DB_NAME(database_id) AS db_name,
    COUNT_BIG(*) * (8*1024) AS buffer_pool_size
    FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)
    WHERE database_id <> 32767 -- ResourceDB
    GROUP BY database_id

    Problem is that the sys.dm_os_performance_counters has over 11 millions rows (128Gb server) and the execution takes around 1 minute using about that much CPU (the grouping is obviously very demanding).  Since this is executed every about 2 minutes, it's basically taking half a cpu all the time.

    Anyone has ever worked with sys.dm_os_buffer_descriptors?  What kind of performance did you get?  How many rows?  Any ideas to improve the performance?

  • I did get less execution time cost using COUNT() rather than COUNT_BIG():

    SELECT
    DB_NAME(database_id) AS db_name,
    COUNT(*) / 128.0 AS buffer_pool_size_mb
    FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)
    WHERE database_id <> 32767 -- ResourceDB
    GROUP BY database_id

    How many dbs are on the instance?
    Did the query plan show SQL using a hash match or did it do a sort?

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Just 38 user DBs and 60% of the memory used by 2 DBs. 
    The Plan shows a hash match (30% for the function, 57% parralelism, 2% filter and then 11% for the hash match).
    I tried the Count(*) in place of the Count_Big() and CPU time went down from 70sec to 51sec.

  • Wow, that seems way long for only 38 dbs.

    Force SQL to not parallelize the query, and see how that runs:

    SELECT
    DB_NAME(database_id) AS db_name,
    COUNT(*) / 128.0 AS buffer_pool_size_mb
    FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)
    WHERE database_id <> 32767 -- ResourceDB
    GROUP BY database_id
    OPTION ( MAXDOP 1 )

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Got 78sec CPU time with COUNT and MAXDOP 1.
    The view returns around 11 millions rows.  Is this normal on a 128GB system?
    How long does it takes on one of your server (if you have one with a few millions rows in the view)?
    My guess is that it's only a matter of the grouping performance which I can't do much about.

  • PatLap - Wednesday, March 27, 2019 1:39 PM

    Got 78sec CPU time with COUNT and MAXDOP 1.
    The view returns around 11 millions rows.  Is this normal on a 128GB system?
    How long does it takes on one of your server (if you have one with a few millions rows in the view)?
    My guess is that it's only a matter of the grouping performance which I can't do much about.

    I have one with 6.6M rows (for ~72GB), I did test it, took 6 secs.
    The server has 256GB, but it's divided among 4 instances, the largest of which has ~72GB.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Are your CPUs really busy on that server for some reason?

    In my case, I have lots of free CPU, it's I/O that's restraining my system perf.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • PatLap - Wednesday, March 27, 2019 7:32 AM

    Hi,
    We are in process of buying a third party application (NewRelic) that does a query the SQL 2016 server with the following.

    SELECT
    DB_NAME(database_id) AS db_name,
    COUNT_BIG(*) * (8*1024) AS buffer_pool_size
    FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)
    WHERE database_id <> 32767 -- ResourceDB
    GROUP BY database_id

    Problem is that the sys.dm_os_performance_counters has over 11 millions rows (128Gb server) and the execution takes around 1 minute using about that much CPU (the grouping is obviously very demanding).  Since this is executed every about 2 minutes, it's basically taking half a cpu all the time.

    Anyone has ever worked with sys.dm_os_buffer_descriptors?  What kind of performance did you get?  How many rows?  Any ideas to improve the performance?

    I get real nervous about 3rd party software when the developers insist on doing 8*1024 to convert pages to bytes.  It would be better if they used 8192.  Actually, it would be better if they used /128.0 to convert to MB.  I also get real nervous when such 3rd party software is hitting a system view ever 2 minutes.

    As for their being 11 million rows (1 per page), that's only about 85GB... that's not bad on a 128GB server.

    As far as monitoring software goes, I find that such things typical spend a shedload of time reporting on symptoms without actually identifying the cause and so a pretty useless, IMHO.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @ScottPletcher
    The server is not busy at all (24 cores).  Avg around 8% for SQL with max of 48 over the last 4 hours according to sys.dm_os_ring_buffers
    @Jeff Moden
    Completely agree with you, but buying this 3rd party software is not my call.  NewRelic . c o m   

    I made a test. 
    I inserted the results of the view into a temp table (took 12min but this is not what I wan't to test) : select * into #Pat FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)
    Then I applied the same query as before
    SELECT
    DB_NAME(database_id) AS db_name,
    COUNT_BIG(*) * (8*1024) AS buffer_pool_size
    From #Pat
    WHERE database_id <> 32767 -- ResourceDB
    GROUP BY database_id

    Result returned with 8sec of CPU time on a 1sec duration (table scan with Has Match).
    Why?  This means that grouping is not the problem?

  • PatLap - Wednesday, March 27, 2019 3:22 PM

    @ScottPletcher
    The server is not busy at all (24 cores).  Avg around 8% for SQL with max of 48 over the last 4 hours according to sys.dm_os_ring_buffers
    @Jeff Moden
    Completely agree with you, but buying this 3rd party software is not my call.  NewRelic . c o m   

    I made a test. 
    I inserted the results of the view into a temp table (took 12min but this is not what I wan't to test) : select * into #Pat FROM sys.dm_os_buffer_descriptors WITH (NOLOCK)
    Then I applied the same query as before
    SELECT
    DB_NAME(database_id) AS db_name,
    COUNT_BIG(*) * (8*1024) AS buffer_pool_size
    From #Pat
    WHERE database_id <> 32767 -- ResourceDB
    GROUP BY database_id

    Result returned with 8sec of CPU time on a 1sec duration (table scan with Has Match).
    Why?  This means that grouping is not the problem?

    The grouping should not be a problem, with a HASH match (and no sort required), since there a very limited number of groups (~40) (there should be very few, if any, hash collisions).

    And you have a lot of CPU capacity, so something specific must have been delaying the query before.  Have you reduced the 'max mem' setting?

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • The VM has 128GB and the SQl max mem is at 112GB.
    Looks like it's the read time of the view that causes the problem.  I mean 12 minutes to insert 12 millions rows directly on a temp table, It took only 4 minutes to insert 12 millions rows from one regular table (with multiple columns) into a temp table.

  • At this point, I'm going to suggest that you ask the 3rd party vendor that uses the query.

    Also, change the WHERE clause to NOT use "<>".  Instead, use just "<" (without the quotes).  According to the execution plan, it should be about 50% faster.  Disclaimer:  I've NOT tested it on my production box.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you guys for your help.  I did the following 3 tests and here are the results.

    1) Original Call               -- CPU time = 67452 ms, elapsed time = 69808 ms.
    2) using < 32767 with COUNT_BIG(*) * (8*1024)          -- CPU time = 52922 ms, elapsed time = 57332 ms.
    3) using <32767 and COUNT(*) / 128           -- CPU time = 55891 ms, elapsed time = 60424 ms.

    At then end, I can't change the call since it's 3rd party but I will see if at least there's a config that can reduce the number of calls it does.

    Thanks again

  • PatLap - Thursday, March 28, 2019 6:44 AM

    Thank you guys for your help.  I did the following 3 tests and here are the results.

    1) Original Call               -- CPU time = 67452 ms, elapsed time = 69808 ms.
    2) using < 32767 with COUNT_BIG(*) * (8*1024)          -- CPU time = 52922 ms, elapsed time = 57332 ms.
    3) using <32767 and COUNT(*) / 128           -- CPU time = 55891 ms, elapsed time = 60424 ms.

    At then end, I can't change the call since it's 3rd party but I will see if at least there's a config that can reduce the number of calls it does.

    Thanks again

    Interesting timings.  Thanks for posting them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • As a bit of a sidebar, since there are so many rows that need to be loaded into memory for the query, I wonder how many rows are actually showing up because of this query and how many "other" rows are being flushed out of cache.  As can be very true, "If you measure it, you change it".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply