It is well known and understood that SQL Server requires a substantial amount of memory. SQL Server will also try to consume as much memory as possible from the available system memory – if you let it. Sometimes, there will be some contention / pressure with the memory.
When contention occurs, the users will probably start screaming because performance has tanked and deadlines are about to be missed. There are many different ways (e.g. here or here) to try and observe the memory conditions and even troubleshoot memory contention. Extended Events (XE) gives one more avenue to try and troubleshoot problems with memory.
Using XE to observe memory conditions is a method that is both geeky/fun and an advanced technique at the same time. If nothing else, it will certainly serve as a divergence from the mundane and give you an opportunity to dive down a rabbit hole while exploring some SQL Server internals.
Diving Straight In
I have a handful of events that I have picked for an event session to track when I might be running into some memory problems. Or I can run the session when I suspect there are memory problems to try and provide me with a “second opinion.” Here are the pre-picked events.
SELECT xo.name AS EventName, xo.capabilities_desc, xo.description FROM sys.dm_xe_objects xo WHERE xo.name IN ('large_cache_memory_pressure','buffer_manager_page_life_expectancy' ,'buffer_node_page_life_expectancy' ,'buffer_pool_eviction_thresholds_recalculated','server_memory_change');
Investigating those specific events a little further, I can determine if the payload is close to what I need.
SELECT oc.OBJECT_NAME AS EventName ,oc.name AS column_name, oc.type_name ,oc.column_type AS column_type ,oc.column_value AS column_value ,oc.description AS column_description ,ca.map_value AS SearchKeyword ,ch.Channel --added because large_cache_memory_pressure has no category/search keyword FROM sys.dm_xe_object_columns oc OUTER APPLY (SELECT TOP 1 mv.map_value FROM sys.dm_xe_object_columns occ INNER JOIN sys.dm_xe_map_values mv ON occ.type_name = mv.name AND occ.column_value = mv.map_key AND oc.object_package_guid = mv.object_package_guid WHERE occ.name = 'KEYWORD' AND occ.object_name = oc.object_name) ca INNER JOIN (SELECT c.object_name AS EventName,c.object_package_guid AS PkgGuid, v.map_value AS Channel FROM sys.dm_xe_object_columns c INNER JOIN sys.dm_xe_map_values v ON c.type_name = v.name AND c.column_value = CAST(v.map_key AS NVARCHAR) WHERE c.name = 'channel') ch ON ch.EventName = oc.object_name AND ch.PkgGuid = oc.object_package_guid WHERE oc.object_name IN ('large_cache_memory_pressure','buffer_manager_page_life_expectancy' ,'buffer_node_page_life_expectancy','buffer_pool_eviction_thresholds_recalculated' ,'server_memory_change') AND oc.column_type <> 'readonly' ORDER BY EventName,oc.column_id;
That is a small snippet of the payload for all of the pre-picked events. Notice that the large_cache_memory_pressure event has no “SearchKeyword” / category defined for it. There are a few other events that also do not have a category assigned which makes it a little harder to figure out related events. That said, from the results, I know that I have some “server” and some “memory” tagged events, so I can at least look at those categories for related events.
/* Keyword search */DECLARE @Keyword VARCHAR(64) = 'memory' --memory --server /* note there are a few more interesting events buffer_node_database_pages buffer_manager_database_pages buffer_manager_target_pages memory category query_memory_grant_usage memory_manager_stolen_server_memory allocation_failure bad_memory_detected */SELECT oc.OBJECT_NAME AS EventName ,oc.name AS column_name, oc.type_name ,oc.column_type AS column_type ,oc.column_value AS column_value ,oc.description AS column_description ,REVERSE(LEFT(REVERSE(olm.name),CHARINDEX('',REVERSE(olm.name))-1)) AS DLLName ,olm.file_version ,xp.name AS PackageName ,mv.map_value AS SearchKeyword ,ch.Channel FROM sys.dm_xe_object_columns oc INNER JOIN sys.dm_xe_map_values mv ON oc.type_name = mv.name AND oc.column_value = mv.map_key AND oc.object_package_guid = mv.object_package_guid AND oc.name = 'KEYWORD' INNER JOIN sys.dm_xe_packages xp ON oc.object_package_guid = xp.guid INNER JOIN sys.dm_os_loaded_modules olm ON xp.module_address = olm.base_address INNER JOIN (SELECT c.object_name AS EventName,c.object_package_guid AS PkgGuid, v.map_value AS Channel FROM sys.dm_xe_object_columns c INNER JOIN sys.dm_xe_map_values v ON c.type_name = v.name AND c.column_value = CAST(v.map_key AS NVARCHAR) WHERE c.name = 'channel') ch ON ch.EventName = oc.object_name AND ch.PkgGuid = oc.object_package_guid WHERE mv.map_value = @Keyword ORDER BY oc.object_name;
This query will yield results similar to the following.
If you look closely at the script, I included a note about some additional interesting events that are related to both categories “server” and “memory.”
After all of the digging and researching, now it’s time to pull it together and create a session that may possibly help to identify various memory issues as they arise or to at least help confirm your sneaking suspicion that a memory issue is already present.
USE master; GO -- Create the Event Session IF EXISTS ( SELECT * FROM sys.server_event_sessions WHERE name = 'BuffMemPressure' ) DROP EVENT SESSION BuffMemPressure ON SERVER; GO EXECUTE xp_create_subdir 'C:DatabaseXE'; GO CREATE EVENT SESSION [BuffMemPressure] ON SERVER ADD EVENT sqlos.large_cache_memory_pressure ( ACTION ( sqlserver.database_id, sqlserver.nt_username, sqlserver.sql_text, sqlserver.username,sqlserver.session_nt_username, sqlserver.client_app_name, sqlserver.session_id, sqlserver.client_hostname) ), ADD EVENT sqlserver.buffer_manager_page_life_expectancy ( ACTION ( sqlserver.database_id, sqlserver.nt_username, sqlserver.sql_text, sqlserver.username,sqlserver.session_nt_username, sqlserver.client_app_name, sqlserver.session_id, sqlserver.client_hostname) ), ADD EVENT sqlserver.buffer_node_page_life_expectancy ( ACTION ( sqlserver.database_id, sqlserver.nt_username, sqlserver.sql_text, sqlserver.username,sqlserver.session_nt_username, sqlserver.client_app_name, sqlserver.session_id, sqlserver.client_hostname) ), ADD EVENT sqlserver.buffer_pool_eviction_thresholds_recalculated ( ACTION ( sqlserver.database_id, sqlserver.nt_username, sqlserver.sql_text, sqlserver.username,sqlserver.session_nt_username, sqlserver.client_app_name, sqlserver.session_id, sqlserver.client_hostname) ), ADD EVENT sqlserver.server_memory_change ( ACTION ( sqlserver.database_id, sqlserver.nt_username, sqlserver.sql_text, sqlserver.username,sqlserver.session_nt_username, sqlserver.client_app_name, sqlserver.session_id, sqlserver.client_hostname) ) ADD TARGET package0.event_file ( SET filename = N'C:DatabaseXEBuffMemPressure.xel' , max_rollover_files = ( 5 ) ) WITH ( MAX_MEMORY = 4096 KB , EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS , MAX_DISPATCH_LATENCY = 5 SECONDS , MAX_EVENT_SIZE = 0 KB , MEMORY_PARTITION_MODE = NONE , TRACK_CAUSALITY = ON , STARTUP_STATE = ON ); GO ALTER EVENT SESSION BuffMemPressure ON SERVER STATE = START;
When running this session for a while, you will receive a flood of events as they continue to trigger and record data to your trace file. You will want to keep a steady eye on the trace files and possibly only run the session for short periods.
Here is an example of my session with events grouped by event name. Notice anything of interest between the groups?
If the data in the session does not seem to be helpful enough, I recommend looking at adding the additional events I noted previously.
Here is another view on a system that has been monitoring these events for a while longer and does experience memory pressure.
Here we can see some of the direct results of index operations on memory as well as the effects on memory for some really bad code. Really cool is that we can easily find what query(ies) may be causing the memory pressure issues and then directly tune the offending query(ies).
The Wrap
Diving in to the internals of SQL Server can be useful in troubleshooting memory issues. Extended Events provides a means to look at many memory related events that can be integral to solving or understanding some of your memory issues. Using Extended Events to dive into the memory related events is a powerful tool to add to the memory troubleshooting toolbelt.
Try it out on one or more of your servers and let me know how it goes.
For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.
Interested in seeing the power of XE over Profiler? Check this one out!
This has been the ninth article in the 2018 “12 Days of Christmas” series. For a full listing of the articles, visit this page.