High TempDB usage - Where is the culprit?

Question

Post reply

High TempDB usage - Where is the culprit?

Maddave

SSCertifiable

Points: 6828
More actions
March 5, 2014 at 8:30 am

#289112

SQL 2012 SP1 Standard on a VM with 4 Vcores and 40 GB allocated RAM.
I have noticed that TEMPDB usage is very high lately and of the two 15GB files (30GB total TempDB size), there is only 2.6 GB free. I am trying to find out why this could be. I have executed a query on sys.dm_db_file_space_usage and got the following results (Query1):
Free Pages: 343608
Free Space in MB: 2684
Version store Pages used: 1440
Version store space MB: 11.25
Internal Object Pages used: 6144
Internal Object space MB: 48
User Object Pages used: 3646000
User Object space in MB: 28484.375
So if I am reading this correctly, the user objects are taking up the space in TEMPDB. I then ran a query (Query2) on sys.dm_exec_request joined with dm_exec_sql_text to find the sessions consuming the space, but the numbers don't add up.
Looking at the sum of sys.dm_db_task_space_usage.user_objects_alloc_page_count for the sessions, the total usage count for all sessions doesn't equal the usage I am seeing in TEMPDB, notable only 2.2 GB of usage. Where as the user object space is reporting a usage of 28GB plus!
Can anyone give any advice on whether I am reading this correctly and the best way to find the consumption of TempDB? Thanks.
The queries used are:
Query 1:
SELECT
SUM(unallocated_extent_page_count) AS [free pages], (SUM(unallocated_extent_page_count)*1.0/128) AS [free space in MB],SUM(version_store_reserved_page_count) AS [version store pages used],
(SUM(version_store_reserved_page_count)*1.0/128) AS [version store space in MB],SUM(internal_object_reserved_page_count) AS [internal object pages used],
(SUM(internal_object_reserved_page_count)*1.0/128) AS [internal object space in [MB],SUM(user_object_reserved_page_count) AS ,
(SUM(user_object_reserved_page_count)*1.0/128) AS
FROM sys.dm_db_file_space_usage;
Query 2:
SELECT R1.session_id, R1.request_id, R1.Task_request_internal_objects_alloc_page_count, R1.Task_request_internal_objects_dealloc_page_count,
R1.Task_request_user_objects_alloc_page_count,R1.Task_request_user_objects_dealloc_page_count,R3.Session_request_internal_objects_alloc_page_count ,
R3.Session_request_internal_objects_dealloc_page_count,R3.Session_request_user_objects_alloc_page_count,R3.Session_request_user_objects_dealloc_page_count,
R2.sql_handle, RL2.text as SQLText, R2.statement_start_offset, R2.statement_end_offset, R2.plan_handle FROM (SELECT session_id, request_id,
SUM(internal_objects_alloc_page_count) AS Task_request_internal_objects_alloc_page_count, SUM(internal_objects_dealloc_page_count)AS
Task_request_internal_objects_dealloc_page_count,SUM(user_objects_alloc_page_count) AS Task_request_user_objects_alloc_page_count,
SUM(user_objects_dealloc_page_count)AS Task_request_user_objects_dealloc_page_count FROM sys.dm_db_task_space_usage
GROUP BY session_id, request_id) R1 INNER JOIN (SELECT session_id, SUM(internal_objects_alloc_page_count) AS Session_request_internal_objects_alloc_page_count,
SUM(internal_objects_dealloc_page_count)AS Session_request_internal_objects_dealloc_page_count,SUM(user_objects_alloc_page_count) AS Session_request_user_objects_alloc_page_count,
SUM(user_objects_dealloc_page_count)AS Session_request_user_objects_dealloc_page_count FROM sys.dm_db_Session_space_usage
GROUP BY session_id) R3 on R1.session_id = R3.session_id
left outer JOIN sys.dm_exec_requests R2 ON R1.session_id = R2.session_id and R1.request_id = R2.request_id
OUTER APPLY sys.dm_exec_sql_text(R2.sql_handle) AS RL2
Where
Task_request_internal_objects_alloc_page_count >0 or
Task_request_internal_objects_dealloc_page_count>0 or
Task_request_user_objects_alloc_page_count >0 or
Task_request_user_objects_dealloc_page_count >0 or
Session_request_internal_objects_alloc_page_count >0 or
Session_request_internal_objects_dealloc_page_count >0 or
Session_request_user_objects_alloc_page_count >0 or
Session_request_user_objects_dealloc_page_count >0

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Matt Crowley SSCarpal Tunnel Points: 4163 More actions · Answer 1

Do you have any databases that use Snapshot Isolation?

select name, snapshot_isolation_state

from sys.databases

where snapshot_isolation_state > 0

we have an application that did not free up it's snapshots on a timely basis, so we had to resort to restarting the application once a month.

Maddave SSCertifiable Points: 6828 More actions · Answer 2

Thanks for the reply. Only one database (bar master and msdb) are using snapshot isolation. The vesioning would be kept in the version store of TempDB though, wouldn't they, so this doesn't look to be the problem.

One thing I have noticed which may or may not be related is the SQL server performance bombing out at times. I have found that when a user views a view definition, either by looking at the design option of a view or asking for a create/alter script to be generated, then SSMS hangs, SQL server performance drops and PLE drops to zero! I notice millions of reads in the IO and the same for writes. Once I kill off the session trying to do the view definition query, then all starts working as normal. The wait type when this happens is PAGEIOLATCH_SH:tempdb:3(*) (taken from whoisactive) which makes me think this could be to do with tempdb and free space.

This may just be a coincidence.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

Maddave (3/5/2014)
Thanks for the reply. Only one database (bar master and msdb) are using snapshot isolation. The vesioning would be kept in the version store of TempDB though, wouldn't they, so this doesn't look to be the problem.
One thing I have noticed which may or may not be related is the SQL server performance bombing out at times. I have found that when a user views a view definition, either by looking at the design option of a view or asking for a create/alter script to be generated, then SSMS hangs, SQL server performance drops and PLE drops to zero! I notice millions of reads in the IO and the same for writes. Once I kill off the session trying to do the view definition query, then all starts working as normal. The wait type when this happens is PAGEIOLATCH_SH:tempdb:3(*) (taken from whoisactive) which makes me think this could be to do with tempdb and free space.
This may just be a coincidence.

sp_whoisactive can also show you the actual tempdb consumption along with the queries running, query plans, and lots of other details. That should easily allow you to tie tempdb usage to queries causing high usage.

PLE to zero is often caused by a bug somewhere that is causing a flush of sql memory. I have seen MANY of these over the years, from hardware drivers to windows to SQL Server itself. There are some very nasty memory (and NUMA) bugs in SQL 2012 that you need to be patched up to get around. What is your build level?

I also note you are on a VM. So now another thing comes into play and that is the VM causing some hoky-poky stuff to happen. Have your VM admin monitor for a variety of things too. There are SOOO many ways you can mis-configure SQL Server and having VM in the mix add to that list significantly.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Maddave SSCertifiable Points: 6828 More actions · Answer 4

Many thanks for the reply. I am on build 11.0.3128. I've only applied the fix to the ms installer running bug as I experienced this a while back. I was only told to wait for service packs rather than apply CU's, but I might look to roll these out now.

I have ended up resolving the issue by bouncing the instance last night. After this, PLE has now grown back to the expected 15 hours rather jumping up and down like it has been over the last few days. Tempdb is also cleared down (as expected after a restart), but running the two queries in my original post both the value for the tempdb in use by users from the first query and the sum of the user allocation of tempdb in the second query, now tally, where as before they were way out. Viewing a view definition now works quickly and doesn't cause a drop in PLE.

So something was definitely wrong and causing stability issues. I feel I will not know the exact cause of this, but will be better prepared if I see things like this again.

Thanks again for your help.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 5

Maddave (3/6/2014)
Many thanks for the reply. I am on build 11.0.3128. I've only applied the fix to the ms installer running bug as I experienced this a while back. I was only told to wait for service packs rather than apply CU's, but I might look to roll these out now.
I have ended up resolving the issue by bouncing the instance last night. After this, PLE has now grown back to the expected 15 hours rather jumping up and down like it has been over the last few days. Tempdb is also cleared down (as expected after a restart), but running the two queries in my original post both the value for the tempdb in use by users from the first query and the sum of the user allocation of tempdb in the second query, now tally, where as before they were way out. Viewing a view definition now works quickly and doesn't cause a drop in PLE.
So something was definitely wrong and causing stability issues. I feel I will not know the exact cause of this, but will be better prepared if I see things like this again.
Thanks again for your help.

The days of waiting for a service pack for SQL Server are long gone I am afraid. Those that do that will come up against issues FAR more often than those who wait for the service pack, IMNSHO.

As long as you can bounce the server when this issue crops up, it seems like you have found an acceptable workaround. :hehe:

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service