November 10, 2011 at 4:02 pm
I'm running SQL Server 2005 Enterprise Edition SP2 on Windows 2003 R2 2X8 core with 256GB of RAM
Usually, PageLatch_SH and PageLatch_EX waittimes are in the 20-30 millisecond range. Twice now, all of a sudden, PageLatch_SH has gone from 30000 ms to 252000 ms over 20 minutes and PageLatch_EX 25000 to 82000 and then they both dropped rapidly to their normal levels and the issue went away.
During that time, there was a large amount of "rolling" blocking occurring in tempdb. That means that everytime I ran sp_who2'active', blocking was occurring, but the same spid was not blocking anymore. Another spid was.
Obviously, during this incident my waiting tasks ( in Activity Monitor) increased from 20-30 to up over 300. Users were impacted.
I'm trying to figure out what type of activity would cause PageLatch_SH to increase drastically and then go away. We don't run bulk insert queries, so I'm a little confused on this one. The definiton of the wait type "Caused when a task is waiting on a latch for a buffer that is not in an I/O request" What else could cause this issue outside of I/O?
November 11, 2011 at 2:24 am
PageLatch on what resource? Could be TempDB allocation contention, but I want to see the resource first to be sure.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2011 at 8:49 am
Sometimes, we do have tempdb congestion in the allocation pages, but yesterday it seemed like the problem was in the data pages .
PageLatch_SH 2:7:102824
PageLatch_SH 2:6:257047
etc....
November 11, 2011 at 8:53 am
Interesting. How many TempDB files do you have? What traceflags do you have on and how many cores do you have?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2011 at 2:20 pm
We just migrated over to a new server and EqualLogic disk array with SSD and 10K SAS drives. So, I was hoping to get rid of any tempdb issues that I had on the old system.
Right now, I have 8 tempdb files(all the same size) on a 2 *8 or 16 core box.
Question? Does increasing the number of files help solve blocking issues in tempdb even if the blocking doesn't appear to be in the allocation pages? Like I said before(I think), we are seeing blocking in tempdb on the data pages when this issue starts to occur. When that waittype, PAGELatch_SH, is at it's normal levels, the blocking is minimal to non existent. When that wait type increases, the blocking starts.
We have a heavy transactional system, 2500 - 3200 Batches/Per Second, so I wonder if that plays a role.
Also, we are discussing turning on Trace Flag 1118, but we don't have it on currently. Do you know of any problems that turning that trace flag on has caused?
November 11, 2011 at 2:27 pm
Lilsamp (11/11/2011)
We just migrated over to a new server and EqualLogic disk array with SSD and 10K SAS drives. So, I was hoping to get rid of any tempdb issues that I had on the old system.
PageLatch waits aren't IO related, so improving that's not going to make the PageLatch waits go away
Question? Does increasing the number of files help solve blocking issues in tempdb even if the blocking doesn't appear to be in the allocation pages?
Maybe, maybe not. Hard to say, it shouldn't cause any harm though. Try doubling the files if you can.
This reminds me of something, I'll try and remember what...
Also, we are discussing turning on Trace Flag 1118, but we don't have it on currently. Do you know of any problems that turning that trace flag on has caused?
I don't, but that's specifically for allocation contention on the SGAM.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2011 at 3:08 pm
Thanks for the feedback. I was thinking about adding more files this weekend during our maintenace window just to see if it helps. I wanted to go down that path before trying Trace Flag 1118.
What bothers me the most is the randomness of the issue. It comes out of no where and then leaves and everything is back to normal. Do you have any hints of types of queries that I should look for through my traces. For example, if the wait type was IO related, then I could look for heavy IO queries. What do you look for when trying to track down PageLatch_SH issues? Heavy tempdb allocations?
November 11, 2011 at 3:38 pm
Lilsamp (11/11/2011)
For example, if the wait type was IO related, then I could look for heavy IO queries. What do you look for when trying to track down PageLatch_SH issues? Heavy tempdb allocations?
Not necessarily, PageLatch contention can appear in any DB, but since yours seems to be DBID 2, then yes. There's some scripts for monitoring TempDB around somewhere...
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 11, 2011 at 3:45 pm
Sorry, can't find and it's late, but start with this DMV, it's at the core of tempDB monitoring
sys.dm_db_task_space_usage
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply