November 16, 2016 at 10:10 am
I was reviewing a SQL Server 2008 data warehouse instance and noticed high latency on the TempDb files. The latency was around 500ms per file. There are four data files, each of equal size. The files were on the same drive with user databases. I asked the SAN Admin if we could get another drive to put TempDb on. I moved the files to another logical drive he created. The SAN Admin said there is nothing else using the part of the SAN that I moved TempDB to. Instead of latency going down, it has actually doubled! Now avg write stall ms is at 1000ms for all the data files. The SAN Admin tells me that the logical drive he created is backed by SSD just like the drive TempDb came from. I have asked him if the new drive is on a different SAN, thinking maybe there is a difference in network latency between where the new drive is versus where TempDb came from.
The SQL instance is 2008 SP4, 10.0.6535 on OS Server 2008 SP2.
What else should I be thinking about or investigating?
November 16, 2016 at 11:29 am
Set up perfmon on that volume sampling @5sec, try to gather all read/write type counters. Ask SAN admin to pull performance report for that LUN ( he should be able to pull that, pretty much all the SAN monitoring has some kind of performance reporting might not very granular but something to compare). Compare perfmon with SAN reports, if you notice high latencies in perfmon and in SAN report then your SAN is the bottle neck. If you see low latencies on SAN and high on perfmon then look at the n/w fabric , may be the ports are saturated.
From what i have experienced so far it is most likely the SAN, sounds like you have tiered storage . The problem with tiered storage is it totally depends on vendors algorithm to move the data from SSD to spinning disk.
Let us what you find.
November 17, 2016 at 11:17 am
We have SQL Diagnostic Manager monitoring this instance. When I see a spike in disk reads and disk writes per second then avg disk queue length spikes as well. SAN Admin tells me the array is so old he can't see any historical performance data. Based on what I see from SQL Server and Diagmostic Manager and previous experience, it seems like the SAN is the issue. However I don't think I am going to solve this if I can't get performance info about the SAN.
Looks like we may be living with this until we migrate to a new server that is connected to the newer storage.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply