December 1, 2009 at 1:08 pm
I have worked in the field of SQL Performance tuning for years, and this one is just perplexing.
New Server - Windows 2008 R2, SQL 2008 sp1, both x64 editions.
24 GB RAM, 2 x Quad Core Nehalem processors (shows 16 procs in Task Mgr).
Set MaxDop server wide to 4
Max Server Mem is at 20GB
Main prod DB is only 8GB.
Stats Updates
No significant fragmentation
updated usage for DB
Disk response times are great (avg disk read/sec & write/sec) < .010
Page Life Exp > 70,000
Memory: Avail MB 2.5 GB
Proc Util is a bit odd - Proc #2 is above 80% most of the time, almost 100% of that is Kernel Time. All other Procs are below 10% most of the time.
Here's the problem:
Doing even the most simple query, like "select * from tablename" takes forever. On the old machine (2 processors, 4 GB RAM) the example we are testing would take 4 seconds. On the new machine, we are at 20-25 seconds for the same query. This type of comparison holds true across the board for all queries.
Interestingly enough, when I do a filtered trace (my own spid) on the "select * from tablename", the results come back:
7000 reads
1 second CPU time
22 seconds duration
Yet, there is no blocking showing up when I check.
Whether I do the query on a workstation pointed at the server, or directly on SSMS on the server itself, there is a huge delay in displaying the results.
Like I said, same query on the old machine, same DB, same qty of records, the times were 10X faster.
I am a bit stumped. Any ideas of what to try (other than putting on CU5, which I am going to do).
Mindy Curnutt
Sr. SQL Server DBA / Hardware Infrastructure Architect
TMW Systems, Inc.
twitter: @sqlgirl
December 1, 2009 at 1:19 pm
What are you getting with regards to IO stats? Specifically, does the new server end up with more physical reads than the old server, and less logical reads, on the same queries?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 1, 2009 at 3:00 pm
Old Box (SQL 2000, Windows 2003)
Table 'foo', Scan count 1, logical reads, 6346, physical reads 0, read-ahead reads 0.
New Box (SQL 2008, Windows 2008 R2)
Table 'foo', Scan count 1, logical reads 3940, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
So the new box does something close to 1/2 the qty of reads, yet it takes 10X as long to display the results, even locally.
There is nothing else on this server, no AV, no IIS, no services, firewall is off. Nothing showing up in Task Mgr as being out of line. I'm just stumped. There isn't even much running at all.
Mindy Curnutt
Sr. SQL Server DBA / Hardware Infrastructure Architect
TMW Systems, Inc.
twitter: @sqlgirl
December 1, 2009 at 3:05 pm
If the reads are that different, then either the table is different, or the data is different, or it's a different query. I'm not sure that's related to the query performance difference, but it's something to look into.
My first guess would be that there's something horribly misconfigured in the hardware with that kind of difference. Maybe a messed up I/O channel, maybe something in the SAN, not sure. With the hardware spec improvement you're talking about, I'd normally expect very good performance.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
December 1, 2009 at 3:27 pm
How does your disk configuration compare to the old box (i.e. what is the allocation unit size? Was the disk aligned on both servers?, raid configs? disk speed?)?
What about the maxdop setting on the old server? Are all of the SQL configuration settings the same on both servers (except obvious things like memory increases for the new box)?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
December 1, 2009 at 4:28 pm
Since there is 24 GB of memory, and the DB is only 8 GB -- and, I am seeing really fast response time on the disk, I think it cannot be the disk, even if it wasn't configured optimally, after the 2nd execution of the same query, most of the IO should be coming from memory cache, not disk.
I tried setting the maxdop down to 4, and changing it to 0, it is still slow on the delivery of the data either way.
I am starting to think this has something to do with the memory, that either it is terribly slow, that something is incompatible, or that something is wrong with it. The kernel being so hard hitting on the one processor, and the hardware interrupts being high on that proc make me think it may in fact be a hardware issue.
I'll be having them put CU5 on tonight and see how it goes from there.
I am also going to get a trace to see if everything appears to have slow delivery, or if it just things over a certain amount of reads, or if just looking at the trace results I get a new idea about what is going on.
Mindy Curnutt
Sr. SQL Server DBA / Hardware Infrastructure Architect
TMW Systems, Inc.
twitter: @sqlgirl
December 2, 2009 at 2:26 am
We have more or less the same setup with slower queries than old system.
HP BL685c Blades
4 Quad-Core AMD opteron Processors 2.8Ghz
32 GB Ram
HP EVO 6400 San with 8 Enclosures 60 X 146 GB x 15k hdd 3 filegroups
24 GB mem given to sql.
max dop 8
main prod db 10 gb
Windows 2008 R2, SQL 2008 sp1
fail over clustering
db is restored from sql 2005.
app servers and db servers are in the same blade enclosure.
using remoteapp on 4 appservers all connecting to same db.
I dont know where to look anymore.
December 3, 2009 at 8:26 am
1) I would turn off hyperthreading. it is more often suboptimal than helpful for sql server workloads
2) your disk metrics are not as good as you think. 10 ms is suboptimal for log writes. Also, those numbers are aggregate. what you need to do is get sql server file IO stall analysis done WHILE the query is running. See sys.dm_io_virtual_file_stats and do a differential query. Oh, does the result come back instantaneously if you run the query twice in a row (i.e. hit the data after it is in cache)?
3) didn't notice, but have you checked for table fragmentation? os file fragmentation?
4) I would have said networking is a problem until you stated you got same metrics running the query directly on the server.
5) is the new machine virtualized by any chance?
6) do a waitstats analysis just like you do the IO stall analysis while the query is running.
7) obvious question, but are the tables EXACTLY the same between the old and new system?
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 11, 2010 at 3:47 am
jandus (12/2/2009)
We have more or less the same setup with slower queries than old system.HP BL685c Blades
4 Quad-Core AMD opteron Processors 2.8Ghz
32 GB Ram
HP EVO 6400 San with 8 Enclosures 60 X 146 GB x 15k hdd 3 filegroups
24 GB mem given to sql.
max dop 8
main prod db 10 gb
Windows 2008 R2, SQL 2008 sp1
fail over clustering
db is restored from sql 2005.
app servers and db servers are in the same blade enclosure.
using remoteapp on 4 appservers all connecting to same db.
I dont know where to look anymore.
We solved our problem by rolling back to windows 2008 R1.
January 11, 2010 at 11:39 am
Thanks, that is good to know.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
July 23, 2010 at 4:12 am
Hi,
Did we get a resolution on this? I am getting a similar issue on Windows 2008R2 and SQL Server 2008 sp1.
Thanks
August 1, 2010 at 5:22 pm
I have not been able to locate any solution other than roll back to windows 2008 r1. Not a good solution as far as I am concerned. I would like to know why performance is so bad on windows 2008 r2 and sql 2008 r2. I have two machines that are like in hardware but one is windows 2008 sp1 and sql 2008 sp1 and the other is windows 2008 r2 and sql 2008 r2. The r2 box is around 10 times slower on all queries. The plans seems to get swapped our of memory quick also. Overall the box is extremely slow on queries. I have tried maxdop, thresholds and adhoc settings to no avail.
August 2, 2010 at 3:20 am
Thanks for the reply.
I have since been working with Microsoft to find a resolution.
Our issue was resolved by enabling the "Lock Pages in Memory" windows security policy on Windows 2008 R2. This has resolved all query latency and SQL is now performing as we would expect.
August 4, 2010 at 8:43 am
Microsoft have confirmed this was a problem with the Windows Server R2 kernel which is fixed in this KB article and hotfix.
August 4, 2010 at 9:03 am
Well I am using AMD so this won't help me. I am still looking. One thing we did do was to disable VMM in the BIOS for the CPU's. Performance did increase but it is still not where it should be.
thanks
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply