January 19, 2012 at 8:17 am
I am looking for some help troubleshooting performance issues as I am faily new to performance tuning and i am learning as I go here. I have a good idea of what I need to look at,...but I just need help making sense of it all and where the red flags are since I don't know what's considered 'NORMAL' for SQL Server just yet...
System Details:
SQL Server 2005 (9.0.4273) Enterprise Edition (64-bit)
OS Win NT 5.2 (3790)
16 Processors
Server Memory = 32 GB
Network Teamed 1GB adapters = 2gbps
SQL Server Memory Settings:
-Use AWE is Enabled
-Min 4,000MB
-Max 24000 MB
-Index Creation is Dynamic
-Min mem per query 1024 KB
The issue:
Within the application I can click a report and get results in seconds then exit and click it again and it will take 20mins to load the result set. It's not always fast on first click. It's random and can be the other way around slow then fast. Very strange. Much of the data is imported from external feeds and overwritten / appended on a daily basis.
I have been looking into things like wait_stats, stalls, perfmon, memory, and physical file fragmentation.
The BIG RED FLAG: 'Memory'
According to SQL Server: Memory Manager: Total Server Memory.... SQL Server is currently using 23.44 GB of the 24 GB that have been allocated to it. The servers taskmgr shows the PF Usage at 30.3 out of 32 GB!! I don't know why the memory usage is so high. Below are some of the wait stats etc that I have collected thus far.
As of 1/19/12:
wait_type--------wait_time_s--------pct--------running_pct
CXPACKET--------7192156.2--------40.6--------40.6
ASYNC_NETWORK_IO--------4618332.66--------26.07--------66.68
PAGEIOLATCH_SH--------1454544.22--------8.21--------74.89
ASYNC_IO_COMPLETION--------958365.69--------5.41--------80.3
BACKUPBUFFER--------915561.59--------5.17--------85.47
PAGEIOLATCH_EX--------577656.63--------3.26--------88.73
OLEDB--------529225.67--------2.99--------91.72
WRITELOG--------296641.7--------1.67--------93.39
SOS_SCHEDULER_YIELD--------236954.55--------1.34--------94.73
SLEEP_BPOOL_FLUSH--------236410.42--------1.33--------96.07
STALLS:
DB_name | io_stall_read_ms | num_of_reads | avg_read_stall_ms | io_stall_write_ms |num_of_writes | avg_write_stall_ms | io_stalls total_io | avg_io_stall_ms
DB1-148989940694-376295481-396-30874114517-192552726-160-179864055211-568848207-316
DB2-1387059583-5813234-239-42210505-1208472-35-1429270088-7021706-204
DB3-624661058-392417733-2-16675797206-92930519-179-17300458264-485348252-36
DB4-27525054-944660-29-4457617-624597-7-31982671-1569257-20
DB5-2544190-422601-6-3238964-379673-9-5783154-802274-7
DB6-5253080-1014851-5-829520-106800-8-6082600-1121651-5
BUFFER POOL STUFF: select convert(varchar,cast(bpool_committed*8 as money),1)as ,bpool_committed,bpool_commit_target
from sys.dm_os_sys_info
size bpool_committedbpool_commit_target
24,576,000.003072000 3072000
DBCC MEMORYSTATUS RESULTS
Memory Manager, KB
VM Reserved,33974640
VM Committed,399676
AWE Allocated,24974592
Reserved Memory,1024
Reserved Memory In Use,0
Optimization Queue,Value
Overall Memory,20171980800
Target Memory,14829641728
Last Notification,1
Timeout,6
Early Termination Factor,5
Memory node Id = 0, KB
VM Reserved,38272
VM Committed,38240
AWE Allocated,5879040
MultiPage Allocator,5240
SinglePage Allocator,4407000
(5 row(s) affected)
Memory node Id = 1, KB
VM Reserved,33848944
VM Committed,274408
AWE Allocated,6242304
MultiPage Allocator,111592
SinglePage Allocator,4407000
(5 row(s) affected)
Memory node Id = 2, KB
VM Reserved,39616
VM Committed,39584
AWE Allocated,6610944
MultiPage Allocator,5712
SinglePage Allocator,4407000
(5 row(s) affected)
Memory node Id = 3, KB
VM Reserved,42048
VM Committed,41760
AWE Allocated,6242304
MultiPage Allocator,8656
SinglePage Allocator,4407000
(5 row(s) affected)
MEMORYCLERK_SQLBUFFERPOOL (node 0), KB
VM Reserved,33603584
VM Committed,32768
AWE Allocated,6242304
SM Reserved ,0
SM Commited ,0
SinglePage Allocator ,0
MultiPage Allocator ,1296
(7 row(s) affected)
MEMORYCLERK_SQLBUFFERPOOL (node 1), KB
VM Reserved,0
VM Committed,0
AWE Allocated,5879040
SM Reserved ,0
SM Commited ,0
SinglePage Allocator ,0
MultiPage Allocator ,0
(7 row(s) affected)
MEMORYCLERK_SQLBUFFERPOOL (node 2), KB
VM Reserved,0
VM Committed,0
AWE Allocated,6610944
SM Reserved ,0
SM Commited ,0
SinglePage Allocator ,0
MultiPage Allocator ,0
(7 row(s) affected)
MEMORYCLERK_SQLBUFFERPOOL (node 3), KB
VM Reserved,0
VM Committed,0
AWE Allocated,6242304
SM Reserved ,0
SM Commited ,0
SinglePage Allocator ,0
MultiPage Allocator ,0
(7 row(s) affected)
MEMORYCLERK_SQLBUFFERPOOL (Total), KB
VM Reserved,33603584
VM Committed,32768
AWE Allocated,24974592
SM Reserved ,0
SM Commited ,0
SinglePage Allocator ,0
MultiPage Allocator ,1296
Buffer Distribution,Buffers
Stolen,6302
Free,11456
Cached,544573
Database (clean),2476405
Database (dirty),33264
I/O,0
Latched,0
(7 row(s) affected)
Buffer Counts,Buffers
Committed,3072000
Target,3072000
Hashed,2509669
Stolen Potential,2367525
External Reservation,0
Min Free,512
Visible,3072000
Available Paging File,1143428
(8 row(s) affected)
Procedure Cache,Value
TotalProcs,53189
TotalPages,524077
InUsePages,3040
(3 row(s) affected)
Global Memory Objects,Buffers
Resource,722
Locks,17729
XDES,99
SETLS,16
SE Dataset Allocators,32
SubpDesc Allocators,16
SE SchemaManager,3868
SQLCache,6743
Replication,2
ServerGlobal,52
XP Global,2
SortTables,1524
(12 row(s) affected)
Optimization Queue,Value
Overall Memory,20171980800
Target Memory,14829641728
Last Notification,1
Timeout,6
Early Termination Factor,5
cached_pages_countobjectnameindexnameindexid
36 sysobjvaluesclst1
4 sysschobjsclst1
4 sysidxstatsclst1
3 syscolparsclst1
3 sysallocunitsclust1
3 sysdbfilesclst1
2 sysmultiobjrefsclst1
2 spt_valuesspt_valuesclust1
2 sysschobjsnc23
2 sysprivsclust1
2 syshobtcolumnsclust1
1 sysrowsetsclust1
1 syssingleobjrefsclst1
1 sysownersnc23
1 spt_monitorNULL0
1 sysnsobjsnc2
1 sysrtsnc23
1 sysscalartypesclst1
1 sysdbregnc12
1 sysownersclst1
1 sysschobjsnc34
1 sysxlgnsnc23
1 sysxlgnscl1
1 sysguidrefscl1
1 syshobtsclust1
1 sysdbregclst1
1 sysxlgnsnc12
1 sysserefsclust1
January 19, 2012 at 8:55 am
Lets be clear... Does the very same query, with the very same predicate, with the very same filtering/access parameters, against the very same database/set-of-tables sometimes returns in seconds and sometimes returns after 20 minutes if you execute it twice?
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.January 19, 2012 at 8:58 am
And I strongly recommend you have a look at this: http://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-the-accidental-dba/
Re memory: If you tell SQL that it can use 24 GB, it will use 24 GB. If that's too much (considering what else is running on the server), then you need to tell SQL to use less.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 19, 2012 at 9:30 am
PaulB-TheOneAndOnly (1/19/2012)
Lets be clear... Does the very same query, with the very same predicate, with the very same filtering/access parameters, against the very same database/set-of-tables sometimes returns in seconds and sometimes returns after 20 minutes if you execute it twice?
Yes, exactly. The application let's them create reports that they can save and run over and over. The very same query (the users call them highlights) can be clicked one time with great results and the very next time a few seconds later you are waiting around for hours...
January 19, 2012 at 9:31 am
Have you checked for blocking?
Have you checked what wait types those queries have (not the aggregated, the wait types for those queries)?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 19, 2012 at 9:31 am
GilaMonster (1/19/2012)
http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-1/And I strongly recommend you have a look at this: http://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-the-accidental-dba/
Re memory: If you tell SQL that it can use 24 GB, it will use 24 GB. If that's too much (considering what else is running on the server), then you need to tell SQL to use less.
Thanks Gail! I will review those links..
January 19, 2012 at 9:34 am
The reports and application, are they custom reports for the application, are they SSRS reports read by the application, is it a different product entirely?
What you describe leads me to think of how SSRS can cache reports and this kind of behavior can be seen (depending on timing of the cache flush).
In either case, I would look to get an execution plan and tune the queries.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 19, 2012 at 9:47 am
GilaMonster (1/19/2012)
Have you checked for blocking?Have you checked what wait types those queries have (not the aggregated, the wait types for those queries)?
Each time they call or email the first thing i do is check for blocking and all results show 0 in the blocked column.
As for the wait types of specific queries...no i have not done that yet since i'm still trying to find out how to identify the 'problem queries'. I have put in a query id?? into a function called show sql text or something like that and was able to see the actual querey statement on another server one time. I'll try and do that here once I identify the problem queries.
January 19, 2012 at 9:56 am
SQLRNNR (1/19/2012)
The reports and application, are they custom reports for the application, are they SSRS reports read by the application, is it a different product entirely?What you describe leads me to think of how SSRS can cache reports and this kind of behavior can be seen (depending on timing of the cache flush).
In either case, I would look to get an execution plan and tune the queries.
They are not using SSRS. It's some web driven applciation that allows the users to build custom reports/highlights and then saves the reports. At one point I was going through the stall results and what i was reading basically indicated that the query that the application issued to SQL Server had such a large result set that SQL Server was waiting on the application to process the large result set. But I can't prove this just yet. Then I read somethign about cache or buffer flushing but have not had time to dive into that yet but from what i've read it sounds like a buffer flush could be whats happening. The user request a large result set that already exsists in the buffer and gets results quickly then the buffer gets flushed and they run the same query again and SQL Server no longer has the result in buffer and has to retrieve from disk. The SQL error logs also have numerous errors showing some msg about I/O requests taking longer than 15 seconds to complete on tempdb and DB4.mdf.
January 19, 2012 at 10:00 am
This may sound a bit pre-historic, but, to get hold of the problematic query(s) in a batch you can run it in batch by batch. This way you can see what part is taking all the time for the batch.
January 19, 2012 at 4:23 pm
Vertigo44 (1/19/2012)
PaulB-TheOneAndOnly (1/19/2012)
Lets be clear... Does the very same query, with the very same predicate, with the very same filtering/access parameters, against the very same database/set-of-tables sometimes returns in seconds and sometimes returns after 20 minutes if you execute it twice?Yes, exactly. The application let's them create reports that they can save and run over and over. The very same query (the users call them highlights) can be clicked one time with great results and the very next time a few seconds later you are waiting around for hours...
There is small chance that it may happen because the query is waiting for resources, for example memory.
You can try to monitor Resource Pool Stats : Pending memory grants, just in case.
January 25, 2012 at 6:43 am
I had a chance to speak with our vendor in regards to the server memory issues. I was forwarded an email that pertains to the following application error:
"Viewer Err: Display_feed.asp: Out of memory"
Description: This error occurs in the largest highlight, which may or may not have a graph.
Solution: Modify the highlight so it does not use as much memory:
1. Active caching on the highlight (I believe this is already done)
2. PreQualify the highlight. (I believe here lies the issue as the support personel told me that their application has a tendancy to request a very large amount of data if the users don't qualify their highlight when building them. And the application does not release the memory properly)
3. Review the drill level of the highlight so the top level is not pulling a lot of data. (Same as #2 it sounds like)
4. Plan a full reboot of the server. (Really?!?)
Well unfortunately, we had to do just that (a server reboot, unplanned) when the server ran out of memory and threw the following error:
"An application error has occured in display_highlight.asp: Automation error, not enough storage is available to process this command."
My thoughts... would it be possible to build views and repoint the application to serve up the views instead?
Thank you.
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply