August 3, 2011 at 11:17 am
coronaride (8/2/2011)
We're seeing a ton of PAGEIOLATCH waits with large Buffer I/O wait times, upwards of 30000ms. Before starting to analyze the query, does anyone have any immediate suggestions?
I don't know whether you looked at white paper from microsoft regarding wait_types and queues which helped me a lot before here[/url]
Just might helpful.
Before when I see the PageIolatch waits a lot, the usual reason would be high transactions/sec along with bulk operations.
Excessive waiting on pages allocation, and the three types of pages are GAM,PFS, SGAM
to find exactly what kind of pages its waiting on
----------------------------------------------
--//FIND WHAT PAGE ALLOCATION CONTENSTION YOU HAVE
----------------------------------------------
Select session_id,
wait_type,
wait_duration_ms,
blocking_session_id,
resource_description,
ResourceType = Case
When Cast(Right(resource_description, Len(resource_description) - Charindex(':', resource_description, 3)) As Int) - 1 % 8088 = 0 Then 'Is PFS Page'
When Cast(Right(resource_description, Len(resource_description) - Charindex(':', resource_description, 3)) As Int) - 2 % 511232 = 0 Then 'Is GAM Page'
When Cast(Right(resource_description, Len(resource_description) - Charindex(':', resource_description, 3)) As Int) - 3 % 511232 = 0 Then 'Is SGAM Page'
Else 'Is Not PFS, GAM, or SGAM page'
End
From sys.dm_os_waiting_tasks
Where wait_type Like 'PAGELATCH_UP'
And resource_description Like '5:%'
Also here is teh query to find the wait_types you have currently,task and plan
-----------------------------------------
--//FIND WAITING TASK WITH WAIT_TYPE AND PLAN
-----------------------------------------
select r.cpu_time
, r.logical_reads
, r.session_id
into #temp
from sys.dm_exec_sessions as s
inner join sys.dm_exec_requests as r
on s.session_id =r.session_id --and s.last_request_start_time=r.start_time
where is_user_process = 1
waitfor delay '00:00:01'
select substring(h.text, (r.statement_start_offset/2)+1 , ((case r.statement_end_offset when -1
then datalength(h.text) else r.statement_end_offset end - r.statement_start_offset)/2) + 1) as text
, r.cpu_time-t.cpu_time as CPUDiff
, r.logical_reads-t.logical_reads as ReadDiff
, p.query_plan
, r.wait_type
, r.wait_time
, r.last_wait_type
, r.wait_resource
, r.command
, r.database_id
, r.blocking_session_id
, r.granted_query_memory
, r.session_id
, r.reads
, r.writes
, r.row_count
, s.[host_name]
, s.program_name
, s.login_name
from sys.dm_exec_sessions as s
inner join sys.dm_exec_requests as r
on s.session_id =r.session_id and s.last_request_start_time=r.start_time
full outer join #temp as t on t.session_id=s.session_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) h
cross apply sys.dm_exec_query_plan(r.plan_handle) p
order by 3 desc
drop table #temp
Also check Trace flag 1118 by sqlskills.com , This traceflag helped me reducing the pageiolatches a lot for only some days and issue is back again.
So you analyze from the all this rough information to find what solves your issue.
These might not solve issue directly but helps to find a way to reduce pageiolatches.
August 3, 2011 at 4:30 pm
What I haven't seen yet is the memory and CPU specs for the server. That will also have an effect on performance here.
For a database that large, I would even consider drives less than 10K - and would push for 15K. With 24 drives available, how have you carved that up for the LUNs? Not really clear how you configured this - which could definitely be a factor.
Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”
― Charles R. Swindoll
How to post questions to get better answers faster
Managing Transaction Logs
August 4, 2011 at 7:44 am
Even with a HUGE amount of RAM on the box you have absolutely no hope of good IO from your cheap IO subsystem. I have a 2 year old LAPTOP that will smoke your production server. You are trying to process enterprise-scale data volumes on a penny-ante budget. Just cannot be done. The best you can do on this hardware is to have good indexing and make your code 'batchable' where you process 10000 rows at a time in an index-seek set-based manner.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 4, 2011 at 9:49 am
Yes, yes, understood, the hardware configuration is not great. As previously mentioned, when I made the hardware request, we were still very much in the proof of concept phase. I knew we were going to end up with a fair amount of data, so I had to make a choice between speed and space. I went with the latter, since getting enough 15k drives to provide data redundancy to support up to 8TB was going to triple the price.
I think that, after we can at least finish this phase and "wow" the big boss, we'll go back and request faster I/O.
August 4, 2011 at 9:55 am
... san might not be overkill here. kevin whould have better pointers than that to offer.
August 4, 2011 at 2:59 pm
coronaride (8/2/2011)
It is a SELECT only, with multiple joins on PK-FK relationships to child tables. There is no inserting, updating, or deleting.As far as the number of concurrent processes hitting the database, after much troubleshooting on my own, I had a feeling that this might be the bottleneck. At this point, I'm thinking that, due to all of the waits we're experiencing, we're giving up any gains from multi-threading and should try reducing the number of processes.
I know I'm coming in on this a bit late, but I have to ask. Do you specifically index the FK on the child table side? The reason I ask this is that I've seen some developers assume that declaring the FK relationship automatically creates an index on the FK side, which is not the case. The FK has to be indexed or you get table scans of the child table on the join from parent to child.
Other than the above, it does look like your hardware lacks a bit of oomph.
Todd Fifield
August 4, 2011 at 3:13 pm
Yes, we have the field indexed. Thanks, though. 🙂
August 5, 2011 at 9:16 am
Ninja's_RGR'us (8/4/2011)
... san might not be overkill here. kevin whould have better pointers than that to offer.
For most small shops (and even lots of medium/big ones) direct-attached storage is often best for SQL Server stuff. WAY simpler to setup, configure, manage and you NEVER have to worry about what the SAN admin(s) will do with your LUNs.
To the OP: consider getting a tuning professional to give your database/app/process a review. In 15 years of doing said work I have NEVER been to a client that didn't have LOTS of low-hanging performance fruit.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
August 5, 2011 at 9:39 am
Thanks
Viewing 9 posts - 16 through 23 (of 23 total)
You must be logged in to reply to this topic. Login to reply