January 11, 2013 at 4:17 am
Hi All
Recently I have been observing the following types of errors in the SQL error log
SQL Server has encountered 100 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [i]datafile[/i] in database [i]db_name[/i] (5).
These errors seem to be popping up more and more.
Things I've checked:
>> Checked wait stats on the server and noticed IO related waits at the top
>> Using sys.dm_db_index_operational_stats, I've tracked down the top tables with the highest page_io_latch-wait_counts.
>> The indexes on these tables are heavily fragmented, 70% +
>> I've also observed what I think is a low buffer cache hit ratio, between 89%-92%
>> Drives containing datafiles are used ONLY for datafiles.
Could the fragmentation be the cause of the IO delays?
Is there anything else I can check to track this down?
Thanks
January 11, 2013 at 5:27 am
It could be fragmentation, but more likely is either slow disks or disk contention. Check your wait statistics to see what is causing the system to slow down. See if you're dealing primarily with i/o issues. Then collect metrics on queries to figure out which ones are using the most i/o. Tune 'em.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 12, 2013 at 12:09 am
Thanks
I've checked the highest averaging IO queries in cache and there are definately some serious scans happening against the top tables
Also, do you think that the low Buffer cache hit ratio could be contributing? From what I understand, the low hit ratio means that more queries are forced to retrieve pages from disk. Along with this, Page life expectancy is between 700-900 secs. The server has 20GB of ram and SQL has been allocated 4GB.
Thanks
January 12, 2013 at 3:25 am
Buffer cache hit ratio is a poor indicator. The page life expectancy is pretty low though. I suspect you are hitting the disk a lot. Fragmentation may help at the margins, but it sounds like you either are moving massive amounts of data, or you need to tune some queries.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 12, 2013 at 5:22 am
Thanks
Thanks, I know that there are definately some queries hitting the disk quite hard, and that could definately do with some tuning.
In your opinion, what's a better approach: Changing the T-SQL Code or adding indexes to accomodate the code ?
I'm gonna increase SQL's memory allocation as well and observe from there. SQL is only allocated 4GB out of a total of 20GB available.
January 12, 2013 at 6:01 am
SQLSACT (1/12/2013)
In your opinion, what's a better approach: Changing the T-SQL Code or adding indexes to accomodate the code ?
Yes.
Adding indexes is useless if the problem is in the code. Changing the code is useless if the problem is a lack of indexes.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 12, 2013 at 8:27 am
GilaMonster (1/12/2013)
SQLSACT (1/12/2013)
In your opinion, what's a better approach: Changing the T-SQL Code or adding indexes to accomodate the code ?Yes.
Adding indexes is useless if the problem is in the code. Changing the code is useless if the problem is a lack of indexes.
Thanks
The Queries that seems to be racking up the most IO are not queries that run often, they're run at most 5 times a day. This is an OLTP system.
Adding indexes is useless if the problem is in the code. Changing the code is useless if the problem is a lack of indexes
Where do you draw the line between this, when does code become "bad code". The queries are not queries like "select * from some_table".
Thanks
January 12, 2013 at 8:46 am
SQLSACT (1/12/2013)
Where do you draw the line between this, when does code become "bad code".
When testing shows that it performs badly.
Performance tuning is based on testing, not guesswork. Find the queries with the worst IO load (not the ones that do the most reads per execution). Run them. Identify why they're not performing well. Fix whatever problem was identified. Run them. Do they perform acceptably? If not, identify why and fix. Repeat until performance is acceptable.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 12, 2013 at 8:58 am
SQLSACT (1/12/2013)
ThanksI've checked the highest averaging IO queries in cache and there are definately some serious scans happening against the top tables
Also, do you think that the low Buffer cache hit ratio could be contributing? From what I understand, the low hit ratio means that more queries are forced to retrieve pages from disk. Along with this, Page life expectancy is between 700-900 secs. The server has 20GB of ram and SQL has been allocated 4GB.
why such a low memory allocation to SQL out of the total? Increasing the memory available to SQL will improve your page life expectancy and reduce the need to go to disk.
---------------------------------------------------------------------
January 12, 2013 at 9:01 am
The application that uses this SQL Server is installed on the server as well and is very resource intensive
January 12, 2013 at 9:04 am
SQLSACT (1/12/2013)
The application that uses this SQL Server is installed on the server as well and is very resource intensive
My first step would be to split these two servers.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 12, 2013 at 9:05 am
Find the queries with the worst IO load (not the ones that do the most reads per execution).
Thanks
I've been using the following script to identify the highest IO intensive queries
SELECT top 20
SUBSTRING(t.text, ( s.statement_start_offset / 2 ) + 1,
( ( CASE statement_end_offset
WHEN -1 THEN DATALENGTH(t.text)
ELSE s.statement_end_offset
END - s.statement_start_offset ) / 2 ) + 1)
AS statement_text,
text,
objtype,
cacheobjtype,
usecounts,
last_execution_time,
total_physical_reads,
total_physical_reads / execution_count AS [Avg Physical Reads],
total_logical_reads,
total_logical_reads / execution_count AS [Avg Logical Reads],
total_physical_reads+total_logical_reads/execution_count AS [Total AVG IO],
total_worker_time,
total_worker_time / execution_count AS [Avg CPU Time],
execution_count ,
qp.query_plan
FROM sys.dm_exec_query_stats AS s
inner join sys.dm_exec_cached_plans cp
on s.plan_handle = cp.plan_handle
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) qp
order by [Total AVG IO] desc
Have I been going about my investigation incorrectly?
Thanks
January 12, 2013 at 9:09 am
Change your order by to Avg Physical Reads. The Total Avg in that query has physical reads plus logical reads.
Starting with IO is a good start. Identifying those high IO queries is what Gail suggested be done. From there, tune the code appropriately and add indexes as needed.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
January 12, 2013 at 9:56 am
SQLRNNR (1/12/2013)
Change your order by to Avg Physical Reads. The Total Avg in that query has physical reads plus logical reads.Starting with IO is a good start. Identifying those high IO queries is what Gail suggested be done. From there, tune the code appropriately and add indexes as needed.
Thanks
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply