IO Errors

  • Hi All

    Recently I have been observing the following types of errors in the SQL error log

    SQL Server has encountered 100 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [i]datafile[/i] in database [i]db_name[/i] (5).

    These errors seem to be popping up more and more.

    Things I've checked:

    >> Checked wait stats on the server and noticed IO related waits at the top

    >> Using sys.dm_db_index_operational_stats, I've tracked down the top tables with the highest page_io_latch-wait_counts.

    >> The indexes on these tables are heavily fragmented, 70% +

    >> I've also observed what I think is a low buffer cache hit ratio, between 89%-92%

    >> Drives containing datafiles are used ONLY for datafiles.

    Could the fragmentation be the cause of the IO delays?

    Is there anything else I can check to track this down?

    Thanks

  • It could be fragmentation, but more likely is either slow disks or disk contention. Check your wait statistics to see what is causing the system to slow down. See if you're dealing primarily with i/o issues. Then collect metrics on queries to figure out which ones are using the most i/o. Tune 'em.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks

    I've checked the highest averaging IO queries in cache and there are definately some serious scans happening against the top tables

    Also, do you think that the low Buffer cache hit ratio could be contributing? From what I understand, the low hit ratio means that more queries are forced to retrieve pages from disk. Along with this, Page life expectancy is between 700-900 secs. The server has 20GB of ram and SQL has been allocated 4GB.

    Thanks

  • Buffer cache hit ratio is a poor indicator. The page life expectancy is pretty low though. I suspect you are hitting the disk a lot. Fragmentation may help at the margins, but it sounds like you either are moving massive amounts of data, or you need to tune some queries.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks

    Thanks, I know that there are definately some queries hitting the disk quite hard, and that could definately do with some tuning.

    In your opinion, what's a better approach: Changing the T-SQL Code or adding indexes to accomodate the code ?

    I'm gonna increase SQL's memory allocation as well and observe from there. SQL is only allocated 4GB out of a total of 20GB available.

  • SQLSACT (1/12/2013)


    In your opinion, what's a better approach: Changing the T-SQL Code or adding indexes to accomodate the code ?

    Yes.

    Adding indexes is useless if the problem is in the code. Changing the code is useless if the problem is a lack of indexes.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (1/12/2013)


    SQLSACT (1/12/2013)


    In your opinion, what's a better approach: Changing the T-SQL Code or adding indexes to accomodate the code ?

    Yes.

    Adding indexes is useless if the problem is in the code. Changing the code is useless if the problem is a lack of indexes.

    Thanks

    The Queries that seems to be racking up the most IO are not queries that run often, they're run at most 5 times a day. This is an OLTP system.

    Adding indexes is useless if the problem is in the code. Changing the code is useless if the problem is a lack of indexes

    Where do you draw the line between this, when does code become "bad code". The queries are not queries like "select * from some_table".

    Thanks

  • SQLSACT (1/12/2013)


    Where do you draw the line between this, when does code become "bad code".

    When testing shows that it performs badly.

    Performance tuning is based on testing, not guesswork. Find the queries with the worst IO load (not the ones that do the most reads per execution). Run them. Identify why they're not performing well. Fix whatever problem was identified. Run them. Do they perform acceptably? If not, identify why and fix. Repeat until performance is acceptable.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • SQLSACT (1/12/2013)


    Thanks

    I've checked the highest averaging IO queries in cache and there are definately some serious scans happening against the top tables

    Also, do you think that the low Buffer cache hit ratio could be contributing? From what I understand, the low hit ratio means that more queries are forced to retrieve pages from disk. Along with this, Page life expectancy is between 700-900 secs. The server has 20GB of ram and SQL has been allocated 4GB.

    why such a low memory allocation to SQL out of the total? Increasing the memory available to SQL will improve your page life expectancy and reduce the need to go to disk.

    ---------------------------------------------------------------------

  • The application that uses this SQL Server is installed on the server as well and is very resource intensive

  • SQLSACT (1/12/2013)


    The application that uses this SQL Server is installed on the server as well and is very resource intensive

    My first step would be to split these two servers.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Find the queries with the worst IO load (not the ones that do the most reads per execution).

    Thanks

    I've been using the following script to identify the highest IO intensive queries

    SELECT top 20

    SUBSTRING(t.text, ( s.statement_start_offset / 2 ) + 1,

    ( ( CASE statement_end_offset

    WHEN -1 THEN DATALENGTH(t.text)

    ELSE s.statement_end_offset

    END - s.statement_start_offset ) / 2 ) + 1)

    AS statement_text,

    text,

    objtype,

    cacheobjtype,

    usecounts,

    last_execution_time,

    total_physical_reads,

    total_physical_reads / execution_count AS [Avg Physical Reads],

    total_logical_reads,

    total_logical_reads / execution_count AS [Avg Logical Reads],

    total_physical_reads+total_logical_reads/execution_count AS [Total AVG IO],

    total_worker_time,

    total_worker_time / execution_count AS [Avg CPU Time],

    execution_count ,

    qp.query_plan

    FROM sys.dm_exec_query_stats AS s

    inner join sys.dm_exec_cached_plans cp

    on s.plan_handle = cp.plan_handle

    CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) AS t

    CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) qp

    order by [Total AVG IO] desc

    Have I been going about my investigation incorrectly?

    Thanks

  • Change your order by to Avg Physical Reads. The Total Avg in that query has physical reads plus logical reads.

    Starting with IO is a good start. Identifying those high IO queries is what Gail suggested be done. From there, tune the code appropriately and add indexes as needed.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • SQLRNNR (1/12/2013)


    Change your order by to Avg Physical Reads. The Total Avg in that query has physical reads plus logical reads.

    Starting with IO is a good start. Identifying those high IO queries is what Gail suggested be done. From there, tune the code appropriately and add indexes as needed.

    Thanks

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply