Intermittent database catalog failure on iSCSI NetApp

  • SQL 2005 Enterprise edition sp2 on IBM 64bit Intel 16GB RAM deployed on NetApp iSCSI protocol 1 volume aggregate 2 LUN's comprised of 9 RAID-DP 300GB disks. Intermittently, SQL server will lose track of some, but not all, catalogs, reporting Error 21 in the log "O/S is reporting a media error...". Simply taking the DB catolog offline then back online fixes the issue. Run CHECKDB and other maintenance, backup, etc. These catalogs have little to no activity, but 24 hours later the catalog is "missing again". Same procedures as above and the catalog can then be queried, used by applications, etc., until the media fails again. Some catalogs on the NetApp NEVER fail, others take turns intermittently failing.

  • Looks like its squarely an issue with your IO subsystem. I'd get onto NetApp and have them help you diagnose the problem.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Good call. We finally got NetApp on the line and they identified millions of queue overflow errors on the interface hardware. They are providing additional troubleshooting tools and becoming actively involved. Thanks!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply