Attach database failing at DR - SAN replication

  • Hello,

    we have SAN to SAN replication going between Production data center and DR data center using RecoverPoint. Time to time we test DR. Network team stops SAN replication, DBA team attach databases and app teams point the apps and test.

    So far so good.

    First time we added sharepoint server to the mix. During current DR test, all other servers worked fine. On Sharepoint server, out of 7 content databases, 5 attached fine.

    Two are not attaching with following error:

    Server DRDC-SQL005\SHAV01

    Job Name DBA: Attach DBs for Disaster Recovery

    Step Name Step01 Attach DBs

    Duration 00:00:25

    Sql Severity 21

    Sql Message ID 3313

    Operator Emailed

    Operator Net sent

    Operator Paged

    Retries Attempted 0

    Message

    Executed as user: domain\username. Could not open new database 'WSS_Content_Online'. CREATE DATABASE is aborted. [SQLSTATE 42000] (Error 1813) Could not redo log record (15443:9603:9), for transaction ID (0:5529955), on page (1:113478), database 'WSS_Content_Online' (database ID 14). Page: LSN = (14474:125:3), type = 1. Log: OpCode = 2, context 2, PrevPageLSN: (15135:5416:4). Restore from a backup of the database, or repair the database. [SQLSTATE HY000] (Error 3456) During redoing of a logged operation in database 'WSS_Content_Online', an error occurred at log record ID (15443:9603:9). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database. [SQLSTATE HY000] (Error 3313). The step failed.

    For several years now, we were pretty confident about our DR strategy and now it has raised a question. We completely resynched SAN and redid the test. One database worked and other did not. Basically, it is not consistent at this point just for this two databases. But if it is redo log error, it could very well be for other servers and other databases as well.

    Does anyone has experienced this before? Is there anyway to avoid this potential corrupt state while breaking SAN replication and attaching databases?

    Thanks,

  • Yup, I've seen similar.

    Most likely cause, your SAN replication is not following the rules that SQL needs out of an IO subsystem (write order preservation mainly). Check that the replication is certified to work with SQL Server. If it's not, and it's not following the write order rules, then it's a gamble every time you try to bring a replicated DB up.

    Basically, make sure that what you have is certified by Microsoft and the SAN vendor to work correctly with SQL Server databases.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail.

    I will look into Write order preservation for EMC-Recoverpoint replication and make sure that it is following the correct process not to cause integraty issues.

  • It is not just that. Check and ensure that it is certified for SQL Server, that it supports SQL Server. EMC should have that explicitly stated somewhere in their docs, if it's not then check with their tech people.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I am on it.

    Thanks a lot for your guidance.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply