October 24, 2012 at 1:09 pm
Hello,
we have SAN to SAN replication going between Production data center and DR data center using RecoverPoint. Time to time we test DR. Network team stops SAN replication, DBA team attach databases and app teams point the apps and test.
So far so good.
First time we added sharepoint server to the mix. During current DR test, all other servers worked fine. On Sharepoint server, out of 7 content databases, 5 attached fine.
Two are not attaching with following error:
Server DRDC-SQL005\SHAV01
Job Name DBA: Attach DBs for Disaster Recovery
Step Name Step01 Attach DBs
Duration 00:00:25
Sql Severity 21
Sql Message ID 3313
Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0
Message
Executed as user: domain\username. Could not open new database 'WSS_Content_Online'. CREATE DATABASE is aborted. [SQLSTATE 42000] (Error 1813) Could not redo log record (15443:9603:9), for transaction ID (0:5529955), on page (1:113478), database 'WSS_Content_Online' (database ID 14). Page: LSN = (14474:125:3), type = 1. Log: OpCode = 2, context 2, PrevPageLSN: (15135:5416:4). Restore from a backup of the database, or repair the database. [SQLSTATE HY000] (Error 3456) During redoing of a logged operation in database 'WSS_Content_Online', an error occurred at log record ID (15443:9603:9). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database. [SQLSTATE HY000] (Error 3313). The step failed.
For several years now, we were pretty confident about our DR strategy and now it has raised a question. We completely resynched SAN and redid the test. One database worked and other did not. Basically, it is not consistent at this point just for this two databases. But if it is redo log error, it could very well be for other servers and other databases as well.
Does anyone has experienced this before? Is there anyway to avoid this potential corrupt state while breaking SAN replication and attaching databases?
Thanks,
October 24, 2012 at 1:17 pm
Yup, I've seen similar.
Most likely cause, your SAN replication is not following the rules that SQL needs out of an IO subsystem (write order preservation mainly). Check that the replication is certified to work with SQL Server. If it's not, and it's not following the write order rules, then it's a gamble every time you try to bring a replicated DB up.
Basically, make sure that what you have is certified by Microsoft and the SAN vendor to work correctly with SQL Server databases.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 24, 2012 at 1:27 pm
Thanks Gail.
I will look into Write order preservation for EMC-Recoverpoint replication and make sure that it is following the correct process not to cause integraty issues.
October 24, 2012 at 2:24 pm
It is not just that. Check and ensure that it is certified for SQL Server, that it supports SQL Server. EMC should have that explicitly stated somewhere in their docs, if it's not then check with their tech people.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 24, 2012 at 2:59 pm
I am on it.
Thanks a lot for your guidance.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply