January 9, 2015 at 12:44 pm
We have a process that loads a large file (almost 10 million records). It used to take a few hours, anywhere between 4 - 8. Recently the file load takes multiple days to complete. But not always. Sometimes it loads in about 8-10 hours.
About 6 months ago, we started mirroring the destination database to another server for DR purposes. When we reached out to the vendor, they told us that the mirroring was the likely cause of the long delay. Neither I nor my boss like that answer, but in the spirit of dotting Is and crossing Ts, I'm putting the question out here.
Can mirroring a database truly cause data imports (on the principal) to perform so horribly?
January 9, 2015 at 2:23 pm
It can. Synchronous mirroring requires a 2-phase commit to take place which could introduce delays on the principal. A transaction comes into the principal, is sent to the mirror across the LAN/WAN, commits on the mirror, sends the acknowledgement back to the principal, and commits on the principal.
Depending on the distance between (network latency) for the principal and mirror as well as the hardware involved, the mirror server can cause delays to the principal.
Some examples:
Large network latency
Less CPU, memory, IO subsystem resources on the mirror server.
Using the same NIC for user transactions and mirroring activity for a busy system.
Trace flag 1485 can be used on the mirror to allow for more restore threads but if the IO system is the bottleneck, it won't help. From this article (page 84):
Database mirroring uses parallel threads to restore the transaction log records that were shipped to the mirror instance. By default it uses one restore thread for every 4 CPU threads. In our large SAP ERP encryption example, this still amounted to 16 threads encrypting on the principal, but only 2 threads were restoring on the mirror side.
If your system encounters an imbalance like the one we encountered, youshould consider using trace flag 1485 on the mirror instance.
Finally, a couple more articles related to database mirroring performance:
January 12, 2015 at 5:25 am
Ah, I forgot about that little part of it. And of course the person who set it up did set it up as synchronous instead of asynchronous.
This is going to be a fun conversation with our corporate IT and our business users.
January 12, 2015 at 5:38 am
I can find all sorts of references about switching the safety mode in the GUI, but I only found one reference to using t-sql and it refers to the PARTNER statement. Since we only have mirroring in Production, I can't test the below t-sql. Would this be the correct code for swapping from FULL to OFF and OFF to FULL?
ALTER DATABASE database_name
SET SAFETY OFF;
GO
ALTER DATABASE database_name
SET SAFETY FULL;
GO
January 12, 2015 at 6:06 am
No.
http://msdn.microsoft.com/en-us/library/ms189061.aspx
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 12, 2015 at 6:08 am
GilaMonster (1/12/2015)
No.
Thank you, Gail.
January 12, 2015 at 6:37 am
Brandie, just to confirm, you have enterprise edition to enable the use of asynchronous? If you can tolerate some data loss, would log-shipping be an alternative?
to provide evidence mirroring is your root cause, Have you determined database mirroring is your top wait type and the latency mirroring is introducing?
http://msdn.microsoft.com/en-us/library/ms190030(v=sql.105).aspx
---------------------------------------------------------------------
January 12, 2015 at 6:42 am
george sibbald (1/12/2015)
Brandie, just to confirm, you have enterprise edition to enable the use of asynchronous? If you can tolerate some data loss, would log-shipping be an alternative?
No. We cannot tolerate data loss on this application (or only the minimum amount possible). Log shipping is not an alternative.
to provide evidence mirroring is your root cause, Have you determined database mirroring is your top wait type and the latency mirroring is introducing?
Actually, I am going to test the mirroring with the next file (which we got this weekend), by turning the safety off during the load, then turning it back to FULL when the load is done. If the file loads faster (it has a few thousand more records then the file I just loaded), then I know for sure it is the mirroring causing the issue.
if it doesn't load faster, then I can go back to the vendor and say "See? I tried. Now fix it."
January 12, 2015 at 6:45 am
There could also be some effects from the write latency if the log file on the mirror is slow.
When you have synchronous mirroring, the minimum duration for a transaction is roughly 2*latency between principal & mirror + write duration to mirror's log drive.
One not-immediately-obvious consequence of this is that it's often way better to wrap multiple inserts in a transaction than to do lots of small transactions. This is even true on a standalone, but the effects on a mirrored server can be far more significant.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 12, 2015 at 6:52 am
GilaMonster (1/12/2015)
There could also be some effects from the write latency if the log file on the mirror is slow.When you have synchronous mirroring, the minimum duration for a transaction is roughly 2*latency between principal & mirror + write duration to mirror's log drive.
One not-immediately-obvious consequence of this is that it's often way better to wrap multiple inserts in a transaction than to do lots of small transactions. This is even true on a standalone, but the effects on a mirrored server can be far more significant.
If it were only 2*latency, we wouldn't be concerned, though. Whatever issue is causing the problem, we've ballooned up to 15* the normal load time.
Something is seriously wrong. I just hope that we can find an easy work-around or an easy fix. (Preferably both).
January 12, 2015 at 7:01 am
2*network latency *per transaction*, which results in locks being held longer, which can cause blocking problems, etc, etc, etc.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 12, 2015 at 9:07 am
And the verdict is in. It was the mirroring after all. Specifically, it was synchronous mode. When I set it to asnychronous, it only took 2 and 1/2 hours to complete.
So we need to update the job to switch SAFETY OFF before the process kicks off, then switch it back to FULL when we're done.
YAY! EASY FIX!
January 12, 2015 at 9:20 am
thanks, worth knowing.
---------------------------------------------------------------------
January 12, 2015 at 9:48 am
Brandie, just curious, when you are importing the file how is it inserted into the table(s)? Does the data go in sequentially on the clustered index? What I am wondering about is page splits and fragmentation.
January 12, 2015 at 10:48 am
Lynn Pettis (1/12/2015)
Brandie, just curious, when you are importing the file how is it inserted into the table(s)? Does the data go in sequentially on the clustered index? What I am wondering about is page splits and fragmentation.
I would love to tell you. Except this process is a vendor process and everything works through SQL jobs and .bat files that call encrypted stored procedures. All I know is the file is pulled into a staging table and then checked against existing records with at two different verifications being run.
Beyond that, I'm clueless. The vendor would not let me get a look at the stuff under the hood, which frustrated my boss and I to no end because it meant we couldn't check for optimizations.
On the other hand, I doubt it's an index issue as we reorged and rebuilt (depending on fragmentation levels) the indexes for other purposes between files and this did not make a difference. Also, table stats are updated every morning based on a vendor script.
Viewing 15 posts - 1 through 15 (of 18 total)
You must be logged in to reply to this topic. Login to reply