September 28, 2008 at 5:09 pm
Has anyone here ever worked with replication from an Oracle publisher?
I have an oracle publication that's exhibiting terrible performance in places. 36 hours to create the publication of 300 articles, but only 45 min to snapshot all 300 articles, with 100 GB of data. The log reader's timing out every time it tries to connect.
I took some of the queries that the publication creation script was running and tried to run them direct on oracle, and they ran the Oracle box's CPU to 50% and hadn't finished 3 min later. That's for a query that fetches the columns in the pk for a single table. However data queries run lightning fast, even when run using SQL plus from the distributor.
Hence the problem looks very much like an Oracle one. Now I have to convince an Oracle DBA of that.
Does anyone have any suggestions on where I could start looking, or has anyone had similar experiences with Oracle?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 29, 2008 at 5:31 am
Ask your Oracle contact to run the offending query in a session altered as follows...
alter session set events '10046 trace name context forever, level 12'
... then tkprof the trace file and send it to you.
Look at the Wait Events on the tkprof(ed) trace, there is the whole truth about why it takes that long.
By the way, a query like the one you describe shouldn't take long providing you do not have a zillion tables and a tera-zillion columns in your Oracle instance.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.September 29, 2008 at 2:09 pm
Thanks. I'll pass that on.
There are 500 odd tables. None of the tables have hundreds of columns. Interesting, if the DBA added (to the query in sqlplus) the equivalent of the fastfirstrow hint, the query returned instantly. Of course that doesn't really7 help seeing as I have no plans to go and rewrite the replication wizard.
The Oracle guru is 'stumped' and is logging a case with Oracle directly.
We're still trying to dig out what query the log reader is attempting to run. Whatever it is is taking more than 30 min. The only clue I have is that the log reader finally fails with an error
"Heterogeneous Logreader encountered an error in call to LoadReplCmds when processing state 'SYNC_INIT'" (from memory, so not 100% exact)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 30, 2008 at 1:40 am
I see.
Not sure how critical is the issue for you but better to get the Oracle Service Request upgraded to either sev=1 or sev=2 - in my experience an approaching drop dead deadline for production deployment is enough to get a sev=2; also talk to the shift supervisor and get it assigned to an engineer located on your own timezone.
Last but not least, there is an obscure reference to an issue close to the one your are looking at; I'm pretty sure you have already bumped into it but just in case here is the url... http://support.microsoft.com/kb/922767
Good luck
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.September 30, 2008 at 2:54 am
PaulB (9/30/2008)
I see.Not sure how critical is the issue for you but better to get the Oracle Service Request upgraded to either sev=1 or sev=2 - in my experience an approaching drop dead deadline for production deployment is enough to get a sev=2; also talk to the shift supervisor and get it assigned to an engineer located on your own timezone.
For me, it's not critical at all, for my client I think it is. I'll ask him to get on to his DBA (outsourced) and suggest this.
Last but not least, there is an obscure reference to an issue close to the one your are looking at; I'm pretty sure you have already bumped into it but just in case here is the url... http://support.microsoft.com/kb/922767
I saw that and checked, it's not the case here. I did try changing the mssqlversion value, to see if it was, changing it caused an immediate failure, so it's not that. I wish it was, that would be easy.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 1, 2008 at 1:56 pm
Apparently the issue was related to bug no. 6646512, whatever that is. 😉
The creation of the publication now runs in a reasonable time, but the log reader's still timing out. I'm hoping the recreation of the publication will fix matters.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 2, 2008 at 9:03 am
the plot thickens!... there is not such a thing as Bug #6646512 documented on Oracle's Metalink, interesting.
Glad it working now.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.October 2, 2008 at 9:15 am
PaulB (10/2/2008)
the plot thickens!... there is not such a thing as Bug #6646512 documented on Oracle's Metalink, interesting.
<shrug> I got that info 3rd hand. I wouldn't be surprised if there was a miscommunication along the way
Glad it working now.
It's not fully. The slow creation of the publication was an irritation. Unless I can get the log reader running, I can't get the snapshot validated and the replication won't run.
Still 50% fixed is better than it was yesterday. Hope this problem's as easy to resolve.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply