May 19, 2008 at 12:40 pm
Sorry I am not a DBA but I am a fairly knowledgeable developer, so I might not get the exact terms correct.
I am working at a small company that is using transaction replication to make a copy of a transactional database for reporting purposes. Once a day, data from a Planning DB is loaded into the Transaction DB. This is done by calling one stored procedure in the Planning DB. Both DBs are on the same server. The stored procedure figures out what data from several tables it needs to copy into the Transaction DB. The table (let's call it table A) I am concerned with can have 50,000 to 100,000 rows inserted during this process. The new rows replicate to the Report DB in an acceptable amount of time and everyone is currently happy.
Now I was asked to put a trigger on Table A in the Report DB (they do not want triggers slowing down the Transaction DB) that will be used to update a new table (let's call it Table B) in a ReportSupport DB. Table B is sort of like a summary table in that changes in thousands of rows in Table A will cause me to change an amount in one row of Table B. I wrote the trigger to sum up data from multiple rows in the trigger’s insert and delete tables before update Table B. After we put our staging environment we found a problem. I was wrongly under the impression that Multi-row insert in Table A in the Transaction DB would be replication as a Multi-row insert in Table A in the Report DB. While researching this, I found out that the default replication process is to split it up into multiple single-row inserts. This behavior is a problem for us. Instead of summing up let’s say 2,000 rows that were inserted into Table A and inserting one row into Table B with the amount 2,000, the trigger inserts one row into table B with the amount 1 and then makes 1,999 updates to this row until the amount is 2,000. This causes the transaction log to fill up rather quickly. Our 10G Database had its transaction log grow to 30G before we stopped the replication.
I confirmed that our multi-row insert was be split up by placing a simple trigger on table A that just counted the rows in both the insert and delete tables and wrote the information into Count Table I created for debugging. While this debug trigger was in place, I found out that our replication process was replicating Multi-row updates in Table A in the Transaction DB as Multi-row updates in Table A in the Report DB. Someone (nobody here seams to know who) stopped replication from splitting up Multi-row updates.
Now for the last couple of days I have been trying to find out how this was done. The information available online about replication is very simplistic. I have found hints that are many ways to stop the Multi-row I/U/D statements from getting split up, but none of the authors of the basic replication articles take it the next step. The only method I found that could account for what was happening in our databases, was to make a Stored Procedure execution replication. I don’t believe that this is what is happening for my Multi-row updates. The only “Articles” that exist for our “Publication” are table based. Also the “article” properties for our Table A look the same as the “article” properties for a debug table I setup in replication using the replication wizard’s defaults.
Can anyone tell me what to look for in our replication's “Publication” for why the updates are working different than the inserts for our Table A? Does anyone know a good book or web link with the information I need to do this?
Thanks in advance,
Dave
May 20, 2008 at 1:45 am
Yes... your code for the trigger... let's see it...
--Jeff Moden
Change is inevitable... Change for the better is not.
May 20, 2008 at 11:23 am
What do you mean by "Yes", and why do you want to see the trigger. I am on the replication forum asking for information about how to replicate multi-row inserts/updates/deletes without splitting them into single-row inserts/updates/deletes. It is being done for updates on one table setup by people no longer here. So I believe it can be done but we don't know how. A new table I setup in replication myself splits up multi-row updates but I can't see any difference in the replication setup of either table.
Here is the trigger I have put on the replicated table that I am using to test if replication has split up multi-row Insert/Update/Delete statements:
create Trigger [dbo].[trgTable_A]
ON [dbo].[Table_A]
AFTER DELETE, INSERT, UPDATE
AS
begin --trg
declare @When datetime
, @insertedCount bigint
, @deletedCount bigint
, @min-2 int
, @max-2 int
, @Min_Date Datetime
, @Max_Date Datetime
set @When = getDate()
select @insertedCount = count(*)
, @min-2 = Min(Product_Id)
, @max-2 = Max(Product_Id)
, @Min_Date = Min(Last_Mod_Date)
, @Max_Date = Max(Last_Mod_Date)
from inserted
select @deletedCount = count(*)
, @min-2 = isnull(@Min,Min(Product_Id))
, @max-2 = isnull(@Max,Max(Product_Id))
, @Min_Date = isnull(@Min_Date,Min(Last_Mod_Date))
, @Max_Date = isnull(@Max_Date,Max(Last_Mod_Date))
from deleted
if ((@insertedCount = @deletedCount) and (@insertedCount < 100)) return
insert
into Table_A_counts
( [When]
, InsertCount
, DeleteCount
, Min_Product_Id
, Max_Product_Id
, Min_Date
, Max_Date)
values( @When
, @insertedCount
, @deletedCount
, @min-2
, @max-2
, @Min_Date
, @Max_Date)
end --trg
May 20, 2008 at 10:10 pm
By "Yes", I meant yes, I know a good reference... it starts with your code... I wanted to see your trigger to see if I could help.
The trigger you posted is only doing checks (or so it seems). Are they using triggers to do the replication? If so, is it possible to post that? If they're using RBAR, it should be easy to fix to keep the "splits" you're talking about from happening...
--Jeff Moden
Change is inevitable... Change for the better is not.
May 21, 2008 at 6:40 am
We are not using triggers for replication. Forget about the triggers. We are using replication functionality provided by SQL Server 2005. Creating a "Publication" with "Articles" on a "Publisher" which feeds data to a "Subscriber" through a "Distributer".
May 21, 2008 at 8:38 am
While researching this, I found out that the default replication process is to split it up into multiple single-row inserts. This behavior is a problem for us. Instead of summing up let’s say 2,000 rows that were inserted into Table A and inserting one row into Table B with the amount 2,000, the trigger inserts one row into table B with the amount 1 and then makes 1,999 updates to this row until the amount is 2,000. This causes the transaction log to fill up rather quickly.
Ok... but I didn't bring up the triggers first... and I didn't know that "replication" actually made triggers (sorry). If it does and they are RBAR like you say, I don't know how to change that behavior other than rewritting the triggers.
When you talk about the "replication process", you are talking about SQL Server Replication, aren't you?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 21, 2008 at 11:15 am
The triggers I talk about are not part of replication. These triggers are put on a replicated table in the subscriber database to do some business logic after replication is done.
Again, multi-row updates are being replicated into my subscriber table as multi-row updates. Multi-row Inserts are being replicated into my subscriber table as multiple single-row inserts. I can't find out why. Does anyone know where I should look?
May 22, 2008 at 2:45 am
Hi there
Currently I'm also exploring the Replication options within SQL-Server 2005.
Although I don't have the solution to your question, perhaps what I know about the replication process can help you figure it out.
The replication on the Subscriber side actually uses a Stored Procedure that is executed on every record that comes across from the Distributer (in a Push configuration). There are basically 3 stored procedures one for Insert, Update and Delete. (I found out because I actually changed the SP to add additional functionalities.) Unfortunately it doesn't solve my requirement.
My guess it's not possible what you are trying to achieve, because the replication SP's work on single records and not multi recordset.
Anyway hope it helps
Cheers
Dino
May 23, 2008 at 11:30 am
Have you considered creating an indexed view that does the summarization for you? You can replicate the indexed view to your Reporting Server as a table. If a thousand rows of I/U/D activity results in only a single row changing in your indexed view then subsequently only one transaction gets replicated.
Kendal Van Dyke
May 27, 2008 at 7:01 am
No a view won't help. I gave a simplified version of what the trigger is doing as background. Please concentrate on the replication question.
May 27, 2008 at 10:07 am
I believe that this is an option of the published Articles. You will have to change it in each Article's definition, or change the Article defaults and then drop and re-add all of the Articles.
The relevant section in Book OnLine is "Specifying How Changes Are Propagated for Transactional Articles". It is in the SQL Server Replication book, in the following section chain: Implementing Replication > Publishing Data and Database Objects > Article Options for Transactional Replication >
Here is a copy at Technet:http://technet.microsoft.com/en-us/library/ms152489.aspx
and another at MSDN:http://msdn.microsoft.com/en-us/library/ms152489.aspx
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
June 21, 2008 at 5:38 pm
the only you have to do is
put in your trigger NOT FOR REPLICATION what means that what trigger do. not going to replicate.
create Trigger [dbo].[trgTable_A]
ON [dbo].[Table_A] NOT FOR REPLICATION
AFTER DELETE, INSERT, UPDATE
AS
begin --trg
declare @When datetime
, @insertedCount bigint
, @deletedCount bigint
, @min-2 int
, @max-2 int
, @Min_Date Datetime
, @Max_Date Datetime
set @When = getDate()
select @insertedCount = count(*)
, @min-2 = Min(Product_Id)
, @max-2 = Max(Product_Id)
, @Min_Date = Min(Last_Mod_Date)
, @Max_Date = Max(Last_Mod_Date)
from inserted
select @deletedCount = count(*)
, @min-2 = isnull(@Min,Min(Product_Id))
, @max-2 = isnull(@Max,Max(Product_Id))
, @Min_Date = isnull(@Min_Date,Min(Last_Mod_Date))
, @Max_Date = isnull(@Max_Date,Max(Last_Mod_Date))
from deleted
if ((@insertedCount = @deletedCount) and (@insertedCount < 100)) return
insert
into Table_A_counts
( [When]
, InsertCount
, DeleteCount
, Min_Product_Id
, Max_Product_Id
, Min_Date
, Max_Date)
values( @When
, @insertedCount
, @deletedCount
, @min-2
, @max-2
, @Min_Date
, @Max_Date)
end --trg
June 23, 2008 at 7:19 am
In case anyone is trying to do the same thing I have asked about, apparently it can’t be done in SQL server. I haven’t looked at this problem for over two months. From what I can remember about some article I read back then, SQL server will split up multi-row updates when there is a change in primary key. There is some internal technical reason for this. However, I believe that inserts, by definition, have their primary key changed. Therefore, Multi-row inserts are always split up into multiple Single-row inserts.
The last entry posted 6/23/08 is less than useless. Did you even read my post? I WANT the trigger to fire during replication. Just not one row at a time. The entry on 5/27/08 is just pointing to a Microsoft resource I already read. This article came up many times during my internet searches. It is information on how to get replication to use I/U/D sql, auto generated stored procedures, or custom stored procedures to replicate the data when a row is inserted, updated, or deleted in the published database. It says you can even skip replicating the data when a row is inserted, updated, or deleted is the published database. I saw nothing about keeping Multi-row inserts/updates/deletes together during replication.
June 23, 2008 at 9:13 am
dbirchok (5/21/2008)
The triggers I talk about are not part of replication. These triggers are put on a replicated table in the subscriber database to do some business logic after replication is done.Again, multi-row updates are being replicated into my subscriber table as multi-row updates. Multi-row Inserts are being replicated into my subscriber table as multiple single-row inserts. I can't find out why. Does anyone know where I should look?
Let's go step-by-step:
Multi-row updates/inserts/deletes are replicated as "MULTIPLE" single-row U/I/D respectively, by DESIGN.
-Why? --> Because it checks each row on the replica!! If it sends an update it verifies that the row exists, if it deletes it also check that the row existed if it inserts it will verify that the row is not there.
At the point it detects the "data consistency" failure it Stops and you could check exactly what row is not valid.
You can bypass this behaviour replicating stored procedure execution! There are some trade-off with this option but It will do what you want.
* Noel
June 24, 2008 at 7:03 am
I did a lot of research about this and your answers make no sense. Let me go “step-by-step” for you.
1.Your reason for doing one row at a time would be equally valid when updating a table in the original publishing database. We all know you can write multi-row I/U/D statements against SQL server tables.
2.When any one of the many single-row inserts fails, the entire multi-row insert fails replication. Any successful single-row inserts before the failure are rolled back. You can’t “verify that the row is not there” because none of the rows are there.
3.And let’s just say that we could keep the successful rows, you still would have trouble identifying the problem row. For example, 5 rows where updated, 2 made it through replication. You don’t know the order the updates happened so you don’t know which of the last 3 caused the problem. Maybe, in this case, you could look at each of three rows manually. However, multiply these counts by 1000, and that option flies out the window.
4.Multi-row Updates can be replicated as Multi-row updates when the primary key isn’t changed. It is currently happening in our replication process (As I stated in my earlier posts).
5.And before you say it, a primary key change does not guarantee that a row you want to update is still in the replicated table.
6.When using stored procedures (auto generated or custom) to do replication, they have a few fixed set of input parameters to choose from. All are based on processing one row at a time. It seams to me, the only chance to get multi-row replication behavior is to use I/U/D statements for replication. (I could be wrong on this one)
Saying "By DESIGN" is a cop-out. I am sure replication is working as Microsoft has it currently designed. And its design is good enough for most replication. I am just pushing the edge of the envelope. Microsoft has changed designs when enough of a need is there.
I sorry that my posts may be a bit terse. I apologize to the many experts on this site that have helped many people solve their problems. I posted this to see if any of you have the in depth knowledge about my topic. Not to get people who are trying to inflate there stats by regurgitating pat answers from Microsoft’s anemic documentation.
Viewing 15 posts - 1 through 15 (of 17 total)
You must be logged in to reply to this topic. Login to reply