September 24, 2007 at 9:07 pm
I have a Table A. There are several triggers on this table. One of them is a trigger which moves the record to table B on insert and updates the record in Table B on update.
I am wondering if it will speed up inserts and updates on table A if I move all the rest of the triggers on Table A to table B.
My gut feeling is that it won't make a difference at all because all the transaction won't end until all the triggers on all the tables are run. Is that right?
September 24, 2007 at 9:33 pm
I can't even imagine a scenario where that would be remotly possible. It's like asking if 3/4 = 4/3. It seems to be close but it's not. I'm not trying to be rude or anything, it's just that... well it's not possible :).
If you post the trigger code, then maybe we can figure out a way to speed them up for you. Also if you tell us why you have so many triggers, then maybe we can figure out a better way to make this process run.
September 24, 2007 at 11:00 pm
Here is the reason why I have that trigger.
In Table A there are a couple of million records. A legacy application needs them there.
There are apps which need to frequently query the latest 50,000 or so records.
One of the things that was done was to add a trigger to Table A to "replicate" the record to Table B. Table B is trimmed every night to get rid of older records. The applications that very frequently query data hit Table B instead of Table A and therefore run much faster. Plus they don't throw locks against Table A.
Why this scheme instead of a materialized view? Because in SQL server 2000 using our database and our data the insert and update performance of Table A becomes dismal if you create a materialized view from it. Triggers are faster!!
So in summary.
Table A big. Table B smaller subset. Triggers on Table A constantly insert and update Table B to keep them in sync.
Other triggers on Table A do other processing that is needed (don't want to go into it here but it's a common enough scenario).
My question is will moving all the triggers from A to B (except the "replication" triggers) speed up the application at all?
I don't understand why you say this is not possible. When the record in Table B gets inserted or updated then it's triggers will be fired right?
September 25, 2007 at 12:12 am
I'm thinking that Remi thought you were going to move the triggers that did the insert/update on table B to table B... and that's not what you intend...
Of course, I could be wrong, but if the other triggers that you want to move perform a join of some sort on table A, then what would be the performance benefit of moving them to table B? If they join the Inserted/Deleted tables with some table other than table A, then there will be no performance gain, either.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 25, 2007 at 5:03 am
That makes more sens... let me sleep on this one...
September 25, 2007 at 6:51 pm
Jeff Moden (9/25/2007)
I'm thinking that Remi thought you were going to move the triggers that did the insert/update on table B to table B... and that's not what you intend...Of course, I could be wrong, but if the other triggers that you want to move perform a join of some sort on table A, then what would be the performance benefit of moving them to table B? If they join the Inserted/Deleted tables with some table other than table A, then there will be no performance gain, either.
They contain the same data but Table B is a subset of Table A.
The triggers are only concerned about what happens when a new transaction occurs or an existing transaction gets updated. All of these updated only occur in the subset of the data.
For example let's pretend Table A was a sales table. It contains all sales for the last 90 days. That's a big table.
Every time there is a sale a trigger is run which does some things with the new sale. Along with that the sale is inserted into Table B (the "recent sales" table).
So my question goes like this.
Right now the inserts into the sales table are slowed by all that processing. If I move the processing to the "current sales" table will is speed up inserts into the sales table.
More generally the question is "when does a transaction end"? When the triggers on Table A are done processing or when all triggers on all tables touched by the insert are done processing.
If the answer is the latter then of course no gain in performance. If the answer is the former then big gain in performance.
September 25, 2007 at 7:17 pm
I c now.
A simple way to do this is something like so : Keep the trigger setup as they are right now.
However, create a new table with only 1 column, the id of the sale. In the trigger, only insert the ids of the inserted rows into the new table. Then move the code that synchronise the tables into it's own procedure. All that's left to do now is to run a job every x minutes to finish the data synchronisation. Since this is now done outside the main transaction, that base insert will be somewhat, or even a lot, faster (the trigger will still have to scan the inserted table, but that should be much faster than moving all that data around).
The speed difference will depend on how slow the 2nd insert is on that trigger.
September 25, 2007 at 8:21 pm
More generally the question is "when does a transaction end"? When the triggers on Table A are done processing or when all triggers on all tables touched by the insert are done processing.
Triggers make their own transaction, so-to-speak... that transaction, and therefore, the insert, will end when all of the triggers on Table A are done firing. If those triggers cause other tables' triggers to fire as well, then they are part of the transaction. I'm thinking no performance gain.
Remi might be on to something with the separate proc but lot's of folks can't wait 5 minutes.
The key to doing all of this is to make all of the trigger code as fast and scalable as you can... I know... "well, DUH!" The problem is that lot's of well-meaning folks fall short in that area.
That, and people tend to over index tables for reporting purposes. We had a multi-million row table that had 20 bloody indexes on it! Talk about slow inserts! We sat down and did some analysis and combined the 20 indexes (many were single column, some were multi-column) into 4 (rather complex) multi-column indexes. Most of the reports increased performance and so did the Inserts.
One of the most important things to remember is... what is the CLUSTERED index based on? If it's something that causes a reordering of data, then it shouldn't be the clustered index on a table that big.
If you want, post some of your trigger code, the schema of the table, and the index creation code... maybe we can help.
--Jeff Moden
Change is inevitable... Change for the better is not.
September 25, 2007 at 8:26 pm
I heard a few good comments at PASS about clustered index and reordering. Those don't apply here in this case, but let's just say we have a few beliefs to change about CI ;).
September 25, 2007 at 9:20 pm
Like what?
And, Yes, I know that inserting into the "middle" of a clustered index may not cause data movement right then, but it does take extra time to do the "rechaining"...
--Jeff Moden
Change is inevitable... Change for the better is not.
September 25, 2007 at 9:27 pm
I'm gonna dig to find that slide, but Microsoft ran a test and compared perf of tables with CI and the same set of columns as heaps.
On any operation (insert, select, delete, update), the CI always won. the smallest margin was only 3% (insert), largest was selects with 23% (or 30%). Turns out it's that way because the heap version has to do 2 writes (heap + index) vs the CI only 1. I don't remember how they tested this, but they had talked about that reorg myth and apparently it wasn't all that founded after all (because at worst, it's gonna be the same perf as Heap + index). I can dig out the slide but they didn't provide the test code for this one so we may have some work to do to replicate this one.
September 26, 2007 at 4:14 am
Jeff Moden (9/25/2007)
Triggers make their own transaction, so-to-speak... that transaction, and therefore, the insert, will end when all of the triggers on Table A are done firing. If those triggers cause other tables' triggers to fire as well, then they are part of the transaction. I'm thinking no performance gain.
Thanks. That answers my question. It's not the answer I wanted to hear though 🙂
Remi might be on to something with the separate proc but lot's of folks can't wait 5 minutes.
I can't that's for sure. Furthermore it would be a heavier hit on the database to do the scan.
That, and people tend to over index tables for reporting purposes. We had a multi-million row table that had 20 bloody indexes on it! Talk about slow inserts! We sat down and did some analysis and combined the 20 indexes (many were single column, some were multi-column) into 4 (rather complex) multi-column indexes. Most of the reports increased performance and so did the Inserts.
That's odd. Everything I have read says it's better not to have compound indexes because the optimizer can deal with separate indexes more efficiently. Do you have an article or something I can read about this strategy?
This table does have a lot of indexes on it but it would take a year to try and track down every application that uses it and try to determine what they all need. It's an old application with years of cruft and this table is the center of the universe for the application.
One of the most important things to remember is... what is the CLUSTERED index based on? If it's something that causes a reordering of data, then it shouldn't be the clustered index on a table that big.
I have clustered table a couple of ways but nothing really worked all that way. I settled for the one that worked the best.
clustering is voodoo as far as I am concerned. I have never read so many conflicting articles in my life.
If you want, post some of your trigger code, the schema of the table, and the index creation code... maybe we can help.
Sorry can't do that. That all answers my question anyway. From the sound of it there isn't a whole lot that can be done. I did some test with sql server 2005 and it has a little better performance in materialized views (but the performance is highly erratic for some reason).
I guess we throw more hardware at it or do the unthinkable and think about switching to some other database.
September 26, 2007 at 9:30 am
The way I would do it is with partitioning and a view. Though I think you discounted this when you mentioned "materialised views", I was wondering why this is a poor option?
Table "A" contains legacy data
Table "B" contains 'highly queried data'
View "C" is a view on the union of these two tables.
An overnight process moves data from B to A as necessary.
(Naming the view to the name of the original single table to maintain legacy functionality, and naming the two partitioned tables something like _active.)
Cheers,
Mat.
September 26, 2007 at 9:41 am
Mal Content (9/26/2007)
That's odd. Everything I have read says it's better not to have compound indexes because the optimizer can deal with separate indexes more efficiently. Do you have an article or something I can read about this strategy?
It very much depends on the type of queries that you have. Let's say for example that you have a table with three columns, a, b and c.
If all your queries filter on a or b, then it's good to have separate indexes. A single index on a and b is no good for queries that filter only on b. Imagine, if you will, a phone book with an index on surname first, with firstname as a second column. The index is no good for finding all the people with a first name of 'Matthew' (no use is a bit harsh, but it's not seekable. The best you could get would be an index scan)
If, on the other hand, your queries are of the type a=x and b=y, they you would rather have a single index on a and b. That way, a single index lookup will find the required records. Again, imaging the phone book, but this time there is one index on surname and another on first name. How much work would it be to find all the people named Matthew Black?
Regarding posting code, you don't have to post the exact code. Simplify, change names, that kinda thing. I know that that kinda thing is sensitive, but we're just interested in the structure, not the business sensitive information.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 26, 2007 at 7:42 pm
Matthew Bailie (9/26/2007)
The way I would do it is with partitioning and a view. Though I think you discounted this when you mentioned "materialised views", I was wondering why this is a poor option?Table "A" contains legacy data
Table "B" contains 'highly queried data'
View "C" is a view on the union of these two tables.
An overnight process moves data from B to A as necessary.
(Naming the view to the name of the original single table to maintain legacy functionality, and naming the two partitioned tables something like _active.)
Cheers,
Mat.
Materialized views were tried but they really slow down the inserts. More then having triggers!.
I can't really split the table like you said because it's a legacy application and it would take too much effort to change all the code which is scattered across too many applications.
I guess I am doing almost the same thing by keeping an active transactions table.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply