December 16, 2011 at 6:27 am
First of all, I am terribly sorry that I am not allowed to paste any code from the project I'm working on here, but I'll ensure that I make my case clearly understood.
We have a central table, lets call this Hub, that has more than a billion records. Around this table, there are some ten Spoke tables each having a foreign key to the primary key in the Hub table. These Spoke tables can have anything from a few hundred to billions of records. The row size of the records in each table is approximately 100 bytes, with a composite primary key of eight or twelve bytes.
Now, what I am implementing is a routine which copies data from these tables to other tables for archival. Most commonly this is followed by a delete for the archived data. Deleting from all the Spoke tables is performing just fine, but deleting from the Hub table takes ages when we delete a substantial amount of records. I've identified that the delete is causing clustered index scan on the Spoke tables to ensure that referential integrity is maintained. I'm currently creating indexes on the columns used for foreign key to the Hub table to speed up deletion, but is there anything else I can do?
Unfortunately, statistics cannot be updated within a transaction, otherwise I would have tried to update the statistics on the Spoke tables to see if that improved the performance.
December 16, 2011 at 7:08 am
I think you're already doing the right thing, adding indexes to support the lookups required for referential integrity. The only other thing you could do is drop the RI while you delete and then add it back after you're done, but... I suspect that would be extremely painful because you're going to want to add it back with a CHECK operation and that's going to be costly.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 16, 2011 at 7:17 am
Thanks a lot. Dropping RI would be, as you said, a pain when I had to implement it back, and most likely be guaranteed to include the same clustered index scans that I try to get rid of. Feels kind of strange to have more performance issues with deletes than inserts.
December 16, 2011 at 7:23 am
When I'm teaching reading execution plans, my favorite moment is when I show a delete query which then hits 15 tables doing RI lookups and I ask people, "So, still think your database does more data modification than it does data reads?"
Indexes in support of these lookup operations are your best bet. But, you are possibly going to be dealing with stats issues too as the data gets deleted. If you're doing this in batches, you might want to plan for a stats update in between each batch. Experiment with it though. I'm not sure it'll be needed.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply