October 30, 2020 at 1:16 am
as it was said in the title, which we should pay attention to when optimizing insert?delete?update sql statement? thanks
The following points is what I know :
October 30, 2020 at 9:21 am
Sounds right. Largely depends on the goal of insert/update/delete
don't use big transaction -> big transactions lead to longer locking times, greater chances to fill up disk space
perform insert?delete and update sql statement by batch , and the numbers of batch is not too big
-> right size the batch. The benefit is the lower chance of locking (only the batch is affected), required diskspace is minimized. Too small: you get too much overhead from selecting the data
if there is where statement, we'd better make sure to create correct index on the fields of the where statement
-> Having the statement using an index (sargable queries), sql server can seek the required data much faster and reduce the chance of locking ( when doing a full table scan). Of course you have to decide when the index is worth it. Each index adds overhead
Sometimes it's better to break a batch in a series of small queries/transactions, using temptables (with or without indexes) for intermediary results.
Pay attention to the order of queries to avoid deadlocks
October 30, 2020 at 11:31 am
Also, look for hidden issues like triggers.
You can look at execution plans for INSERT/UPDATE/DELETE queries, same as for a SELECT query. Common issues affecting performance can include stuff like foreign keys without indexes slowing things down since those reads are always done when modifying data.
Other than that, you're spotting the common stuff.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 30, 2020 at 12:08 pm
One thing to look out for is UPDATEs that significantly expand the size of varchar columns because that can cause page splits which can be a real performance issue, both when they occur and for later processing. For example, say you have several varchar(30) to varchar(100) columns that are initially inserted as '' but are later updated to fairly long strings (20+ bytes).
You'll want to take steps to address that or it will very likely cause you some performance issues until it's resolved.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
October 30, 2020 at 1:36 pm
One thing to look out for is UPDATEs that significantly expand the size of varchar columns because that can cause page splits which can be a real performance issue, both when they occur and for later processing. For example, say you have several varchar(30) to varchar(100) columns that are initially inserted as '' but are later updated to fairly long strings (20+ bytes).
You'll want to take steps to address that or it will very likely cause you some performance issues until it's resolved.
Amen to that! I know you know this but for the others that may be reading this...
It doesn't even have to be such significant expansion, although significant expansion can cause a single page to split more than once. One byte of expansion on a narrow index can lead to significant page splitting simply because narrow indexes are even worse at page splits than wider indexes simply because they have more rows per page. IIRC, each and every row that has to be moved in a page split has has the potential to cause 1 log file entry each to be made even in the SIMPLE recovery model. All of that is comparatively slow and it's all done within a fully blocking system transaction.
An example of such a horror might be in changing (for example) the status of rows in a multi-byte status/reason column. Consider what a change (again, for example) from "P" for "Pending" to "S1" for "Step 1" in some process that affects many rows will be. On the more narrow non-clustered indexes that contain that column as a key (and there could be many indexes that contain such a status column), it's going to have to move easily more than 400 rows to other pages to maintain the correct page order according to keys of each and every one of those indexes. That won't only cause huge number of page splits but it will also be accompanied by a huge number of log file entries, all done in a fully blocking system transaction.
Inserts can easily suffer the same problem and so the Insert/Delete method may (will probably) compound the problem by increasing the log file activity while still leaving many pages less than half full on the pages that the data was deleted from while doing nothing to prevent page splits on other pages.
The fix for such a column actually doesn't exist although it will help prevent page splits on the much larger Clustered Index that doesn't have the status column as a key column. For example, even if you make it "non-expansive" by changing the column from a {gasp!} VARCHAR(2) to a CHAR(2), the rows on non-unique non-clustered indexes (and maybe some unique... "It Depends") that contain the column as a key will still have to move to other pages to maintain the sort order that SQL Server is compelled to maintain.
And remember... each page of a non-clustered index contains a whole lot more rows per page than the clustered index because they are more narrow. That means that page splits on non-clustered indexes can cause many, many more log file entries than a clustered index.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 30, 2020 at 1:56 pm
as it was said in the title, which we should pay attention to when optimizing insert?delete?update sql statement? thanks
The following points is what I know :
<li style="list-style-type: none;">
<li style="list-style-type: none;">
<li style="list-style-type: none;">
<li style="list-style-type: none;">
- don't use big transaction
- perform insert?delete and update sql statement by batch , and the numbers of batch is not too big
- if there is where statement, we'd better make sure to create correct index on the fields of the where statemen
The "only" (in quotes because it's still of high importance) thing that not using "big transactions" and doing things in smaller batches will do is allow more concurrency but only if you do it right. You may have to wait for a couple of seconds between batches for other things to run. Converting UPDATEs to Insert/Delete is likely not going to help the issue and may compound problems in the log file as I said in the post above.
The key to large numbers of Inserts is to make sure that things like FK columns are correctly indexed on both the PK and the FK tables. You even have to be careful there because the use of indexes on the target table has no benefit to Inserts other than possibly helping Clustered Indexes (if they're so keyed) avoid page splits if they are "ever increasing" according to the data that's being inserted to help avoid "bad" page splits.
The same holds true for Deletes except you don't have to worry about the order of the rows or page splits. You only need to be concerned with all the indexes that are updated (they will all be affected on the table, with the possible exception of some filtered indexes). The only logging that will be done is for the actual rows deleted. This is why it can be useful to do temporal partitioning if the purpose of the Deletes is to purge old data. You can SWITCH out months or years of data in a heartbeat and then simply drop the switched table (although I've found some utility in TRUNCATEing it first... again, "It Depends".
I've also found utility in collecting just the PK information for a table that I want to delete from and then delete based on that. That works especially well for deleting and updating in batches but can also work quite well for single batch operations because you're controlling what would otherwise get spooled internally. Again, "It Depends". Each table is different and "panacea" recommendations my not even come close to working as advertised. "Must look eye". 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
October 30, 2020 at 2:35 pm
As a bit of a sidebar, if you're updating the proverbial "millions of rows", it may be better to disable most (not the ones that are unique because they may be driving FKs for another table) non-clustered indexes, do your inserts or deletes, and then rebuild the non-clustered indexes, which you'd probably have to do later on anyway. Of course, that means the table might not be available while each index is being rebuilt. "It Depends" but it is another option to look at..
--Jeff Moden
Change is inevitable... Change for the better is not.
October 31, 2020 at 1:51 pm
Thank everyone for your kind and good advice! thanks so much!
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply