July 18, 2011 at 10:08 pm
SERVER CONFIG: SQL 2008 Enterprise, 64-bit, SP2, latest updates, 4 CPU, 16GB RAM, RAID 1+0
I've got a many queries like the snippet below and they all seem to run fine for the most part but this one in particular results in 50-60 CXPACKET waits...(it is deleting 2-3 million rows of data)...and has been running for over 5 hours...
I know using query hints are frowed upon because of how they can significantly impact the query/execution plan but I can't figure out how to speed up the delete process. Before beginning the process I dropped all the non-clustered indexes in effort to eliminate the overhead from the updates to the indexes...so I guess my question is: can MAXDOP BE USED in a DELETE statement?
Is the TSQL really that poorly written? I mean, it works quite well for other tables that I'm running it against (with similar number of records in each table)...but this once goes out to lunch?
I don't know what else to do.
Process Info:
spid | DB | State | Command | Application | Wait Time | Type | Resource | Blocked By
63F1SettingsSUSPENDEDDELETESQLAgent - TSQL JobStep (Job 0xC5D2FE96AF12B24B8FF0E9FC3DAADE2A : Step 1)2372804CXPACKETexchangeEvent id=Pipe183024380 WaitType=e_waitPipeGetRow nodeId=563
Snippet of code causing the wait type:
--DECLARE VARIABLES
BEGIN
DECLARE @ArchiveDatevarchar(10)
DECLARE @BatchSizeint
DECLARE @RowCountint
DECLARE @LoopCountint
DECLARE @Errorint
DECLARE @Errint
DECLARE @ErrMsgvarchar(250)
DECLARE @Envvarchar(65)
DECLARE @EmailBodynvarchar(max)
DECLARE @EmailSubjectnvarchar(140)
END
--SET VARIABLES
BEGIN
SELECT @RowCount = 0, @LoopCount = 0, @BatchSize = 50000,
@ArchiveDate = CONVERT(varchar(50), DATEADD(mm, -6, GETDATE()) , 101), @Env = @@SERVERNAME
END
/* ######################################### START MAIN PROCEDURE HERE ########################################## */
BEGIN
BEGIN TRY
/* Archive the data */
INSERT INTO MyDB_ARCHIVE.dbo.DoDetailTxns
SELECT *, GETDATE() FROM MyDB.dbo.DoDetailTxns
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader)
WHILE @BatchSize <> 0
BEGIN
/* Delete the data */
DELETE TOP (@BatchSize) FROM MyDB.dbo.DoDetailTxns
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader)
SELECT @Error = @@ERROR, @BatchSize = @@ROWCOUNT
SELECT @RowCount = @RowCount + @BatchSize, @LoopCount = @LoopCount + 1
IF @LoopCount = 2
BEGIN
CHECKPOINT
PRINT('CHECKPOINT REACHED (' + CAST(@RowCount as varchar(25)) + ' rows deleted)')
SET @LoopCount = 0
END
PRINT ('Records Deleted: ' + CAST(@Batchsize as varchar(25)) + ', Total Count: ' + CAST(@RowCount as varchar(25)))
END
/* Audit Records Pruned for the day's run */
IF @RowCount > 0
BEGIN
INSERT INTO F1Settings.dbo.CIPruningHistory
SELECT GETDATE(), 'DoDetailTxns', @RowCount
END
END TRY
BEGIN CATCH
SELECT @Err = ERROR_NUMBER(), @ErrMsg = ERROR_MESSAGE()
BEGIN
SET @EmailSubject = '' + @Env + ' :: DAILY DATA PRUNING FAILURE (CI) - dbo.DoDetailTxns : '+ CONVERT(VARCHAR(19),GETDATE(),121)
SET @EmailBody =
'********************************************************************** ' + CHAR(13)
+ 'ERROR - ' + @ErrMsg + CHAR(13)
+ '**********************************************************************' + CHAR(13)
/* Send Email */
EXECUTE msdb.dbo.sp_send_dbmail @recipients = 'dba_alerts@mycompany.com'
,@subject=@EmailSubject
,@body = @emailBody
,@body_format = 'TEXT'
,@sensitivity = 'Confidential'
,@importance='High'
END
END CATCH
END
/* ########################################## END MAIN PROCEDURE HERE ########################################### */
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
July 19, 2011 at 5:16 pm
Hi
DBA hat on
the parrallel packets to me just cause waits 🙂 if you look in sys.dm_os_wait_stats order by wait_time_ms desc .
If your cx packets is in the top 10 I'd stop using it all together. sp_configure 'max degree',0 , reconfigure.
This would be because you system is spending more time trying to break work down into smaller packets and parrallel and you end up waiting for that.
5 hours, you are most likely going to get a call soon saying that the transaction log is full, then server falls over if there is no more space on the drive it , then you hve to add more files wait for a rollback ... just a dirty business.
Dev hat on.
What I have done before with troublesome audit tables for webmethod calls which had plenty of data.
Don't handle your own transactions. This can be done if you konw the workload of the system like the back of your hand, or do this after hours.
Be smart about it, calculate the row size, how many rows can fit in the log when it's a decent size. 1-5GB or % of your daily IO.
Then use this size to determine your batch sizes, if you delete sequential rows, you can also play with the number of rows on page and use that as a batch size, you can also test PAGELOCK hint when deleting, I've used this before.
Then if there is a clustered index you are going to pay a bit of a price on those deletes, and a bigger price if there are nonclustereds. Cause them indexes will need to be kept up to date, you can spot the cost from the query plan.
Also make sure that your deleting column is indexed.
If you want to rewrite the process, check options like a swop out table, or partitioning if you want to be more complex.
If you are keep less rows than what you are deleting port the new rows into a new table and swop it out and sp_rename, this is not a great way for a 24/7 system or OLTP where there might be constraints FK \ PK.
Test how long it will take you to disable the index + rebuild it. VS the time of waiting for the delete.
I many cases it's faster to disable and delete for big tables.
A rebuilt can use IO on tempdb when sorting and index there, you can use MAXDOP efficiently for rebuilds aswell plus you can even do it online if you have enterprise edition.
There are many ways to skin this cat,
However start by thinking about what you are doing in methodical way with a nice cup of coffee, a big screen and ctrl + L.
Cheers
Jannie
July 19, 2011 at 5:30 pm
So, the questions: Yes, you can apply MAXDOP to any DML statement.
I disagree with the above. Don't adjust maxdop serverwide, adjust the cost to initiate it (5 is way to low in my opinion) and selectively turn down (or 1) certain procedures.
CXPacket waits are not a problem, really, unless they're just spinning out of control. When you parallel you will almost always end up with cxpacket waits. It's the nature of the beast, nothing perfectly splits. However, unless they're unreasonable, I'd not be as concerned about the symptom, but look to the cause.
It sounds like you've got a concurrency or pressure issue. 5000 rows is where row/page locks want to escalate to table locks. Now, it'll still keep going with row/page locks if it can't attain it, but it'll try.
I would personally look for what's NOT getting a cxpacket block which is probably delaying the rest, and look into that blockage. My guess is you're going to see either PAGEIOLATCH or something similar equivalently high. That's your real blockage.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
July 19, 2011 at 5:32 pm
Never EVER put maxdop 0 1 @ server level unless you're dam sure that's the best option for ALL normal queries. Since we're working on 1 query here I really don't see the need for this. Moreover if this is a massive query, using more cpus will speed it up, not slow it down.
cpackets is also a side effect of bad stats which can be quite likely in a big load.
I don't have time to debug this with you atm but I'll be hanging around in case something else comes up.
July 19, 2011 at 5:35 pm
Craig Farrell (7/19/2011)
So, the questions: Yes, you can apply MAXDOP to any DML statement.I disagree with the above. Don't adjust maxdop serverwide, adjust the cost to initiate it (5 is way to low in my opinion) and selectively turn down (or 1) certain procedures.
Too early for that too. You don't even know if this is oltp or olap server or mixed. I agree with this a a "general best-ish practice" but this is definitely not going to solve this particular issue.
If it's been running for 5+ hours then the cost is way the heck over 5 or 35 for that matter. So parallelism is actually required for this query, not something to avoid.
July 19, 2011 at 5:38 pm
Ai
sorry guys, typo there. 'maxdop ', 1
Default is 0 which is not always the best.
July 19, 2011 at 6:07 pm
Ninja's_RGR'us (7/19/2011)
Craig Farrell (7/19/2011)
So, the questions: Yes, you can apply MAXDOP to any DML statement.I disagree with the above. Don't adjust maxdop serverwide, adjust the cost to initiate it (5 is way to low in my opinion) and selectively turn down (or 1) certain procedures.
Too early for that too. You don't even know if this is oltp or olap server or mixed. I agree with this a a "general best-ish practice" but this is definitely not going to solve this particular issue.
If it's been running for 5+ hours then the cost is way the heck over 5 or 35 for that matter. So parallelism is actually required for this query, not something to avoid.
Heh, sorry Remi. You're right, but I phrased it poorly. I was more trying to point out a better practice, not recommending it for this case. If you'll note the rest of my post, I recommend he look for the true bottleneck, not reduce the cxpacket waits via maxdop.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
July 19, 2011 at 6:30 pm
Jannie-186227 (7/19/2011)
Aisorry guys, typo there. 'maxdop ', 1
Default is 0 which is not always the best.
Maybe not, but certainly is 98% of the time so I won't go down that route for 1 rogue query.
July 19, 2011 at 6:31 pm
Craig Farrell (7/19/2011)
Ninja's_RGR'us (7/19/2011)
Craig Farrell (7/19/2011)
So, the questions: Yes, you can apply MAXDOP to any DML statement.I disagree with the above. Don't adjust maxdop serverwide, adjust the cost to initiate it (5 is way to low in my opinion) and selectively turn down (or 1) certain procedures.
Too early for that too. You don't even know if this is oltp or olap server or mixed. I agree with this a a "general best-ish practice" but this is definitely not going to solve this particular issue.
If it's been running for 5+ hours then the cost is way the heck over 5 or 35 for that matter. So parallelism is actually required for this query, not something to avoid.
Heh, sorry Remi. You're right, but I phrased it poorly. I was more trying to point out a better practice, not recommending it for this case. If you'll note the rest of my post, I recommend he look for the true bottleneck, not reduce the cxpacket waits via maxdop.
Maybe I wasn't clear either. I completely agreed with the rest of your ideas, just not that one for this particular case 😉 :hehe:
July 19, 2011 at 7:38 pm
I'd be interested to see the query plan for the DELETE, because without knowing what the clustered index is on that table, I have no way to know whether it is on Statement Id or not.
More generally, the best way to archive data is to use a partitioning scheme - whether that is Enterprise partitioning or other-versions local partitioned views. I would encourage you to read the Data Loading Performance Guide. Metadata-only and minimally-logged operations are your friends when doing this type of work.
Forget about parallelism - almost no-one understands what CXPACKET really means. Concentrate on achieving an efficient plan, and finding a way to use the techniques shown in the White Paper referenced above.
July 19, 2011 at 8:44 pm
SQLkiwi (7/19/2011)
Forget about parallelism - almost no-one understands what CXPACKET really means. Concentrate on achieving an efficient plan, and finding a way to use the techniques shown in the White Paper referenced above.
I'm sorry what white paper are you referring to?
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
July 19, 2011 at 9:20 pm
Sorry I didn't clarify a few things that obviously were important, so let me clarify 4 those things:
1) I dropped all non-clustered indexes but kept the clustered (which I just realized is not on StatementID, go figure) because I thought the sub-query depended on it. Also wanted to point out that the clustered index only has 48,348 pages and is about 377MB with 8% fragmentation (nothing extreme there) and 2) I didn't mean using MAXDOP server-wide, I was only inquiring about using it for the actual DELETE statement within the stored-procedure itself. For instance:
DELETE TOP (@BatchSize) FROM MyDB.dbo.DoDetailTxns
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader) OPTION(MAXDOP 1)
And 3) Before running these queries I performed a full backup, stopped any applications/processes connecting to it, took a transaction log backup, then push the DB into simple mode (so I could avoid the log file growth issues expected with such DELETE operations).
And 4) this is a mixed server of OLAP and OLTP...
Some great comments from everyone (much appreciated too!)
I will definitely evaluate the query plan and try using PAGELOCK hint on the DELETE statement (that one slipped my mind)...thanks for this suggestion!
I guess with what Craig suggested a good thing to try would be to limit the batch size to 5000...certainly smaller batches "should" = less overhead and impact on tempdb and on the slam to the clustered index.
I REALLY appreciate everyone's input on this - looking forward to nixing this issue and moving forward with the rest of the workload my boss's placed on my plate!
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
July 19, 2011 at 9:28 pm
Wow.
Such a simplistic approach and the never-ending procedure completed in 7 minutes and 40 seconds. I added the MAXDOP and PAGLOCK, and reduced the batch size to 5000...
SELECT *, GETDATE() FROM MyDB.dbo.DoDetailTxns WITH(READUNCOMMITTED)
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader WITH(READUNCOMMITTED))
OPTION (MAXDOP 2)
DELETE TOP (@BatchSize) FROM MyDB.dbo.DoDetailTxns WITH(PAGLOCK)
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader WITH(READUNCOMMITTED))
OPTION (MAXDOP 2)
______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience
July 19, 2011 at 9:44 pm
Welsh Corgi (7/19/2011)
I'm sorry what white paper are you referring to?
The Data Loading Performance Guide - it's a in-line link in my previous post.
January 11, 2013 at 4:08 am
For faster deletes have you tried repeating the from piece?
For example, in your original post, change this:
>>>>>>>>>>>>>>>>>>>
/* Delete the data */
DELETE TOP (@BatchSize) FROM MyDB.dbo.DoDetailTxns
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader)
>>>>>>>>>>>>>>>>>>>
to this:
>>>>>>>>>>>>>>>>>>>
/* Delete the data */
DELETE TOP (@BatchSize) FROM MyDB.dbo.DoDetailTxns FROM MyDB.dbo.DoDetailTxns
WHERE StatementID IN (SELECT StatementID FROM MyDB_ARCHIVE.dbo.StatementHeader)
>>>>>>>>>>>>>>>>>>>
It works wonders on some of my deletes.
(By the way, when I have to delete millions of rows from a table I usually do it in batches of 10,000 or 100,000.)
Regards, Dave.
Viewing 15 posts - 1 through 15 (of 20 total)
You must be logged in to reply to this topic. Login to reply