Deleting Records taking too much time!!!

Question

Post reply

Deleting Records taking too much time!!!

ghanshyam.kundu

Hall of Fame

Points: 3337
More actions
June 28, 2012 at 7:44 am

#258600

Hi folks,
I am deleting records from a table which is having around 2 million records, i am deleting via batch means 2000 records at a single time.
the server configuration is sql server standard edition, having ram 14 GB and 4 processers of intel 2.67 GHz,
to delete the records its taking more than 6 hours.
i think it is taking too much time, how can reduce the taken time
here is the query i am using to delete the records.
DECLARE @delcnt int
DECLARE @VARROWCOUNT INT
SET @VARROWCOUNT = 0
select @VARROWCOUNT = COUNT(*)
from BaseData2 a
join dbo.IncrementalData b
on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1 )
SET @delcnt = 0
WHILE @VARROWCOUNT > 0
BEGIN
DELETE TOP (1000) a
FROM BaseData2 a
JOIN dbo.IncrementalData b
ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1)
SET @delcnt = @delcnt + @@ROWCOUNT
SET @VARROWCOUNT = @VARROWCOUNT - 1000
END
INSERT dbo.ETLAudit (Inserted_RecordCount,InsertedDate,Deleted_RecordCount,Error_RecordCount,Select_Record_Count)
VALUES ('',CONVERT(VARCHAR(8), GETDATE(), 1) ,@delcnt,'','')
this has been called inside a package.
your suggestion will be highly appreciated!!!

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Jared SSC Guru Points: 61793 More actions · Answer 1

The first thing I can think of is that you are joining the huge table over and over. Why?

Jared
CE - Microsoft

chrisfradenburg SSCrazy Eights Points: 9592 More actions · Answer 2

I don't expect this to be a major improvement but you never know. Instead of checking the count you can just check to see if something exists:

WHILE EXISTS (select 1

from BaseData2 a

join dbo.IncrementalData b

on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY)

BEGIN

Are BaseTable.DW_INSTALL_BASE_KEY and IncrementalTable.DW_INSTALL_BASE_KEY indexes? Increasing the MAXDOP would also help but I'm assuming that's there so it doesn't affect the performance of other things happening on the system. Is the package running on the server where the DB is hosted? If not you may want to consider creating a stored proc and having the package call the stored proc instead. I've seen testing where that caused a performance increase due to less communication between the server and the client.

This isn't a performance thing but you may want to consider putting the delete in a transaction. That will allow the transaction log to clear out while it's running and if there is an issue you don't loose all that's been done. Right now if it fails the whole thing rolls back. That may be the way you want it to work but if it isn't I would use a transaction.

Gullimeel Hall of Fame Points: 3871 More actions · Answer 3

Did you try to increase the batchsize say to 5000?

You mentioned that there are around 2 million rows in one table what about other table.After delete how many rows remain in basetable2.

It looks like that you are deleting lots of data. If you are deleting lots of data you could use a left join from basetable to increment table to find the records which you want to retain and put them into a temp table.

Then truncate the table ( if you have referential integtirty it will not allow to truncate the table and identity will be reset if you have identity field). Then insert into the table Basetable using the batches and I would say that use the batch of 5000 or more based on your log size.

if you do not have full logged recovery for database(I am sure you will have full logged but in case you do not have) then you could use the insert into diretcly without any logging in single insert select statement.

Below whitepaper has lots of information on how you can delete or update or insert bulk data.This will provide you lots of infor which will be useful.

http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

GulliMeel

http://gullimeelsqlsybase.wordpress.com/2012/06/28/finding-top-n-worst-performaing-queries/

http://gullimeelsqlsybase.wordpress.com/2012/06/26/improved-merge-join-algorithm-in-sql-server-2008/

GulliMeel

Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]

Homersdonut Old Hand Points: 346 More actions · Answer 4

Your batch count is really small. I would try at least 10k.

Jeff Moden SSC Guru Points: 1004703 More actions · Answer 5

ghanshyam.kundu (6/28/2012)
Hi folks,
I am deleting records from a table which is having around 2 million records, i am deleting via batch means 2000 records at a single time.
the server configuration is sql server standard edition, having ram 14 GB and 4 processers of intel 2.67 GHz,
to delete the records its taking more than 6 hours.
i think it is taking too much time, how can reduce the taken time
here is the query i am using to delete the records.
DECLARE @delcnt int
DECLARE @VARROWCOUNT INT
SET @VARROWCOUNT = 0
select @VARROWCOUNT = COUNT(*)
from BaseData2 a
join dbo.IncrementalData b
on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1 )
SET @delcnt = 0
WHILE @VARROWCOUNT > 0
BEGIN
DELETE TOP (1000) a
FROM BaseData2 a
JOIN dbo.IncrementalData b
ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1)
SET @delcnt = @delcnt + @@ROWCOUNT
SET @VARROWCOUNT = @VARROWCOUNT - 1000
END
INSERT dbo.ETLAudit (Inserted_RecordCount,InsertedDate,Deleted_RecordCount,Error_RecordCount,Select_Record_Count)
VALUES ('',CONVERT(VARCHAR(8), GETDATE(), 1) ,@delcnt,'','')
this has been called inside a package.
your suggestion will be highly appreciated!!!

How many indexes and foreign key references do you have associated with the table being deleted? Are there any indexed view that relate to the table being deleted? What is the Clustered index for the table being deleted?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ghanshyam.kundu Hall of Fame Points: 3337 More actions · Answer 6

Thank guys for your reply's

here are the details

we are deleting based on join because in base table , incremental records needs to be deleted.so we are using base table join condition n matched records will be deleted.

DB and package both are hosted in same server.

Only one index on deleted table “ndx_DW_INSTALL_BASE_KEY(Non Unique, Non Clustered) “ , the same key we are using in join condition while deleting the record.

In the same way there 7 non claustered indexes on Basedata table as well.

No Clustered index on both the tables all are non clustered indexes.

i will increase the batch size and see the performance and let you know.

my doubt is there anything we need to do with server configuration??

or by tune the query we can achieve this.

thanks,

Gullimeel Hall of Fame Points: 3871 More actions · Answer 7

Are you deleting from the basedata2 table (this has 7 non clustered indexes.) or incremental data (it has just one non clustered index)? Again how many rows are being deleted from the table?

Also, you have the heap table and thus deleting data is not reclaiming the space and thus actually you might have just 2 million rows in the table but actually it could be very large table.I assume that this delete and insert happens every day. Try to find the size of the table using sys.dm_db_index_phsyical_stats DMV and if that is the issue you might want to consider using clustered index as well.

Also if you are deleting basetable then the number of indexes (7) also making it slow to delete the data?

GulliMeel

http://gullimeelsqlsybase.wordpress.com/2012/06/28/finding-top-n-worst-performaing-queries/

http://gullimeelsqlsybase.wordpress.com/2012/06/26/improved-merge-join-algorithm-in-sql-server-2008/

GulliMeel

Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]

SpringTownDBA SSCertifiable Points: 6917 More actions · Answer 8

This is a different way to chunk your deletes. It populates a temp table of keys, then picks the first N keys, and deletes rows associated with them, then grabs the next N keys, etc.

(completely untested, use at your own risk, ask if you don't understand, etc.)

DECLARE @delcnt INT

DECLARE @VARROWCOUNT INT

--number of DW_INSTALL_BASE_KEY's to attempt to delete in a batch.

-- (it will still be done in 1000 row chunks).

DECLARE @DW_INSTALL_BASE_KEY_INCREMENT INT

DECLARE @tmpRowCount BIGINT

SET @tmpRowCount = -1

set DW_INSTALL_BASE_KEY_INCREMENT = 10 --10 is arbitrary.

SET @VARROWCOUNT = 0

SELECT @VARROWCOUNT = COUNT(*)

FROM BaseData2 a

JOIN dbo.IncrementalData b

ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

OPTION ( MAXDOP 1 )

SET @delcnt = 0

--New table table to hold work.

SELECT DISTINCT

b.DW_INSTALL_BASE_KEY

INTO #KeysToDelete

CREATE CLUSTERED INDEX IX_tmpKeysToDelete_BaseKey

ON #KeysToDelete (DW_INSTALL_BASE_KEY)

--WHILE @VARROWCOUNT > 0

-- BEGIN

WHILE EXISTS ( SELECT *

FROM #KeysToDelete AS ktd )

BEGIN

WHILE @@tmpRowCount <> 0

BEGIN

;

WITH batch

AS ( SELECT TOP ( @DW_INSTALL_BASE_KEY_INCREMENT )

DW_INSTALL_BASE_KEY

FROM #KeysToDelete AS ktd

ORDER BY DW_INSTALL_BASE_KEY

)

DELETE TOP ( 1000 )

a

FROM batch b JOIN

BaseData2 a ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

OPTION ( MAXDOP 1, RECOMPILE, LOOP JOIN, FORCE ORDER )

-- last two hints may not be needed, but, the recompile is definately needed.

SET @tmpRowCont = @@rowcount

SET @delcnt = @delcnt + @tmpRowCount

END

SET @tmpRowCount = -1

-- now we tidy up our temp table to remove the keys already processed.

DELETE FROM #KeysToDelete

WHERE DW_INSTALL_BASE_KEY IN (

SELECT TOP ( @DW_INSTALL_BASE_KEY_INCREMENT )

DW_INSTALL_BASE_KEY

FROM #KeysToDelete AS ktd

ORDER BY DW_INSTALL_BASE_KEY )

END

INSERT dbo.ETLAudit

( Inserted_RecordCount ,

InsertedDate ,

Deleted_RecordCount ,

Error_RecordCount ,

Select_Record_Count

)

VALUES ( '' ,

CONVERT(VARCHAR(8), GETDATE(), 1) ,

@delcnt ,

'' ,

''

)

good luck!