June 28, 2012 at 7:44 am
Hi folks,
I am deleting records from a table which is having around 2 million records, i am deleting via batch means 2000 records at a single time.
the server configuration is sql server standard edition, having ram 14 GB and 4 processers of intel 2.67 GHz,
to delete the records its taking more than 6 hours.
i think it is taking too much time, how can reduce the taken time
here is the query i am using to delete the records.
DECLARE @delcnt int
DECLARE @VARROWCOUNT INT
SET @VARROWCOUNT = 0
select @VARROWCOUNT = COUNT(*)
from BaseData2 a
join dbo.IncrementalData b
on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1 )
SET @delcnt = 0
WHILE @VARROWCOUNT > 0
BEGIN
DELETE TOP (1000) a
FROM BaseData2 a
JOIN dbo.IncrementalData b
ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1)
SET @delcnt = @delcnt + @@ROWCOUNT
SET @VARROWCOUNT = @VARROWCOUNT - 1000
END
INSERT dbo.ETLAudit (Inserted_RecordCount,InsertedDate,Deleted_RecordCount,Error_RecordCount,Select_Record_Count)
VALUES ('',CONVERT(VARCHAR(8), GETDATE(), 1) ,@delcnt,'','')
this has been called inside a package.
your suggestion will be highly appreciated!!!
June 28, 2012 at 9:52 am
The first thing I can think of is that you are joining the huge table over and over. Why?
Jared
CE - Microsoft
June 28, 2012 at 10:14 am
I don't expect this to be a major improvement but you never know. Instead of checking the count you can just check to see if something exists:
WHILE EXISTS (select 1
from BaseData2 a
join dbo.IncrementalData b
on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY)
BEGIN
Are BaseTable.DW_INSTALL_BASE_KEY and IncrementalTable.DW_INSTALL_BASE_KEY indexes? Increasing the MAXDOP would also help but I'm assuming that's there so it doesn't affect the performance of other things happening on the system. Is the package running on the server where the DB is hosted? If not you may want to consider creating a stored proc and having the package call the stored proc instead. I've seen testing where that caused a performance increase due to less communication between the server and the client.
This isn't a performance thing but you may want to consider putting the delete in a transaction. That will allow the transaction log to clear out while it's running and if there is an issue you don't loose all that's been done. Right now if it fails the whole thing rolls back. That may be the way you want it to work but if it isn't I would use a transaction.
June 28, 2012 at 12:22 pm
Did you try to increase the batchsize say to 5000?
You mentioned that there are around 2 million rows in one table what about other table.After delete how many rows remain in basetable2.
It looks like that you are deleting lots of data. If you are deleting lots of data you could use a left join from basetable to increment table to find the records which you want to retain and put them into a temp table.
Then truncate the table ( if you have referential integtirty it will not allow to truncate the table and identity will be reset if you have identity field). Then insert into the table Basetable using the batches and I would say that use the batch of 5000 or more based on your log size.
if you do not have full logged recovery for database(I am sure you will have full logged but in case you do not have) then you could use the insert into diretcly without any logging in single insert select statement.
Below whitepaper has lots of information on how you can delete or update or insert bulk data.This will provide you lots of infor which will be useful.
http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx
GulliMeel
http://gullimeelsqlsybase.wordpress.com/2012/06/28/finding-top-n-worst-performaing-queries/
http://gullimeelsqlsybase.wordpress.com/2012/06/26/improved-merge-join-algorithm-in-sql-server-2008/
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
June 28, 2012 at 12:26 pm
Your batch count is really small. I would try at least 10k.
June 28, 2012 at 2:19 pm
ghanshyam.kundu (6/28/2012)
Hi folks,I am deleting records from a table which is having around 2 million records, i am deleting via batch means 2000 records at a single time.
the server configuration is sql server standard edition, having ram 14 GB and 4 processers of intel 2.67 GHz,
to delete the records its taking more than 6 hours.
i think it is taking too much time, how can reduce the taken time
here is the query i am using to delete the records.
DECLARE @delcnt int
DECLARE @VARROWCOUNT INT
SET @VARROWCOUNT = 0
select @VARROWCOUNT = COUNT(*)
from BaseData2 a
join dbo.IncrementalData b
on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1 )
SET @delcnt = 0
WHILE @VARROWCOUNT > 0
BEGIN
DELETE TOP (1000) a
FROM BaseData2 a
JOIN dbo.IncrementalData b
ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION (MAXDOP 1)
SET @delcnt = @delcnt + @@ROWCOUNT
SET @VARROWCOUNT = @VARROWCOUNT - 1000
END
INSERT dbo.ETLAudit (Inserted_RecordCount,InsertedDate,Deleted_RecordCount,Error_RecordCount,Select_Record_Count)
VALUES ('',CONVERT(VARCHAR(8), GETDATE(), 1) ,@delcnt,'','')
this has been called inside a package.
your suggestion will be highly appreciated!!!
How many indexes and foreign key references do you have associated with the table being deleted? Are there any indexed view that relate to the table being deleted? What is the Clustered index for the table being deleted?
--Jeff Moden
Change is inevitable... Change for the better is not.
June 28, 2012 at 11:43 pm
Thank guys for your reply's
here are the details
we are deleting based on join because in base table , incremental records needs to be deleted.so we are using base table join condition n matched records will be deleted.
DB and package both are hosted in same server.
Only one index on deleted table “ndx_DW_INSTALL_BASE_KEY(Non Unique, Non Clustered) “ , the same key we are using in join condition while deleting the record.
In the same way there 7 non claustered indexes on Basedata table as well.
No Clustered index on both the tables all are non clustered indexes.
i will increase the batch size and see the performance and let you know.
my doubt is there anything we need to do with server configuration??
or by tune the query we can achieve this.
thanks,
June 29, 2012 at 12:10 am
Are you deleting from the basedata2 table (this has 7 non clustered indexes.) or incremental data (it has just one non clustered index)? Again how many rows are being deleted from the table?
Also, you have the heap table and thus deleting data is not reclaiming the space and thus actually you might have just 2 million rows in the table but actually it could be very large table.I assume that this delete and insert happens every day. Try to find the size of the table using sys.dm_db_index_phsyical_stats DMV and if that is the issue you might want to consider using clustered index as well.
Also if you are deleting basetable then the number of indexes (7) also making it slow to delete the data?
GulliMeel
http://gullimeelsqlsybase.wordpress.com/2012/06/28/finding-top-n-worst-performaing-queries/
http://gullimeelsqlsybase.wordpress.com/2012/06/26/improved-merge-join-algorithm-in-sql-server-2008/
GulliMeel
Finding top n Worst Performing queries[/url]
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw[/url]
June 29, 2012 at 12:45 am
This is a different way to chunk your deletes. It populates a temp table of keys, then picks the first N keys, and deletes rows associated with them, then grabs the next N keys, etc.
(completely untested, use at your own risk, ask if you don't understand, etc.)
DECLARE @delcnt INT
DECLARE @VARROWCOUNT INT
--number of DW_INSTALL_BASE_KEY's to attempt to delete in a batch.
-- (it will still be done in 1000 row chunks).
DECLARE @DW_INSTALL_BASE_KEY_INCREMENT INT
DECLARE @tmpRowCount BIGINT
SET @tmpRowCount = -1
set DW_INSTALL_BASE_KEY_INCREMENT = 10 --10 is arbitrary.
SET @VARROWCOUNT = 0
SELECT @VARROWCOUNT = COUNT(*)
FROM BaseData2 a
JOIN dbo.IncrementalData b
ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION ( MAXDOP 1 )
SET @delcnt = 0
--New table table to hold work.
SELECT DISTINCT
b.DW_INSTALL_BASE_KEY
INTO #KeysToDelete
CREATE CLUSTERED INDEX IX_tmpKeysToDelete_BaseKey
ON #KeysToDelete (DW_INSTALL_BASE_KEY)
--WHILE @VARROWCOUNT > 0
-- BEGIN
WHILE EXISTS ( SELECT *
FROM #KeysToDelete AS ktd )
BEGIN
WHILE @@tmpRowCount <> 0
BEGIN
;
WITH batch
AS ( SELECT TOP ( @DW_INSTALL_BASE_KEY_INCREMENT )
DW_INSTALL_BASE_KEY
FROM #KeysToDelete AS ktd
ORDER BY DW_INSTALL_BASE_KEY
)
DELETE TOP ( 1000 )
a
FROM batch b JOIN
BaseData2 a ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY
OPTION ( MAXDOP 1, RECOMPILE, LOOP JOIN, FORCE ORDER )
-- last two hints may not be needed, but, the recompile is definately needed.
SET @tmpRowCont = @@rowcount
SET @delcnt = @delcnt + @tmpRowCount
END
SET @tmpRowCount = -1
-- now we tidy up our temp table to remove the keys already processed.
DELETE FROM #KeysToDelete
WHERE DW_INSTALL_BASE_KEY IN (
SELECT TOP ( @DW_INSTALL_BASE_KEY_INCREMENT )
DW_INSTALL_BASE_KEY
FROM #KeysToDelete AS ktd
ORDER BY DW_INSTALL_BASE_KEY )
END
INSERT dbo.ETLAudit
( Inserted_RecordCount ,
InsertedDate ,
Deleted_RecordCount ,
Error_RecordCount ,
Select_Record_Count
)
VALUES ( '' ,
CONVERT(VARCHAR(8), GETDATE(), 1) ,
@delcnt ,
'' ,
''
)
good luck!
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply