Deleting Records taking too much time!!!

  • Hi folks,

    I am deleting records from a table which is having around 2 million records, i am deleting via batch means 2000 records at a single time.

    the server configuration is sql server standard edition, having ram 14 GB and 4 processers of intel 2.67 GHz,

    to delete the records its taking more than 6 hours.

    i think it is taking too much time, how can reduce the taken time

    here is the query i am using to delete the records.

    DECLARE @delcnt int

    DECLARE @VARROWCOUNT INT

    SET @VARROWCOUNT = 0

    select @VARROWCOUNT = COUNT(*)

    from BaseData2 a

    join dbo.IncrementalData b

    on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

    OPTION (MAXDOP 1 )

    SET @delcnt = 0

    WHILE @VARROWCOUNT > 0

    BEGIN

    DELETE TOP (1000) a

    FROM BaseData2 a

    JOIN dbo.IncrementalData b

    ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

    OPTION (MAXDOP 1)

    SET @delcnt = @delcnt + @@ROWCOUNT

    SET @VARROWCOUNT = @VARROWCOUNT - 1000

    END

    INSERT dbo.ETLAudit (Inserted_RecordCount,InsertedDate,Deleted_RecordCount,Error_RecordCount,Select_Record_Count)

    VALUES ('',CONVERT(VARCHAR(8), GETDATE(), 1) ,@delcnt,'','')

    this has been called inside a package.

    your suggestion will be highly appreciated!!!

  • The first thing I can think of is that you are joining the huge table over and over. Why?

    Jared
    CE - Microsoft

  • I don't expect this to be a major improvement but you never know. Instead of checking the count you can just check to see if something exists:

    WHILE EXISTS (select 1

    from BaseData2 a

    join dbo.IncrementalData b

    on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY)

    BEGIN

    Are BaseTable.DW_INSTALL_BASE_KEY and IncrementalTable.DW_INSTALL_BASE_KEY indexes? Increasing the MAXDOP would also help but I'm assuming that's there so it doesn't affect the performance of other things happening on the system. Is the package running on the server where the DB is hosted? If not you may want to consider creating a stored proc and having the package call the stored proc instead. I've seen testing where that caused a performance increase due to less communication between the server and the client.

    This isn't a performance thing but you may want to consider putting the delete in a transaction. That will allow the transaction log to clear out while it's running and if there is an issue you don't loose all that's been done. Right now if it fails the whole thing rolls back. That may be the way you want it to work but if it isn't I would use a transaction.

  • Did you try to increase the batchsize say to 5000?

    You mentioned that there are around 2 million rows in one table what about other table.After delete how many rows remain in basetable2.

    It looks like that you are deleting lots of data. If you are deleting lots of data you could use a left join from basetable to increment table to find the records which you want to retain and put them into a temp table.

    Then truncate the table ( if you have referential integtirty it will not allow to truncate the table and identity will be reset if you have identity field). Then insert into the table Basetable using the batches and I would say that use the batch of 5000 or more based on your log size.

    if you do not have full logged recovery for database(I am sure you will have full logged but in case you do not have) then you could use the insert into diretcly without any logging in single insert select statement.

    Below whitepaper has lots of information on how you can delete or update or insert bulk data.This will provide you lots of infor which will be useful.

    http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

    GulliMeel

    http://gullimeelsqlsybase.wordpress.com/2012/06/28/finding-top-n-worst-performaing-queries/

    http://gullimeelsqlsybase.wordpress.com/2012/06/26/improved-merge-join-algorithm-in-sql-server-2008/

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • Your batch count is really small. I would try at least 10k.

  • ghanshyam.kundu (6/28/2012)


    Hi folks,

    I am deleting records from a table which is having around 2 million records, i am deleting via batch means 2000 records at a single time.

    the server configuration is sql server standard edition, having ram 14 GB and 4 processers of intel 2.67 GHz,

    to delete the records its taking more than 6 hours.

    i think it is taking too much time, how can reduce the taken time

    here is the query i am using to delete the records.

    DECLARE @delcnt int

    DECLARE @VARROWCOUNT INT

    SET @VARROWCOUNT = 0

    select @VARROWCOUNT = COUNT(*)

    from BaseData2 a

    join dbo.IncrementalData b

    on a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

    OPTION (MAXDOP 1 )

    SET @delcnt = 0

    WHILE @VARROWCOUNT > 0

    BEGIN

    DELETE TOP (1000) a

    FROM BaseData2 a

    JOIN dbo.IncrementalData b

    ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

    OPTION (MAXDOP 1)

    SET @delcnt = @delcnt + @@ROWCOUNT

    SET @VARROWCOUNT = @VARROWCOUNT - 1000

    END

    INSERT dbo.ETLAudit (Inserted_RecordCount,InsertedDate,Deleted_RecordCount,Error_RecordCount,Select_Record_Count)

    VALUES ('',CONVERT(VARCHAR(8), GETDATE(), 1) ,@delcnt,'','')

    this has been called inside a package.

    your suggestion will be highly appreciated!!!

    How many indexes and foreign key references do you have associated with the table being deleted? Are there any indexed view that relate to the table being deleted? What is the Clustered index for the table being deleted?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank guys for your reply's

    here are the details

    we are deleting based on join because in base table , incremental records needs to be deleted.so we are using base table join condition n matched records will be deleted.

    DB and package both are hosted in same server.

    Only one index on deleted table “ndx_DW_INSTALL_BASE_KEY(Non Unique, Non Clustered) “ , the same key we are using in join condition while deleting the record.

    In the same way there 7 non claustered indexes on Basedata table as well.

    No Clustered index on both the tables all are non clustered indexes.

    i will increase the batch size and see the performance and let you know.

    my doubt is there anything we need to do with server configuration??

    or by tune the query we can achieve this.

    thanks,

  • Are you deleting from the basedata2 table (this has 7 non clustered indexes.) or incremental data (it has just one non clustered index)? Again how many rows are being deleted from the table?

    Also, you have the heap table and thus deleting data is not reclaiming the space and thus actually you might have just 2 million rows in the table but actually it could be very large table.I assume that this delete and insert happens every day. Try to find the size of the table using sys.dm_db_index_phsyical_stats DMV and if that is the issue you might want to consider using clustered index as well.

    Also if you are deleting basetable then the number of indexes (7) also making it slow to delete the data?

    GulliMeel

    http://gullimeelsqlsybase.wordpress.com/2012/06/28/finding-top-n-worst-performaing-queries/

    http://gullimeelsqlsybase.wordpress.com/2012/06/26/improved-merge-join-algorithm-in-sql-server-2008/

    GulliMeel

    Finding top n Worst Performing queries[/url]
    Improve the performance of Merge Join(special case)
    How to Post Performance Problem -Gail Shaw[/url]

  • This is a different way to chunk your deletes. It populates a temp table of keys, then picks the first N keys, and deletes rows associated with them, then grabs the next N keys, etc.

    (completely untested, use at your own risk, ask if you don't understand, etc.)

    DECLARE @delcnt INT

    DECLARE @VARROWCOUNT INT

    --number of DW_INSTALL_BASE_KEY's to attempt to delete in a batch.

    -- (it will still be done in 1000 row chunks).

    DECLARE @DW_INSTALL_BASE_KEY_INCREMENT INT

    DECLARE @tmpRowCount BIGINT

    SET @tmpRowCount = -1

    set DW_INSTALL_BASE_KEY_INCREMENT = 10 --10 is arbitrary.

    SET @VARROWCOUNT = 0

    SELECT @VARROWCOUNT = COUNT(*)

    FROM BaseData2 a

    JOIN dbo.IncrementalData b

    ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

    OPTION ( MAXDOP 1 )

    SET @delcnt = 0

    --New table table to hold work.

    SELECT DISTINCT

    b.DW_INSTALL_BASE_KEY

    INTO #KeysToDelete

    CREATE CLUSTERED INDEX IX_tmpKeysToDelete_BaseKey

    ON #KeysToDelete (DW_INSTALL_BASE_KEY)

    --WHILE @VARROWCOUNT > 0

    -- BEGIN

    WHILE EXISTS ( SELECT *

    FROM #KeysToDelete AS ktd )

    BEGIN

    WHILE @@tmpRowCount <> 0

    BEGIN

    ;

    WITH batch

    AS ( SELECT TOP ( @DW_INSTALL_BASE_KEY_INCREMENT )

    DW_INSTALL_BASE_KEY

    FROM #KeysToDelete AS ktd

    ORDER BY DW_INSTALL_BASE_KEY

    )

    DELETE TOP ( 1000 )

    a

    FROM batch b JOIN

    BaseData2 a ON a.DW_INSTALL_BASE_KEY = b.DW_INSTALL_BASE_KEY

    OPTION ( MAXDOP 1, RECOMPILE, LOOP JOIN, FORCE ORDER )

    -- last two hints may not be needed, but, the recompile is definately needed.

    SET @tmpRowCont = @@rowcount

    SET @delcnt = @delcnt + @tmpRowCount

    END

    SET @tmpRowCount = -1

    -- now we tidy up our temp table to remove the keys already processed.

    DELETE FROM #KeysToDelete

    WHERE DW_INSTALL_BASE_KEY IN (

    SELECT TOP ( @DW_INSTALL_BASE_KEY_INCREMENT )

    DW_INSTALL_BASE_KEY

    FROM #KeysToDelete AS ktd

    ORDER BY DW_INSTALL_BASE_KEY )

    END

    INSERT dbo.ETLAudit

    ( Inserted_RecordCount ,

    InsertedDate ,

    Deleted_RecordCount ,

    Error_RecordCount ,

    Select_Record_Count

    )

    VALUES ( '' ,

    CONVERT(VARCHAR(8), GETDATE(), 1) ,

    @delcnt ,

    '' ,

    ''

    )

    good luck!

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply