Improve performance

  • This is my current stored procedure it is taking more time to finish the opration.

    any suggestion to improve this?

    There are around 25 _TESTDATA tables each around 8 million records , i want to delete the records which are older than 30 days.

    CREATE PROCEDURE sp_delete_by_date

    @pdate AS int,

    @nrp bit = 1

    AS

    SET DEADLOCK_PRIORITY LOW

    SET IMPLICIT_TRANSACTIONS ON

    DECLARE @i tinyint

    DECLARE @Table_Name varchar(50)

    CREATE TABLE #MC (

    ID int IDENTITY(1,1) NOT NULL,

    Table_Name varchar(50) NOT NULL)

    INSERT INTO #MC

    SELECT object_name(fkeyid) FROM sysforeignkeys

    WHERE ( object_name(fkeyid) LIKE '%_TESTDATA' )

    AND object_name(fkeyid) NOT LIKE 'TMP_%'

    SET @i = 1

    WHILE @i <=(SELECT MAX(ID) FROM #MC)

    BEGIN

    SELECT @Table_Name=Table_Name FROM #MC WHERE ID=@i

    EXEC ('DELETE FROM '+ @Table_Name +' WHERE RUN_START_TIME<=' + @pdate + '')

    COMMIT TRAN

    SET @i=@i +1

    END

    IF @nrp <> 1

    BEGIN

    WAITFOR DELAY '00:00:00.800'

    END

    DELETE TESTDATA WHERE RUN_START_TIME <=@pdate

    COMMIT TRAN

    IF @nrp <> 1

    BEGIN

    WAITFOR DELAY '00:00:00.200'

    END

    DELETE TR WHERE RUN_END_TIME<=@pdate

    COMMIT TRAN

    DROP TABLE #MC

    COMMIT TRAN

    Do we really need Commit at every delete statement or can we have it as common statement

    something like

    IF @@ERROR != 0

    BEGIN

    ROLLBACK TRANSACTION

    RETURN

    END

    ELSE

    BEGIN

    COMMIT TRANSACTION

    END

  • If you commit after every row, it will run slower, but it also allows you to manage your log size. If it's one big commit, it's one big transaction that must be maintained in the log.

    In terms of making this faster, I'd look at deleting batches of rows, perhaps in the 10,000-100,000 size range to speed things up. That often gives you a good balance between one large transaction and 8mm small ones.

  • So steve you are suggesting some thing like

    EXEC ('DELETE Top (10000) FROM '+ @Table_Name +' WHERE RUN_START_TIME<=' + @pdate + '')

    COMMIT TRAN

    DELETE Top (10000) TESTDATA WHERE RUN_START_TIME <=@pdate

    COMMIT TRAN

    DELETE Top (10000) TR WHERE RUN_END_TIME<=@pdate

    COMMIT TRAN

    incase what if in lower version of sql server say sql server 2000 how can we manage it

  • yuvipoy,

    we have similar requirements for our real-time systems. We need to delete millions of rows and not cause locking that would stop transactions. I wrote a delete loop that works for us.

    --Start by getting the records to delete

    SELECT ID

    INTO #DeleteKeys

    FROM TESTDATA WITH (NOLOCK)

    WHERE RUN_START_TIME <=@pdate

    --loop thru small batches to avoid excessive locking

    create table #DeleteBatch (ID int)

    declare @RowCount int

    SET @RowCount = 1000

    WHILE @RowCount = 1000

    BEGIN

    DELETE TOP (1000) #DeleteKeys

    OUTPUT DELETED.ID

    INTO #DeleteBatch

    FROM #DeleteKeys

    SET @RowCount = @@ROWCOUNT

    DELETE

    FROM TESTDATA

    FROM TESTDATA TD

    INNER JOIN #DeleteBatch DB

    ON TD.ID = DB.ID

    TRUNCATE TABLE #DeleteBatch

    END

  • Hi EricEyster,

    After having bacth operation is delete operation faster?

    how much time you gained after rewritting your query.

    Incase what if in lower version of sql server say sql server 2000 how can we manage it

  • yuvipoy (3/5/2014)


    Hi EricEyster,

    After having bacth operation is delete operation faster?

    how much time you gained after rewritting your query.

    Incase what if in lower version of sql server say sql server 2000 how can we manage it

    You should see at least a 10x performance improvement versus single record delete. It will vary depending on how big you make the batch size for the delete. We have to decrease the size on some systems and that slows the delete. Others, we can go as high as 100,000 and get great performance.

    If you are using SQL2000, then you do not have the output operator. I updated the code to work without it. It does require an extra step, but it should not drastically change the performance on a reasonably sized delete:

    declare @pdate datetime

    set @pdate = getdate()-7

    --Start by getting the records to delete

    SELECT ID

    INTO #DeleteKeys

    FROM TESTDATA WITH (NOLOCK)

    WHERE RUN_START_TIME <=@pdate

    --loop thru small batches to avoid excessive locking

    create table #DeleteBatch (ID int)

    declare @RowCount int

    SET @RowCount = 1000

    WHILE @RowCount = 1000

    BEGIN

    INSERT INTO #DeleteBatch

    SELECT TOP (1000) ID

    FROM #DeleteKeys

    SET @RowCount = @@ROWCOUNT

    DELETE #DeleteKeys

    FROM #DeleteKeys DK

    JOIN #DeleteBatch DB

    ON DK.ID = DB.ID

    DELETE

    FROM TESTDATA

    FROM TESTDATA TD

    INNER JOIN #DeleteBatch DB

    ON TD.ID = DB.ID

    TRUNCATE TABLE #DeleteBatch

    END

  • Sorry for the delay. I typically do what Eric has written here, even back in SQL 2000/7, but I might run 3-4 batches manually at different sizes and time them to understand what the best size might be for my system.

  • declare @pdate datetime

    set @pdate = getdate()-7

    --Start by getting the records to delete

    SELECT ID

    INTO #DeleteKeys

    FROM TESTDATA WITH (NOLOCK)

    WHERE RUN_START_TIME <=@pdate

    If i have refer another table for delete then will this query will be better choice or

    New Code :1

    declare @pdate datetime

    set @pdate = getdate()-7

    --Start by getting the records to delete

    SELECT a.ID

    INTO #DeleteKeys

    FROM TESTDATA as a WITH (NOLOCK)

    WHERE a.id not in(Select b.id from another_table as b )

    and a.RUN_START_TIME <=@pdate

    this query

    New Code :2

    declare @pdate datetime

    set @pdate = getdate()-7

    --Start by getting the records to delete

    SELECT a.ID

    INTO #DeleteKeys

    FROM TESTDATA as a WITH (NOLOCK)

    WHERE not exists(Select b.id from another_table as b where a.id =b.id )

    and a.RUN_START_TIME <=@pdate

    Not exists vs Not in.

    The id field in both the table is not null column.

  • yuvipoy (3/17/2014)


    Not exists vs Not in.

    The id field in both the table is not null column.

    No difference.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Table A is having 100k records

    Table B is having 8000k records

    When i delete the TableB with TableA record exists the query is taking long time thats why i came across Not Exists Vs Not IN.

    how about leftouter join?

    Thanks!

  • yuvipoy (3/17/2014)


    When i delete the TableB with TableA record exists the query is taking long time thats why i came across Not Exists Vs Not IN.

    how about leftouter join?

    Very slightly slower than not in/not exists in my tests.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Is there a better way to do so to delete the records checking with another table.

  • You should see at least a 10x performance improvement versus single record delete. It will vary depending on how big you make the batch size for the delete. We have to decrease the size on some systems and that slows the delete. Others, we can go as high as 100,000 and get great performance.

    If you are using SQL2000, then you do not have the output operator. I updated the code to work without it. It does require an extra step, but it should not drastically change the performance on a reasonably sized delete:

    declare @pdate datetime

    set @pdate = getdate()-7

    --Start by getting the records to delete

    SELECT ID

    INTO #DeleteKeys

    FROM TESTDATA WITH (NOLOCK)

    WHERE RUN_START_TIME <=@pdate

    --loop thru small batches to avoid excessive locking

    create table #DeleteBatch (ID int)

    declare @RowCount int

    SET @RowCount = 1000

    WHILE @RowCount = 1000

    BEGIN

    INSERT INTO #DeleteBatch

    SELECT TOP (1000) ID

    FROM #DeleteKeys

    SET @RowCount = @@ROWCOUNT

    DELETE #DeleteKeys

    FROM #DeleteKeys DK

    JOIN #DeleteBatch DB

    ON DK.ID = DB.ID

    DELETE

    FROM TESTDATA

    FROM TESTDATA TD

    INNER JOIN #DeleteBatch DB

    ON TD.ID = DB.ID

    TRUNCATE TABLE #DeleteBatch

    END

    I tried your code but it is taking more time than the normal delete at present which i am doing.

    My column is uniquidentifier not as INT u created in #Delete tables.

    joining on uniquidentifier to delete some 50 k records out of 5 million records is taking more than 15 min with locking the main table.

    uniquidentifier column is noncluster primary key

    and where condition column is a clustered one

    When i run SP_lock and see the SPID of Delete transaction session it is having more records EXT , TAB,RID with some 300 records for the transaction.

  • Delete operation takes more or less same time i dont see any improvement in the query when i have batch delete. To delete some 1 million records.

    atually the approach is taking more time than the current one.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply