Deleting 400K Rows on Production Server

  • No, no

    🙂

    I mean duration of SELECT execution.

    I guess that your trouble is very slow INNER JOIN. So I try to found another way to delete data

    Something like this:

    DELETE a

    FROM OQ a

    WHERE ( (a.dd < GETDATE() - 0.1) AND (a.st IN (4,7)) ) OR

    ( (a.dd < GETDATE() - 0.1) AND (a.st BETWEEN 250 AND 255) ) OR

    ( (a.dd < GETDATE() - 0.25) AND (a.st BETWEEN 4 AND 29) ) OR

    ( (a.dd < GETDATE() - 0.25) AND (a.st BETWEEN 34 AND 61) ) OR

    ( (a.dd < GETDATE() - 2) AND (a.st in (3))

  • Thanks Mwolf

    I am taking off all the records on a temp table, so i think that there is no problem with the select statement. its only when it is deleting. and i cant use the condition to delete, i have to use mid as the condition, since the st field changes from time to time.

    Cheers

  • Sorry if I am missing something, but when you say you are deleting 400K rows from table A, is that ALL the rows from table A or just a subset?

    If you don't need the data in table A at all, then wouldn't a 'Truncate Table' lose the content so much quicker?

  • Its only a subset, if its a whole table that would be so nice.:(

    Cheers

  • CrazyMan (5/28/2008)


    Hi

    I dont have a execution plan , but the table index structure is as below

    IX_OQa_Binonclustered, stats no recompute located on PRIMARYBi

    IX_OQa_Blo_Binonclustered, stats no recompute located on PRIMARYBlo, Bi

    IX_OQa_ddnonclustered, stats no recompute located on PRIMARYdd

    IX_OQa_iMidnonclustered, stats no recompute located on PRIMARYiMid

    IX_OQa_Phnonclustered, stats no recompute located on PRIMARYPh

    IX_OQa_Rtnonclustered, stats no recompute located on PRIMARYRt

    IX_OQa_Stnonclustered, stats no recompute located on PRIMARYSt

    IX_OQa_Stanonclustered, stats no recompute located on PRIMARYSta

    IX_OQa_StaRtnonclustered, stats no recompute located on PRIMARYSta, Rt

    PK_OQaclustered, unique, primary key, stats no recompute located on PRIMARYMid

    Cheers

    With that many indexes, it may take a while. Are folks going to be using the table while you're deleting?

    If people are using the table while you're deleting, I believe I'd make a "delete crawler" based on the order of the clustered index (can't tell what column(s) it's on from your listing above) and delete a chunk at a time. Here's and example... it doesn't match your table or columns, but it will give you and idea... it contains it's own test data, as well...

    --===== If the test table exists, drop it

    IF OBJECT_ID('dbo.JBMTestDetail','U') IS NOT NULL

    DROP TABLE dbo.JBMTestDetail

    GO

    --===== Create and populate a 1,000,000 row test table.

    -- Column "ConnID" has a range of 1 to 100,000 non-unique numbers stored as VARCHAR(30)

    -- Column "Key1" has a range of 1 to 100,000 non-unique numbers stored as VARCHAR(30)

    -- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

    -- Column "Time_Stamp" has a range of >=01/01/2005 and <01/01/2015 non-unique date/times

    -- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

    -- for all rows.

    -- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

    SELECT TOP 1000000

    ConnID = ISNULL(CAST(ABS(CHECKSUM(NEWID()))%100000+1 AS VARCHAR(30)),''), --(10 rows per connection)

    Key1 = ISNULL(CAST(ABS(CHECKSUM(NEWID()))%100000+1 AS VARCHAR(20)),''), --just to test index with

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65),

    SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

    SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

    Time_Stamp = ISNULL(CAST(RAND(CHECKSUM(NEWID()))*3652.0+38351.0 AS DATETIME),0),

    SomeHex12 = RIGHT(NEWID(),12)

    INTO dbo.JBMTestDetail

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

    --===== Create indexes similar to Troy's

    CREATE CLUSTERED INDEX IXC_JBMTestDetail_Time_Stamp_ConnID ON dbo.JBMTestDetail (Time_Stamp,ConnID)

    CREATE NONCLUSTERED INDEX IX_JBMTestDetail_ConnID_Key1 ON dbo.JBMTestDetail (ConnID,Key1)

    GO

    --===== Setup to measure performance...

    SET STATISTICS TIME ON

    --===== Define the cutoff date with a time of "midnight" or, if you will,

    -- define the cutoff date with no time so we only delete whole days.

    DECLARE @CutoffDate DATETIME

    SELECT @CutoffDate = DATEADD(dd,DATEDIFF(dd,0,GETDATE()),-94)

    --===== Limit all further queries, including deletes, to 25,000 rows

    -- (about 1 second worth of deletes, like I said before)

    SET ROWCOUNT 25000

    --===== See if any rows qualify for deletion. If even just one exists,

    -- then there's work to do and @@ROWCOUNT will be > 0.

    -- DO NOT PUT ANY CODE BETWEEN THIS SELECT AND THE WHILE LOOP OR

    -- YOU'LL MESS UP THE ROWCOUNT FOR THE WHILE LOOP

    SELECT TOP 1 1 FROM dbo.JBMTestDetail WHERE Time_Stamp < @CutoffDate

    --===== If the rowcount from the above is greater than 0,

    -- then delete 25,000 rows at a time until there's nothing

    -- left to delete

    WHILE @@ROWCOUNT > 0

    BEGIN

    --===== Just a "marker" to separate the loop in the output

    PRINT REPLICATE('=',78)

    --===== This delay gives other processes breathing room

    WAITFOR DELAY '00:00:10'

    --===== Do the delete. Will be limited by the SET ROWCOUNT above.

    -- DO NOT PUT ANY CODE BETWEEN THIS DELETE AND THE "END" OR

    -- YOU'LL MESS UP THE ROWCOUNT FOR THE WHILE LOOP.

    DELETE dbo.JBMTestDetail WITH (TABLOCKX)

    WHERE Time_Stamp < @CutoffDate

    END

    --===== Restore the ability to process more than 25,000 rows

    SET ROWCOUNT 0

    SELECT 1000000 - COUNT(*) FROM jbmtestdetail

    If people aren't using the table at the same time, you might just try doing the delete by capturing the Primary Key information in a temp table and then doing the delete using a join on the temp table. Even in the presence of indexes, deletes usually move right along.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • CrazyMan (5/28/2008)


    Thanks Mwolf

    I am taking off all the records on a temp table, so i think that there is no problem with the select statement. its only when it is deleting. and i cant use the condition to delete, i have to use mid as the condition, since the st field changes from time to time.

    Cheers

    Do you have index on "Mid" field?

  • However, I want to see your execution plan

    Try to execute it and show result XML, please

    It seems that index on Mid is missed or not used

    SELECT a.Mid, a.Ph, a.msg, a.dd, a.Rt, a.st,

    a.MDa, a.BI, isNull(a.BO, 0) as BO, a.NMs, a.r, a.sr, a.kd, a.Re, a.Or, a.Ims, Yy = datepart(Yy,a.dd)-2000, Mm = datepart(Mm,a.dd),

    Dd = datepart(Dd,a.dd), Hh = datepart(Hh,a.dd), KeyID into #Temp

    FROM OQa (nolock)

    WHERE ( (a.dd < GETDATE() - 0.1) AND (a.st IN (4,7)) ) OR

    ( (a.dd < GETDATE() - 0.1) AND (a.st BETWEEN 250 AND 255) ) OR

    ( (a.dd < GETDATE() - 0.25) AND (a.st BETWEEN 4 AND 29) ) OR

    ( (a.dd < GETDATE() - 0.25) AND (a.st BETWEEN 34 AND 61) ) OR

    ( (a.dd < GETDATE() - 2) AND (a.st in (3))

    INSERT INTO OQa2

    (Mid, Ph, msg, dd, Rt, st,

    MDa, BI, BO, NMs, r, sr, kd, Re, Or, Ims, Yy, Mm, Dd, Hh, KeyID)

    select * from #Temp

    alter table #Temp add constraint

    PK_Mid Primary key clustered (Mid ) on [Primary]

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    GO

    SET SHOWPLAN_XML ON

    go

    DELETE a

    FROM OQa

    INNER JOIN #Temp b

    ON a.Mid = b.Mid

    GO

    SET SHOWPLAN_XML OFF

    go

    Drop table #Temp

    GO

  • Thanks Jeff

    I will try using the deletion Crawler, since applicaitons are using this tables at the time of deletion.

    Let u guys know once completed

    Cheers

Viewing 8 posts - 16 through 22 (of 22 total)

You must be logged in to reply to this topic. Login to reply