large table cleanup using Delete

  • Looking for the most effective way to cleanup a large table(Can't use Truncate  -FK). I would like to cleanup in Batches, and maybe delete 50,000 or make that a parm, for the amount to remove. The DB is in Simple mode, accessed very heavily doing Inserts and Updates.

    The Attributes I'm trying to cleanup have a NULL attribute value.

    Thanks.

     

    Insert Into Quality_Attribute
    Values('14964824','QM_ID','3744908','2017-07-16 21:28:02.720')
    go

    Insert Into Quality_Attribute
    Values('14964825','Audit','1','2017-07-16 21:28:02.707')
    go

    Insert Into Quality_Attribute
    Values('14964825','Comment','Null','2017-07-16 21:28:02.707')
    go

    Insert Into Quality_Attribute
    Values('14964825','Reason','Null','2017-07-16 21:28:02.707')
    go
    CREATE TABLE [dbo].[Quality_Attribute](
    [Quality_ID] [int] NOT NULL,
    [Attribute] [nvarchar](100) NOT NULL,
    [Attribute_Value] [nvarchar](100) NULL,
    [Attribute_DateTime] [datetime] NULL,
    CONSTRAINT [PK_Quality_Attribute] PRIMARY KEY CLUSTERED
    (
    [Quality_ID] ASC,
    [Attribute] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO
  • 'Null' is a string. Do you really have strings containing the word 'Null'?

    NULL is the absence of data.

    As you don't want to remove all rows, your TRUNCATE comment is not really necessary.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Sorry did my Insert wrong should not contain the NULL string should be blank..

  • use TOP() or LIMIT and do the delete in batches until @@ROWCOUNT = 0?

  • Depending upon how much data you are saving, it may be faster/easier to select what you want to keep into a new table. Drop the PK, FK and indexes on the original table, rename the original table, and rename the new table to the old table's name, and then re-apply the constraints

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Table has 400Million+ records, and I need to keep it online as other processes are updating-inserting...

    Any good examples?

     

    Thanks.

  • This is a basic template for a batch delete/update/insert process:

    Declare @rowsAffected int = 1
    , @batchSize int = 50000; --increase/decrease as needed to avoid long running blocks

    While @rowsAffected > 0
    Begin

    --==== Perform the delete
    Delete Top (@batchSize)
    From yourTable yt
    Where yt.somecolumn = somevalue;

    --==== Get the number of rows affected
    Set @rowsAffected = @@rowcount;

    --==== Perform a checkpoint and delay to allow other processes
    Checkpoint;
    Waitfor Delay '00:00:01';
    End

    Before attempting - make sure you test a select statement and verify that your select statement is correctly returning the rows that you want deleted.  If your criteria is simply where a column is null then an index on the column probably won't help - in which case you may need to expand on this idea.  For example, you could add another loop and work through the data by date and attribute value is null.

    I would definitely test this on a copy of production and validate the code before even considering running it in production.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • The date in the table doesn't help much, but the attribute column itself would help in the selection..

  • Bruin wrote:

    Sorry did my Insert wrong should not contain the NULL string should be blank..

    No... not blank... it should contain the word NULL without quotes.  So go back and fix the code you posted instead of making us fix it for you.  It would also be handy to folks it you listed the CREATE TABLE first before the data.

    Once you've done that, do you have indexes on the FK columns in the tables that are looking at this table?  If not, you should go add them for a pretty serious performance gain during the DELETEs.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Scratch my previous.  The table you provided the code for has a NOT NULL for the attribute name but it does allow nulls for the attribute value.  Why are you trying to remove what is allowed?  And, if it's not supposed to be allowed, what are you doing to ensure that it won't be allowed again in the future?

    I'm thinking that this whole thing may be a mistake.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have already put code in place to block the NULL values for some of the Attributes, not sure why is was ever code to allow those into the Table. I think it may have been for Future attribute values for those attributes, but it was never developed.

    Insert Into Quality_Attribute
    Values('14964824','QM_ID','3744908','2017-07-16 21:28:02.720')
    go

    Insert Into Quality_Attribute
    Values('14964825','Audit','1','2017-07-16 21:28:02.707')
    go

    Insert Into Quality_Attribute
    Values('14964825','Comment',Null,'2017-07-16 21:28:02.707')
    go

    Insert Into Quality_Attribute
    Values('14964825','Reason',Null,'2017-07-16 21:28:02.707')
    go

    .

  • Once you've done that, do you have indexes on the FK columns in the tables that are looking at this table?

    There are Indexes created.

  • Ok... the next step is for us to make a bit of an assessment.   Please run the code (change the table name if it's not correct).

     SELECT  RowCnt  = COUNT(Quality_ID)
    ,NullCnt = COUNT(Attribute_Value)
    FROM dbo.Quality_Attribute
    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • RowCnt              NullCnt

    949718305     391584820

  • Bruin wrote:

    I have already put code in place to block the NULL values for some of the Attributes, not sure why is was ever code to allow those into the Table. I think it may have been for Future attribute values for those attributes, but it was never developed.

    So some of the attributes can still have NULL values?  Or - is this a cleanup effort to remove all of the NULL values and then set the column to NOT NULL?

    It appears that you will be deleting a bit less than half the table.  As someone else pointed out - it might be better to create a new table and move the data over to that new table instead of working through a batch delete.  But if the only method you can use is a batch delete - I provided some code that will work, once you have validated the where clause to be used.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply