Deleting 150+ million rcds - Tips ?

  • We have a huge 210 million rcd archive table, and are thinking of archiving most of it to speed up future imports. There are 8 indexes (all non-clustered), so I imagine our inserts of 50,000 - 100,000 rcds could run faster with smaller indexes to update.

    So, if we decide to delete 3/4 of the records (maybe actually insert them into a "save" table), what steps should I take afterwards ?

    CREATE TABLE [dbo].[CallLogArchive](

    [id] [bigint] IDENTITY(1,1) NOT NULL,

    [Activity_Datetime] [datetime] NULL,

    [Time_ID] [int] NULL,

    [XID] [char](8) NULL,

    [First] [varchar](25) NULL,

    [Last] [varchar](25) NULL,

    [Supervisor] [char](6) NULL,

    [Shift] [char](1) NULL,

    [Room] [char](2) NULL,

    [Center] [char](1) NULL,

    [List_ID] [int] NULL,

    [Disp_ID] [int] NULL,

    [Campaign_ID] [uniqueidentifier] NULL,

    [RWA_ID] [char](10) NULL,

    [Start] [datetime] NULL,

    [VoiceStart] [datetime] NULL,

    [VoiceEnd] [datetime] NULL,

    [Released] [datetime] NULL,

    [DialerCode] [char](5) NULL,

    [Disposition] [uniqueidentifier] NULL,

    [CreditTo] [char](38) NULL,

    [WaitTime] [datetime] NULL,

    [CustomData] [nvarchar](255) NULL,

    [Code] [char](10) NULL,

    [CallLog_ID] [uniqueidentifier] NULL,

    [RecordingLength] [varchar](8) NULL

    ) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [CallLog_ID] ON [dbo].[CallLogArchive]

    ([CallLog_ID] ASC) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [Campaign_ID] ON [dbo].[CallLogArchive]

    ([Campaign_ID] ASC) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [CLA_ID] ON [dbo].[CallLogArchive]

    ([id] ASC) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [Code] ON [dbo].[CallLogArchive]

    ([Code] ASC) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [CustomData] ON [dbo].[CallLogArchive]

    ([CustomData] ASC) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [List_ID] ON [dbo].[CallLogArchive]

    ([List_ID] ASC) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [RWA_ID] ON [dbo].[CallLogArchive]

    ( [RWA_ID] ASC) ON [INDEXES]

    CREATE NONCLUSTERED INDEX [Time_ID] ON [dbo].[CallLogArchive]

    ([Time_ID] ASC) ON [PRIMARY]/code]

  • two things to consider....

    if you are going to delete the records after archiving them off to another table, consider doing it in batches....if you do too many rows at a time, you end up getting an exclusive table lock, and that can prevent others from reading from the table while your delete is going on.

    here's a snippet i saved as an example:

    SET ROWCOUNT 50000

    WHILE 1=1

    BEGIN

    DELETE FROM dbo.myTable

    WHERE MyField = My Condition

    IF @@ROWCOUNT = 0

    BREAK

    END

    SET ROWCOUNT 0

    after that many deletes, there would be lots of fragmentation in the indexes form the gaps we just deleted, and statistics would be way off, so I'd rebuild the indexes first to remove all the gaps in the pages, and then update statistics

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • i agree with lowell, also watch your logs.

  • If this takes up lot of time and you need to delete appx 70% of your records, i might do it the other way.

    change recovery model from FULL to Bulk Logged

    Move all records to a temp table(bulk insert).

    truncate original table.

    move the records you want to keep from temp table to original table(bulk insert)

    change recovery model back to FULL

    Retain the temp table for sometime (just to be sure things worked fine and delete it afterwards upon confirmation)

    You may remove indexes before moving the data and recreate indexes. This will speed up things further.



    Pradeep Singh

  • OR, create a table, mytable_new. Copy over just the records you wish to KEEP. Rename the current table (mytable) to mytable_old, then rename mytable_new to mytable.

    Saves you moving all the records to one table, then moving back only those that you wish to keep.

    If you aren't saving the old records, you could just drop the table instead of renaming it.

  • yes, you can also do that if no one is using the table. otherwise, data integrity issue...

  • Lynn Pettis (1/20/2009)


    OR, create a table, mytable_new. Copy over just the records you wish to KEEP. Rename the current table (mytable) to mytable_old, then rename mytable_new to mytable.

    Saves you moving all the records to one table, then moving back only those that you wish to keep.

    If you aren't saving the old records, you could just drop the table instead of renaming it.

    Yeah, this will save good amount of time 🙂 in this case keep the indexes on original table in place which will fetch the required records faster to be moved to the new table and remove them from source table once u're done moving, and recreate on the new table....



    Pradeep Singh

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply