May 28, 2009 at 1:22 pm
Hi all, sorry for this long description.
i am creating a purge process for the old records and i have a question regarding clustered index for the table below.
it seems that
CREATE TABLE [dbo].[OldRecords](
[ID] [int] NOT NULL,
[OrderID] [char](12) NOT NULL,
[IsDeleted] [tinyint] NOT NULL CONSTRAINT [DF_OldRecords_IsDeleted] DEFAULT ((0)),
[CreateDate] [datetime] NOT NULL CONSTRAINT [DF_OldRecords_CreateDate] DEFAULT (getdate()),
[DeletedDate] [datetime] NULL,
CONSTRAINT [PK_OldRecords] PRIMARY KEY CLUSTERED
(
[ID] ASC
))
Also, there are 2 non-clustered indexes defined on this table which are used:
One is defined based on IsDeleted column and another on DeletedDate.
The way the purge works is basically every day it grabs old id and orderid from the original table based on the delivery date for an order and inserts these records into OldRecords table.
So, yesterday for example it can insert records like:
IDOrderIDIsDeletedCreateDateDeletedDate
1DUF002009-05-27 15:26:58.280NULL
4LDYX002009-05-27 15:26:58.280NULL
and today it will insert this data:
2LDYWF02009-05-28 15:26:58.280NULL
3DYUV02009-05-28 15:26:58.280NULL
5LE2UX02009-05-28 15:26:58.280NULL
and the final result will be like:
IDOrderIDIsDeletedCreateDateDeletedDate
1DUF002009-05-27 15:26:58.280NULL
2LDYWF02009-05-28 15:26:58.280NULL
3DYUV02009-05-28 15:26:58.280NULL
4LDYX002009-05-27 15:26:58.280NULL
5LE2UX02009-05-28 15:26:58.280NULL
Here is where i am not sure if the Clustered index defined on OldRecords table is the correct way to do so since records (can be many of them) will be inserted randomly and will cause page splits.
The next step after the records inserted will be:
DECLARE @rowcount INT
@count INT
SET @rowcount = 5000
DECLARE @DelOldRecords Table (ID INT PRIMARY KEY,
OrderID CHAR(12))
WHILE 1 = 1
BEGIN
INSERT INTO @DelOldRecords
(ID,
OrderID)
SELECT TOP (@rowcount) TMP.ID, TMP.OrderID
FROM dbo.OldRecords TMP
WHERE TMP.IsDeleted = 0
ORDER BY TMP.ID -- I am doing this in order to get the oldest records first
-- delete records based on the ID column
DELETE O FROM dbo.Order O INNER JOIN @DelOldRecords TV ON O.ID = TV.ID
-- or delete records based on the
DELETE O FROM Authorize O INNER JOIN @DelOldRecords TV ON O.OrderID = TV.OrderID -- how bad this join will perform since there is no index for OrderID in table variable?
UPDATE TMP
SET TMP.IsDeleted = 1,
TMP.DeletedDate = GETDATE()
FROM dbo.OldRecords TMP
INNER JOIN @DelOldRecords TV
ON TMP.ID = TV.ID
AND TMP.IsDeleted = 0
SELECT @count = @@ROWCOUNT
DELETE FROM @DelOldRecords
IF @count < @rowcount BREAK
END
The OldRecords table also will be purged based on DeletedDate < getdate - 2 month
Can you please recommend the best way to handle indexes in this case?
Thanks
May 28, 2009 at 1:37 pm
Do you ever do any queries on that table?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 28, 2009 at 3:26 pm
The only query on that table will be:
SELECT TOP (@rowcount) TMP.ID, TMP.OrderID
FROM dbo.OldRecords TMP
WHERE TMP.IsDeleted = 0
ORDER BY TMP.ID -- I am doing this in order to get the oldest records first
when we get records in a batch for a purge and then this table will be updated.
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply