November 15, 2024 at 12:24 pm
I have a very large table and I'm looking to do a cleanup based upon a non-keyed field. This table many concurrent insert transactions running against it, and I can't afford downtime to create an Index this field. Looking for some suggestions using the PK(ID) to help speed up the deletes and minimize Locking on the table.
Right now driver for cleanup is something like this:
delete from Image_Classification_Master
where
convert(date,SpoolStartDt) = '02/22/2022'
Thanks.
CREATE TABLE [dbo].[Image_Classification_Master](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ImageName] [nvarchar](500) NULL,
[ImageType] [nvarchar](200) NULL,
[ImageSource] [image] NULL,
[ReceivedDateTime] [datetime] NULL,
[ImagePath] [nvarchar](500) NULL,
[Site] [varchar](3) NULL,
[MachineNbr] [int] NULL,
[LineNbr] [int] NULL,
[TakeUpNbr] [int] NULL,
[SpoolNbr] [int] NULL,
[ImageIndex] [int] NULL,
[CameraNbr] [varchar](2) NULL,
[SpoolStartDt] [datetime] NULL,
[SpoolStartTime] [time](7) NULL,
[DefectDate] [datetime] NULL,
[DefectTime] [time](7) NULL,
[DefectNbr] [int] NULL,
[DefectClass] [varchar](100) NULL,
[Reviewer] [nvarchar](50) NULL,
[UserDefectInput] [nvarchar](10) NULL,
CONSTRAINT [PK_Image_Classification_Master] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[Image_Classification_Master] ADD CONSTRAINT [DF_Image_Classification_Master_ReceivedDateTime] DEFAULT (getdate()) FOR [ReceivedDateTime]
GO
November 15, 2024 at 3:31 pm
Oof. Even if you put an index on the SpoolSrtDt column, because you have a function to convert the value you're going to get scans anyway. Why do that? It's a datetime column. Compare it to a datetime value. Then, an index could help. Otherwise, you're just looking at scans and no way around that. Because the clustered index is ID, that's what must be used to delete values. No getting around it. So, they're either found through a scan of the clustered index, or, you build an index on the appropriate column (which will absolutely have some affect on the system, unless you're running Enterprise, then you can do an online index creation) and pay that cost so it doesn't have to do the table scan. No magic way around this really.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply