Delete is so slow, help

Question

Delete is so slow, help

nmederich

Old Hand

Points: 347
More actions
September 7, 2005 at 3:32 pm

#102778

I'm tearing what's left of my hair out with this slower than slow delete statement.
It's a simple delete statement to remove records from 1 table. But it can take up to 1 minute to execute, way to slow for our app. I did a Ctrl-L to show the Query Plan and there's a SORT operation that has an estimated cost of 201,296 and the estimated row count is 1,071,290,600 and supposedly is 98% of the query cost. The SORT is on a table not listed in the DELETE statement so I'm guessing it's maybe a cascading delete on a table that references the vts_tbQuestion table?
This is the query causing me problems:
declare @QuestionID int
set @questionid = 5178
DELETE FROM vts_tbQuestion WHERE ParentQuestionID = @questionid
You can view a jpeg of the Queryplan at http://www.savesomonauk.com/img/queryplan.jpg
The vts_tbQuestion table has 3633 rows. The vts_tbAnswer table has 15,802 rows. The vts_tbVoterAnswer table has 565,037 rows.
Below is a simplified structure of the tables. By the way this is on SQL 2000 SP4.
CREATE TABLE [dbo].[vts_tbAnswer] (
[AnswerId] [int] IDENTITY (1, 1) NOT NULL ,
[QuestionId] [int] NOT NULL ,
[AnswerTypeId] [smallint] NULL ,
[AnswerText] [nvarchar] (4000) NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[vts_tbQuestion] (
[QuestionId] [int] IDENTITY (1, 1) NOT NULL ,
[ParentQuestionID] [int] NULL ,
[SurveyID] [int] NULL ,
[QuestionText] [nvarchar] (4000) NULL
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[vts_tbVoterAnswers] (
[VoterID] [int] NOT NULL ,
[AnswerID] [int] NOT NULL ,
[AnswerText] [ntext] NULL ,
[SectionNumber] [int] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[vts_tbVoterAnswers] WITH NOCHECK ADD
CONSTRAINT [PK_vts_tbVoterAnswers] PRIMARY KEY CLUSTERED
(
[VoterID],
[AnswerID],
[SectionNumber]
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [idxiQuestionID] ON [dbo].[vts_tbAnswer]([QuestionId]) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [idxiPollID] ON [dbo].[vts_tbQuestion]([SurveyID], [ParentQuestionID], [QuestionId]) ON [PRIMARY]
GO
CREATE INDEX [nci_va] ON [dbo].[vts_tbVoterAnswers]([AnswerID]) ON [PRIMARY]
GO
ALTER TABLE [dbo].[vts_tbAnswer] ADD
CONSTRAINT [PK_vts_tbAnswer] PRIMARY KEY NONCLUSTERED
([AnswerId]) ON [PRIMARY]
GO
ALTER TABLE [dbo].[vts_tbQuestion] ADD
CONSTRAINT [PK_vts_tbQuestion] PRIMARY KEY NONCLUSTERED
([QuestionId]) ON [PRIMARY]
GO
ALTER TABLE [dbo].[vts_tbAnswer] ADD
CONSTRAINT [FK_vts_tbAnswer_vts_tbQuestion] FOREIGN KEY
([QuestionId]) REFERENCES [dbo].[vts_tbQuestion] (
[QuestionId]) ON DELETE CASCADE ON UPDATE CASCADE
GO
ALTER TABLE [dbo].[vts_tbVoterAnswers] ADD
CONSTRAINT [FK_VoterAnswers_vts_tbAnswer] FOREIGN KEY
([AnswerID]) REFERENCES [dbo].[vts_tbAnswer] (
[AnswerId]
) ON DELETE CASCADE ON UPDATE CASCADE

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

noeld SSC Guru Points: 96590 More actions · Answer 1

Don't have much time now but a glance at your indexes logic indicate that your :

CREATE CLUSTERED INDEX [idxiPollID] ON [dbo].[vts_tbQuestion]([SurveyID], [ParentQuestionID], [QuestionId]) ON [PRIMARY]

GO

Should probably better be:

CREATE UNIQUE CLUSTERED INDEX [idxiPollID] ON [dbo].[vts_tbQuestion]( [ParentQuestionID], [QuestionId],[SurveyID]) ON [PRIMARY]

GO

The order MATTERS

Good Luck!

I'll be back tomorrow if you need more help

* Noel

Sergiy SSC Guru Points: 110208 More actions · Answer 2

There is no index on QuestionId in table vts_tbAnswer.

That's why SQL Server creates hash table with millions rows to get all possible joins with table vts_tbQuestion. Because therwe is no index it takes all pairs, including those where values are not equal.

Than it sorts it and deletes those which are not to be there.

And only than it filters it by QuestionId value in vts_tbQuestion.

Add the index and you'll feel the difference.

_____________
Code for TallyGenerator

noeld SSC Guru Points: 96590 More actions · Answer 3

There is no index on QuestionId in table vts_tbAnswer.

Sorry but :

CREATE CLUSTERED INDEX [idxiQuestionID] ON [dbo].[vts_tbAnswer]([QuestionId]) ON [PRIMARY]

GO

Proof your assumption wrong There is an index in that table and in that column.

The statement :

DELETE FROM vts_tbQuestion WHERE ParentQuestionID = @questionid

Couldn't care less about table vts_tbAnswer ..

That is why I said that the CLUSTERED index that was created on vts_tbQuestion needed to be arranged differently

Cheers!

* Noel

PW-201837 SSC-Insane Points: 20805 More actions · Answer 4

Look carefully at the constraints.

ON DELETE CASCADE.

A *parent* question ID is being deleted. Which may delete more than 1 record from vts_tbQuestion. Which will cascade and delete multiple records from vts_tbAnswer. But wait, now there's a ON DELETE CASCADE from vts_tbAnswer to vts_tbVoterAnswers.

So the original delete is actually potentially deleting a large number of records from 3 tables.

Except, for the table where most records will be deleted from, there is a non-clustered index that is probably not selective enough since it is on AnswerID which will have many repeating records of the same value, so SQL Server will ignore that index:

CREATE INDEX [nci_va] ON [dbo].[vts_tbVoterAnswers]([AnswerID]) ON [PRIMARY]

I'd say the solution starts with ensuring the index on vts_tbVoterAnswers gets used in the cascading delete, either by making it clustered, or adding columns to make it more selective.

noeld SSC Guru Points: 96590 More actions · Answer 5

Ah!! Missed the CASCADE

PW you are correct!

apparently the primary key was not good enough (which is what was used in the SORT) I would also try REORDERING that one as well by setting answerid as the FIRST column!

Good catch!

* Noel

nmederich Old Hand Points: 347 More actions · Answer 6

Well I've been trying a few changes with little success.

A performance gain happened when I deleted the index:

CREATE INDEX [nci_va] ON [dbo].[vts_tbVoterAnswers]([AnswerID]) ON [PRIMARY]

Removing the index improved the performance by about 20%. Good but still not good enough. I need to get this delete down to 10 seconds or less if I can.

Unfortunately, removing this index really hurt performance of other Select queries in the application. So it moved the performance problem from one spot to another. I had to add the index back.

Has anyone experienced performance problems after implementing cascading constraints?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Yup. I've got a massive cascade structure in my db. A parent table cascades to 8 child tables, each of which cascades to two more.

I had no joy with indexes, but got massive speed increases by doing the cascade manually.

If there is only one place where you ever delete from the main table, I can suggest that you remove the offending cascade (the one that the massive sort is done on), make it a normal foreign key and do the delete from the child table first.

It is more work, and you have to be very careful to find all the places the delete is done, but it might help your speed problems.

HTH.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

nmederich Old Hand Points: 347 More actions · Answer 8

Thankyou all for your assistance with this vexing cascading.

I agree that have cascading references setup adds a lot of overhead that can be avoided if you're willing to do the work.

But I think I found a workable solution.

Removing the following index helped performance some:

CREATE INDEX [nci_va] ON [dbo].[vts_tbVoterAnswers]([AnswerID]) ON [PRIMARY]

This hurt performance on some Select statements however,so to combat that I rearranged the columns (and removed column "SectionNumber") on the clustered index of vts_tbVoterAnswers like so:

ALTER TABLE [dbo].[vts_tbVoterAnswers] WITH NOCHECK ADD

CONSTRAINT [PK_vts_tbVoterAnswers] PRIMARY KEY CLUSTERED

( [AnswerID], [VoterID])

I found that there is fewer distinct Answerid values than VoterID values and thought making Answerid the first column in the clustered index would help.

This helped the performance of the Select statements and I think it may have helped with this Delete as well.

Now the Delete runs in less than 5 seconds.

This is good enough.

Thanks guys.