Need ideas on query performance

Question

Post reply

Need ideas on query performance

mayurkb

Ten Centuries

Points: 1155
More actions
March 13, 2012 at 9:47 am

#252898

I have 2 tables (suppose A and B) with around 1 million rows each. I want to find and delete the records from table A that does not exist in table B.
The Primary key and clustered on both tables is a uniqueidentifier column. And PK of A goes into B as a foreign key.
This query is part of a stored proc. and is taking anywhere from 90 to 120 SECONDS. It always uses clustered index scan on the table A which is consuming most of the time.
I initially had a query which used NOT EXISTS and then I used all other options like NOT IN, LEFT JOIN, TOP. But nothing is helping out.
I need ideas of how can I achieve this in more optimized way.
Here are the schemas
CREATE TABLE A(
[PersonId] [uniqueidentifier] NOT NULL,
[FirstName] [nvarchar](50) NULL,
[LastName] [nvarchar](50) NULL,
[MiddleName] [nvarchar](50) NULL,
[Prefix] [nvarchar](20) NULL,
[Suffix] [nvarchar](20) NULL,
[ProfessionalSuffix] [nvarchar](20) NULL,
[SystemEditFlag] [tinyint] NOT NULL,
[EntityVersion] [timestamp] NOT NULL,
CONSTRAINT [PK_A] PRIMARY KEY CLUSTERED
(
[PersonId] ASC
)
)
CREATE TABLE B(
[PersonId] [uniqueidentifier] NOT NULL
--...other columns
CONSTRAINT [PK_B] PRIMARY KEY CLUSTERED
(
[PersonId] ASC
)
)
ALTER TABLE B WITH CHECK ADD CONSTRAINT [FK_B_A] FOREIGN KEY([PersonId])
REFERENCES A ([PersonId])
Thanks in advance.
Regards
Mayur

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 1

Try:

1. Use SELECT INTO to insert records you want to stay into new table using INNER JOIN, IN or EXIST

2. Drop the old table

3. Rename new one to old name

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

mayurkb Ten Centuries Points: 1155 More actions · Answer 2

Thanks for the reply.

But I cannot do that because A is a physical table in the database and all I'm doing is a clean up stuff. I cannot drop and recreate the whole table while there are other mainline transactions taking place.

Jared SSC Guru Points: 61793 More actions · Answer 3

Is there a non-clustered index on Table B for personid?

Jared
CE - Microsoft

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 4

mayur birari (3/13/2012)
Thanks for the reply.
But I cannot do that because A is a physical table in the database and all I'm doing is a clean up stuff. I cannot drop and recreate the whole table while there are other mainline transactions taking place.

If you do it in a table which is in use and most likely participates in relationships, the way you do it looks right and time taken to process looks fine to me.

Is you clean up one-off thing or you need to run it regularly? If the second, why not to avoid inserting not-needed records at the first place instead of doing regular clean-up (which more appropriate for log-kind tables)

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 5

SQLKnowItAll (3/13/2012)
Is there a non-clustered index on Table B for personid?

It's defined as PRIMARY KEY CLUSTERED in the OP posted DDL...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

Try this is a test environment.

delete from TABLEA

where exists(select Personid from TABLEA except select Personid from TABLEB);

Jared SSC Guru Points: 61793 More actions · Answer 7

Eugene Elutin (3/13/2012)
SQLKnowItAll (3/13/2012)
Is there a non-clustered index on Table B for personid?
It's defined as PRIMARY KEY CLUSTERED in the OP posted DDL...

I see that the OP did that, but it does not make sense normally (although in cases it does) he referenced it as a foreign key to table A. My assumption is that the OP made an error in his/her DDL.

Jared
CE - Microsoft

mayurkb Ten Centuries Points: 1155 More actions · Answer 8

@sqlknowitall

Yes. it does have a foreign key in table B. I agree it may not be needed since it is also a clustered PK on the same table.

@Lynn

I did try Except clause, it has same execution plan and takes same time.

@Eugene

It runs from configurable clean up utility. It runs either every night, every 15 days or once a month. Different for each customer.

And I cannot avoid that because the same foreign key goes into 2 more tables and over the time it can have orphaned records.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 9

mayur birari (3/13/2012)
Yes.

Yes what?

Have you tested the code I posted?

Jared SSC Guru Points: 61793 More actions · Answer 10

mayur birari (3/13/2012)
@SQLKnowItall
Yes. it does have a foreign key in table B. I agree it may not be needed since it is also a clustered PK on the same table.
@Lynn
I did try Except clause, it has same execution plan and takes same time.
@Eugene
It runs from configurable clean up utility. It runs either every night, every 15 days or once a month. Different for each customer.
And I cannot avoid that because the same foreign key goes into 2 more tables and over the time it can have orphaned records.

No, that's still needed as a constraint if it fits your design. What I would like for you to do is run your delete query(s) without the DELETE. Make them selects. My guess is that it is not the LEFT JOIN or NOT IN, etc. causing the execution time to climb. It is the delete itself. When we purge tables in this fashion, we typically run it in batches of 100 or 1000 for 15 minutes. Of course, this depends on how thr traffic to your databases fluctuates.

Jared
CE - Microsoft

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 11

SQLKnowItAll (3/13/2012)
mayur birari (3/13/2012)
@SQLKnowItall
Yes. it does have a foreign key in table B. I agree it may not be needed since it is also a clustered PK on the same table.
@Lynn
I did try Except clause, it has same execution plan and takes same time.
@Eugene
It runs from configurable clean up utility. It runs either every night, every 15 days or once a month. Different for each customer.
And I cannot avoid that because the same foreign key goes into 2 more tables and over the time it can have orphaned records.
No, that's still needed as a constraint if it fits your design. What I would like for you to do is run your delete query(s) without the DELETE. Make them selects. My guess is that it is not the LEFT JOIN or NOT IN, etc. causing the execution time to climb. It is the delete itself. When we purge tables in this fashion, we typically run it in batches of 100 or 1000 for 15 minutes. Of course, this depends on how thr traffic to your databases fluctuates.

Based on the above, you could do something like the following:

while (select count(*) from TABLEA except select Personid from TABLEB) > 0

begin

delete top(5000) from TABLEA -- delete 5,000 records at a time (set as needed for your appliction)

where exists(select Personid from TABLEA except select Personid from TABLEB);

waitfor delay '00:00:05'; -- 5 second delay

end

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 12

Rereading your original post I noticed this:

It always uses clustered index scan on the table A which is consuming most of the time.

With an ordered list of Personid's, this is efficient. You are scanning the the index, which is basically the table since the clustered index is declared on Personid.

Jared SSC Guru Points: 61793 More actions · Answer 13

Lynn Pettis (3/13/2012)
SQLKnowItAll (3/13/2012)
mayur birari (3/13/2012)
@SQLKnowItall
Yes. it does have a foreign key in table B. I agree it may not be needed since it is also a clustered PK on the same table.
@Lynn
I did try Except clause, it has same execution plan and takes same time.
@Eugene
It runs from configurable clean up utility. It runs either every night, every 15 days or once a month. Different for each customer.
And I cannot avoid that because the same foreign key goes into 2 more tables and over the time it can have orphaned records.
No, that's still needed as a constraint if it fits your design. What I would like for you to do is run your delete query(s) without the DELETE. Make them selects. My guess is that it is not the LEFT JOIN or NOT IN, etc. causing the execution time to climb. It is the delete itself. When we purge tables in this fashion, we typically run it in batches of 100 or 1000 for 15 minutes. Of course, this depends on how thr traffic to your databases fluctuates.
Based on the above, you could do something like the following:
while (select count(*) from TABLEA except select Personid from TABLEB) > 0
begin
delete top(5000) from TABLEA -- delete 5,000 records at a time (set as needed for your appliction)
where exists(select Personid from TABLEA except select Personid from TABLEB);
waitfor delay '00:00:05'; -- 5 second delay
end

That's what I'm talking about! 🙂 However, if the SELECT itself is slow, we will know that there may be something else going on outside of the delete. Heck, if this thing runs daily and here are only 2 rows to delete daily, then the issue resides outside of the DELETE. I would like to isolate that piece first.

Jared
CE - Microsoft

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 14

Curious, what type of plan does this produce:

select

Personid

from

TABLEA

except

select

Personid

from

TABLEB

)

delete top(5000) from ta

from

TABLEA ta

inner join IdsToDelete itd

on (ta.Personid = itd.Personid)