February 6, 2011 at 9:40 pm
Hi,
I've got two databases MyDB1 and MyDB2 with some of the tables of the same structure and some are different. I need to clean up Company table in MyDB2, leaving only records that exist in MyDB1. I can't truncate and populate the table from MyDB1 due to the existance of indexed views. I can't drop and re-create indexed views, because they may change in the next version, but the script should not have to be changed.
I use the below code to do the job. "Not In" structure is a bit slow. Is there a better and faster way? Like using some sort of join?
Thanks.
SET ROWCOUNT 2000000
select top 1 * from MyDB1.[dbo].[Company] (nolock) WHERE [CompanyID] not in (
SELECT [CompanyID] FROM MyDB2.[dbo].[Company])
while @@rowcount > 0
begin
delete MyDB1.[dbo].[Company] WHERE [CompanyID] not in (
SELECT [CompanyID] FROM MyDB2.[dbo].[Company])
end
SET ROWCOUNT 0
February 6, 2011 at 11:02 pm
I'd probably try to delete fewer rows in every bite but that's basically it. If you have indexed views, they'll be a part of the reason why it goes so slowly... they're update during the deletes, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
February 6, 2011 at 11:12 pm
I am disabling the indexes on indexed views so this is not a problem. The problem is with "NOT IN"
I am trying something like this:
delete
-- select top 10 * from
from MyDB1.[dbo].[Company] c1
left join MyDB2.[dbo].[Company] c2
on c1.CompanyID = c2.CompanyID
WHERE c2.[CompanyID] is NULL
But it gives me a syntax error.
February 7, 2011 at 3:46 am
Extra FROM
select top 10 * from
from
February 7, 2011 at 4:15 pm
That extra "from" is related to commented out select statement.
I worked this out:
delete MyDB1.[dbo].[Company]
-- select top 10 * from
from MyDB1.[dbo].[Company] c1
left join MyDB2.[dbo].[Company] c2
on c1.CompanyID = c2.CompanyID
WHERE c2.[CompanyID] is NULL
The only thing, it does not run any faster compared to "not in" approach. I actually took 50% longer for some reason, perhaps different load. I was expecting the two largest tables to drop to 1 hour from 4 hours, but instead they went up to 6 hours. Strange...
February 7, 2011 at 4:44 pm
I wonder if EXCEPT or INTERSECT can be used here.
Just an idea...
--Vadim R.
February 8, 2011 at 1:58 am
DECLARE@CurrID INT = (SELECT MIN(PrimaryKeyColumnNameHere) FROM MyDB1.[dbo].[Company]),
@MaxID INT = (SELECT MAX(PrimaryKeyColumnNameHere) FROM MyDB1.[dbo].[Company]),
@Interval INT = 100000
WHILE @CurrID <= @MaxID
BEGIN
DELETEtgt
FROMMyDB1.dbo.Company AS tgt
LEFT JOINMyDB2.dbo.Company AS src ON src.CompanyID = tgt.CompanyID
WHEREtgt.PrimaryKeyColumnNameHere >= @CurrID
AND tgt.PrimaryKeyColumnNameHere < @CurrID + @Interval
AND src.PrimaryKeyColumnNameHere IS NULL
SET@CurrID = @CurrID + @Interval
END
N 56°04'39.16"
E 12°55'05.25"
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply