Compare tables with more than 4.9 million records

Question

Compare tables with more than 4.9 million records

ms-techie

SSChasing Mays

Points: 627
More actions
March 18, 2014 at 10:30 am

#283494

Hi,
I want to compare ONLY 1 Column values from 2 tables having more than 4.9 million records. There is a difference of 4000 rows between the 2 tables.
SELECT ID From TABLE1 where ID not in (SELECT DISTINCT ID From TABLE2)
My above query took nearly 4.5 hours to run and I had to cancel it. Is there a better way to write the query . I just want to compare the ID - column values which are missing in TABLE2

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183706 More actions · Answer 1

Have you tried the EXCEPT clause? Do you have indexes in your tables?

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

pietlinden SSC Guru Points: 63503 More actions · Answer 2

or maybe (From a database I had handy)

SELECT p.ProtocolNo

FROM Protocol p

WHERE NOT EXISTS (SELECT 1 FROM Enroll e WHERE e.ProtocolNo = p.ProtocolNo);

ProtocolNo is the primary key of Protocol, so that's indexed. e.ProtocolNo is a foreign key, so I indexed that too.

Are your PK and FK's indexed?

ms-techie SSChasing Mays Points: 627 More actions · Answer 3

Is EXCEPT Clause faster

or

is NOT EXISTS Clause faster ?

Luis Cazares SSC Guru Points: 183706 More actions · Answer 4

ms-techie (3/18/2014)
Is EXCEPT Clause faster
or
is NOT EXISTS Clause faster ?

Have you tested them? That's the best indicator.

Someone has made some tests for you. You could thank Gail, but you should test as well in your environment http://sqlinthewild.co.za/index.php/2010/02/18/not-exists-vs-not-in/

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

ms-techie SSChasing Mays Points: 627 More actions · Answer 5

EXCEPT Clause worked like a charm. I got output in just 9 mins from 4.5 hours. Thank you

Now when I do a simple select , it is taking more than 30 mins to execute . There are 4.9 million rows in the table.

SELECT ID FROM TABLE where ID = '1234'

--Is there a better than this SELECT QUERY.

pietlinden SSC Guru Points: 63503 More actions · Answer 6

Is there an index on the column "ID"? If not, there's your problem. Check the execution plan for the query - that will show you what's going on and where to start looking for causes. (If I were smart enough to understand it, I would go get a copy of Grant Fritchey's book on execution plans.) Actually, dig around for some of Grant's posts on reading execution plans - he might even have some one YouTube (look up ScaryDBA, and you'll find him).

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 7

food for thought.......note index

SELECT TOP 5000000

TranID = IDENTITY(INT, 1, 1)

INTO Table1

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

CROSS JOIN sys.all_columns ac3

ALTER TABLE [dbo].[Table1] ADD CONSTRAINT [t1_ID] PRIMARY KEY CLUSTERED ([TranID] ASC)

GO

SELECT TOP 5000000

TranID = IDENTITY(INT, 1, 1)

INTO Table2

FROM sys.all_columns ac1

CROSS JOIN sys.all_columns ac2

CROSS JOIN sys.all_columns ac3

ALTER TABLE [dbo].[Table2] ADD CONSTRAINT [t2_ID] PRIMARY KEY CLUSTERED ([TranID] ASC)

GO

DELETE FROM Table2

WHERE (TranID % 1250 = 0)

SELECT Table1.TranID

, Table2.TranID AS Expr1

FROM

Table1 LEFT OUTER JOIN Table2 ON Table1.TranID = Table2.TranID

WHERE (Table2.TranID IS NULL);

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day