Select distinct into or dedupe faster

Question

Select distinct into or dedupe faster

Ghanta

Hall of Fame

Points: 3999
More actions
August 20, 2009 at 4:23 am

#140979

hey Guys,
I have a table with 4 billion records which I need to dedupe... (not many duplicates though)..
which is more efficient..
Select distinct * into new_tbl from old_table
drop old_table
rename new_tbl
or
deduping is faster by deleting duplicate records from the table?
Thanks!

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184397 More actions · Answer 1

I'm not sure which would be faster. Both have advantages and disadvantages. You'll need to try both in dev.

As I think about it, I'd say the select into would probably be faster because there are no indexes, but then you'd have to create all the indexes on the new table which will take time and resources, but while that is going on the original table will still be available to your application. How much space do you have?

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 2

Ghanta (8/20/2009)
(not many duplicates though)..

It's important... how many?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)