copying a billion row lines

Question

Post reply

copying a billion row lines

Shas3

Hall of Fame

Points: 3671
More actions
August 4, 2004 at 2:26 pm

#162334

Any body has any experience dealing with copying a billion row lines of text from one table to another, especially in prod environment?
Thanks

Shas3

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 741233 More actions · Answer 1

Steve Jones - SSC Editor

SSC Guru

Points: 741233

August 4, 2004 at 8:37 pm

#517740

not that big. Done a few million, but that's it.

dbuchan SSCrazy Points: 2328 More actions · Answer 2

Done about 8 million text/integer using BCP which was fairly fast and painless, used the Import Data Wizard for a 3 million row table - which was a bad thing as it took a long time and the transaction log nearly took the log disk out. (And shrinking it took a long time toooooooo......)

Have never tried putting a large table into a separate filegroup and backing that up then restoring to different database/filegroup but keen to try...wonder if it would be a good thing ??

Experiment, then tell us .....

The systems fine with no users loggged in. Can we keep it that way ?br>

Green Armchair Hall of Fame Points: 3561 More actions · Answer 3

I think bcp is your best bet.

bcp out using native format. Remove all indexes before you bcp in (unless the data is sorted in your clustered index order, in which case you can leave this in place). Rebuild indexes after the load has completed.

Still going to take a long time and a lot of disk space.

Andy

David.Poole SSC Guru Points: 76349 More actions · Answer 4

I've not hit the magic billion but I have done 100's millions.

I used to take the production server down for a morning when I did it.

My approach was as follows

Calculate the size of the table to be copied. See BOL and in the index type "calculating table size" the top three entries are the valid ones.
Expand the target database by that size. You don't want the database autoresizing when this baby runs!
If you use trunctate log on checkpoint then switch it off.
Expand the transaction log, again you don't want it auto-resizing.
Drop all indices/constraints/primary keys/triggers on the target database etc
Use DTS Wizard to build and run the insert.
Rebuild indices/constraint/primary keys/triggers etc.
You may want to truncate the transaction log and DBCC SHRINKFILE it.

I'm a bit rusty on the process because it is something that I did in my last job, over 4 years ago but I seem to remember I had to expand TEMPDB for the reindexing part of the process. Again, you don't want autoresizing to take place whilst the job takes place.

I also had to kick everyone off the database server when I did the process.

The choice was stay off the server and have everything back by lunchtime, or stay on the server and this urgent data might be ready next week.

LinkedIn Profile

bbdpres SSCommitted Points: 1876 More actions · Answer 5

I agree with David's steps, but with two additional things you might consider, if you're exporting the data out to a flat file first...

If the table you're inserting into is clustered, you may want to presort the data. And if it is clustered, you wouldn't want to drop that index on the 'to' table when you insert the data.
You may want to try the BULK INSERT command instead of BCP - it is slightly more efficient. Be sure to include the TABLOCK and ORDER clauses for minimal logging - and make sure the database is in either simple or bulk-logged recovery mode.

David is correct too. TEMPDB will expand when you rebuild indexes on tables that large, so be conservative on its resize. It's easier to shrink it later than let it expand during the CREATE INDEX statements.

Good Luck