drive space issues trying to index 2.4TB table

Question

drive space issues trying to index 2.4TB table

RP_DBA

SSCertifiable

Points: 5116
More actions
May 11, 2012 at 9:14 am

#255813

OK, this landed on my desk this morning: user created a 2.4TB table without a Clustered Key. He now wants a clustered key on the table but is unable to do so because he's running out of disk space. We're unable to allocate him more disk space and have cleaned up about all we can cleanup.
Here's what I'm working with:
SQL Server 2008 R2
Data Drive: 5.3TB (200GB free space)
DB unallocated space: .5TB
Tbl Size: 2.4TB
Rows: 1,268,636,579
TempDb Drive: 115GB
My initial thought was to create a nonclustered index and then recommend he consult with someone before doing something like this in the future. The more likely scenario would be to create a new table with a clustered key and then migrate the data over in small batches.
Any thoughts/ideas/approaches on an alternative would be appreciated.
_____________________________________________________________________
- Nate
@nate_hughes

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Gazareth One Orange Chip Points: 27737 More actions · Answer 1

No way for the user to go through the original table population procedure again?

Changing to Bulk Logged recovery mode might help with the disk space required, although not sure if that helps with going Heap -> Clustered table.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

To create a clustered index on a 2.4 TB table, you need 2.4TB for the old structure (the heap), the same amount of space for the new index, around 20% of that for sort space, either in the user database or tempDB, then you need space to log that, if the DB is in full recovery, that's the full size of the index + around 10%.

If you need the cluster (and all tables should generally have a clustered index), I'd recommend the 'new table and move option, but that could still require an extra 2.4 TB or so,

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

RP_DBA SSCertifiable Points: 5116 More actions · Answer 3

Thanks Gazareth and Gail for the quick response!

Gazareth (5/11/2012)
No way for the user to go through the original table population procedure again?

Nope. The raw data was deleted off this server once he manipulated and loaded it into the table. We could go and collect all the data again for him but we're talking weeks of processing time which he doesn't have.

GilaMonster (5/11/2012)
To create a clustered index on a 2.4 TB table, you need 2.4TB for the old structure (the heap), the same amount of space for the new index, around 20% of that for sort space, either in the user database or tempDB, then you need space to log that, if the DB is in full recovery, that's the full size of the index + around 10%.
If you need the cluster (and all tables should generally have a clustered index), I'd recommend the 'new table and move option, but that could still require an extra 2.4 TB or so,

Thanks for the detailed breakdown, I kind of figured I would need at least the same amount of free space as the existing table.

Going to recommend the nonclustered index for the short-term and if performance is an issue then we'll try migrating the data over to a new table.

_____________________________________________________________________
- Nate

@nate_hughes

Jeffrey Williams SSC Guru Points: 90313 More actions · Answer 4

If you have the storage available for the raw data on the server, then you should be able to BCP out the data from the table, truncate the table, add the clustered index, and BCP the data back in.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Jeff Moden SSC Guru Points: 1004434 More actions · Answer 5

RP_DBA (5/11/2012)
OK, this landed on my desk this morning: user created a 2.4TB table without a Clustered Key. He now wants a clustered key on the table but is unable to do so because he's running out of disk space. We're unable to allocate him more disk space and have cleaned up about all we can cleanup.
Here's what I'm working with:
SQL Server 2008 R2
Data Drive: 5.3TB (200GB free space)
DB unallocated space: .5TB
Tbl Size: 2.4TB
Rows: 1,268,636,579
TempDb Drive: 115GB
My initial thought was to create a nonclustered index and then recommend he consult with someone before doing something like this in the future. The more likely scenario would be to create a new table with a clustered key and then migrate the data over in small batches.
Any thoughts/ideas/approaches on an alternative would be appreciated.

If you have the Enterprise Edition, you might be able to use Partitioned Tables to make the index management a whole lot easier. If you have the Standard Edition, you might be able to pull of a similar miracle with a partitioned view. Both will take a while to build because you'll need to split off "sub tables" which means moving smaller amounts of duplicated data and then deleted that data from the big table, but it can work.

As a side bar, if it's absolutely necessary for this user to have a 2.4 TB database, some addititions to infrastructure in the form of additional hard disk space would definitely go a long way. The tricks with partitioning will serve you well for only so long.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)