January 2, 2007 at 11:17 am
Hi folks,
I have a table with 60 millions rows, the table is not well created by another group member, no primary key/index is put there, I don't want to create the table from scratch again, so if I want to add, say, primary key ID, for the table, considering the size of the table, what's the best approach to do this?
Thanks.
January 2, 2007 at 12:19 pm
Creating PK with clustered index is definitely it is offline operation.
You may need to test of dev server how much time it takes to finish...
You can do it creating the new table and create the pk and swap the tables...this is online operation but not 100% online but it will have less impact on users AND requires addtional disk space.
1. Create Table TblNew
2. Insert data into TblNew select * from tbl
3. Create PK on TblNew
4. Rename tbl to tbl_Old
5. Rename tblNew to Tbl
6. Grant neccessary permission to TblNew...
Note: If the Tbl has identity you have make sure ENABLE IDENTITY INSERT....
MohammedU
Microsoft SQL Server MVP
January 2, 2007 at 12:46 pm
I do not believe that you need to have a clustered index. Without a clustered index, the rows are likely physical ordered in the order they were added. (Perhaps, if there were deletes or page splits due to updates, there might have been physical gaps. And if gaps existed when inserting new rows, I do not know if SQL Server would have filled the gaps, or just always added new rows at the end.)
If you are just looking to add a Primary Key, you can do so without physically reordering the table by adding the PRIMARY KEY constraint without clustered index. This will build an index, but not physically change your existing table. (Example is from BOL, modified only to change argument to NONCLUSTERED)
USE AdventureWorks; GO ALTER TABLE Production.TransactionHistoryArchive WITH CHECK ADD CONSTRAINT PK_TransactionHistoryArchive_TransactionID PRIMARY KEY NONCLUSTERED (TransactionID) WITH (FILLFACTOR = 75, ONLINE = ON, PAD_INDEX = ON) GO
You should be able to do this online, even with 60 million rows, since the table itself is not changed. You probably should check to make sure your primary key columns are good before you run this, make sure that column(s) have no null values and are are unique.
Hope this helps
Mark
January 2, 2007 at 1:54 pm
Thank you guys for replying.
I just found that there ARE kind of redundant records for the columns I wanted to create index on, so I guess the only way to do this is to add an extra ID field and make it primary key with clustered/non-clustered index. But again I am worrying about if this might takes long time to do and if there is an efficient way to do it.
Thanks.
January 3, 2007 at 1:42 am
That would be my suggestion too, but I would swap steps 2 and 3
January 3, 2007 at 3:27 am
If you add a clustered index it will require a data sort, a secondary index will not impact the base table. If you take the create another table approach make sure you have the two tables on different disk arrays for best performance.
Note that placing the database into simple recovery mode ( or even bulk logged ) will be better than leaving it in full recovery. It's difficult to make any predictions on performance as it depends upon your server hardware performance - on a powerful server the whole process could be quite quick, also depends upon width of table etc. etc. You might want to consider partitioning.
IF you want a clustered index and take the two table approach then I suggest you batch your transfers for optimum performance. You indicate that you will need a surrogate PK so you'll have to add an identity ( I guess ) to your existing table anyway - then it's just a case of if you want a clustered index.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
January 3, 2007 at 4:52 am
Jurriaan, you should definitely NOT create the index before populating the data. This will simply lead to a very fragmented index right off the bat.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
January 3, 2007 at 8:26 am
Kevin,
How to know that the index is actually 'fragmented'? and if in fact it is 'fragmented', is there any remedy?
Thanks.
January 3, 2007 at 8:37 am
Since you are a newbie to SQL Server (and appear to have been thrown into the deep end of the pool I will give you two suggestions:
1) Get familiar with SQL Server's Books Online. They are an extraordinary resource. Great documentation and a very good search functionality.
2) Start spending every spare moment you can studying one or more of the many SQL Server administration books out there. There is no substitute for intense study for learning about new software.
Now, to answer your question, use DBCC SHOWCONTIG.
Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply