GUID as Primary Key - How to lower I/O

Question

GUID as Primary Key - How to lower I/O

Frank Rizzo-261382

SSC Enthusiast

Points: 186
More actions
January 20, 2007 at 1:58 pm

#115903

I posted this in the general section yesterday, but I think it belongs here instead.
I believe that I am having a lot of I/O bottlenecks due to how the data is laid out on the disk when using a GUID as a PK. I am trying to delete data from a number of bloated tables and it takes forever to delete records even when I have the server to myself. The tables range in size from 1M to 40M rows. The smaller (row-wise) tables take less time and the large ones are super s..l..o..w...I don't disable constraints when deleting, although in testing this makes the deletion go much faster. One bottleneck I do know of is that the Data, Log and Indexes are all on the same physical RAID-1 drive. That will get fixed ina few weeks, but I need to focus on trying to tune the DB for less I/O.
Is there a way to reorder data by adding a new column INT IDENTITY and making a clustered index on that column?
I would think that this would order the data sequentially on the drive, making it easier to read and write data. My goal is to only have the last 6 months of data in the production DB and archive the old data to DB on another server.
Thanks,
--Frank

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

dclark SSC Eights! Points: 843 More actions · Answer 1

A likely cause of the problem is that is that you don't have indexes on foreign key columns.

When deleting from a table that is referenced by a foreign key, SQL Server must first verify the the table with the foreign key does not contain values that match the rows being deleted. If there is no index on the foreign key, it forces SQL Server to scan the table.

Frank Rizzo-261382 SSC Enthusiast Points: 186 More actions · Answer 2

Frank Rizzo-261382

SSC Enthusiast

Points: 186

January 21, 2007 at 7:40 am

#684564

All FK have indexes.

--Frank

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 3

A guid makes a lousy clustered index in sql 2000, in sql2005 you have the option of a sequential guid. To improve performance make your guid non clustered and add a sequential numeric clustered key - this will at least stop poor performance due to the randomness of the guid value.

I rate the addition of the guid to sql server and it's adoption as a key value by many many third party apps as one of the worst new features and one of the quickest ways to degrade performance - try an upgrade to 2005.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

jeffrey yao SSCarpal Tunnel Points: 4365 More actions · Answer 4

You can always count on the performance counters to see where and what the bottle neck is.

But one thing in your case is to make sure that your db log is big enough to avoid the expansion if you delete hundreds of thousands of records within one transaction. It may be quicker if you delete them chunk by chunk in this secnario.

Just my 2 cents.

HTH

J.Y

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 5

agreed batched small deletes work so much better. make sure the delete criteria is indexed, you should index for deletes to avoid scans. I know no-one makes much reference to tuning deletes but it is important if it's something you do often.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Frank Rizzo-261382 SSC Enthusiast Points: 186 More actions · Answer 6

Thanks for the replies. I created a thread a month or so ago on a mass delete I was undertaking. I am still working on it as my maintence windows has always been small. I can only do a little bit at a time.

As for adding of a sequential clustered index, I am glad there is some agreement on this. It also helps doing some defragmentation as it has to rewrite the data based on the clustered index column.

My next question:

I am altering a table with 6M rows to add a INT IDENTITY column. In my dev environment on a SATA RAID-1 array it has been running for 73+ hours. Using PerfMon I have observed the physical drives ping-ponging up and down since it started for both the drive th data is on and the temp db. My goal was to add the column, alter the table and dropping the GUID PK and then add this new sequential columns as the PK. So far this does not seem to promising for a badly fragmented table. I tried this on some other GUID PK tables with 200-300K rows and it worked fast.

What are the best PerMon counters to observer when trying to pinpoint I/O bottlenecks?

Thanks,

--Frank

Frank Rizzo-261382 SSC Enthusiast Points: 186 More actions · Answer 7

FWIW...

Here is an article on the Microsoft site called SQL Server 2005 Waits and Queues http://download.microsoft.com/download/4/7/a/47a548b9-249e-484c-abd7-29f31282b04d/Performance_Tuning_Waits_Queues.doc

I will read it and see if it can help me on my SQL2k issues or at least point me in the right direction.

--Frank

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 8

disk io completion time is the main one I use, if the counter is over 10ms ( I guess for a sata drive ) then it is a possible bottleneck. On non sans you can look at disk queue.

I'm not sure the waits will do much for you, I use this extensively but for busy production systems.

I should get a kickback for endlessly recommending this book but ... SQL server 2000 performance tuning reference guide, mspress is a good place to start.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

Frank Rizzo-261382 SSC Enthusiast Points: 186 More actions · Answer 9

Colin,

Thanks for the book recommendation....less that $20 for a like-new harcover copy from a used book vendor on amazon.

--Frank