Eliminate GUIDs as Primary Key / Clustered Indexes

  • Need Help!!!

    I currently have a SQL Server 2000 DB, which we are planing to upgrade to SQL Server 2005. The main issue is that all the tables within this database have Uniqueidentifiers (GUIDs) as the Primary Key and on top of that they are Clustered on GUIDs. Whats the best way out of this NIGHTMARE!!

    Can someone provide me with design solutions??

    Also I have one single table with 600 million rows, also clustered on GUID, I need to move this table from Test server database to Production server db. What is the best way to move this huge table, about 50GB?

    Any suggestions/ solutions are welcome......

    :crazy:

  • I am not sure why you think this is a nightmare. In SQL 2000, a GUID as a primary key was fine, it usually is bad for a clustered index because they are not generated in any sort of order so they fragment the index. In SQL 2005, you can change the default to NewSequentialID() and it will generate them in ascending order so they do not fragement the indexes.

    Other than a GUID being large and difficult to remember, they make pretty good primary and foreign keys. They are unique, unrepeatable, and meaningless - just like a good record key should be. I typically like to use integer identity columns in most cases, but GUID's are far from a major disaster.

    As far as moving the "large" table, I would use SSIS on the destination server and use a SQL Destination component in the data flow. It will be the fastest bulk operator you have to work with.

  • And if you post again, the large font you have used is pretty annoying.

  • Hi SSCrazy,

    Thank you for your response and sorry for my delayed response. Well i agree with you that GUID's are useful in their own way, developers love it, but DBA's like me who need performance and better management of their data, like identity columns, which use less space, easier to remember and manage. But the nightmare comes into picture when you have guids on every table in the DB and cluster them on top of that and then have several other non-clustered indexes which will in turn use the clustered index key to locate the rows in the table. And specially when you have tables that can grown at the rate of more than million rows a month, you can't even imagine the fragmentation that exists and the amount of I/O.

    And yes SQL 2005 should solve most of the problems, but i still think developers should really consider the business need before implementing GUIDs and not use it, unless it absolutely necessary.

  • Sure, I agree with much of that argument, and I tend to not use GUID's unles necessary, but in your case, that is not really the point.

    You have a system with GUID columns already. The major problem they have caused you is solved in SQL 2005. You are already in the process of upgrading. I would tend to leave it. Why do a major redesign without a really pressing need?

    It's one thing to redesign something because it does not meet a business need. It is something completely different to redesign because you don't particularly like a data type on a column that the users never see.

  • That is very true, a complete redesign is unnecessary at this point. Like you mentioned most of the problems will be eliminated or minimized with SQL 2005 upgrade.

    I was planing on doing a side-by-side upgrade, any thoughts on that!

  • The side-by-side upgrade is the way to go. Are you going to use diffeent hardware, or just a second instance?

    Obviously, new hardware is best, but either plan will work. Backup the SQL 2000 databases and restore them on the 2005 server. Then you have to update statistics with FULLSCAN (don't forget to do this). Remember that the 2005 features will not work in a 2000 compatibility mode, so use the upgrade advisor and move yourself to the 2005 compatibility mode during the upgrade if at all possible.

    You will have to do some testing. There are things in 2005 that can end up slower than they were in 2000.

    One other note - don't upgrade to 2005, upgrade to 2008. There is no point in going to 2005. The upgrade process is the same and there is really no additional work required.

  • Once again thank you for your reply.

    I would have loved to upgrade directly to Sql 2008, but my company has already purchased 2005 licenses, hence we are going to stick with it for now.

    To answer your question, I will be using a different but identical hardware. This is a Test Server, which is being rebuilt with Win Server 2003 SP2 and 3 sets or mirrored drives.

    The current production environment is SQL Server 2000 Standard and will be upgraded to SQL 2005 Standard. I will be doing a full back of production DB's and restoring it on the new test server.

    1) Do i first need to install SQL 2000 on the test server or directly install SQL 2005?

    2) At what point should i run the upgrade advisor?

    3) At present we have 4 huge databases in production, and the developers have designed scripts to break down the sql 2000 databases into smaller SQL 2005 databases (around 350 smaller databases), during the upgrade process.

    4) About 300 of the newly created SQL 2005 databases will be used for Read only purposes only, so i plan on putting them on separate Read-only filegroups.

    5) Remaining databases will be on separate read-write filegroups.

    6) since i have 3 sets of mirrored drives, i was planning on putting the OS and SQL on one drive, datafiles on the next set and log files on the third set.

    So thats the plan so far, please feel free to provide any suggestions or provide some useful links!!

    Thanks.

  • vp0401 (11/21/2008)


    Once again thank you for your reply.

    I would have loved to upgrade directly to Sql 2008, but my company has already purchased 2005 licenses, hence we are going to stick with it for now.

    To answer your question, I will be using a different but identical hardware. This is a Test Server, which is being rebuilt with Win Server 2003 SP2 and 3 sets or mirrored drives.

    The current production environment is SQL Server 2000 Standard and will be upgraded to SQL 2005 Standard. I will be doing a full back of production DB's and restoring it on the new test server.

    1) Do i first need to install SQL 2000 on the test server or directly install SQL 2005?

    2) At what point should i run the upgrade advisor?

    3) At present we have 4 huge databases in production, and the developers have designed scripts to break down the sql 2000 databases into smaller SQL 2005 databases (around 350 smaller databases), during the upgrade process.

    4) About 300 of the newly created SQL 2005 databases will be used for Read only purposes only, so i plan on putting them on separate Read-only filegroups.

    5) Remaining databases will be on separate read-write filegroups.

    6) since i have 3 sets of mirrored drives, i was planning on putting the OS and SQL on one drive, datafiles on the next set and log files on the third set.

    So thats the plan so far, please feel free to provide any suggestions or provide some useful links!!

    Thanks.

    I tend to prefer to set up a brand new instance directly in SQL 2005, and simply attach/restore the database objects to this new instance (which will upgrade them).

    Keep in mind that while the "major issue" (which I am assuming is the ridiculous amount of fragmentation this has caused) might be mitigated with NEWSequentialID, your "minor issue" will not be, and isn't so minor (since you're bloating your indexes by quite a bit). I'm not sure I'd be so confident about not tackling it. On databases as big as you're talking about - that will start to really add up.

    Also - don't forget this little tidbit about newSequentialID():

    Creates a GUID that is greater than any GUID previously generated by this function on a specified computer.

    It is based on the network card's MAC address. The problem can arise if you happen to (by having all of the random GUID's in place) already have something that's "high up" in the sequence it wants to use. You may find it runs out of of Guids in that range rather quickly. I quite honestly am not sure what happens when it runs out of those numbers in that designated sequence, so it may be worth testing out.

    Personally - I'd leave the GUID generation alone, but would use something else as the clustered index (identity field for example, to mitigate the fragmentation). And - once things get a little better - start keeping an eye on tuning the fill factor so as to keep splits down.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Ok, so from what i understand, I directly need to install SQL 2005 on the newly rebuilt Test Server and use detach/attach to copy the production databases to SQL 2005 and then change the db compatibility to 90, right!

    Yes i have planned on removing the clustered indexes on the GUIDs and moving it to some other column like date and so on. I might use non-clustered on the GUIDs.

    As i previously mentioned, the 4 large SQL 2000 databases will be split into about 350 smaller databases.

    1) So do i create a new instance in SQL 2005 and run the script there for the new db creation?

    My previous plan was to install SQL 2000 first, restore the production db there. then install sql 2005 and run the script in SQL 2005 to break down the databases.

    What do you think?

  • Hi ,

    Yes, I think it should be in fine. In SQL 2005 when you attach files from 2000, the server automatcally updates your database.

  • Thanks for your reply!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply