December 17, 2014 at 4:48 am
Cause the pk when it's created will already point to the cidx and wont need rebuild/recalculate. Less work on the io system and cpu.
December 17, 2014 at 7:33 am
PiMané (12/17/2014)
Cause the pk when it's created will already point to the cidx and wont need rebuild/recalculate. Less work on the io system and cpu.
I thought all the NC indexes were rebuilt as part of the CIDX rebuild thus ensuring that they are not fragmented, seems that BOL seems to contradict this view, suggesting the NC's be applied AFTER the CIDX.
_________________________________________________________________________
SSC Guide to Posting and Best Practices
December 17, 2014 at 8:27 am
Jason-299789 (12/17/2014)
PiMané (12/17/2014)
Cause the pk when it's created will already point to the cidx and wont need rebuild/recalculate. Less work on the io system and cpu.I thought all the NC indexes were rebuilt as part of the CIDX rebuild thus ensuring that they are not fragmented, seems that BOL seems to contradict this view, suggesting the NC's be applied AFTER the CIDX.
No, it will rebuild them. But that's why he's trying to get the cluster in place, then put the non-clustered pk in, otherwise, that index gets built twice, once when he builds it on what is now a heap and then again when it becomes a cluster.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
December 17, 2014 at 8:29 am
Grant Fritchey (12/17/2014)
No, it will rebuild them. But that's why he's trying to get the cluster in place, then put the non-clustered pk in, otherwise, that index gets built twice, once when he builds it on what is now a heap and then again when it becomes a cluster.
Yep. If you check the IO subsystem on both cases and see the page split you'll know the one to pick 🙂
December 17, 2014 at 2:50 pm
Hey PiMané I would recommend that you take your SW vendor out into the woods for a good spanking for making a clustered index on UNIQUEIDENTIFIER column. You are stuck with your table structure and refactoring the indexes is just about the only plan.
<Rant> Please under stand that the terms "Clustered Index" and "Primary Key" are not the same thing, have never been the same thing, and hopefully never will be the same thing. Just because that's what happens in SSMS when you click on the little key does not mean that it is required or even right </Rant>
So far you have the essence of a plan. The plan is for each affected table one at a time.
Step 0: Backup! - Stop all access to the database and do a backup just for the purpose of these steps.
Step 1: Script out the creation of all the FKs that point to this table.
Step 2: Drop the FKs
Step 3: Drop the clustered index. This renders the table as a heap but that is OK. The data will be just fine.
Step 4: Create a non-clustered index. Keep the name on the index the same as the original. It must have the unique attribute. To keep your software from freaking out nominate this as the primary key.
Step 5: Put back the FKs.
Step 6: TEST! TEST! TEST!
======================
It's bunch of work to be sure. The fine point here are: an FK must point to a unique index on the primary table. It does not have to be the PK or a clustered index. Just unique. I would keep the name the same even if it violates every convention in any book. Some software checks for the presence of indexes by name. Same thing with the PK designation.
For step 4 SSMS can be helpful. You can script out the current clustered index and change the clustered keyword to non-clustered.
Dropping the clustered index will force a rebuild of all the other indexes on the same table. They will be somewhat smaller by the virtue of the fact that the pointer will be the 8 byte RID rather than the 16 byte UNIQUEIDENTIFIER It saves 8 bytes per index entry and with 10K new rows per day that can add up.
The bad news is that even this wont end your fragmentation woes. Your SW generates a new UNIQUEIDENTIFIER for each row and you are stuck with that. Those inserts are going to be all over the place. You could set the fill factor to be stupid low but that would make the index crazy big.
ATBCharles Kincaid
December 17, 2014 at 3:47 pm
You can do the full db backup ahead of time. You can do a differential just before you do the index changes.
Script out all existing non-clustered indexes ahead of time also. Drop all n-c indexes before dropping the PK. Then create the clustered index. Then recreate all n-c indexes, including the PK.
The index sequence is always: drop n-c first, then clus; create clus first, then n-c.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 17, 2014 at 11:15 pm
Scott you are correct. One point though there will be no clustered index at all. They will all be non-clustered.
From http://msdn.microsoft.com/en-us/library/ms189049.aspx we find this:
A foreign key constraint does not have to be linked only to a primary key constraint in another table; it can also be defined to reference the columns of a UNIQUE constraint in another table.
(Emphasis mine.)
I have been through this exercise during the database design for Scout. We needed a design that would support inserts from synchronization of mobile device data. UNIQUEIDENTIFIER initially looked like the simplest solution to collision avoidance. I had to do all the research into the indexing of UNIQUEIDENTIFIER primary keys. We ran into the issues raised by the OP here. We scrapped UI altogether and wrote our own key authority.
ATBCharles Kincaid
December 18, 2014 at 12:36 am
There is no problem having an uniqueidentifier as the pk since it ain't the cidx.
You can have the pk on the uniqueid and have a fillfactor of 70 or less.
The index will still fragmentate but not the cidx that is the most important index since all the others point to it.
And it is better tho have an int identity than heap... Less space on the nc indexes.
December 18, 2014 at 9:48 am
PiMané (12/18/2014)
There is no problem having an uniqueidentifier as the pk since it ain't the cidx.You can have the pk on the uniqueid and have a fillfactor of 70 or less.
The index will still fragmentate but not the cidx that is the most important index since all the others point to it.
And it is better tho have an int identity than heap... Less space on the nc indexes.
Agreed on GUID as PK, that's no problem at all. To reduce page splits, since GUID inserts are always all over the place, start with a low fillfactor, such as 65-75%.
You could cluster on the identity, although most often there's a vastly better column(s) to cluster on for best performance. Look at your current missing index and index usage stats to determine that. Never default a clustered key, always choose it carefully! The clus index is the most critical factor overall for performance.
But if you do decide to cluster on identity, the fillfactor should almost certainly be much higher than 70%, more like 95-99%, unless you do a very large number of UPDATEs that significantly increase the length of varchar columns later.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 19, 2014 at 7:26 am
CELKO (12/19/2014)
Remember that first class on SQL? The fundamental concepts the first week? A key is a subset of attributes of an entity which uniquely identifies it!This is in effect a weird way to get a pointer chain and not a key. This is why we never use exposed locators in RDBMS. But it is easy for lazy non-RDBMS programmers to use to mimic the old object id or a allocate a new node in a pointer chain.
That's why I considered the heap, but the "link" from the NC indexes would be greater than having a int identity...
The PK on this table is huge... It has 4 fields all nvarchar, the max size can be 96 bytes!! This is for sure not a good CIDX specially since it fragments like a charm...
We are doing this to fix the fragmentation problem that leads to performance gain over 60%... We can't touch the SW, just improve the database...
December 19, 2014 at 9:04 am
PiMané (12/19/2014)
CELKO (12/19/2014)
Remember that first class on SQL? The fundamental concepts the first week? A key is a subset of attributes of an entity which uniquely identifies it!This is in effect a weird way to get a pointer chain and not a key. This is why we never use exposed locators in RDBMS. But it is easy for lazy non-RDBMS programmers to use to mimic the old object id or a allocate a new node in a pointer chain.
That's why I considered the heap, but the "link" from the NC indexes would be greater than having a int identity...
The PK on this table is huge... It has 4 fields all nvarchar, the max size can be 96 bytes!! This is for sure not a good CIDX specially since it fragments like a charm...
We are doing this to fix the fragmentation problem that leads to performance gain over 60%... We can't touch the SW, just improve the database...
Again, the PK and the clustered index are not necessarily the same. The perfect PK may be a terrible clustering key and vice versa.
You must look at the missing index and index usage stats -- at a minimum -- for that table to determine the best clustered index. I can provide a query to list those if you need it. Since those stats are kept internally in SQL, they get reset every time SQL is restarted. Because of that, the longer SQL has been continuously up the better and more detailed those stats are.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
December 19, 2014 at 9:16 am
yep. DMV queries... already use it... Thanks 🙂
the problem is that all the possible indexes generate lots of fragmentation and right now the main issue is to reduce page split since the io system is at it's limit...
December 19, 2014 at 9:18 am
PiMané (12/19/2014)
yep. DMV queries... already use it... Thanks 🙂the problem is that all the possible indexes generate lots of fragmentation and right now the main issue is to reduce page split since the io system is at it's limit...
OK, you've obviously got the info you think you need. Good luck!
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 13 posts - 16 through 27 (of 27 total)
You must be logged in to reply to this topic. Login to reply