Index not rebuilding

  • So it's better to leave them like this, non-clustered, than put a clustered index on pk?

    I repeat, they are mainly used for joins, so a clustered index on other columns is not necessary.

    Though I understand two different things : from Grant I understand that is better to put a clustered index, this way non-clustered indexes have smth to rely on, and from Gila that is better to keep it this way...

    As I said before , all my tables have guids as Pk and the join use mainly this columns.

    I guess I will leave them this way....to maintain the logic...

    Unfortunetly the database is prety slow , and I hoped that rebuilding the indexes an removing this heaps will be the best move.

    Thanks again for your help

  • I don't think Gail and I are in disagreement here. You need a clustered index on these tables, however, a primary key of GUID is not the best choice, and is even a bad choice. So, either you use the ordered GUID, which is a little better, but not great, or you find another candidate for the cluster. Regardless, you really should get clustered indexes on the tables. SQL Server functionality in data retrieval is largely designed around the clustered index. Without them, you're hurting your system.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • shnex, you clearly want a solution that involves you in making minimal changes, which makes clustering the GUID an obvious choice for you.

    Gail, Grant, can I ask you why you think a GUID is automatically a bad choice. I'm in complete agreement with you if the fill factor is set to 100%.

    However if the fill factor is 90%, you should be able to increase the number of rows very effectively because the GUID is truly random (I assume) and therefore the inserts should spread across the pages in a very regular manner.

    So if you increase the number of rows by 10% you should now expect each page to be about 99% full with little splitting because of the distribution.

    (This all assumes that the row size allows a reasonable number of rows to be held in each page of course.)

    .

  • I have several issues with GUID's as clustered indexes. First, they're very wide. This means fewer per page and more pages per index, making them somewhat less efficient. That's a minor issue. I've dealt with wider indexes before. The real issue is because the GUID is completely randomly distributed across the index, even with a lower fill factor, it leads to page splitting. That's an expensive operation in itself and, of course, it causes the index to become fragmented much quicker.

    Now, if you use SEQUENTIALID() instead of NEWID(), it offsets alot of the issues, and reduces the random nature of the GUID, but the width is still something of an issue.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ok man...message received.I will search for other candidates on clustered index, and if I don't find I will put them on guids.That's the situation.Thank you...things are more clear now:-)

  • Grant Fritchey (3/19/2009)


    I have several issues with GUID's as clustered indexes. First, they're very wide. This means fewer per page and more pages per index, making them somewhat less efficient. That's a minor issue. I've dealt with wider indexes before.

    I can't argue with any of that.

    Grant Fritchey (3/19/2009)


    The real issue is because the GUID is completely randomly distributed across the index, even with a lower fill factor, it leads to page splitting. That's an expensive operation in itself and, of course, it causes the index to become fragmented much quicker.

    This doesn't stack up theoretically for me. A truly random insertion would cause less splitting not more providing a sensible fill factor is used. It would also reduce page contention for locks on insert compared to any monotonically increasing key. The only way this would not be true is if NEWID() does not generate a truly random GUID.

    Have you seen the fragmentation effect you describe in practice?

    .

  • Oh yeah. We did a lot of tests to see if we could put GUID to work. It made developers lives easier, but it really seriously impacted the database. I don't have numbers any more, the tests were done a long time ago. I do know that SEQUENTIALID made a huge positive difference.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Now I'm choosing my indexes, so I have another question: I have a bigint column with 90% different values, the duplicate values are rare, but the column is not very used(about 10% of the queries and joins). The fact that the values are most of the time different could be important in choosing the clustered index column? Should I combine the pk column that's a guid with this column and create a clustered index?

  • shnex (3/19/2009)


    Now I'm choosing my indexes, so I have another question: I have a bigint column with 90% different values, the duplicate values are rare, but the column is not very used(about 10% of the queries and joins). The fact that the values are most of the time different could be important in choosing the clustered index column? Should I combine the pk column that's a guid with this column and create a clustered index?

    I'm afraid it's very hard to answer specific questions without knowing what your schema is like.

    I prefer a clustered primary key (which is by definition unique) or a UNIQUE clustered index. Usually you shouldn't choose a value which changes as this will physically move the row when you update it.

    Although a clustered index improves performance when you are scanning based on the value in it (e.g. orders created between 11:00am and 11:30am) on many occasions they are just used to organise the data, so in a retail system Order Code might make a good one.

    There is a hit if they are not declared as UNIQUE in terms of space required and performance as well as making the defragmentation processes take longer and do more logging.

    So you are right to look for columns with lots of different values.

    .

  • Grant Fritchey (3/19/2009)


    Oh yeah. We did a lot of tests to see if we could put GUID to work. It made developers lives easier, but it really seriously impacted the database. I don't have numbers any more, the tests were done a long time ago. I do know that SEQUENTIALID made a huge positive difference.

    OK. I still think my theory is correct. Sounds like NEWID() isn't as random as it might be.

    .

  • So....

    The table contains person information and it looks like this

    column1 uniqueidentifier,

    column 2 bigint,

    column 3 varchar(30),

    ...

    column 12 decimal(18,2)

    from column 3 to 12 there are columns that contain data, int,decimal or char

    column2 is a registration code, that could be the same for persons that are related(same family usualy)

    column1 is reference for some fk and is used in joins.

    there are some sp that use column2 as a filter, or in a join, but they are few.

    So, my question was: puting the clustered index on column2(bigint) is beter than on column1(guid)?

  • Tim Walker (3/19/2009)


    Gail, Grant, can I ask you why you think a GUID is automatically a bad choice.

    Experience.

    I've had a table with a guid cluster that could go from 0% fragmentation to 99.95% fragmentation in under 10 hours, and that was with a fill factor of 75%. I wasn't willing to drop the fill factor any lower because, as it was, the table took over 300 GB of space. (150 million rows). Performance degraded noticeably (on some rather important queries) once the fragmentation reached about 50%.

    My criteria for clustered indexes - narrow, unchanging, unique, ever-increasing. Guids fail miserably on the last point. I don't mind them on slow-changing tables, but on tables with lots of inserts, they tend to be a disaster.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • shnex (3/19/2009)


    So, my question was: puting the clustered index on column2(bigint) is beter than on column1(guid)?

    Maybe. Can you give us the table's actual schema and some of the more common queries on it.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (3/19/2009)


    My criteria for clustered indexes - narrow, unchanging, unique, ever-increasing.

    That's a good one Gail. Explains everything just in 4 words.:-D

  • Krishna (3/19/2009)


    GilaMonster (3/19/2009)


    My criteria for clustered indexes - narrow, unchanging, unique, ever-increasing.

    That's a good one Gail. Explains everything just in 4 words.:-D

    I can't take credit. Kimberly Tripp[/url]'s been preaching that for years

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 46 through 60 (of 63 total)

You must be logged in to reply to this topic. Login to reply