Clustered Indexes during Updates

  • I am reviewing a critical DB application with extensive application logic in SPs, functions, and views. 

    Three very large tables in the DB (the largest--tens of thousands of records) have clustered indices for PKs that don't follow the natural order in which the records are laid down.  E.g., one table has a PK of (SubOrganizationID, RecordType, RecordSubType, Date).  Every year about 2,500 records are added to this table (and similar for the other two large tables) covering every combination of the RecordType, etc., for each SubOrganization for every date in the coming year. 

    On a daily basis thousands of records are updated.  PKs never change but attribute data is updated--PKs are used in the WHERE clause of the update statements.

    When an update takes place on a clustered index table such as this does the delete/insert sequence cause a reshuffle of the table or does the insert go right into the space vacated by the prior version with no appreciable page rebuilding?

    A secondary question is does anyone see any advantage to using a clustered index in this situation anyway?  My practice has been to use clustered indices (usually) when the records are laid down in the order the index will be in anyway--Identity Attribute PK, datetime PK, etc.

  • Good question which will open a veritable can of worms!!! ( and probably get the most absurd suggestions ) OK , tens of thousands does not make a large table , hundreds or thousands of millions make long tables - the width has a major part to play when looking at table size. ( but one person's large table is another's small - it's a matter of scale )

    If you update the values in a column which is part of a clustered PK then the update will always be deferred, e.g. a new entry will be created and the old row deleted. A deferred update also takes place if the updated row is larger than the existing row and there isn't sufficient free space in the page. However, that said " It Depends " if your row is greater than 50% of a page you'll only have one row per page anyway - the number of rows in pages makes a big difference to how all this works, and you can pad in an attempt to lessen the effect - subject to the PK effect. Oh and an update trigger will always force a deferred update.

    You can use a program like lumigent's log explorer to examine how amny deferred updates you get. It's a very useful exercise - but only if you're concerned about things.

    You mention you're reviewing things - so important point do you have problems? if not then leave things well alone or migrate to a stress environment for testing - don't make changes because you think you need to to justify your review - I've seen these types of changes have very bad results.

    Finally clustered indexes - again it just depends - technically a clustered index is in place to give the best performance for a particular range or individual query. Of course i've met many a "DBA" (? ) who do not realise that PK's don't have to be clustered - these are normally what I call "GUI cowboys" - no particular offence intended - I'd say over 75% of job applicants I've interviewed have failed this question - when you look at their work history it sometimes can be worrying to imagine what sort of databases are in place where they worked !!

    I've leave the question for others - my advice get on a training course by Kimberley Tripp or Kalen Delaney - then you'll start to understand how complex this seemingly simple question is < grin >

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • The clustered PK you mention (SubOrganizationID, RecordType, RecordSubType, Date) provides an excellent boost to get, say, all records for a particular SubOrganization. For getting any other large collection of records, say by date range, it is a hindrance.

    So it depends on the nature of your query mix - if nearly every query is for a particular specified SubOrganization - great! Otherwise, maybe not so great.

    Clustered indexing often is a fretfully complicated compromise. Some Oracle programmers believe the very concept is idiotic...

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply