Online Index rebuild behaviour

  • Hi all,

    I had an issue over the weekend when our online reindex job took forever to rebuild an index online. It caused a long locking chain.

    Here is the definition of the index being rebuilt

    CREATE UNIQUE NONCLUSTERED INDEX [index1] ON [dbo].[table1]

    (

    [Rent Agreement No_] ASC,

    [Rent Schedule Type] ASC,

    [Rent Schedule No_] ASC,

    [Line No_] ASC

    )

    Earlier that day, I had noticed the following query doing an awful lot of logical reads and taking longer than normal.

    SELECT *

    FROM "dbo"."table1"

    WITH

    (

    READUNCOMMITTED

    )

    WHERE (("Rent Agreement No_"=@P1))

    AND "Rent Agreement No_"=@P2

    AND "Rent Schedule Type"=@P3

    AND "Rent Schedule No_"<@P4

    ORDER BY "Rent Agreement No_" DESC,

    "Rent Schedule Type" DESC,

    "Rent Schedule No_" DESC,

    "Line No_" DESC

    So in my infinite wisdom I created another index, with the same definition of index1, but with the order descending rather than ascending. see below for definition.

    CREATE UNIQUE NONCLUSTERED INDEX [index2] ON [dbo].[table1]

    (

    [Rent Agreement No_] DESC,

    [Rent Schedule Type] DESC,

    [Rent Schedule No_] DESC,

    [Line No_] DESC

    )

    This sped up the query and reduced the reads, so I was happy. Until the next day. In the morning when I checked, I found that the rebuild job had spent the whole night trying to rebuild index1, and locking a large number of other things as the night went on. When I dropped index2, index1 rebuilt in seconds.

    [Rent schedule no_] and [Line no_] also make up the PK for this table, in ascending order.

    The columns in index1 and index2 are the same, but the order is different. Can anyone explain this long running rebuild behaviour?

    The table is relatively small, 258mb, 537k rows, with the normal rebuild of index1 online taking just a couple of seconds.

    Needless to say I have dropped index2, but I need to understand what went wrong! Any help would be much appreciated.

  • Hard to say for sure without being able to investigate...

    Online index rebuilds aren't online. They're mostly online. They need to take schema locks at two points, right at the beginning and right at the end. If there's heavy concurrent activity, the rebuilds will have to wait to acquire those locks, and anything wanting other locks on the table (including schema-stability) will have to wait behind them. One long running query and lots of concurrent activity and you could easily have a blocking chain with the 'mostly online' index rebuild as the first waiting task.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks GilaMonster, That's what I though, that something was blocking the online rebuild, but on investigation this was not the case. The index rebuild was at the top of the locking chain. We use confio ignite monitoring software, and this allows me to view locking chains.

    The top of the chain is the index rebuild job, and the relevant step. You can go into the query details, and this is the query being ran,

    INSERT [dbo].[table1]

    SELECT *

    FROM [dbo].[table1]

    I use ola.hallengren.com's index rebuild script, and from the logs I can see the statement the job ran was

    ALTER INDEX [index1] ON [dbo].[table1] REBUILD WITH (SORT_IN_TEMPDB = OFF, ONLINE = ON, FILLFACTOR = 85)

    I could even replicate it by running this statement the following day. The only way I could get the rebuild to succeed was by dropping the new index, index2. the definition of which can be seen in the first post.

  • Then no idea. Would have to repo and diagnose live to be able to say anything useful.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Well I have managed to reproduce in my test database, and can see the rebuild of index1 is waiting on LCK_M_IX. The details of the wait resource is a pagelock, with a different page each time I cancel and rerun. If I drop the index these page locks don't occur. So the rebuild of index1 is waiting for a lock on a page in index2.

    I am using DBCC IND ('DBNAME',TABLE1,-1) to interrogate the database for the pages it contains.

    Interestingly when I do DBCC PAGE on the page number that the lock is waiting on i get the following error.

    --

    A DBCC PAGE error has occurred: Invalid page metadata - dump style 3 not possible.

    --

    Looks a bit dodgy, though DBCC CHECKDB does not show any corruption or errors.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply