Indexed View / Creating unique clustered index: Slow, consuming massive disk space

  • Background: Using SQL Server 2005, SP2. Problem is occuring on a development database with no user traffic or concurrency related issues (including replication).

    I have successfully established an indexed view, each row of which will have a uniqueidentifier from a base table which will serve as the unique clustered index.

    Once the view is created, and in order to establish other indices, I am first trying to create the required unique clustered index.

    The problem is that this query is taking well over an hour (we end up cancelling the query) and it eventually consumes the larger part of a 60 GB drive. There are only 140,000 rows in the base table upon which the index is created ( no cross, full or outer joins present ). The database is reasonably small to begin with.

    Here is the DDL which we use to create the view successfully:

    ================================================

    --Set the options to support indexed views.

    SET NUMERIC_ROUNDABORT OFF

    GO

    SET ANSI_PADDING,ANSI_WARNINGS,CONCAT_NULL_YIELDS_NULL,ARITHABORT,QUOTED_IDENTIFIER,ANSI_NULLS ON

    GO

    if exists ( select * from information_schema.Views where table_name='ivTreatmentProcedure' )

    drop view ivTreatmentProcedure

    go

    create view [dbo].ivTreatmentProcedure

    with schemabinding

    as

    selecttvp.txVisitProcID,

    tvp.txVisitProcDirMin,

    tvp.procCodeID,

    tv.txVisitID,

    t.txID,

    t.txDt,

    t.txLockInd,

    ds.facilityID,

    ds.discStayID,

    ds.disciplineMLID,

    'disciplineShortDescTxt' = dml.masterLookupShortDescTxt,

    'disciplineLongDescTxt' = dml.masterLookupDescTxt,

    'disciplineSeqNum' = dml.masterLookupSeqNum,

    ds.discStayAdmitDt,

    ds.discStayDischargeDt

    fromdbo.dwTreatment tvp join dbo.TreatmentVisit tv

    ontv.txVisitID = tv.txVisitID join dbo.Treatment t

    ont.txID = tv.txID join dbo.DisciplineStay ds

    ont.discStayID = ds.discStayID join dbo.MasterLookup dml

    ondml.masterLookupID = ds.disciplineMLID

    go

    -- Notes: xxxID columns are generally uniqueidentifiers in the above base tables. The from clause

    -- queries only base tables itself.

    -- This CREATE VIEW statement works very quickly. Here is the query which takes a long time and

    -- becomes a disk hog:

    create unique clustered index PK_ivTreatmentProcedure on ivTreatmentProcedure (txVisitProcID)

    go

    -- Note: txVisitProcID is a uniqueidentifier and is the primary key of the base table which should have the same cardinality as the view (140,000 rows)

    ================================================

    Any help would be appreciated. What could be causing the slow performance and database bloat?

    This CREATE UNIQUE CLUSTERED INDEX runs for over an hour and I don't see why it should be so slow and consume so much space. We are really in an isolated DBA-access only development database here.

  • "Must look eye"...

    See anything wrong with the following?

    from dbo.dwTreatment tvp join dbo.TreatmentVisit tv

    on tv.txVisitID = tv.txVisitID

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/16/2008)


    See anything wrong with the following?

    from dbo.dwTreatment tvp join dbo.TreatmentVisit tv

    on tv.txVisitID = tv.txVisitID

    Well there's your problem!

    Hmm, I think that that would have a cardinality of at least 20,000,000,000 (probably more). Times 16 bytes = 320GB, plus index overhead, etc., well you get the idea.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I have successfully established an indexed view, each row of which will have a uniqueidentifier from a base table which will serve as the unique clustered index.

    Note that you have only created a view so far (which is why it was quick) ... it only becomes an indexed view once you have added the clustered index (which materialises the data, hence takes some time to build).

  • DOH!

    That was an expensive typo...

    Thanks for the extra pair of eyes folks.

  • Jeff Moden (4/16/2008)


    "Must look eye"...

    See anything wrong with the following?

    from dbo.dwTreatment tvp join dbo.TreatmentVisit tv

    on tv.txVisitID = tv.txVisitID

    hehe....I keep getting that visual of Jeff dressed up as Pat Morita....:D

    If you start doing Crane kicks...I'm outta here:hehe:

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (4/17/2008)


    If you start doing Crane kicks...I'm outta here:hehe:

    Ya gotta be "stumped" to practice those...:hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff that was a nice catch :w00t:

    "Keep Trying"

  • Thank you for the compliment, Chirag... but you're giving me too much credit. The key phrase was "consumes the larger part of a 60 GB drive. There are only 140,000 rows "... that means at least a partial Cartesian join had to be present. The rest was just proof reading the joins. The "self join" I found simply meant that another table was missing a join and created a full Cartesian join as a result.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply