Why have a heap?

  • In the area of SQL indexes, a table with no idexes is called a heap. I liken this to a literal heap of library books. If they are not arranged in any order and you wanted to find how many books by a particular author there were, you'd have to go through the entire heap and look at each book. If the books were on the shelf sorted by author (a clustered index), or if you had a card catalog (non-clustered index) your search would be much faster.

    So reading about heaps I've wondered why one would even have a heap? I suppose if a table would never have more than a dozen records or so, then full table scans would not create a performance issue and you could get away with having no indexes. Is there any other circumstance where one should deliberately design a heap?

  • A heap is actually a table without a clustered index (i.e. order of the data itself). You can have non-clustered indexes on a heap. Sometimes it is more hassle to manage a clustered index on a table that does not need it. I have heard people using heaps for things like logs. Not much more I can offer you here. I would think that basically any table that does not have a logical order to the data could function well as a heap. i.e. employee table where empid is not sequential, but randomly generated and no createddate or modifieddate columns.

    Jared
    CE - Microsoft

  • Except for temp tables or tables that are truncated and used for staging tables, I can think of no other reason why a table should be a heap.

    Without a nice narrow UNIQUE clustered index, any non-clustered index you have on the table will have an internal row number added to the table. That row number is comparatively quite large compared to a nice narrow int. IIRC, there are also some additional bytes that will be added to each non-clustered index row if you don't have a clustered UNIQUE index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Yup. Without a cluster, all nonclustered indexes get the RID (8 bytes). With a non-unique clustered index, the duplicate rows in the clustered index get a 4-byte uniquifier (yes, that is what it's called) and hence so do any nonclustered indexes

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Out of curiosity, are inserts "cheaper" in terms of time for a heap versus a clustered index?

    To extend the OP's metaphor, it's real easy to toss another book on top of the heap, but if I have to put in order I have to find where it fits on the shelf and it will take me longer because I have to do a search as well.

    Sorry to hijack 😀

  • Jim_K (7/11/2012)


    Out of curiosity, are inserts "cheaper" in terms of time for a heap versus a clustered index?

    Maybe a little, but I doubt it. SQL is optimised for tables with clustered indexes.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Jim_K (7/11/2012)


    Out of curiosity, are inserts "cheaper" in terms of time for a heap versus a clustered index?

    To extend the OP's metaphor, it's real easy to toss another book on top of the heap, but if I have to put in order I have to find where it fits on the shelf and it will take me longer because I have to do a search as well.

    Sorry to hijack 😀

    Heaps have one row in sys.partitions, with index_id = 0 for each partition used by the heap. By default, a heap has a single partition. When a heap has multiple partitions, each partition has a heap structure that contains the data for that specific partition. For example, if a heap has four partitions, there are four heap structures; one in each partition.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply