Indexing for count?

  • I am trying to find the best way to speed up certain counts from a table. Is there a good way to index a table to assist with that?

    I am looking both for at queries like :

    select col1, count(*)

    from tablename

    group by col1

    and even a straight:

    select count(*)

    from tablename

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • For a clustered index, the row count should be stored in a system table, so that should be quick.

    For the first, if you had an index on the GROUP BY column, that should speed things up.

  • Steve Jones - Editor (12/13/2008)


    For a clustered index, the row count should be stored in a system table, so that should be quick.

    For the first, if you had an index on the GROUP BY column, that should speed things up.

    Thanks, Steve.

    Putting a clustered index on it was my first thought, and of course was done after the initial ETL to populate was finished. But on the table in question (Roughly 14 million rows), a simple "select count(*) from TheTable" tables takes roughly 90 seconds.

    I also indexed the group by column before initially posting, and it definitely helps, but I was wondering if there was something that could specifically aid with counts, since we call up the counts very frequently.

    Is there any index type or options that would be specifically applicable to aggegates?

    One option I have considered is using indexed views, but this would require developers to change their code and adds certain other complications, especially since this is using Standard instead of Enerprise edition, so I am trying to explore other options.

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • Why dont use DTA tool and apply the recommendations from there? I am just wondering.....

  • timothyawiseman (12/13/2008)


    I also indexed the group by column before initially posting, and it definitely helps, but I was wondering if there was something that could specifically aid with counts, since we call up the counts very frequently.

    Not counts specifically. A count (*) (which is the fastest count there is) will be calculated by scanning the smallest index on the table (ie the one with the lowest page count). So if you have any nonclustered index on the table, the count's probably running pretty close to the best it can be.

    A clustered index is not good for a count, because scanning the cluster is the same as a table scan

    Is there any index type or options that would be specifically applicable to aggegates?

    Typically for a aggregate, you want an index with the group by columns in the index key (in any order) and the columns referenced in the aggregates as the include columns

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (12/14/2008)


    timothyawiseman (12/13/2008)


    I also indexed the group by column before initially posting, and it definitely helps, but I was wondering if there was something that could specifically aid with counts, since we call up the counts very frequently.

    Not counts specifically. A count (*) (which is the fastest count there is) will be calculated by scanning the smallest index on the table (ie the one with the lowest page count). So if you have any nonclustered index on the table, the count's probably running pretty close to the best it can be.

    A clustered index is not good for a count, because scanning the cluster is the same as a table scan

    That helps a lot. There is currently only a clustered index on the primary key since the primary key will be used for virtually all searches on this table. I will try adding a nonclustered index as well. Thank you.

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • If you can, create a NC index to support the other searches on the table. If that's not possible, or you want a NC just for the count, then put it on the smallest column in the table (smallint, char(1), tinyint, bit, ...) That's assuming you're always doing count(*) and not count with a column specified.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Krishna_DBA (12/13/2008)


    Why dont use DTA tool and apply the recommendations from there? I am just wondering.....

    Primarily because it's unreliable. Your time is better spent learning how to do indexing yourself because, while the DTA can get some of the simple stuff, overall, it's a pretty poor performer.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • If your table has a clustered index and reasonably updated statistics you can query the system table directly:

    SELECT rows

    FROM sysindexes

    WHERE id = OBJECT_ID('MyTable') AND indid < 2

    I wouldn't put that query in production as the returned value is approximate and won't necessarily match COUNT(*). But I've found it to be accurate enough for debugging/analysis purposes as it returns almost immediately.

  • timothyawiseman (12/13/2008)


    I am trying to find the best way to speed up certain counts from a table. Is there a good way to index a table to assist with that?

    I am looking both for at queries like :

    select col1, count(*)

    from tablename

    group by col1

    and even a straight:

    select count(*)

    from tablename

    You can also look into indexed views for the "GROUP BY" aggregation.


    * Noel

  • Thanks everyone. I prefer the indexed view option as that returns the best results I've seen yet, but it causes some issues for the developpers.

    The nonclustered index Gail suggested sped things up quite a bit, and seems to have resolved the issue for now.

    And thanks for pointing out the rows in sysindexes.

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply