Count on table with million data too slow

  • The clustered index cannot be used for a seek because of the the column [sys_service_id] (which is not included in your SQL statement).

    To get a seek, you can either reverse the order of the clustered index as ID, GPSTime, [sys_service_id] (which can have a serious impact on other queries!), or you can create a new non-clustered index on ID and GPSTime.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Lowell (3/18/2015)


    GilaMonster (3/18/2015)


    Lowell (3/18/2015)


    also, why not count(*)? it would be faster, i would think.

    In this case it's unlikely to be faster as the column in the count is already in the where clause. Since null values can't be returned by the where clause predicate, the optimiser will likely be treating it as a count(*)

    that's where i'm a little weak Gail; i of course saw that the column being counted is the column being filtered int he WHERE, so i'd think nulls would be excluded, but i wasn't sure if the optimizer would shortcut the logic or not.

    I know you said *likely*, but is it really a given that it would do that , assuming an index on that column?

    I'd have to look at the execution plan to be sure, but even if it did add a NOT NULL check into the where clause it would be a redundant one.

    Count(column) is slower when column isn't already been filtered on and SQL has to add a column IS NOT NULL predicate into the seek/scan which may require an index which doesn't exits or a key lookup, etc

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Koen Verbeeck (3/18/2015)


    you can create a new non-clustered index on ID and GPSTime.

    This, assuming there aren't any other indexes which we haven't been shown

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • appdev13 (3/18/2015)


    I am having a table with over 120 million rows. I am executing a count query and it is taking about 4 mins to execute. Every 1 second 100 rows are being inserted into the table also with current time value in gpstime field.

    indexes are created on id and gpstime fields.

    select count(id)

    from tablename

    where gpstime between A and B

    and id=123;

    The query is a little strange since with a unique index on id, you will either get one row returned with a value of 1 for COUNT(id), or no rows returned.

    You don't have a nonclustered index containing gpstime, hence the clustered index scan.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (3/18/2015)


    appdev13 (3/18/2015)


    I am having a table with over 120 million rows. I am executing a count query and it is taking about 4 mins to execute. Every 1 second 100 rows are being inserted into the table also with current time value in gpstime field.

    indexes are created on id and gpstime fields.

    select count(id)

    from tablename

    where gpstime between A and B

    and id=123;

    The query is a little strange since with a unique index on id, you will either get one row returned with a value of 1 for COUNT(id), or no rows returned.

    You don't have a nonclustered index containing gpstime, hence the clustered index scan.

    Indeed. If ID is unique, why isn't there an index seek?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen Verbeeck (3/18/2015)


    ChrisM@Work (3/18/2015)


    appdev13 (3/18/2015)


    I am having a table with over 120 million rows. I am executing a count query and it is taking about 4 mins to execute. Every 1 second 100 rows are being inserted into the table also with current time value in gpstime field.

    indexes are created on id and gpstime fields.

    select count(id)

    from tablename

    where gpstime between A and B

    and id=123;

    The query is a little strange since with a unique index on id, you will either get one row returned with a value of 1 for COUNT(id), or no rows returned.

    You don't have a nonclustered index containing gpstime, hence the clustered index scan.

    Indeed. If ID is unique, why isn't there an index seek?

    Because the query doesn't filter on ID. It filters on sys_service_id and gps_date, neither of which are leading keys (look at the query the execution plan shows, it's different from the one initially posted)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (3/18/2015)


    (look at the query the execution plan shows, it's different from the one initially posted)

    *sigh*

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • try this

    select count(id)

    from tablename WITH(nolock)

    where gpstime between A and B

    and id=123;

  • GilaMonster (3/18/2015)


    Koen Verbeeck (3/18/2015)


    you can create a new non-clustered index on ID and GPSTime.

    This, assuming there aren't any other indexes which we haven't been shown

    No, better to just add GPSTime to the existing non-clustered index on ID. It can still be a UNIQUE index as teh current one is, adding a key column makes no difference to that.

    edit: But as it apparently isn't the ID column after all, I haven't a clue

    Tom

  • My question would be, why does one need to know the count of rows of this table to begin with? Since someone is adding a hundred rows per second to the table to begin with, it's not like it's going to be accurate for very long.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply