Count on table with million data too slow

Question

Count on table with million data too slow

Viewing 10 posts - 16 through 24 (of 24 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259135 More actions · Answer 1

The clustered index cannot be used for a seek because of the the column [sys_service_id] (which is not included in your SQL statement).

To get a seek, you can either reverse the order of the clustered index as ID, GPSTime, [sys_service_id] (which can have a serious impact on other queries!), or you can create a new non-clustered index on ID and GPSTime.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

Lowell (3/18/2015)
GilaMonster (3/18/2015)
Lowell (3/18/2015)
also, why not count(*)? it would be faster, i would think.
In this case it's unlikely to be faster as the column in the count is already in the where clause. Since null values can't be returned by the where clause predicate, the optimiser will likely be treating it as a count(*)
that's where i'm a little weak Gail; i of course saw that the column being counted is the column being filtered int he WHERE, so i'd think nulls would be excluded, but i wasn't sure if the optimizer would shortcut the logic or not.
I know you said *likely*, but is it really a given that it would do that , assuming an index on that column?

I'd have to look at the execution plan to be sure, but even if it did add a NOT NULL check into the where clause it would be a redundant one.

Count(column) is slower when column isn't already been filtered on and SQL has to add a column IS NOT NULL predicate into the seek/scan which may require an index which doesn't exits or a key lookup, etc

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Koen Verbeeck (3/18/2015)
you can create a new non-clustered index on ID and GPSTime.

This, assuming there aren't any other indexes which we haven't been shown

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

appdev13 (3/18/2015)
I am having a table with over 120 million rows. I am executing a count query and it is taking about 4 mins to execute. Every 1 second 100 rows are being inserted into the table also with current time value in gpstime field.
indexes are created on id and gpstime fields.
select count(id)
from tablename
where gpstime between A and B
and id=123;

The query is a little strange since with a unique index on id, you will either get one row returned with a value of 1 for COUNT(id), or no rows returned.

You don't have a nonclustered index containing gpstime, hence the clustered index scan.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Koen Verbeeck SSC Guru Points: 259135 More actions · Answer 5

ChrisM@Work (3/18/2015)
appdev13 (3/18/2015)
I am having a table with over 120 million rows. I am executing a count query and it is taking about 4 mins to execute. Every 1 second 100 rows are being inserted into the table also with current time value in gpstime field.
indexes are created on id and gpstime fields.
select count(id)
from tablename
where gpstime between A and B
and id=123;
The query is a little strange since with a unique index on id, you will either get one row returned with a value of 1 for COUNT(id), or no rows returned.
You don't have a nonclustered index containing gpstime, hence the clustered index scan.

Indeed. If ID is unique, why isn't there an index seek?

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

Koen Verbeeck (3/18/2015)
ChrisM@Work (3/18/2015)
appdev13 (3/18/2015)
I am having a table with over 120 million rows. I am executing a count query and it is taking about 4 mins to execute. Every 1 second 100 rows are being inserted into the table also with current time value in gpstime field.
indexes are created on id and gpstime fields.
select count(id)
from tablename
where gpstime between A and B
and id=123;
The query is a little strange since with a unique index on id, you will either get one row returned with a value of 1 for COUNT(id), or no rows returned.
You don't have a nonclustered index containing gpstime, hence the clustered index scan.
Indeed. If ID is unique, why isn't there an index seek?

Because the query doesn't filter on ID. It filters on sys_service_id and gps_date, neither of which are leading keys (look at the query the execution plan shows, it's different from the one initially posted)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Koen Verbeeck SSC Guru Points: 259135 More actions · Answer 7

GilaMonster (3/18/2015)
(look at the query the execution plan shows, it's different from the one initially posted)

*sigh*

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

shreya.puneeth Valued Member Points: 64 More actions · Answer 8

try this

select count(id)

from tablename WITH(nolock)

where gpstime between A and B

and id=123;

TomThomson SSC Guru Points: 104773 More actions · Answer 9

GilaMonster (3/18/2015)
Koen Verbeeck (3/18/2015)
you can create a new non-clustered index on ID and GPSTime.
This, assuming there aren't any other indexes which we haven't been shown

No, better to just add GPSTime to the existing non-clustered index on ID. It can still be a UNIQUE index as teh current one is, adding a key column makes no difference to that.

edit: But as it apparently isn't the ID column after all, I haven't a clue

Tom

Jeff Moden SSC Guru Points: 1004441 More actions · Answer 10

My question would be, why does one need to know the count of rows of this table to begin with? Since someone is adding a hundred rows per second to the table to begin with, it's not like it's going to be accurate for very long.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)