count(*) giving wrong results

Question

count(*) giving wrong results

auaiomrn

Ten Centuries

Points: 1300
More actions
February 11, 2008 at 3:15 pm

#119968

Hello, I had a strange problem and am wondering if anyone can tell me what could have caused it. Running these queries on a certain table in SQL Server 2000 gave the following results:
select count(*) from Table1
-> about 19 million
select count(*) from Table1 where DateColumn < 'Jan 31, 2008'
-> about 10.5 million
select count(*) from Table1 where DateColumn >= 'Jan 31, 2008'
-> about 10.5 million
(there are actually only 2 different dates in DateColumn, one from the beginning of January and one from the beginning February)
note: DateColumn is part of the primary key
After reindexing the table the first query returned the correct result of about 21 million. Does anyone know what the problem could have been and how it could have happened?
Thanks

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

thecosmictrickster@gmail.com SSChampion Points: 10386 More actions · Answer 1

Your statistics were probably out of date. Running the UPDATE STATISTICS command would've been a better command to use (first) rather than rebuilding the index.

You should regularly update statistics on your indexes. The optimizer relies on statistics to determine whether or not to use an index when building a query plan. If the statistics are out of date, it can lead to the optimizer ignoring the index when it should be using it.

Scott Duncan

MARCUS. Why dost thou laugh? It fits not with this hour.
TITUS. Why, I have not another tear to shed;
--Titus Andronicus, William Shakespeare

SQLBill SSC Guru Points: 51440 More actions · Answer 2

This COULD be the issue:

From the BOL:

COUNT(*) returns the number of rows in a specified table without eliminating duplicates. It counts each row separately, including rows that contain null values.

That means if you have one DATETIME that is NULL, it will be counted in both counts.

Try:

SELECT COUNT(*) FROM Table1

WHERE DATECOLUMN IS NULL

If the count returns more than 0, that is your issue.

-SQLBill

auaiomrn Ten Centuries Points: 1300 More actions · Answer 3

Thanks for the responses. I forgot to mention but I did try updating the stats before reindexing, but it didn't help. Also I just checked and there are no nulls.

I did just finish running dbcc checkdb on a copy of the database from before the "fix", and I got a whole bunch of "Missing or invalid key in index" errors for two of the indexes on that table. I guess somehow the indexes got corrupted, now I can try and find out why.

Bob Fazio SSChampion Points: 10794 More actions · Answer 4

SQLBill (2/11/2008)
That means if you have one DATETIME that is NULL, it will be counted in both counts.
-SQLBill

The OP stated that it was part of the PK (Nulls not allowed).

Actually I would expect that the index was in some way corrupted giving bad results.

I have also seen strange results with smalldates

Bob Fazio SSChampion Points: 10794 More actions · Answer 5

auaiomrn (2/12/2008)
I did just finish running dbcc checkdb on a copy of the database from before the "fix", and I got a whole bunch of "Missing or invalid key in index" errors for two of the indexes on that table. I guess somehow the indexes got corrupted, now I can try and find out why.

Didn't catch this before I replied, but I can guess that confirms my suspicion

dejava SSC Veteran Points: 249 More actions · Answer 6

If there are nulls, the first query should give result more than the other 2 queries combine because nulls should not be counted in the 2 queries.

Try use DATEDIFF:

select count(*) from Table1 where DATEDIFF(dd,DateColumn,'Jan 31, 2008')<0

select count(*) from Table1 where DATEDIFF(dd,DateColumn,'Jan 31, 2008')>=0

I am afraid that your SQL sees them as strings, not real date.

The other possibility might be some of dates have values older than the oldest date posible. I am not sure the number, but it might be only see them as 2 digit years.

Bob Fazio SSChampion Points: 10794 More actions · Answer 7

nugroho (2/13/2008)
If there are nulls, the first query should give result more than the other 2 queries combine because nulls should not be counted in the 2 queries.

It was part of the PK. PK's cannot be NULL. So that couldn't have been the issue.

Try use DATEDIFF:
select count(*) from Table1 where DATEDIFF(dd,DateColumn,'Jan 31, 2008')<0
select count(*) from Table1 where DATEDIFF(dd,DateColumn,'Jan 31, 2008')>=0
I am afraid that your SQL sees them as strings, not real date.

This is a performance nightmare. You just forced a full scan of the table that has the perfect index.

I agree that with your idea. Just doing 'Jan 31, 2008' will do an implicit conversion to a "SMALLDATETIME" since there is no time in the string. This could cause issues, but since a rebuild fixed the count, that wouldn't have been the issue either.

select count(*) from Table1 where DateColumn < CAST('Jan 31, 2008' as DATETIME)

or even better

select count(*) from Table1 where DateColumn < CONVERT(DATETIME,'Jan 31, 2008',107)

FYI, If it can be avoided, NEVER wrap an indexed column with a function