July 1, 2004 at 10:57 am
I've heard that Count(*) is a bad practice...
Can anyone tell me why?
Thanks
July 1, 2004 at 11:13 am
You should use SELECT COUNT(some_column) instead. If you use the *-sign you get much more I/O.
July 1, 2004 at 11:51 am
SQL Server has to count each and every row when you use COUNT(*). But the same data is kept elsewhere.
Test this:
SELECT rowcnt
FROM sysindexes
WHERE id = (SELECT id
FROM sysobjects
WHERE name = 'yourtablename')
Obviously, replace yourtablename with the name of the table you want the row count from. Run it in the database that has the table you want to check and run DBCC UPDATEUSAGE ('yourdbname', 'yourtablename') first - that will update sysindexes.
-SQLBill
July 1, 2004 at 12:05 pm
Oh, forgot about that one. That's a better way of doing it if you just want to count all the rows i the table. But if you need a WHERE clause as well...
July 1, 2004 at 12:06 pm
OOOPPPS ... I left out something important.
It should have been:
SELECT rowcnt
FROM sysindexes
WHERE id = (SELECT id
FROM sysobjects
WHERE name = 'yourtablename')
AND indid < 2
I ran the above script against one of my huge tables and it returned: 87274758 in less than one second. Running SELECT COUNT(*) against the same table gave the same result but it took longer (1 minute 1 second).
Argument against COUNT(*)....takes a while.
Argument for COUNT(*)....it will give the count of the rows at the moment it is run.
Argument for the rowcnt version...very quick
Argument against the rowcnt version...it is not up-to-the-second accurate. If your database is constantly getting new rows, this is only as accurate as the last DBCC UPDATEUSAGE.
Bottom line: What do you need to know?
The amount of rows at this second?
-use COUNT(*)
The amount of rows and there's not a lot of rows being inserted?
-use DBCC UPDATEUSAGE followed by the rowcnt script.
-SQLBill
July 1, 2004 at 12:25 pm
Why does SELECT COUNT(*) take more I/O than SELECT COUNT(some_column) ? One would think it still has to step through each row to count.
July 1, 2004 at 12:28 pm
No idea. I just know because I asked the same question about a year ago and someone smarter told me that.
Damn, I never get the chance to shine
July 2, 2004 at 2:04 am
Also, depending on ANSI settings, a COUNT(*) can cause NULL entries to be excluded from the COUNT.
--------------------
Colt 45 - the original point and click interface
July 2, 2004 at 3:54 am
COUNT(*) counts all rows in the table for both base tables and views. The sysindexes query works for base tables and not views.
Also, COUNT(Some_Column) may return a lower number than COUNT(*). BOL states that COUNT(expression) evaluates the expression for each row in a group and returns the number of nonnull values. So be careful which column you choose to count - make sure you use a required column or better still the primary key for the table.
Keith
July 2, 2004 at 6:17 am
Count(some_column) does not work with certain datatypes eg. uniqueidentifier
July 2, 2004 at 6:20 am
sp_spaceused tablename will give the no of rows - much faster
July 2, 2004 at 6:41 am
sp_spaceused can also give the "possible innaccurate" value from sysindexes,
unless the paramter @updateusage is set true.
The belief that count(*) is bad practice and that you should use count(0) or count(column_name) instead is a myth.
Maybe on some other dbsystems or earlier versions of sql server this was true, but on sql server 2000 it simply is not true.
If you use COUNT(column_name) on a column that allows NULL than it will not count the NULL values, and it will be slower than COUNT(*) because sql server has to weed out the NULL values.
Compare execution plans if you are not convinced.
/rockmoose
You must unlearn what You have learnt
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply