How To Get Table Row Counts Quickly And Painlessly

Question

How To Get Table Row Counts Quickly And Painlessly

Kendal Van Dyke

SSCertifiable

Points: 5972
More actions
September 2, 2009 at 12:15 am

#214481

Comments posted to this topic are about the item How To Get Table Row Counts Quickly And Painlessly
Kendal Van Dyke
http://kendalvandyke.blogspot.com/[/url]

Viewing 15 posts - 1 through 15 (of 108 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004419 More actions · Answer 1

In order for the rowcounts found in sysIndexes in SQL Server 2000 to be anywhere close to accurate, you need to use DBCC UPDATEUSAGE on the table you're trying to get the rowcount on. To get the rowcounts (and a lot of other information) for all tables in a database, you need to do something like this ...

--_______________________________________________________________________________________________________________________

/**********************************************************************************************************************

Purpose:

Returns a single result set similar to sp_Space used for all user tables at once.

Notes:

1. May be used as a view, stored procedure, or table-valued funtion.

2. Must comment out 1 "Schema" in the SELECT list below prior to use. See the adjacent comments for more info.

Revision History:

Rev 00 - 22 Jan 2007 - Jeff Moden

- Initital creation for SQL Server 2000

Rev 01 - 11 Mar 2007 - Jeff Moden

- Add automatic page size determination for future compliance

Rev 02 - 05 Jan 2008 - Jeff Moden

- Change "Owner" to "Schema" in output. Add optional code per Note 2 to find correct schema name

**********************************************************************************************************************/

--===== Ensure that all row counts, etc is up to snuff

-- Obviously, this will not work in a view or UDF and should be removed if in a view or UDF. External code should

-- execute the command below prior to retrieving from the view or UDF.

DBCC UPDATEUSAGE(0) WITH COUNT_ROWS, NO_INFOMSGS

--===== Return the single result set similar to what sp_SpaceUsed returns for a table, but more

SELECT DBName = DB_NAME(),

--SchemaName = SCHEMA_NAME(so.UID), --Comment out if for SQL Server 2000

SchemaName = USER_NAME(so.UID), --Comment out if for SQL Server 2005

TableName = so.Name,

TableID = so.ID,

MinRowSize = MIN(si.MinLen),

MaxRowSize = MAX(si.XMaxLen),

ReservedKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Reserved ELSE 0 END) * pkb.PageKB,

DataKB = SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.DPages ELSE 0 END) * pkb.PageKB

+ SUM(CASE WHEN si.IndID IN ( 255) THEN ISNULL(si.Used,0) ELSE 0 END) * pkb.PageKB,

IndexKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Used ELSE 0 END) * pkb.PageKB

- SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.DPages ELSE 0 END) * pkb.PageKB

- SUM(CASE WHEN si.IndID IN ( 255) THEN ISNULL(si.Used,0) ELSE 0 END) * pkb.PageKB,

UnusedKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Reserved ELSE 0 END) * pkb.PageKB

- SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Used ELSE 0 END) * pkb.PageKB,

Rows = SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.Rows ELSE 0 END),

RowModCtr = MIN(si.RowModCtr),

HasTextImage = MAX(CASE WHEN si.IndID IN ( 255) THEN 1 ELSE 0 END),

HasClustered = MAX(CASE WHEN si.IndID IN ( 1 ) THEN 1 ELSE 0 END)

FROM dbo.SysObjects so,

dbo.SysIndexes si,

(--Derived table finds page size in KB according to system type

SELECT Low/1024 AS PageKB --1024 is a binary Kilo-byte

FROM Master.dbo.spt_Values

WHERE Number = 1 --Identifies the primary row for the given type

AND Type = 'E' --Identifies row for system type

) pkb

WHERE si.ID = so.ID

AND si.IndID IN (0, --Table w/o Text or Image Data

1, --Table with clustered index

255) --Table w/ Text or Image Data

AND so.XType = 'U' --User Tables

AND PERMISSIONS(so.ID) 0

GROUP BY so.Name,

so.UID,

so.ID,

pkb.PageKB

ORDER BY ReservedKB DESC

Notice that there are no cursors or While loops... sp_MSForEachTable is nothing but the world's nastiest cursor behind the scenes. The code above also has the advantage of returning all the data in a single result set.

So far as running SELECT COUNT(*) against a million row table goes.... try it out... I believe you'll be surprised at how quick it actually is and it's accurate at the time of the run...

Here's a million row test table...

--===== Create and populate a 1,000,000 row test table.

-- Column "RowNum" has a range of 1 to 100,000 unique numbers

-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers

-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings

-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers

-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times

-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'

-- for all rows.

-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)

-- Jeff Moden

SELECT TOP 1000000

SomeID = IDENTITY(INT,1,1),

SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),

SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),

SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),

SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),

SomeHex12 = RIGHT(NEWID(),12)

INTO dbo.JBMTest

FROM Master.dbo.SysColumns t1,

Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

--===== A table is not properly formed unless a Primary Key has been assigned

-- Takes about 1 second to execute.

ALTER TABLE dbo.JBMTest

ADD PRIMARY KEY CLUSTERED (SomeID)

And here's the test to run against it...

SELECT COUNT(*) FROM dbo.JBMTest

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

jcrawf02 SSC-Insane Points: 24198 More actions · Answer 2

"Quickly and Painlessly" doesn't seem to fit the method, when SELECT COUNT(*) FROM myTable is just four little words (ok, a function ain't a word, whatever) long, and the alternative is joining three tables together and using a WHERE clause. Point taken that the system objects hold this data, but not sure that's easier.

Well-written article though, clear, concise and to the point. Good job!

---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

steve-433846 SSC Veteran Points: 279 More actions · Answer 3

I find this works well

USE DatabaseName

GO

CREATE TABLE #temp (

table_name sysname ,

row_count INT,

reserved_size VARCHAR(50),

data_size VARCHAR(50),

index_size VARCHAR(50),

unused_size VARCHAR(50))

SET NOCOUNT ON

INSERT #temp

EXEC sp_msforeachtable 'sp_spaceused ''?'''

SELECT a.table_name,

a.row_count,

COUNT(*) AS col_count,

a.data_size

FROM #temp a

INNER JOIN information_schema.columns b

ON a.table_name collate database_default

= b.table_name collate database_default

GROUP BY a.table_name, a.row_count, a.data_size

ORDER BY CAST(REPLACE(a.data_size, ' KB', '') AS integer) DESC

DROP TABLE #temp

David Betteridge Hall of Fame Points: 3264 More actions · Answer 4

It's useful if you want to know which tables contain data, or contain a lot of rows. I've used the method in the past after a customer had reported unexpected growth in their database. I quickly determined that a logging table existed with lots of rows in it and that it wasn't being cleared down.

David

denglishbi SSCrazy Eights Points: 8831 More actions · Answer 5

The undocumented stored procedure is sp_MSforeachtable, not sp_foreachtable.

So you could do something like the following to return the results quickly:

sp_MSforeachtable 'sp_spaceused ''?'''

There is also the undocumented stored procedure sp_MSforeachdb.

----------------------------------------------------------------------------------------
Dan English - http://denglishbi.wordpress.com

Steve-524674 SSC Eights! Points: 854 More actions · Answer 6

Good article. It's good to see sp_spaceused in an article. It provides a lot of great information for one little proc.

I still can't believe that you can get a count from SQL Server faster in FoxPro than you can in the native product. It is dumb that we have to worry about slow record counts and use JET to see how many rows there are in order to do it fast. I think the development team should have a hash like JET uses that is integral to SQL Server.

cee.yech SSC Enthusiast Points: 146 More actions · Answer 7

The post is useful to get rowcount, but for the partitioned table it does not give correct rowcount.

For the partitioned table, it gives count on individual partition rather than count of whole table

rja.carnegie SSC Eights! Points: 923 More actions · Answer 8

I find the report of DBCC CHECKDB useful.

Why doesn't "SELECT COUNT(*) FROM table" look at an index, which apparently it doesn't(?)

Reginald J Ray Jr Old Hand Points: 383 More actions · Answer 9

I ran count(*), and sp_spaceused on a table, got 9997796 with count(*) and 9997764 with sp_spaceused. I told it @updateusage = 'True'.

My first thought is 'corrupt table'. Comments ?

matt stockham SSCrazy Eights Points: 9942 More actions · Answer 10

rja.carnegie (9/2/2009)
Why doesn't "SELECT COUNT(*) FROM table" look at an index, which apparently it doesn't(?)

It should pick the smallest index (by page reads).

jvrakesh-858370 SSC Veteran Points: 285 More actions · Answer 11

I ran count(*) and got the count as 1048912 and when I ran sp_spaceused I get 1048613 , the count doesnt match . Any Comments?

Jeff Moden SSC Guru Points: 1004419 More actions · Answer 12

jvrakesh (9/2/2009)
I ran count(*) and got the count as 1048912 and when I ran sp_spaceused I get 1048613 , the count doesnt match . Any Comments?

Yes... see my previous post (2nd post in this thread).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Reginald J Ray Jr Old Hand Points: 383 More actions · Answer 13

I had the same problem, different counts from sp_spaceused versus select count(*)

So I tried sp_spaceused with @updateusage = 'True', and I also ran DBCC UPDATEUSAGE for the table.

And the numbers never varied. My only thought is possibly a corrupt table.

Luckily I have little need for record counts from large tables :^)...

sqlservercentral-879774 SSC Rookie Points: 27 More actions · Answer 14

I find it baffling that the query planner can't provide a way to do count(*) quickly in the common case. It does so many other clever things; this one seems fairly straightforward to implement by comparison, and there's obviously a need for it, as the existence of this article shows.