September 2, 2009 at 12:15 am
Comments posted to this topic are about the item How To Get Table Row Counts Quickly And Painlessly
September 2, 2009 at 12:23 am
In order for the rowcounts found in sysIndexes in SQL Server 2000 to be anywhere close to accurate, you need to use DBCC UPDATEUSAGE on the table you're trying to get the rowcount on. To get the rowcounts (and a lot of other information) for all tables in a database, you need to do something like this ...
--_______________________________________________________________________________________________________________________
/**********************************************************************************************************************
Purpose:
Returns a single result set similar to sp_Space used for all user tables at once.
Notes:
1. May be used as a view, stored procedure, or table-valued funtion.
2. Must comment out 1 "Schema" in the SELECT list below prior to use. See the adjacent comments for more info.
Revision History:
Rev 00 - 22 Jan 2007 - Jeff Moden
- Initital creation for SQL Server 2000
Rev 01 - 11 Mar 2007 - Jeff Moden
- Add automatic page size determination for future compliance
Rev 02 - 05 Jan 2008 - Jeff Moden
- Change "Owner" to "Schema" in output. Add optional code per Note 2 to find correct schema name
**********************************************************************************************************************/
--===== Ensure that all row counts, etc is up to snuff
-- Obviously, this will not work in a view or UDF and should be removed if in a view or UDF. External code should
-- execute the command below prior to retrieving from the view or UDF.
DBCC UPDATEUSAGE(0) WITH COUNT_ROWS, NO_INFOMSGS
--===== Return the single result set similar to what sp_SpaceUsed returns for a table, but more
SELECT DBName = DB_NAME(),
--SchemaName = SCHEMA_NAME(so.UID), --Comment out if for SQL Server 2000
SchemaName = USER_NAME(so.UID), --Comment out if for SQL Server 2005
TableName = so.Name,
TableID = so.ID,
MinRowSize = MIN(si.MinLen),
MaxRowSize = MAX(si.XMaxLen),
ReservedKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Reserved ELSE 0 END) * pkb.PageKB,
DataKB = SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.DPages ELSE 0 END) * pkb.PageKB
+ SUM(CASE WHEN si.IndID IN ( 255) THEN ISNULL(si.Used,0) ELSE 0 END) * pkb.PageKB,
IndexKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Used ELSE 0 END) * pkb.PageKB
- SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.DPages ELSE 0 END) * pkb.PageKB
- SUM(CASE WHEN si.IndID IN ( 255) THEN ISNULL(si.Used,0) ELSE 0 END) * pkb.PageKB,
UnusedKB = SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Reserved ELSE 0 END) * pkb.PageKB
- SUM(CASE WHEN si.IndID IN (0,1,255) THEN si.Used ELSE 0 END) * pkb.PageKB,
Rows = SUM(CASE WHEN si.IndID IN (0,1 ) THEN si.Rows ELSE 0 END),
RowModCtr = MIN(si.RowModCtr),
HasTextImage = MAX(CASE WHEN si.IndID IN ( 255) THEN 1 ELSE 0 END),
HasClustered = MAX(CASE WHEN si.IndID IN ( 1 ) THEN 1 ELSE 0 END)
FROM dbo.SysObjects so,
dbo.SysIndexes si,
(--Derived table finds page size in KB according to system type
SELECT Low/1024 AS PageKB --1024 is a binary Kilo-byte
FROM Master.dbo.spt_Values
WHERE Number = 1 --Identifies the primary row for the given type
AND Type = 'E' --Identifies row for system type
) pkb
WHERE si.ID = so.ID
AND si.IndID IN (0, --Table w/o Text or Image Data
1, --Table with clustered index
255) --Table w/ Text or Image Data
AND so.XType = 'U' --User Tables
AND PERMISSIONS(so.ID) 0
GROUP BY so.Name,
so.UID,
so.ID,
pkb.PageKB
ORDER BY ReservedKB DESC
Notice that there are no cursors or While loops... sp_MSForEachTable is nothing but the world's nastiest cursor behind the scenes. The code above also has the advantage of returning all the data in a single result set.
So far as running SELECT COUNT(*) against a million row table goes.... try it out... I believe you'll be surprised at how quick it actually is and it's accurate at the time of the run...
Here's a million row test table...
--===== Create and populate a 1,000,000 row test table.
-- Column "RowNum" has a range of 1 to 100,000 unique numbers
-- Column "SomeInt" has a range of 1 to 50,000 non-unique numbers
-- Column "SomeLetters2" has a range of "AA" to "ZZ" non-unique 2 character strings
-- Column "SomeMoney has a range of 0.0000 to 99.9999 non-unique numbers
-- Column "SomeDate" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Column "SomeCSV" contains 'Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10'
-- for all rows.
-- Column "SomeHex12" contains 12 random hex characters (ie, 0-9,A-F)
-- Jeff Moden
SELECT TOP 1000000
SomeID = IDENTITY(INT,1,1),
SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,
SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),
SomeCSV = CAST('Part01,Part02,Part03,Part04,Part05,Part06,Part07,Part08,Part09,Part10' AS VARCHAR(80)),
SomeMoney = CAST(ABS(CHECKSUM(NEWID()))%10000 /100.0 AS MONEY),
SomeDate = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
SomeHex12 = RIGHT(NEWID(),12)
INTO dbo.JBMTest
FROM Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN
--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE dbo.JBMTest
ADD PRIMARY KEY CLUSTERED (SomeID)
And here's the test to run against it...
SELECT COUNT(*) FROM dbo.JBMTest
--Jeff Moden
Change is inevitable... Change for the better is not.
September 2, 2009 at 7:05 am
"Quickly and Painlessly" doesn't seem to fit the method, when SELECT COUNT(*) FROM myTable is just four little words (ok, a function ain't a word, whatever) long, and the alternative is joining three tables together and using a WHERE clause. Point taken that the system objects hold this data, but not sure that's easier.
Well-written article though, clear, concise and to the point. Good job!
---------------------------------------------------------
How best to post your question[/url]
How to post performance problems[/url]
Tally Table:What it is and how it replaces a loop[/url]
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
September 2, 2009 at 7:21 am
I find this works well
USE DatabaseName
GO
CREATE TABLE #temp (
table_name sysname ,
row_count INT,
reserved_size VARCHAR(50),
data_size VARCHAR(50),
index_size VARCHAR(50),
unused_size VARCHAR(50))
SET NOCOUNT ON
INSERT #temp
EXEC sp_msforeachtable 'sp_spaceused ''?'''
SELECT a.table_name,
a.row_count,
COUNT(*) AS col_count,
a.data_size
FROM #temp a
INNER JOIN information_schema.columns b
ON a.table_name collate database_default
= b.table_name collate database_default
GROUP BY a.table_name, a.row_count, a.data_size
ORDER BY CAST(REPLACE(a.data_size, ' KB', '') AS integer) DESC
DROP TABLE #temp
September 2, 2009 at 7:24 am
It's useful if you want to know which tables contain data, or contain a lot of rows. I've used the method in the past after a customer had reported unexpected growth in their database. I quickly determined that a logging table existed with lots of rows in it and that it wasn't being cleared down.
David
September 2, 2009 at 7:42 am
The undocumented stored procedure is sp_MSforeachtable, not sp_foreachtable.
So you could do something like the following to return the results quickly:
sp_MSforeachtable 'sp_spaceused ''?'''
There is also the undocumented stored procedure sp_MSforeachdb.
----------------------------------------------------------------------------------------
Dan English - http://denglishbi.wordpress.com
September 2, 2009 at 8:23 am
Good article. It's good to see sp_spaceused in an article. It provides a lot of great information for one little proc.
I still can't believe that you can get a count from SQL Server faster in FoxPro than you can in the native product. It is dumb that we have to worry about slow record counts and use JET to see how many rows there are in order to do it fast. I think the development team should have a hash like JET uses that is integral to SQL Server.
September 2, 2009 at 8:29 am
The post is useful to get rowcount, but for the partitioned table it does not give correct rowcount.
For the partitioned table, it gives count on individual partition rather than count of whole table
September 2, 2009 at 8:34 am
I find the report of DBCC CHECKDB useful.
Why doesn't "SELECT COUNT(*) FROM table" look at an index, which apparently it doesn't(?)
September 2, 2009 at 8:53 am
I ran count(*), and sp_spaceused on a table, got 9997796 with count(*) and 9997764 with sp_spaceused. I told it @updateusage = 'True'.
My first thought is 'corrupt table'. Comments ?
September 2, 2009 at 8:58 am
rja.carnegie (9/2/2009)
Why doesn't "SELECT COUNT(*) FROM table" look at an index, which apparently it doesn't(?)
It should pick the smallest index (by page reads).
September 2, 2009 at 9:05 am
I ran count(*) and got the count as 1048912 and when I ran sp_spaceused I get 1048613 , the count doesnt match . Any Comments?
September 2, 2009 at 9:53 am
jvrakesh (9/2/2009)
I ran count(*) and got the count as 1048912 and when I ran sp_spaceused I get 1048613 , the count doesnt match . Any Comments?
Yes... see my previous post (2nd post in this thread).
--Jeff Moden
Change is inevitable... Change for the better is not.
September 2, 2009 at 10:05 am
I had the same problem, different counts from sp_spaceused versus select count(*)
So I tried sp_spaceused with @updateusage = 'True', and I also ran DBCC UPDATEUSAGE for the table.
And the numbers never varied. My only thought is possibly a corrupt table.
Luckily I have little need for record counts from large tables :^)...
September 2, 2009 at 11:19 am
I find it baffling that the query planner can't provide a way to do count(*) quickly in the common case. It does so many other clever things; this one seems fairly straightforward to implement by comparison, and there's obviously a need for it, as the existence of this article shows.
Viewing 15 posts - 1 through 15 (of 108 total)
You must be logged in to reply to this topic. Login to reply