April 10, 2015 at 1:54 pm
Is this an accurate way to get row count?
SELECT
t.name table_name,
s.name schema_name,
p.rows AS total_rows
FROM sys.tables t
join sys.schemas s on (t.schema_id = s.schema_id)
join sys.partitions p on (t.object_id = p.object_id)
WHERE p.index_id in (0,1) AND t.name='MyTable' AND p.rows=0
vs the old and simple, SELECT COUNT(*) ? ...
I'm about to query hundreds of databases to get row count for a particular table. The 2nd one looks more resource intensive to me, but I want to be sure the DMV provides accurate row information.
Thanks in advance.
April 10, 2015 at 2:04 pm
sql-lover (4/10/2015)
Is this an accurate way to get row count?
SELECT
t.name table_name,
s.name schema_name,
p.rows AS total_rows
FROM sys.tables t
join sys.schemas s on (t.schema_id = s.schema_id)
join sys.partitions p on (t.object_id = p.object_id)
WHERE p.index_id in (0,1) AND t.name='MyTable' AND p.rows=0
vs the old and simple, SELECT COUNT(*) ? ...
I'm about to query hundreds of databases to get row count for a particular table. The 2nd one looks more resource intensive to me, but I want to be sure the DMV provides accurate row information.
Thanks in advance.
If ALL of your statistics are refreshed first this should return an accurate count...or really close in nearly every circumstance. Meaning that if the stats are fresh the row count is not likely to be off by more than a small handfull on a couple of tables. If something happens between the index refresh and your query they may be off by a few rows. If 100% accuracy is not critical this is fine. If you need pinpoint accuracy you will need to count the rows.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 10, 2015 at 2:04 pm
I use a slightly different query. I like to get the rows from the sys.dm_db_partition_stats
SELECT
so.name as TableName
,ps.row_count as TotalRows
FROM sys.dm_db_partition_stats ps
LEFT JOIN sys.objects so
ON ps.object_id=so.object_id
WHERE
so.name='TableName'
and ps.index_id = 1;
James Phillips
Sr. Consultant
Pragmatic Works
April 10, 2015 at 2:11 pm
Thanks for the responses so far.
I need 100% certainty. The DMV/count is needed to identify the table(s) with an empty set. A massive DDL change will be deployed against those tables with zero rows and it will be done live. But! ... If there are rows, means an specific application module is active for that client and the DDL should not touch that table so changes should be perform on that client during the maintenance window only.
But I had the feeling that statistics increases the accuracy of the DMV. It seems that I was right.
So going back to my original question. If I want to be 100% sure that the table has zero rows, should I use COUNT(*) instead?
April 10, 2015 at 2:13 pm
Quick suggestion, simplify the query, the difference between the first and the second is 50%
😎
SELECT
t.name table_name,
s.name schema_name,
p.rows AS total_rows
FROM sys.tables t
join sys.schemas s on (t.schema_id = s.schema_id)
join sys.partitions p on (t.object_id = p.object_id)
WHERE p.index_id in (0,1) AND p.rows=0;
SELECT
OBJECT_NAME(SP.object_id) AS TABLE_NAME
,OBJECT_SCHEMA_NAME(ST.object_id) AS SCHEMA_NAME
,SP.rows AS TOTAL_ROWS
FROM sys.partitions SP
INNER JOIN sys.tables ST
ON SP.object_id = ST.object_id
WHERE SP.rows = 0;
April 10, 2015 at 2:27 pm
sql-lover (4/10/2015)
So going back to my original question. If I want to be 100% sure that the table has zero rows, should I use COUNT(*) instead?
Yes. If you want 100% accuracy the guaranteed way is to look at the table. It may be slow on large tables. Of course if you are just trying to determine if there are rows you should EXISTS instead of COUNT.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
April 10, 2015 at 2:35 pm
Agree much less costly to just do a if exists (select top 1 from tablename)
James Phillips
Sr. Consultant
Pragmatic Works
April 10, 2015 at 2:42 pm
sql-lover (4/10/2015)
Thanks for the responses so far.I need 100% certainty. The DMV/count is needed to identify the table(s) with an empty set. A massive DDL change will be deployed against those tables with zero rows and it will be done live. But! ... If there are rows, means an specific application module is active for that client and the DDL should not touch that table so changes should be perform on that client during the maintenance window only.
But I had the feeling that statistics increases the accuracy of the DMV. It seems that I was right.
So going back to my original question. If I want to be 100% sure that the table has zero rows, should I use COUNT(*) instead?
The sys.partitions count is accurate or like Gail Shaw once put it, if it isn't then report a bug with Microsoft. The difference between these approaches is that using the sys.partitions is a single table query, using exist or count requires query on every single table, not likely to be more efficient!
😎
April 10, 2015 at 3:36 pm
Sean - I am interested in why you say that COUNT(*) is the only way to get a row count. I'm not disagreeing with you - just curious looking for a more detailed explanation of what's wrong with sys.partitions.
Eirikur Eiriksson (4/10/2015)
The difference between these approaches is that using the sys.partitions is a single table query, using exist or count requires query on every single table, not likely to be more efficient!😎
If you are saying that the query posted by the OP would not be faster than SELECT COUNT(*) FROM <MyTable> for a row count then I'm going to have to respectfully disagree.
(temp1 is my laptop test db)
USE temp1
GO
IF OBJECT_ID('temp1.dbo.X') IS NOT NULL DROP TABLE dbo.X
GO
CREATE TABLE dbo.X (xID int not null, xTxt varchar(36) not null);
WITH
E1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(c)),
E3(N) AS (SELECT 1 FROM E1 a, E1 b, E1 c)
INSERT dbo.X
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT ($))), newid()
FROM E3 a, E3 b;
SET NOCOUNT ON;
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
SET STATISTICS TIME ON
SELECT COUNT(*) FROM dbo.X;
SELECT p.rows AS total_rows
FROM sys.tables t
join sys.schemas s on (t.schema_id = s.schema_id)
join sys.partitions p on (t.object_id = p.object_id)
WHERE p.index_id in (0,1) AND t.name='x';
SET STATISTICS TIME OFF
GO
Results:
SQL Server Execution Times:
CPU time = 641 ms, elapsed time = 345 ms.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 23 ms.
Note that the SELECT COUNT(*) method is using a parallel plan to complete the task in 345 ms.
Update/edit: I missed the updated query that you posted which completes in 0 ms on my laptop, 9ms for 4,000,000 rows.
-- Itzik Ben-Gan 2001
April 10, 2015 at 4:26 pm
Ok, quick update... I did some research on this - it appears that sys.partitions does not guarantee an accurate row count. According to Microsoft, the rows column in sys.partitions "Indicates the approximate number of rows in this partition."
-- Itzik Ben-Gan 2001
April 10, 2015 at 4:43 pm
Alan.B (4/10/2015)
Ok, quick update... I did some research on this - it appears that sys.partitions does not guarantee an accurate row count. According to Microsoft, the rows column in sys.partitions "Indicates the approximate number of rows in this partition."
AFAIK, sys.partitions and sys.dm_db_partition_stats are accurate but might not give exact information due to problems with transactions. Statistics don't affect them the way sysindexes was affected.
These are other possible queries.
SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id=OBJECT_ID('TableName')
AND (index_id=0 or index_id=1);
EXEC sp_spaceused 'TableName'
April 11, 2015 at 12:01 am
Quick thought, let's not forget that tables can have more than one partition
😎
SELECT
OBJECT_NAME(SP.object_id) AS TABLE_NAME
,OBJECT_SCHEMA_NAME(SP.object_id) AS SCHEMA_NAME
,SUM(SP.rows) AS TOTAL_ROWS
FROM sys.partitions SP
WHERE SP.index_id IN (0,1)
GROUP BY SP.object_id
HAVING SUM(SP.rows) = 0;
April 11, 2015 at 4:43 am
As far as I know, if you have to have a guaranteed, 100% accurate, count of the rows in the table, the way to do it is COUNT. Everything I know about all the other DMVs say that they'll get you close, if not spot on, most of the time. But it's that most bit that's an issue if you have to have a perfect set of counts. The thing is, why do you need a perfect set of row counts?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
April 11, 2015 at 7:13 am
Grant Fritchey (4/11/2015)
As far as I know, if you have to have a guaranteed, 100% accurate, count of the rows in the table, the way to do it is COUNT. Everything I know about all the other DMVs say that they'll get you close, if not spot on, most of the time. But it's that most bit that's an issue if you have to have a perfect set of counts. The thing is, why do you need a perfect set of row counts?
So the script will run on empty sets only. Those are the only databases that can be done on the fly.
April 11, 2015 at 12:32 pm
sql-lover (4/11/2015)
Grant Fritchey (4/11/2015)
As far as I know, if you have to have a guaranteed, 100% accurate, count of the rows in the table, the way to do it is COUNT. Everything I know about all the other DMVs say that they'll get you close, if not spot on, most of the time. But it's that most bit that's an issue if you have to have a perfect set of counts. The thing is, why do you need a perfect set of row counts?So the script will run on empty sets only. Those are the only databases that can be done on the fly.
Belt and braces then?
😎
SELECT COUNT(*)
FROM dbo.TABLE_NAME WITH (TABLOCKX);
Viewing 15 posts - 1 through 15 (of 38 total)
You must be logged in to reply to this topic. Login to reply