How To Get Table Row Counts Quickly And Painlessly

  • matt stockham (9/2/2009)


    rja.carnegie (9/2/2009)


    Why doesn't "SELECT COUNT(*) FROM table" look at an index, which apparently it doesn't(?)

    It should pick the smallest index (by page reads).

    The ESTIMATED execution plan says it's doing a table scan, is that the wrong version of execution plan to look at (for me)? Or does it say that when it's really using the index?

    Will it surprise you that whenever I try to apply what I THINK I know about SQL Server indexing and so forth, it doesn't work like I expected?

  • Paul White (9/2/2009)


    wbrianwhite (9/2/2009)


    I had always heard that you should do a "select count(1) from table".

    http://www.techonthenet.com/sql/count.php

    "TIP: Performance Tuning

    Since the COUNT function will return the same results regardless of what NOT NULL field(s) you include as the COUNT function parameters (ie: within the brackets), you can change the syntax of the COUNT function to COUNT(1) to get better performance as the database engine will not have to fetch back the data fields."

    This is a myth.

    SELECT COUNT(*) produces an identical plan, with identical performance (and results) to SELECT COUNT(1) or SELECT COUNT(-4.678923) or whatever in all circumstances. Same goes for EXISTS (SELECT 1 FROM versus EXISTS (SELECT * FROM. I personally prefer to use the star, but it makes no difference.

    I heard a rumour once that COUNT(*) might perform differently from COUNT(1) on some early versions of Oracle, but I may be mis-remembering.

    Paul

    I think I heard it as an old Oracle trick - that the equivalent of query optimiser was dumb enough to interpret EXISTS (SELECT *) and/or COUNT(*) as demanding a column list.

    Of course COUNT(expressionThatIsSometimesNull) is different from COUNT(1). Then again, if you want to confuse enemies you can write COUNT(0).

  • rja.carnegie (9/3/2009)


    I think I heard it as an old Oracle trick - that the equivalent of query optimiser was dumb enough to interpret EXISTS (SELECT *) and/or COUNT(*) as demanding a column list.

    Of course COUNT(expressionThatIsSometimesNull) is different from COUNT(1). Then again, if you want to confuse enemies you can write COUNT(0).

    Personally I prefer COUNT(SQRT(PI())) :laugh:

    Proponents of COUNT(1) will be disappointed to see that the actual execution plan XML contains:

  • Jeff Moden (9/2/2009)


    In order for the rowcounts found in sysIndexes in SQL Server 2000 to be anywhere close to accurate, you need to use DBCC UPDATEUSAGE on the table you're trying to get the rowcount on.

    Hey Jeff,

    Just for fun, here's a small addition to your fine script:

    --===== A table is not properly formed unless a Primary Key has been assigned

    -- Takes about 1 second to execute.

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (SomeID);

    GO

    -- That PK creation will have loaded the data pages into cache

    -- Clear the caches now to test cold-start performance

    DBCC FREEPROCCACHE;

    DBCC DROPCLEANBUFFERS;

    GO

    -- DMV method

    SELECT row_count = SUM(row_count)

    FROM sys.dm_db_partition_stats DPS

    WHERE DPS.[object_id] = OBJECT_ID(N'dbo.JBMTest', N'U')

    AND DPS.index_id IN (0, 1);

    GO

    -- The usual method

    SELECT COUNT(*) FROM dbo.JBMTest;

    GO

    -- Performance comparisons

    SELECT query = LEFT(T.text, 30),

    S.creation_time, S.last_execution_time, S.execution_count,

    S.total_worker_time, S.total_physical_reads, S.total_logical_reads,

    S.total_elapsed_time

    FROM sys.dm_exec_query_stats S

    CROSS

    APPLY sys.dm_exec_sql_text(S.[sql_handle]) T

    WHERE T.text NOT LIKE '%sys.dm_exec_sql_text%';

    GO

    Results:

  • Paul,

    Heh... oh no... I agree... sys.dm_db_partition_stats in 2k5 is the berries especially on mega-row tables. And, you beat me to it on the OBJECT_ID thingy to help eliminate a join. I also agree with Kevin that COUNT(*) with no filters on large tables can drive some things out of cache which will cause things to run slow until recached.

    I just didn't want people that were going to write a script in 2k for this to even think about using a (ugh!) cursor or While Loop to get reasonably accurate rowcounts for all tables in a DB. 🙂

    I also took a look at the article about COUNT(1). I wish they allowed for direct responses to their articles so I could chew on them a bit about perpetuating myth's without some actual million row testing. 😉

    Last but not least, it's a rare thing to actually need to do a COUNT on anything other than maybe a temp table. If I want an order of magnitude estimate row count for a table in SQL Server 2000, I just double click on the table in EM and there it is... it's from sysIndexes and it's usually a bit off, but I don't need a precise row count in most cases... just an OM.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Kendal,

    Drat... I didn't say it in my first response (not enough coffee). Kudos on the article. It was well written and clear. And, you managed to get some folks to enjoin in a nice discussion about other methods and a caveat here and there. Well done!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff - great stuff, as usual. (Kevin, eh? Hmmm...)

    Let me add my congratulations to the author: this tends to be an emotional topic, so well done on presenting a basically sound approach in a clear and engaging fashion.

    Paul

  • >>I just didn't want people that were going to write a script in 2k for this to even think about using a (ugh!) cursor or While Loop to get reasonably accurate rowcounts for all tables in a DB.

    Absolutely no reason that someone comfortable with writing a cursor or while loop shouldn't use that if they wish to do a count(*) on all or some tables in a database. The overhead of looping sysobjects to do this (or using sp_msforeachtable) is negligible compared to the cost of actually doing the counts even if they are all indexed on a tinyint.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • markjvernon (9/3/2009)


    What if you would like to query using a where clause?

    For example...

    SELECT COUNT(*) FROM books WHERE title LIKE '%Harry Potter%'

    Or...

    SELECT COUNT(*) FROM invoices WHERE deleted = 0

    I could be wrong but I believe that querying the sysobjects is a great way for a DBA to find out table sizes but may not be as useful to a developer, who will probably want more complex results?

    You've got a point there. I think it's most likely that you'd want total row count only from a work-in-progress or temporary table that your own process just loaded data into, in which case you can use @@ROWCOUNT.

    In a trigger you may be interested in [inserted] and [deleted], but they each have the same number of rows or zero: I wonder what happens then? (But not enough right now to go find out.)

    And does it matter whether the trigger runs "AFTER" or "INSTEAD OF" actually operating on the table - matter to what @@ROWCOUNT may or may not do, I mean?

  • rja.carnegie (9/3/2009)


    You've got a point there. I think it's most likely that you'd want total row count only from a work-in-progress or temporary table that your own process just loaded data into, in which case you can use @@ROWCOUNT.

    In a trigger you may be interested in [inserted] and [deleted], but they each have the same number of rows or zero: I wonder what happens then? (But not enough right now to go find out.)

    And does it matter whether the trigger runs "AFTER" or "INSTEAD OF" actually operating on the table - matter to what @@ROWCOUNT may or may not do, I mean?

    The whole thing about the WHERE clause is a bit of a red herring: no one is suggesting that DMV-based row counts should replace the SELECT COUNT(*) construction for anything other than a count of all the rows in a table. Sure, adding a WHERE clause changes the problem, much in the same way adding a JOIN would 😉

    You seem to be wondering about @@ROWCOUNT in a trigger. Regardless of whether it is an AFTER or INSTEAD OF trigger, @@ROWCOUNT represents the total number of rows inserted, updated, or deleted. Note the words 'total' and 'or' there. In 2008, a MERGE statement that INSERTs 2 rows, UPDATEs no rows, and DELETEs 1 row would cause all three potential triggers (INSERT, UPDATE, and DELETE) to fire - with a @@ROWCOUNT of 3!

    Paul

  • TheSQLGuru (9/3/2009)


    >>I just didn't want people that were going to write a script in 2k for this to even think about using a (ugh!) cursor or While Loop to get reasonably accurate rowcounts for all tables in a DB.

    Absolutely no reason that someone comfortable with writing a cursor or while loop shouldn't use that if they wish to do a count(*) on all or some tables in a database. The overhead of looping sysobjects to do this (or using sp_msforeachtable) is negligible compared to the cost of actually doing the counts even if they are all indexed on a tinyint.

    I respectfully disagree... practicing poking yourself in the eye is never a good thing even with the low overhead you mention. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I expect there to be lots of 'messes' out there related to MERGE and triggers. So incredibly difficult to follow some of the logic trees when both are involved!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Guys,

    How about the below queries which use only 1 table or view to get the required info.

    Also the stress on the system will be reduced by using less number of tables (in this case only 1 table or view)

    /*This Query works for SQL 2000, SQL 2005 and SQL 2008*/

    Select object_name(id) AS [Table Name],rowcnt as [RowCount] from sysindexes

    WHERE indid < 2

    AND OBJECTPROPERTY(id, 'IsMSShipped') = 0

    ORDER BY object_name(id)

    /*This Query works for SQL 2005 and SQL 2008 as it uses DMV*/

    Select Distinct Object_Name(object_id) AS [Table Name], row_count AS [RowCount] from sys.dm_db_partition_stats

    where OBJECTPROPERTY(object_id, 'IsMSShipped') = 0 order by Object_Name(object_id)

    Regards,
    Sandesh Segu
    http://www.SansSQL.com

  • TheSQLGuru (9/3/2009)


    I expect there to be lots of 'messes' out there related to MERGE and triggers. So incredibly difficult to follow some of the logic trees when both are involved!

    Certainly - just imagine a MERGE that does an INSERT, UPDATE, and DELETE, has an OUTPUT clause that feeds an outer INSERT, and all of the target tables have triggers! There are so many catches with MERGE - my current favourite is the one concerning INSTEAD OF triggers. Nightmare.

  • Guys,

    How about the below queries which use only 1 table or view to get the required info.

    Also the stress on the system will be reduced by using less number of tables (in this case only 1 table or view)

    /*This Query works for SQL 2000, SQL 2005 and SQL 2008*/

    Select object_name(id) AS [Table Name],rowcnt as [RowCount] from sysindexes

    WHERE indid < 2

    AND OBJECTPROPERTY(id, 'IsMSShipped') = 0

    ORDER BY object_name(id)

    /*This Query works for SQL 2005 and SQL 2008 as it uses DMV*/

    Select Distinct Object_Name(object_id) AS [Table Name], row_count AS [RowCount] from sys.dm_db_partition_stats

    where OBJECTPROPERTY(object_id, 'IsMSShipped') = 0 order by Object_Name(object_id)

    Did you actually read any of this thread before posting this message? :w00t:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 31 through 45 (of 108 total)

You must be logged in to reply to this topic. Login to reply