SELECT COUNT(*) vs DMVs to get row count

  • Sean Lange (4/13/2015)


    GilaMonster (4/13/2015)


    Not fixed. Large portions of the storage engine were rewritten between SQL 2000 and 2005.

    I would say that rewriting the storage engine is what "fixed" the problem.

    😀

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The script ran beautifully! 😉 Thanks all for your input.

  • sql-lover (4/10/2015)


    Thanks for the responses so far.

    I need 100% certainty. The DMV/count is needed to identify the table(s) with an empty set. A massive DDL change will be deployed against those tables with zero rows and it will be done live. But! ... If there are rows, means an specific application module is active for that client and the DDL should not touch that table so changes should be perform on that client during the maintenance window only.

    But I had the feeling that statistics increases the accuracy of the DMV. It seems that I was right.

    So going back to my original question. If I want to be 100% sure that the table has zero rows, should I use COUNT(*) instead?

    Then I would use SELECT count(*), GetDate() from <table name> to be certain.

    You most likely will want to start a Transaction and Grab a Table Lock before doing this.

    It would be very sad if a customer connected after the 0 rows was returned by any query but before you are able to execute the DDL.

    In other words. For the scenario you are describing you true issue is not finding the tables that have no rows. It is making sure they remain like that until your update is finished.

  • PHYData DBA (4/14/2015)


    sql-lover (4/10/2015)


    Thanks for the responses so far.

    I need 100% certainty. The DMV/count is needed to identify the table(s) with an empty set. A massive DDL change will be deployed against those tables with zero rows and it will be done live. But! ... If there are rows, means an specific application module is active for that client and the DDL should not touch that table so changes should be perform on that client during the maintenance window only.

    But I had the feeling that statistics increases the accuracy of the DMV. It seems that I was right.

    So going back to my original question. If I want to be 100% sure that the table has zero rows, should I use COUNT(*) instead?

    Then I would use SELECT count(*), GetDate() from <table name> to be certain.

    You most likely will want to start a Transaction and Grab a Table Lock before doing this.

    It would be very sad if a customer connected after the 0 rows was returned by any query but before you are able to execute the DDL.

    In other words. For the scenario you are describing you true issue is not finding the tables that have no rows. It is making sure they remain like that until your update is finished.

    No. My issue was doing the changes on tables with data on it. That means an application module is active and in use, so users are connected and using that table.

    Anyway, please see my previous post. The script ran fine and had no issues.

  • You should look into RCSI as well. That may solve your issue even more completely and with less effort during coding :-).

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • GilaMonster (4/13/2015)


    Sean Lange (4/13/2015)


    GilaMonster (4/13/2015)


    Not fixed. Large portions of the storage engine were rewritten between SQL 2000 and 2005.

    I would say that rewriting the storage engine is what "fixed" the problem.

    😀

    +1 to that!

  • PHYData DBA (4/14/2015)


    Then I would use SELECT count(*), GetDate() from <table name> to be certain.

    Counting every row in the table just to see whether the table has a row or not is a massive waste of resources if the table isn't empty. Finding out whether there is a row or not is exactly what EXISTS is there for and it won't read the entire table, if it finds a qualifying row it exits immediately.

    You can do EXISTS within a transaction and with the appropriate isolation level if necessary, but there's no reason to pull an entire table through memory, counting all the rows, just to see if the result is > 0

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (4/15/2015)


    PHYData DBA (4/14/2015)


    Then I would use SELECT count(*), GetDate() from <table name> to be certain.

    Counting every row in the table just to see whether the table has a row or not is a massive waste of resources if the table isn't empty. Finding out whether there is a row or not is exactly what EXISTS is there for and it won't read the entire table, if it finds a qualifying row it exits immediately.

    You can do EXISTS within a transaction and with the appropriate isolation level if necessary, but there's no reason to pull an entire table through memory, counting all the rows, just to see if the result is > 0

    I was not aware he was looking for an empty table.

    Exists is absolutely the better way.

    I have been spending the last two years trying to find and replace "if (SELECT TOP 1 ...) IS NOT NULL" with "IF Exists" in over 20 DB's and 6 applications with adhoc SQL.

    It is amazing to me how hard it is to convince people one is better than the other and does the same thing.

  • GilaMonster (4/13/2015)


    Alan.B (4/10/2015)


    Ok, quick update... I did some research on this - it appears that sys.partitions does not guarantee an accurate row count. According to Microsoft, the rows column in sys.partitions "Indicates the approximate number of rows in this partition."

    Because that piece of documentation was copied almost verbatim from sysindexes in SQL 2000, which was prone to errors and inaccuracies. It was trivial in SQL 2000 to find cases where sysindexes's row count did not match Count(*) from table, in SQL 2005+, that should not happen with sys.partitions, if it does, it is a bug (from one of the dev team)

    Thanks for your comments and sorry for the late reply - I'm buried and behind on a project....

    I did notice that that documentation has not changed through 2014 in BOL.

    So I'm clear, you're saying that if COUNT(*) from a table and rows sys.partitions don't match it would be because of a bug? This per the someone on the MS SQL Server Dev Team?

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 9 posts - 31 through 38 (of 38 total)

You must be logged in to reply to this topic. Login to reply