SELECT COUNT(*)

Question

Post reply

SELECT COUNT(*)

SQLdevotee

SSCarpal Tunnel

Points: 4820
More actions
March 17, 2008 at 8:46 pm

#185291

I have about 2Terabytes of data in which I have to know how may rows are there. I did a SELECT COUNT (*) FROM DB. But even after 20 minutes, it was still executing. I need to know how many rows there are so that I can further do some other development that I was asked to. Anyone worked with this gigantic database before?
[font="Verdana"]Imagination is more important than knowledge-Albert Einstein[/font]

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

RBarryYoung SSC Guru Points: 143329 More actions · Answer 1

Check out this query/view: http://www.sqlservercentral.com/scripts/tables/62545/

I'd recommend showing the query plan before execution.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Adam Haines SSC-Insane Points: 23217 More actions · Answer 2

You can query sysobjects.

SELECT o.name AS [Table Name], i.rowcnt AS [Row Count]

FROM sysobjects o, sysindexes i

WHERE i.id = o.id

AND indid IN(0,1)

AND xtype = 'u'

AND o.name <> 'sysdiagrams'

AND o.name = 'MyTable' --MyTableName

My blog: http://jahaines.blogspot.com

Adam Haines SSC-Insane Points: 23217 More actions · Answer 3

Got the above from this author. Have to give credit where it is due 🙂

http://www.sqlservercentral.com/scripts/Administration/61766/

My blog: http://jahaines.blogspot.com

SQLdevotee SSCarpal Tunnel Points: 4820 More actions · Answer 4

Okay, after that, I have to delete data that is more than 12 months from today. whatever is older than a year today have to be deleted. How do I do this real fast and efficiently? I have an identity row as the first column, just FYI.....

[font="Verdana"]Imagination is more important than knowledge-Albert Einstein[/font]

Adam Haines SSC-Insane Points: 23217 More actions · Answer 5

What percentage of the total number of records is 12 months? This will allow me to better determine how to delete the data.

You have plenty of options here. You can save the data to a table that you plan to keep and truncate the source table. Truncating a table is signifantly faster than deleting data and the deletes are not logged, so it does not inflate you t-log. Note that this option is more benificial if you plan to delete a significant portion of the table, like 60%.

Another option is to delete records in batches. It is sometimes faster to delete a few thousand records at a time than 100,000.

You could do something like;

DECLARE @Batch INT

SET @Batch = 1000

WHILE

EXISTS(SELECT 1

FROM MyTable

WHERE CONVERT(VARCHAR,MyColumn,101) <

CONVERT(VARCHAR,DATEADD(mm,-12,GETDATE()),101))

)

BEGIN

DELETE TOP (@Batch)

FROM MyTable

WHERE CONVERT(VARCHAR,MyColumn,101) <

CONVERT(VARCHAR,DATEADD(mm,-12,GETDATE()),101))

END

My blog: http://jahaines.blogspot.com

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 6

How much data does that 12-month period represent? if that's small compared to the 2TB - it will be faster to copy out the stuff you want to keep to a new table, then DROP the old table, and rename the table you just created. Of course - you end up having to recreate constraints, indexs, etc... but then again - it can still be a lot faster.

If that's not feasible - the perhaps try "walking the clustered index" (which hopefully is that identity column you just mentioned.)

looks something like this:

declare @batchsize int;

set @batchsize=50000; --how many records to kill in one operation

declare @startid int;

set @startID =0;

declare @endid int;

set @endid=max(id) from MyTableToClean;

declare @datetokill datetime;

set @datetokill=dateadd(dd,datediff(dd,0,getdate()),0); --kill in "full days"

while @startID<@endid

begin

delete

from MyTableToClean

where

id between @startid and @startid+@batchsize

AND mydate<@datetokill;

Set @startid=@startID+@batchsize+1;

checkpoint;

end

If that really is your clustered index - then it should take your deletions to task pretty good. The nice part is - you should be able to stop and start this loop if it's making your server miserable.

Either way - don't forget to rebuild your indexes (and update your stats) when you're done.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Tomm Carr SSCertifiable Points: 5426 More actions · Answer 7

Here is a function we wrote based on info posted here and elsewhere:

/*

This function will return the number of rows in a table, the same result

as executing a "Select Count(*) from tableName" except that this is much

more efficient in that it doesn't have to scan the entire table.

Note: the value returned could be off by a few records, especially if there

has been some recent insert and/or delete activity against the table. If

absolute accuracy is important, be sure to execute the following

command first:

DBCC Updateusage( '', @TableName ) [WITH NO_INFOMSGS]

For large tables, this will still be MUCH faster than "Select Count(*)"

*/

Create function [dbo].[GetRowCount](

@TableName sysname

)

returns int

as begin

declare @Result int;

Select @Result = rows

from sysindexes

where id = object_id( @TableName, 'U' )

and indid < 2;

return IsNull( @Result , 0 );

end

Tomm Carr
--
Version Normal Form -- http://groups.google.com/group/vrdbms

SQLdevotee SSCarpal Tunnel Points: 4820 More actions · Answer 8

Thank you everybody above. Thanks a bunch guys for the quick help!

[font="Verdana"]Imagination is more important than knowledge-Albert Einstein[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 9

How did it work out?

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]