September 17, 2013 at 1:55 pm
Hey everyone..I would like to start a discussion to understand what are the different quickest and most efficient ways to handle the following scenarios:
1) How to move trillion records
a) From one SQL Server table to another SQL Server table
b) From SQL Server table to Excel or CSV file
c) From Excel or csv file into SQL server table
2) How to count trillion records in SQL server
Please provide any details if you can..Thank you!!
September 17, 2013 at 2:16 pm
rockstar283 (9/17/2013)
Hey everyone..I would like to start a discussion to understand what are the different quickest and most efficient ways to handle the following scenarios:1) How to move trillion records
a) From one SQL Server table to another SQL Server table
I would question why you need to move a trillion rows of data. It is already in a table so there better be a REALLY good reason to move it.
b) From SQL Server table to Excel or CSV file
You don't. Excel can't come close to handling a trillion rows.
c) From Excel or csv file into SQL server table
You can't...see above.
2) How to count trillion records in SQL server
Unless you have a different number of fingers and toes I would suggest COUNT(*)...but you might be prepared to wait for a bit.
There is my 2ยข.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 โ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 17, 2013 at 2:29 pm
Thanks for the guidance mate..I had an interview today and the interviewer asked me the same questions..I answered it the best I could and suggested everything from count(*),@@rowcount,ranking functions to bulk copy, select * into etc..but he was not looking completely satisfied by my answers..so it got me thinking..
September 17, 2013 at 2:44 pm
rockstar283 (9/17/2013)
Thanks for the guidance mate..I had an interview today and the interviewer asked me the same questions..I answered it the best I could and suggested everything from count(*),@@rowcount,ranking functions to bulk copy, select * into etc..but he was not looking completely satisfied by my answers..so it got me thinking..
I kind of wondered if these were interview questions. Seems a strange thing to ask since not too many places have a trillion rows of data. That is a massive amount of data. Reason I made my comment about not moving without a good reason is because even in batches that would take a crazy amount of time and need a crazy amount of disc space. I mean a trillion rows of just a single int would be 4,000,000,000,000 bytes. That is not small amount of disc space. I have never worked with anything anywhere near that large but if I did I would only move that data as an absolute last resort. It would likely be easier to remove the rows no longer needed and rename the table. Questions like that are a bit bizarre to me.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 โ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 17, 2013 at 2:46 pm
If I was asked that in an interview, my immediate reaction would be to ask 'for what purpose?' Depending on why those rows need to be moved, I could probably suggest alternatives that did not involve moving that many rows around, because it's not something you'd ever want to do.
A trillion rows, at minimum, is just under a TB of data (if each row is a single byte, it probably isn't). Where's that going to be stored? How's it going to be moved? How fast is the network? What's the point?
p.s. Count_Big()
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 17, 2013 at 2:47 pm
I completely agree with you..even I will never build anything like..from the moment when you realize that the table is going to be that big..one should immediately think about partitioning it..its stupid to keep that much data in one table and then move it..the most insane thing in the interview was that after this question he asked me about 300 Terabytes of data ๐ ๐ ๐ ๐ :hehe:
September 17, 2013 at 2:48 pm
rockstar283 (9/17/2013)
from the moment when you realize that the table is going to be that big..one should immediately think about partitioning it.
No.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 17, 2013 at 2:50 pm
rockstar283 (9/17/2013)
the most insane thing in the interview was that after this question he asked me about 300 Terabytes of data ๐ ๐ ๐ ๐ :hehe:
Not at all insane if they have that volume of data. That is certainly a lot of data but VLDBs are different than trillion row tables.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 โ Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
September 18, 2013 at 2:58 am
Answers I have found so far:
Quickest way to move the data:
1) SSIS
2) DELETE FROM dbo.Table1 OUTPUT deleted.* INTO dbo.Table2
3) SELECT * INTO dbo.Table2 from dbo.Table2 where 1=0
Quickest way to count the number of records:
1) SP_Spaceused
Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database.
USE AdventureWorks2012;
GO
EXEC sp_spaceused N'Purchasing.Vendor';
GO
2) Count(*)
3) select SCHEMA_NAME(ST.schema_id)+'.'+ST.name TableName,
SP.rows RowCnt
from
sys.tables ST
inner join sys.partitions SP
on (ST.object_id = SP.object_id
and SP.index_id in (0,1))
order by
RowCnt desc
4) SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
where (index_id=0 or index_id=1)
and object_name(object_id) = 'YourTableName'
5) select Count_Big(*) from HumanResources.Employee
Thanks to all who helped me to formulate above answers
if anyone else has any other thoughts..please pitch in:-)
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply