August 18, 2009 at 3:57 pm
Something I always wondered about.
I ran into this article
but it's dated in 2007.
My guess today would be youtube. At the amount people are uploading, they must be growing exponentially or google in general.
I'm surprise services like Facebook, Myspace, and even the 10 million subscribed game World of Warcraft didn't make the list.
August 19, 2009 at 5:31 am
There are probably (and we're unlikely to ever know) signal processing databases in the NSA that make WoW, YouTube & Google look like rinky-dink Access db's. Just the indications of some of the types of processing and the amount of data collected that we've seen in press reports suggest these guys are dealing with some seriously huge data sets.
You might also consider some of the financial systems that are tracking trades around the globe in various markets. Any individual transaction is miniscule, but the cumulative effect is quite large.
I'd put my bets on either of those before the YouTubes of the world.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 19, 2009 at 7:40 am
I know the large databases you can do stuff like partitioning tables or creating many filegroups to increase speed, storage and increase flexibility with backups.
However is it possible that these databases are more than one databases combined?
August 19, 2009 at 7:42 am
I don't have a clue. It'd be fun to have that problem though.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 19, 2009 at 7:51 am
I think having a large db would be a headache. On call, no sleep, family interrupted. It would be a challenge, but I'm not sure I'd think it was fun.
I don't think YouTube has a super large db. Most of the video is probably in the Google File system, with only meta data (name, reads, vote, etc.) in a db. I think MySpace, Facebook, etc. all use multiple databases, so they don't have everything together.
I'm not sure Google even has one true database with everything in it. However maybe we do need to change our idea of what a database is as we go to more distributed systems.
I know in the SQL space there is a Winter Corp survey that had Verizon as one of the larger dbs, 100TB+
August 19, 2009 at 7:54 am
No idea how accurate any of these are...
http://www.usatoday.com/money/industries/technology/maney/2006-05-16-nsa-privacy_x.htm
I'd love to play with even the baby of the listed ones. 312 TB is one hell of a challenge.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 19, 2009 at 8:00 am
I know that MySpace (which runs on SQL Server) is not one database. Last time I heard they had 100 or so database servers with the data distributed between them.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
August 19, 2009 at 8:08 am
Wow. Thanks for the good article to read this morning.
I can't even imagine a 312TB database let alone Petabytes.
I wonder what it takes for these organization to maintain these databases. How many DBA's and developer would it take and how would they structure their human resources to work on these stations. My guess would be they would have several different groups of dbas and developers assigned to work on a part of the database system... I'm curious.
August 27, 2009 at 10:30 pm
You have sparked my interest.
To be able to work on some of these databases could be fun in a morbid way. It would be a fantastic challenge.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
August 28, 2009 at 7:08 am
I remember my ex (an Oracle DBA) drooling at the thought of the enormity of the set-up at CERN
15 petabytes a year of data
11 DBA teams
PDF Link to "Databases for the CERN LHC: Techniques and Lessons learned"
August 29, 2009 at 10:42 am
I stumbled across a blog a few days ago which directed me to a slide show by the chief data architect from MySpace.
Take a look at the slide show - an interesting insight into how SQL Server is used.
http://www.slideshare.net/markginnebaugh/myspace-data-architecture-june-2009
August 29, 2009 at 1:36 pm
There are a few good presentations on MySpace. I bet their DBAs have an interesting job, but not much sleep.
August 31, 2009 at 12:56 pm
Clive Strong (8/29/2009)
I stumbled across a blog a few days ago which directed me to a slide show by the chief data architect from MySpace.Take a look at the slide show - an interesting insight into how SQL Server is used.
http://www.slideshare.net/markginnebaugh/myspace-data-architecture-june-2009
awesome slideshow. thanks for sharing.
September 1, 2009 at 1:57 am
No problem. It's good to see slideshows like that and I agree...very little sleep...Reminds me of working for Virgin Media!:w00t:
Viewing 14 posts - 1 through 13 (of 13 total)
You must be logged in to reply to this topic. Login to reply