SELECT database WHERE size = 'biggest in the world'

  • Something I always wondered about.

    I ran into this article

    Top ten largest databases

    but it's dated in 2007.

    My guess today would be youtube. At the amount people are uploading, they must be growing exponentially or google in general.

    I'm surprise services like Facebook, Myspace, and even the 10 million subscribed game World of Warcraft didn't make the list.

  • There are probably (and we're unlikely to ever know) signal processing databases in the NSA that make WoW, YouTube & Google look like rinky-dink Access db's. Just the indications of some of the types of processing and the amount of data collected that we've seen in press reports suggest these guys are dealing with some seriously huge data sets.

    You might also consider some of the financial systems that are tracking trades around the globe in various markets. Any individual transaction is miniscule, but the cumulative effect is quite large.

    I'd put my bets on either of those before the YouTubes of the world.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I know the large databases you can do stuff like partitioning tables or creating many filegroups to increase speed, storage and increase flexibility with backups.

    However is it possible that these databases are more than one databases combined?

  • I don't have a clue. It'd be fun to have that problem though.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I think having a large db would be a headache. On call, no sleep, family interrupted. It would be a challenge, but I'm not sure I'd think it was fun.

    I don't think YouTube has a super large db. Most of the video is probably in the Google File system, with only meta data (name, reads, vote, etc.) in a db. I think MySpace, Facebook, etc. all use multiple databases, so they don't have everything together.

    I'm not sure Google even has one true database with everything in it. However maybe we do need to change our idea of what a database is as we go to more distributed systems.

    I know in the SQL space there is a Winter Corp survey that had Verizon as one of the larger dbs, 100TB+

  • No idea how accurate any of these are...

    http://www.usatoday.com/money/industries/technology/maney/2006-05-16-nsa-privacy_x.htm

    I'd love to play with even the baby of the listed ones. 312 TB is one hell of a challenge.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I know that MySpace (which runs on SQL Server) is not one database. Last time I heard they had 100 or so database servers with the data distributed between them.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Wow. Thanks for the good article to read this morning.

    I can't even imagine a 312TB database let alone Petabytes.

    I wonder what it takes for these organization to maintain these databases. How many DBA's and developer would it take and how would they structure their human resources to work on these stations. My guess would be they would have several different groups of dbas and developers assigned to work on a part of the database system... I'm curious.

  • You have sparked my interest.

    To be able to work on some of these databases could be fun in a morbid way. It would be a fantastic challenge.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I remember my ex (an Oracle DBA) drooling at the thought of the enormity of the set-up at CERN

    15 petabytes a year of data

    11 DBA teams

    PDF Link to "Databases for the CERN LHC: Techniques and Lessons learned"

    ------------------------------------------------------------------------
    Bite-sized fiction (with added teeth) [/url]

  • I stumbled across a blog a few days ago which directed me to a slide show by the chief data architect from MySpace.

    Take a look at the slide show - an interesting insight into how SQL Server is used.

    http://www.slideshare.net/markginnebaugh/myspace-data-architecture-june-2009

  • There are a few good presentations on MySpace. I bet their DBAs have an interesting job, but not much sleep.

  • Clive Strong (8/29/2009)


    I stumbled across a blog a few days ago which directed me to a slide show by the chief data architect from MySpace.

    Take a look at the slide show - an interesting insight into how SQL Server is used.

    http://www.slideshare.net/markginnebaugh/myspace-data-architecture-june-2009

    awesome slideshow. thanks for sharing.

  • No problem. It's good to see slideshows like that and I agree...very little sleep...Reminds me of working for Virgin Media!:w00t:

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply