How many rows is it reasonable to put in SQL Server? (large data question)

  • First, I'm aware of the BOL answer that we are only limited by disk space, and I have reviewed the similar question posted in 2003 (on this site in the sql7/2000 Forum).

    I have someone that has asked for my help migrating mainframe data to SQL Server (or Oracle or My Sql - but I know SSvr best).

    There are currently 178 million records in the mainframe db.

    This is a couple orders of magnitude larger than I'm used to dealing with.

    Are any of you aware of any 'better practices' when it comes to large data?

    Here's an example of what I'm looking for: a friend of mine once commented that he had 25,000 pdf documents to deal with and it was a pain to determine how to break them into three directories on the server.

    Me: Can't one directory hold more than that?

    Her: Yeah, Windows will handle n files in a directory, but you start to get lots of delays when you put more than 10,000 files in a single directory.

    I've heard statements in the past like 'you can put millions of rows in ssvr, but if you're going to join tables with more than a million rows you really need Oracle on a VAX'. Even if that statement from an Oracle DBA in 2005 was true, I'm not sure that it holds true now.

    Any insights? Thoughts?

    I'm interested in statements from the "don't do what I did" category

    and any "I already have an OLTP db with x rows and I do joins and it works fine, don't worry."

    Thanks, sorry for the length - but I'm trying to avoid the rtfm answers and get some people that have real world experience to share their stories (or horror stories)

    -Chris C.

  • Largest number of rows I've heard of in a single unpartitioned table - 100 billion rows.

    Largest number I've personally worked with - 150 million rows

    Make sure there's enough hardware (memory and IO bandwidth)

    Make sure the disks are designed for throughput not size (chapter 2: http://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-the-accidental-dba/)

    Make sure you index sufficiently, but not too much. Index based on the queries that will run on the table.

    Make sure your queries are written optimally.

    Watch your maintenance times, large databases take longer to backup, restore, run integrity checks, do index rebuilds, etc.

    I wouldn't sweat 200 million rows these days.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My largest system is currently running with 3 billion rows in single unpartitioned heap. It still works just fine as a backend to an extremely busy website.

    In terms of database size, its growing by around 60-80GB a week and is currently at around 6TB.

    A few hundred million rows these days isnt that much.

  • 178M rows shouldn't be a problem per se, if you index properly.

    THE biggest factor is getting the correct clustered index.

    Get that right, you should have no issues with that many rows on decent hardware.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • largest table I have had was 1.5 billion rows. That was fine on a server with only 4Gb of memory.

    Indexing is very important, for instance I found it useful to separate out the non clustered indexes to a separate disk.

    the table was 300gb, and thats more important than the number of rows, how much disk space does the table and its indexes use.

    all of this does not mean you should not minimise table space if possible by eliminating fragmentation, archiving or partitioning if appropriate.

    biggest problem with it was probably maintaining it (reindexing\integrity)

    ---------------------------------------------------------------------

  • im looking at a table with close to 300 million records with proper indexes and have no problems. the key as people have stated is the proper clustered and non clustered indexes for the queries run on the table.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply