Terabyte datacenter maintenance

  • Hi,

    I just inherited a 5.4 terabyte SQL Server 2000 database last week which is the backbone for the business I joined.

    Such a large database is something completely new to me. I am wondering how the other companies do maintenance on these databases?

    Before making a case to the management I needed to get more info on the maintenance strategies for such large databases and what options are available.

    The current hardware is: a UNISYS ES 7000 server with 12 Itanium processors, 64 GB memory and 5 TB database on a SAN. There is not enough space left on the machine to take a backup locally. So, Currently the database is dumped on the series of LTO2 tapes directly and no disk backup available.

    Has anyone used any backup compression tools like sql litespeed or sql backup with this kind of set up? How do dbas run the maintenance(reindexing, dbcc checkdb) on this size database without having downtime in days?

    Any good article, case study, white paper or the suggestions from the knowledgeable friends at sql server central is appreciated.

    Thanks,

    N

  • Greetings,

    Welcome to the giant databases world.

    I can't give you precise and detailled info, but I can help you to get started.

    You will have 3 foes to tacle:

    1 - Access Performance tuneup (i.e.: partitioning, indexes)

    2 - Backups/Recovery

    3 - Maintenance (i.e.:defragment, retiring old data if possible)

    Indexes will be your best friends and foes. Analyze the current DB and make sure the indexing is at its best. I have been there many times where an unused index occupied GBs of space and I needed the time it took to reorganize that index for other things. Also, missing an index on a 100 GBs table and having a full scan is, to put it mildly, quite bad.

    2 - Plan, work the plan, test the plan and plan again your backup/restore strategy. I never really cared too much about third parties softwares because the tape array was already compressing on tapes. Here is a good/bad advice, depending of your situation: try to keep the latest full backup and days logs files (if you backup the logs) on a drive somewhere if your tapes are too slow or busy for a fast restore. But like I said, it depends of your situation.

    3 - Identify only the tables/indexes needing to be reorg and do not do anything that you do not beed to do. If you can, split the reorg during many days (for example Monday you do Table1 and tuesday table2). If you can also, the offline reorg is faster and more efficient than the online reorg. If you can do this one, it will be better.

    Bottom line, you'll have to plan more carefully than with a small database. Good luck.

    Thank you

    MBA

    MCSE, MCDBA, MCSD, MCITP, IBM DB2 Expert, I-Net+, CIW

    Proud member of the NRA

    -Anti-gun laws prevent law abiding citizens to buy guns and defend themselves against bad guys who do not care about the law and get their gun illegally.

    - Democracy is 2 wolves and one sheep talking about their next dinner. Freedom is 2 wolves and one armed sheep with a .357 magnum talking about their next dinner.

  • Thanks Grasshoper

    Backups and reindexing are the main concern for me. This is a backend database for SAP so there is nothing much we can do as far as design is concerned ( tables/index etc ). there are several tables with over 300 GB of data. So, reindexing them is a nightmare. Plus almost 12 gb of data is added almost everyday. So, SAN is running out of space too.

    I was analyzing the backup /restore strategy. All we have are tape backups. That really worries me. For me, so far, I have to have a handy verified disk backup for the fastest recovery. However, I am learning that terabyte databases are a totally different ball game alltogether. But I was still curious to see the industry practices about the backups etc from the dbas who have been dealing with this for a long time. I am still curious If someone has tried like sql backup or litespeed on this size databases and whether it works or not.

  • I meant to say thanks to kestak

  • i don't envy you on the backup retore side of your job, but at the same time i do envy the extra large db features;

    I'd love the opportunity to learn really big database management, but without another 5 gig SAN, how can you test/verify backups and restores are really working; I shudder to think of what happens on the day something goes really wrong. I've only heard good things about SQL litespeed, and hopefully

    Steve Jones might have some insight; i would not be surprised if SQLServerCentral's data is maga-terabyte sized, so maybe he''l have some suggestions.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • For Such large setups either you have a "brutal" SAN that can do split mirror or you use some sort of log shipping. If you think you can restore such large DB from tape ... think Again 🙁


    * Noel

  • You might want to have a look at this article:

    http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/sap-top-10.mspx

    for a few pointers.

  • I threw together a quick list on my old blog at http://blogs.msdn.com/sqlserverstorageengine/archive/2007/04/30/very-quick-list-of-vldb-maintenance-best-practices.aspx.

    Primary in my mind would be making sure you have the ability to recover in a disaster - both restoring the data to a point in time and being able to quickly check where the damage is.

    Unfortunately this is really a topic for a book (which Kimberly and I keep promising ourselves we'll get around to writing).

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • In addition to all posted Reply...

    Looks SAP/VLDB Database is very important to the Organisation. Hence you need to chalkout DR (Disster Recovery) Plan on the priority.

    Log shipping would be ideal option as you can put your database in Read-Only. You can run DBCC CHECKDB OR CHECKALLOCK as a database maintenance on the readonly database.This will reduce load of your production server and same time come to know the inconsistancy of the database.

    About the Lightspeed backup; it is certified by Microsoft and Microsoft itself is using to backup their SAP Databases which are in TB.

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • Probably the first question/answer I would seek is the expected time to recover should the shit hit the fan? As you've pointed out it would take "forever" to restore from tape should your SAN bite the big one (yes, it does happen), the question for the business then becomes "do you understand that it will take X to recover from Y?, is that acceptable to you? If not, we need to talk".

  • i can think of two important issues right now, aside from all the others raised.... how could you do a partial recovery? supposed you need to restore a backup from x days ago, and join on the accounting table or something for differences.

    also, if there's just the one SAN, is there any development going on? If there is, that would imply direct updates to your production database. I'd buy one of those 1 or 2 terabyte NAS units at a minimum so you could at least have something backup/develop against. $1,100 bucks might save your butt some day.

    http://www.tigerdirect.com/applications/category/category_slc.asp?CatId=207&

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • To me it looks like Nikki has more problems than a newcomer to terabyte-sized databases and a newcomer to the business can solve.

    The best advise I can give is to identify as many problems as possible, and get the management to put a ballpark cost on the business impact of some of the failure scenarios. Then look for a consultancy organisation with experience and credibility in managing terabyte SQL Server databases to come and offer solutions. Nikki can then learn from the consultants and the look at the problem and solutions.

    If the business impact of downtime is near $zero, there will be no money for the consultants. If it is even a few $10k per day, the consultants will appear.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • I like your order and sorry for not getting in on this earlier. Saw Pauls' name in the post list and figured he'd covered things 🙂

    Definitely figure out the backup restore first, then I'd do indexes, at least a temporary solution.

    And I agree with Ed. I'm not a VLDB person, so I'd be getting some consulting help from someone who's live the pain. Doesn't make sense to do anything else.

  • All I can say is that it is REALLY scary that a company has an untrained/inexperienced DBA managing critical infrastructure!! I have been doing this for 15 years now and I am STILL regularly shocked at how frequently companies ignore the importance of the database server(s). Get a professional to mentor you, QUICK! Take some courses and start reading books too.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Nipa also sent me email - I gave her some more links:

    - A video recording of my Secrets of Fast Detection and Recovery from Database Corruptions talk from TechEd this year

    - Kimberly's TechNet webcast series that has a bunch of HA/reliability stuff

    - Consider coming to SQL Connections in Vegas in November to see our two pre-con workshops on maintenance and HA

    - A recently published book on HA by Allan Hirt

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply