Large database question in regards to performance.

  • We currently have a database that contains 12 tables containing related information. The tables contain at least 2 years worth of data (our business rule) and range from 50k to 85 million rows. One of these tables is quite large with 900 columns.

    Each month we receive 12 data files that are imported into said tables from an AS400 system. This process may become daily with the same amount of data if not more. Currently this process takes some time to import. It used to take 15 hours to process, since rewrite takes around 6.

    My question is would partitioning improve performance? Should this data be cubed?

    The data is currently based upon a month. Currently each month's worth of data is housed in the corresponding tables. My biggest table with 900 columns takes a great deal of time to populate my thought was that if data was partitioned may improve performance.

    This process is currently a DTS package.

  • I'd really suggest looking for the areas that are slow. Then look for an answer.

    The more you are prepared, the less you need it.

  • Partitioning is built for this type of issue, but will it help? Can you separate out partitions to separate disks? It might reduce blocking on loads to load a new partition and switch it in, but queries might not work better unless you query certain partitions only.

    Cubes? Hard to say. Will pre-built aggregates help along dimensions (time/geography/etc)? Do you have SSAS skills or a separate box for this? Can you query cubes easily or will you have to rewrite applications?

  • Do you really need 2 years of info online? Is there no way to archive all but the last year's worth of info?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Actually, I can see why they may need two years of data online. Previous employer required that we maintain that much history in OLTP system (COBOL/ISAM databases). We needed it for reporting purposes for both customers and vendors.

    We actually had more than that when you also look at the AIS database I had been building during my time with them.

  • Well, let's start with the basics: is your Import process slow because it is CPU-bound or because it is IO-bound? If it is CPU-bound, then all the partitioning in the world isn't going to help.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply