Synonyms versus Partitioning for daily load?

  • Our BI team needs to perform a daily load of a customer-facing fact table from our warehouse.

    We've pretty much decided to go with a swapping technique, where yesterday's data remains available until today's populate is complete.

    The table will be 100% loaded every day, and 95% of the data will be unchanged from the day before (lack of PKs on the source system, etc, necessitates this).

    So, competing strategies to weigh:

    1) Have two tables, myTable_A, and myTable_B, and use two synonyms to refer to them. myTable_Load, and myTable_Read. They would swap every successful run, so that the SSIS package always writes to the _Load table, and the customers always use the _Read.

    2) Put a new column on the table, and partition on it. The load won't effect the partition that's already active, and then swap them out.

    Synonyms seems like a more natural approach, but as am over-building DBA, I am leaning toward partitioning!

    What are your experiences? And am I missing something obvious?

    Thanks in advance!

  • Method 1 seems more easy to set-up than method 2.

    Is it a big data load?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • There is another possibility.

    Have two tables, t1 and t2, with identical structures.

    t1 is your 'main' table. Assume that it is up to date with data.

    On the next day, load table t2. Then use sp_rename to swap t1 and t2. And repeat ...

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • I agree with Koen and Phil, if this is a complete "wipe and load" then I wouldn't bother with partitioning since it just adds complexity. The Load and rename would be the fastest way I think.

  • Thanks for the replies, all...

    Question: Is there no benefit of using synonyms over renaming the base tables themselves?

  • Personally, I like the synonym approach since it's a layer of abstraction. If I needed to move to 3 tables, or implement partitioning, or anything else, I like having a place to point people and potentially still keeping the tables separate. However, I'm not sure it's much more than personal preference.

    I also think partitioning is overkill.

  • @SixStringSQL (8/14/2014)


    Thanks for the replies, all...

    Question: Is there no benefit of using synonyms over renaming the base tables themselves?

    I like Synonyms for this type of thing for several reasons.

    Let's consider this...

    The table will be 100% loaded every day, and 95% of the data will be unchanged from the day before (lack of PKs on the source system, etc, necessitates this).

    If the data is completely rebuilt, what's happening to the log file? Chances are that your DW (or whatever) database is in the FULL recovery mode and you have Point-in-Time backups being done. That could be a shedload of data that you're backing up every day... data that can easily be reconstructed and doesn't actually need for log file backups on the construction of said tables. With that in mind, putting the data on a different database that uses simple recovery would be the way to go. You wouldn't even have to backup that new database.

    Then, there's the subject of Disaster Recovery. Making a gross understatement, the less data you have in a database, the faster you can recover it using a restore. For such a "swapping" system, one would be tempted to drop the "old" table when the "new" table is ready to help in this area but then you lose the ability to compare the "old" with the "new" for reporting as some would have you do.

    To solve all of those problems, I actually have 3 tables... "old", "new", and "empty". "Old" and "new" are in a separate database with the SIMPLE recovery model. None of that data needs to actually live in the main database so I save on backup time and restore time. The "empty" table lives in the main database and would be used during a disaster recovery if all hell broke loose. I could be recovering the main database and rebuilding the data in the other database at the same time. The main database would restore much more quickly than if all of this data were in just the main database. If the main database comes online before I manage to rebuild the data in the "new" table, then I just point the Synonym to the "empty" table in the main database and I'm back online for any functionality that doesn't use the "new" table. As soon as the rebuild process is done on the other database, it'll flop the Synonym and Bob's your uncle. You can't do that with renaming a table.

    Then there's the idea of expansion vs cost. SAN disks are relatively expensive creatures and having two copies of the same large tables (which can easily be rebuilt, require no backups, etc, etc) is a waste of valuable and sometimes expensive hardware. We made that realization and moved a lot of such data to a much less iSCSI system. People thought I was a little crazy when I insisted the reference to the "old" and "new" tables also be done using synonyms but they didn't think so when we did the move. There was zero downtime and zero code changes because I simply repointed those synonyms to the newly named tables (while the old tables were still in service and working) on the new drives, did the data rebuild like we normally would, and we were done!

    You just can't do that with simple table renames... not without changing some code. How often do things like that happen? Maybe never. But, if it ever does happen, it's bloody simple and it didn't take any extra time to write the initial code for.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm honored by everyone's responses...Thanks, superstars!

    Jeff, some very good points...

    I will weigh everyone's suggestions.

    Thanks for your help, everyone!

  • One update: Consider that MSSQL won't allow a "truncate table <synonym>"...You'll need a work-around stored proc.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply