Archiving to history without creating duplicate rows

  • Well, I agree that there are no X table locks being put on system tables in any case of BEGIN TRAN ; SELECT ... INTO ... since I did actually explicitly write that the locks were key locks...

    I did neither write that you cannot create tables while you are S/I in a transaction, I said retrieve something... I even gave the (blocked) example of SELECT * FROM INFORMATION_SCHEMA.TABLES. But if you were to say that no one ever wants to retrieve this information or do similar operations that would block on KEY locks in a system table, then I would agree. 

    Saying 'never' I still think is best. At least it leads people to think what they are doing (which, regarding DBs, seems to be one thing developers do not do) and which implications their code can have to others work.

    P.S Thanx!

  • Hi All:

    I seemed to start an interesting thread with all kinds of great information. The problem was solved with the use of several inputs. This is what seemed to work. For the move to history without creating duplicates:

    insert daily_transactions_hist

    select * from daily_transactions

    where daily_transactions.transactiondate < dateadd(d, -360, getdate())

    and not exists (Select 1 from daily_transactions_hist where daily_transactions.TransactionID = daily_transactions_hist.TransactionID)

    Then just a another step in the job that deletes the rows:

    use Immedia_Billing

    delete daily_transactions where transactiondate < dateadd(d, -360, getdate())

    go

    Thank you all for the help and assistance

    William

  • Hello!

    Just a quick question. Have you considered partitioning? If your data amout is huge maybe partitioning can offset the extra cost of a complicated query plan? And you could 'drop' your partition from which you moved the data to the history table (if you partitioning is 'daily'). Maybe even most your queries are limited by the time dimension? Then partitioning might even be beneficial for Selects...

    Regards,

    Hans

  • So you were just kidding when you wrote the following?

    quoteDuring this time you have eXclusive locks on system tables.

    ... here's another...

    quoteP.S Even when creating temporary tables you lock the system tables.

    Yep, I realize you corrected yourself later.

    Ok... (trying to lighten this up a bit ), strangly enough, I absolutely agree with the following statement you made...

    quoteAt least it leads people to think what they are doing (which, regarding DBs, seems to be one thing developers do not do) and which implications their code can have to others work.

    Had you said something like that to begin with (instead of the "Never" thing and system table locks: things), I'd have taken a bit of a different tact and added the following to boot...

    1. As a code reviewer, trying to generate an estimated execution plan in the presence of SELECT/INTO is a real inconvenience because you have to materialize all referenced tables in the code to gen the plan.  Sure, you can do a WHERE 1=0 or TOP 0 (thanks for that idea ), but the reviewer shouldn't have to search for and modify such code especially in lengthy procs for batch jobs (lengthy procs are another sore spot, but later... ) .  Explicit declarations of ALL temp tables and "working" tables at the beginning of the code allow the reviewer to simply select that portion of the code and execute it to materialize the tables.  As a Developer that needs to troubleshoot code, SELECT/INTO makes life a real pain in the patooti for similar reasons.
    2. SELECT/INTO is a combination of both DDL and DML.  Books Online states that all DDL should be accomplished at the beginning of the proc and should not be mixed with DML or multiple recompiles of the same proc may occur thereby degrading performance.  Tnis would even effect batch processes which normally have 1 recompile at the beginning, anyway, because the data has changed significantly or the stored execution plan has simply expired.
    3. Although it is possible to create other tables when SELECT/INTO is running, certain features, like many of the information schema views, will simply not return anything until the SELECT/INTO completes.  Be very careful when running long winded SELECT/INTO's as you may also interfere with code that drops and recreates objects based on their presence in the views (also, shouldn't use those views for simple object detection... use IF OBJECT_ID('objectname') IS NOT NULL, instead).  It'll also tick off the DBA if he/she uses those views alot for daily monitoring (or whatever) and has to wait while your code runs.  You and your DBA CAN regain some utility in this area by selecting from the Information Schema views using the WITH (NOLOCK) table hint with the understanding that you've just thrown ACID properties out the window and your code can withstand that shortcoming.
    4. SELECT/INTO really shouldn't be used for much of anything except for the occasional ad-hoc copy of a small table unless you can get the full benefit of not logging the data by setting the recovery mode to Simple or Bulk Load.  In most cases, this is simply not allowed as it is a database wide setting and has some other implications similar to the old "Truncate Log On Checkpoint" setting.
    5. In most cases, the use of SELECT/INTO can have some far reaching effects and should only be used when it's absolutely essential that the data transferred to the new table must not be logged.  Those "essential" times should be limited to things like the initial load of huge archive files (saves on the first backup by not logging the action) or when you need to bulk load a table into a "play area" (usually, a different database) and you want to do it as quick as possible and, again, without logging.  In both these cases, the target database should be set to either the Simple or Bulk Logged recovery mode or there's simply no reason to use SELECT/INTO because no utility is gained by using it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply