How Effective Is Your Data Load Monitoring in SQL and Python?

  • Comments posted to this topic are about the item How Effective Is Your Data Load Monitoring in SQL and Python?

  • This was removed by the editor as SPAM

  • The nice part about this article is that it is generic and the principles could be applied to any database platform. Same with the execution, with Python just executing T-SQL and inserting parameters. It's a solid technique. We did something similar but not as extensive, just having a list of tables and setting additional parameters for data load frequency, since monthly data loads destroyed the standard deviation. Very good approach and well written.

  • Agreed, nice approach, and I like that it is generic.  I'm going to offer an alternative that has worked for me a lot in the past -- the illustration is from an old SSIS package I happen to have with me, but it can be applied in any environment.  It does take a little more discipline to say "I will handle all imports using this basic pattern", but once you do it you have really flexible monitoring output capabiities.

    Integration reporting pattern

  • [accidental duplicate post, please delete]

     

    • This reply was modified 1 year, 3 months ago by  Lisa Slater Nicholls. Reason: [accidental duplicate post, please delete]
  • First, nice article.  It appears well written, is in a good "order of revelation", and includes the code in an easy to read format.  Kudos for all that and thank you for sharing.

    I do have to ask, though...

    Why not just put a nice, set-based trigger on each"ETL"  table that captures the row count and date along with the trigger action and write it to either a separate log table for each "ETL" table or to a common table for all tables?  If you elect to use one log table to each "ETL" table, the trigger could easily be built to create that table with almost no overhead so that someone wouldn't have to remember to do it.  The reporting would be super simple, as well.  You could even capture what did the action using something like the ORIGINAL_LOGIN() function.

    Most of the code being executed in Python for this is T-SQL anyway and has code that is proprietary to T-SQL so it's not going to be an instant "drop it in and run it" solution, anyway.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply