Oops, I deleted that data

  • Same here. Our developers have a template script to use as a starting point so some of the details are provided for them. We also have a scratch database on every server in every environment so they can make "backups" of the data they are altering to have a simple recovery.

    Scripts can be run as post-deploy in our SSDT deploy environment. Every database deploy pipeline includes a formal backup step at the beginning of the deploy. Granted, restoring a backup to a live database after a deploy with issues is risky at best and virtually impossible in most cases. Typically we just move forward because we can't risk the data loss nor the time to figure it out. But in a catastrophe we know we can restore to the point before the deploy initiated.

    Granted, we are assuming the backup can be restored and we are aware of that.

    ------------
    Buy the ticket, take the ride. -- Hunter S. Thompson

  • Bryant McClellan wrote:

    Granted, restoring a backup to a live database after a deploy with issues is risky at best and virtually impossible in most cases. Typically we just move forward because we can't risk the data loss nor the time to figure it out. But in a catastrophe we know we can restore to the point before the deploy initiated.

    Granted, we are assuming the backup can be restored and we are aware of that.

    The best thing we ever invested in is Our DR test server - regular automated restore tests and also re-usable as a data playpen - i'm hoping to get our restore tests to an interval of 3 days (so that our playpen is a little more usable)

    you are quite right though, for a huge database it might involve significant downtime and it's really hard to sell that to the business. I know someone on here had a signature that mentioned "log backup every minute" - trust me, our warehouse and production systems would be offline for more than a day if we had to restore a Full DB and more than 1000 log backups.

     

     

    MVDBA

  • MVDBA (Mike Vessey) wrote:

    Bryant McClellan wrote:

    Granted, restoring a backup to a live database after a deploy with issues is risky at best and virtually impossible in most cases. Typically we just move forward because we can't risk the data loss nor the time to figure it out. But in a catastrophe we know we can restore to the point before the deploy initiated.

    Granted, we are assuming the backup can be restored and we are aware of that.

    The best thing we ever invested in is Our DR test server - regular automated restore tests and also re-usable as a data playpen - i'm hoping to get our restore tests to an interval of 3 days (so that our playpen is a little more usable)

    you are quite right though, for a huge database it might involve significant downtime and it's really hard to sell that to the business. I know someone on here had a signature that mentioned "log backup every minute" - trust me, our warehouse and production systems would be offline for more than a day if we had to restore a Full DB and more than 1000 log backups.

    Certainly having a DR server to run automated restore testing is ideal but you cannot practically do that in the midst of a production pipeline deploy. Certainly you could test them after the fact but during a deploy you have to assume the pipeline backup will suffice.

    ------------
    Buy the ticket, take the ride. -- Hunter S. Thompson

  • alex.sqldba wrote:

    Oh yah. Had this. Only to be told by the Lead Developer that its a 'best practice' and that 'I wouldn't understand'

    So, I made my case and then let it blow up. That was fun. And immature.

    I've gotta disagree with you on this one... it wasn't you being immature... it was the damned Lead Developer being immature.  Anyone that claims "Best Practice" and follows that up with a "wouldn't understand" comment is a complete and totally arrogant idiot.  If you can't explain something to someone that doesn't know something, then you don't know it well enough yourself.  What a jerk.

    You did absolutely the correct thing... if common sense cannot be made to prevail, then you have to let "da code an' da 'puter 'splain it" like it is, because a "man forced against his will is of the same opinion still".  Not to be confused with setting someone up to fail (which I don't do), sometimes you just have to "Give them the opportunity to fail" to make your point.  People like that Lead Developer always rise to that opportunity. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Developers could certainly have read access to production data. If a production change needs to occur to data for whatever reason, the production change should be done by an ops DBA, who would document the request, the change approval,  the testing of the change script, the outcome of the testing, the user approval of the test results, the production execution, and the outcome of the production execution. This is the *most basic* form of segregation of duties. Even if your "ops DBA" is just a technician who isn't a developer. If your company is publicly held or owned by a publicly held corporation in the US, this isn't even an option, it's a requirement.

    If your developer has write access to your production databases, you're operating on a development database, and operating far outside best practices.

  • We maintain "TableName_deleted" tables for our data tables, each of which is the same structure as the data table it reflects (except for no key fields), plus one field for the user and another for the time stamp -- these reflect who and when.  Triggers are set for deletes and updates so that every change or delete results in the original record(s) being backed up to the "TableName_deleted" tables along with precisely when using GetDate() and by who using sUser_SName().  Thus we can see every single change/delete to a record, when it was made and by who, and go back to any version of a record we need to.

    This works great for us, but our database is not huge such as discussed in this article, and we don't have hundreds of millions of changes, so maybe this isn't a great method for everyone.

    I'm pretty new -- maybe temporal tables is the better way to do this same thing, but I haven't had time to learn yet just what those are.

  • Mike wrote:

    We maintain "TableName_deleted" tables for our data tables, each of which is the same structure as the data table it reflects (except for no key fields), plus one field for the user and another for the time stamp -- these reflect who and when.  Triggers are set for deletes and updates so that every change or delete results in the original record(s) being backed up to the "TableName_deleted" tables along with precisely when using GetDate() and by who using sUser_SName().  Thus we can see every single change/delete to a record, when it was made and by who, and go back to any version of a record we need to.

    This works great for us, but our database is not huge such as discussed in this article, and we don't have hundreds of millions of changes, so maybe this isn't a great method for everyone.

    I'm pretty new -- maybe temporal tables is the better way to do this same thing, but I haven't had time to learn yet just what those are.

    I attempted something like that about 10 years ago

    I created an audit table something like

    TableName varchar(255)

    FieldName varchar(255)

    oldvalue varchar(max)

    newvalue varchar(max)

    datetimemodified datetime

    using the pivot function i could get a version of any table at any given point - i don't have the code for it any more

    It worked nicely on data sets that were not written to very often, but these days i wouldn't dream of doing it.

     

    MVDBA

  • Table triggers will kill your performance and are generally frowned upon. If your database is small and your users are few it won't matter, but this isn't a scalable solution.

    I'm curiousy why not storing the primary key? How would you rejoin rows to related records in other tables with foreign key data that would also be restored? The primary key is vitally important.

  • jermitts 57867 wrote:

    Table triggers will kill your performance and are generally frowned upon. If your database is small and your users are few it won't matter, but this isn't a scalable solution.

    I'm curiousy why not storing the primary key? How would you rejoin rows to related records in other tables with foreign key data that would also be restored? The primary key is vitally important.

    not sure if that was a reply to me - but I just forgot to put the PK on the post (I had to type it on the fly before I get going home)

    MVDBA

  • That's interesting, @mvdba. Being a developer, I keep myself restricted from fire fighting with the production environment. It is a nightmare for me to update values in production directly, without using transaction statements...

    Aveek has been working as an Analytics Engineer for the past couple of years now. His main areas of interests is SQL Server, SSIS, and SSAS. On his leisure, he likes to travel and is also an amateur photographer.
    https://twitter.com/avikoleum
    https://www.linkedin.com/in/aveekd22/

  • Mike wrote:

    Oops, I guess I wasn't clear so I'll try again.

    The field(s) for the key is/are included in the _deleted tables.  But the _deleted table does not use that for a key.  For example, if [Table1].[RecordID] is the key, then [Table1_deleted].[RecordID] does indeed exist.  But [Table1_deleted].[RecordID] is not designated as the unique key because it needs to exist in multiple rows.

    • This reply was modified 5 years, 1 month ago by  Mike.
  • Oh, that makes much more sense! I implemented a similar strategy once when there was a complaint that an unknown person was deleting and modifing rows from outside the erp software. They were found. 😉

  • Jeff Moden wrote:

    thisisfutile wrote:

    One takeaway I had from your post is that you use tables in another DB and reference them in the main DB via synonym.  We have a web server that references our software DB in this manner and I always wondered if speed would be impacted by such a model (I've never tested it mainly because I'm not the developer) but it seems to be performing well all of these years.  After reading your post I find myself thinking, "It can't be too bad if Jeff Moden uses it!"  Any caveats you can think of for this model?  Keep in mind, we have a 5GB database and a very low amount of web traffic (we're a small member-based company, not a Fortune 500).

    We use synonyms a lot between databases so we can keep the general functionality separate and give us lines of demarcation for all sorts of things including the isolation of large tables and similar things.  One caveat that we've run into in the past is similar to what we run into with views.  We don't generally use "Hungarian Notation" to identify different objects and so it sometimes takes people a second or two to figure out if something is a table, view, function, or synonym.  But it takes virtually no time to do such a thing.

    Another more important caveat is that you can't use DRI (FKs) between tables that live in different databases.  This hasn't been a problem for me because I've only split of the largest of tables and all of those are some form of audit or history or transaction table that has no FKs.  If I did have such a need, I'd simply make a copy of the usually very small PK table.  If you updated one of the copies, you'd have to update them all.  Extended properties on the tables are a good thing to remind folks and we do use them.  If that doesn't work for you, you could always make a "do nothing" function that has schema_binding on the table with a note in the flower box to remind people.

    An example of why we don't use any form of "Hungarian Notation" is that today's table could become tomorrows view or synonym and vice versa.  If you think not using "Hungarian Notation" is confusing, just wait until you run into a view with a "tbl" prefix or a synonym with a "tbl" prefix.

    There is another suggestion that I have that has also helped us a lot.  If you move a table to another database, always leave an empty copy of the table in the original database (of course, it will be differently named than the original because the synonym will use the original name to keep from having to change code).  In a pinch, that allows the database to be copied without having to copy the database with the large table in it.  This works particularly well for audit tables, which usually turn out to be the largest tables in most databases and are usually not essential if you need to "get back in business" after a massive crash or when you need to make a copy of the database for Development or Test purposes.  All you need to do is rebuild the synonym(s) to point at the local table(s).

    Also, I've not done specific performance testing for the differences between hitting a local table in the same database or a table in a different database on the same instance, but I've never found a difference during performance tests of related stored procedures affected by such a change.

    Fantastic!  Thank you for the feedback.  I never considered this: "If you think not using "Hungarian Notation" is confusing, just wait until you run into a view with a "tbl" prefix or a synonym with a "tbl" prefix."  I know my brain and when I see a little "v", I immediately know it's a view and if it was indeed a synonym, I might spend way too many hours not realizing that's where it was.

    Thanks a ton for sharing.  I think 25% of all that I know I've gleaned from your posts.  It's much appreciated!

  • I once told a friend, I aim to please... I sometimes miss but I'm always aiming.  I'm humbled by your comment.  Thank you for the very kind feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've started using MERGE for updating records because UPDATE just freaks me out. It takes more thinking out but will get faster with practice and it's good to get familiar with merge.

    Something like below. I can highlight the  inner SELECT to verify the number of records. Maybe wrap in a BEGIN TRANS - COMMIT as belt and braces approach. Our tables all have triggers for insert, update delete to copy rows to corresponding history tables but performance impact on large databases might be unpalatable (our tables are less than a million rows).

    MERGE       account                                     TARGET
    USING (SELECT id,
    COUNT(*) OVER() item_count,
    0 active_yn,
    account_code
    FROM account
    WHERE account_code = '4c') SOURCE(id,item_count,active_yn,account_code)
    ON TARGET.id = SOURCE.id
    AND TARGET.account_code = SOURCE.account_code
    WHEN MATCHED
    THEN UPDATE SET TARGET.active_yn = SOURCE.active_yn;

Viewing 15 posts - 46 through 60 (of 77 total)

You must be logged in to reply to this topic. Login to reply