July 26, 2012 at 5:53 am
So, I have been asked for ideas on how to restore data for only 1 customer in a multi-customer database. The business case is this: Customer does some stuff through a web application throughout the day and finds out that what they were doing was wrong or maybe a disgruntled employee does a bunch of bad things through the application. They call and want a day's data restored or they have a button or something that restores their data to a previous point in time. This does not have to be a "restore," but maybe even a way to go back to some auditing tables to restore data or a snapshot or something. Again, we do not want to restore a whole database to a point in time, but just the data for a customer within this database. Almost like a filtered point in time restore.
I am simply looking for ideas or thoughts from people who have done something similar. Once current proposal is to take the SQL data at its previous state before it was modified/deleted and store it in 1 field in a bug JSON string in our "up and coming" MongoDB (not a fan being a SQL guy). Any thoughts?
Jared
CE - Microsoft
July 26, 2012 at 6:06 am
I am simply looking for ideas or thoughts from people who have done something similar. Once current proposal is to take the SQL data at its previous state before it was modified/deleted and store it in 1 field in a bug JSON string in our "up and coming" MongoDB (not a fan being a SQL guy). Any thoughts?
Urgh. One way would be looking at creating a stored procedure which would systematically update every table that a customer could touch (through the app layer) back to a known safe value. This assumes of course that you have some audit capability on all the tables, and could cause constraint issues too. If you've only got 3 or 4 tables this could be simple, but if you have a complex schema or set of schemas this could be a nightmare.
---
Note to developers:Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.
July 26, 2012 at 7:43 am
Yeah, my first thought is to set up some auditing triggers and put some kind of identifier to allow them to restore along several different measures; i.e. a specific item or group(s) of items, or to a point in time. I think then the tricky part will be what to do with the current data. How do I know what to overwrite or keep. That will take some more architecture thought.
The other option is to not ever delete the data, but to have a deleted column with a datetime and populate it with GETUTCDATE() or something...
Anyone else?
Jared
CE - Microsoft
July 26, 2012 at 8:10 am
As far as I understand it MongoDB is a non-sql, documents oriented database engine.
In terms of restoring a subset of the database I would start by backing up subsets of it that later can be used as the data source for the restore.
Following this line of thinking, I would research "mongodump" - specifically the option of dumpin a single "collection" filtering data with a query. This would allow for a restore strategy based on deleting whatever data has to be replaced then load the particular filtered collection.
Oh well... I would love a MongoDB guy to comment on this 🙂
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.July 26, 2012 at 8:27 am
PaulB-TheOneAndOnly (7/26/2012)
As far as I understand it MongoDB is a non-sql, documents oriented database engine.In terms of restoring a subset of the database I would start by backing up subsets of it that later can be used as the data source for the restore.
Following this line of thinking, I would research "mongodump" - specifically the option of dumpin a single "collection" filtering data with a query. This would allow for a restore strategy based on deleting whatever data has to be replaced then load the particular filtered collection.
Oh well... I would love a MongoDB guy to comment on this 🙂
Yeah, the goal is to NOT use MongoDB. We have a competitiveness here that is supposed to spark ideas. So I need a solution in SQL that is better than the proposed MongoDB. So, SQL suggestions are what I need 🙂
Jared
CE - Microsoft
July 26, 2012 at 8:40 am
SQLKnowItAll (7/26/2012)
PaulB-TheOneAndOnly (7/26/2012)
As far as I understand it MongoDB is a non-sql, documents oriented database engine.In terms of restoring a subset of the database I would start by backing up subsets of it that later can be used as the data source for the restore.
Following this line of thinking, I would research "mongodump" - specifically the option of dumpin a single "collection" filtering data with a query. This would allow for a restore strategy based on deleting whatever data has to be replaced then load the particular filtered collection.
Oh well... I would love a MongoDB guy to comment on this 🙂
Yeah, the goal is to NOT use MongoDB. We have a competitiveness here that is supposed to spark ideas. So I need a solution in SQL that is better than the proposed MongoDB. So, SQL suggestions are what I need 🙂
Okay, do the same in SQL Server 😀
1- Partition your tables by "customer" and be sure there is no RI outside of the partitions of any particular "customer".
2- Export data by partition.
3- When a restore is needed then truncate the target partition and import the last knows "good backup" e.g. last partition export.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.July 26, 2012 at 8:40 am
I'm not sure this would work, but it sounds like it might. Set up the DB so that each customer in the DB is on their own filegroup, with the tables partitioned off based on a customer identifier. Then if the need to "roll back" a customer arose, do a restore of the filegroup only.
Problem is, I think you wouldn't be able to do point-in-time restores of the customer data, only restores up to the last filegroup backup.
I'm sure some of the more experienced folks here will either tell me I'm wrong, or that I'm on the right track...
Jason
And it looks like I'm not crazy, as this is what Paul B posted while I was typing...
:hehe:
July 26, 2012 at 8:46 am
PaulB-TheOneAndOnly (7/26/2012)
SQLKnowItAll (7/26/2012)
PaulB-TheOneAndOnly (7/26/2012)
As far as I understand it MongoDB is a non-sql, documents oriented database engine.In terms of restoring a subset of the database I would start by backing up subsets of it that later can be used as the data source for the restore.
Following this line of thinking, I would research "mongodump" - specifically the option of dumpin a single "collection" filtering data with a query. This would allow for a restore strategy based on deleting whatever data has to be replaced then load the particular filtered collection.
Oh well... I would love a MongoDB guy to comment on this 🙂
Yeah, the goal is to NOT use MongoDB. We have a competitiveness here that is supposed to spark ideas. So I need a solution in SQL that is better than the proposed MongoDB. So, SQL suggestions are what I need 🙂
Okay, do the same in SQL Server 😀
1- Partition your tables by "customer" and be sure there is no RI outside of the partitions of any particular "customer".
2- Export data by partition.
3- When a restore is needed then truncate the target partition and import the last knows "good backup" e.g. last partition export.
I thought of this, but there are some issues.
1. Std. Ed.
2. Partitioning on customer means that the customer id has to be in each table, and of course, not every table needs that id; i.e. customer table linked to an app table linked to app attributes...
Those are just the main issues with a partitioning approach, but the thoughts are appreciated. Brainstorming is what I am looking for here 🙂
Jared
CE - Microsoft
July 26, 2012 at 8:52 am
SQLKnowItAll (7/26/2012)
1. Std. Ed.
Do poor's-man partitioning, a table by customer exposing the bunch as a view joining all of them.
SQLKnowItAll (7/26/2012)
2. Partitioning on customer means that the customer id has to be in each table, and of course, not every table needs that id; i.e. customer table linked to an app table linked to app attributes...
Oh well... if you can't identify to which customer each tuple belongs then my feeling is that we are all wasting our time - customer based backup/restore requires (surprise!) to identify the "customer" and by identify I mean "where the heck is stored the data of this particular guy".
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.July 26, 2012 at 8:56 am
PaulB-TheOneAndOnly (7/26/2012)
SQLKnowItAll (7/26/2012)
1. Std. Ed.Do poor's-man partitioning, a table by customer exposing the bunch as a view joining all of them.
SQLKnowItAll (7/26/2012)
2. Partitioning on customer means that the customer id has to be in each table, and of course, not every table needs that id; i.e. customer table linked to an app table linked to app attributes...Oh well... if you can't identify to which customer each tuple belongs then my feeling is that we are all wasting our time - customer based backup/restore requires (surprise!) to identify the "customer" and by identify I mean "where the heck is stored the data of this particular guy".
Ugh... I hate poor man's partitioning lol I can of course identify the relationships to the customer, but that does not mean partitioning will work as such. However... I could maybe put each customer into their own schema? and build those schemas on their own filegroups? Hmm... Just thinking out loud.
Jared
CE - Microsoft
July 26, 2012 at 9:43 am
SQLKnowItAll (7/26/2012)
However... I could maybe put each customer into their own schema? and build those schemas on their own filegroups? Hmm... Just thinking out loud.
Why don't you push your "own schema" idea a bit further and put each customer on each own database? This, by design, will solve the customer based backup/restore strategy.
If the choice is to have a zillion things rather than one I would prefer to have a zillion databases - one per customer - as opposed to having a zillion filegroups - one per customer.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.July 26, 2012 at 9:53 am
PaulB-TheOneAndOnly (7/26/2012)
SQLKnowItAll (7/26/2012)
However... I could maybe put each customer into their own schema? and build those schemas on their own filegroups? Hmm... Just thinking out loud.Why don't you push your "own schema" idea a bit further and put each customer on each own database? This, by design, will solve the customer based backup/restore strategy.
If the choice is to have a zillion things rather than one I would prefer to have a zillion databases - one per customer - as opposed to having a zillion filegroups - one per customer.
Thanks Paul. I guess I just worry about changing the architecture of the current application too much. To go to multiple databases will require significant code changes for the existing framework. The proposed MongoDB solution does not require code changes, so I would lose this argument. I think I may have to look at DML triggers and auditing to do this effectively.
Jared
CE - Microsoft
July 26, 2012 at 10:52 am
at my shop we have nightly backups of our database restored to the same server under a different name. if something happens from the time of the last backup we can just do a cross database insert or update and recover the data. (One of our clients had a disgruntled employee that tried to mess up their account, since we had this we just did the mentioned cross database INSERT INTO ... SELECT and got their data back.) If you have the space this would give you some pad for this sort of thing.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
July 26, 2012 at 11:59 am
Looks like you could also use a database snapshot as the source updating the underlying database as well. Note, I'm not saying to revert to a database snapshot, but to select the specific data that needs to be used to update the underlying database.
Just a thought based on capn's idea of having a copy of the database restored under another name.
July 26, 2012 at 2:29 pm
Is this a one-off thing or a "feature" you're being asked to build? I'm not sure I'd design a database schema around the occasional need to rollback a single customer, and separate tables or filegroups for each customer can be unweildy if the business has hundreds or thousands of customers.
Are you confident you know which tables the affected records are in, and do they have modified datetimestamps? If so I'd try to restore a backup and write a script to delete the affected records and re-insert them from the backup. If this is something that needs to be done on a regular basis I'd create a snapshot and the end of the each day and run the script against that. I'd probably put it in a stored procudure with parameters for the customerid and the datetime value after which records should be replaced.
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply