November 12, 2010 at 6:30 am
Hi,
I have to design a module in an existing system wherein I need to create versioning in the database for a workflow.
I have thought of two options:
1. When the user wishes to modify the data which is in 'Approved' state after a workflow is complete, then create a copy of ALL the data with a new Version Id and status as 'Work in Progress'.
Advantage:
1. historic data will be present in teh same table.
2. not much code change in the UI.
3. Addition of just a filtering will be required in existing reports procedures (apprx50 reports).
Concern Area:
1. Copying the data would be costly on the click of a button because there are approx 70 tables containing the complete workflow details.
2. Create another table to contain the historic data and the current table(s) have the current 'Approved' record only.
Advantage:
1. A new table will contain historic information so querying in the current tables will not be costly performance wise as only current active records will be saved.
Concern Area:
1. Not too sure how I would create historic tables for approx 50 tables. 🙁
Does anyone have a better idea?
Also, what is the most efficient way of copying the data from more than 100 tables so that it doesnt take much time.
Thanks
November 12, 2010 at 7:17 am
Read about David Ziffer's recent approach to this problem here -
http://www.sqlservercentral.com/articles/software+development/71275/
Probably one of the more clever ways to do it, in my opinion. However, I wouldn't be surprised if this approach faced some scalability problems as the data grew, namely in the amount of writes that a single action could potentially do... but sounds like your alternative approach would be just as write-intensive.
November 12, 2010 at 7:25 am
This sounds like a process change rather than create mountains of duplicate data. How are you going to even know which copy belongs to the original? This just sounds like a nightmare to maintain down the road. If you have expanded all other options and are deadset on creating a copy of records from 50-100 tables you are in for a bit of coding. You can do all this fairly easily in a sproc. Maintaining this sproc could prove challenging as time goes on because you will need to modify it whenever you change any of the structures in any of your 100 or so tables. Another challenge is for "Project A" which "set" of records is correct??? What happens if "Project A" gets approved, then edited, then approved...
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
November 12, 2010 at 9:34 am
Thanks for the response.
Sean:
In the current scenario - there is one set of records being populated in multiple related tables. There is only one set of records for a workflow (after Create- Review- Approve flow) and when this undergoes the Modification process, then the records in the current Approved workflow are updated. There is no history/version. The current one is overwritten by the modified record. We need to version this.
I plan to -
1. create a copy of the records (in all related tables) of the Approved workflow and create a new version. I would definitely write a spoc for this.
2. The user would then work on this copy. We could mark the new version as 'Wok in progress'.
3. In the main table I would add two more columns - Valid From and Valid To (datetime).
4. Once the record 'Work in Progress' record get 'Approved', then I would close the previous 'Approved' record by filling in the 'Valid To' column with getdate() and I would leave the 'Valid To' column for the new record as NULL. This would imply that the record having Valid To = NULL would be the latest Approved.
Problem is that when I create a copy of the data...then that would consume a lot of time. The end user using the screen would have to wait for quite sometime. I am looking for a method wherein I can cut that time short by writing efficient queries/or if there is any other faster way to copy the data into tables.
Please let me know if you need further explanation of the problem.
November 12, 2010 at 4:10 pm
We've had to build a system that does something like this. What we did was add a table that maintains versions. Then we had a master table, think of it as a book management system with versions of books. Every table in the system maintained two columns as part of the primary key, the version number and the book number. Any set of changes that came in were given a new number from the Version table, and the updates to the table or tables were stored with the new version. Then, when you query, you're either going for a specific version or the latest. If you're going for the latest, we get the latest version for that book, then only select the max version from each table where it is equal to or less than the version we're looking for. With the keys all clustered on these values the system is extremely fast and has scaled to millions & millions of rows with no issues. Writing the code is a bit of a pain, but CTEs can help.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
November 12, 2010 at 10:59 pm
nidhis (11/12/2010)
Hi,I have to design a module in an existing system wherein I need to create versioning in the database for a workflow.
I have thought of two options:
1. When the user wishes to modify the data which is in 'Approved' state after a workflow is complete, then create a copy of ALL the data with a new Version Id and status as 'Work in Progress'.
Advantage:
1. historic data will be present in teh same table.
2. not much code change in the UI.
3. Addition of just a filtering will be required in existing reports procedures (apprx50 reports).
Concern Area:
1. Copying the data would be costly on the click of a button because there are approx 70 tables containing the complete workflow details.
2. Create another table to contain the historic data and the current table(s) have the current 'Approved' record only.
Advantage:
1. A new table will contain historic information so querying in the current tables will not be costly performance wise as only current active records will be saved.
Concern Area:
1. Not too sure how I would create historic tables for approx 50 tables. 🙁
Does anyone have a better idea?
Also, what is the most efficient way of copying the data from more than 100 tables so that it doesnt take much time.
Thanks
If you are using the Enterprise Edition, my recommendation would be to use partitioned tables and "Type II Slowly Changing Dimension" methods. The advantages there are 1) No need to create separate tables and 2) has all the advantages of having separate tables without the cumbersome recombination of data if you need to lookup the total history of rows.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 30, 2011 at 9:37 am
Grant,
I like your idea on the book management system with versions of books. But the question I have is; if part of the key contains a version representation what happens when you have a new version, but there is no changes to a given row, and you want the most current version for the entire book?
March 30, 2011 at 9:41 am
hlpearl (3/30/2011)
Grant,I like your idea on the book management system with versions of books. But the question I have is; if part of the key contains a version representation what happens when you have a new version, but there is no changes to a given row, and you want the most current version for the entire book?
If you look at the queries in the example, you only ever get the TOP value that is less than or equal to the version you're looking for. So it doesn't matter if a particular table has nothing updated since version 6 and you're on version 9. the top value that is equal to or less than 9 will be 6.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
March 30, 2011 at 11:42 am
Grant,
I agree! I think that the children would be added in descending order.
I didn't see the queries. I possible could I have the link to them.
The issue I'm dealing with is: I might have editing going on for changes 6, 7, 8 and 9. Then an implementation of change 6 that includes some from 6, 7, 8 and 9. But using the TOP value or a switch that indicates the "current" change value, it's that number and below.
March 30, 2011 at 11:55 am
Sorry, I don't get what you're getting at. The queries and the code are all available with the article[/url].
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply