October 5, 2018 at 4:03 am
Hello,
I have a requirement wherein there are 2 tables (Staging & Target) in the same database.
Everytime data is first loaded in the Staging table. Now in second run data will be again first Loaded to the Staging table. Now I want to flip the tables using SQL query in such a way that after data is loaded into the Staging table make this changes
Staging becomes(flip) Target
Target becomes(flip) Staging
So ideally we will see both tables. But in actual at a time only 1 table has latest data.
Before opting for flip tables approach I have tried the sp_rename but that results in deadlock if someone tried to query Target table while it is being dropped and getting renamed.
Example,
IF OBJECT_ID('[dbo].[Target]','U') IS NOT NULL DROP TABLE [dbo].[Target] ;
EXEC sp_rename '[dbo].[Staging]','Target';
If we use the flip approach then there will be minimal chances of a lock. I tried to understand this flip tables concept and one approach I see is, it could be done using some kind of flag setting in SQL but not sure how. Any help on this would be really appreciated.
October 5, 2018 at 7:57 am
vikasjagadale8 - Friday, October 5, 2018 4:03 AMHello,I have a requirement wherein there are 2 tables (Staging & Target) in the same database.
Everytime data is first loaded in the Staging table. Now in second run data will be again first Loaded to the Staging table. Now I want to flip the tables using SQL query in such a way that after data is loaded into the Staging table make this changesStaging becomes(flip) Target
Target becomes(flip) StagingSo ideally we will see both tables. But in actual at a time only 1 table has latest data.
Before opting for flip tables approach I have tried the sp_rename but that results in deadlock if someone tried to query Target table while it is being dropped and getting renamed.Example,
IF OBJECT_ID('[dbo].[Target]','U') IS NOT NULL DROP TABLE [dbo].[Target] ;
EXEC sp_rename '[dbo].[Staging]','Target';If we use the flip approach then there will be minimal chances of a lock. I tried to understand this flip tables concept and one approach I see is, it could be done using some kind of flag setting in SQL but not sure how. Any help on this would be really appreciated.
This is actually pretty simple. First, name the tables as something the front end and other code isn't going to use. For example, if the name of the real table should be "Target", then name the two tables "dbo.Target1' and "dbo.Target2".
Next, create a synonym named the same as if it were the real table. In this case, the synonym would be named "dbo.Target". Assuming that dbo.Target1 is the currently active table, point the synonym at it.
After that, it's easy.
1. Truncate and load dbo.Target2 with data.
2. Do you validations and whatnot to ensure the data is valid and that the table contains what's needed.
3. If step 2 passed, drop and recreate the dbo.Target synonym but point it to dbo.Target2.
4. If you need the space, you could truncate dbo.Target1 at this point but it's ok to leave it as a "way to get back to where you were" if the proverbial poo hits the fan.
5. The next time you need to run, just reverse the tables listed above. Of course, you'll need to know what the currently active table is but reading the definition of the synonym should be good enough for that. You SHOULD have the ability to do a manual override for this, as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 5, 2018 at 9:14 am
+1 one for the synonym. You could use a view as well for client access, but this is more of what a synonym is for.
October 7, 2018 at 12:34 am
Jeff Moden - Friday, October 5, 2018 7:57 AMThis is actually pretty simple. First, name the tables as something the front end and other code isn't going to use. For example, if the name of the real table should be "Target", then name the two tables "dbo.Target1' and "dbo.Target2".
Next, create a synonym named the same as if it were the real table. In this case, the synonym would be named "dbo.Target". Assuming that dbo.Target1 is the currently active table, point the synonym at it.
After that, it's easy.
1. Truncate and load dbo.Target2 with data.
2. Do you validations and whatnot to ensure the data is valid and that the table contains what's needed.
3. If step 2 passed, drop and recreate the dbo.Target synonym but point it to dbo.Target2.
4. If you need the space, you could truncate dbo.Target1 at this point but it's ok to leave it as a "way to get back to where you were" if the proverbial poo hits the fan.
5. The next time you need to run, just reverse the tables listed above. Of course, you'll need to know what the currently active table is but reading the definition of the synonym should be good enough for that. You SHOULD have the ability to do a manual override for this, as well.
Thanks for detail explanation. But maybe my requirement was not clear enough
Basically what I want to do is there will be 2 tables. data will be first loaded to Staging table, now I want the data of Staging to be in Target table.
Reason Staging table will every time have new columns/ data which won't match with Target table. And Target table is what everyone will use for querying (SELECT) .
So while process of loading data from Staging to Target is in progress there should not be any lock for users who are querying on Target table because currently I have to drop Target table and rename staging to Target.
This is what I want to achieve data of staging should be in Target without dropping Target.
So any approach in SQL server 2014 that could help me achieve my goal of not dropping the target table and getting the latest data of Staging into it will be helpful.
SYNONYM approach which you suggested seems to be a good one, But I am not sure how it will fit in my requirement.
October 8, 2018 at 7:20 am
OK, you weren't clear in the explanation.
If you are loading data from staging to target, not updating,but new data, there shouldn't be many locks. You might have minor locks in the most current pages, but they should disappear quickly. For updates, you'll get some locking/blocking, but that is what happens. If you batch updates to smaller sets of data, this will go quicker.
You can look at partitioning., but this is EE in 2014. That allows you to load a new partition and switch it into the table.
October 8, 2018 at 7:57 am
vikasjagadale8 - Sunday, October 7, 2018 12:34 AMJeff Moden - Friday, October 5, 2018 7:57 AMThis is actually pretty simple. First, name the tables as something the front end and other code isn't going to use. For example, if the name of the real table should be "Target", then name the two tables "dbo.Target1' and "dbo.Target2".
Next, create a synonym named the same as if it were the real table. In this case, the synonym would be named "dbo.Target". Assuming that dbo.Target1 is the currently active table, point the synonym at it.
After that, it's easy.
1. Truncate and load dbo.Target2 with data.
2. Do you validations and whatnot to ensure the data is valid and that the table contains what's needed.
3. If step 2 passed, drop and recreate the dbo.Target synonym but point it to dbo.Target2.
4. If you need the space, you could truncate dbo.Target1 at this point but it's ok to leave it as a "way to get back to where you were" if the proverbial poo hits the fan.
5. The next time you need to run, just reverse the tables listed above. Of course, you'll need to know what the currently active table is but reading the definition of the synonym should be good enough for that. You SHOULD have the ability to do a manual override for this, as well.Thanks for detail explanation. But maybe my requirement was not clear enough
Basically what I want to do is there will be 2 tables. data will be first loaded to Staging table, now I want the data of Staging to be in Target table.
Reason Staging table will every time have new columns/ data which won't match with Target table. And Target table is what everyone will use for querying (SELECT) .
So while process of loading data from Staging to Target is in progress there should not be any lock for users who are querying on Target table because currently I have to drop Target table and rename staging to Target.
This is what I want to achieve data of staging should be in Target without dropping Target.
So any approach in SQL server 2014 that could help me achieve my goal of not dropping the target table and getting the latest data of Staging into it will be helpful.SYNONYM approach which you suggested seems to be a good one, But I am not sure how it will fit in my requirement.
Is anything other than the loading process inserting, updating, or deleting rows in the target table?
Also, I'd seriously question having a process that requires new columns in an existing table. It'll be the source of many nightmares not only with the scheduled loading of data but with reporting as well.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 10, 2018 at 8:33 am
Steve Jones - SSC Editor - Monday, October 8, 2018 7:20 AMOK, you weren't clear in the explanation.If you are loading data from staging to target, not updating,but new data, there shouldn't be many locks. You might have minor locks in the most current pages, but they should disappear quickly. For updates, you'll get some locking/blocking, but that is what happens. If you batch updates to smaller sets of data, this will go quicker.
You can look at partitioning., but this is EE in 2014. That allows you to load a new partition and switch it into the table.
You can use Partition Switching in any edition of SQL Server from 2005 upwards, you only need EE if you want multiple partitions in a table (this is a common misconception). Personally I prefer it over the SYNONYM approach, because it guarantees the tables have to have the same structure, indexes etc. The principle is similar to the schema example provided earlier, in that you need an intermediate table to assist in the process (since you can only switch with an empty partition)
Something like:
begin tran
alter table active switch to swapbuffer
alter table staging switch to active
alter table swapbuffer switch to staging
commit
October 10, 2018 at 8:53 am
vikasjagadale8 - Sunday, October 7, 2018 12:34 AMThanks for detail explanation. But maybe my requirement was not clear enoughBasically what I want to do is there will be 2 tables. data will be first loaded to Staging table, now I want the data of Staging to be in Target table.
Reason Staging table will every time have new columns/ data which won't match with Target table. And Target table is what everyone will use for querying (SELECT) .
I would still have the Staging and Target tables have the same structure - it's going to be a requirement for pretty much every approach that boils down to a metadata switch, whether sp_rename, partition switching, schema switching or synonyms. If you want to hide those additional columns from users then have them go through a view.
October 16, 2018 at 12:13 pm
I'd still go with synonyms.
Something like:
Have table dbo.TargetActive with the current structure and data.
Create synonym dbo.Target that points to dbo.TargetActive
Then to perform load of data:
1. Check if table dbo.TargetInactive exists, if so, delete.
2. Create new dbo.TargetInactive table with new structure and Insert data
4. Drop synonym dbo.Target
5. Rename dbo.TargetActive to dbo.TargetSwap
6. Rename dbo.TargetInactive to dbo.TargetActive
7. Create synonym dbo.Target to point to dbo.TargetActive
8. Rename dbo.Target_swap to dbo.TargetInactive
October 16, 2018 at 2:36 pm
andycadley - Wednesday, October 10, 2018 8:53 AMvikasjagadale8 - Sunday, October 7, 2018 12:34 AMThanks for detail explanation. But maybe my requirement was not clear enoughBasically what I want to do is there will be 2 tables. data will be first loaded to Staging table, now I want the data of Staging to be in Target table.
Reason Staging table will every time have new columns/ data which won't match with Target table. And Target table is what everyone will use for querying (SELECT) .
I would still have the Staging and Target tables have the same structure - it's going to be a requirement for pretty much every approach that boils down to a metadata switch, whether sp_rename, partition switching, schema switching or synonyms. If you want to hide those additional columns from users then have them go through a view.
Be very aware that sp_rename has some of it's own issues as the warning it issues implies. The use of Synonyms avoid such problems.
--Jeff Moden
Change is inevitable... Change for the better is not.
October 16, 2018 at 5:17 pm
Jeff Moden - Tuesday, October 16, 2018 2:36 PMBe very aware that sp_rename has some of it's own issues as the warning it issues implies. The use of Synonyms avoid such problems.
That's why I prefer Partition Switching. Unlike Synonyms it's pretty much guaranteed to fail cleanly if anything happens that would cause code to break, such as a change in table structure, and (at least it the more recent version of SQL Server) generates very good error messages about exactly why it failed. YMMV.
October 16, 2018 at 6:36 pm
andycadley - Tuesday, October 16, 2018 5:17 PMJeff Moden - Tuesday, October 16, 2018 2:36 PMBe very aware that sp_rename has some of it's own issues as the warning it issues implies. The use of Synonyms avoid such problems.
That's why I prefer Partition Switching. Unlike Synonyms it's pretty much guaranteed to fail cleanly if anything happens that would cause code to break, such as a change in table structure, and (at least it the more recent version of SQL Server) generates very good error messages about exactly why it failed. YMMV.
Why would there be a failure if you change table structure with Synonyms? Ostensibly, it would be a planned change (as it should also be with Partition Switching). You still have two tables to worry about with Partition Switching
--Jeff Moden
Change is inevitable... Change for the better is not.
October 17, 2018 at 3:38 pm
Jeff Moden - Tuesday, October 16, 2018 6:36 PMWhy would there be a failure if you change table structure with Synonyms? Ostensibly, it would be a planned change (as it should also be with Partition Switching). You still have two tables to worry about with Partition Switching
More a belt-and-braces thing than anything else. For example an index getting added to a table to resolve a performance issue, but getting missed on the secondary table. Or a column being dropped during an import process but the consequences not being fully considered elsewhere - though that's probably less likely in the synonym case since any import probably also needs to be using synonyms for the "non active" table, or explicitly aware of the issue.
Naturally a rigorous change process and code review should highlight such problems well before production anyway, but that's true of so many things. We do this kind of thing a lot for shuffling data between servers, so the process is even more heavily automated and there's a higher chance of something getting overlooked. In those cases having the data transfer fail due to the checks inherent in a Partition Switch is preferable to an unexpected side effects of a method that allows the structure between two tables to vary. It also has the upside that developers can't accidentally bypass the mechanism and write queries against one of the underlying tables (though again code review should pick that up).
October 17, 2018 at 11:20 pm
andycadley - Wednesday, October 17, 2018 3:38 PMJeff Moden - Tuesday, October 16, 2018 6:36 PMWhy would there be a failure if you change table structure with Synonyms? Ostensibly, it would be a planned change (as it should also be with Partition Switching). You still have two tables to worry about with Partition SwitchingMore a belt-and-braces thing than anything else. For example an index getting added to a table to resolve a performance issue, but getting missed on the secondary table. Or a column being dropped during an import process but the consequences not being fully considered elsewhere - though that's probably less likely in the synonym case since any import probably also needs to be using synonyms for the "non active" table, or explicitly aware of the issue.
Naturally a rigorous change process and code review should highlight such problems well before production anyway, but that's true of so many things. We do this kind of thing a lot for shuffling data between servers, so the process is even more heavily automated and there's a higher chance of something getting overlooked. In those cases having the data transfer fail due to the checks inherent in a Partition Switch is preferable to an unexpected side effects of a method that allows the structure between two tables to vary. It also has the upside that developers can't accidentally bypass the mechanism and write queries against one of the underlying tables (though again code review should pick that up).
I guess I'd rather have a soft failure because someone forgot to add an index to one of the two tables than to have the nightly process call me at 3AM to tell me that the critical job failed because someone (that's going to get the butt kicked at 3:30AM) forgot to add an index to one of the two tables. 😛
I do, very much, have a grand appreciation for code reviews or, in the case of many companies, the serious lack of them. I can see how having a hard failure, as you mention, might serve as a good reminder to management if I were to automatically forward all notifications to their cell phones as well as my own. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
October 18, 2018 at 1:24 am
The good thing about hard failures is that they happen straight away in dev or at worst test environments, so production problems are rarer than soft failures that might sneak through and cause intermittent issues.
But agree 100% on code review being vital.
Viewing 15 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply