How to a use a temp table as a destination for a dataflow in SSIS

  • I need to use data from one server to update a table on another server.  For reasons of GDPR compliance there cannot be a direct connection between these two servers so I can't use a linked server to do this in a single tsql query.  Instead I plan on using SSIS to create a temp table on the destination server, populate it with the pertinent data from the source and then use an execute sql task to update the destination table based on the temp table.

    My problem is when I come to set up the temp table as a destination.  Because the table doesn't exist at design time it's not available to use on the column mappings tab of the destination wizard.  I can find plenty of examples on line of using a temp table as a source but I can't find any that use one as a destination.

    From those examples it occurred to me that I could perhaps use a global temp table (##) but it's still not available as a destination table in the destination wizards drop down so I still can't use the mappings tab.  I'd also rather not give it global scope.  This process is intended to run as a single instance so collisions shouldn't be a problem but I'd rather not take the risk if I can avoid it.

    Similarly, I know I could use "real" table but I'd rather avoid the clutter and, again, I'd rather avoid the global scope.

    So can I use a temp table in this way?  If not I guess I'll fall back on real tables.

  • FunkyDexter - Thursday, August 16, 2018 8:08 AM

    I need to use data from one server to update a table on another server.  For reasons of GDPR compliance there cannot be a direct connection between these two servers so I can't use a linked server to do this in a single tsql query.  Instead I plan on using SSIS to create a temp table on the destination server, populate it with the pertinent data from the source and then use an execute sql task to update the destination table based on the temp table.

    My problem is when I come to set up the temp table as a destination.  Because the table doesn't exist at design time it's not available to use on the column mappings tab of the destination wizard.  I can find plenty of examples on line of using a temp table as a source but I can't find any that use one as a destination.

    From those examples it occurred to me that I could perhaps use a global temp table (##) but it's still not available as a destination table in the destination wizards drop down so I still can't use the mappings tab.  I'd also rather not give it global scope.  This process is intended to run as a single instance so collisions shouldn't be a problem but I'd rather not take the risk if I can avoid it.

    Similarly, I know I could use "real" table but I'd rather avoid the clutter and, again, I'd rather avoid the global scope.

    So can I use a temp table in this way?  If not I guess I'll fall back on real tables.

    This sounds to me like a legitimate need for a new permanent table. One way to help avoid the cluttered feeling is to put the new table into its own schema ... wrk, load, stg or whatever, to clearly mark it as a table containing only transient data.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • This sounds to me like a legitimate need for a new permanent table

      I was afraid you'd say that 🙂

    I guess permanent tables it is then.  If I really want to guard against collision I can always attach a GUID to the tablename.

    Thanks for the help

  • FunkyDexter - Thursday, August 16, 2018 8:26 AM

    This sounds to me like a legitimate need for a new permanent table

      I was afraid you'd say that 🙂

    I guess permanent tables it is then.  If I really want to guard against collision I can always attach a GUID to the tablename.

    Thanks for the help

    No problem. I'd also suggest putting the SQL which loads from the table to the target table into a proc and then calling that proc from your SSIS package. Keeping all but the simplest T-SQL out of packages is good practice.

    What sort of 'collision' are you concerned about?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • FunkyDexter - Thursday, August 16, 2018 8:08 AM

    So can I use a temp table in this way? 

    You can if you do it all in a stored procedure. 😉  It's not THAT hard to do.

    Also, don't the necessary connections you've created for SSIS violate the very rules of not having permanent connections that rule out such things as Linked Servers??? :blink:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You could, if you wanted, create the table in tempdb. Although it's not a true "temporary table", it will still only persist for as long as the server is running. As soon as the instance restarts it'll be lost.

    Personally though, like Phil suggested, I'd recommend a permanent table. I have a database I specifically use for staging tables in ETL tasks. Some of them I do generate "on the fly", but no one else apart from a couple of service accounts and DBAs have access to the database, so I know no one could be "peaking". Plus, because they arre staging tables, it doesn't matter what happens to them too much, as if the data or definition goes wonky, you can just rebuild it; that's the point of them.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thom A - Thursday, August 16, 2018 9:21 AM

    You could, if you wanted, create the table in tempdb. Although it's not a true "temporary table", it will still only persist for as long as the server is running. As soon as the instance restarts it'll be lost.

    Personally though, like Phil suggested, I'd recommend a permanent table. I have a database I specifically use for staging tables in ETL tasks. Some of them I do generate "on the fly", but no one else apart from a couple of service accounts and DBAs have access to the database, so I know no one could be "peaking". Plus, because they arre staging tables, it doesn't matter what happens to them too much, as if the data or definition goes wonky, you can just rebuild it; that's the point of them.

    I'll also add that if you're going to use real but semi-permanent tables, that you create a "Scratch" database create them in rather than using TempDB.  That "Scratch" database should also be set to the SIMPLE recovery model (to take advantage of minimal logging which virtually doubles performance of inserts) and totally excluded from any backups and index/stats maintenance. 

    And that works whether you use SSIS or Stored Procedures or a hybrid of both.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thom A - Thursday, August 16, 2018 9:21 AM

    You could, if you wanted, create the table in tempdb. Although it's not a true "temporary table", it will still only persist for as long as the server is running. As soon as the instance restarts it'll be lost.

    Personally though, like Phil suggested, I'd recommend a permanent table. I have a database I specifically use for staging tables in ETL tasks. Some of them I do generate "on the fly", but no one else apart from a couple of service accounts and DBAs have access to the database, so I know no one could be "peaking". Plus, because they arre staging tables, it doesn't matter what happens to them too much, as if the data or definition goes wonky, you can just rebuild it; that's the point of them.

    Does that mean that you are doing lots of cross-database merges?

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Jeff Moden - Thursday, August 16, 2018 8:53 AM

    FunkyDexter - Thursday, August 16, 2018 8:08 AM

    So can I use a temp table in this way? 

    You can if you do it all in a stored procedure. 😉  It's not THAT hard to do.

    Also, don't the necessary connections you've created for SSIS violate the very rules of not having permanent connections that rule out such things as Linked Servers??? :blink:

    The connections wouldn't be permanent. They would only exist while the package is running. They'd also be done by a different server working as a bridge between both instances.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Phil Parkin - Thursday, August 16, 2018 9:30 AM

    Thom A - Thursday, August 16, 2018 9:21 AM

    You could, if you wanted, create the table in tempdb. Although it's not a true "temporary table", it will still only persist for as long as the server is running. As soon as the instance restarts it'll be lost.

    Personally though, like Phil suggested, I'd recommend a permanent table. I have a database I specifically use for staging tables in ETL tasks. Some of them I do generate "on the fly", but no one else apart from a couple of service accounts and DBAs have access to the database, so I know no one could be "peaking". Plus, because they arre staging tables, it doesn't matter what happens to them too much, as if the data or definition goes wonky, you can just rebuild it; that's the point of them.

    Does that mean that you are doing lots of cross-database merges?

    No. But that's due to the nature of our vendor's application (which is running on an OS created at Cambridge University in 1978!!! TriPOS). Thus if I were to merge into the application database, it wouldn't be reflected in the application, and would simply be overwritten at the end of the week. Hence why a staging database works really well, as its not going anywhere else afterwards apart from into an xml file.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • Thom A - Thursday, August 16, 2018 9:41 AM

    Phil Parkin - Thursday, August 16, 2018 9:30 AM

    Does that mean that you are doing lots of cross-database merges?

    No. But that's due to the nature of our vendor's application (which is running on an OS created at Cambridge University in 1978!!! TriPOS). Thus if I were to merge into the application database, it wouldn't be reflected in the application, and would simply be overwritten at the end of the week. Hence why a staging database works really well, as its not going anywhere else afterwards apart from into an xml file.

    I see. Gotta love XML for data transfer.

    Apologies to the OP for slightly hijacking this thread, though I think the original question is fully answered now.

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Blimey, loads of stuff to respond to.  Thanks for all the input folks.

    A bit of background that might help (I didn't include it before because it doesn't really inform the question at hand but might help answer some of the questions you guys have asked).  GDPR means we have to "forget" a customer if they ask us to.  Ie remove all personal information from the system.  But we're also subject to legislation that means we have to retain data for a certain length of time (2 years if no contract was formed, 5 years if it went to contract) for use in legal disputes, fraud investigation etc.  So we've got a conflict there.

    The solution is that, when a customer asks to be forgotten, we copy their personal data into a "secret" (not really the right word but it'll do) repository that 's locked down and isn't readily accessible by anyone, even the DBAs.  For anyone to access it they need to have gone through a process that audits the legitimate need etc.  All access is audited and, if someone could prove we accessed it without a legitimate need we would be in breach of GDPR.  Their data is then redacted in the main system.  These two system need to sit on separate servers.

    All that's working fine but when there is a legitimate need to access the "forgotten" data, we need a process to recover the data from the repository into the main system while the dispute/fraud etc is investigated and resolved.  That's what I'm working on right now.

    What sort of 'collision' are you concerned about?

    Just that if more than one person runs the process at the same time one might drop and replace the table while the others using it.  That sort of thing.  If I chuck some sort of unique identifier in the table name I can work around that though.  Also, this recovery process will probably need to be run once in a blue moon so I'm probably worrying over nothing.

    You can if you do it all in a stored procedure.

    The logic's easy enough, it's the access that's the problem.  I can't think of away I can do it in pure sql without a linked server.  Am I missing something there?

    don't the necessary connections you've created for SSIS violate the very rules of not having permanent connections that rule out such things as Linked Servers???

    Well I'm told not.  I don't proclaim to be an expert on the legislation, though, and am working on the advice from our security expert.

    you create a "Scratch" database create them in rather than using TempDB[/quote]Yeah, we've got a "DBA workhorse" database I can use.  It's got no access at all outside of the DB team so should be isolated enough.

    The connections wouldn't be permanent. They would only exist while the package is running. They'd also be done by a different server working as a bridge between both instances.[/quote]I believe that sums up the legal situation, yes.  Like I said, though, I'm just doing what our legal champ says on this though.

    As an aside, I'm getting some really odd problems posting in the last couple of days.  The first letter of each paragraph keeps getting shoved along, and right now it looks like the colours are spontaneously changing for no obvious reason (I've got blues, blacks and now a weird orange:unsure:).  Is there a bug reporting mechanism on this forum?  I couldn't see one.

    Edit> and apparently the quote tags aint working for me either:doze:

  • Well how big are the data sets you're talking about here?  Are they small enough you can do the comparisons in flight in SSIS without needing to load a staging table?

  • FunkyDexter - Thursday, August 16, 2018 8:08 AM

    ...
    My problem is when I come to set up the temp table as a destination.  Because the table doesn't exist at design time it's not available to use on the column mappings tab of the destination wizard.  I can find plenty of examples on line of using a temp table as a source but I can't find any that use one as a destination.
    ...

    Sounds like you already have your answer, but I wanted to add my 2 cents since I've actually done this...

    This actually can be done, but it is tricky to set up and a real pain if you have to make changes in the future.  You basically have to set up your package either using a real table and switching it afterwards while avoiding re-validation or by creating a temp table in the same session long enough to set it up, then configuring the package to delay validation at runtime on all components referencing the temp table until the package has been able to recreate it.  

    Once set up it should work fine.  The problem is that every time you need to make a change, you'll have to go through the same process.  You could set up your package to create and then drop a 'regular' table to make things a bit easier.  You'll still have the same hassles but it would be easier to create the table on the fly when you need to do maintenance. 

    Personally I say it isn't worth the hassle and go with a permanent staging table instead unless you have a really compelling reason to not have a persistent table.

  • Well how big are the data sets you're talking about here? Are they small enough you can do the comparisons in flight in SSIS without needing to load a staging table?

    Actually, yes, that might be an option.  The individual datasets aren't large (maybe 20 rows, certainly never more than 50).

    The problem with that is that I'll be using the data from the repository to update the main system (the records remain in place in the main system, the sensitive data is simply overwritten).  SSIS isn't great for updates in my opinion.  I guess I could use ForEach containers wrapping Execute SQL tasks containing Update statements to iterate the recordsets but the number of fields affected would require ALOT of variables.  Across maybe 20 recordsets there's got to be a couple of hundred fields that would need addressing.  But if you can suggest a better way of handling it on the fly then I'm very open to suggestions.  I think I'll have a play with that suggestion tomorrow anyway.  It's dfinitely worth a punt.

    Overall, though, I'm pretty happy copying into permanent tables and then updating from there.  It's a compromise on where I wanted to be but it's a pragmatic one.  I can live with that.

    You basically have to set up your package either using a real table...

    I was trying that earlier today but couldn't get it to work right.  I think I might take another punt at it tomorrow because, if I can get it right, it'll get me to where I really want to be.  This might be a case of getting it to work with permy tables then revisiting to see if I can get this approach working

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply