Using parameters in Execute SQL Task

  • Hmmm...how to best summarize this...

    I'm writing an SSDT package that:
    1) Clones the table schema of a source table query to a "temporary" table target environment (see details below)
    2) Copies data from a linked server to that table (heap).  I will be using the Balance Data Distributor task for improved performance.
    3) Creates indexes on the table
    4) Drops the "temporary" target table
    5) Renames the "temporary" table to the target table name.

    "Temporary" is in quotes because it's not a #temp table, but the source table name prefixed with "zzz_", i.e. "zzz_MyTable".

    I have these project level parameters:
    SrcServer
    SrcDatabase
    SrcSchema
    TgtServer
    TgtDatabase
    TgtSchema

    I've named the package to have the same name as the source/target table name.  My hope is to just clone a working package, name it to the desired table, and it "just works".

    I have an external file that contains the query, which is something like this:

    SELECT [facility_identifier]
      ,CONVERT(VARCHAR(40),CAST(HASHBYTES('SHA1',[visit_identifier]) AS BINARY(20)),2) AS [visit_identifier_e]
      ,[procedure_type]
      ,[procedure_code]
      ,[block_num]
      ,[snap_from_date]
      ,[replica_valid_from_date]
      ,[replica_valid_to_date]
      ,[replica_current_record]
    FROM [SrcServer,12345].[SrcDatabase].[SrcSchema].[SrcTable]

    I have a stored procedure which creates the "temporary" table.  Its syntax is:

    uspCloneTableWithQuery
       @SrcTableName = 'SrcServer,12345.SrcDatabase.SrcSchema.SrcTable'
    ,  @TgtTableName = 'TgtSchema.TgtTableName'
    ,  @Query = '<same query from above>'
    ,  @DropTgtTable = 1

    The stored process just does:

    SELECT *
    INTO TgtTable
    FROM (
       Query
    ) x
    WHERE 0=1

    What I can't work out in SSIS is:

    1) How I reference my parameters in the code for the execute task?  I think question marks should be used here?  So I think (???) it would be something like:

    uspCloneTableWithQuery '?.?.?.?', '?.?', '?'

    2)  The last question mark is the query, which is defined in the external file.  I'm not sure how to get that parameter populated from the external file?

    Should I just use an Execute Script task to 1) read the external file into memory, 2) wrap that query in the above wrapper code, 3) return the results as a variable, 4) use that variable in a subsequent execute SQL task?

    Thanks...

  • No, you can't do it like that.  You need one "?" per parameter and you need to lose the single quotes from the code in (1).  If you want to pass the server, database, schema and table names separately, you need to have them as separate parameters in your stored procedure.  Otherwise, build the query using a variable and have the Execute SQL Task take that as its input.

    As an aside, please take some time to read about SQL injection.  Unless you've taken precautions that aren't evident in your post, your solution is very vulnerable.

    John

  • Hi John,

    Thanks for your reply.  Re: SQL injection, I'm trying to isolate the post to the problem at hand, and not bog down the forum with code not germane to my problem.  Rest assured, I'll consider SQL injection in the final solution.

    I've got this test code in an Execute SQL Task, no result set:

    declare @SrcTableName varchar(100)
    declare @TgtTableName varchar(100)
    declare @sql varchar(max)

    select @SrcTableName=?
    select @TgtTableName=?

    begin try
    SELECT @sql=BulkColumn
    FROM OPENROWSET(BULK'C:\Temp\test.sql',SINGLE_BLOB) x;
    end try
    begin catch
    end catch

    select @sql=REPLACE(@sql,'{SrcTableName}',@SrcTableName)
    select @sql=REPLACE(@sql,'{TgtTableName}',@TgtTableName)

    exec uspCloneTableWithQuery @SrcTableName, @TgtTableName, @sql

    I've defined the parameters SrcTableName and TgtTableName.  In Parameter Mapping I've mapped them to 0 and 1.

    I've tried setting a breakpoint on the OnPostExecute, and changed the SP to just print out the parms.  I need to read up on debugging techniques in SSDT.

    I feel like I'm close, but it's still failing...

  • Failing in what way - error messages, wrong results, just doesn't do anything?  Please post an example of the @sql that you get from test.sql and the full definition of uspCloneTableWithQuery.  Have you tried replacing the "?"s with literals and running the code you posted in SSMS?

    John

  • I've made a bit of progress on this today, but am still having problems.

    To recap, what I need to do is:
    1) I have a particular query, that A) contains a call to HASHBYTES (AFAIK that can't be duplicated in SSDT), and 2) runs against a source table on a remote linked server.  See file attachment Copy_ED_PROCEDURE.sql.
    2) I need to clone the structure of that query to a table named [content].[zzz_ED_PROCEDURE] on another server.  See file attachments Util_Clone_Table.sql and uspCloneTableWithQuery.sql.
    3) Once the empty table is created, use the Data Flow task to run that query, using the Balanced Data Distributor task for better performance, to populate the target table.
    4) Create indexes on that table.
    5) If no errors to this point, drop the original table, and rename this table to the original table (rename without the "zzz_" prefix).

    Currently, I've got the query defined in a external file following a naming convention (i.e. Copy_ED_PROCEDURE.sql, where ED_PROCEDURE is the package name).  I'm using a script task to read that file into a variable.  This is working.  See file attachment ReadQueryIntoVariable.cs.
    The query has (or may have) an embedded token {SrcTableName}.  I need to resolve that token.  I could have resolved that in the script task, but preferred to keep the script task simple, and use T-SQL to resolve that token.  See file attachment Util_Resolve_Tokens.sql.

    The Resolve Embedded Tokens step is failing with errors.  See file attachment errors.txt.  I have configured the Execute SQL Task with a single row result set, and have configured the Result Set tab with results SrcTableName, TgtTableName, and Query, mapped to corresponding variables.  If I remove the mapping for Query, the error goes away.  Perhaps the CRLF in the variable confuses SSDT???

    I had the SQL Profiler on when I ran this code.  See file attachment SQLProfile_Resolve_Text.txt (edited).  This code runs fine and as expected if I paste it in SSMS, so there must be a misconfiguration or issue in SSDT (???).

    I'm happy to entertain other approaches to this problem (and have tried several of them).  

    What I don't want is to have to maintain that query twice, once to create the empty file, and again to load the data.  From my Googling, there doesn't appear to be a "clone table structure" task in SSDT.  (Is there a separate "clone table structure" addon for SSDT?)

    Lastly, I've attached a screenshot of my current package.

    Thanks...

  • You don't appear to have attached errors.txt.  In SQLProfile_Resolve_Text.txt, are the last few lines (starting with the line that begins with SELECT) the contents of the @Query variable?  Have you tried using varchar(2000), say, instead of varchar(max) for that variable?

    John

  • Hi John, thanks for your perseverance, much appreciated.  I've attached the error.txt from yesterday.  Sorry about that.

    However, I think I need to step back a bit, and ask some more basic questions...

    What I want to do is this:

    1) Delete an existing table if it exists.  I have a stored procedure (SP) uspDropTable for this.  So, I can do Control Flow --> Execute SQL Task --> Direct Input --> EXEC uspDropTable 'content.zzz_<TargetTable>' for this.  So far, so good.

    2) Load data from one server to another using a query.  The query is like:

    SELECT [key_column]
      ,CONVERT(VARCHAR(40),CAST(HASHBYTES('SHA1',[hashed_column]) AS BINARY(20)),2) AS [hashed_column_e]
      ,[other_columns]
    FROM [server].[database].[dbo].[<SourceTable>]

    Currently the code (a Powershell script) runs on my "local"server, attaching to a linked server via the 4-level name.

    The HASHBYTES column is the problem column giving me fits.  (Well, fits because I need to learn more about SSIS).

    It occurred to me that, with SSIS, it shouldn't matter if the servers are linked or not, as long as I can connect to source and target.  Once I learn SSIS parameterization/environments, this should make it easier to move between DEV and PROD environments.

    So, for Step 2, I'm using a Data Flow task:  OLE DB Source to remote server --> SQL command --> Embedded query like above --> OLE DB Target.

    However, since the target table does not exist. I need to click New.

    Questions:

    1) Can I have a Destination table that does not yet exist?  Can't SSIS create a table "under the covers" from the query?  Analogous to:

    SELECT * INTO <TargetTable> FROM (SELECT <columns> FROM <SourceTable> ) x WHERE 0=1

    2) Our current process is to load a "temp" file named "schema.zzz_<TargetTableName>", load the temp table, create indexes, drop the TargetTableName, sp_rename 'schema.zzz_<TargetTableName>', '<TargetTableName>'.  Would a better/best practice approach be to truncate the target, delete indexes, load the target, and rebuild indexes?  The reason we're doing our current approach is, if the (very long running - 18-24 hours) load fails, the original tables are still in place for end users to work.
    3) Why doesn't the OLE DB Source Editor (and therefore the Create New table window) preserve the column order from the query in the OLE DB Source?  If I go to the Columns tab, the columns are out of order, with no way to reorder them.  If I click Preview, they are in the correct order.
    4) Is there a way to change the font in the Create Table, SQL Statement, etc windows to a monospaced font?  And Cntl-A doesn't work in these windows.  Hopefully Microsoft will give these klunky looking windows some attention in a future release.

    3)  Post-processing.  Alter columns to NOT NULL (the source data schema isn't the best) and creating indexes.

    Is this another Execute SQL Task?

    4) Drop target table, rename temp table to target table.

    Is this another Execute SQL Task?

    Ideally, I will get one package working, then can clone it for the 35 other tables we have to process.  Very ideally would be to just name the package to match the source table and it "just works".  To that end, once I get this working, I may look to keep the Copy query and Post-processing in external files, use a Script task to read them into variables, and use those variables to make the code more dynamic.

    Sorry for all the basic questions.  I have invested many hours Googling, YouTubing, etc on SSIS/SSDT, and continue to do so.  

    To that end, if there are good books or websites for learning SSIS and SSDT, please let me know.

  • It occurred to me that I may be making this too hard...

    Since the final target exists at run time, and it has the same schema as the query, I can simply:

    1) Drop the temp table:  Execute SQL Task:  
    EXEC uspDropTable 'content.zzz_TARGET_TABLE'

    2) Create the empty temp table from the current target table:  Execute SQL Task:
    SELECT *
    INTO content.zzz_TARGET_TABLE
    FROM content.TARGET_TABLE
    WHERE 0=1

    While this isn't exactly in sync with the remote source table query, that query and the resultant schema rarely changes.  When/if it does, I can sync it up outside SSIS.

    3) Now that I have the empty temp table, I have a Data Flow task:
    A) OLE DB Source:  Remote Server:  SQL Command:  Embedded SQL Query per my last post
    B) OLE DB Target:  Local Server:  Table or view - fast load:  content.zzz_TARGET_TABLE

    This is working ok.  Now I just have to wrap my head around post-processing, then try to parameterize this so I can just clone the package, name it the same as TARGET_TABLE, and it just works.

    One more question:  is there a way to run all the packages in a solution?  I'm also happy to write a Powershell or CMD file using DTEXEC to do this.

  • Scott In Sydney - Thursday, August 10, 2017 1:48 AM

    One more question:  is there a way to run all the packages in a solution?  I'm also happy to write a Powershell or CMD file using DTEXEC to do this.

    Do they need to run in a specific order, concurrently, or does it not matter at all (so each time the packages might execute in a different order)?

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • You're getting the error because you're trying to force a result set into a scalar data type.  Result sets can only be fed into a variable with data type Object.  But I don't think you need a result set at all, since you are already assigning each valuable in the result set to a variable.  All you need to do is map each of those to an output parameter in the Execute SQL task, and you're done.

    With all that said, do you really need to do this with SSIS?  Since you have a linked server, you don't need a Data Flow task, and so you should be able to do it all with T-SQL.  Reading from the text file into a variable may be a little tricky, but you should be able to get round that using bcp or a couple of lines of PowerShell.

    John

  • Thom A - Thursday, August 10, 2017 2:13 AM

    Scott In Sydney - Thursday, August 10, 2017 1:48 AM

    One more question:  is there a way to run all the packages in a solution?  I'm also happy to write a Powershell or CMD file using DTEXEC to do this.

    Do they need to run in a specific order, concurrently, or does it not matter at all (so each time the packages might execute in a different order)?

    For this scenario, all jobs are independent.  Concurrent processing, in any order is fine.  However, when I ran all 35 jobs concurrently via Powershell and the BulkCopy .Net class, it filled the transaction log.  I need to learn how SSIS processing affects the transaction log.

    In the future, if I can convince my management to use SSIS for our ETL needs, then I'll need to be able to run the jobs in a particular order.

  • John Mitchell-245523 - Thursday, August 10, 2017 2:18 AM

    You're getting the error because you're trying to force a result set into a scalar data type.  Result sets can only be fed into a variable with data type Object.  But I don't think you need a result set at all, since you are already assigning each value in the result set to a variable.  All you need to do is map each of those to an output parameter in the Execute SQL task, and you're done.

    With all that said, do you really need to do this with SSIS?  Since you have a linked server, you don't need a Data Flow task, and so you should be able to do it all with T-SQL.  Reading from the text file into a variable may be a little tricky, but you should be able to get round that using bcp or a couple of lines of PowerShell.

    John

    The way I'm doing it now is via a Powershell script I wrote using the .Net BulkCopy class.  The main SQL to run (Copy and Post-processing) is in external files with replaceable parameters ({0}, {1}, etc).  It's all wrapped in PS code with multithreading capability.

    Even though I wrote it, I admit it's complex and hard to maintain.

    A SQL Server consultant came in (for another team), but showed me SSDT and recommended it for this work.  Initial investigation shows it could be much faster than the current approach, plus easier to maintain.  If I can skill up on SSDT/SSIS, it may be a good solution.  Although lack of SSIS skills among other team members (as well as my own burgeoning skills!) is a risk.

    Moving forward, I have a large ETL project this year.  I can do the work in SAS (which is my core strength), but the SAS code is primarily just "wrapper code" around SQL (referred to as explicit pass through - it's just passing the code to SQL Server via ODBC.  A bit like submitting SQL via Powershell - SAS is just another client).  I think SSIS could be a good fit for this work, supplementing it with SAS when needed.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply