trying to change data before importing into table

  • Hi all,

    Having some fun with an excel import.

    I have a excel sheet that has data that needs to be UpSert'ed into 2 different tables. In addition, I need to use a value in the spreadsheet to determine the PK from a reference data table, for one of the UpSert oprations.

    That is all working now.

    The thing I'm struggling with is something I am sure is quite simple - but I'm not seeing a solution from attempts, googling or BOL.

    2 of the columns I receive have either nothing, or X in them. The columns they go into are defined as BIT, NOT NULL.

    So, in SQL it would be something relatively simple like:

    CASE

    When IsAvailable = 'X' then 1

    When IsAvailable is null then 1

    ELSE 0

    end

    But I don't know how to do this to data that was in a spreadsheet, and now is a resultset being handed from a task to another task.

    to outline my current solution:

    ---- table 1 = this all works -------------

    Excel Source --> MultiCast (For Table 1)-->Data conversion for table1-->:Sort for Table1--> Merge Join for table 1 (left Outer join) as 'left' leg

    Table1 Source --> Sort Table1 --> Merge Join for table 1 (left Outer join) as 'right' leg

    Merge Join for table 1 --> Conditional Lplit for table1

    Conditional Split for table1 (table1 source PK is null) -->Insert Into Table1 Destination

    Conditional Split for table1 (table1 source PK is not null) -->Update Table1 OLE DB Command

    ---- table 2 = this needs to be able to convert X/NULL to BIT -------------

    MultiCast (For Table 2)-->Copy Column for Table2 -->Data Conversion for Table 2-->table3 lookup to get FK-->Sort for Table2 merge-->Merge Join for table 2 (left outer join) as 'left' leg

    Table2 Source --> Sort Table2 --> Merge Join for table 2 (left outer join) as 'right' leg

    Merge Join for table 2 --> Conditional split for table 2

    Conditional split for table 2(table2 source PK in null) -->insert into table 2

    Conditional split for table 2(table2 source PK in not null) -->update table 2 ole db command

    -----------------------------------------

    Now, if I correct the spreadsheet to have 0's and 1's in the two column, then the process above works. But I cannot (yet) force business to do that.

    If tried to use SQL Command for the excel source, but there is limited functionality on the command - I cannot do SQL coalese, isnull or case statements, which would allow me to resolve that data at source.

    I've tried to use derived columns to alter the columns. I think that the REPLACE (IsAvailable, VariableContainingX,VariableContaining1) might work to change X's to 1, but that doesn't resolve the NULL issue.

    I've tried to use a script component to handle the conversion - which REALLY feels like a bad way to do this - the .Net script is wrote was:

    -------------.net script code-------------

    Imports System

    Imports System.Data

    Imports System.Math

    Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

    Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

    Imports Microsoft.SqlServer.Dts

    Public Class ScriptMain

    Inherits UserComponent

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    If Not (Row.EndOfRowset) Then

    If (Row.IsDotComVanEnabled_IsNull) Or (Row.IsDotComVanEnabled.Equals("X")) Then

    Row.IsDotComVanEnabled = "1"

    End If

    If (Row.IsStoreCollectionEnabled_IsNull) Or (Row.IsStoreCollectionEnabled.Contains("X")) Then

    Row.IsStoreCollectionEnabled = "1"

    End If

    Row.NextRow()

    End If

    End Sub

    End Class

    ------------------------------------------

    I also tried it like this:

    ----------------1st draft .Net script code--------

    Imports System

    Imports System.Data

    Imports System.Math

    Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

    Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

    Imports Microsoft.SqlServer.Dts

    Public Class ScriptMain

    Inherits UserComponent

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    If (Row.IsDotComVanEnabled_IsNull) Or (Row.IsDotComVanEnabled.Equals("X")) Then

    Row.IsDotComVanEnabled = "1"

    End If

    If (Row.IsStoreCollectionEnabled_IsNull) Or (Row.IsStoreCollectionEnabled.Contains("X")) Then

    Row.IsStoreCollectionEnabled = "1"

    End If

    End Sub

    End Class

    -------------------------------------------------

    Those both threw the following error;

    -------------error information-----------------

    Script Component has encountered an exception in user code:

    PROJECT NAME: ScriptComponent_a333140d269b413bb1bddef390da7e16

    Object reference not set to an instance of an object.

    at ScriptComponent_a333140d269b413bb1bddef390da7e16.ScriptMain.Input0_ProcessInputRow(Input0Buffer Row)

    at ScriptComponent_a333140d269b413bb1bddef390da7e16.UserComponent.Input0_ProcessInput(Input0Buffer Buffer)

    at xScriptComponent_a333140d269b413bb1bddef390da7e16.UserComponent.ProcessInput(Int32 InputID, PipelineBuffer Buffer)

    at Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost.ProcessInput(Int32 inputID, PipelineBuffer buffer)

    -----------------------------------------------

    Any ideas?

  • Have also been having fun importing (165 column) workbook (date columns!  aargh!).

    Anyways, to your problem: much simplified once you have the following:

    1. The use of the Conditional "?" i.e. IF... Then...

    2. The knowledge that the statement used in the "Condition" of a Conditional Split or the "Expression" of a derived column is processed in left-to-right order (or in order of parentheses as normal) - and this enables you to get round your NULL issue:

    So, the equivalent of the SQL

    CASE

    When IsAvailable = 'X' then 1

    When IsAvailable is null then 1

    ELSE 0

    would be (something like) ...

    ISNULL([IsAvailable]) ? TRUE : [IsAvailable]=="X"

    ... in the Expression box of a Derived Column.  I think I've gotten the syntax right - note the use of '==' and double-quotes (it's all a bit .NET...).

    The logic works so that it goes...

    'Is the field NULL?  If yes, then return TRUE, if it's not NULL I can now evaluate the actual content of the field and I'm not going to throw a wobbly because the result comes out NULL.  This last bit is particularly an issue if using it for the Conditional Split as the result must be True/False.

    Hope this helps.  If I've got the syntax wrong, keep at it as this will definitely yield you a result - and it is "Data Stream friendly" - i.e. it's quick and optimised for SSIS.  And it's one line!

  • I was having the same problem and found that you need to check like this *before* you try to do something to the column:

    If Not Row.Column_IsNull Then

    'Do Something here

    End If

    This will avoid the Object not set error....

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply