Checking other 5 other columns for change and if they have set field null

  • Hi ,

    Columns in source

    Address1

    Address2

    Address3

    Address4

    Address5

    Columns in destination

    Address_1

    Address_2

    Address_3

    Address_4

    Address_5

    City

    I'm using a conditional split to checking if the data has changed between the source file and the destination table for scd2 history.

    But I also need to check if any of the address columns 1-5 have changed for the city column and if they have set the city field to null, I'm hoping to avoid script compent since I don't know c#, is this possible with a derived column?

    Thanks for the help.

  • Can you write some pseudo code that illustrates what you are trying to accomplish?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Sample SQL script below

    Before checks -

    Columns from Source

    Address1 = 67 office

    Address2 = NULL

    Address3 = Reading road

    Address4 = NULL

    Address5 = NULL

    Columns from destination

    Address_1 = 67 office

    Address_2 = NULL

    Address_3 = Reading road

    Address_4 = NULL

    Address_5 = NULL

    City_d = London

    I used conditional split in SSIS to check for changes between the source and destination column for any checks between columns using the expression below in the conditional split –

    ((IsNull([Address1]) ? "?^$@" : [Address1]) != (IsNull([Address_1]) ? "?^$@" : [Address_1]))|| ((IsNull([Address2]) ? "?^$@" : [Address2]) != (IsNull([Address_2]) ? "?^$@" : [Address_2]))||

    ((IsNull([Address3]) ? "?^$@" : [Address3]) != (IsNull([Address_3]) ? "?^$@" : [Address_3]))||

    ((IsNull([Address4]) ? "?^$@" : [Address4]) != (IsNull([Address_4]) ? "?^$@" : [Address_4]))||

    ((IsNull([Address5]) ? "?^$@" : [Address5]) != (IsNull([Address_5]) ? "?^$@" : [Address_5]))

    Which compares if any column values have changes since the last load –

    If after using the conditional spilt and any of the address columns 1-5 have changed then the City column should be NULL.

    For example-

    After checks -

    Columns from Source

    Address1 = 68 office

    Address2 = NULL

    Address3 = Reading road

    Address4 = NULL

    Address5 = NULL

    Columns from destination

    Address_1 = 67 office

    Address_2 = NULL

    Address_3 = Reading road

    Address_4 = NULL

    Address_5 = NULL

    City_d = London

    When copied into the destination table the result would look like below –

    Address_1 = 68 office

    Address_2 = NULL

    Address_3 = Reading road

    Address_4 = NULL

    Address_5 = NULL

    City_d = NULL

    The City column should be NULL because address1 does not match address_1.

    Hope that helps.

    Thanks

    create table dbo.AddressSource

    (Address1 varchar(50),

    Address2 varchar(50),

    Address3 varchar(50),

    Address4 varchar(50),

    Address5 varchar(50))

    GO

    Insert into dbo.AddressSource (

    Address1,

    Address2,

    Address3,

    Address4,

    Address5

    )

    VALUES (

    '67 office',

    NULL,

    'Reading road',

    NULL,

    NULL

    )

    Go

    create table dbo.AddressDest

    (Address_1 varchar(50),

    Address_2 varchar(50),

    Address_3 varchar(50),

    Address_4 varchar(50),

    Address_5 varchar(50),

    City varchar(50),

    WANTED_RESULTS varchar(200)) /* IF ANY OF THE ADDRESS1-5 LINE CHANGE or not change

    and how it affects the CITY column*/

    GO

    --the result should only bring back one row- there two to show wanted results -

    INSERT INTO AddressDest (Address_1,Address_2,Address_3,Address_4,Address_5,City,WANTED_RESULTS)

    VALUES ('68 office',NULL,'Reading road',NULL,NULL,NULL,'one of the address column has changed the City is now NULL');

    INSERT INTO AddressDest (Address_1,Address_2,Address_3,Address_4,Address_5,City,WANTED_RESULTS)

    VALUES ('67 office',NULL,'Reading road',NULL,NULL,'London','nothing has changed in address1-5 the previous value will be used so London'

    )

  • No one got any ideas?

  • Using your conditional split, you have one output for when the has changed. Connect this to a derived column and set the city column to NULL.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • You could also use a 3rd party custom component such as the Checksum component from Konesans to calculate and store a checksum for the five address columns you are interested in and then compare the checksum of the source columns to the checksum of the destination columns; if the checksum values are different then at least one of your source columns has changed.

    Disclaimer: I am not connected to Konesans in any way.

    Regards

    Lempster

  • Lempster (4/4/2014)


    You could also use a 3rd party custom component such as the Checksum component from Konesans to calculate and store a checksum for the five address columns you are interested in and then compare the checksum of the source columns to the checksum of the destination columns; if the checksum values are different then at least one of your source columns has changed.

    Disclaimer: I am not connected to Konesans in any way.

    Regards

    Lempster

    You can also implement a hash yourself using a script component:

    Using hash values in SSIS to determine when to insert or update rows[/url]

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Lempster (4/4/2014)


    You could also use a 3rd party custom component such as the Checksum component from Konesans to calculate and store a checksum for the five address columns you are interested in and then compare the checksum of the source columns to the checksum of the destination columns; if the checksum values are different then at least one of your source columns has changed.

    Disclaimer: I am not connected to Konesans in any way.

    Regards

    Lempster

    This is a good solution. For good performance, you should consider persisting the checksum as a new column in the target table(s).

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Koen Verbeeck (4/4/2014)


    You can also implement a hash yourself using a script component:

    Using hash values in SSIS to determine when to insert or update rows[/url]

    Yes and that would be my preferred way of doing it because you can use reduce the chance of hash collisions by using a stronger hash such as SHA1....but...the OP specifically stated that they didn't want to go down the scripting route. 😉

    Thanks for the link btw - I thought I had that saved to my favourites, but obviously not.

  • Lempster (4/4/2014)


    ....but...the OP specifically stated that they didn't want to go down the scripting route. 😉

    Yeah, but c'mon, all the c# code is already in the article 🙂

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Phil Parkin (4/4/2014)


    Lempster (4/4/2014)


    You could also use a 3rd party custom component such as the Checksum component from Konesans to calculate and store a checksum for the five address columns you are interested in and then compare the checksum of the source columns to the checksum of the destination columns; if the checksum values are different then at least one of your source columns has changed.

    Disclaimer: I am not connected to Konesans in any way.

    Regards

    Lempster

    This is a good solution. For good performance, you should consider persisting the checksum as a new column in the target table(s).

    Absolutely Phil, hence I stated '..calculate and store...';-)

    Lempster

  • Lempster (4/4/2014)


    Phil Parkin (4/4/2014)


    Lempster (4/4/2014)


    You could also use a 3rd party custom component such as the Checksum component from Konesans to calculate and store a checksum for the five address columns you are interested in and then compare the checksum of the source columns to the checksum of the destination columns; if the checksum values are different then at least one of your source columns has changed.

    Disclaimer: I am not connected to Konesans in any way.

    Regards

    Lempster

    This is a good solution. For good performance, you should consider persisting the checksum as a new column in the target table(s).

    Absolutely Phil, hence I stated '..calculate and store...';-)

    Lempster

    I should have read your post more carefully! The lure of the Friday-afternoon beer is already proving distracting :exclamation:

    The absence of evidence is not evidence of absence.
    Martin Rees

    You can lead a horse to water, but a pencil must be lead.
    Stan Laurel

  • Koen Verbeeck (4/4/2014)


    Lempster (4/4/2014)


    ....but...the OP specifically stated that they didn't want to go down the scripting route. 😉

    Yeah, but c'mon, all the c# code is already in the article 🙂

    😀 I know, but then again there is something to be said for 'if you don't know how something works, don't use it (in Production)' or words to that effect.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply