Update flag on table 1 based on values in table 2

  • Good morning out there,

    I'm not very experienced with BIDS and SSIS yet, so I'm hoping that someone can shed some light on where to even begin to resolve my issue.

    I have 2 seperate tables as follows:

    Table 1: stats

    Location | Date | a_temp |b_temp | c_temp | flag

    Ontario 20090701 22.345 26.52 10.2 V

    Table 2: conditionvariables

    Field | min | max

    a_temp 20 25

    b_temp 25 26.8

    c_temp 10 12.342

    Basically the second table holds the acceptable range of values for items (a_temp, b_temp, c_temp). If all of the values in Table 1 are in the valid range, the flag will be V for valid. If any of the values are outside of the acceptable range, the flag will be I (for Invalid).

    I'm slowly trying to use a script component but it hasn't worked out so well. I'm not too sure which transformations I should be using in my data flow.

  • Are these tables in the same database?

  • Hi Jack,

    Sorry I should've mentioned that. Yes, both tables are in the same database.

  • Okay, then if you have to use SSIS I'd use an Execute SQL Task with code like this:

    -- create test table variables

    DECLARE @stats TABLE (location VARCHAR(25), DATE DATETIME, a_temp FLOAT, b_temp FLOAT, c_temp FLOAT, flag CHAR(1))

    DECLARE @conditionvariables TABLE(field VARCHAR(10), min FLOAT, max FLOAT)

    -- load a few test rows

    INSERT INTO @stats (

    location,

    DATE,

    a_temp,

    b_temp,

    c_temp,

    flag

    )

    -- pass

    Select

    'Ontario',

    '20090701',

    22.345,

    26.52,

    10.2,

    ''

    UNION ALL

    -- fail for a_temp

    Select

    'Quebec',

    '20090701',

    19,

    26.52,

    10.2,

    ''

    UNION ALL

    -- fail for b_temp

    Select

    'Nova Scotia',

    '20090701',

    22,

    26.92,

    10.2,

    ''

    UNION ALL

    -- fail for c_temp

    Select

    'New Brunswick',

    '20090701',

    22,

    26.52,

    9.2,

    ''

    INSERT INTO @conditionvariables (

    field,

    [min],

    [max]

    )

    SELECT

    'a_temp',

    20,

    25

    UNION ALL

    SELECT

    'b_temp',

    25,

    26.8

    UNION ALL

    SELECT

    'c_temp',

    10,

    12.342

    -- THis section is what I'd put in the execute sql task

    UPDATE @stats

    SET flag = 'V'

    WHERE

    EXISTS (SELECT 1 FROM @conditionvariables C WHERE a_temp BETWEEN C.[min] AND C.[max] AND C.field = 'a_temp') And

    EXISTS (SELECT 1 FROM @conditionvariables C WHERE b_temp BETWEEN C.[min] AND C.[max] AND C.field = 'b_temp') AND

    EXISTS (SELECT 1 FROM @conditionvariables C WHERE c_temp BETWEEN C.[min] AND C.[max] AND C.field = 'c_temp')

    UPDATE @stats

    SET flag = 'I'

    WHERE

    NOT EXISTS (SELECT 1 FROM @conditionvariables C WHERE a_temp BETWEEN C.[min] AND C.[max] AND C.field = 'a_temp') OR

    NOT EXISTS (SELECT 1 FROM @conditionvariables C WHERE b_temp BETWEEN C.[min] AND C.[max] AND C.field = 'b_temp') OR

    NOT EXISTS (SELECT 1 FROM @conditionvariables C WHERE c_temp BETWEEN C.[min] AND C.[max] AND C.field = 'c_temp')

    -- end section in execute sql task

    SELECT

    *

    FROM

    @stats

    There may be a way to do it that will perform better but this shjould do the job. You'll need to test is in DEV of course.

  • here's how i did it:

    i created a userdefined function that evaluates the three temperature types, and used a Calculated column fo rthe "flag" field.

    Note I changed a few table and column names so they were not reserved words, but this seems to work fine:

    CREATE TABLE conditionvariables (

    FIELDNAME VARCHAR(30),

    minval decimal(9,3),

    maxval decimal(9,3) )

    GO

    INSERT INTO CONDITIONVARIABLES

    SELECT 'a_temp', 20, 25 UNION ALL

    SELECT 'b_temp', 25, 26.8 UNION ALL

    SELECT 'c_temp', 10, 12.342

    GO

    Create Function fn_ValidateConditionVariables(@Atemp decimal(9,3),@Btemp decimal(9,3),@Ctemp decimal(9,3))

    RETURNS CHAR(1)

    AS

    BEGIN

    DECLARE @return CHAR(1)

    SELECT @return = CASE

    WHEN SUM(ATEMP) > 0

    THEN 'I'

    ELSE 'V'

    END

    FROM (

    SELECT

    CASE

    WHEN @Atemp BETWEEN minval and maxval

    THEN 0

    ELSE 1

    END AS ATEMP FROM CONDITIONVARIABLES WHERE FIELDNAME = 'a_temp'

    UNION ALL

    SELECT

    CASE

    WHEN @Btemp BETWEEN minval and maxval

    THEN 0

    ELSE 1

    END AS ATEMP FROM CONDITIONVARIABLES WHERE FIELDNAME = 'b_temp'

    UNION ALL

    SELECT

    CASE

    WHEN @Ctemp BETWEEN minval and maxval

    THEN 0

    ELSE 1

    END AS ATEMP FROM CONDITIONVARIABLES WHERE FIELDNAME = 'c_temp'

    ) X

    return @return

    END --PROC

    GO

    CREATE TABLE TEMPSTATS (

    Location varchar(30),

    StatDate datetime,

    a_temp decimal(9,3),

    b_temp decimal(9,3),

    c_temp decimal(9,3),

    flag AS dbo.fn_ValidateConditionVariables(a_temp,b_temp,c_temp) )

    INSERT INTO TEMPSTATS

    (Location,StatDate,a_temp,b_temp,c_temp)

    SELECT 'Ontario','20090701', 22.345, 26.52, 10.2 UNION ALL

    SELECT 'Miami','20090702', 32.345, 19.52, 9.2 UNION ALL

    SELECT 'New York','20090701', 30.345, 25.52, 10.3

    SELECT * FROM TEMPSTATS

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • doh i did not know using SSIS was a requirement....sorry...

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Lowell (7/2/2009)


    doh i did not know using SSIS was a requirement....sorry...

    Nice solution though Lowell, and probably better than doing it using SSIS. I really like the concept of doing the derived table union all with the sum.

  • Thank you Jack and Lowell so much for taking the time to respond. I really appreciate it.

    I'm going to try incorporating Jack's solution into my package. I'll report back on my results soon. Cross your fingers for me. 😀

  • Hi Jack and Lowell,

    I'm not bound to using SSIS. Would I get better performance using Lowell's solution? I have about 10,000 records in my table 1 that need to be validated against table 2 on a weekly and ad hoc basis.

  • Well, it depends.

    Can you modify the table structure to include the calculated column?

    Are the values in conditionvariables changing?

    The beauty of Lowell's solution is that, if the values in conditionvariables do not change or if they have start and end date columns (you'd need to modify the function for this), the flag value is automatically calculated whenever data is inserted or updated.

    You could still adapt Lowell's method to set the flag value through an update statement running in SSIS or a SQLAgent job. You'd have to test both my update and the update using Lowell's code to determine which one is faster. I'd actually guess that Lowell's will perform better because my code does 2 updates and his would run once. That would look something like this:

    SELECT OBJECT_ID(N'dbo.fn_ValidateConditionVariables', N'FN')

    IF OBJECT_ID(N'dbo.fn_ValidateConditionVariables', N'FN') IS NOT NULL

    BEGIN

    DROP FUNCTION fn_ValidateConditionVariables

    END

    IF OBJECT_ID(N'dbo.conditionvariables', N'U') IS NOT NULL

    BEGIN

    DROP TABLE dbo.conditionvariables

    END

    Go

    Create Function fn_ValidateConditionVariables

    (

    @Atemp decimal(9, 3),

    @Btemp decimal(9, 3),

    @Ctemp decimal(9, 3)

    )

    RETURNS CHAR(1)

    AS BEGIN

    DECLARE @return CHAR(1)

    SELECT

    @return = CASE WHEN SUM(ATEMP) > 0 THEN 'I'

    ELSE 'V'

    END

    FROM

    (SELECT

    CASE WHEN @Atemp BETWEEN min and max THEN 0

    ELSE 1

    END AS ATEMP

    FROM

    CONDITIONVARIABLES

    WHERE

    FIELD = 'a_temp'

    UNION ALL

    SELECT

    CASE WHEN @Btemp BETWEEN min and max THEN 0

    ELSE 1

    END AS ATEMP

    FROM

    CONDITIONVARIABLES

    WHERE

    FIELD = 'b_temp'

    UNION ALL

    SELECT

    CASE WHEN @Ctemp BETWEEN min and max THEN 0

    ELSE 1

    END AS ATEMP

    FROM

    CONDITIONVARIABLES

    WHERE

    FIELD = 'c_temp') X

    return @return

    END --PROC

    GO

    -- create test table variables

    DECLARE @stats TABLE (location VARCHAR(25), DATE DATETIME, a_temp FLOAT, b_temp FLOAT, c_temp FLOAT, flag CHAR(1))

    CREATE TABLE conditionvariables (field VARCHAR(10), min FLOAT, max FLOAT)

    -- load a few test rows

    INSERT INTO @stats (

    location,

    DATE,

    a_temp,

    b_temp,

    c_temp,

    flag

    )

    -- pass

    Select

    'Ontario',

    '20090701',

    22.345,

    26.52,

    10.2,

    ''

    UNION ALL

    -- fail for a_temp

    Select

    'Quebec',

    '20090701',

    19,

    26.52,

    10.2,

    ''

    UNION ALL

    -- fail for b_temp

    Select

    'Nova Scotia',

    '20090701',

    22,

    26.92,

    10.2,

    ''

    UNION ALL

    -- fail for c_temp

    Select

    'New Brunswick',

    '20090701',

    22,

    26.52,

    9.2,

    ''

    INSERT INTO conditionvariables (

    field,

    [min],

    [max]

    )

    SELECT

    'a_temp',

    20,

    25

    UNION ALL

    SELECT

    'b_temp',

    25,

    26.8

    UNION ALL

    SELECT

    'c_temp',

    10,

    12.342

    -- lowell's code updating using his function

    UPDATE @stats

    SET flag = dbo.fn_ValidateConditionVariables(a_temp,b_temp,c_temp)

    /* lowell's code modified to do an explicit update without a function

    UPDATE @stats

    SET flag = CASE WHEN X.atemp = 0 THEN 'V' ELSE 'I' END

    FROM

    @stats S JOIN

    (

    SELECT

    X.location,

    X.date,

    SUM(ATEMP) atemp

    FROM

    (

    SELECT

    S.location,

    S.date,

    CASE

    WHEN S.a_temp BETWEEN min and max

    THEN 0

    ELSE 1

    END AS ATEMP

    FROM

    @stats S,

    CONDITIONVARIABLES C

    WHERE C.FIELD = 'a_temp'

    UNION ALL

    SELECT

    S.location,

    S.date,

    CASE

    WHEN S.B_temp BETWEEN min and max

    THEN 0

    ELSE 1

    END AS ATEMP

    FROM

    @stats S,

    CONDITIONVARIABLES C

    WHERE C.FIELD = 'b_temp'

    UNION ALL

    SELECT

    S.location,

    S.date,

    CASE

    WHEN S.c_temp BETWEEN min and max

    THEN 0

    ELSE 1

    END AS ATEMP

    FROM

    @stats S,

    CONDITIONVARIABLES C

    WHERE C.FIELD = 'c_temp'

    ) X

    GROUP BY

    X.location,

    X.date

    ) AS X ON S.location = X.location AND S.date = X.date

    */

    /* My Code

    -- THis section is what I'd put in the execute sql task

    UPDATE @stats

    SET flag = 'V'

    WHERE

    EXISTS (SELECT 1 FROM CONDITIONVARIABLES C WHERE a_temp BETWEEN C.[min] AND C.[max] AND C.field = 'a_temp') And

    EXISTS (SELECT 1 FROM CONDITIONVARIABLES C WHERE b_temp BETWEEN C.[min] AND C.[max] AND C.field = 'b_temp') AND

    EXISTS (SELECT 1 FROM CONDITIONVARIABLES C WHERE c_temp BETWEEN C.[min] AND C.[max] AND C.field = 'c_temp')

    UPDATE @stats

    SET flag = 'I'

    WHERE

    NOT EXISTS (SELECT 1 FROM CONDITIONVARIABLES C WHERE a_temp BETWEEN C.[min] AND C.[max] AND C.field = 'a_temp') OR

    NOT EXISTS (SELECT 1 FROM CONDITIONVARIABLES C WHERE b_temp BETWEEN C.[min] AND C.[max] AND C.field = 'b_temp') OR

    NOT EXISTS (SELECT 1 FROM CONDITIONVARIABLES C WHERE c_temp BETWEEN C.[min] AND C.[max] AND C.field = 'c_temp')

    -- end section in execute sql task

    */

    SELECT

    *

    FROM

    @stats

  • Jack Corbett (7/2/2009)


    Well, it depends.

    Can you modify the table structure to include the calculated column?

    Are the values in conditionvariables changing?

    Yes and Yes. I am able to modify the table structure to include the calculated column to include the calculated column. The conditions can fluctuate quite often based on standard test results in our organization.

    I'm still testing out the solutions, but will definately report back with my results. Thank you so much.

  • So after testing and reviewing my requirements, implementing the solution via SSIS was easier for me. There wasn't a big preformance difference when I processed 10,000 records, but the deciding factor was more the changing requirements variableconditions table.

    Thank you once again for your responses.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply