JOIN too slow; looking for a better approach

  • Sql 2005: I want to compare two records in two tables that share a common "key". I want to update

    a flag in the first record if any of the corresponding fields (other than the key)

    in the records do not agree in value. Here's the idea:

    update TableA

    set myflag = 1

    from TableA left join TableB

    on

    (TableA.key = TableB.key)

    and

    (

    (TableA.c1 <> TableB.c1) or

    (TableA.c2 <> TableB.c2) or

    (TableA.c3 <> TableB.c3) --- etc

    )

    where

    (TableB.key is not null)

    and

    (

    (TableB.c1 is not null) or

    (TableB.c2 is not null) or

    (TableB.c3 is not null)

    )

    The problem I'm finding is that, in my case, Table A and TableB each have around 500,000 records, with 15 fields, and the JOINs are way too slow to run.

    Any idea how I can improve the situation?

    TIA,

    Barkingdog

  • check out the new sql2005 keyword Except !

    could be something like this...

    update Ta

    set colx = Tb.colx

    ...

    from ( select * from table1

    except

    select T2.* from table2 T2

    -- only needed if rows might not occur in Table1

    inner join table1 T1

    on T1.keycol = T2.keycol

    ) Ta

    inner join table2 Tb

    on Ta.keycol = Tb.keycol

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • SELECT * FROM TableA

    EXCEPT SELECT * FROM TableB;

    The result will be rows from TableA that do not have a corresponding row in TableB (exact match on all columns). If you need to limit the columns, specify the columns you want to match.

    I would put the results into a temp table - then reference the temp table in an update statement. Or, you could put the above into a CTE and use it that way - example:

    ;WITH

    cte AS (SELECT * FROM TableA

    EXCEPT SELECT * FROM TableB)

    ,cte2 AS (SELECT * FROM TableB

    EXCEPT SELECT * FROM TableA)

    UPDATE table

    SET flag = 1

    WHERE ID IN (SELECT ID FROM cte)

    OR ID IN (SELECT ID FROM cte2);

    You are going to have to replace the '*' with the actual columns you want to compare.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thank You,

    I like that idea.

    Barkingdog

  • Barkingdog (2/8/2009)


    The problem I'm finding is that, in my case, Table A and TableB each have around 500,000 records, with 15 fields, and the JOINs are way too slow to run.

    Are the join columns indexed?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • update TableA

    ...

    from TableA left join TableB

    on

    (TableA.key = TableB.key)

    ...

    where

    (TableB.key is not null)

    ...

    I think that these things cancel each other out. Try this instead:

    update TableA

    set myflag = 1

    from TableA join TableB

    on

    (TableA.key = TableB.key)

    and

    (

    (TableA.c1 <> TableB.c1) or

    (TableA.c2 <> TableB.c2) or

    (TableA.c3 <> TableB.c3) --- etc

    )

    where

    (

    (TableB.c1 is not null) or

    (TableB.c2 is not null) or

    (TableB.c3 is not null)

    )

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • All interesting ideas. Regarding the last one

    >>>

    update TableA

    set myflag = 1

    from TableA join TableB

    on

    (TableA.key = TableB.key)

    and

    (

    (TableA.c1 <> TableB.c1) or

    (TableA.c2 <> TableB.c2) or

    (TableA.c3 <> TableB.c3) --- etc

    )

    where

    (

    (TableB.c1 is not null) or

    (TableB.c2 is not null) or

    (TableB.c3 is not null)

    )

    >>>

    I have to think of the purpose of the where-clause contents now that the left join has changed to an inner join.

    Thanks to all

    Barkingdog

  • All it does is to insure that at least one non-key column is not null.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (2/8/2009)


    All it does is to insure that at least one non-key column is not null.

    .. to ensure the left join has a match....

    i.e. only handle inner join stuff ...

    Still you need to check of the row column data is unequal from A to B.

    This would mean you would have to add a full comparisson column list !

    If you would only provide the ...

    update T1

    set col1 = case when T1.col1 = T2.col1 then T1.col1 else T2.col1 end

    , col1 = case when T1.colx = T2.colx then T1.colx else T2.colx end

    ....

    the join clause

    it would update all joined rows of T1 ! (even with its original data if there is no column data change detected)

    When using the except clause, you would only get rows which actually have to be modified.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply