Update with Inner Join

  • Hi All,

    I have an update statement i want to execute but when i do the select statement the number of records changed is different.

    select *

    FROM MOHG_Processing_TestData.dbo.Stage2test INNER JOIN MOHG_Processing_TestData.dbo.Country_Transform ON MOHG_Processing_TestData.dbo.Stage2test.Address1_City = MOHG_Processing_TestData.dbo.Country_Transform.DQCorrection COLLATE Latin1_General_CI_AI

    UPDATE [MOHG_Processing_TestData].[dbo].[Stage2test]

    SET Address1_City = NULL

    FROM MOHG_Processing_TestData.dbo.Stage2test INNER JOIN MOHG_Processing_TestData.dbo.Country_Transform ON MOHG_Processing_TestData.dbo.Stage2test.Address1_City = MOHG_Processing_TestData.dbo.Country_Transform.DQCorrection COLLATE Latin1_General_CI_AI

    The count for the select is 98797 and the count for update is 25784

    Please could you tell me what to change to get my update to update what i am selecting; hope this makes sense 🙂

    Thanks in advance

  • Looks like you have multiple matching rows in the country_transform table. Test with this:

    SELECT *

    FROM MOHG_Processing_TestData.dbo.Stage2test t

    INNER JOIN (SELECT DISTINCT DQCorrection FROM MOHG_Processing_TestData.dbo.Country_Transform) s

    ON t.Address1_City = s.DQCorrection COLLATE Latin1_General_CI_AI

    and with this:

    SELECT t.Address1_City, COUNT(*)

    FROM MOHG_Processing_TestData.dbo.Stage2test t

    INNER JOIN MOHG_Processing_TestData.dbo.Country_Transform s

    ON t.Address1_City = s.DQCorrection COLLATE Latin1_General_CI_AI

    GROUP BY t.Address1_City

    HAVING COUNT(*) > 1

    ORDER BY COUNT(*) DESC

    SQL Server will only update a row once regardless of how many rows match in the source.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • thanks for this, but both of the statement run fine but i am gaining different results when i am expecting the same number of records

  • martin.kerr 34088 (12/5/2011)


    thanks for this, but both of the statement run fine but i am gaining different results when i am expecting the same number of records

    Did you run the two queries I posted? Are the result counts different to your SELECT query?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • yes they are im afraid

  • -- This query will show you which of the cities (which have a match in the staging table)

    -- are duped in the source table Country_Transform

    -- Query uses WHERE EXISTS to conserve cardinality

    SELECT DQCorrection, COUNT(*)

    FROM MOHG_Processing_TestData.dbo.Country_Transform s

    WHERE EXISTS (SELECT 1 FROM MOHG_Processing_TestData.dbo.Stage2test t WHERE t.Address1_City = s.DQCorrection)

    GROUP BY DQCorrection

    Multiple rows in Country_Transform having the same value of DQCorrection isn't necessarily a problem, it depends what the table is for. It's not a problem in the UPDATE statement either, because as mentioned earlier, SQL Server won't complain and will only update a row once for multiple rows matching in the FROM table source. However, most folks consider it good practice to eliminate dupes from the table source (it's not standard ANSI SQL IIRC) and newer operators e.g. MERGE will fail if multiple rows are detected in the source.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ok i have dupes in the staging table but surely this should not matter, when it sees Singapore in the staging table and in the update table then it should just update it

  • It doesn't matter if the staging table (the update target) has multiple rows which match the search predicate of the update.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • so are you saying my update statement will never update what i want to (i.e. the select statement)

  • No, he's saying that the update statement is updating what you want, but since SQL won't update rows 2 or 3 or more times, if there are duplicate rows that appear in the select statement, the update will show a lower row count.

    See this simple example:

    CREATE TABLE t1(

    id INT,

    somecolumn char(50)

    )

    CREATE TABLE t2(

    id INT,

    foreignid INT,

    somecolumn char(50)

    )

    INSERT INTO t1 VALUES (1, NULL)

    INSERT INTO t2 VALUES (1,1,'One'), (2,1, 'Two'), (3,1,'Three')

    If I do a select from the 2, there will be 3 rows returned, because the one row from t1 matches to all 3 rows in t2.

    SELECT * FROM t1 INNER JOIN t2 ON t1.id = t2.foreignid

    If I update t1 from t2, there will be one row affected only, because there's only one row in t2. The fact that it matches to 3 rows in t2 won't make SQL update 3 rows, there aren't 3 rows to update.

    UPDATE t1 SET somecolumn = t2.somecolumn

    FROM t1 INNER JOIN t2 ON t1.id = t2.foreignid

    p.s. However it's worth noting that it's not defined which value of t2.somecolumn is the one that t1.somecolumn gets...

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • martin.kerr 34088 (12/5/2011)


    so are you saying my update statement will never update what i want to (i.e. the select statement)

    It will update just fine, but it's important for you to know a) why your counts are different between the SELECT and the UPDATE; b) what the consequences might be for other SQL Server operators such as MERGE or other SQL databases; and c) how this issue is commonly dealt with.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • oh ok, so if if de-dupe my connecting table then it should update fine?

  • It's already updating fine, but if you de-dupe the table you'll see the counts that you're wanting.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ok i will select distinct from that table and create a new table and then use that one.

    Thanks for all your help

  • but even with the joining table de-duped it still produces the same count. Does it matter if the original table i am trying to update to NULL has dupes? i.e. 10000 cities which are Hong Kong

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply