October 11, 2012 at 3:49 am
[font="Verdana"]
Hi All,
While working on one of the stored procedure, I observed one situation. Inside the stored procedure there is one Update statement like:
Update <Table Name>
Set Column1 = <New Value>
Where Column1 = <Old Value> And column2 In (Select IDs From <Table Variable>)
I observed the values in <Table Variable> are duplicating. Even though Update statement is updating values with unique values for Column2.
Means,
<Table Variable> has values like (1, 2, 2, 3, 4, 5, 5, 5, 6, 6, 7, 8, 8, 8, 8, 8) however update statement is returning (8 row(s) affected) message though <Table Variable> has record count of 16.
To verify on this situation I created one Update Trigger on the said table and gathered the newly updating information from the affected table to audit table and found surprising results. Its showing / contains only 6 updated records.
So does it means IN clause reads the distinct values from the given bunch of values?
Please let me know on this.
Thanks in advance,
-- Mahesh
[/font]
MH-09-AM-8694
October 11, 2012 at 5:05 am
It's not the update, it's the IN. IN just looks for matching values, it doesn't matter at all how many times the value is in the subquery/value list, just whether it is there or not. It's not a full join, just a check for matching values.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 11, 2012 at 5:17 am
Yes, "IN" internally does "distinct" of the values.
I would recommend to use other syntax for update, one which is similar to SELECT.
First, write the SELECT that returns the rows.
Then, replace part before "FROM" with "UPDATE tablealias SET xy=...".
For example:
--SELECT t1.*, t2.*
UPDATE t2 SET t2.Name=t1.name+'xy', t2.Address=...
FROM table1 t1
JOIN table2 t2 ON ...
WHERE ...
October 11, 2012 at 5:18 am
To add to Gail's reply, the optimizer parses the literal values in the IN predicate and extracts unique values.
Here's an example using your code:
DECLARE @T TABLE (
Column1 int,
Column2 int
)
INSERT INTO @T VALUES(1,2),(2,3),(3,4),(4,5),(5,6),(6,7),(7,8),(8,9)
SELECT *
FROM @T
WHERE Column2 IN (1, 2, 2, 3, 4, 5, 5, 5, 6, 6, 7, 8, 8, 8, 8, 8)
The scan predicate found in the execution plan is the following:
Predicate: [Column2]=(1) OR [Column2]=(2) OR [Column2]=(3) OR [Column2]=(4) OR [Column2]=(5) OR [Column2]=(6) OR [Column2]=(7) OR [Column2]=(8)
However, this is just an implementation detail you shouldn't care about.
EDIT: typo
-- Gianluca Sartori
October 11, 2012 at 5:19 am
It doesn't necessarily actually do a distinct.
I personally would not recommend changing the update to use the update from syntax. Since we know that the join will produce duplicate rows, it's not guaranteed what value the update will pick. An update from should never be used when there are duplicate rows produced by the join, as there will be in this situation.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 11, 2012 at 6:42 am
Mahesh example is very simple update, without any join, just a subquery.
So, "FROM" syntax is very similar as a syntax without "FROM":
--SELECT *
UPDATE t SET t.Column1 = <New Value>
FROM <Table Name> t
Where Column1 = <Old Value> And column2 In (Select IDs From <Table Variable>)
It's just matter of syntax, execution plan is the same.
I personally prefer that syntax because it is clearer to read for me and it is straightforward to convert it to SELECT and check what data will be updated prior to actually do it (have you ever run UPDATE and forgot a WHERE clause? This form saved my day a few times 🙂 ).
October 11, 2012 at 6:54 am
[font="Verdana"]Thanks Gail Shaw, Vedran for your valuable inputs.
@Vedran, ofcourse not. I keep my eyes widely open while writting Insert / Update / Delete :w00t:[/font]
MH-09-AM-8694
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply