Is there a clever way to do an UPDATE/MERGE and insert rows when there are duplicate matches?

Question

Post reply

Is there a clever way to do an UPDATE/MERGE and insert rows when there are duplicate matches?

hominamad

SSC Veteran

Points: 285
More actions
July 3, 2014 at 6:53 am

#287564

Hi - I'm trying to do something like below and wondering if there's a way I can do it with the MERGE function or some other clever type of update.
I have two tables similar to below:
TargetTable
-----------
Row Key SecurityID
1 KKL43431
2 KKLFK992
3 KKL11011
SourceTable
-----------
SecurityIDKey
401KKL43431
987KKL43431
533KKLFK992
123KKL11011
I want to update the target table with SecurityID, joining on the key value. As you can see there is more than one possible match for a key in the source table. The end result I need is this:
TargetTable
-----------
RowKeySecurityID
1KKL43431401
1KKL43431987
2KKLFK992533
3KKL11011123
So, I would like to insert rows for the cases where there are duplicates. If I do UPDATE FROM, I will have the same number of rows, and SQL Server will pick one for me. I was playing with MERGE a bit but couldn't figure out how to achieve this behavior.
The only thing I can think of doing is to join the two tables together and insert into a new table. Are there any other cleve ways to accomplish this?
Thanks,
H
Edited to add: Apologies for the bad formatting. My tabs and spaces didn't show properly and I didn't have a chance to figure out how to get that working yet. I think it's pretty clear though.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

gbritton1 SSCertifiable Points: 6522 More actions · Answer 1

This worked for me:

declare @TargetTable table ([Key] char(8), SecurityID int)

declare @SourceTable table ([Key] char(8), SecurityID int)

insert into @TargetTable([Key]) values

('KKL43431'),

('KKLFK992'),

('KKL11011')

insert into @SourceTable([Key], SecurityID) values

('KKL43431',401),

('KKL43431',987),

('KKLFK992',533),

('KKL11011',123)

update t

set t.SecurityID = s.SecurityID

from @TargetTable t

join @SourceTable s

on t.[Key] = s.[Key]

insert into @TargetTable ([Key], SecurityID)

select [Key], SecurityID

from @SourceTable

except

select [Key], SecurityID

from @TargetTable

select * from @TargetTable

Results:

Key SecurityID

KKL43431 401

KKLFK992533

KKL11011123

KKL43431987

BTW, try not to use reserved words as column names. "Key" is such a word in SQL Server.

hominamad SSC Veteran Points: 285 More actions · Answer 2

Thanks for the quick reply. So basically this is a two-step update, first an update then inserting the duplicates. I will experiment with that and see if it works for me. One thing I forgot to mention is that the target table may have 10 million+ records - and the source table is a few hundred thousand. So I would also need something that is somewhat performant. There is nothing waiting on this query (GUI, etc) but I would rather have something that takes 1 or 2 minutes vs. something that could take 10 min to complete.

So I'm wondering if joining and inserting to a new table is going to be faster than doing an update and then an insert. I will have to experiment.

P.S. The column and table names (and schema) is not my actual tables - I just simplified it for the example but should have chosen different column names to avoid confusion.

Thanks,

H

gbritton1 SSCertifiable Points: 6522 More actions · Answer 3

be sure your source and target tables are indexed on the key column

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

hominamad (7/3/2014)
Thanks for the quick reply. So basically this is a two-step update, first an update then inserting the duplicates. I will experiment with that and see if it works for me. One thing I forgot to mention is that the target table may have 10 million+ records - and the source table is a few hundred thousand. So I would also need something that is somewhat performant. There is nothing waiting on this query (GUI, etc) but I would rather have something that takes 1 or 2 minutes vs. something that could take 10 min to complete.
So I'm wondering if joining and inserting to a new table is going to be faster than doing an update and then an insert. I will have to experiment.
P.S. The column and table names (and schema) is not my actual tables - I just simplified it for the example but should have chosen different column names to avoid confusion.
Thanks,
H

Here's the MERGE you asked for - and I'd guess that gbritton's solution would be several times faster.

;

WITH TargetTable AS (SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY [Key] ORDER BY SecurityID) FROM #TargetTable)

MERGE TargetTable AS target

USING (

SELECT t.[row], t.[Key], s.SecurityID,

rn = ROW_NUMBER() OVER(PARTITION BY t.[Key] ORDER BY s.SecurityID)

FROM #TargetTable t

INNER JOIN #SourceTable s

ON s.[Key] = t.[Key]

) AS source

ON target.[Key] = source.[Key] AND target.rn = source.rn

WHEN MATCHED THEN

UPDATE SET SecurityID = source.SecurityID

WHEN NOT MATCHED THEN

INSERT ([row], SecurityID, [Key])

VALUES (source.[row], source.SecurityID, source.[Key]);

SELECT * FROM #TargetTable ORDER BY [row]

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

hominamad SSC Veteran Points: 285 More actions · Answer 5

ChrisM - that MERGE is brilliant. I will try out all these solutions and probably go with whatever is fastest. MERGE statement seems very powerful and flexible as it let's you do things like this even though it's probably not exactly the type of thing it was designed for.