July 3, 2014 at 6:53 am
Hi - I'm trying to do something like below and wondering if there's a way I can do it with the MERGE function or some other clever type of update.
I have two tables similar to below:
TargetTable
-----------
Row Key SecurityID
1 KKL43431
2 KKLFK992
3 KKL11011
SourceTable
-----------
SecurityIDKey
401KKL43431
987KKL43431
533KKLFK992
123KKL11011
I want to update the target table with SecurityID, joining on the key value. As you can see there is more than one possible match for a key in the source table. The end result I need is this:
TargetTable
-----------
RowKeySecurityID
1KKL43431401
1KKL43431987
2KKLFK992533
3KKL11011123
So, I would like to insert rows for the cases where there are duplicates. If I do UPDATE FROM, I will have the same number of rows, and SQL Server will pick one for me. I was playing with MERGE a bit but couldn't figure out how to achieve this behavior.
The only thing I can think of doing is to join the two tables together and insert into a new table. Are there any other cleve ways to accomplish this?
Thanks,
H
Edited to add: Apologies for the bad formatting. My tabs and spaces didn't show properly and I didn't have a chance to figure out how to get that working yet. I think it's pretty clear though.
July 3, 2014 at 7:14 am
This worked for me:
declare @TargetTable table ([Key] char(8), SecurityID int)
declare @SourceTable table ([Key] char(8), SecurityID int)
insert into @TargetTable([Key]) values
('KKL43431'),
('KKLFK992'),
('KKL11011')
insert into @SourceTable([Key], SecurityID) values
('KKL43431',401),
('KKL43431',987),
('KKLFK992',533),
('KKL11011',123)
update t
set t.SecurityID = s.SecurityID
from @TargetTable t
join @SourceTable s
on t.[Key] = s.[Key]
insert into @TargetTable ([Key], SecurityID)
select [Key], SecurityID
from @SourceTable
except
select [Key], SecurityID
from @TargetTable
select * from @TargetTable
Results:
Key SecurityID
KKL43431 401
KKLFK992533
KKL11011123
KKL43431987
BTW, try not to use reserved words as column names. "Key" is such a word in SQL Server.
July 3, 2014 at 7:25 am
Thanks for the quick reply. So basically this is a two-step update, first an update then inserting the duplicates. I will experiment with that and see if it works for me. One thing I forgot to mention is that the target table may have 10 million+ records - and the source table is a few hundred thousand. So I would also need something that is somewhat performant. There is nothing waiting on this query (GUI, etc) but I would rather have something that takes 1 or 2 minutes vs. something that could take 10 min to complete.
So I'm wondering if joining and inserting to a new table is going to be faster than doing an update and then an insert. I will have to experiment.
P.S. The column and table names (and schema) is not my actual tables - I just simplified it for the example but should have chosen different column names to avoid confusion.
Thanks,
H
July 3, 2014 at 7:34 am
be sure your source and target tables are indexed on the key column
July 3, 2014 at 7:34 am
hominamad (7/3/2014)
Thanks for the quick reply. So basically this is a two-step update, first an update then inserting the duplicates. I will experiment with that and see if it works for me. One thing I forgot to mention is that the target table may have 10 million+ records - and the source table is a few hundred thousand. So I would also need something that is somewhat performant. There is nothing waiting on this query (GUI, etc) but I would rather have something that takes 1 or 2 minutes vs. something that could take 10 min to complete.So I'm wondering if joining and inserting to a new table is going to be faster than doing an update and then an insert. I will have to experiment.
P.S. The column and table names (and schema) is not my actual tables - I just simplified it for the example but should have chosen different column names to avoid confusion.
Thanks,
H
Here's the MERGE you asked for - and I'd guess that gbritton's solution would be several times faster.
;
WITH TargetTable AS (SELECT *, rn = ROW_NUMBER() OVER(PARTITION BY [Key] ORDER BY SecurityID) FROM #TargetTable)
MERGE TargetTable AS target
USING (
SELECT t.[row], t.[Key], s.SecurityID,
rn = ROW_NUMBER() OVER(PARTITION BY t.[Key] ORDER BY s.SecurityID)
FROM #TargetTable t
INNER JOIN #SourceTable s
ON s.[Key] = t.[Key]
) AS source
ON target.[Key] = source.[Key] AND target.rn = source.rn
WHEN MATCHED THEN
UPDATE SET SecurityID = source.SecurityID
WHEN NOT MATCHED THEN
INSERT ([row], SecurityID, [Key])
VALUES (source.[row], source.SecurityID, source.[Key]);
SELECT * FROM #TargetTable ORDER BY [row]
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
July 3, 2014 at 7:42 am
ChrisM - that MERGE is brilliant. I will try out all these solutions and probably go with whatever is fastest. MERGE statement seems very powerful and flexible as it let's you do things like this even though it's probably not exactly the type of thing it was designed for.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply