April 17, 2015 at 10:57 am
We push data out to an SQL ODS. Some of the ODS tables keep a history of changes of records from the source DB. There may be times when a source DB issue would require a full refresh of the SQL table, which would lose the history.
I have tested copying the SQL table, doing a full refresh and then UNIONing the two tables back together and it works, but it's not as clean as I would have liked. The reason is one of the fields in the SQL table is the export date/time that the source record was pushed into the ODS. When I select all records from both tables and UNION them I get doubles where all the data is the same except the export date. Ideally I would want only one record in the union when all the data is the same except the export date, but I am having a hard time figuring out how this would be done. Any ideas?
Here is the simple union I was doing:
select * into originaltable
from newtablecopy
union
select *
from originaltablecopy;
April 17, 2015 at 11:40 am
I wanted to add to this question an important piece of the puzzle. That is that the export date is part of the key. So what I am really trying to do in the UNION is to not include rows in which all of the data matches except for the export date part of the key.
April 17, 2015 at 12:01 pm
I'm assuming you want your query results to include the Export date? and if so which record to you want keep (e.g. the one with the earliest Export date? oldest?)
-- Itzik Ben-Gan 2001
April 17, 2015 at 12:09 pm
Yes the export date would be included in the query results, in fact all the data would be included. The oldest date would be preferred.
April 17, 2015 at 12:16 pm
With respect to records that are duplicated, let's say you only wanted the one with the most recent
To get the records with the oldest Export date you could do this (using table variables for brevity):
DECLARE @newtablecopy TABLE (col1 varchar(10), col2 varchar(10), ExportDate date);
DECLARE @originaltablecopy TABLE (col1 varchar(10), col2 varchar(10), ExportDate date);
INSERT @newtablecopy VALUES ('x','y','10/1/2010'),('x2','y','10/10/2010'),('x2','y2','10/11/2010');
INSERT @originaltablecopy VALUES ('x','y','10/2/2010'),('x2','y','10/10/2010'),('x2','y2','10/12/2010');
WITH bothTables AS
(
SELECT col1, col2, ExportDate
FROM @newtablecopy
UNION ALL
SELECT col1, col2, ExportDate
FROM @originaltablecopy
),
filtered AS
(
SELECT *, MostRecent = ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY ExportDate DESC)
FROM bothTables
--ORDER BY col1, col2, ExportDate --for debugging
)
SELECT col1, col2, ExportDate
FROM filtered
WHERE MostRecent = 1;
Note that I avoided UNION and used UNION ALL because UNION requires a sort in the query plan to filter out duplicates. I am using a more efficient way to filter out duplicates using ROW_NUMBER().
Update/Edit: Missed your reply to my question. I updated my solution and explanation accordingly.
-- Itzik Ben-Gan 2001
April 17, 2015 at 1:52 pm
Wow thanks Alan! If I explain how I am interpreting this can you confirm if I am correct?
First, a temp result set is created with all the rows of both tables. Then, from that result set you PARTITION the rows into groups where in each group both col1 and col2 are the same, and you give each row in each group a number based on the order of the export date. Finally, from the second temp result set you select those rows with the oldest date based on it being labeled the first row in the partition.
April 17, 2015 at 2:04 pm
dmodersk (4/17/2015)
Wow thanks Alan! If I explain how I am interpreting this can you confirm if I am correct?First, a temp result set is created with all the rows of both tables. Then, from that result set you PARTITION the rows into groups where in each group both col1 and col2 are the same, and you give each row in each group a number based on the order of the export date. Finally, from the second temp result set you select those rows with the oldest date based on it being labeled the first row in the partition.
No problem at all and yes, exactly.
-- Itzik Ben-Gan 2001
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply