November 30, 2018 at 12:06 am
Hello everyone,
Need some best solution for my below requirement.
I need to Insert/update/delete some data into my target based as per my staging (meta) data.
For example -
EmpID | FirstName | MiddleName | LastName | Designation | UpdatedDate |
1 | FN1 | MN1 | LN1 | D1 | 2018-01-01 |
2 | FN2 | MN2 | LN2 | D2 | 2018-01-01 |
3 | FN3 | MN3 | LN3 | D3 | 2018-01-01 |
4 | FN4 | MN4 | LN4 | D4 | 2018-01-01 |
5 | FN5 | MN5 | LN5 | D5 | 2018-01-01 |
6 | FN6 | MN6 | LN6 | D6 | 2018-01-01 |
EmpID | ColumnName | ChangeType | Value | ChangeDate |
1 | FirstName | Update | FN1.1 | 2018-01-02 |
2 | FirstName | Update | FN2.1 | 2018-01-02 |
2 | MiddleName | Delete | - | 2018-01-02 |
3 | LastName | Update | LN3.1 | 2018-01-02 |
3 | Designation | Update | D3.1 | 2018-01-02 |
After applying the staging changeto my target table, my target data should be like below.
EmpID | FirstName | MiddleName | LastName | Designation | UpdatedDate |
1 | FN1.1 | MN1 | LN1 | D1 | 2018-01-02 |
2 | FN2.1 | - | LN2 | D2 | 2018-01-02 |
3 | FN3 | MN3 | LN3.1 | D3.1 | 2018-01-02 |
4 | FN4 | MN4 | LN4 | D4 | 2018-01-01 |
5 | FN5 | MN5 | LN5 | D5 | 2018-01-01 |
6 | FN6 | MN6 | LN6 | D6 | 2018-01-01 |
I thought generating required dynamic SQL DML commands as per the staging(meta) data.
Could someone please share your thoughts if there is any best approach to achieve this?
November 30, 2018 at 3:32 am
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
November 30, 2018 at 5:35 am
ChrisM's solution looks good.
November 30, 2018 at 11:10 am
You could also look at the MERGE statement, which will do inserts, updates, and deletes from a source data set to a target data set in a single statement. There are clauses for what to do when matching records are found (UPDATE), what to do when the source has records and the target does not (INSERT), and what to do when the source does not have records but the target does (DELETE). These can be combined so any combination of those can be used.
Documentation is found at Microsoft's website: https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017
November 30, 2018 at 11:27 am
fahey.jonathan - Friday, November 30, 2018 11:10 AMYou could also look at the MERGE statement, which will do inserts, updates, and deletes from a source data set to a target data set in a single statement. There are clauses for what to do when matching records are found (UPDATE), what to do when the source has records and the target does not (INSERT), and what to do when the source does not have records but the target does (DELETE). These can be combined so any combination of those can be used.Documentation is found at Microsoft's website: https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-2017
On second look, your source data is not normalized, so it may be difficult to get your data into a usable format for MERGE. You could do one column at a time, like this:WITH BaseData
AS (
SELECT EmpID,
Value AS FirstName,
ROW_NUMBER() OVER (PARTITION BY EmpID ORDER BY ChangeDate DESC) AS RowNum
FROM #staging_Employees
WHERE ColumnName = 'FirstName'
)
MERGE #dbo_Employees u
USING (
SELECT EmpID,
FirstName,
RowNum
FROM BaseData
WHERE RowNum = 1
) x
ON x.EmpID = u.EmpID
WHEN MATCHED THEN UPDATE
SET FirstName = x.FirstName
WHEN NOT MATCHED BY TARGET THEN INSERT
(EmpID, FirstName)
VALUES
(x.EmpID, x.FirstName);
You could create a more complex query to get your staging data into a normalized format. I think that would be more difficult, because you have to account of multiple changes to the same item and select only the latest.
November 30, 2018 at 12:32 pm
fahey.jonathan, ChrisM@Work - Thanks for your inputs. I will consider these both option and choose one on their performance as we are having huge data in staging and target.
December 1, 2018 at 11:41 am
I'm not sure ChrisM's solution works in the general case - the various MAX's operate independently and so unless you can guarantee the values all ascend in time (which seems unlikely) you won't necessarily get the right end values. Try adding the line:
(1, 'FirstName', 'Update','A2','2018-02-02')
to the list and you'll see what I mean. You would need to rank the rows by date and take the most recent for any given field first to ensure you're getting the row values you actually require.
December 1, 2018 at 11:11 pm
@rockys - Is it possible that the same EmpID could have multiple updates for the same column, at the same time? If so, how would you determine which of the multiple values gets updated to the dbo.Employee table?
Looking at the Staging table, it gives the impression that it's wiped and reloaded on daily basis. If the UpdatedDate really does lack a time component, it could be difficult (more likely impossible) to determine which is the most resent.
December 28, 2018 at 1:27 am
This was removed by the editor as SPAM
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply