Create a delta from downloaded info

Question

Create a delta from downloaded info

Bruin

SSCrazy

Points: 2107
More actions
August 3, 2021 at 4:01 pm
Go to Answer
#3916586

I'm looking for some suggestions on how to create a delta file that will some me differences from yesterdays downloaded information
versus today's. The table has about 25 fields, but no mod_date in the data, and any one of 10-15 could have changed. We are currently doing a full replace from Staging table to the Live table, now they are wanting to only see changes.
Thanks.

Viewing 15 posts - 1 through 15 (of 45 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 247227 More actions · Answer 1

Bruin wrote:

I'm looking for some suggestions on how to create a delta file that will some me differences from yesterdays downloaded information versus today's.

Write a query to identify the differences between the data in the tables
Output the results of the query to a file

Bruin SSCrazy Points: 2107 More actions · Answer 2

Can that be driven from information schema tables so I don't have to hard code field names do the compares, how

about the decimal and float values?

any examples?

Thanks.

ScottPletcher SSC Guru Points: 101238 More actions · Answer 3

Yeah, EXCEPT should to that just fine:

SELECT * /*...list_of_cols...*/

FROM dbo.today

EXCEPT

SELECT * /*...list_of_cols...*/

FROM dbo.yesterday

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Phil Parkin SSC Guru Points: 247227 More actions · Answer 4

What does your desired output look like? Is a list of all the source rows which are different good enough? Or do you want to see differences at the data-item level (eg, row id <x>, column <n> old value: <old val>, new value: <new val>)?

It may be obvious, but Scott's SELECT * EXCEPT idea works only as long as the columns in both tables match exactly.

This reply was modified 4 years, 7 months ago by Phil Parkin.

Bruin SSCrazy Points: 2107 More actions · Answer 5

I would like to see diffs at the data-item level, this would help in the validation process.

Thanks.

Bruin SSCrazy Points: 2107 More actions · Answer 6

compare table values and create a file with diff's.

Bruin SSCrazy Points: 2107 More actions · Answer 7

any examples of a compare script and output diff's?

Thanks.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 8

Bruin wrote:

any examples of a compare script and output diff's?
Thanks.

There are many ways to display the "Diffs". Please provide and explicit example of what you'd like to see for your "Diff" report. And, when you do, remember that there really needs to be some form of key that guarantees uniqueness of each row or this is going to be mostly impossible to accomplish. Remember that EXCEPT only show you where there are no matches and so does not show you a Before'n'After.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Bruin SSCrazy Points: 2107 More actions · Answer 9

My table as a PK(unique) of:

[InvoiceNbr] ASC,

[Line] ASC,

[SubLine] ASC,

[InvoiceDate] ASC,

[ShipDate] ASC

The process that creates the file to load to SQL does a FULL refresh of the data, meaning it does a MTD pull each time it executes. The diff file needs to display the PK along with any values on the record that differ from yesterday's pull of the information. It needs to be dynamic in case fields are added or deleted from the table.

Thanks.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 10

Bruin wrote:

My table as a PK(unique) of:
[InvoiceNbr] ASC, [Line] ASC, [SubLine] ASC, [InvoiceDate] ASC, [ShipDate] ASC
The process that creates the file to load to SQL does a FULL refresh of the data, meaning it does a MTD pull each time it executes. The diff file needs to display the PK along with any values on the record that differ from yesterday's pull of the information. It needs to be dynamic in case fields are added or deleted from the table.
Thanks.

It seems like you're telling me that an invoice date can change? That's not going to make for a Martha Stuart moment. I seems to me that the InvoiceNbr, Line, and Subline would be the only things need to make a unique row.

Heh... and you're description of what is needed to find the differences has always been understood but you really need to describe how you want them displayed. A graphic would be good for that. It would really be nice if you also provided some readily consumable data even if it's just for the PK and a couple of other columns.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 11

Bruin wrote:

My table as a PK(unique) of:
[InvoiceNbr] ASC, [Line] ASC, [SubLine] ASC, [InvoiceDate] ASC, [ShipDate] ASC
The process that creates the file to load to SQL does a FULL refresh of the data, meaning it does a MTD pull each time it executes. The diff file needs to display the PK along with any values on the record that differ from yesterday's pull of the information. It needs to be dynamic in case fields are added or deleted from the table.
Thanks.

I would move away from the idea that this should be dynamic - if a new column is added in the source system and you want that data to be considered in the destination, then the column must be created in the destination and populated *exactly* the same way it was populated in the source system. That is a required change and as part of that change request you can and should update any downstream processes (like this one).

That means you can easily setup and use EXCEPT to identify changes from the source to the target - specifying each column to be compared.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Bruin SSCrazy Points: 2107 More actions · Answer 12

small sample of data, and the output desired.

Row 2 is written as a delta to a new table since acctgrp and channel changed.

Output:

Old:display yesterday's record

New:display today's record

Just so I have an audit of data changes..

Invoicenbr,line,subline,invoicedate,shipdate,acctgrp,channel,qtyshipped
78123,1,0,8-13-2021,8-10-2021,1002,rrb,800     -- Yesterday's data
78123,1,0,8-13-2021,8-10-2021,1004,xzb,800     -- Today's data

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 13

Bruin wrote:

small sample of data, and the output desired.
Row 2 is written as a delta to a new table since acctgrp and channel changed.
Output: Old:display yesterday's record New:display today's record
Just so I have an audit of data changes..
Invoicenbr,line,subline,invoicedate,shipdate,acctgrp,channel,qtyshipped
78123,1,0,8-13-2021,8-10-2021,1002,rrb,800     -- Yesterday's data
78123,1,0,8-13-2021,8-10-2021,1004,xzb,800     -- Today's data

Heh... ok. That's just "whole row" auditing and it's pretty simple to do. It would be nice if you had done the rest to make this easy. Please see the first link (again... I've asked you before) in my signature line below for what I'm talking and how it would really save you and the people that are trying to help you a shedload of time... especially YOU. 😉

So post some readily consumable data to help us help you so we can write some tested code instead of playing multiple "tag you're it" posts. I recommend you post a couple of lines that haven't changed in both, some that have, and some that have only been added to one, and some that have only been deleted from one.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 14

This is confusing - it really isn't clear what you want to accomplish. I am thinking you might want a temporal table - but since you haven't provided the requested data it really isn't clear.

With a temporal table - you would extract the full data from the source system, then use EXCEPT to identify rows that have changed or rows that are new. Using that set of data you can then update the existing rows with the new data and insert the new rows. The temporal table will create a history row when you update the existing data and you can then query for old/new values and when they changed.

If that is what you are trying to accomplish - please provide the requested data. This will need some existing data - some new rows to be inserted and some rows to be updated, in the form of create and insert statements. A final table and a staging table...

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs