INSERT if doesn't exist, UPDATE if changed

Question

INSERT if doesn't exist, UPDATE if changed

lk-681111

SSCrazy

Points: 2915
More actions
November 26, 2008 at 11:02 am

#126166

Hello, all:
Please forgive me if this is an idiotic question--I'm still learning!
I'm trying to create a process that pulls data from an outside (non-SQL Server) database, then combines that data with data from one of my tables and inserts the results into another table.
Demographic data, including CustomerType, Location, Date, AgeBracket, Zip Code, Email, and DBXID (a unique identifier assigned to each record in the outside database), is pulled from the outside database into a table called DBX. Using an insert statement, I combine this data with my ZipCodeAssoc table (which lists the City, County, State, and Country associated with each zip code) and insert the results into my Demographics table. Currently I'm using the following query:
INSERT
INTO Demographics
SELECT DBX.CustomerType, DBX.Location, DBX.[Date], DBX.AgeBracket,
ZipCodeAssoc.City, ZipCodeAssoc.County, ZipCodeAssoc.State, DBX.[Zip Code],
ZipCodeAssoc.Country, DBX.Email, DBX.DBXID
FROM DBX LEFT JOIN ZipCodeAssoc
ON DBX.[Zip Code] = ZipCodeAssoc.[Zip Code]
WHERE DBX.DBXID NOT IN
(SELECT Demographics.DBXID
FROM Demographics
WHERE DBX.[Zip Code] = ZipCodeAssoc.[Zip Code])
This works fine to insert new records into Demographics, but what I'm struggling with is how to update existing records in Demographics if they're changed in the outside database. (Part of this process is that DBX is truncated and rewritten to every night, so any changes or additions will be current in the DBX table.) Is there a way to identify which records in Demographics have been changed, and to overwrite those with the new data in DBX? The only solution I can come up with is to truncate and rewrite the Demographics table every time this process runs...is there a better alternative?
Thanks in advance!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Chris Harshman SSC-Forever Points: 42192 More actions · Answer 1

lk (11/26/2008)
...Is there a way to identify which records in Demographics have been changed, and to overwrite those with the new data in DBX? The only solution I can come up with is to truncate and rewrite the Demographics table every time this process runs.

the only way to tell if a record in Demographics has changed is either have some sort of [last_updated] datetime column in the table from the outside source, and check that value, or do the painfull operation of comparing the Demographics table values with the new values in your DBX table. If you don't have any indication of when the Demographics record was updated, then depending on the number of records expected in the Demographics table as a whole, you could do an UPDATE statement with a join in it before the INSERT statement you mentioned (which would update all the records that have a match) or do the TRUNCATE and completely reload as you suggested.

Ron McCullough SSC Guru Points: 63877 More actions · Answer 2

Might I ask as to why you appear to be wanting to create table structure which is not normalized? For example with your tables DBX and ZipCodeAssoc

you could extract all that you appear to want with some simple T-SQL for example:

SELECT DBX.CustomerType, DBX.Location, DBX.DBXID, ZipCodeAssoc.City, ZipCodeAssoc.County, ZipCodeAssoc.State, DBX.ZipCode

FROM DBX INNER JOIN

ZipCodeAssoc ON DBX.ZipCode = ZipCodeAssoc.[Zip Code]

WHERE (ZipCodeAssoc.State = 'input state name or abbreviation')

If everything seems to be going well, you have obviously overlooked something.

Ron

Please help us, help you -before posting a question please read[/url]
Before posting a performance problem please read[/url]

lk-681111 SSCrazy Points: 2915 More actions · Answer 3

Thanks to you both for your responses! It turns out that the records with which we'll be working will not change once they're created, so it renders my question moot. I'm sure this issue will come up again, though, so thanks for your help! And bitbucket, the reason that I want to pull all this data into a new table is that we have some sort of weird, pre-existing reporting software that needs to use the table as a direct data source. Hopefully we'll soon be moving away from that and toward SQL Reporting Services, but for now, we just have to make do... :rolleyes:

Thanks again!

vaseem-773819 Ten Centuries Points: 1015 More actions · Answer 4

Hi

I was facing the same problem last week where I have to update my database table from a flat file changed everyday having some new records and some updated records.

Truncating the table and inserting all data again is one of the easiast method in such a case, but during this process data is not accessible or in some cases if after truncation there is some problem in importing data then data loss case occurs.

To avoid this I have created two identical tables, in your case you can create DBX and DBX_intermediate with same structure and do the following steps

step 1 ) truncate DBX_Intermediate table

step 2 ) import all data into DBX_Intermediat ( including DBXID ) from external source

step 3) Update DBX fields by making inner join between DBX and DBX_Intermediate on DBXID

step 4) Insert record into DBX from DBX_Intermediate for DBXID not in DBX table

So first time all records will be inserted into DBX and further existing will be updated with updated values and new one will be inserted.

Hope it will help u.

Thanks

Vaseem

INSERT if doesn't exist, UPDATE if changed

Cookies on SQLServerCentral