November 26, 2008 at 11:02 am
Hello, all:
Please forgive me if this is an idiotic question--I'm still learning!
I'm trying to create a process that pulls data from an outside (non-SQL Server) database, then combines that data with data from one of my tables and inserts the results into another table.
Demographic data, including CustomerType, Location, Date, AgeBracket, Zip Code, Email, and DBXID (a unique identifier assigned to each record in the outside database), is pulled from the outside database into a table called DBX. Using an insert statement, I combine this data with my ZipCodeAssoc table (which lists the City, County, State, and Country associated with each zip code) and insert the results into my Demographics table. Currently I'm using the following query:
INSERT
INTO Demographics
SELECT DBX.CustomerType, DBX.Location, DBX.[Date], DBX.AgeBracket,
ZipCodeAssoc.City, ZipCodeAssoc.County, ZipCodeAssoc.State, DBX.[Zip Code],
ZipCodeAssoc.Country, DBX.Email, DBX.DBXID
FROM DBX LEFT JOIN ZipCodeAssoc
ON DBX.[Zip Code] = ZipCodeAssoc.[Zip Code]
WHERE DBX.DBXID NOT IN
(SELECT Demographics.DBXID
FROM Demographics
WHERE DBX.[Zip Code] = ZipCodeAssoc.[Zip Code])
This works fine to insert new records into Demographics, but what I'm struggling with is how to update existing records in Demographics if they're changed in the outside database. (Part of this process is that DBX is truncated and rewritten to every night, so any changes or additions will be current in the DBX table.) Is there a way to identify which records in Demographics have been changed, and to overwrite those with the new data in DBX? The only solution I can come up with is to truncate and rewrite the Demographics table every time this process runs...is there a better alternative?
Thanks in advance!
November 26, 2008 at 12:03 pm
lk (11/26/2008)
...Is there a way to identify which records in Demographics have been changed, and to overwrite those with the new data in DBX? The only solution I can come up with is to truncate and rewrite the Demographics table every time this process runs.
the only way to tell if a record in Demographics has changed is either have some sort of [last_updated] datetime column in the table from the outside source, and check that value, or do the painfull operation of comparing the Demographics table values with the new values in your DBX table. If you don't have any indication of when the Demographics record was updated, then depending on the number of records expected in the Demographics table as a whole, you could do an UPDATE statement with a join in it before the INSERT statement you mentioned (which would update all the records that have a match) or do the TRUNCATE and completely reload as you suggested.
November 26, 2008 at 10:28 pm
Might I ask as to why you appear to be wanting to create table structure which is not normalized? For example with your tables DBX and ZipCodeAssoc
you could extract all that you appear to want with some simple T-SQL for example:
SELECT DBX.CustomerType, DBX.Location, DBX.DBXID, ZipCodeAssoc.City, ZipCodeAssoc.County, ZipCodeAssoc.State, DBX.ZipCode
FROM DBX INNER JOIN
ZipCodeAssoc ON DBX.ZipCode = ZipCodeAssoc.[Zip Code]
WHERE (ZipCodeAssoc.State = 'input state name or abbreviation')
December 2, 2008 at 2:15 pm
Thanks to you both for your responses! It turns out that the records with which we'll be working will not change once they're created, so it renders my question moot. I'm sure this issue will come up again, though, so thanks for your help! And bitbucket, the reason that I want to pull all this data into a new table is that we have some sort of weird, pre-existing reporting software that needs to use the table as a direct data source. Hopefully we'll soon be moving away from that and toward SQL Reporting Services, but for now, we just have to make do... :rolleyes:
Thanks again!
December 3, 2008 at 5:59 am
Hi
I was facing the same problem last week where I have to update my database table from a flat file changed everyday having some new records and some updated records.
Truncating the table and inserting all data again is one of the easiast method in such a case, but during this process data is not accessible or in some cases if after truncation there is some problem in importing data then data loss case occurs.
To avoid this I have created two identical tables, in your case you can create DBX and DBX_intermediate with same structure and do the following steps
step 1 ) truncate DBX_Intermediate table
step 2 ) import all data into DBX_Intermediat ( including DBXID ) from external source
step 3) Update DBX fields by making inner join between DBX and DBX_Intermediate on DBXID
step 4) Insert record into DBX from DBX_Intermediate for DBXID not in DBX table
So first time all records will be inserted into DBX and further existing will be updated with updated values and new one will be inserted.
Hope it will help u.
Thanks
Vaseem
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply