Update or Insert from one table to another

  • Hello all. I'm having some trouble coming up with a solution that works the way I want it to.

    I'm importing a load of data from a text file into a temporary table (called dts_SWInventoryTemp).

    The primary key on this table is ComputerName.

    The final destination of this data is tbl_SWInventory - it is identical to the temporary table in structure.

    Now that I've got my data in the temporary table, this is what I would like to do:

    Query the temporary table against the final table, to see if records already exist for any of the CompterNames.

    If the record does exist, then update the record from the temp table.

    If the record does not exist, then INSERT the record from the temp table.

    The table is very unwieldy -- it has just over 100 columns.

    So, first, could anyone recommend a way to do this? It seems like it must be one of the most common operations in the business world, so I'm sure it's been done a hundred different ways, but I can't seem to get it to work with what I've been trying.

    Secondly, is there a way to run the UPDATE command against the table, without having to actually type out all 100+ column names specifically?

    I've read and searched the site & forums for quite a while now, and been registered for a little bit, but haven't bothered to post yet. There seem to be a lot of talented people here, so it's the first place I asked.

    Any help is much appreciated - thanks in advance. 🙂

  • By the way, this is the closest I've gotten on my latest fumbling attempt at this:

    SELECT dts_SWInventoryTemp.ComputerName FROM dts_SWInventoryTemp

    WHILE EXISTS (SELECT dts_SWInventoryTemp.ComputerName FROM dts_SWInventoryTemp)

    BEGIN

    IF EXISTS

    (

    SELECT tbl_SWInventory.ComputerName, dts_SWInventoryTemp.ComputerName from tbl_SWInventory, dts_SWInventoryTemp

    WHERE tbl_SWInventory.ComputerName = dts_SWInventoryTemp.ComputerName

    )

    DELETE FROM tbl_SWInventory WHERE ComputerName = dts_SWInventoryTemp.ComputerName

    END

    INSERT INTO tbl_SWInventory

    SELECT * FROM dts_SWInventoryTemp

    When I run that, I get the "column prefix doesn't match with a table name or alias name used in the query" error, and I'm not sure why.

  • From the looks of your code, you are trying to do this procedurally, maybe even with a cursor...correct?? This whole operation can be done with 2 set based queries. Here's an example.

    DECLARE @temp_Table TABLE (ComputerName varchar(50) PRIMARY KEY, Col1 int, Col2 varchar(20))

    DECLARE @TargetTable TABLE (ComputerName varchar(50) PRIMARY KEY, Col1 int, Col2 varchar(20))

    SET NOCOUNT ON

    INSERT INTO @TargetTable

    SELECT 'Computer 1', 0, 'orig row' UNION ALL

    SELECT 'Computer 2', 0, 'orig row' UNION ALL

    SELECT 'Computer 3', 0, 'orig row' UNION ALL

    SELECT 'Computer 4', 0, 'orig row' UNION ALL

    SELECT 'Computer 5', 0, 'orig row' UNION ALL

    SELECT 'Computer 6', 0, 'orig row'

    INSERT INTO @temp_Table

    SELECT 'Computer 2', 1, 'updated row' UNION ALL

    SELECT 'Computer 4', 1, 'updated row' UNION ALL

    SELECT 'Computer 6', 1, 'updated row' UNION ALL

    SELECT 'Computer 8', 0, 'new row' UNION ALL

    SELECT 'Computer 10', 0, 'new row'

    SELECT * FROM @TargetTable

    UPDATE target

    SET Col1 = temp.Col1,

    Col2 = temp.Col2

    FROM @TargetTable target

    INNER JOIN @temp_Table temp

    ON target.ComputerName = temp.ComputerName

    INSERT INTO @TargetTable

    SELECT temp.*

    FROM @temp_Table temp

    LEFT JOIN @TargetTable target

    ON target.ComputerName = temp.ComputerName

    WHERE target.ComputerName IS NULL

    SELECT * FROM @TargetTable

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • aaharris (3/27/2008)


    Secondly, is there a way to run the UPDATE command against the table, without having to actually type out all 100+ column names specifically?

    Write the INSERT like so:

    [font="Courier New"]INSERT into tbl_SWInventory

    Select * From dts_SWInventoryTemp D

    WHERE Not EXISTS (Select * from tbl_SWInventory T2

    Where T2.ComputerName = D.ComputerName)

    [/font]

    For the UPDATE try this: In QA, open a window to edit your stored procedure with the UPDATE. Write the Update like so:

    [font="Courier New"]UPDATE tbl_SWInventory

    SET

    FROM tbl_SWInventory T

    JOIN dts_SWInventoryTemp D ON D.ComputerName = T.ComputerName

    [/font]

    Now make a new window in QA, and set your output mode to text. Then execute this query:

    Select ' ' + Column_name+ ' = D.' + Column_name + ','

    From INFORMATION_SCHEMA.COLUMNS

    Where Table_Schema = 'dbo'

    And Table_Name = 'dts_SWInventoryTemp'

    Now copy all of the text data rows (not the row headers or footers) in the output window, switch back to your stored procedure editing window. Put your cursor immediately after the "SET " in your update statement and paste the text in. Find the last comma and delete it.

    NOTE: ALWAYS do the Update first!, Then the Insert.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • John Rowan (3/27/2008)


    From the looks of your code, you are trying to do this procedurally, maybe even with a cursor...correct?? This whole operation can be done with 2 set based queries. Here's an example.

    The tables in question have around 100 columns and 900 or so rows. It looks like your query would require writing a line for every column and every row - is that correct, or am I missing something?

    Of course, it's not a huge problem to write it that way, but it seems like it would be fairly inefficient.

  • Thanks for the reply rbarryyoung, giving it a shot now.

  • The tables in question have around 100 columns and 900 or so rows. It looks like your query would require writing a line for every column and every row - is that correct, or am I missing something?

    Yes, you are missing something. Rbarry and I are actually talking about doing the same thing. Rbarry has just provided you with a slick way to script out the SET portion of your update statement without having to manually type the column names in.

    Take the output that Rbarry has given you and put the column names into the UPDATE statement that I gave you and whallah! Run the 2 statements in the order that I gave them to you, first UPDATE, then INSERT. The UPDATE will take care of all existing rows and the INSERT will grab all missing rows. There is no need to do things one row at a time. That is the whole purpose of set based processing.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Ah, I see what you're saying.

    Thanks for all your help guys, I've got this working now, and added it as a step in my DTS package that is handling all this.

    I'm a network admin, not a dba, so some of these concepts are still new to me. I have to say that I'm thoroughly enjoying this though. I'm building an inventory database that will let me track information about all the Windows nodes on my network (since I can't get the company to fork out for a decent product that does it 😉 ).

    The table in question is the result of a batch script, to a text file, which is parsed out by Perl into a db-friendly format. It contains all the information on my machines from hardware to uptime, to every piece of software that's installed on them.

    Some of them have a lot of software installed, hence the large number of columns in my table (SW_1, SW_2....SW_96).

    There is probably a more elegant way to store this information that would make it easier to access and search.

    What I would like to be able to do is answer the question "How many machines have this particular piece of licensed software installed on them?"

    It looks like I am going to have to full-text index every one of these SoftWare columns, and develop a CONTAINS query that searches every field for the particular piece of software. Because sometimes Adobe Photoshop will be listed under SW_3 - and sometimes it will be listed under SW_7. I can't see any easy way to get around this.

    Is it going to completely kill my server to index and query against so many columns?

    Thanks again for your help fellas - you saved me a lot of time. :up:

  • It sounds like a spreadsheet may work just as well for you with your current structure??

    In any case, rather than having columns such as

    software1, software2, software3, etc

    and worrying about whether photoshop is in column 1 or 7, you should have a table called "Software" which lists the possible software packages with columns such as

    softwareID, softwareName, manufacturer, version, etc

    And then another table called ComputerSoftware with columns such as

    computerID, softwareID, installedDate, etc

    This way you can search for software and computer easily without worrying about software1 vs software7. What if you only had 10 software columns and a computer had 11 applications installed? What if you decided to uninstall photoshop from several of the machines? You'd have to remove it from software 1, 2, 3, 4, etc depending on each computer and then move the data within each row to shuffle the columns down by one.... Lots of work 🙂

    Look up some articles on here about normalization - it's not at all hard and will make querying and management of your data much more straightforward. Please post more questions here if you have them! 🙂

  • Thanks for the advice, Ian.

    I did actually think about doing it that way when I planned out the db structure, but decided against it mainly for two reasons:

    1) It would be much harder to get the data into a format that would work that way. The data I am importing is basically one huge text file with a section for each computer and a list of uptime, OS version, cpu, memory, a few other things, and then a list of every piece of software installed on the machine. This is the way the (free) application I use outputs its results when I run it as a batch. I wrote a Perl script to parse it out into a tab-separated file so that it's basically

    ComputerName*TAB*Uptime*TAB*OS*TAB*SW_1*TAB*SW_2*TAB*etc...

    A SQL Job kicks off the Perl script, then runs the DTS package that imports the data. I could re-write this to give a different format, but wasn't sure it would be worth it, although I've started to re-consider a couple times while working on it.

    2) Some computers have 70 or 80 different pieces of software on them, so I would have to discard all the other pieces of software that I didn't include in my Software table. Then, any time I wanted to include a new piece of software that I hadn't before, I would have to add it to the software table, then go back and change my code to include it in the output file.

    Also, this data is generated and will be imported on a daily basis, so I'll never have to manually update the tables.

    Still, I see your point and it may be worth looking into.

    I know that just from my single post it seems like I could accomplish what I'm trying for with just a spreadsheet, but this is just a small part of the database I'm putting together. I'm also importing Computer and User account information from Active Directory into separate tables in the database, importing some files that are generated by login scripts to it with other information, and a couple of other things. They are all keyed together either by the ComputerName, or by our inventory asset tag numbers on each machine (except servers).

    So far, I only have one .asp page that queries the database for information from the LoginHistory table (login script info), but I've already been able to use quick SQL queries against this to report a lot of information that used to take hours to put together. Eventually I plan to have a set of .asp pages on our intranet/helpdesk server and make a lot of these things available to our whole department, so they don't have to rely on me to manually put this stuff together from different sources every time it's needed.

    All in all, this is probably going to save our department a dozen man hours every week, so I'm kinda proud of what I've been able to accomplish so far. 🙂

    Again though, thanks for the advice (I'm certainly not discarding your suggestion - I may still change things yet), and I'll check into the normalization topics.

  • aaharris (3/27/2008)


    Thanks for the reply rbarryyoung, giving it a shot now.

    Let us know how it works.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • It works perfectly. It's now part of the DTS package I'm using to import the data.

    Thanks again, this would have taken me forever to figure out otherwise, and my solution undoubtedly wouldn't have been nearly as efficient. :thumbsup:

  • aaharris (3/27/2008)


    Thanks for the advice, Ian.

    I did actually think about doing it that way when I planned out the db structure, but decided against it mainly for two reasons:

    1) It would be much harder to get the data into a format that would work that way. The data I am importing is basically one huge text file with a section for each computer and a list of uptime, OS version, cpu, memory, a few other things, and then a list of every piece of software installed on the machine. This is the way the (free) application I use outputs its results when I run it as a batch. I wrote a Perl script to parse it out into a tab-separated file so that it's basically

    ComputerName*TAB*Uptime*TAB*OS*TAB*SW_1*TAB*SW_2*TAB*etc...

    A SQL Job kicks off the Perl script, then runs the DTS package that imports the data. I could re-write this to give a different format, but wasn't sure it would be worth it, although I've started to re-consider a couple times while working on it.

    2) Some computers have 70 or 80 different pieces of software on them, so I would have to discard all the other pieces of software that I didn't include in my Software table. Then, any time I wanted to include a new piece of software that I hadn't before, I would have to add it to the software table, then go back and change my code to include it in the output file.

    Also, this data is generated and will be imported on a daily basis, so I'll never have to manually update the tables.

    Still, I see your point and it may be worth looking into.

    It's ultimately a matter of "pay now or pay later". Normalizing the data early on makes for time savings down the line. Having been in your shoes before - the "savings" you're talking about in the initial load will end up costing you in the future. And - the difficulties with playing fast and dirty up front end up costing a lot more overall. Build it "right" or this will come back and bite you, over and over and over again. Really.

    Point#1: I guess there's a reason for it, but why parse the file, and then parse the parsed file? Why not import it directly, and handle it within SQL Server? Also - if you ARE going to parse it, why not build something like XML instead of some delimited file, (which would get around your formatting issues)?

    As to your point #2 - it's invalid, since you already have to do that. If you were to bring them in in a normalized fashion, you wouldn't have to change the program each time (which you do today, in 3 places: the PERL program to include it as one or more new columns, the data table to have a place to put the date and the import process to include it in the stuff being brought in.)

    Normalization allows for a lot more flexibility that you will ever achieve in a flat file context.

    Again - just trying to save you some pain. you're setting yourself up for a LOT of it as it now stands.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • edit: On Point #2, I think you may have misunderstood how I have it set up. No matter how many applications get loaded on machines, I'll never have to make any changes, because every application is just listed beside all the other data in the text file; all that is imported directly into the table, and if there are more than 96 pieces of software installed, they just don't get imported. Which is a problem in itself that would probably come back to bite me later.

    I'm not extremely familiar with XML (I have worked with it a little, although it was parsing XML files instead of creating them), but if you'd like to show me what an SQL-friendly XML file should look like, and help me out with getting that into the database, I'd love to do it that way.

    🙂

  • One thing you could do to make this a little more friendly is change your Perl script to create 2 files. One would contain the data for the computer like

    Computer_ID

    Computer_Name

    Computer_OS

    Computer_Uptime

    and another that loops through the software

    Software_Computer_ID

    Software_Name

    Software_InstallDate

    that way if you want to know what computers have Photoshop it is an easy query against the software table where currently you would need to look at all 96 software fields to find them all. This also removes the 96 software title limitation.


Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply