August 24, 2010 at 11:45 pm
Just curious on how you people would do this. We have an importer built that uses a web application, uploads an excel file, and loads that into the database into an intermediate table. I have to then slightly massage that data and build a couple extra records out of it, and then insert it into a table. I was thinking or using a stored proc, but if one of the records can't be attached to the correct primary keys, it will case the bulk insert to fail.
I was thinking I could test the records in this intermediate table (which has no foreign key constraints on it, just a black holding area) by LEFT OUTER JOIN it to each table that has a key found in the created record, and if any individual corresponding column contains a null, I could capture it that way, and exclude it from the bulk insert, leaving its record in the intermediate table and removing the records that pass the test from the intermediate table.
The other option would be to do this load one record at a time. I'm not entirely sure how to do this, and I would think this would be quite slow, granted my import is only generally a max of 100 records.
Is this where you would use a cursor? I've never used them, but heard a lot of bad things about them.
Just curious how other people do things like this. Our web application does a 1 by 1 insert on another import in the database, but at 10 000+ records its very slow, and it error checks each record, sending individual sql statements to the database server to execute.
any advice at all is greatly appreciated.
August 25, 2010 at 3:30 am
i know SSIS is built for this, but I always end up using what i know well...like you suggest, i do it the way you describe: bulk insert into a staging table, I usually try to use a view for the data massage/joining to foreign keys whenever possible, so the data in bulked in is available for review intact. left outer joining for updates from the staging table, then left outer joining to the destination table for inserts (so i don't update the data i just inserted)
I've played with the merge statement on a couple of projects since i learned how to use that, which basically does exactly what i described above, but in one syntactical statement(i think it still performs two operations...update then insert)
Lowell
August 25, 2010 at 10:24 am
Ya we are working with SQL Server 2005 Express edition, so there is no SSIS available. Besides, we are trying to create a univseral style of solution that we can replicate in various ways for other imports.
If I'm correct, this 'merge' statement you are speaking of only exists in sql 2008? I have read about it before some where, but I can essentially do the same thing with outer joins and an extra pass at the source data table, so for now thats no problem.
Does any one else suggest a different way? This is kinda a rushed project, and may require me to make a descision ASAP (i.e. today).
August 25, 2010 at 10:39 am
Lowell, are you saying that you would join your imported data to a view to get the appropriate values, and then place it into the staging table?
Ideally this is what I would like to do, then the user can work with and modify certain aspects of the data that isn't going to cause problem, and then call my stored proc to do the insert/updates. I would assume the best way to do this would be to get the imported data into a table variable or temp table, then left outer join to the required other tables to get the corresponding key columns, then dump the results of that into a staging table. Is this what your suggesting?
The imported data doesn't have a foreign key like my destination table requires, just a character 'code', so I have to join the imported data to a link table at some previous point and get the corresponding foreign key to this 'code'. I would assume it is better to do this before loading the data into the staging table, so all corresponding records are intact. Only problem is that I am not too sure if the frontend (web app) can import from a CSV all the data into a sort of temp table and hold it that way before joining to the remaining data and then inserting into the staging table. I would assume the front end (vb code and visual web developer, can't insert into a temp table or table variable, perhaps it would have to join one like at a time and insert one line at a time.
August 25, 2010 at 11:52 am
loki1049 (8/25/2010)
Lowell, are you saying that you would join your imported data to a view to get the appropriate values, and then place it into the staging table?
what i am saying is I create a view that does joins to my tables...for exmaple if i had a staging table that had CITY STATE ZIP in it, but my destiniation table uses CITYID and STATEID, i create a view that does the join to TBCity and TBState to get those id's...i don't add columns to the staging table...
if the Lastname ahs to be data massaged into propercase, i use the view to do that as well..
then my final insert comes from the view into teh destination table...it just makes it cleaner for me, as the view may have dozens of joins and cleanups.
your example where you might have to add a case statement to generate your character 'code' , or a status, or anything else based ont he data itself, is what i try to do in the view, instead of a whopper insert/update with lots more joins.
Lowell
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply