January 18, 2012 at 5:55 am
This is not my choice I hasten to add but I may need to do it.
Currently, I import a csv file containing multiple lines using bulk insert (no SSIS yet I'm afraid) into a temp table and removing duplicates in the usual way using row_number. I've discovered that searching for duplicates based on equal column values and then using another column as the discriminator may not be right and the physical order in the file should be used - keeping the last physical row if duplicates.
My question is a simple one: If I do a bulk insert, how can I guarantee that the rows in the target temp table are in the same physical order?
I would like to avoid inserting the lines one at a time in the client app and using cursors in the stored procedure (though I could do it as the files are small).
Thanks
January 18, 2012 at 1:48 pm
You can bulk insert into a table with an identity. The identity will maintain the insertion order.
The probability of survival is inversely proportional to the angle of arrival.
January 18, 2012 at 1:51 pm
i think you also want to set MAXDOP =1 during the bulk insert as well as inserting into the identity table as sturner suggested.
Lowell
January 18, 2012 at 1:55 pm
mymistake, it's the TABLOCK option that prevents parallellism for BULK INSERT:
BULK INSERT BULKACT FROM 'c:\Export_o.txt'
WITH (
DATAFILETYPE = 'char',
FIELDTERMINATOR = ',',
ROWTERMINATOR = '',
FIRSTROW = 1,
TABLOCK
)
Lowell
January 18, 2012 at 2:27 pm
Thanks. My lack of knowledge, I didn't know you could bulk insert into a table with an identity column - I thought that all columns had to match.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply