BULK INSERT with csv files

  • I have a .csv datasource that I am doing a bulk insert command into a table. Occasionally, the .csv file will have a cell in it that has a comma as part of the field value. For example "Smith, MD". That comma causes innacuracies in the import. Any thoughts on how to get around this comma using BULK INSERT?

    Is there a different tool I should use? I liked BULK INSERT because I could write a stored procedure to write to a dynamic data source and a dynamic destination.

    Remember this when a developer tells you it will just be temporary. Temporary = Permanent.

  • This was removed by the editor as SPAM

  • Though the question is old... appered interesting.... Here is the possible way out...

    There is a way to detect which record contained unwanted commas.

    The last column of the destination table contains equal number of commas as are in actuall data, which can be calculated by charIndex() function.

    Now, the miracle of Bulk Insert is that it prefixes the data in the column with (") which contained those commas.

    All you have to do is to update the data of all the columns starting from one that contained (") as mentioned above, right upto the second last column with data of the relatively next placed column(Column just after the column containing (") as a suffix.

    E.g.

    If your table contains 10 columns...

    10th column will contain the commas...

    Suppose there are 2 commas... and say your 3rd column starts with (")...

    You have to replace 3rd Column data with concatenating data of 3rd, 4th and 5th column(5-3=2) and...

    Replace 4th Col Data with 6th Col Data(6-4=2) and so on... upto 7th column

    (10-(2+1))

    8th, 9th and 10th column data will come from 10th column by using substring() and charIndex() functions... 10th(Last) column contains data of 3 columns(i.e. No. Of Commas + 1)

    And 10th column data will be derived from 10th column by using right() and charIndex() function.

     

  • Though the question is old... appered interesting.... Here is the possible way out...

    There is a way to detect which record contained unwanted commas.

    The last column of the destination table contains equal number of commas as are in actuall data, which can be calculated by charIndex() function.

    Now, the miracle of Bulk Insert is that it prefixes the data in the column with (") which contained those commas.

    All you have to do is to update the data of all the columns starting from one that contained (") as mentioned above, right upto the second last column with data of the relatively next placed column(Column just after the column containing (") as a suffix.

    E.g.

    If your table contains 10 columns...

    10th column will contain the commas...

    Suppose there are 2 commas... and say your 3rd column starts with (")...

    You have to replace 3rd Column data with concatenating data of 3rd, 4th and 5th column(5-3=2) and...

    Replace 4th Col Data with 6th Col Data(6-4=2) and so on... upto 7th column

    (10-(2+1))

    8th, 9th and 10th column data will come from 10th column by using substring() and charIndex() functions... 10th(Last) column contains data of 3 columns(i.e. No. Of Commas + 1)

    And 10th column data will be derived from 10th column by using right() and charIndex() function.

     

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply