Bulk Insert not working for dataset

  • Hello all,

    Maybe this is my own ignorance showing, but I think I've hit a brick wall. I can't seem to bulk insert a dataset. I create a temp table at the start of the query, and then try to bulk insert my .csv into it. Now, if I have a simple .csv to deal with, it works fine. However, we are trying to take a .csv that is being exported from a supplier's software and import it into the database. My problem is as is below.

    1                 2              3

    500,          200,       "1,000"

    "7,000",   300,        500

    50,           "5,243",   232

    Basically, as soon as they roll over 1,000, the software will export the number as "5,000", instead of 5000, . Any thoughts on how to cope with this? I would use a replace function of some kind, if I knew how.

  • It looks like the problem you are having is related to the fields being quoted.  You need to specify in the BULK INSERT statement that the file is a CSV file and the FIELDQUOTE = '"' (single-quote, double-quote, single-quote).

    Review the documentation here: https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-2016

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Thanks for posting, sir. I did happen to try that on a development sql box that is version 14.0.2002.14 (2017) and it doesn't seem to work though. At first, I thought I had lost my marbles, because it ought to work. But then I tried the query with the same tables, .csv file, and everything on a 2016 box. Didn't work and displayed an error on 2016. 2017, no error, but still no dice.

    I would think if it didn't display the error on my 2017 box, that the FIELDQUOTE parameter is actually available, just not working. Is it maybe in a later release? That is all I can think of upto this point.

  • Just tried CU 15 for 2017, still no dice.

  • logan.keister wrote:

    Thanks for posting, sir. I did happen to try that on a development sql box that is version 14.0.2002.14 (2017) and it doesn't seem to work though. At first, I thought I had lost my marbles, because it ought to work. But then I tried the query with the same tables, .csv file, and everything on a 2016 box. Didn't work and displayed an error on 2016. 2017, no error, but still no dice. I would think if it didn't display the error on my 2017 box, that the FIELDQUOTE parameter is actually available, just not working. Is it maybe in a later release? That is all I can think of upto this point.

    Interesting... can you post the code that didn't work for this?  Also, if you could post any errors that occurred, that would be helpful, as well.

    EDIT:  Scratch my last about the errors... I now see where you said no errors were produced.

     

    • This reply was modified 5 years, 5 months ago by  Jeff Moden. Reason: EDIT:  Scratch my last about the errors... I now see where you said no errors were produced

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • BULK INSERT tempTable
    FROM '\\server1\data.csv'
    WITH
    (
    FIRSTROW = 6,
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    FIELDQUOTE = '"',
    TABLOCK
    )

    So, I have a confession to make... in the interest of testing this, my temp table has all of the columns set to nvarchar(50). With this in mind, the errors I'm getting below are because of the fact that the last field in my table can't hold all of the leftover data as the other columns fill up quickly with only partial sets of data. So, some columns look like this:

    1           2

    "7         469"

    When it's supposed to be like this:

    1

    7469

    That is what is pushing all of the strings out. With that explanation, here are the errors I'm getting.

    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 10, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 12, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 20, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 25, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 27, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 32, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 34, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 36, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 40, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 41, column 76 (TempF).
    Msg 4864, Level 16, State 1, Line 86
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 42, column 76 (TempF).
    Msg 4865, Level 16, State 1, Line 86
    Cannot bulk load because the maximum number of errors (10) was exceeded.
    Msg 7399, Level 16, State 1, Line 86
    The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
    Msg 7330, Level 16, State 2, Line 86
    Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

    Thanks.

  • It's time to calibrate the people that are sending you the data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Trust me, I would love to. I believe calibration by 2x4 would be the only way they would listen, though. They are unhelpful, and their unhelpful tendencies and some of our employees have made this into an issue that an auditor caught wind of... It's been a ride.

  • logan.keister wrote:

    Trust me, I would love to. I believe calibration by 2x4 would be the only way they would listen, though. They are unhelpful, and their unhelpful tendencies and some of our employees have made this into an issue that an auditor caught wind of... It's been a ride.

     

    Heh... have suffered from "OPS" (Other People's S**t) myself before and I include auditors in that, as well.  I commiserate with ya on this one for sure.

    If you want to be able to import most of the rows and sequester the hairballs they include, you could try using the "ERRORFILE" features built into BULK_INSERT and do separate automatic or manual evaluations on the hairballs.  You'll also need to change the number of allowable errors to something like 2 billion.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • FIELDQUOTE looks like it was added in 2017, as well as the FORMAT = 'CSV' parameter.  This can be resolved using SSIS instead - since that will recognize the quotes and treat them appropriately even when other fields are not quoted.

    As for the systems where you have 2017 and it didn't work - verify the database compatibility and also insure you are using the FORMAT = 'CSV' option as well as the FIELDQUOTE.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Inserting the FORMAT = 'CSV'  option gives me a new error, googling around on the error at the moment but the things I've tried to fix the error aren't quite helping. Error below.

    Msg 7301, Level 16, State 2, Line 86
    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".
  • Jeffrey Williams wrote:

    FIELDQUOTE looks like it was added in 2017, as well as the FORMAT = 'CSV' parameter.  This can be resolved using SSIS instead - since that will recognize the quotes and treat them appropriately even when other fields are not quoted. As for the systems where you have 2017 and it didn't work - verify the database compatibility and also insure you are using the FORMAT = 'CSV' option as well as the FIELDQUOTE.

    Even SSIS won't fix the erroneous data.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • BULK INSERT #CUSTOMER

    FROM 'E:\Loyalty_DWND\A22679.csv'

    WITH

    (

    FIRSTROW=2,

    --DATAFILETYPE = 'char',

    FIELDTERMINATOR =',',

    ROWTERMINATOR = '0x0a', --

    CODEPAGE = '65001',

    TABLOCK,

    FIELDQUOTE='"',

    FORMAT='CSV'

    )

  • sudishnch wrote:

    BULK INSERT #CUSTOMER FROM 'E:\Loyalty_DWND\A22679.csv' WITH ( FIRSTROW=2, --DATAFILETYPE = 'char', FIELDTERMINATOR =',', ROWTERMINATOR = '0x0a', -- CODEPAGE = '65001', TABLOCK, FIELDQUOTE='"', FORMAT='CSV'

    )

    That would do the trick in 2017 and up.  But... this thread is on a 2016 forum.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Let try below 2017 version

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply