How to differentiate (Field containing comma) in csv file while using bulk insert

  • Hi All,

    I 'm Using Bulk insert Statement to load .CSV File into temp table.

    one of my field contains data as eg."23,446" so while inserting it treats it as two different field.

    BULK INSERT [#a]

    FROM 'd:\evn_428018705_20081205.csv'

    WITH ( FIRSTROW = 2,

    FIELDTERMINATOR = ',',

    ROWTERMINATOR = ''

    )

    So can any one share their Opinion on it.

  • yea, you don't need to use a comma as a field separator.

    However if you have a "csv" file I would use SSIS to form the inserts.

    If the data is massive, you may need to transform it and use a different separator like ~ or something.

    Also you can use ssis on the sql server itself to do bulk inserts, which has worked for me in the past.

  • I just can't recollect now, can you just cross check for the option "TEXT IDENTIFIER" in BULK INSERT?

    foxjazz (3/14/2009)


    yea, you don't need to use a comma as a field separator.

    However if you have a "csv" file I would use SSIS to form the inserts.

    If the data is massive, you may need to transform it and use a different separator like ~ or something.

    Also you can use ssis on the sql server itself to do bulk inserts, which has worked for me in the past.

    This is SQL 2000 forum, so SSIS is not available.

    --Ramesh


  • Easiest way is to use ODBC driver for test files.

    Look in BOL for topics about Linked servers, OPENQUERY, OPENROWSET for details and examples.

    _____________
    Code for TallyGenerator

  • megham10 (3/13/2009)


    Hi All,

    I 'm Using Bulk insert Statement to load .CSV File into temp table.

    one of my field contains data as eg."23,446" so while inserting it treats it as two different field.

    BULK INSERT [#a]

    FROM 'd:\evn_428018705_20081205.csv'

    WITH ( FIRSTROW = 2,

    FIELDTERMINATOR = ',',

    ROWTERMINATOR = ''

    )

    So can any one share their Opinion on it.

    Not sure if you mean the field is surrounded by quotes or not. If it is, then you're in luck. You just need to make a BCP format file to handle it. If it's without quotes, what does the field appear as for numbers less than 1000?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • my field is like "Field"(Surrounded by "").But how to make BCP Format .

    Pls guide me.

  • You can find an explanation here – http://msdn.microsoft.com/en-us/library/ms191516(SQL.90).aspx"> http://msdn.microsoft.com/en-us/library/ms191516(SQL.90).aspx

    Adi

    --------------------------------------------------------------
    To know how to ask questions and increase the chances of getting asnwers:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Unfortunately, the MS documentation on this subject doesn't tell you how to handle text qualified strings that use quotes with BCP or Bulk Insert. You just have to "know".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • megham10 (3/15/2009)


    my field is like "Field"(Surrounded by "").But how to make BCP Format .

    Pls guide me.

    Sergiy is correct... OPENROWSET or a text based linked server may do the trick easily. If you want the super high speed performance of Bulk Insert, you'll need to use a BCP Format file along with a couple of undocumented "tricks". If you attach the first 10 rows of one of the files and the CREATE TABLE statement for the target table, we might be able to help in that area. Just make sure that nothing in the file would invade anyone's privacy.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My table Sructure is as follows

    Create Table [##Table] (UHRZEIT Varchar(100),RUFNUMMER Varchar(100),DAUER

    Varchar(100),BETRAG varchar(100),Note Varchar(100),URSPRUNG Varchar(100),ZIELORT Varchar(100))

    Contents of my CSV File is as below:

    First line is Header.

    UHRZEIT,RUFNUMMER,DAUER,BETRAG,Note,URSPRUNG,ZIELORT

    8:09:15,9.14E+11,0:01:34,"23,333",Actual cost = PurchaseCost/10000,,Indien

    11:03:36,9.20E+11,0:04:43,"67,667",,,Indien mobil

    15:53:53,9.20E+11,0:04:18,"60,667",,,Indien mobil

    18:01:03,9.14E+11,0:03:13,"46,667",,,Indien

    5:17:27,9.14E+11,0:00:13,"14,000",,,Indien

    5:17:55,9.14E+11,0:00:10,"14,000",,,Indien

    5:18:43,9.20E+11,0:02:37,"37,333",,,Indien mobil

    5:22:15,9.20E+11,0:09:39,"135,333",,,Indien mobil

    10:35:58,9.14E+11,0:03:45,"53,667",,,Indien

    10:44:17,9.14E+11,0:04:07,"58,333",,,Indien

    5:48:43,9.14E+11,0:14:19,"200,667",,,Indien

    6:03:15,9.14E+11,0:04:24,"63,000",,,Indien

    11:36:36,9.14E+11,0:03:25,"49,000",,,Indien

    16:42:55,9.14E+11,0:04:09,"58,333",,,Indien

    16:48:38,9.14E+11,0:00:13,"14,000",,,Indien

    16:49:25,9.20E+11,0:07:27,"105,000",,,Indien mobil

    16:57:20,9.20E+11,0:00:08,"14,000",,,Indien mobil

    16:58:06,9.20E+11,0:00:32,"14,000",,,Indien mobil

    8:50:54,9.20E+11,0:00:00,"0,1700",,,

  • Assumption:

    1. Data under column BETRAG is always enclosed in double quotes "223,12"

    2. Data for column other than BETRAG aren't enclosed in double quotes & nor do they contain comma

    Refer to 'References' section if data format under csv is different than above.

    Step1. Save the following under c:\Format.txt

    8.0

    8

    1 SQLCHAR 0 0 "" 0 x Latin1_General_CI_AS

    2 SQLCHAR 0 0 "," 1 UHRZEIT Latin1_General_CI_AS

    3 SQLCHAR 0 0 "," 2 RUFNUMMER Latin1_General_CI_AS

    4 SQLCHAR 0 0 ",\"" 3 DAUER Latin1_General_CI_AS

    5 SQLCHAR 0 0 "\"," 4 BETRAG Latin1_General_CI_AS

    6 SQLCHAR 0 0 "," 5 Note Latin1_General_CI_AS

    7 SQLCHAR 0 0 "," 6 URSPRUNG Latin1_General_CI_AS

    8 SQLCHAR 0 0 "\r" 7 ZIELORT Latin1_General_CI_AS

    Step2. Save c:\Data.csv. Ensure csv file is not opened under Microsoft Excel.

    Step3. Execute the script below

    IF OBJECT_ID('tempdb..#TEMP') IS NOT NULL

    DROP TABLE #TEMP

    CREATE TABLE #TEMP(

    UHRZEIT Varchar(100),

    RUFNUMMER Varchar(100),

    DAUER Varchar(100),

    BETRAG varchar(100),

    Note Varchar(100),

    URSPRUNG Varchar(100),

    ZIELORT Varchar(100)

    )

    bulk insert #TEMP from 'C:\Data.csv' with (FIRSTROW = 2, formatfile = 'C:\Format.txt')

    SELECT * FROM #TEMP

    References

    1. Format File: http://www.nigelrivett.net/SQLTsql/BCP_quoted_CSV_Format_file.html

    - Rohit Madhok

  • Assumption:

    1. Data under column BETRAG is always enclosed in double quotes "223,12"

    2. Data for column other than BETRAG aren't enclosed in double quotes & nor do they contain comma

    Refer to 'References' section if data format under csv is different than above.

    Step1. Save the following under c:\Format.txt

    8.0

    8

    1 SQLCHAR 0 0 "" 0 x Latin1_General_CI_AS

    2 SQLCHAR 0 0 "," 1 UHRZEIT Latin1_General_CI_AS

    3 SQLCHAR 0 0 "," 2 RUFNUMMER Latin1_General_CI_AS

    4 SQLCHAR 0 0 ",\"" 3 DAUER Latin1_General_CI_AS

    5 SQLCHAR 0 0 "\"," 4 BETRAG Latin1_General_CI_AS

    6 SQLCHAR 0 0 "," 5 Note Latin1_General_CI_AS

    7 SQLCHAR 0 0 "," 6 URSPRUNG Latin1_General_CI_AS

    8 SQLCHAR 0 0 "\r" 7 ZIELORT Latin1_General_CI_AS

    Step2. Save c:\Data.csv. Ensure csv file is not opened under Microsoft Excel.

    Step3. Execute the script below

    IF OBJECT_ID('tempdb..#TEMP') IS NOT NULL

    DROP TABLE #TEMP

    CREATE TABLE #TEMP(

    UHRZEIT Varchar(100),

    RUFNUMMER Varchar(100),

    DAUER Varchar(100),

    BETRAG varchar(100),

    Note Varchar(100),

    URSPRUNG Varchar(100),

    ZIELORT Varchar(100)

    )

    bulk insert #TEMP from 'C:\Data.csv' with (FIRSTROW = 2, formatfile = 'C:\Format.txt')

    SELECT * FROM #TEMP

    References

    1. Format File: http://www.nigelrivett.net/SQLTsql/BCP_quoted_CSV_Format_file.html

    - Rohit Madhok

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply