Importing text data via BCP and removing double quotes.

  • Hi all,

    I've been struggling with a BCP IN problem here the text is contained within double quotes. I want to remove the leading and trailing double quotes in the data file via a BCP format file. However, I'm not getting the results I expect. Below is the CREATE TABLE, data file, format file and errors I am getting along with the BCP IN. I get one row (last row) copied in successfully, but errors on all other rows. Note, this is just a small sample of a much larger dataset I need to load. Any help or pointers to correcting the BCP format file would be greatly appreciated!

    -- table:

    CREATE TABLE [dbo].[HPKeywordsSmall](

    [KEYWORDS] [nvarchar] (4000) NULL,

    [VISIT_HIGH] [varchar] (20) NULL,

    [VISIT_LOW] [varchar] (20) NULL,

    [VISIT_NUMBER] [varchar] (10) NULL,

    [SEQNO] [varchar] (10) NULL,

    ) ON [PRIMARY]

    GO

    -- BCP IN

    bcp FASTSearchHP.dbo.HPKeywordsSmall in c:\SQLTest\internal_search_data_small.txt -fc:\SQLTest\HPKeywordsSmall.fmt -ec:\SQLTest\HPKeywordsSmallerr.txt -T -S JKANE2\SQL100R2 -F 2

    -- input text file:

    KEYWORDS~VISIT_HIGH~VISIT_LOW~VISIT_NUMBER~SEQNO

    "xw6000"~"100005250290113"~"45071774156593787"~20~1

    "xw6000"~"100005250290113"~"45071774156593787"~20~3

    "non%20system%20diisk"~"100005250290113"~"45071774156593787"~20~11

    "non%20system%20disk"~"100005250290113"~"45071774156593787"~20~12

    "hp compaq dc7600 sff"~"1000108583"~"2751861534"~1~2

    NOTE: The last row is imported, but the columns are not aligned properly, from SELECT after BCP IN:

    KEYWORDS VISIT_HIGH VISIT_LOW VISIT_NUMBER SEQNO

    --------------- -------------------- -------------------- ------------ ----------

    NULL hp compaq dc7600 sff 1000108583 2751861534 1~2

    -- format file:

    10.0

    5

    1 SQLCHAR 0 8000 "\"" 1 KEYWORDS SQL_Latin1_General_CP1_CI_AS

    2 SQLCHAR 0 20 "\"~\"" 2 VISIT_HIGH SQL_Latin1_General_CP1_CI_AS

    3 SQLCHAR 0 20 "\"~\"" 3 VISIT_LOW SQL_Latin1_General_CP1_CI_AS

    4 SQLCHAR 0 10 "\"~" 4 VISIT_NUMBER SQL_Latin1_General_CP1_CI_AS

    5 SQLCHAR 0 10 "\r" 5 SEQNO SQL_Latin1_General_CP1_CI_AS

    -- errors:

    #@ Row 1, Column 4: String data, right truncation @#

    <NULL>xw60001000052502901134507177415659378720~3

    #@ Row 2, Column 4: String data, right truncation @#

    <NULL>non%20system%20diisk1000052502901134507177415659378720~11

    #@ Row 3, Column 4: String data, right truncation @#

    <NULL>non%20system%20disk1000052502901134507177415659378720~12

    Any thoughts on modifying the BCP format file parameters or other ideas?

    Thanks,

    John


    John T. Kane

  • You need to define a dummy column to handle the first quote.

    for this input text file:

    KEYWORDS~VISIT_HIGH~VISIT_LOW~VISIT_NUMBER~SEQNO

    "xw6000"~"100005250290113"~"45071774156593787"~20~1

    "xw6000"~"100005250290113"~"45071774156593787"~20~3

    "non%20system%20diisk"~"100005250290113"~"45071774156593787"~20~11

    "non%20system%20disk"~"100005250290113"~"45071774156593787"~20~12

    "hp compaq dc7600 sff"~"1000108583"~"2751861534"~1~2

    Your format file should be something like thid (haven't tested it)

    10.0

    6

    1 SQLCHAR 0 0 "\"" 10 skip 0 skip SQL_Latin1_General_CP1_CI_AS

    2 SQLCHAR 0 8000 "\"~\"" 1 KEYWORDS SQL_Latin1_General_CP1_CI_AS

    3 SQLCHAR 0 20 "\"~\"" 2 VISIT_HIGH SQL_Latin1_General_CP1_CI_AS

    4 SQLCHAR 0 20 "\"~" 3 VISIT_LOW SQL_Latin1_General_CP1_CI_AS

    5 SQLCHAR 0 10 "~" 4 VISIT_NUMBER SQL_Latin1_General_CP1_CI_AS

    6 SQLCHAR 0 10 "\r" 5 SEQNO SQL_Latin1_General_CP1_CI_AS

    The probability of survival is inversely proportional to the angle of arrival.

  • sturner hit the nail on the head...we must have been working on this at the same time

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Excellent! It was the extra skip row and some changes to the escape characters. Below is the actual format file that was successful!

    10.0

    6

    1 SQLCHAR 0 0 "\"" 0 skip SQL_Latin1_General_CP1_CI_AS

    2 SQLCHAR 0 8000 "\"~\"" 1 KEYWORDS SQL_Latin1_General_CP1_CI_AS

    3 SQLCHAR 0 20 "\"~\"" 2 VISIT_HIGH SQL_Latin1_General_CP1_CI_AS

    4 SQLCHAR 0 20 "\"~" 3 VISIT_LOW SQL_Latin1_General_CP1_CI_AS

    5 SQLCHAR 0 10 "~" 4 VISIT_NUMBER SQL_Latin1_General_CP1_CI_AS

    6 SQLCHAR 0 10 "\r" 5 SEQNO SQL_Latin1_General_CP1_CI_AS

    Thanks,

    John


    John T. Kane

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply