Help with bcp - Do not want to delimit, just load each row into 1 column

  • I am tasked with a project of taking a text file that has a 1 row header of 4 columns, 1 row footer of 2 columns, and the data in-between of varying rows and columns. For the sake of this forum, let's say it is 10 columns. all of the fields are pipe delimited and the row terminator is LF (I think...)

    The idea is to use the header of the file to determine which table the data will eventually go to. First column is a header identifier, "H", the second column is a company identifier, "XXXX", the third column is the table name, "YYYYYYYYY", and the final column is an issue or created date in string format, "YYYYMMDDHHMMSS." This is done so that if the company that produces the file makes an error on the file name itself, we can determine which table the data is actually destined for long with a verification of the creation date simply by the header. The footer simply contains a footer identifier in the first row, "T", and then a count of the number of data rows. This is to act as a verification that the data export to file was complete. These things I do not have control over, it is what it is.

    So, In order to load the header into its own table, along with its footer, I want to first bcp each row into 1 column without using the pipe delimiter. This way, I can identify the header and footer rows and move them to another table. I will then bcp out the rest of the data, and then bcp it into its corresponding staging table now that I have all of the data having the same amount of columns and data types (presumably now that the header and footer are gone).

    Here is what I have tried:

    DECLARE @cmd VARCHAR(8000);

    IF EXISTS (SELECT * FROM information_schema.TABLES WHERE TABLE_NAME = 'bcp_staging')

    BEGIN

    DROP TABLE bcp_staging;

    CREATE TABLE bcp_staging (dataImport VARCHAR(8000))

    END

    ELSE

    CREATE TABLE bcp_staging (dataImport VARCHAR(8000))

    SET @cmd = 'bcp dbName.dbo.bcp_staging IN "C:\Users\me\Documents\fileName.txt" -T -t -r -n'

    EXEC master..xp_cmdshell @cmd;

    This fails with the following error:Error = [Microsoft][SQL Server Native Client 10.0]String data, right truncation

    I know that 1 row is never even close to 8000 characters and I am "pretty sure" the encoding of the data is non-unicode.

    I have also tried replacing -n with -w, -c, -N... I am not to familiar with all of the bcp options.

    Your thoughts are appreciated.

    Jared
    CE - Microsoft

  • Can you you get the BCP to work manually? I know I have done this myself, just can't remember how I handled it of the top of my head.

  • Nope, but BULK INSERT seems to be working now... Just can't get the row terminator right.

    Jared
    CE - Microsoft

  • I can tell you it took some playing when I was doing it. Problem is it was so long ago it is hard to remember what we did. To get the data imported to a single column table.

  • Lynn Pettis (8/8/2012)


    I can tell you it took some playing when I was doing it. Problem is it was so long ago it is hard to remember what we did. To get the data imported to a single column table.

    Ugh... This should be the easiest thing to do! Something must be different about my file because I can't get the row to terminate. The strange thing is that SSIS was able to do it with no problem when I skipped the header row. UGH!

    Jared
    CE - Microsoft

  • Maybe my files I am using are encoded differently than yours or maybe I am missing something here, but any time I have had to load a whole line from a file into a single column and ignore any field delimiters I use the following:

    create table #myTest

    (

    buffer varchar(8000)

    )

    bulk insert #myTest from 'c:\temp\whatever.txt'

    with

    (

    fieldterminator = '\'

    )

    Edit: I don't know why the code isn't displaying correctly, fieldterminator should be \ n without the space

  • roryp 96873 (8/8/2012)


    Maybe my files I am using are encoded differently than yours or maybe I am missing something here, but any time I have had to load a whole line from a file into a single column and ignore any field delimiters I use the following:

    create table #myTest

    (

    buffer varchar(8000)

    )

    bulk insert #myTest from 'c:\temp\whatever.txt'

    with

    (

    fieldterminator = '\'

    )

    Edit: I don't know why the code isn't displaying correctly, fieldterminator should be \ n without the space

    Yeah, I am stumped! I am using {LF} as the row terminator in SSIS Flat File Connection Manager, and it is working fine. I think I just may have to work with SSIS for this.

    Jared
    CE - Microsoft

  • SQLKnowItAll (8/8/2012)


    Lynn Pettis (8/8/2012)


    I can tell you it took some playing when I was doing it. Problem is it was so long ago it is hard to remember what we did. To get the data imported to a single column table.

    Ugh... This should be the easiest thing to do! Something must be different about my file because I can't get the row to terminate. The strange thing is that SSIS was able to do it with no problem when I skipped the header row. UGH!

    Do you have an editor that will allow you to look at the hex values? I have used UltraEdit to do that so that I can see what the terminator(s) are at the end of a line of a text file.

  • Lynn Pettis (8/8/2012)


    SQLKnowItAll (8/8/2012)


    Lynn Pettis (8/8/2012)


    I can tell you it took some playing when I was doing it. Problem is it was so long ago it is hard to remember what we did. To get the data imported to a single column table.

    Ugh... This should be the easiest thing to do! Something must be different about my file because I can't get the row to terminate. The strange thing is that SSIS was able to do it with no problem when I skipped the header row. UGH!

    Do you have an editor that will allow you to look at the hex values? I have used UltraEdit to do that so that I can see what the terminator(s) are at the end of a line of a text file.

    HA! Did that and know that it is ASCII 10. So, here was the solution that I found in BOL for "Line Feed" on BULK INSERT:

    DECLARE @bulk_cmd varchar(1000)

    SET @bulk_cmd = 'BULK INSERT dbName.dbo.bcp_staging

    FROM ''C:\Users\me\Documents\fileName.txt''

    WITH (ROWTERMINATOR = '''+CHAR(10)+''')'

    EXEC(@bulk_cmd)

    Jared
    CE - Microsoft

  • Glad to see you got it working.

  • I was going to add that when I run into import errors from files of that type I generally load them up in Notepad++ and bring all the codes out into the open. You can usually see pretty quickly what the differences are in that case.

    Maybe next time.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply