Bulk Insert (How do I find my rowterminator?)

  • So, my other option. If I just import the file into one columnCREATE TABLE BULKACT(RAWDATA VARCHAR (MAX))

    BULK INSERT BULKACT FROM 'C:\ftp\NAV830.curr' then I can only import the rows where substring(rawdata,1,3) = 'HDR' but how do I parse out the asterisks as columns?

    Or is this that "DelimitedSplit8K" that Ed Wagner was talking about?

  • DKY (3/11/2015)


    So, my other option. If I just import the file into one columnCREATE TABLE BULKACT(RAWDATA VARCHAR (MAX))

    BULK INSERT BULKACT FROM 'C:\ftp\NAV830.curr' then I can only import the rows where substring(rawdata,1,3) = 'HDR' but how do I parse out the asterisks as columns?

    Or is this that "DelimitedSplit8K" that Ed Wagner was talking about?

    Yes. As long as your row isn't over 8000 characters long, DelimitedSplit8K is about to be your friend. Here's an example:

    WITH cteData(Line) AS (

    SELECT 'HDR*20150309000001*00009*3588331C2AI547A&101428*X1*622553*0*20150308*8*119*2230*937-342-5042*002*3588331C2'

    )

    SELECT s.ItemNumber, s.Item

    FROM cteData d

    CROSS APPLY dbo.DelimitedSplit8K(d.Line, '*') s

    ORDER BY s.ItemNumber;

    You're going to need Jeff's function, which he explains at http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    The associated article on tally tables is at http://www.sqlservercentral.com/articles/T-SQL/62867/.

    If you aren't familiar with them, take the time to read the articles and understand how they work. It's very well worth it. I think they'll change the way you look at string and will change your expectations of performance.

  • Ed Wagner (3/12/2015)


    DKY (3/11/2015)


    So, my other option. If I just import the file into one columnCREATE TABLE BULKACT(RAWDATA VARCHAR (MAX))

    BULK INSERT BULKACT FROM 'C:\ftp\NAV830.curr' then I can only import the rows where substring(rawdata,1,3) = 'HDR' but how do I parse out the asterisks as columns?

    Or is this that "DelimitedSplit8K" that Ed Wagner was talking about?

    Yes. As long as your row isn't over 8000 characters long, DelimitedSplit8K is about to be your friend. Here's an example:

    WITH cteData(Line) AS (

    SELECT 'HDR*20150309000001*00009*3588331C2AI547A&101428*X1*622553*0*20150308*8*119*2230*937-342-5042*002*3588331C2'

    )

    SELECT s.ItemNumber, s.Item

    FROM cteData d

    CROSS APPLY dbo.DelimitedSplit8K(d.Line, '*') s

    ORDER BY s.ItemNumber;

    You're going to need Jeff's function, which he explains at http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    The associated article on tally tables is at http://www.sqlservercentral.com/articles/T-SQL/62867/.

    If you aren't familiar with them, take the time to read the articles and understand how they work. It's very well worth it. I think they'll change the way you look at string and will change your expectations of performance.

    +1 to that Ed. You beat me to it.

    Question: Is this an EDI data file for some kind of Shipping or Cargo data. The HDR row looks a lot like that.

  • @Ed Wagner

    your code example makes the data into rows, not columns. How do I use this to make columns?

  • I'm still trying to figure out how to make this data into columns. Is it not possible with "DelimitedSplit8K"? Should I use something else? Is there something else?

  • Resolving this issue is actually pretty easy but I need some data to work with. Any chance of you attaching a file that I can test with after work today?

    And, I'm assuming that there's no PII in this file. If there is, we'll have to figure something else out.

    I also need to know which version and edition of SQL Server that you're actually using for all of this.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Attached is a smaller version of the file that I'm using. I opened it in notepad++, edited it and saved it as a .txt file so I could upload it using the standards permitted so I don't know if saving it changes the format in anyway but here goes.

    To answer your other question I'm using SQL Server 2008 R2.

  • DKY (3/17/2015)


    Attached is a smaller version of the file that I'm using. I opened it in notepad++, edited it and saved it as a .txt file so I could upload it using the standards permitted so I don't know if saving it changes the format in anyway but here goes.

    To answer your other question I'm using SQL Server 2009 R2.

    Ummm... did I miss a release? You meant 2008 R2?

    This X12 message does look familiar to me. Kind of looks like a Bill of lading or Inventory release message.

    It looks like some kind of Inventory Release record with multiple line Items in the DTL rows for each OHD and HDR segment.

    Wrote more than one Script Tasks in SSIS to parse and stage data in this file format before.

    Then had a stored procedure the SSIS package executed to complete the insert of data into a billing database.

    <edit> Sorry, got no clue on a "good" way to do this only with Bulk Insert.

  • Jeff Moden (3/17/2015)


    Resolving this issue is actually pretty easy but I need some data to work with. Any chance of you attaching a file that I can test with after work today?

    And, I'm assuming that there's no PII in this file. If there is, we'll have to figure something else out.

    I also need to know which version and edition of SQL Server that you're actually using for all of this.

    Jeff, did you ever get a chance to look at this?

  • I think I may have figured out a work around.

    > I bulk insert the entire file into a one column table

    > I then bcp the data into three different text files where the substring(colname,1,3) = 'HDR' for example

    > I create three different tables with different column counts (all varchar(max))

    > Then I bulk insert the three files into their respective temp tables using something like WITH (FIELDTERMINATOR = '*', ROWTERMINATOR = '')

    It seems to work really well and do what I was looking to do.

  • DKY (3/21/2015)


    I think I may have figured out a work around.

    > I bulk insert the entire file into a one column table

    > I then bcp the data into three different text files where the substring(colname,1,3) = 'HDR' for example

    > I create three different tables with different column counts (all varchar(max))

    > Then I bulk insert the three files into their respective temp tables using something like WITH (FIELDTERMINATOR = '*', ROWTERMINATOR = '')

    It seems to work really well and do what I was looking to do.

    That's pretty much what I was going to do. The question is, are you able to match the OHD and HDR rows with the DTL rows? It looks like the second field may be the key to that but I just don't know enough about the data to tell if there's something common in each of the 3 row types and have a method to do so if that's not the key.

    Also, did you actually do the BCP outs with no row terminators? You might be in for a bit of a surprise if you did.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply