Bulk Insert Row Terminator Woes

  • Hi All,

    I'm trying to bulk insert a pipe-delimited file with <CR><LF> as a row terminator. The problem is some fields may contain <CR><LF>. So, I've asked the client to provide a file where a pipe is after the last field as well so I can use |<CR><LF> as the row terminator. For example:

    Original File:

    COL_A|COL_B|COL_C<CR><LF>

    1|Here is some<CR><LF>

    some long comment text.|2011-01-01<CR><LF>

    2|Short Text|2012-01-01<CR><LF>

    3|Meh|2012-12-01<CR><LF>

    What I requested:

    COL_A|COL_B|COL_C|<CR><LF>

    1|Here is some<CR><LF>

    some long comment text.|2011-01-01|<CR><LF>

    2|Short Text|2012-01-01|<CR><LF>

    3|Meh|2012-12-01|<CR><LF>

    Here is the SQL I was going to use:

    bulk insert dbo.[MyTable] from 'MyFile' with (fieldterminator = '|',firstrow = 2,rowterminator = '|')

    The client said she is using SQL Developer to create the flat files and cannot put the field delimiter after. Maybe I ask her to modify the query to return on more column that is always empty. I don't think she can control the row terminator either.

    I've done a lot of Googling but I feel like I'm missing something. I need to import over 100 files for different tables. I'd like to keep things simple for the client. Is there a trick with the bulk insert? Missing the obvious.

    Thanks in advance.

  • If you using flat-text-files for export-import of data, you should not allow data to contain the same control character (or combination of them) used as row or column separators.

    As it will be impossible for processor to determine if this control character is part of data or plays the role of separator. Let say some one put pipe "|" in some string value, when such row of data extracted using pipe as column separator, this row will naturally have more columns then others.

    Quite often, to avoid such problem when using flat-text-files as a transport, more complicated combination of control characters is used to reduce probability of its occurrence in the data itself (eg. ~^~ for example...)

    So if your source refuse to do it, you will need to transform the text file before you can bulk-load it, but that can be very complicated and slow...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thanks. That's about what I figured.

    I know this isn't an Oracle forum but can anyone point me in the right direction for how to help the client generate a flat file with |<CR><LF>? I guess SQL Developer's Unload function isn't that flexible.

  • rwm972 (11/30/2012)


    Hi All,

    I'm trying to bulk insert a pipe-delimited file with <CR><LF> as a row terminator. The problem is some fields may contain <CR><LF>. So, I've asked the client to provide a file where a pipe is after the last field as well so I can use |<CR><LF> as the row terminator. For example:

    Original File:

    COL_A|COL_B|COL_C<CR><LF>

    1|Here is some<CR><LF>

    some long comment text.|2011-01-01<CR><LF>

    2|Short Text|2012-01-01<CR><LF>

    3|Meh|2012-12-01<CR><LF>

    What I requested:

    COL_A|COL_B|COL_C|<CR><LF>

    1|Here is some<CR><LF>

    some long comment text.|2011-01-01|<CR><LF>

    2|Short Text|2012-01-01|<CR><LF>

    3|Meh|2012-12-01|<CR><LF>

    Here is the SQL I was going to use:

    bulk insert dbo.[MyTable] from 'MyFile' with (fieldterminator = '|',firstrow = 2,rowterminator = '|')

    The client said she is using SQL Developer to create the flat files and cannot put the field delimiter after. Maybe I ask her to modify the query to return on more column that is always empty. I don't think she can control the row terminator either.

    I've done a lot of Googling but I feel like I'm missing something. I need to import over 100 files for different tables. I'd like to keep things simple for the client. Is there a trick with the bulk insert? Missing the obvious.

    Thanks in advance.

    Based on the rest of the discussion, you may not use the |CRLF delimiter, but if you do, I think the ROWTERMINATOR needs to be '|\r' or '|\r{backslash-n}' (actual \ n text does not show up) rather than just '|'.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • Whoops. Had that in my script at one point. Thanks.

    Since importing these jobs isn't going to be a scheduled thing, I went ahead and wrote some VBScript to just append the last |. Thought I would put it here in case it's useful to you. It isn't polished with error handling and all. Faster than I would have thought.

    Option Explicit

    Dim FIELD_DELIMITER

    Dim DST_FOLDER

    Dim SRC_FOLDER

    FIELD_DELIMITER = "|"

    SRC_FOLDER = "C:\TestFiles"

    DST_FOLDER = "C:\TestFiles\Modified"

    '======================================================

    Dim oFSO

    Dim oDstWriteTS

    Dim oSrcFolder

    Dim oSrcFileList

    Dim oSrcFile

    Dim oSrcReadTS

    Dim iDataCols

    Dim iHeaderCols

    Dim iLineNo

    Dim sDataLine

    Dim sHeaderLine

    Set oFSO = CreateObject("Scripting.FileSystemObject")

    Set oSrcFolder = oFSO.GetFolder(SRC_FOLDER)

    Set oSrcFileList = oSrcFolder.Files

    For Each oSrcFile In oSrcFileList

    iLineNo = 1

    sHeaderLine = ""

    sDataLine = ""

    Set oSrcReadTS = oFSO.OpenTextFile(oSrcFile.Path, 1)

    Set oDstWriteTS = oFSO.CreateTextFile(DST_FOLDER & "\" & oSrcFile.Name)

    sHeaderLine = oSrcReadTS.Readline

    iHeaderCols = UBound(Split(sHeaderLine, FIELD_DELIMITER, -1, 1))

    oDstWriteTS.WriteLine sHeaderLine

    Do While oSrcReadTS.AtEndOfStream <> True

    iLineNo = iLineNo + 1

    sDataLine = sDataLine & oSrcReadTS.ReadLine

    iDataCols = UBound(Split(sDataLine, FIELD_DELIMITER, -1, 1))

    If iDataCols > iHeaderCols Then

    MsgBox "Line overflow (File=" & oSrcFile.Name & "; LineNo=" & CStr(iLineNo) & "; HeaderCols=" & CStr(iHeaderCols) & "; DataCols=" & CStr(iDataCols) & ")"

    Exit Do

    End If

    If iDataCols = iHeaderCols Then

    oDstWriteTS.WriteLine sDataLine & FIELD_DELIMITER

    sDataLine = ""

    End If

    Loop

    oSrcReadTS.Close

    oDstWriteTS.Close

    Next

    MsgBox "Complete"

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply