November 30, 2012 at 8:17 am
Hi All,
I'm trying to bulk insert a pipe-delimited file with <CR><LF> as a row terminator. The problem is some fields may contain <CR><LF>. So, I've asked the client to provide a file where a pipe is after the last field as well so I can use |<CR><LF> as the row terminator. For example:
Original File:
COL_A|COL_B|COL_C<CR><LF>
1|Here is some<CR><LF>
some long comment text.|2011-01-01<CR><LF>
2|Short Text|2012-01-01<CR><LF>
3|Meh|2012-12-01<CR><LF>
What I requested:
COL_A|COL_B|COL_C|<CR><LF>
1|Here is some<CR><LF>
some long comment text.|2011-01-01|<CR><LF>
2|Short Text|2012-01-01|<CR><LF>
3|Meh|2012-12-01|<CR><LF>
Here is the SQL I was going to use:
bulk insert dbo.[MyTable] from 'MyFile' with (fieldterminator = '|',firstrow = 2,rowterminator = '|')
The client said she is using SQL Developer to create the flat files and cannot put the field delimiter after. Maybe I ask her to modify the query to return on more column that is always empty. I don't think she can control the row terminator either.
I've done a lot of Googling but I feel like I'm missing something. I need to import over 100 files for different tables. I'd like to keep things simple for the client. Is there a trick with the bulk insert? Missing the obvious.
Thanks in advance.
November 30, 2012 at 9:27 am
If you using flat-text-files for export-import of data, you should not allow data to contain the same control character (or combination of them) used as row or column separators.
As it will be impossible for processor to determine if this control character is part of data or plays the role of separator. Let say some one put pipe "|" in some string value, when such row of data extracted using pipe as column separator, this row will naturally have more columns then others.
Quite often, to avoid such problem when using flat-text-files as a transport, more complicated combination of control characters is used to reduce probability of its occurrence in the data itself (eg. ~^~ for example...)
So if your source refuse to do it, you will need to transform the text file before you can bulk-load it, but that can be very complicated and slow...
November 30, 2012 at 10:18 am
Thanks. That's about what I figured.
I know this isn't an Oracle forum but can anyone point me in the right direction for how to help the client generate a flat file with |<CR><LF>? I guess SQL Developer's Unload function isn't that flexible.
November 30, 2012 at 12:27 pm
rwm972 (11/30/2012)
Hi All,I'm trying to bulk insert a pipe-delimited file with <CR><LF> as a row terminator. The problem is some fields may contain <CR><LF>. So, I've asked the client to provide a file where a pipe is after the last field as well so I can use |<CR><LF> as the row terminator. For example:
Original File:
COL_A|COL_B|COL_C<CR><LF>
1|Here is some<CR><LF>
some long comment text.|2011-01-01<CR><LF>
2|Short Text|2012-01-01<CR><LF>
3|Meh|2012-12-01<CR><LF>
What I requested:
COL_A|COL_B|COL_C|<CR><LF>
1|Here is some<CR><LF>
some long comment text.|2011-01-01|<CR><LF>
2|Short Text|2012-01-01|<CR><LF>
3|Meh|2012-12-01|<CR><LF>
Here is the SQL I was going to use:
bulk insert dbo.[MyTable] from 'MyFile' with (fieldterminator = '|',firstrow = 2,rowterminator = '|')
The client said she is using SQL Developer to create the flat files and cannot put the field delimiter after. Maybe I ask her to modify the query to return on more column that is always empty. I don't think she can control the row terminator either.
I've done a lot of Googling but I feel like I'm missing something. I need to import over 100 files for different tables. I'd like to keep things simple for the client. Is there a trick with the bulk insert? Missing the obvious.
Thanks in advance.
Based on the rest of the discussion, you may not use the |CRLF delimiter, but if you do, I think the ROWTERMINATOR needs to be '|\r' or '|\r{backslash-n}' (actual \ n text does not show up) rather than just '|'.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
November 30, 2012 at 2:58 pm
Whoops. Had that in my script at one point. Thanks.
Since importing these jobs isn't going to be a scheduled thing, I went ahead and wrote some VBScript to just append the last |. Thought I would put it here in case it's useful to you. It isn't polished with error handling and all. Faster than I would have thought.
Option Explicit
Dim FIELD_DELIMITER
Dim DST_FOLDER
Dim SRC_FOLDER
FIELD_DELIMITER = "|"
SRC_FOLDER = "C:\TestFiles"
DST_FOLDER = "C:\TestFiles\Modified"
'======================================================
Dim oFSO
Dim oDstWriteTS
Dim oSrcFolder
Dim oSrcFileList
Dim oSrcFile
Dim oSrcReadTS
Dim iDataCols
Dim iHeaderCols
Dim iLineNo
Dim sDataLine
Dim sHeaderLine
Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oSrcFolder = oFSO.GetFolder(SRC_FOLDER)
Set oSrcFileList = oSrcFolder.Files
For Each oSrcFile In oSrcFileList
iLineNo = 1
sHeaderLine = ""
sDataLine = ""
Set oSrcReadTS = oFSO.OpenTextFile(oSrcFile.Path, 1)
Set oDstWriteTS = oFSO.CreateTextFile(DST_FOLDER & "\" & oSrcFile.Name)
sHeaderLine = oSrcReadTS.Readline
iHeaderCols = UBound(Split(sHeaderLine, FIELD_DELIMITER, -1, 1))
oDstWriteTS.WriteLine sHeaderLine
Do While oSrcReadTS.AtEndOfStream <> True
iLineNo = iLineNo + 1
sDataLine = sDataLine & oSrcReadTS.ReadLine
iDataCols = UBound(Split(sDataLine, FIELD_DELIMITER, -1, 1))
If iDataCols > iHeaderCols Then
MsgBox "Line overflow (File=" & oSrcFile.Name & "; LineNo=" & CStr(iLineNo) & "; HeaderCols=" & CStr(iHeaderCols) & "; DataCols=" & CStr(iDataCols) & ")"
Exit Do
End If
If iDataCols = iHeaderCols Then
oDstWriteTS.WriteLine sDataLine & FIELD_DELIMITER
sDataLine = ""
End If
Loop
oSrcReadTS.Close
oDstWriteTS.Close
Next
MsgBox "Complete"
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply