March 11, 2009 at 11:36 am
I do bulk insert from CURRENCY.TXT
It's a TAB delimited file with CR+LF row delimiter
---------------------------------------------
CURRENCY DIMENSION;|--RUN NUM: 00088;PROC_MONTH: 02;PROC_YEAR: 2009
11111111111111...
22222222222222...
to a table.
The first row is a HEADER and I need to skip it.
So I use this code:
BULK INSERT stg.SRC_CURRENCY FROM 'F:\MonthEnd\CURRENCY.TXT' WITH
(
FORMATFILE = 'F:\MonthEnd\CURRENCY.fmt',
FIRSTROW = 2)
Here is my format file:
9.0
19
1 SQLCHAR 0 255 "\t" 1 CCY_CD
2 SQLCHAR 0 255 "\t" 2 CCY_NAME
3 SQLCHAR 0 255 "\t" 3 CTRY_2_CD
4 SQLCHAR 0 255 "\t" 4 CTRY_NAME
5 SQLCHAR 0 255 "\t" 5 FOR_EXCH_RATE
6 SQLCHAR 0 255 "\t" 6 MTH_START_DT
7 SQLCHAR 0 255 "\t" 7 MTH_END_DT
8 SQLCHAR 0 255 "\t" 8 BOM_PRIME_RATE
9 SQLCHAR 0 255 "\t" 9 EOM_PRIME_RATE
10 SQLCHAR 0 255 "\t" 10 BOM_NY_BASE_RATE
11 SQLCHAR 0 255 "\t" 11 EOM_NY_BASE_RATE
12 SQLCHAR 0 255 "\t" 12 TARGET_ROE_RATE
13 SQLCHAR 0 255 "\t" 13 FTP_RATE
14 SQLCHAR 0 255 "\t" 14 AVG_DAILY_PRIME_RATE
15 SQLCHAR 0 255 "\t" 15 AVG_DAILY_NY_BASE_RATE
16 SQLCHAR 0 255 "\t" 16 GIC_NET_SPREAD_1_RATE
17 SQLCHAR 0 255 "\t" 17 GIC_NET_SPREAD_2_RATE
18 SQLCHAR 0 255 "\t" 18 GIC_NET_SPREAD_3_RATE
19 SQLCHAR 0 255 "\r" 19 AVG_3_MTH_TBILL_RATE
last row should be "rn" with "\" between.
It removes it when I post it.
THe results are wrong.
I'm supposed to get 6 rows but Row2 is skipped for some reason.
I only get 5 rows.
See my source file attached (CURR.TXT)
So CURR.TXT has 6 records (excluding HEADER):
CAD
EUR
MXN
OTH
USD
XXX
Bulk insert creates only 5 records:
EUR
MXN
OTH
USD
XXX
So I do a test to see if Row1 was mistakenly joined with Row1.
I run:
BULK INSERT stg.SRC_CURRENCY FROM 'F:\MonthEnd\CURRENCY.TXT' WITH
(
FORMATFILE = 'F:\MonthEnd\CURRENCY.fmt',
FIRSTROW = 1)
And what I suspected was true.
Row2 joins with Row1.
Here is an interesting thing.
Whenever the first row has less than 19 columns
the results get screwed up. I realize that my format file
expects 19 columns but because I want to completely skip the first row
I thought Bulk Insert would find {CR}{LF} after the first row and start
reading from the second row which has 19 columns.
But Bulk Insert does not properly separates first and second row.
Let's say I have only 18 columns in the first row.
It causes the 18th column of row1 to concatenate with the 1st column of row2
and the rest of row2 goes into Row1.Col19
----------------------------------------------------------------------------
Col18Col19
h 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2
Why Bulk Insert does not want to see {CR}{LF} at the end of Row1
(even if it's not after Col19)
and treat the following data as Row2?
Maybe HEADER should also have 19 columns?
Regardless of the fact that we want to skip it?
By the way SSIS can parse it properly if you set Header rows to skip = 1.
If you leave it 0 then the same problem occurs.
Row1.Col1 concatenates with Row2.Col1
March 11, 2009 at 5:25 pm
Hi
Not yet an answer to your question...
Could you post your table definition DDL. Then we can investigate the problem.
Greets
Flo
March 11, 2009 at 6:15 pm
[font="Verdana"]So far as I can tell, bulk insert (and bcp) expect all of the records in the file to match the format, irrespective of whether they are actually loaded. So because your header record is a different format, the parser ends up joining the two lines to fill in the necessary fields for the first line.
I ran into this situation myself recently.
SSIS is a little more flexible.[/font]
March 12, 2009 at 6:56 am
So Bruce,
Does it mean if the header is a different format
then you can't use Bulk Insert (WITH FIRSTROW=2) ?
It won't parse the file correctly?
SSIS is the only solution in your opinion?
March 12, 2009 at 1:17 pm
[font="Verdana"]Well, I haven't been able to make it work. If you can, let me know! :D[/font]
March 13, 2009 at 8:44 am
Guess what Bruce?
Boss said we don't have time to troubleshoot this.
They decided to remove headers from all five files
and it worked. We're in a hurry. We might not meet the deadlines for this project.
Top PM is furious already.
I will come back to this problem in the future
because it's a fundamental thing.
It's whether you can completely apply Bulk Insert techniques for all kinds of formats
or you have always to pay attention to the files with headers.
I hope somebody will contribute helpful information to this thread.
Thanks anyway guys!
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply