OpenRowset Bulk Import Flat files with header & Footer

  • Hi Guys,

    I have a problem to import flat files in SQL Server table. Here is the statement I use on a flat files that contains 7 rows including header & footer but the script skip the 2nd row and import only 4 rows instead of 5. the data files are row delimited with '/n' and field delimited by the pipe '|'

    the format files was created with the BCP tool. This step is part of a long workflow so I cannot do it with SSIS which is doing the import without any error

    INSERT INTO database.dbo.MyTable

    SELECT *, 'My_Flat_FileName'

    FROM OPENROWSET(BULK 'd:\Folder\My_Flat_FileName.txt',

    FORMATFILE = 'd:\folder\MYFormatFile.fmt'

    ,lastrow=5,firstrow=2

    ) as T1;

    If I'm not wrong the lastrow should be equal to 6 but when setting it to 6 I have an error message

    Msg 4832, Level 16, State 1, Line 1

    Bulk load: An unexpected end of file was encountered in the data file.

    Msg 7399, Level 16, State 1, Line 1

    The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.

    Msg 7330, Level 16, State 2, Line 1

    Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)"

    Thank you beforehand for you support

    Drari

  • The header must have exactly the same delimiters as the 2nd row which must match all the other rows, or BCP will give you that error and the 2nd row will never be imported.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • it is the case, the header has exactly the same deimiters than the others, I checked it with Notepad++

  • Only one thing left to do, then. Attach the create script for the target table, the BCP format file, and, if there's no private information in it, the file that caused the problems in the original post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have no problem to attach the script but I prefer to send the data file separately and in private

    if possible

  • Um... ok... a bit out of the ordinary...

    Click on my name above my avatar to the left and send me an email with the file attached.

    Folks, if I find something, I'll be sure to post what the problem was without compromising the data the OP is sending.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sent

  • Yep... And I looked at that file... it's just like I said... the header record and the trailer record both have a different number of columns than the body of the file. Because of that, it will try to include the second row (1st body row) as part of the header and make the error you are getting... you'll never be able to import the 2nd row because the FirstRow option assumes that all the rows are the same and they are not.

    There're several ways around this... the way I usually do it is to import the whole file into a very wide single data column table. Then I delete the header and footer row, re-export to a nonpermanent "working" file, and reimport.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • what I do not understand is when I use an oledb drive to read the file in a datase

    he gives a totalrows = 7

    Maybe there is a way to import the file by removing the rows starting with HEADER and TRAILER using an ole db data provider. Do you know much more about this?

    Otherwise to use your method do I have to do an import unsing the single_clob that's what you mean?

    Which is the fastest solution as as said I have to populate more than 5000 files with at least 10000rows by file

  • drari (8/6/2008)


    what I do not understand is when I use an oledb drive to read the file in a datase

    he gives a totalrows = 7

    Maybe there is a way to import the file by removing the rows starting with HEADER and TRAILER using an ole db data provider. Do you know much more about this?

    Otherwise to use your method do I have to do an import unsing the single_clob that's what you mean?

    Which is the fastest solution as as said I have to populate more than 5000 files with at least 10000rows by file

    There are 7 rows, just like you say. The problem is that the header and footer have a different format that the 5 data rows, just like I said.

    The import into a VARCHAR(8000) (or VARCHAR(MAX) if you prefer), deletion of header and footer, exporting, and reimporting runs fairly fast. Of course, you could just import and split the data using a Tally table. With ROWNUMBER() OVER(), it's a fairly simple task...

    I'm off to work... email me one of your 10k row example files so I can check for performance. I'll give it a whirl tonight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I just sent you a bigger file

    I know that there are actually5 data rows + 1 header and 1 footer thus 7 rows

    But what I say is that I'm using some macros to extract the 1st row to put it in a log table

    and this one is using an odbc driver to read the file and this one without any specific description read well the 7 rows and extract the 1st rows without any difficulty

    here is the connection string used

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\';Extended Properties="text;HDR=NO;FMT=Delimited";Persist Security Info=False

    and using a simple select statement to read the file SELECT * FROM EVT_000XXXX.txt

    Of course I don't care about the command to use if there is an other command to use I will be glad to test it but it should remain fast

    thanks again for your time

  • drari (8/6/2008)


    I just sent you a bigger file

    I know that there are actually5 data rows + 1 header and 1 footer thus 7 rows

    But what I say is that I'm using some macros to extract the 1st row to put it in a log table

    and this one is using an odbc driver to read the file and this one without any specific description read well the 7 rows and extract the 1st rows without any difficulty

    here is the connection string used

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\';Extended Properties="text;HDR=NO;FMT=Delimited";Persist Security Info=False

    and using a simple select statement to read the file SELECT * FROM EVT_000XXXX.txt

    Of course I don't care about the command to use if there is an other command to use I will be glad to test it but it should remain fast

    thanks again for your time

    Macros??? What on Earth are you talking about?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm using an Application named automate is executing different macros to execute mutiliple recurrent and repetitive tasks.

    That's why I'm talking about macros

  • Jeff,

    could you guide me througth the diffrent steps you mentioned ?

    1.Import flat file in a temp table with one colonne : ok

    bulk insert ANYDB.dbo.TmpFileTable

    from 'd:\MyOriginalDataFile.txt'

    with (datafiletype='char',rowterminator ='')

    2. Delete Header and Footer : Ok

    Luckily the header and footer start with their own title so here is the script

    delete from tmpfiletable where (left(datafile,6) in ('HEADER','TRAILE') or datafile is null)

    3. Export table to a flat file:OK (now)

    Finally I did it like that

    C:\Program Files\Microsoft SQL Server\90\Tools\Binn\bcp anydb.dbo.tmpFileTable

    out d:\TmpExportFile.txt -T -c -r

    4.import new flat file in my target table : Ok

    INSERT INTO ANYDB.dbo.MyTargetTable

    SELECT *,'FileName'

    FROM OPENROWSET(BULK 'd:\tmpexportfile.txt',FORMATFILE = 'd:\MyFormatFile.fmt'

    ) as T1;

    thank you for your help again

    Drari

  • Except for the occasional mispelling, that should do it...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply