importing in a fixed format file

  • As part of my job it's becoming more and more frequent for me to have to import fixed format files.  For all the comma delimited files I've been using bcp to get this data in without having to go through the import wizard.

    How do I do an import using fixed format without using the wizard?

  • You can still use bcp - either with a format file that correlates against the fixed format file, or you can bcp it into a one-column staging table, and as a next step substring the single-row into it's respective columns.

    There is no automagic way, however. You're required to know the format of the file beforehand.

    /Kenneth

  • Cheers for that... can someone give me some advice on how to utilise a format file I read up on the internet about bcp's but couldn't find one that explained how to use format files within it.  Hence why I came here.

  • BOL (Books On Line) is your #1 stop for that info.

    Though formatfiles can be handy, they can also be a pita to get working, so keep the one-line staging table as an option as well. Sometimes certain problems is easier to fix in Transact SQL than on the command line.

    /Kenneth

  • Have you considered DTS? It does a nice job with CSV imports.

     

    Darrell

  • It is much more efficient to use a format file and bulk insert your data into individual columns as opposed to bulk inserting into a single column that is the same width as your file record.  Otherwise, you have to substring your way through the single column.  You can also bulk insert into a temp table that consists of only varchar columns, then distribute it out amongst your base tables.  If you post the format of your input file, I can give you a start on the format file.  Or, BOL has a lot of useful stuff.

    There is no "i" in team, but idiot has two.
  • Well, it depends.

    The difference isn't that great, and don't forget that the sole purpose of bcp is to get the content from the file into a table with as less hassle as possible. Sometimes, again depending on the layout and contents of files, it is more efficient to do the singlecolumn -> substring split instead of using a formatfile because a single column is easier to load than with a format file. Anything in the datafile that doesn't conform to the format file will cause bcp to error, and then you need to spend time investigating why the file didn't load. This in the long run may be more inefficient than just loading the data, and if need be, investigate the file in 'tableform'.

    But, it depends.

    /Kenneth

  • Hi, 

     

    here is the file format of one of the files that I have to import.

     

     

        1      FORENAME               Character               20

        2      SURNAME                 Character               20

        3      ADDR1                       Character               30

        4      ADDR2                       Character               30

        5      ADDR3                       Character               30

        6      TOWN                        Character               30

        7      COUNTY                    Character               30

        8      POSTCODE                Character                 8

        9      URN                            Character               10             

        -       CR/LF                         Character                 2

     

     

  • Question,

    will the format file always contain all those spaces? If so, then what addict is saying is not only very true, but right up your solutions alley.

    My bad, misread the post.

     

  • hope this helps:

    Create Table [Staging_Import] (

     iColumn varchar(300)

    )

    Create Table [Resultset] (

     FORENAME varchar(20),

     SURNAME varchar(20),

     ADDR1 varchar(30),

     ADDR2 varchar(30),

     ADDR3 varchar(30),

     TOWN varchar(30),

     COUNTY varchar(30),

     POSTCODE varchar(8),

     URN varchar(10)

    )

    INSERT INTO Staging_Import

    (iColumn) VALUES ('1231231231231231231112312312312312312311123123123123123123111231231231123123123123123123111231231231123123123123123123111231231231123123123123123123111231231231123123123123123123111231231231123456780123456789')

    INSERT INTO Staging_Import

    (iColumn) VALUES ('abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz0123456789abcdefghijklmnopqrstuvwxyz01')

    -- Insert Into Resultset

    Select

    iColumn as [SourceData],

    Len(iColumn) as [Datalength],

    SubString(iColumn, 1, 20 ) as FORENAME,

    SubString(iColumn, 21, 20 ) as SURNAME,

    SubString(iColumn, 41, 30 ) as ADDR1,

    SubString(iColumn, 71, 30 ) as ADDR2,

    SubString(iColumn, 101, 30 ) as ADDR3,

    SubString(iColumn, 131, 30 ) as TOWN,

    SubString(iColumn, 161, 30 ) as COUNTY,

    SubString(iColumn, 191, 8 ) as POSTCODE,

    SubString(iColumn, 200, 10 ) as URN

    From Staging_Import

  • Kenneth:  It does indeed depend.

    How big is the file?  How often is the bcp run?  A few hundred rows at a time, once a day, bringing in the entire row and substringing your way through it to distribute the data to base tables is fine.  I am used to dealing with bigger files (damned mainframes) more often.  Maybe it's just me, but usually when bcp balks at a file, it is easier for me to debug what went wrong by looking at the staging table.

    Your mileage may vary.  Professional driver on a closed course.  Always wear your seatbelt.

    There is no "i" in team, but idiot has two.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply