How to bulk insert rows from text file into a WIDE TABLE which has 1400 columns?

  • Hi All,

    Have interesting question for you. Need you help on this.

    As we all now, with the help of BCP, we can easily load a file into db tables. However, my main concern here is the number of columns in the file. A text file TEXT_1400.txt has 1400 columns. I am unable to load data to my db table using BCP or BULK INSERT commands, as maximum of 1024 columns are allowed per table in SQL Server 2008.

    We can still go ahead and create ‘Wide Table’ (a special table that holds up to 30,000 columns. The maximum size of a wide table row is 8,019 bytes.).

    But when operating on wide table, BCP/BULK INSERT commands still fail. After few hours of scratching my head over BCP and BULK INSERT, I observed that while inserting BCP/BULK INSERT commands are unable to identify SPARSE columns and skip these columns, which disturbs column mapping and results in data conversion and trancation errors.

    Is there any proper way to load this kind of files into the db table?

    I know this can be done easily using SSIS. but i am looking for a non-SSIS solution. Please help.

    Thanks,

    Sunil

    Regards,
    SMK

    --------------------------------------------------------------------------
    "If you’re not prepared to be wrong, you’ll never come up with anything original."

  • Have you tried using a format file with the BCP Utility?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • You probably want to specify the max number of errors, an error file, batch size as well as last row that you want to process so that you are not scratching your head and waiting hours before you get an indication of what might have gone wrong?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • can you post the table schema that you are loading into and also upload the text file with sample data. it will be helpful then someone can actually test it to be able to help you out.

  • If the file is fixed format (not delimited between columns) you could load the file into a table with a single column, and then create a view on the table to separate the individual columns to load into your final table.

  • You can use SSIS to accomplish this.

    Use a flat file source(Delimited) and specify column size to say 5000.

    Then use a derived column task and create columns on a fixed width basis off of your output from Source file task. Then you could easily map your columns between source and destination.

    Thanks..

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • I guess I overlooked your criteria of a Non SSIS based solution :hehe:

    The_SQL_DBA
    MCTS

    "Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives."

  • Welsh Corgi (2/3/2010)


    Have you tried using a format file with the BCP Utility?

    I believe that Welsh's idea above could be one of the simplest and fastest ways. If the data is mostly well formed and consistent for delimiters, you could "ignore" columns to load each of the tables you want to load.

    The Format File that works with BCP will also work quite nicely with BULK INSERT.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thank you very much for all your replies. This morning I am glad to see so many replies to my post. Thanks Guys!

    @Welsh – I tried using format file, the problem here is format file ignores all SPARSE Columns in WIDETABLE (this is the final table with 1400 columns). I also tried to add these ignored columns into the format file manually. But still I couldn’t load this file.

    @The_SQL_DBA - I am thinking of switching back to SSIS to load this file into the database table (Wide Table). But one concern here is the row length, what if this exceeds 8KB? SSIS fails to insert rows.

    @michael-2 Valentine Jones – I have a delimiter and that’s “|”. I can still load this as single row, assuming there is no delimiter, however extracting data will slow down my process.

    @vstitte – I can’t do it right way, coz its hard to generate test data.

    After consulting with internal mates here, we arrived at splitting this text file and loading the data to multiple tables. These tables will be linked with key columns. Any suggestions?

    Thanks for your time guys!

    Sunil

    Regards,
    SMK

    --------------------------------------------------------------------------
    "If you’re not prepared to be wrong, you’ll never come up with anything original."

  • Below is a variation on the answer I posted in another forum, so that people who find this thread also have it:


    I believe I understand the problem. In order to have a wide table with more than 1,024 columns, you had to create a column set. Because of this, you'll notice that SELECT * does not return each of your 1,400 columns. Instead, it returns your non-sparse columns plus the column set column, which contains XML describing the names and values of the non-null sparse columns in that row.

    When you use bcp or bulk operators, they seem to work over the same columns returned by SELECT *, meaning they expect only the non-sparse columns plus the column set XML column. If you output a format file with bcp's -f option, you'll see that this is the case. From my quick tests, it seems that even modifying the format file to manually specify individual sparse columns does not work; bcp doesn't recognize any sparse columns, only the column set. If your data file contains data for each individual sparse column, this explains why it is not working properly. Note that it's the column set that causes this; a table with sparse columns, fewer than 1024 total columns, and no column set should work fine.

    Off the top of my head, I see two workarounds. The first is to do what it seems you settled on: load the columns into two separate tables (i.e., vertical partitioning). You can then link them with foreign keys, or use an INSERT...SELECT to populate the wide table. The second workaround is to write a script that transforms your existing data file into one where the sparse columns are collected into an XML value for the column set column. When you load that, the column set should automatically shred it into the proper columns.

    As a final note, this may very well be a bug, or at least a design choice that we can examine. I've filed an internal work item so that the group owning this can take a look at it. So, many thanks for the feedback!


    Pedro DeRose
    Program Manager, Microsoft SQL Server

    (Legalese: This posting is provided "AS IS" with no warranties, and confers no rights. Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm.)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply