BULK INSERT - Overly large table

  • Hi All

    I have been tearing my hair out with this one - please help.

    We are bulk inserting data and what we are seeing is that post insert the table is about 10 times larger than it should be.

    We are using ADO 2.6 and create a dat file for each case. The eventual size of our DB is approx 200Gb.

    strSQL = "BULK INSERT TableName FROM '" & _

                    strFileFullName & "'" & _

                    "    WITH (" & _

                    "       CODEPAGE = 'ACP'," & _

                    "       DATAFILETYPE = 'char'," & _

                    "       FIELDTERMINATOR = '|'," & _

                    "       MAXERRORS  = 0," & _

                    "       ROWTERMINATOR = '\n'" & _

                    "       )"

            

    Database.Execute strSQL, dbSQLPassThrough

    Any help appreciated.

  • Is the table 10 times larger, or is the database 10 times larger? 

    If you meant that the database is 10 times larger, I would then suspect that your database is configured to autogrow by a percentage rather than a fixed increment. As always, there is a tradeoff with any configuration setting.  You can either shrink the database in between runs, or reconfigure the database setting.

    If the table is truly 10 times larger than expected, check that your data is what you think it is.  You may have additional unintended rows of data.

    Quick thoughts without sufficient caffeine this morning.  I'll shut up now.

    Have a great day!

    Wayne

  • Do you have a clustered index on your table? Could be that if your data isn't sorted in the same order, then a lot of page splits are occurring, with the free space not being used by subsequent row inserts. I wouldn't have thought this would lead to the 10x increase, but could be a factor.

    What happens if you shrink the database after the insert? Or rebuild the clustered index and then shrink?

  • No - I went through all that before and the table size is not affected greatly. The reason I know something is strange is that I can export the data from the table (test db is only 3Gb and the table is about 90 - 95 % of that) and when import the data into a base database the size is reduced to 400mb.

    From what I can see I am doing the same operation with the same dataset. One via ADO and the other via EnterpriseManager and getting two completely different table sizes for the exact same schema.

  • Could you be clearer when you say table size - do you mean amount of storage space, or rows in the table?

    If storage space, I have found that even bulk inserts cause (temporary) 'free' space to grow - even when in the database is set to simple mode and is in theory not logging anything.

    I'm assuming this is your problem, and about all you can do is shrink the DB in between loads, you can speed this up if it's the log file by detaching the database, deleting the log file, and reattaching the database, but you'll still need to shrink the .mdf file ( using dbcc shrinkfile (1,0).

    If Rows - you have extra garbage and need to examine your input file for control characters etc.. get a good text editor and have a look (uedit is good for this, but there are lots out there that are free or shareware).

     

    Chuck


    Thanks, and don't forget to Chuckle

  • Chuck

    Sorry - Yes it is the table size not the number of rows. One thing I will say is that when I try to shrink the database just using ent. manager the db doesnt shrink. If I take the data from my table and copy it into a temp table first I dont get the padding that is happening.

    I have noticed that I can import the data without increasing the size of the table above what it should be. Any ideas on database settings to stop this happening?

    Cheers

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply