Performance problems with bulk insert command

  • We have performance problems with bulk insert. Where should I start troubleshooting this problem?

    The only thing that comes to my mind is to check for page splits. But I could not find any trace event for this. So I checked fragmentation for a particular table, and it was relatively low. What else should I check?

    Thanks.

  • What recovery model is the database in? if full set to bulk logged for the bulk insert operation. Are there any unused indexes on the table? get rid of them. Page splits can be monitored via a perfmon trace using the sql server counters.

  • +1 to cunningham.

    Also, what's a performance problem? Is it slower than it was before? By what metric? Or is it just that users are complaining?

  • cunningham (4/29/2015)


    What recovery model is the database in? if full set to bulk logged for the bulk insert operation. Are there any unused indexes on the table? get rid of them. Page splits can be monitored via a perfmon trace using the sql server counters.

    Thanks for reply.

    Recovery model is full of course, this is production. But we have log backup every 15 min. Will try to test it with bulk_logged, maybe on a test server.

    This table has 189 columns, 10 indexes (including the clustered), 53 mln recs. It is partitioned. And according to sys.dm_db_index_usage_stats, it has 3 unused indexes. Will try to test it after dropping them, again on test server for now.

    Tried to collect page splits info by periodically running sys.dm_os_performance_counters while running my inserts. Yes, it has lots of splits, but our system is large, it's running a lot of concurrent processes (we have average 500 concurrent connections), how can I attribute these splits to my particular table ?

  • Steve Jones - SSC Editor (4/29/2015)


    +1 to cunningham.

    Also, what's a performance problem? Is it slower than it was before? By what metric? Or is it just that users are complaining?

    The problem is that while it inserts new records, other processes like selects for various reports and display for users, are blocked. Of course, we cannot completely eliminate blockings, it's normal behavior of a concurrent system. If blockings are short, it's not a problem, it's almost unnoticeable for selecting systems. But when block duration is long, our select procedures enter into timeout (I am not sure, maybe 1 min or 2 min), and this is the real problem and source of complains.

  • Aren't you using a staging table to prevent problems?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (4/29/2015)


    Aren't you using a staging table to prevent problems?

    Actually, we are using a staging server, that stays between our vendors and us, but in our hands.

    But in general, how staging table will prevent such blockings? We would need to insert into a staging table first, and then select from there to our destination table anyway. Same records.

  • It depends on what you're inserting. Does all the information needs to be available at the same time? You could do inserts on batches to prevent long times during blocking.

    If the indexes work correctly, you shouldn't need to block the entire table and allow other users to read from it. I can't assure this because it depends on the type of queries you're running.

    You say that the table has 53 million rows. How many are you inserting?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (4/29/2015)


    It depends on what you're inserting. Does all the information needs to be available at the same time? You could do inserts on batches to prevent long times during blocking.

    If the indexes work correctly, you shouldn't need to block the entire table and allow other users to read from it. I can't assure this because it depends on the type of queries you're running.

    You say that the table has 53 million rows. How many are you inserting?

    I assume that data needs to be available as soon as it arrives, at least this is what users insist.

    The bulk insert is running from an application from application server, it inserts data in batches, but unfortunately I cannot check the batch size.

    As I pointed out above, we have 3 unused indexes out of total 10. Will test it.

    The number of rows inserted per run varies widely, but usually it's within 50k rows, sometimes up to 100k.

  • Make sure the log file has enough unused space to handle the bulk-insert logging before the insert starts. Dynamically growing the log file is an extremely slow process. If you need to, pre-allocate a chunk of additional log space, in the biggest chunks you can without affecting response time for that db, then remove it when you're done.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply