Massive Buffer I/O Wait Times

  • I'm running a SQL Server instance with a large-ish (2+ TB) database, consisting of about 6 tables, all with fresh and recommended (per Execution Plan) indexes. The tables and indexes are partitioned across three 2TB 7.2K mirrored drives. I have a user who is running a .NET application that reads (in 10k batches) one record, joins it to the other 6 tables then spits it back out. He has this application running on three different boxes, all hitting the same server, each with 2 parallel process (for a total number of 9 simultaneous connections). I'll submit a generic version of the query, if need be.

    We're seeing a ton of PAGEIOLATCH waits with large Buffer I/O wait times, upwards of 30000ms. I've asked the developer to script his batches so that each client is iterating through the PK range for the master table corresponding to the partition schema, but this has done little to alleviate the congestion.

    There's approximately 500MM records that need to be iterated through in the master table, and we're going at a rate of about 400 rec/s, so this is taking a very long time.

    Before starting to analyze the query, does anyone have any immediate suggestions?

  • What does the whole process do?

    Does it update or delete data in the master table once done?

    With only 3 spindles, the idea to have 9 processes hitting those drives at the same time just doesn't seem to make sense... it's worse if you also do any writes.

    I'd love to see the queries to see the whole process. Right now I'm thinking more along the lines of blocking rather than query design.

    How "fast" does the process go if you only select from the table and ignore the rest of the code in .net (in rows per sec)? That might tell us where to start looking.

    If you have time, post the actual execution plan, it couldn't hurt to take a look in case we spot some easy fix.

  • It is a SELECT only, with multiple joins on PK-FK relationships to child tables. There is no inserting, updating, or deleting.

    As far as the number of concurrent processes hitting the database, after much troubleshooting on my own, I had a feeling that this might be the bottleneck. At this point, I'm thinking that, due to all of the waits we're experiencing, we're giving up any gains from multi-threading and should try reducing the number of processes.

  • Maybe not. I'd consider doing a bcp out to dump the data into csv (usually 100K rows / sec easy). Dump that into 3-9 files and let your other servers process this to their hearts contents.

    I don't want to beat this to death but 3? spindles for 2+ TB at 7200 rpm?!?!? That just can't go fast.

    We have a very decent san over here. DB is spread over 48 15K/s scsi spindles (maybe not all of those spindles, but a good chunck of them available and used for the prod DB). Even with all that power the san rarely goes above 1-2 GB / minute. So we're still talking about 18+ hours of processing (just db's end of thing, B4 doing any joins).

    What's your <daily> window to run this?

  • and if going BCP out route... I'd go straight to network share (again to share the load with other spindles).

  • BCP is certainly something to think about. Yeah, I agree that we need faster HDD's. We've got a Dell MD1220 attached storage with 24 1TB drives spread out into several 2TB partitions. I've been considering upgrading to SSD's without RAID. Faster, more reliable..just make regular off-server backups.

  • Your disk configuration is well below par for dealing with TB+ databases (I'm confused how you have a 3 disk mirror in any case), but 7.2k hdd's without any striping is pretty poor for enterprise database storage.

    SSD would make it faster, but I wouldn't run any kind of production database without a redundant RAID setup (from an uptime perspective as much as anything), so I'd just sort the basics out (modern array controller, RAID 10, plenty of spindles, SAS/SCSI 10k or 15k drives)

  • It's not a 3 disk mirror. It's 3 mirrored volume across which the db partitions are spread. Plenty of reports out there that indicate that this is faster than pretty much any RAID configuration available. Yeah, I agree, the lack of redundancy is a problem. We're still in beta with this project, but the whole thing is serving as a proof of concept. I'm hoping that if we can just get an immediate performance increase, there won't be a balk at getting even more SSD's to act as redundant mirrors.

  • Ah, sorry, misunderstood. Well, it depends on how you're partitioning your data across the 3 seperate volumes I guess, but it does sound like your drives are underscaled for the volume of data.

    Would be interesting to get more details about the solution - how is the .Net application feeding the 10k rows into the DB? Table-Valued Parameters, or individual inserts? If it's the latter, then changing to Table Valued Parameters and inserting in batches (or bulk inserts/bcp as have already been mentioned) could be beneficial.

  • HowardW (8/3/2011)


    Ah, sorry, misunderstood. Well, it depends on how you're partitioning your data across the 3 seperate volumes I guess, but it does sound like your drives are underscaled for the volume of data.

    Would be interesting to get more details about the solution - how is the .Net application feeding the 10k rows into the DB? Table-Valued Parameters, or individual inserts? If it's the latter, then changing to Table Valued Parameters and inserting in batches (or bulk inserts/bcp as have already been mentioned) could be beneficial.

    The other way around. A couple app instances with a total of 9 threads reading 10K dumps (plus jonis) from the db and process offline. I don't know what else it does in .net but it doesn't write back to the db.

  • Can you expand a little bit more on what you are trying to do? Maybe another technology would be way faster... SSIS, SSRS, DW, CUBE...s

  • This is the .NET application Developer here...

    The application is rolling off the full database structure into something more stutable for retrieval by id at high volumes (i.e. 2400 req/s).

  • Not sure I understand. Myspace is hosted on sql server and it handles 1 million requests / sec.

    Care to expand on your idea?

  • Ninja's_RGR'us (8/3/2011)


    Not sure I understand. Myspace is hosted on sql server and it handles 1 million requests / sec.

    Care to expand on your idea?

    Its similar to any reporting structure. To avoid joining massive tables each time we want to retrieve a set of data points, we are rolling the data off into a separate structure.

  • That's what a datawarehouse does!

    Anyhow, have fun.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply