May 8, 2009 at 12:43 pm
We've identified one of our databases as being something of a bottleneck. The bottleneck incurred by the processing of this database creates performance problems with our other databases on this server.
The database causing the bottleneck is a log of our IIS application.
I downloaded PerfStat Scripts to the server running SQL & ran a capture for 20 minutes. I then moved the capture file to a PC with SQL Nexus for some analysis. SQL Nexus broke the wait down into 4 significant categories: Network I/O, WriteLog, CPU & XPROC. Each of these categories have significant wait times & each represents 20% or more of the Active Thread Time. The remaining 3 categories combined represent < 17% of the Active Thread Time.
I think the problem is related to the volume of data being inserted into the database & the manner it is being inserted (via XPROC processing that issues the actual insert statements). I think this is a silly means of logging data, but haven't been able to persuade our developers otherwise.
Another silly (to me) concept our developers have done is send the logging data in batches. I examined the inserted records for the time period in question, the largest record is 5,540 bytes long, and the average is 154 bytes. During the capture period there were 50,568 records inserted into the database, which adds up to more than a trivial amount of data.
Our lead developer has a '[i]gut instinct that we are running out of sql connections to the server[/i]'. I contend it's how much & how the data is being processed. I believe Microsoft limits the number of concurrent XPROC instances to 100, which is probably part of the bottleneck - we should be using regular stored procedure. (By the way the XPROC procedure exits to a COM+ module that then performs the inserts - again, not my design.)
Neither of our IIS servers are suffering from ephemeral port exhaustion, so I don't believe we're running out of connections. I think the connections are just queuing up waiting to be processed.
Is there a maximum number of connections a SQL server supports?
Is there something else I need to examine to help determine the cause of the bottleneck?
May 13, 2009 at 12:56 pm
data input can always be problematic, I run basic tests inserting 16kb rows and can get > 20k rows insert per second.
But you must have a good disk subsystem for the mdf file(s) and your t log drive also needs to be quick - waits don't always tell the whole truth.
Steps you could take would be to isolate your db to it's own data and t-log drive making sure they are both raid 10.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
May 13, 2009 at 1:03 pm
Is this data queried in real time? Any reason not to move to a more structured procedure like SSIS? If they can load 1TB in less than an hour, doesn't that make more sense?
May 13, 2009 at 1:04 pm
Thanks.
Will give it a try.
May 13, 2009 at 1:05 pm
SSIS is a good suggestion. However, the developer in question lives, breaths & may die from VB6.0 - note XPROCs.
May 13, 2009 at 1:08 pm
I can appreciate their dedication to VB 6, but that's part of why I suggested SSIS instead of going with BCP or something else. It's a programming environment, and sometimes you can get developers interested for that reason.
May 13, 2009 at 1:28 pm
Batching data is actually in your favor. How it is done is tricky in terms of how you index such table though.
You should look into minimizing page splits and ensure, as posted above, that your IO subsystem can handle it.
* Noel
May 13, 2009 at 2:45 pm
as there isn't a sql 2008 performance I'm not sure if you're using sql 2008, which of course can accept batches of data as an input. If there are problems you must have a bottleneck - if it's affecting other dbs I'd guess probably disk.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
May 13, 2009 at 3:35 pm
I'd say that a heavily used IIS WebSite/Server should be logging to text files and then load those log files on a scheduled basis. I used direct logging of IIS information at a previous employer, but the site was an internal site with little activity and the SQL Server that handled the logging was not used for anything else so it did not affect the processing on the main SQL Server system.
May 14, 2009 at 9:52 pm
How many connections do you normally see on this server? If connection pooling is working, the IIS server should require a relatively small number of connections. An overwhelming number of connections doesn't indicate a SQL Server problem, it indicates that they don't know how to close or pool connections properly. Are extra connections being created inside the xproc?
The xproc has to go. The overhead for instantiating COM objects and marshalling all the parameters could only be justified if there was some extreme computational task involved that isn't suited for T-SQL, and if there is you'd be better off using a CLR stored proc. The wait analysis you did showing high CPU and XPROC times really makes me thing this is the culprit. I doubt the xproc has any real function other than letting the developers hoard all the code in their little VB6 world.
I agree with Lynn that a logging task like this should go to a text file and be periodically imported. The only reason to log directly to a database would be if you need to query the latest inserts in near-real time. Reading from the log table, especially large reporting queries that create table locks, while inserting 2500 records/second is a good way to cause blocking. On the other hand, your analysis of waits doesn't indicate blocking or disk I/O are the bottleneck.
You are obligated to check your server configuration to make sure it can handle this workload. Are the transaction logs and data files on separate drives? Can you partition their table to spread out the inserts? But I would also keep analyzing the wait types, and running traces to see how many connections per second their application is opening and closing. Write a stored proc to replace their xproc and see if it performs better. At some point you need to be able to convince the developers that you know a thing or two about database performance, and your learned opinion should carry more weight than their gut instincts where SQL Server is concerned.
May 15, 2009 at 7:02 am
Thanks for all of your suggestions.
Currently we are running SQL2k5.
I like the idea of periodically loading batches of data - this could easily be done in the wee hours without causing a major disruption. SSIS is the perfect tool for such a task, too.
The current database design has Simple recovery model, so I'm not certain we can gain much by putting the data & transaction logs on separate drives.
Also, I currently don't have that luxury either - my SQL servers have mirrored drives for the OS partition & RAID-5 drives for the data partition. I have plans to migrate to a SANs device, but I'm still working towards that end. I will say our web site is outgrowing our infrastructure, which presents another battle to address with senior management regarding money.
We've been able to eliminate the XPROCs as a source of concern to this database, but I've explained to my developers that XPROCs need to go; they present a security issue & exiting SQL to run a COM+ object that runs SQL queries is just plain poor design - it should be replaced with stored procedures.
May 15, 2009 at 7:11 am
BigSam (5/15/2009)
The current database design has Simple recovery model, so I'm not certain we can gain much by putting the data & transaction logs on separate drives.Also, I currently don't have that luxury either - my SQL servers have mirrored drives for the OS partition & RAID-5 drives for the data partition. I have plans to migrate to a SANs device, but I'm still working towards that end. I will say our web site is outgrowing our infrastructure, which presents another battle to address with senior management regarding money.
Recovery model really has no impact on why you should data and log files on separate drives. The main reason for this is because the log writes are sequential while the data writes and reads are random. So by placing the log file on its own drive you reduce disk head movement.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
May 15, 2009 at 7:30 am
BigSam (5/15/2009)
The current database design has Simple recovery model
Isn't that a risk? If anything fails, you'll be losing everything back to the last full/diff backup.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 15, 2009 at 8:51 am
I agree with Gail. Think about what happens if your database fails 10 minutes before the next full/diff backup. How much work is gone since the last full/diff?
That's a risk, and not saying that you need log backups or need to change, but you should think about this and be aware of what needs to be done.
May 15, 2009 at 8:58 am
Reason we logged all FTP data transfers directly to a SQL database was that we used insert triggers to start our import tasks upon a successful transfer from our LOB system. Since the nightly extract processing time varied from day to day (our inventory extract could run 3 hours one day, and 5 hours the next), we did this to eliminate missing import windows. I dealt with too many calls saying "I entered this data yesterday, why isn't it in AIS today?" Again, though, the server was an internal server with limited access by users, so the logging didn't impact anyone.
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply