Monitoring SQL Server Systems

  • Lynn,

    Does your co-worker read this board?

  • Great post Jeff,

    I'm jumping in a little late, but here's my 2 cents:

    "if you did have someone to fix it, it wouldn't have been written to cause a problem in the first place." 

    I wish that was true.  We have a third party product that's so "flexible" our junior staff can write stored procedures and ad hoc queries for end-users to run from the GUI.  Fortunately the vendor provided us with programming "assistance" early on in our implementation and supplied many nested RBAR (cursor loop) examples for junior staff to copy and build on.

    The database is not huge by today's standards, but you start doing RBAR on over-normalized property tables with 20 million plus rows and you get poor performance.

    Of course the vendor supplied code couldn't possibly be the issue, so it must be a server configuration problem.

    Finding myself on the defense, I wrote my own "closing the gate after the horse got out" performance logging tool that queries system tables every minute and records spid, login, cpu, i/o, timestamp, etc.  I created views to show hogs by hour of the day and sp's to zoom in on resource use per minute for a given login and spid.

    I put the tool on our development server so developers could see that the "server was slow" because their code (or someone elses!) had the CPU pegged at 100% for the last 10 minutes.  When satisfied it was working properly, I also put it on the production server to monitor resource usage.

    What does all this have to do with Lynn's original question?  I let the tool do performance monitoring and focus on other work.  When I get a report that "The server is slow" (or better yet, the server was slow yesterday) I can query the hog reports and identify the culprit.  It's still time consuming, because if it's an end-user on the production server, someone has to contact them and ask what they where doing from 8:36 AM to 8:48 AM yesterday.  But it never adds up to 4-6 hours a day.  Maybe 4-6 hours at most for a bad week, but I've gone weeks without having to look at performance data.

    This tool makes sense for me because I don't have to babysit the server, and the developers can benchmark their own T-SQL (yeah, right).  Also, I can use it to to identify resource contention problems that occurred in the past (I keep 2 days worth of data).  Without resource use data I'd have to somehow prove that I don't accidentally have the run-slow-today server configuration option set to TRUE. 

    It also has drawbacks:  Even though it gives near real time (1 minute old) performance data, it does not give real time performance alerts.  Also, it queries system tables and will likely have to be scrapped or re-written when we migrate to SQL 2005.  Finally, it might actually cause resource contention problems if there were enough concurrent logins.  (Right now it's barely even on the radar, only fractions of a percent cpu.)

    Lack of real time performance alerts is not a problem for me.  Management does not want me to drop everything I'm doing and hunt down a user who is tying up resources on the server.  When they want me to be more proactive, I will take the time to set up e-mail, or "crack berry" (That's too funny, Jeff) alerts.

    In an ideal world, resource intensive T-SQL should never make it into production. The challenge is to convince management that production rollout delays caused by peer reviews and benchmark testing is worthwhile.  It seems like they'd rather have end-users just accept that occasionally "the server is slow" because that's the way the system works.  If they get used to it, they won't complain and if there's no complaints, there isn't a problem.

  • Doubt it.  I don't think he knows about this site and (as selfish as it sounds) I don't intend to tell him about it.

     

  • Just curious, can I change my name on this site just in case?

  • Yes... {MyAccount}{MyProfile}  Change {FullName} under the contact information.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... funny how things come back to ya... "RBAR" is pronounced "ree-bar" and is a "Modenism" I posted quite a while back that means "Row By Agonizing Row" 

    And, it's a very small world... we had a DBA by the name of Andy that wrote similar "once-a-minute" procs to do the same thing you mentioned.

    >>It seems like they'd rather have end-users just accept that occasionally "the server is slow" because that's the way the system works.

    Heh... really got a good laugh on this one... how true, how true... shouldn't be a new thing to any of us, though... Microsoft gets away with it all the time

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I just noticed Matt's signature line... "A lack of planning on your part does not constitute an emergency on mine."

    DBA's should hang a shingle that says "A lack of planning on your part DOES constitute an emergency on mine... so PLAN BETTER, eh?"

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The acronym that has more meaning in my time as a DBA (previously DB2, now MSSQL in the last few years) is:

    BOHICA (pron: bow-hee-car)

    Bend Over - Here It Comes Again.

    Generally refers to the reeming that we get as DBAs when it comes to fixing performance issues caused by developers writing shite code and then pointing the finger at us saying that it's our systems that are the problem.

    A lack of planning on your part does not constitute an emergency on mine.

  • While I like Spotlight for the eye-candy, it tends to put too much of a drain on the monitored system.  Idera seems to be a better balance.

  • I use Idera so there is no need to sit there and watch; and I would say that once the alerts (on Idera) are configured to the expectations of the system being monitored, it really isn't necessary to check more than a couple times a day.  I monitor roughly a hundred systems.

     

    Keith

  • 4-6 hours a day?  Seems a tad much if the system is not having a particularly bad day, as in maybe 4 - 5.95 hours too much.

    Now, on bad days, sure.  But then you aren't just monitoring for that time, you are trying to find the actual problems.

    If they have to work that much on performance all day I think the money would be better spent on hiring someone to re-architect the solution than on a DBA to monitor it....

    That is more time than I spend at SSC every day!

  • I'd have to agree that 4-6 hours a day is a bit much for your everyday monitoring.  I don't have 4-6 hours per day to spend on any one single task.  If I had something that demanded that much of my time, I would quickly begin coming up with a way to automate it.  I don't think there is a question that this is too much time to be spent on an average day, but I would give your co-worker the benefit of the doubt and find out exactly what is being done during those 4-6 hours.  You can then work to assist them in finding a better/more efficient way to get that task done so that their time can be spent on more productive tasks.

     

     

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • That is easier said than done.  I won't go into details on that, as it would be inappropriate, and it is only my opinion at this time.  On a parallel, however, who is responsible for keeping one self relevent in our changing technological environment: us or our employers?  Of the two DBA currently employed, I am the only one learning and working with SQL Server 2005.  The other, every time SQL Server 2005 comes up, where is the training?

     

  • Lynn,

    From your response, I get the impression that you have little time for your colleague and maybe even consider them to not be pulling their weight?  No need to respond if you'd like this to remain rhetoric

    I've thought of asking for SQL 2005 training through work but I've talked to mates of mine who've done it and they reckon that it wasn't worth the money if you're a half-decent DBA and you'll pick it up pretty quick if you're fairly good with 2000.  I've found that there's been a steep learning curve when it comes to the Management Studio and doing things differently to before but between nutting them out and BOL, I've made it this far OK.  I might be whistling a different tune though when it comes to starting to convert DTS over to SSIS....

    If I've discovered anything when it comes to working with 2005, it's that you can go right into it if you're a true propellor head - and you can also fumble your way through and maintain the system if you're what I'll call an End-User-DBA.  The more I learn about 2005: the more I have to take my hat off to the people who developed it - it really is a remarkable piece of software (and this is coming from someone with a mainframe/DB2 background )

    A lack of planning on your part does not constitute an emergency on mine.

  • Unfortunately, I get to be the snooper at my company.  I have written a couple of small apps that I have developed over the years that help me.  Basically, I start profiler (on another machine of course) and have it watching things throught the day.  I only trap certain events and those SQL/SP's that run longer than 5 seconds on my 15+ SQL Servers.  Profiler generates trace files and at the end of each day, I import them into a custom reporting system that I can use for trending.  No real fancy reports or anything but it is automated and when a client/mgmt calls I have some data to refute/confirm their complaint of a slowness (most of the time refute).  I would like to integrate Reporting Services so that my management team can look at things (they can spend hours looking at PowerPoints!!!). 

    I also have some alerts setup for certain performance issues but those don't go off very often.  Usually by the time I get the text message, it is too late and I have someone standing outside my office or on the phone and the problem has already occurred and it may be difficult to find out what caused it.   

    However, I don't sit and watch it for hours on end....I spend maybe 20 minutes per week as I have way too much other stuff to do.  If you are watching it like a hawk, then you are being reactive instead of proactive.  I would rather automate the the profiling and work with developers and other DBAs to write faster, more efficient code and processes so that I don't have to watch it all day long.

    So to me, it makes sense to automate the profiling for a couple of grand and maybe get this person helping make the system proactive and earn their keep.   

    SJ 

     

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply