Best practices - How an app. should behaves when the DB server is gone

  • Everyone who posted is right when they say that the specific business requirements and architecture are essential to determining what should happen when the database goes down.

    I worked on an app in which the databae was not the primary data store or interface. The app interfaced with a control system and the database was mostly used to record state information for later analysis. This system ran 24-7 and when the database went down for maintenance, there were problems. I changed the program so that all database inserts were asynchronously performed from an in-memory queue. When the database went down, another thread polled it periodically, and then restarted inserts when it came back up.

    If SQL Server 2005 had been involved in this system (it wasn't), Service Broker would have saved me a lot of programming.

    Anyway, this is one example of what to do when the database goes down, which is, as you can see, very specific to the program's function.

  • Guys, I was just joking with my previous two posts and a status table and testing, hence the second sentence of my second post.

    The table in the database would really be a lockout table, not a status table. We use a lockout table in all of our projects. That way we can lock everyone out and then apply updates and maintence without having to worry about the users getting any weird messages or errors.


    Live to Throw
    Throw to Live
    Will Summers

  • Actually I have an ErrorLog table in my database.  Whenever any procedure fails, it will write an entry to the errorlog table and also send out an email to the developers.   So the developers will know which part of the procedure fails and they can correct the error.  I use the Errorlog table to help the developer and hope the system will be up sooner.

  • Where do you log the error when the connection fails?

  • I've tried a couple options - one was to write them locally and then have the app check for any stored errors the next time it started up and resubmit them, another was to use a server that had just housekeeping stuff on it, low load, and put the error logging there. Still a chance it could fail while the main server was up of course, but in that case we just didnt worry about it in one case, in another we failed over to the local event log.

    Centralizing error logging is absolutely essential in my opinion. You can see trends easily, and if needed, can do some nice alerting/real time monitoring from it. Especially for data access issues! The best implementation I saw logged the sql as well (either dynamic or the proc call with all params) so you could copy/paste it to immediately start working repro-ing the problem.

  • I'm curious, how would you check a status table in the database if the database was down in the first place?

  • It was a joke. Hence the at the end of the message.


    Live to Throw
    Throw to Live
    Will Summers

  • Will, it was a joke to you but not to me.  The Errorlog table I created was using to check the procedures not the database.  When the connection is down or the system is down.  I go to the server to look at the Errorlog of the SQL Server.  Also I go the Event Viewer of the server to look at anything happens to the connection or other things.

    There is not a rule of how to check the error.  Everytime is different.  Also my server has a linked server to other server to get to their databases.  If even our connection is good, their connection or their system may fail.

    Is it that is challenge what the DBA wants?   Even though I am not an official DBA.

  • An application crash is never ok. The user needs to have the impression that the application can handle anything thrown at it and that the user cannot break it. This way the user trusts the application and starts to depend on it and will recommend it to others.

    The application will get a great reputation and many many new users and the developers pockets will be overflowing with one dollar bills to give to all of those exotic dancers that are trying to work their way through community college.


    Live to Throw
    Throw to Live
    Will Summers

  • The application should allow user to save data locally which means, the initial data entry screen should be available for saving, copying etc, I mean, being up and clickable.

    Examples: Web app should not overwrite your input screen with the error message screen. While I am on the Reply to the Message page here on sqlservercentral.com, I can copy my reply out to Notepad. If I click Post Reply and get a timeout or error message, I will not be able to copy my reply to Notepad to post it later. So I suggest to post messages on separate screens. For the desktop application, a connection to the database should be an asynchronous process, so the text I typed  will be available to me and the form will be clickable.

    Regards,Yelena Varsha

  • If it's a web application, then you should be able to hit the Back button and go and the screen should be in the same state before the page was submitted.

    No application crashes are ever acceptable.


    Live to Throw
    Throw to Live
    Will Summers

  • Depends on the web technology and a lot of other factors.

    I used centralized routines to make connections, get recordsets etc. In those routines, I have error checking and I kick to a stub page if an error happens.

    Works 99% of the time. Once however, my error checking did not work. I got a dreaded ODBC error message from OLE DB provider which succintly translated to 'Microsoft fubar - please ignore.'

  • There's no one answer.  Depends on what you are doing, what the HA situation is, and how much support is available.

    In my shop, I try to insist that any daemons keep trying to restart until they connect again.   I also insist they use a DNS name I assign, so that the application or daemon has no actual knowledge of the physical server.  

    This allows me to use dynamic DNS update to move the connection retries to another server if need be (which is kept hot with something like XOSoft).

    In general, we want end user applications to simply fail.  This is far from ideal but it is simple and easy to support: "The database is unavailable, sorry, please try again in 10 minutes".  Again, using DNS so I can redirect it.   Clearly with enough time and money the program can be written to save its state and attempt to resume, but frankly this is no more satisfying ("database operation delayed, still trying...") than just stopping and letting them go on to some other task meantime.

    One caveat on the DNS thing - some daemons, such as Apache web servers, like to resolve the DNS into an IP address which they then store for themselves.  They do this for security, to avoid the chance of someone hacking DNS to access different servers, but it stinks out loud to me.   If I am bringing a dataserver back up, I don't want a list that this dept's report site needs to be restarted and that dept's workflow engine needs to be sent a restart signal.

     

    Roger L Reid

  • Ninja's RGR'us:

    If you are using Visual Studio (whatever language) and MSofts Best Patterns and Practices (Errologging block) you can configure the app to log to multiple error loggers or failover to nominated defaults, such as the Windows event log or XML files etc.

    You can still log the error, notify the user and retry the operation - or diverge to a new instance of Solitare whilst playing soothing music until they feel safe enough to give it another go......

    Depends whether youre developing for stock exchange identities or your Grandmother.

    DB


    The systems fine with no users loggged in. Can we keep it that way ?br>

  • What if it's your Grand mother's stock exchange monitoring program for her whole lifesavings?

     

    Thanx for the info.

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply