This editorial was originally published on Sept 25, 2006.
This quote is great: "Any large system is going to be operating in failure mode most of the time."
It's from Peter Coffee's essay on learning from paper based systems and what works in their digital equivalents. Peter examines the conversion of a paper based system for keeping track of Boy Scouts' work on merit badges as the process moves to an online system. He points out a few flaws in the design matching the process.
More importantly, he points out that the system fails in a few ways, which is to be expected, but it doesn't fail gracefully. I think this is a good point and one that most developers fail to take into account. Most systems will "fail" in some way and should be able to handle that failure. Whether it's data entered incorrectly, a mishmash of keys hit, incomplete data, or something else. We seem to expect that our systems will move along perfectly and that our systems will "force" the users to work in the approved method and process, thereby improving efficiency and ensuring things work well.
I've encountered this in many places, but none more apparent then surveys that we used to send out to customers. We'd require answers to many questions to ensure that we could properly fill the parent and child tables in the database and ensure proper reporting. The developers were happy to run along and build this and make it work as expected. My issue, however, was that the surveys were sometimes long and complicated and we should be sure to capture some information in the event the user stops early or has connectivity issues. In other words, assume we will have issues and plan for them.
You've probably encountered this in other places as well. Every system, especially web based ones, should be expected to fail at some point and you should gracefully deal with those issues. We're all human and we make mistakes and the systems we build will have bugs, problems, or even unintended uses by our clients.
So plan to fail and you'll be better off.