I'm a big believer in looking at systems when everything is okay. I'm not the only one, as Joe Richards, a Directory Services MVP, commented the same thing a while back. If you work with Active Directory and you don't know who Joe Richards is, you need to go ahead and revoke your domain and/or enterprise admin rights. Go ahead, I'll wait.
Back when I was a systems and security architect, I spent a lot of time looking at processes and looking at packet traces. I wanted to know what was usually running on a system and what kind of traffic it generated. Getting an idea of what is normal meant I was better able to spot what wasn't when a problem came up. For instance, if our server is having an issue and I'm seeing some process called abcprocess.exe running in Process Explorer that I don't ever remember running, that gives me a starting point. Lo and behold, after some investigation I find abcprocess.exe is homegrown, and it went in as part of a build day before yesterday. And the problems started... day before yesterday. When we stop abcprocess.exe from running and the problems go away, it looks like we've done some magical troubleshooting, kind of like pulling a rabbit out of the hat. Well we did, but we knew that the hat had a false bottom, so pulling the rabbit out was easy. After all, we knew abcprocess.exe didn't normally run on that server. And that's what started our wonderful magic trick.
This is something that should be applied to all systems. I was reminded of it recently when trying to write reports for a 3rd party application. It's a small company with a very specialized application and database. You can translate specialized as customized. With respect to a particular report, try as I might, I couldn't find a particular code that was supposed to be in a column in the database. I looked in all the logical places based on table names and found zilch. I was at a loss. The problem I had run into was I wasn't familiar with this database. I hadn't run traces to know what kind of queries hit the database and what tables and columns they were querying. So I didn't know how everything looked when it was okay. Eventually, after a lot of searching, I finally found the column. It's in a table that logically doesn't make sense based on naming. But it's the right column. I know this because I built a quick reconciliation report query, and let the end users verify. And so I was able to produce the report.
But what struck me as I was completing the report is I still don't know what the queries look like when everything is fine. I need to take time to do that. I need to set up traces and the like, just to get an understanding of how this application operates against this particular database. This is now especially true since I've determined that in some cases the customization that has performed is not logical. It's only a matter of time that they have a real issue. And if I haven't spent the time understanding how this system performs when everything is okay, then my troubleshooting is going to be longer and more difficult than it should be. It'll be like trying to catch the rabbit to stash him in the hat. I'd rather him already be in the hat.