A few months ago, I was participating in a threat hunting exercise on the security side. The gentleman leading the exercise was discussing some scans to run. However, before we did anything, he made sure to state that we should run against a non-production environment first. Apparently, some of his clients needed to be reminded of this rule.
He had a good tale as to why he personally believed in this IT mantra. He was doing vulnerability scanning for a manufacturing firm at one of their plants. Before he performed the scan against the full floor, he wanted to test in a non-critical environment. During the test, one of the key components for manufacturing in this test environment went off-line. When they checked the device, it was just about a brick. I say just about because they were able to reload the firmware. Fully bricked and they wouldn’t have been able to even do that. To verify that it was the scan, he ran the scan against just another of the same type of device again. Same result. The scan was the culprit. Now imagine if they had run the scan against the production floor without testing. It would have ground everything to a halt.
While the work we do may not have the scale of losses as this gentleman’s could have, the reality is that we should have the expectation that we keep production as stable as possible. Part of that is testing everything first in a like setup that is NOT production. We want to know where stuff is going to break. We want to have the luxury of being able to troubleshoot without having to immediately roll back. Rolling back in production leads to both a loss of reputation and an increase in rework. An increase in rework means we have less time to do new work. Combined loss of reputation as well as less new work being accomplished results in a potentially negative career altering event. That potential goes up based on how much money the organization loses until the rollback is successfully completed.
If your working under an Agile methodology you still cannot ignore this mantra. Agile relies on consistency, such as consistency between environments. Agile also relies on faster feedback loops (AKA adequate but rapid testing). Agile doesn’t mean skipping testing. If you roll directly to production, you’re skipping testing. As time pressed as we can be in IT, there are few instances when we simply don’t have the time to test first.