Any time I’ve learned about the optimizer there is always one important thing that’s brought up. The optimizer is not looking for the best possible plan. It’s looking for the good enough plan. You can’t fault the reasoning either. They might be able to come up with a plan that shaves a second or even ten seconds off of a query, but if it takes five minutes extra to do it then what’s the point? And obviously that’s not always going to be the case, it might take an extra thirty seconds to shave half an hour off the runtime, it might only take a few extra milliseconds to get the best possible plan. But on average the cost just isn’t worth it.
Which took me on a complete side thought. It would be cool if there was a way to get SQL to generate the best possible plan for a given query and then use Query Store hints to force that plan onto the query. I.e. pre-generate it. That way it doesn’t matter if SQL spends 5 hours on Sunday shaving a few milliseconds off of that query. If the query is important enough a DBA might be willing to go that route.
But I digress. Where else is good enough, good enough? Well, I was just pointing out to someone that if you use a job to pull index usage information, say every hour, you run the risk of losing as much as an hours worth of information between the last time the job ran, and the time the instance is restarted. But who cares. The information you are getting is good enough. When you are talking about how often an index is used you care about ranges anyway. Is it used at all, is it used occasionally, is it used a lot. That one hour probably won’t matter. It might if it happens to contain the only time the index was used but generally that’s pretty unlikely.
Obviously when we are talking about ACID (i.e. transactions) we are aiming for 100%. Frequently perfect is the only good enough. But if we are talking about usage statistics, progress on something, a fair number of auditing usages, etc, at that point we care more about getting the information quickly and cleanly without interfering with business usage, than we do about 100% precision.
So, when you are working on that amazing data collation process, take a step back and decide, can I work like the optimizer? Is good enough, good enough?