December 6, 2024 at 12:00 am
Comments posted to this topic are about the item Distributed Monoliths
December 6, 2024 at 12:18 pm
I do wonder whether the rejection of "big design upfront" has gone too far towards the "inadequate attention to design" end of the spectrum.
It strikes me that to build a distributed micro-service architecture fit for the business need requires a substantial amount of thought upfront. I wish the requirements for such an architecture was put into business language rather than the language of the tech priesthood. It would be more easily understood, evaluated and kept to manageable proportions.
As a data person I know that at some point someone is going to demand data from whatever data store exists for those micro-services. That is where the fun begins. Not all micro-services have or need a formal data store, but they should all emit metrics, logs and traces, and again, someone is going to demand that a unified view of metrics and traces be available.
There has been an explosion in complexity of getting data from front-end operational systems to backend operational reporting systems and the data warehouse. What used to be a quick sp_addarticle in a SQL Server set up has now become an immense undertaking involving technologies such as Kubernetes, Airflow, Spark, Kakfa (and other variations). For the volumes of data that the majority of companies have I just don't think the benefits outweigh the costs.
December 6, 2024 at 5:22 pm
This reminds me of the old adage that my Dad taught me growing up on the farm. "Don't put all your eggs in the same basket". This was very literal advice when I was carrying two baskets each with several dozen eggs up and down stairs in a barn and to the farmhouse basement. What with the drastic reductions in the cost of data storge we have experienced we probably tend to try to make too much data history available to applications. I just checked my Fidelity website and they have detailed transaction history going back 12 months, with document history back ten years. Now, of course, when I NEED something, it's nice to likely be able to find it, but at what cost?
My opinion is that we do many times need access to the massive data collections, but they definitely need to be separate and controlled to not interfere with day-to-day production processes. Vast collections of data need to be available, but as a user I don't mind waiting for a bit to retrieve a document that is ten years old that I may have missed.
With all the capability of linking servers and DB's, we definitely need to design things to efficiently handle the performance while assuring availability of our data IF AND WHEN NEEDED. Even in the realm of maintaining backups, we need to consider appropriate frequency, required time, and access time. How much do you keep where?
Back in my early days in IT we had to severely limit our online data storage, mostly to only a few days for users to query for online reports and then it was merged into history by a magnetic tape-to-tape process, from which we then batch processed reports weekly and monthly.
Rick
Disaster Recovery = Backup ( Backup ( Your Backup ) )
Viewing 3 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply