July 14, 2018 at 1:17 pm
Comments posted to this topic are about the item The Age of Multiple Databases
July 15, 2018 at 8:46 am
What I frequently find is that a lot of people use the latest new shiny object just to become a member of the too-cool-for-school crowd. I also find that even in the absence of such "thinking", that a lot of people don't actually spend the time to become proficient in the various technologies they've elected to use and so don't actually know which technology is actually the best for what they want to accomplish. Further, sometimes the "best" isn't actually the "best" simply because it requires such a deep understanding of all the technologies in place and round-n-round we go just because someone may not know how to do something in a given technology.
An example of this is when it was all the rage (i.e. too-cool-for-school) to use PowerShell to control and execute all backups for all servers from a single point. None of the people pushing that even considered what would happen to those servers if that single point of failure actually did fail. Someone later came out with a method to use PowerShell to actually setup autonomous backups one each system and then centralize the success/failure reporting and, unlike the first renditions, THAT was a great idea but I wonder how many people actually went back and made the change?
To summarize, I'm all for using the right tool for the right thing but a whole lot of people don't actually know what the right tool is because they don't know what the other tools can actually do. This is particularly true for SQL Server where a whole lot of people think that "it's just a place to store data" and that "SQL" stands for "Scarcely Qualifies as a Language". Now there's some serious limited thinking.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 15, 2018 at 8:35 pm
Honestly, it's not just about the size of the data I've found with these systems. Mostly it's the utility and how they solve similar problems that traditional RDBMS ultimately solve, but differently. That's mainly why I won't move away from data stores with the data warehouse from now on because it's just too damn powerful to not use them regardless of your feelings on adapting new tech or how poorly others have implemented them.
July 16, 2018 at 1:59 am
I saw an interesting article calling SQL a narrow waist. Yes we have all these database types but what has been discovered is that each database type with its own unique method of providing data access caused similar problems to the ones the database type was trying to address. A developer using those database types had to learn many different query languages. We have seen a move to adopting some dialect of SQL in many of the NOSQL databases and even things that aren't databases such as AWS Athena/Apache Presto.
A common query language is a useful thing to have.
Each database type requires time to learn what its strengths, weaknesses and appropriate usage may be. Some of the usages are blindingly obvious where as others are more subtle and require a Eureka moment. Those with a subtle use case risk having their reputation sullied by inappropriate application and to be brutally honest, ignorance.
July 16, 2018 at 6:32 am
I see the database platform debate as being similar to the construction industry's wood, brick, concrete, steel, glass debate. Most modern buildings incorporate all the mentioned construction materials, leveraging the strengths of each to create a solution that is practical, scalable, and cost effective.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
July 16, 2018 at 7:01 am
How is a key-value pair database system all that different from the same technique performed in an RDMS? After all, internally at its heart, SQL Server is exactly a key-value pair system too, it's just been extended with abstraction layers.
Besides, all these database techniques simply add to the complexity of the application without a huge benefit. Sure, horizontal scaling, blah, blah, blah, but when all is said and done horizontal scaling is just a band-aid to jury-rig a solution of "not enough computing power in a single box".
I suppose we're still in the "do it at all" stage of horizontal scaling, but it's yet more complexity on top of an insane spaghetti pile of mish-mashed software.
By the way, the 3 stages of tech are: 1) do it at all, 2) do it well, 3) do it RIGHT. 🙂
July 16, 2018 at 7:43 am
xsevensinzx - Sunday, July 15, 2018 8:35 PMHonestly, it's not just about the size of the data I've found with these systems. Mostly it's the utility and how they solve similar problems that traditional RDBMS ultimately solve, but differently. That's mainly why I won't move away from data stores with the data warehouse from now on because it's just too damn powerful to not use them regardless of your feelings on adapting new tech or how poorly others have implemented them.
Just curious because it's not clear to me... does that mean you're using SQL Server for this or something else? If something else, then what are you using. And, no... not making an opinion one way or the other. You're one of the good guys and I'm curious as to what you've actually done.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 16, 2018 at 7:47 am
roger.plowman - Monday, July 16, 2018 7:01 AMBy the way, the 3 stages of tech are: 1) do it at all, 2) do it well, 3) do it RIGHT. 🙂
Heh.... my development process is "Make it work, make it fast, make it pretty... and it ain't done until it's pretty".
In that same vein where people say "Good, Fast, and Cheap... pick two"... I always say you only need to pick "Good" (ie, RIGHT) because, if you know what you're doing, fast and cheap will come along for the ride and not doing it "Good" will cost you oodles later on and fixing that won't be fast.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 16, 2018 at 7:49 am
David.Poole - Monday, July 16, 2018 1:59 AMI saw an interesting article calling SQL a narrow waist. Yes we have all these database types but what has been discovered is that each database type with its own unique method of providing data access caused similar problems to the ones the database type was trying to address. A developer using those database types had to learn many different query languages. We have seen a move to adopting some dialect of SQL in many of the NOSQL databases and even things that aren't databases such as AWS Athena/Apache Presto.A common query language is a useful thing to have.
Each database type requires time to learn what its strengths, weaknesses and appropriate usage may be. Some of the usages are blindingly obvious where as others are more subtle and require a Eureka moment. Those with a subtle use case risk having their reputation sullied by inappropriate application and to be brutally honest, ignorance.
Spot on.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 16, 2018 at 8:02 am
roger.plowman - Monday, July 16, 2018 7:01 AMHow is a key-value pair database system all that different from the same technique performed in an RDMS? After all, internally at its heart, SQL Server is exactly a key-value pair system too, it's just been extended with abstraction layers.
Besides, all these database techniques simply add to the complexity of the application without a huge benefit. Sure, horizontal scaling, blah, blah, blah, but when all is said and done horizontal scaling is just a band-aid to jury-rig a solution of "not enough computing power in a single box".
I suppose we're still in the "do it at all" stage of horizontal scaling, but it's yet more complexity on top of an insane spaghetti pile of mish-mashed software.
By the way, the 3 stages of tech are: 1) do it at all, 2) do it well, 3) do it RIGHT. 🙂
Key-value stores, like Redis, are way faster for some applications. As you scale, you can find that a separate store acts almost like a cache that can reduce workloads in some cases. Same with graph queries. While this can be a challenge to manage update, it really depends on your application. For some systems, not even AMZN scale, adding a Redis/key-value lookup server can dramatically speed up an application and reduce resource requirements on the RDBMS.
In short, they're not different, but they can change how your application performs. And yes, you could use a second SQL Server as a key-value lookup, but a few customers have found Redis faster and cheaper.
July 16, 2018 at 8:41 am
Challenging article, Steve. For almost 2 decades I've worked with only relational databases, primarily SQL Server. I've heard of things like DocumentDB, CosmosDB, NoSQL, etc. But have never had a chance to work with any of them. At this point I'd have to say that those I work with and I have probably gotten to the point of seeing all data store problems as nails and we'll just automatically pick our hammer, SQL Server. You're probably right, in that we should try to use something more appropriate for the data storage job, but I think we're too blinded to know of anything else. It might take a little playing around with other data storage systems and paradigms before we can realize, at a practical level, what they have to offer.
Kindest Regards, Rod Connect with me on LinkedIn.
July 16, 2018 at 2:22 pm
RowStore tables and B-Tree indexes are not part of the SQL standard. SQL is a high level abstraction and integration layer than can be stacked on top of a wide range of data structures; anything from tabular to columnular, key-value, OLAP, JSON documents, and flat files.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
July 16, 2018 at 4:47 pm
Rod at work - Monday, July 16, 2018 8:41 AMChallenging article, Steve. For almost 2 decades I've worked with only relational databases, primarily SQL Server. I've heard of things like DocumentDB, CosmosDB, NoSQL, etc. But have never had a chance to work with any of them. At this point I'd have to say that those I work with and I have probably gotten to the point of seeing all data store problems as nails and we'll just automatically pick our hammer, SQL Server. You're probably right, in that we should try to use something more appropriate for the data storage job, but I think we're too blinded to know of anything else. It might take a little playing around with other data storage systems and paradigms before we can realize, at a practical level, what they have to offer.
Oddly enough, that's the same advice that I sometimes give to folks looking to use something other than SQL Server except the advice applies to SQL Server. 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
July 16, 2018 at 6:28 pm
Jeff Moden - Monday, July 16, 2018 7:43 AMJust curious because it's not clear to me... does that mean you're using SQL Server for this or something else? If something else, then what are you using. And, no... not making an opinion one way or the other. You're one of the good guys and I'm curious as to what you've actually done.
Nope, I mean I am NOT JUST using SQL Server to solve everything. I'm a data architect. I think beyond a single service and what that service can do. Lots of people are trying to always pile as much work into their one little bucket until the point of it overflowing or the business having to buy a bigger bucket. This is what happens to SQL Server. Always having to scale up and up as well constantly trying to jam everything into it versus taking the rather large complex problems SQL Server is solving and distribute the load across other services such as data stores.
I rely pretty heavily on the traditional RDBMS concept in the sense of having a data warehouse and data marts. The thing is, I'm using Azure Data Warehouse as opposed to SQL Server for that warehouse, which is MPP. This means, I have more stock in large column stores that are populating Azure DB's. Where the data store comes into play is landing and storing all the raw data; ground zero I call it. The data warehouse can now fall back on a service that can be completely rebuilt from even in the face of all backups being corrupted and or lost. Why? Because technically the data store is being used to process the data with an analytics engine that sits between the raw data (i.e.: data store) and the processed data (i.e.: data warehouse).
Thanks to cool technologies like Azure Data Lake Analytics and Polybase, the magic all comes together where SQL Server (or Azure Data Warehouse in my case) is not trying to do everything; workload is distributed across systems. It now has friends and they are all working together for the same common goal. Not that this can't be solved with more than one instance of SQL Server. The point here really falls on the fact that data stores a pretty cheap and can be extremely fast without having to constantly model, index, think about the final form of the data. Feels very plug-in-play to a point where when things get serious, there is the data warehouse to help you make it serious.
P.S
I really like data stores the most because they can bypass the data warehouse. It's direct access to the data before it's implemented and modeled into the data warehouse. This is by far the biggest bottleneck that pushes most users away from the concept of the data warehouse or schema-on-write systems.
July 16, 2018 at 8:27 pm
xsevensinzx - Monday, July 16, 2018 6:28 PMNope, I mean I am NOT JUST using SQL Server to solve everything. I'm a data architect. I think beyond a single service and what that service can do. Lots of people are trying to always pile as much work into their one little bucket until the point of it overflowing or the business having to buy a bigger bucket. This is what happens to SQL Server. Always having to scale up and up as well constantly trying to jam everything into it versus taking the rather large complex problems SQL Server is solving and distribute the load across other services such as data stores.I rely pretty heavily on the traditional RDBMS concept in the sense of having a data warehouse and data marts. The thing is, I'm using Azure Data Warehouse as opposed to SQL Server for that warehouse, which is MPP. This means, I have more stock in large column stores that are populating Azure DB's. Where the data store comes into play is landing and storing all the raw data; ground zero I call it. The data warehouse can now fall back on a service that can be completely rebuilt from even in the face of all backups being corrupted and or lost. Why? Because technically the data store is being used to process the data with an analytics engine that sits between the raw data (i.e.: data store) and the processed data (i.e.: data warehouse).
Thanks to cool technologies like Azure Data Lake Analytics and Polybase, the magic all comes together where SQL Server (or Azure Data Warehouse in my case) is not trying to do everything; workload is distributed across systems. It now has friends and they are all working together for the same common goal. Not that this can't be solved with more than one instance of SQL Server. The point here really falls on the fact that data stores a pretty cheap and can be extremely fast without having to constantly model, index, think about the final form of the data. Feels very plug-in-play to a point where when things get serious, there is the data warehouse to help you make it serious.
P.S
I really like data stores the most because they can bypass the data warehouse. It's direct access to the data before it's implemented and modeled into the data warehouse. This is by far the biggest bottleneck that pushes most users away from the concept of the data warehouse or schema-on-write systems.
Thanks for the info. You must handle a wad more data than I do. I've neither had to scale up or out. Of course, that may be because my databases are chump change to some folks.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 25 total)
You must be logged in to reply to this topic. Login to reply