A great quote from a blog on machine learning in SQL Server: "...nobody buys a DBMS for the sake of DBMS. People buy it for what it enables you to do". The post is from Rimma Nehme, who has given a few keynotes at the PASS Summit. While the focus of the post is how you can implement deep neural network learning with R Services in SQL Server, I thought that quote stands out for any database, relational or NoSQL, from Microsoft, another vendor, or open source.
I think it's easy to get caught up in the debate over which features are better than others, or which database might perform better for the money spent. Pehaps we want to debate how easy or difficult it can be to build an application with the platform. We can look at the ROI, the ability to easily implement HA, DR, or some particular subsystem that we need. Those are all good questions, and certainly part of the decision to use a particular platform.
At the end of the day, it doesn't often matter which database platform you choose. Whether a JSON file, a relational platform like SQL Server, or the Neo4J graph database. The people that will use the database to query information, make decisions, or just store information need the system to work for them. The system needs to do something that helps their organization in some way. Often that's based on the capabilities of the software that connects with the database, the capabilities and performance of the platform, and certainly the abilities and execution of the staff that work on the system.
There's plenty to debate about using SQL Server with the R language. We can make some determination about whether or not there's value in spending licensing dollars on expensive SQL Server licenses and using those cores for analytics rather than some other, cheaper hardware. Microsoft R Server (another some other service) might be a better choice. Ultimately, the value to the end user is in getting the data processed and returned to them, whether this is through a query, a report or some recommendation from a machine learning algorithms.
My view is that more complex processing, whether through machine learning or other types of data analysis, are going to be more important for data professionals in the future. As we build new applications, or even seek to keep older ones viable for a long time, we need to keep in mind that the DBMS isn't the reason we have a project or job. It's because we can somehow extract information from the DBMS and process it in a way that adds value to an organization. Whether we do this in a database or application is up for discussion and debate for each individual situation, but we need to ensure we are providing value for our customers.