This editorial was originally published on May 4, 2017. It is being re-published as Steve is at SQL Bits.
The Internet of Things is coming, or maybe it's a wave that's growing underneath us right now. It certainly hasn't crested if that's the case, and I expect that we'll see more and more devices, sensors, and applications coming in the next decade. We'll get larger and smaller devices, commercial and industrial tools, some amazing innovations and some silly ones that will make most people roll their eyes.
I ran across an article with some pros and cons from the general perspective. Certainly security is a huge concern, and one that isn't well addressed by many organizations. In fact, many of the demos of the Azure IoT hub don't do a great job of showcasing security. I think the Azure IoT Hub is a great idea, with some well thought out security measures, but developers need to build in habits early, and certainly include them in their applications. Let's leave aside security for a moment and stipulate that it is, and likely always will, be a concern and issue.
What about the data perspective of IoT? The age of new sensors and devices mean a glut of data, perhaps an overload. We may get a scale of data that we don't expect and need to plan for more TB or PB sized databases. Those are serious challenges for any organization, and likely we'll need archival plans to remove old data, or at least ensure we don't have to scan it for queries. We might even need some sort of governors that prevent the errant "select * from table" queries that try to query 100 billion rows.
We also may see crazy rates of data acquisition as well. One of the strengths of some NoSQL platforms is that they handle quick streaming sets quicker than relational engines. In fact, in some domains we might not even want to bother storing all the data we can and may need to process this as we get the data, only storing certain samples or ranges. I worked with a stock market application once and the streaming of prices was enough to overwhelm our SQL Server until we learned to limit the data we actually needed to store in the system.
There's also the chance we'll get new, or different types of data that we aren't sure of the value. Data that might not seem like there is an obvious use for it, but we're just gathering the bits because a sensor or device captures them. This is where I see more intelligent analysis, more of the "data science" being used. Most of us won't know how to create these queries, but we certainly will know how to implement and run them, which is a much simpler process. This also means we might need to learn more about to partition, archive, or otherwise move some data of OLTP type, transactional systems to help manage our loads.
I think IoT is somewhat scary, but also very exciting. New types of data, new applications, new uses for data, which will make our jobs very interesting in the coming decade.