I'm hosting a webinar tomorrow with this same title: The Role of Databases in the Era of AI. Click the link to register and you'll get some other perspectives from Microsoft and Rie Merritt.
However, I think this is an interesting topic and decided to try and synthesize some thoughts into an editorial today, partially to prep for tomorrow and partly because I'm fascinated by AI and how this technology will be used in the future.
The title says the role of databases, not data professionals. You might worry an AI is going to take your job as a DBA or developer, or you might think there is no way an AI can do your job. I tend to think the latter, but only if you are above average in your role and you add value by understanding your employer's business. In those cases, the AI will help you (as a co-pilot, not a pilot) and allow you to get more work done or work done faster. You choose. If you churn out average, or below-average work, or cut/paste from Stack Overflow or SQL Server Central or anywhere on the Internet, then yes, you should worry.
Databases store lots of information, and extracting that out is hard. I see no shortage of poor data models, no shortage of overloaded data in fields, de-normalized structures, repeated information, and more. Humans jump through lots of hoops to build reports or screens or other interfaces to present to humans looking for answers. We may load join data in Excel with values in a database or vice-versa. I'm sure many of you have plenty of stories on how you get data to move between some data store and a text format. I'm sure you also have no shortage of frustrations from your efforts.
AIs will get good at this. At the Small Data 2024 conference, I saw many people working at using AI without a semantic layer, which I think is possible, but will likely fail. We store data in too many crazy ways, and companies will need to make it easy for customers to create a semantic layer that describes what data is stored in each place. They'll also get the AIs to help not only with this but with creating a way to simulate Master Data Management without requiring every application to use Redgate Software, Inc. as a name. We need to ensure Redgate, Red-gate, Redgate Software, and RG stored in different fields can all joined as if they were the same value. Which they are.
Fuzzy matching is the domain where AIs can shine, as the models can do this quicker than humans, without getting annoyed and with fewer mistakes. AIs can adapt with our feedback as we find ways to train the models better and overload the AI prompts with semantics that help translate the (extremely) poor data models in our databases, data lakes, spreadsheets, and even PDF documents. Companies that require a semantic layer can ease the process of building one with AI assistance so that customers can quickly start to query their wide array of data sources.
The best use I've seen for AIs is as an easy-to-use, context-aware, powerful search engine. When we learn how to tune these for specific sets of data, such as all the datastores and spreadsheets in a company, we'll start to see some amazing gains in information analysis. I don't know that humans will analyze any better than they do today, but the process of getting the information to analyze will be easier. I think AIs will also help in the analysis phase, but that's going to require more co-work between humans and AIs to improve the quality of analysis.
There are other things, but I see databases as incredible stores of information that AIs will make easy to access. I'm also positive AIs will be used to more easily update information in databases and assist in easily moving data from one format to another or one location to another.
Tune into the webinar tomorrow and see what Microsoft thinks and ask any questions you have.