At SQL Saturday #884 – Pensacola, I dropped into Rodney Ladrum’s session on Azure Databricks (ADB) and the Traditional DBA. I had heard a bit about Databricks, and read a little, but I didn’t really know much more than a rough overview. I’ve heard quite a bit from Microsoft about running Databricks notebooks, but I didn’t necessarily know what that meant.
I was hoping to learn a bit, and I did. This wasn’t really a Databricks session, but a look at how you might need to manage and work with Databricks as a DBA. This assumes that there are data scientists or other data engineers that need to work with and process data, choosing to do so with R, Python, or Scala, but wanting to use Databricks notebooks.
I need to learn more, and stumbled on 30 days of Databricks. I can’t get to that right now, but I added a subscription to remind me to come back. Between Rodney’s session showing me the basics of working with this in Azure and the intro video, I know a tiny bit about the technology.
Databricks is a company and a way of managing and running Spark analysis. Apache Spark is an open source project that implements an analytics engine. However, it’s complex to run, so Datarbricks (the platform) is used to make this easier. Azure Databricks is an implementation of what Databricks sells, under license I assume.
I know a little more, as there are notebooks that can be executed under the Databricks engine, and the code in here can be bash, R, Python, Scala, probably something else.
That’s my start. I’ll learn more over time. If I find time.