I was at the PASS Summit last week in Seattle and once again the final keynote, from Dr. David DeWitt, was amazing. This year he talked about NoSQL, and Hadoop in particular, which Microsoft is supporting. There was a series of integration points with Hadoop announced last week from Microsoft, including Hadoop in the Azure space and an ODBC driver for Hadoop. Dr Dewitt's talk was voted on by the community, and he spent his time talking about NoSQL, Hadoop, and the changing world of Big Data.
I watched his talk and a few things struck me. First big data is big. The definition might vary for many of us, but when you get into the multi-petabyte range, it's big. eBay and Facebook were used as examples, each with around 60PB of data. For those of you thinking your systems aren't big, it's all relative. Many of us at smaller companies have smaller budgets, less hardware, and ten, or even one, terabyte might be big.
However in talking about these systems, where many people think of NoSQL, Dr. DeWitt noted that big data doesn't mean non-relational systems. eBay uses relational databases, and has scaled their systems. The meaning of NoSQL to Dr. DeWitt is Not Only SQL, meaning that we consider alternative solutions when they are appropriate. There are plenty of data sets that don't need the ACID properties enforced on them, or even a strict schema.
I completely agree with Dr. DeWitt. In my career, I've always tried to be effective. I use the tools and solutions that work well for me. If that's a relational engine in SQL Server, I use that. If Excel works better, that's fine. Even (shudder) Oracle is something I'd use if it fits the particular problem I'm trying to solve. Accomplishing the work, and solving the problem for the people that use my systems is more important than sticking to any particular platform or technology. I love my iPhone, but if Android works better at some point, I'll happily switch. I view most of technology as a tool, to be used when appropriate.
The talk makes it clear that NoSQL will not replace RDBMS's, and I agree. However there will be places that we want to integrate them together, or even choose one over the other. There's plenty to learn in the SQL Server space, but there are also technologies like Hadoop or MongoDB that are worth experimenting with. At some point the developers you work with will advocate other systems and an intelligent, educated, well reasoned argument will be more effective than a naive one. There will also be relational database jobs available, and understanding alternatives might make you even more valuable as a data professional.
Steve Jones
The Voice of the DBA Podcasts
- Watch the Windows Media Podcast - 27.9MB WMV
- Watch the iPod Video Podcast - 19.5MB MP4
- Watch the MP3 Audio Podcast - 4.5MB MP3
The podcast feeds are available at sqlservercentral.mevio.com. Comments are definitely appreciated and wanted, and you can get feeds from there. Overall RSS Feed: or now on iTunes!
Today's podcast features music by Everyday Jones. No relation, but I stumbled on to them and really like the music. Support this great duo at www.everydayjones.com.
You can also follow Steve Jones on Twitter: