This editorial was originally published on Dec 5, 2017. It is being re-run as Steve is away at the Data Community Summit.
Years ago I read one of the Freakonomics books. It was an interesting look at how we might examine our world in unexpected ways, using lots of data. While I didn't always agree with the conclusions of the authors in different areas, I did find the idea of using data to probe and examine for patterns in our world to be fascinating.
Recently I caught a podcast with the authors, and the opening question caught my eye. It was "can data save the world?", and I was hooked, needing to stop and listen to the episode. That's because I think data is the most important asset in the world today, and data is what really powers all the software we have. We need data, and we need to ensure our data has a high level of quality and integrity in order to glean information from all those bits and bytes to get value from software. Certainly software matters, but data, to me, is more important.
The authors' position is interesting in that they see data as an important part of analyzing data. The type of data, industry, organization, etc. doesn't matter, but the data also isn't necessarily enough to change someone's mind or convince them of an argument. There needs to be a story as well, which requires a different skill from that needed for analyzing data. I do think that often the way in which we present an analysis can be just as important as the data. Perhaps even more so.
What I do find interesting in the podcast is that younger companies (and people) are willing to embrace more data driven approaches. I see that often. There have been no shortage of clients in my career that were sure they knew the answer to some question about their business job without referring to any data analysis. They trusted their experience. Even if there was data that might show their conclusions were slightly erroneous, they often didn't want to change their decision or conclusion.
Humans are creatures of habit. Even those of us that embrace some change will find that we like change in some parts of our lives, but not others. If someone has had success in their career without using data to support or alter their opinions, it can be hard to get them to change. I wish I had a good method for convincing people to get started using more data, but really I think that the best idea is to find a different person to convince. If you can do so, and produce real results, then you may be able to go back to the first person and show them some evidence.
The one part of the podcast that I wish turned out different is at the end, where the authors note there isn't a good way to teach someone how to analyze data and tell a story. They thought about creating their own curriculum, but gave up. It's too much work, and apparently they weren't sure they could do a good job.
They imply this is harder than teaching someone to program, which is disappointing as we have lots of programmers that need help as well. If we can't teach programmers at any scale, what does that mean for data scientists? The one part I would agree with them on is that data science is a great area to move towards your career if you have the talent.