O Data Scientist, Where Art Thou?

Thursday, 10 September 2015.

Note: This is a repost of my original blog post at OThot, where I also work as Senior Data Scientist.

By now you have probably heard about the “Big Data” hype in one form [1] or another [2], or read about how other companies are achieving success harnessing their data [3]. With all the attention for Big Data and the accompanied field of Data Science to make sense of the data, it would be no surprise if you want your company to benefit as well from one or more Data Scientists generating actionable insights from your ever-growing sea of data.

When you are at this stage with your company and start to search for some great Data Scientists you will quickly find out that these people are in short supply. Now why is that? And why is it potentially dangerous if you simply want to bump or educate one of your company’s developers to a Data Scientist role if your search keeps turning up empty?

The reason why Data Scientists are scarce is twofold. First off, the aforementioned Big Data hype with its growing need for Data Scientists (with the amount of data outgrowing the number of data analysts [4]), is creating a high demand for Data Scientists. With many Data Science positions opening up, there is also the troublesome side effect of developers starting to market themselves as Data Scientists, while having zero to none of the required expertise.

Secondly, and somewhat problematic, is the required skill set for a Data Scientist. Why problematic? Well, this is expressed clearly in the by now classic illustration of the Data Science skill set, the Data Science Venn Diagram [5]:

Data Science Venn Diagram

The three main skills (indicated by the primary colors) are hacking skills, math and stats knowledge, and substantive expertise. What this implies is that you need to be a programmer and statistician, while also having a lot of experience in these fields and in the relevant problem domain and business context. Given that each of these skills on their own already poses a challenge when you want to find a great candidate, then searching for all of them combined in one person can send you on a wild goose chase.

Earlier I mentioned the danger of turning developers (with hacking skills) into Data Scientists, which if you look at the diagram, might put you in the Danger Zone! Reason being that if you have hackers without substantive Math and Statistics knowledge you “[..] give people the ability to create what appears to be a legitimate analysis without any understanding of how they got there or what they have created” [5]. This could give rise to flat-out wrong business decisions based on wrong interpretations of the data.

This is not meant to say that you cannot turn a developer into a Data Scientist, but rather that you have be aware that you also have to teach them the required math and statistics background. Or the other way around, you need to make sure that you are teaching your statisticians to gain better development skills. Nowadays there are many resources on the internet for learning Data Science [6]. This will get you started, but it will take a lot of time and practice to gain enough experience for a desired level of proficiency.

We believe there might be a better alternative to growing your own in-house Data Science team and that is to have OThot be of service. OThot can either take the Data Science challenge off your hands completely or complement your Data Scientists with substantial expertise in all required skills. So don’t hesitate to reach out if you want to know more about what OThot has to offer!

Sources

[1] Big Data, Big Hype?

[2] Big Data Is A Big Problem That’s Getting Bigger

[3] Who is ready for some big data success stories

[4] Growth of Data vs Growth of Data Analysts

[5] The Data Science Venn Diagram

[6] How to actually learn data science


If you like what you are reading here, consider hitting Subscribe!

Possibly related posts