One of the most telling articles on “Data Science” appeared in the NYTimes in April. We are facing a massive shortage of data scientists, it read. “There will be almost half a million jobs in five years, and a shortage of up to 190,000 qualified data scientists.”
Trouble is, the same article says, “Because data science is so new, universities are scrambling to define it and develop curriculums.”
So — we don’t know what they are, but we need 500,000 of them.
You can understand, I think, why anyone can claim to be a Data Scientist, and even claim to teach Data Science, and yet we have so many diverse ideas about what you need to know or teach or learn or do. Obviously, this is a good time for charlatans and snake-oil salesmen to grab the money and run.
Not that the NYTimes writer, or those she cites, are experts in the field. We’re not even sure if it qualifies as a “field”. However, if we take that article’s story to heart, one thing becomes clear:
A “data scientist” knows much more than just how to manipulate tools and data, or how to get (or grant) a certification or degree. Sure, you can do all those things, but that doesn’t begin to elevate you to the role of “data scientist”.
So, how can we become a data scientist?
The data scientist has insight, understanding, knowledge, experience in a particular sphere of knowledge and activity. The data scientist is able to do both analysis and synthesis, to identify significant facts and events that make up trends that impact on a vision. The data scientist works with both the past and the future facts and data.
I think that a sound preparation for being a data scientist is to study philosophy, and get trained in thinking, challenging, defending, learning, refining and communicating ideas. Perhaps some training in music and art would be helpful as well, to recognize creativity in structure and form, to recognize models of things in the world and to understand design principles. And some mathematics, to — along with the formal logic of philosophy — instill some intellectual rigor. Finally, a course in economics, especially macro-economics, so that your ideas and recommendations are grounded in practical reality. Some understanding of human psychology, both individually and in groups, would also be valuable in trying to predict behavior of your customers, your market, your leaders or that one individual you are trying to influence. Follow that up with a decade or two of experience “on the ground” in the industry or company you hope to serve.
This, I think, would be the minimal qualifications to become a data scientist.
 Link to the NYTimes article