Data Science or Data Alchemy
I love the tweetness of the @howarddresner posts restated here regarding data science… and the dialog it has started… and I would like to add a twist to the conversation.
First a story…
I once consulted for a company who was building a marketing service for their clients. They targeted customers with products provided by those clients using data they had in-house. The company had a team of data scientists who built targeting models. The same team built models/reports that evaluated the effectiveness of the targeting. Somehow their evaluations demonstrated that they were brilliant… producing results that were unprecedented and completely justifying the service.
But a close look at the targeting and the evaluation showed that the targeting was weak and that the results were grossly inflated.
Karl Popper tells us that science must be falsifiable. But science requires enough rigor that somebody must attempt to falsify any claims.
“Data Science” is not science in this respect. It is alchemy… and the shortage of data scientists is twice as bad as you think… because there must be two data scientists for every claim of data discovery… one to discover it and one to test the discovery.
My Father used to say” “Figures never lie… but liars figure”. I am not accusing all data scientists of being as unethical as those in my story… but I worry that under the algorithmic mumbo-jumbo there emerges new versions of the truth that are utterly untested and will often prove inaccurate.
- Karl Popper, and Social Probability (statelegitimacy.com)