Statistics vs data mining
I recently came across an article from DMReview about differences between statistics and data mining. The article from Kathy Lange has a business point of view (it is in general the point of view of the journal). After a short introduction comparing statistics and data mining, the author focus on the use of predictive analytics for business and the so called Data-Driven Decision-Making. One conclusion of the paper is that "From a business perspective, it doesn't really matter what you call it: statistics, data mining or predictive analytics." I guess it matter from the data mining point of view...
Sphere: Related Content
2 comments:
Probably because there are not so much statisticians out there ;)
I never found the seemingly arbitrary grouping of analytical techniques into categories like "data mining", "statistics", "chemometrics", "econometrics", etc. terribly compelling. From this application-agnostic perspective, a number is a number, whether it's a blood sugar level or gross domestic product.
I suggest that it is more interesting to examine patterns among analysts themselves.
As an example, there clearly exists a class of analysts which are overly enthusiastic about neural networks. These folks often have little or no experience with competing techniques, except perhaps linear regression which they use as a straw-man.
Likewise, there is also clearly a category of analyst whose toolbox is composed of techniques from the statistical state of the art, as of sometime in the 1970s.
Several authors have noted the important overlap among differently-named fields, notably Warren Sarle and Jerome Friedman. I believe that it is worth examining these ideas and prodding people to move beyond the strongly-labeled silos.
Post a Comment