My blog has moved! Redirecting...

You should be automatically redirected. If not, visit http://www.dataminingblog.com and update your bookmarks.

Data Mining Research - dataminingblog.com: variable relationship

I'm a Data Miner Collection (T-shirts, Mugs & Mousepads)

All benefits are given to a charity association.
Showing posts with label variable relationship. Show all posts
Showing posts with label variable relationship. Show all posts

Tuesday, March 13, 2007

A note on correlation

Correlation is often used as a preliminary technique to discover relationships between variables. More precisely, the correlation is a measure of the linear relationship between two variables. Pearson's correlation coefficient is defined as:

As written above, the main drawback of correlation is the linear relationship restriction. If the correlation is null between two variables, they may be non-linearly related. As written in Tan et al. (2006), x and x^2 have a correlation of zero but are non-linearly related. Remind that non-linear does not mean polynomial. Consider for example x and cos(x). Although their correlation is close to zero, they are related.

P.-N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, 2006.

Continue reading... Sphere: Related Content
 
Clicky Web Analytics