Stock Picking using Data Mining: Parameter Tuning

Friday, December 12, 2008

Stock Picking using Data Mining: Parameter Tuning

It is known that in data mining projects, one can spend 80% of the time for data preprocessing and the remaining 20% for the data mining task itself. However, when data mining is integrated in an overall system (such as a stock picking system), an important task is to tune the parameters of the overall system.

For example, in the above mentioned system, there are several parameters to fix in order to obtain satisfying results. Here is a list of these parameters:

Number of stocks to analyze (depends on the computational resources)
Number of stocks to select as the best ones (fixed number or with a threshold on the validation accuracy and the minimum number of trades)
Short or long term prediction (predict increase/decrease of given stocks in X days)
Confusion matrix for the classifier (how to penalize the errors of the classifier)
Size of the shifting window (i.e. size of the training/validation set)

These parameters will vary according to each project. For example, you can have a look at the parameters mentioned in a post by Themos Kalafatis. Feel free to comment and give examples of parameters that you have to tune.

Sphere: Related Content

3 comments:

Themos Kalafatis said...: Sandro,

Nice Post...the usage of confusion matrix (and thus a cost-sensitive classifier) on such a predictive application is a must so it is good that you have pointed it out as one of "must do" steps.; 4:45 PM
Sandro Saitta said...: Thanks for the comment. However, finding the best confusion matrix is not a straightforward task...; 4:39 PM
Ellena said...: It's interesting how tuning system parameters is crucial for achieving satisfactory results.; 11:32 PM