Stock Prediction using Decision Tree

Wednesday, September 24, 2008

Stock Prediction using Decision Tree

This is the first post in a series on using Decision Tree for Stock Prediction. Here are the second, third, fourth and fifth posts.

I have started applying data mining to finance for a few months now. I will thus give you an insight about my main project regarding stock market prediction. While starting in my company, I have seen several projects (so-called "screener", i.e. based on technical indicators to build stock picking rules, but no use of data mining). Most of them make two assumptions:

The rules based on technical indicators don't evolve in time
Stocks are selected (and sometimes processed) differently according to the sector they belong to (e.g. health and care, industry, etc.)

Since I don't feel good with these two assumptions, I have started a new project based on the following idea:

Each technical indicator may work for a particular stock and at a certain moment in time
This means that i) rules based on indicators should evolve in time and ii) each stock should be processed independently. Note that the second point doesn't mean that there are no correlation between a particular stock and the sector it belongs to. It only means that stocks may behave differently and thus should be treated independently. However, any information from their sector could be used in the forecasting process.

When seen as a balck box, the system has information about a specific stock (such as open, high, low, close, volume, etc.) as input and a class value as output. The class is fixed this way:

   1 if close[j+n] > (x% * close[j]) + close[j]
-1 otherwise

where n is the difference between the current day and the day predicted and x is a value chosen to take transaction fees into account (note that a fixed value could also be chosen instead of a percentage). The class predictions are thus made for each stock independently. One year daily data is used for training and the following month for testing. A shifting window process is made so that the system adapts itself to the current market.

Here are the different steps of the overall methodology that makes use of decision tree for stock prediction:

1. Stock filtering
2. Data preprocessing
3. Classification tree
4. Risk management

In the following posts, I will explain in details each of these steps.

Sphere: Related Content

10 comments:

Pedro said...: Hi!
Do you want to share some knowledge with me?
I'm thinking to focus my Master BI Degree in forecasting stocks... and I already have some ideias!
Regards!
Pedro
Good Blog!!!!; 11:44 PM
Sandro Saitta said...: Thanks for your comment Pedro. I can give you a list of books/articles about the subject if you're interested. Regarding my personal experience, you will have an excerpt with the following posts on DMR.; 1:53 PM
Anonymous said...: Hi Sandro,
Do you assign classes of -1 and 1 only and or a scale between?
Cheers, Shane; 2:03 AM
Sandro Saitta said...: Hi Shane,
I'm using -1/1 for the classes (i.e. I have only two classes) but I use a more complex function for calculating the accuracy of my decision trees: I take into account the difference between close[j+n] and close[j].; 9:13 AM
Themos Kalafatis said...: Hello,

I was wondering as to whether you think that enhancing your models with Financial facts, your models could achieve higher accuracy? WHat is your opinion on this?

Many Thanks!; 12:27 AM
Sandro Saitta said...: Hi Themos,

Very interesting question. First, I would like to state that I don't believe that technical indicators are better/worse than fundamental indicators for stock picking. In this project, I have however decided to work with technical indicators. Therefore, I make the assumption that all information about the market is contained in the price of stocks. This is related to the old traders quote: "Buy the rumor and sell the news" (i.e. it's too late to look at the news because the market has already been altered by the news).

This is why I don't use financial facts or news. However, a lot of work has been done on text mining on financial news to predict stocks evolution and I won't be surprised that it could work. This is just another way of thinking a system.

I hope my answer is clear enough.
Regards.; 4:16 PM
Themos Kalafatis said...: Hi Again Sandro,

Thanks for your reply, i am looking forward for your findings.

Best Regards,

Themos; 11:19 PM
Unknown said...: hi sandro,
great post actually..
i am doing my final project this semester, and the topic is 'apply data mining in retail business'...
regarding your post, you briefly describe how decision tree can assist in stock prediction... if you dont mind, can you explain to me in term of algorithm itself..
you state that there have 4 steps.. can you explain me each of steps.. :p if you dont mind..
i really need your help..
n i appreciate most your kindness..

best regard,
-amir-
amirloko@gmail.com; 10:16 PM
Sandro Saitta said...: Thanks for your comment Amir. In fact, the four steps correspond to my overall methodology. It is not specific to decision tree.

Regarding decision tree itself, I would suggest the book by Tan et al., Introduction to Data Mining.

Hope it helps.; 11:57 AM
Ryan Duran said...: Greeat post thank you; 2:31 PM