I am having historical data for about 100 assets. and I will have to predict whether the stock will go up (1) or down(-1). Its a numerical data , I tried random forest, logistic regression I dont know why I am not getting good accuracy. so how to analyse such data with good accuracy and results.
Without a proper exploration of the dataset, its pretty hard to say what would work and what wouldn’t. I would suggest you do a thorough exploration of your dataset to understand it.
Off the top of my mind, I think you should give a try to linear regression and its variants. Maybe they could work better in a time series scenario.
I suggest you go through this blog
To get an accurate model to predict whether the stocks will go up or down is not that easy. If it can be predicted that accurately, everyone will make tons of money from it.
However, you need not stop. I can help you from the data side, as better data might help you improve your accuracy. Try : https://www.quandl.com/ . It has large financial databases, easily accessible.
Good Luck !
It depends on your dataset. But I wish to point out something that you may have overlook.
Predicting stock price - there are 2 classes of people.
- they use only basic information like stock price, volume, etc …example from here, they try to create moving average, etc. Now all these information is easily available. If you are using such information, these people I usually term as “technical analysis”. I am sorry to say that technical analysis cannot predict stock price. This is because if it is possible, then all CEOs of list companies need not sell any services , and might as well, sit on their desks and do price prediction.
- They use business ratios for EACH company and on EACH industry. This means databset per company is huge, like net profit ratio, liquidity ratio, etc per company. From here, you will then analyse WHICH RATIOS has an impact on stock price. I term these people “fundamental analysis”. This is realistic because salaried people like you and me contribute to businesses, which in turn, generate numbers for ratio for the stock exchange. I believe this is sustainable from business model.
So tell us, what kind of dataset you are having first ?