Machine Learning_Predict Stock Price with R_2nd May, 2016


Hello All,

I have studied a year data of PNB stock Prices in Nifty index in terms of patterns, seasonal components and have done a regressional analysis to predict the stock price of PNB with 95% confidence interval.

PNB stock price has fallen from Rs 160 to Rs 80 in one year time. I have not dig into the data in order to find out the reason for such a sharp fall. Instead, I performed a regression analysis to see the relationship among the data.
I have considered two variables to predict the high price of stock (PNB), one is each day’s open price and last day’s unit of traded quantity of the stocks.
In order to select these two variables, I have seen the relative importance of the variable in the final outcome( which is high price). Below is the bar graph for the relative importance of Total Traded Quantity & Open Price

Note: This code was provided by Dr. Johnson in the year 2000. This was adapted from SPSS program.

R script for Relative importance,



Open.Price 95.185287
Total.Traded.Quantity 4.814713

It is quite evident from the picture that there is a strong correlation between open price and High price of the stock on a particular day. We will also include the total traded quantity of the stock as we know that this also influences the stock price due to demand and supply of the stock (liquid stock in the market). Higher the liquidity, lower is the stock price and lower the liquidity means speculation and pushing the stock price.

If you compare the below two graphs, it is evident whenever the liquidity of the stock is low, High price of the stock shot up. Therefore, there is a negative relationship between the High Price of the stock and Liquidity of the stock.

In order to get the below graphs, below is the R script,

Note: Please install below packages,

a) Install.packages(“zoo”)
b) Install.packages(“dygraphs”)
c) Install.packages(“lubridate”)

R-script for Total Quantity Traded
x<-mdy(f$Date) # f is the data frame
dygraph(e, main = “Total Traded Quantity from feb 2015 to Apr 2016”) %>%
dyRangeSelector(dateWindow = c(“2015-02-19”, “2016-04-25”))

R-script for High Prices (PNB)
x<-mdy(f$Date) # f is the data frame
dygraph(d, main = “Punjab National Bank(High Price)”) %>%
dyRangeSelector(dateWindow = c(“2015-02-19”, “2016-04-25”))

Regression Analysis of PNB Stock Price

R Script for the lm method,


lm(formula = High.Price ~ Open.Price + Total.Traded.Quantity,
data = f)

Min 1Q Median 3Q Max
-7.8894 -1.0822 -0.3018 0.8450 11.1223

Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.911e+00 6.815e-01 -4.272 2.63e-05 ***
Open.Price 1.027e+00 4.440e-03 231.372 < 2e-16 ***
Total.Traded.Quantity 2.304e-07 3.102e-08 7.427 1.23e-12 ***

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.978 on 291 degrees of freedom
Multiple R-squared: 0.9954, Adjusted R-squared: 0.9953
F-statistic: 3.123e+04 on 2 and 291 DF, p-value: < 2.2e-16

Actual High Price vs Estimated High Price


Prediction of the Stock Price(PNB)

R-Script for the prediction,

Prediction for 02nd May, 2016 for High Price is Rs 88.60 /-

fit_High Price Lower limit Upper Limit
88.60 88.17 89.01

The Estimated / predicted High Price for PNB stock on 29th April, 2016 is
Rs 89.20/- with a CI of UL= Rs 89.63/- and LL= RS 88.78 / -.

Note: The estimated Price for PNB was Rs 89.20/- and observed Price for PNB is Rs 89.15 /- on 29th April, 2016.

Machine Learning

Please install caret package for this particular script,
mod.High.Price.lm <- train(High.Price ~ Open.Price + Total.Traded.Quantity, data = f, method = “lm”)
coef.icept <- coef(mod.High.Price.lm$finalModel)[1]
coef.slope <- coef(mod.High.Price.lm$finalModel)[2]

plot the data

ggplot(data = f, aes(x = Open.Price, y = High.Price))+geom_point()+geom_abline(intercept = coef.icept, slope = coef.slope, color = “red”)

End Note:

I have not included any seasonal component in the prediction of the Stock Price, which actually will reduce the residual in significant manner and the probability of the predict price will be much accurate.
I request readers to comment and suggest on a model which includes time series analysis along with Regression analysis.

Aritra ChatterjeeSP.csv (14.9 KB)<a class="attachment"SP.csv (14.9 KB)


The Prediction for High price for PNB on 27th April, 2016 was Rs 91.43 /-.

The observed High Price for PNB on 27th April, 2016 was Rs 92.15 /-


Similarly Prediction for 28th April, 2016 is Rs 91.23/- (upper limit= Rs91.63 and Lower Limit=90.837)
with Open Price Rs 90.10 and Quantity traded units for the day 6923537 units.


The Prediction for High price for PNB on 28th April, 2016 was Rs 91.23 /-.
The observed high Price is Rs 91.15 /- on 28th April, 2016.

The Estimated / predicted High Price for PNB stock on 29th April, 2016 is
Rs 89.20/- with a CI of UL= Rs 89.63/- and LL= RS 88.78 / -


The estimated Price for PNB was Rs 89.20/- and observed Price for PNB is Rs 89.15 /- on 29th April, 2016.


@richie31 Nice study. What do you suggest the price of this stock by December 2016 ?


Really good question. In order to predict the price for december 2016 with confidence interval of 95%, I am trying to develop a model in R using the concept of Time Series and Regression to predict the price in December, 2016.


Good conclusions…

I am not able to download the SP.csv file.


i have uploaded the file, but I guess there is some issue with blog. You need to check with the blog owner.


SP.csv (14.9 KB)