Difference between R-square and Adjusted R-Square?

@kunal, @aayushmnit it possible that R Square has improved significantly yet Adjusted R Square is decreased with addition of a new predictor?

1 Like

@vajravi

Yes, it is possible - this happens in case the newly added variable brings in more complexity than power to predict the target variables.

Regards,
Kunal

2 Likes

@vajravi- yes,their can be a case where the R Square has improved significantly but Adjusted R Square is decreased with addition of a new predictor. This happen only when the newly added predictor is insignificant for the model

Hi ,

Answer inline.

The easiest way to check the accuracy of a model is by looking at the R-squared value.
The summary provides two R-squared values, namely Multiple R-squared, and Adjusted R-squared.

The Multiple R-squared is calculated as follows:

Multiple R-squared = 1 – SSE/SST where:
SSE is the sum of square of residuals. Residual is the difference between the predicted value and the actual value, and can be accessed by predictionModel$residuals.
SST is the total sum of squares. It is calculated by summing the squares of difference between the actual value and the mean value.

For example,
lets say that we have 5, 6, 7, and 8, and a model predicts the outcomes as 4.5, 6.3, 7.2, and 7.9. Then,
SSE can be calculated as: SSE = (5 – 4.5) ^ 2 + (6 – 6.3) ^ 2 + (7 – 7.2) ^ 2 + (8 – 7.9) ^ 2;
and
SST can be calculated as: mean = (5 + 6 + 7 + 8) / 4 = 6.5; SST = (5 – 6.5) ^ 2 + (6 – 6.5) ^ 2 + (7 – 6.5) ^ 2 + (8 – 6.5) ^ 2

The Adjusted R-squared value is similar to the Multiple R-squared value,
but it accounts for the number of variables. This means that the Multiple R-squared will always increase
when a new variable is added to the prediction model, but if the variable is a non-significant one, the Adjusted R-squared value will decrease.
For more info, refer here.

An R-squared value of 1 means that it is a perfect prediction model,

R-squared or R2 explains the degree to which your input variables explain the variation of your output / predicted variable. So, if R-square is 0.8, it means 80% of the variation in the output variable is explained by the input variables. So, in simple terms, higher the R squared, the more variation is explained by your input variables and hence better is your model.

However, the problem with R-squared is that it will either stay the same or increase with addition of more variables, even if they do not have any relationship with the output variables. This is where “Adjusted R square” comes to help. Adjusted R-square penalizes you for adding variables which do not improve your existing model.

Hence, if you are building Linear regression on multiple variable, it is always suggested that you use Adjusted R-squared to judge goodness of model. In case you only have one input variable, R-square and Adjusted R squared would be exactly same.

Typically, the more non-significant variables you add into the model, the gap in R-squared and Adjusted R-squared increases.

Regards,
Tony

10 Likes

@tillutony: You clubbed everything perfectly. cheers.

https://www.quora.com/What-is-the-difference-between-R-squared-and-Adjusted-R-squared

1 Like

If we add more variables to the model, definitely R-sqaured will increase but Adjusted R-squared will not always increase except the added variable is significant.

@kunal @aayushmnit @tillutony
Can you pls explain what is the difference between Predicted R square and these two terms (Multiple R squared and Adjusted R squared)?

http://blog.minitab.com/blog/adventures-in-statistics-2/multiple-regession-analysis-use-adjusted-r-squared-and-predicted-r-squared-to-include-the-correct-number-of-variables

I was looking for this answer.
Thank you Kunal sir for helping us out.

1 Like

Hi, you can find a more comprehensive explanation here: https://medium.com/analytics-vidhya/measuring-the-goodness-of-fit-r%C2%B2-versus-adjusted-r%C2%B2-1e8ed0b5784a

Hello,

What could be ideal value for Adjusted R-squared?

Regards,
Ankit Prajapati

@akki3026,

Higher the value, better the model. So the ideal value would be 1.

Thanks Tony. Really helpful

Thanks Kunal Sir. Analytics Vidhya is really helpful. Keep doing the good work.

@kunal @aayushmnit Is it possible that by adding a non significant predictor variable (whose p-value is greater than 0.05) adjusted r squared value increases in a multiple ols regression model?

@kunal,

Could you please provide a complete information with some example like above, for

  1. MAE,MSE, RMSE, R2, Adjusted R2,
  2. Which Mechanism is fit in which place ?
  3. why we are calculating all these Errors ?

Hi @punyashloke,

  1. These evaluation metrics are simple to understand. Have you tried going through online blogs and resources? Please go through the following links -

MAE and MSE -> Different evaluation metrics for Regression Models

RMSE and RMSLE ->

R2 and Adjusted R2 ->

  1. The .fit function is used to train the model.

  2. After the training process, we evaluate the model performance to find out how well has the model trained. This is where we use the evaluation metrics (MAE, MSE etc)

can value of adjuested R2 be greater or less than the R2 ?

Awesome sir. thats expactly why you are a champ:)

© Copyright 2013-2019 Analytics Vidhya