Continous Dependent variable modelling

machine_learning
random_forest
#1

I have around 23 independent variables for regressing against a battery life prediction. I have used linear regression and came to Adjusted R2 of around 0.79. I applied Random Forest and got Adjusted R2 as 0.85. The model is working fine but my question is can we do that ? Is it correct to use Random forest for continuous dependent variable? If not what else can I do ?

#2

Hi @koyeli728,

You certainly can use Random Forest for regression problems (where the target variable is continuous). RandomForestClassifier is used for classification problems and RandomForestRegressor is used for regression problems.

You can try other tree based models like GBDT (gradient boosting decision trees) or LightGBM as well.

1 Like
#3

Hi, you can use several types of regression equations: polinomials (2nd, 3rd, 4th, 5th grades), linear, logit, and so on. Look at R2 to select what equation best fits to your data, more higher R2, best fit. So, look also at Beta variation of independents variables, less variation is better. Don’t forget for looking colinearity, to avoid bias of independent variables. Sometimes the FIT is better with less variables in equation.