I am trying to understand the various ways in which boosting can be performed in R.I have tried stochastic gradient boosting and gradient boosting:
#Stochastic Gradient Boosting: modFit <- train(wage ~ ., method="gbm",data=training,verbose=FALSE) print(modFit);summary(modFit) predictions <- predict(modFit,testing) #Calculate RMSE: rmse.stochastic.gbm <- RMSE(predictions,testing$wage)
This gives an output:
And when I use only gradient boosting:
#Gradient Boosting: boost.train.class=gbm(wage~., data=training, n.trees=500,interaction.depth=4) head(boost.train.class$valid.error) summary(boost.train.class) boost.pred=predict(boost.train.class,newdata=testing, n.trees=500) #Calculate RMSE: rmse.boost <- RMSE(boost.pred,testing$wage)
This are the outputs from these:
For the RMSE of the two models:
I understand that in Stochastic bootstrapped samples are taken whereas in gradient boosting the total sample space is taken,right??
In the second case the RMSE is lower because since we are taking bootstrapped samples the model generalizes well??
So when should we use which model of boosting and why??
Can someone please help me understand this.?