How do I combine the results of various models for ensemble learning in R

ensemble_methods
r

#1

Hello,

In ensemble methods how do I combine the predictions from multiple models(one based on linear regression,one on random forests).
My code:

length_divisor<-6
iterations<-5000
predictions<-foreach(m=1:iterations,.combine=cbind) %do% {
  training_positions <- sample(nrow(training), size=floor((nrow(training)/length_divisor)))
  train_pos<-1:nrow(training) %in% training_positions
  lm_fit<-lm(y~x1+x2+x3,data=training[train_pos,])
  predict(lm_fit,newdata=testing)
}
lm_predictions<-rowMeans(predictions)

library(randomForest)
rf_fit<-randomForest(y~x1+x2+x3,data=training,ntree=500)
rf_predictions<-predict(rf_fit,newdata=testing)

Now if I want to predict based on these two models how do I do it?
For example I want to now apply the predictions from linear regression and random forests for a better model and find out the error rate.Then I want to compare this error rate with linear reg model and random forest model.
Can somebody please help me with this!!


#2

Hi @data_hacks,

2 ways in which ensemble can be done -

  • Take a simple mean of both the outputs from linear and random forest models and use that as final prediction. You can do some experimentation also by changing weights to see what fits best example: 0.4lm_predictions + 0.6rf_predictions. The only thing you have to make sure in this method is that both of your models should be of equal predicting powers(almost same RMSE)

  • Second way , make a table with columns lm_predictions , rf_predictions and y. Now you can apply any prediction model like random forest , gam etc to regress y over lm_pred and rf_pred. So this method is like 2 step ensemble. First you train your respective models and use their output to make one more model.

Hope this helps.

Regards,
Aayush