While trying to implement ensemble methods I have used a dataset which looks like:
I am trying to predict y from x1,x2,x3.
While linear regression and random forest gives error’s in the range of 136 SVM gives an error rate of 129.
When I tried to combine the results-1)Combined svm and rf 2)combined svm,random forest and linear reg,shown below is what happened:
As can be seen with SVM and RF the error rate was actually lower than when I combined all the three models.
Aren’t ensemble methods supposed to improve our accuracy,what is going wrong here.?
Also,since this is a small dataset,combining some or all the models and seeing the difference was not an issue,but when the data is large doing such trial and error might be time consuming and hence I want to know how do we decide on which models to combine?