How K-fold Cross-validation helps in improving the model?

crossvalidation

#1

I am currently solving one classification problem using CART in R .I have built the model using rpart after building the model I have looked into some methods which can improve the performance of the model after searching I have found one method called K-fold Cross validation. I want to know how K-fold Cross-validation helps in improving the performance of the model.


#2

Hi,

The k fold cross validation is used to produce appropriate estimate of the model using the training with one slice (fold) hold out. If k = 10 for example you held 1/10 of the training set as hold out (test) for validation.
First point what does this mean? That is you have a huge amount of training observations and you can afford a split of let say .5 then cross validation will not bring you a lot of new information.
So how did it help? First if you have a small amount of training/test observations, you will be tempted to use the maximum to build your model and therefore you face the problem of validation, you will validate on very few observations which will be more guess about the performance of your model once deployed.

Now if you have splitted your observations in 10 slices get one rolling slice for validation, you will build 10 models with the same parameters with each time 9 slices one one hold out for validation ( so every time you have a different train and validation sample). Now for each model you keep the results metric of you validation, for example RMSE, you do one average this now gives you how you model will behave more accurately ,as now you will have use all your tests bot to build and to test the model.
Then you change one parameter in your model and redo 10 models as before, which model did the best base on the aggregate metrics is the model you will take and normally this one should have similar performance to the validation you conducted once deployed.
In few words K-fold or more extreme one out, validate your model by removing the variance you could have if you build a simple split of let say .8 between your train and test set and build one model.
Hope this help.
Alain