What is the standard way for doing validation for a Machine Learning model?

predictive_model

#1

Can you really believe in your model after training? Not until you have validated it. So how would one approach this problem? Also suggest some resources for this validation pipeline


#2

I used the R "caret" package for Cross Validation. There are loads of articles on how caret package can be utilised for this purpose.


#3

Could you give a link to the resources? (Possibly resources in python would be preferable)


#4

You can refer these links
http://topepo.github.io/caret/training.html
http://machinelearningmastery.com/how-to-estimate-model-accuracy-in-r-using-the-caret-package/

But these are for R. Some Python expert can share any additional information on this.


#5

@sonny
The link http://machinelearningmastery.com/how-to-estimate-model-accuracy-in-r-using-the-caret-package/
has 5 different validation method. While developing any model is there any logic in choosing a particular validation technique?


#6

Hello,
I typically start with Data Split or Bootstrap and build a couple of features and models.
During the later stages, I use K-fold Cross Validation most of the times.

Repeated K-fold and Leave-one out take very long to complete depending on how much data you have.
If time is not a constraint , then one should also try these so as to prevent over-fitting.