Why use training set in XGBoost parameter tuning?

This is an old post so I would like to ask a question here if people have some insight.

Why is tuning the parameter ‘gamma’ (section 3) based upon performance of model on the training set a good idea? Surely you want to evaluate based upon minimizing the difference between training and test set accuracy? Surely tuning regularization parameters on training set will only be a recipe for overfitting disaster?

I believe I would like to answer this question with asking you a questions , how do you evaluate every model how do you aim to improve , by checking the accuracy on training data right , here as well if we try to tune the parameters based on loss on train data , it is correct to a certain extent until and unless we try to over fit , tuning our parameters way too much might leas to over fit but generally speaking its all about trial and error basically exploring different ways in which our model can improve .

My understanding of the situation - the regularisation parameter which gives the best training set results, is going to be the one which optimizes the results on the training set, which does not take into account test (or ideally dev) set performance. Regularisation should be used to control the bias variance tradeoff, and in this case be used to ensure that the variance is not too large.

Consider the following - Gamma of 0 gives you training accuracy 0.9. Test set get 0.7.
Gamma of 10 gives you training 0.8. Test of 0.8.

Hence you have not overfit with gamma of 10! In the above tutorial, I believe that evaluating gamma based upon training set will preferentially tell you to choose gamma 0 and get best training accuracy with no regards for the test set accuracy, and hence the ability of the model to generalise

I hope there is a good reason for it and it is my misunderstanding, however I am so used to finding tutorials that are rife with bad practice and hence models do not generalise well at all. It happens more often than it does not happen, and hence I am skeptical. But Perhaps there is a good reason why that I am just not seeing right now?

Hi Laurence , I totally agree with what you said above about generalization but is it while modelling we always go for 99% accuracy on train v/s 90% accuracy at train .
No right , we go for 90% , if there is a comparison between 85% v/s 90% we might go for 90% it all depends on your instinct at that scenario .The point here is finding accuracy better accuracy but still using your instinct for generalization .
I wish I could express myself better.

© Copyright 2013-2019 Analytics Vidhya