I am using Xgboost for a Machine Learning task, my dataset is relatively small, with 3153 observations and 46 features. I follow the steps in https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/#comment-152592 but the model seems to perform worse after hyperparameter tuning.

My original parameters are:

xgb1 = XGBClassifier(

learning_rate =0.01,

n_estimators=1000,

max_depth=5,

min_child_weight=1,

gamma=0,

subsample=0.8,

colsample_bytree=0.8,

objective= ‘binary:logistic’,

nthread=4,

scale_pos_weight=1,

seed=27)

And I got an AUC of 0.911 for training set, 0.949 for testing set.

After I change the parameters according to the result of GridSearchCV, model was like:

xgb2 = XGBClassifier(

learning_rate =0.01,

n_estimators=1000,

max_depth=3,

min_child_weight=5,

gamma=0.2,

subsample=0.8,

colsample_bytree=0.8,

objective= ‘binary:logistic’,

nthread=4,

scale_pos_weight=1,

seed=0)

Then the AUC for training set dropped to 0.892, for testing set it dropped to 0.917. I really feel confused. Hoping for your help!