XGBoost: hyperparameter tuning makes AUC worse ?!

I am using Xgboost for a Machine Learning task, my dataset is relatively small, with 3153 observations and 46 features. I follow the steps in https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/#comment-152592 but the model seems to perform worse after hyperparameter tuning.
My original parameters are:

xgb1 = XGBClassifier(
learning_rate =0.01,
n_estimators=1000,
max_depth=5,
min_child_weight=1,
gamma=0,
subsample=0.8,
colsample_bytree=0.8,
objective= ‘binary:logistic’,
nthread=4,
scale_pos_weight=1,
seed=27)

And I got an AUC of 0.911 for training set, 0.949 for testing set.
After I change the parameters according to the result of GridSearchCV, model was like:

xgb2 = XGBClassifier(
learning_rate =0.01,
n_estimators=1000,
max_depth=3,
min_child_weight=5,
gamma=0.2,
subsample=0.8,
colsample_bytree=0.8,
objective= ‘binary:logistic’,
nthread=4,
scale_pos_weight=1,
seed=0)

Then the AUC for training set dropped to 0.892, for testing set it dropped to 0.917. I really feel confused. Hoping for your help!

Hi @clare_che, what is the list of hyperparameters you gave to the grid search algorithm?

Thank you for the reply.
First I tested max_depth values of 3 to 10 in steps of 2 and min_samples_split from 1 to 6 in steps of 2. It was weird because the ideal values are 1 for max_depth and 5 for min_samples_split. But I still took the max_depth of 3 as optimum, because it is said that max_depth usually ranges from 3 to 10. I fitted the model again and found the AUC decreased from 0.911 to 0.883 in train set, and decrease from 0.949 to 0.917 in test set.
Then I tested gamma values of 0 to 5 in steps of 0.1 and took 0.2 as optimum. Finally, the values of subsample and colsample_bytree. The accuracy and auc just didn’t increase any more.

Hi, I noticed you changed the ‘seed’ value. Please keep the seed value constant.

Thank you. I kept the seed value constant and tried again, the result remained the same. So I guess something else may cause the problem.

© Copyright 2013-2019 Analytics Vidhya