XGBoostError label set cannot be empty



HI Experts,

I am getting error while getting cv score in xgboost python implementation.
This is what i followed:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

train = pd.read_csv("/…train_u6lujuX_CVtuZ9i.csv")


param = {‘eta’: 1,
‘eval_metric’: ‘auc’,
‘max_depth’: 2,
‘nthread’: 4,
‘objective’: ‘binary:logistic’,
‘silent’: 1}


train_dmatrix = xgb.DMatrix(train[“ApplicantIncome”])


xgb.cv(params=param, dtrain=train_dmatrix, seed=21)

#I get error as below:

XGBoostError Traceback (most recent call last)
in ()
----> 1 c = xgb.cv(params=param, dtrain=train_dmatrix, seed=21)

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/training.py in cv(params, dtrain, num_boost_round, nfold, stratified, folds, metrics, obj, feval, maximize, early_stopping_rounds, fpreproc, as_pandas, verbose_eval, show_stdv, seed, callbacks)
398 evaluation_result_list=None))
399 for fold in cvfolds:
–> 400 fold.update(i, obj)
401 res = aggcv([f.eval(i, feval) for f in cvfolds])

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/training.py in update(self, iteration, fobj)
217 def update(self, iteration, fobj):
218 “”"“Update the boosters for one iteration”""
–> 219 self.bst.update(self.dtrain, iteration, fobj)
221 def eval(self, iteration, feval):

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/core.py in update(self, dtrain, iteration, fobj)
805 if fobj is None:
–> 806 _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, iteration, dtrain.handle))
807 else:
808 pred = self.predict(dtrain)

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/core.py in _check_call(ret)
125 “”"
126 if ret != 0:
–> 127 raise XGBoostError(_LIB.XGBGetLastError())

XGBoostError: b’[20:20:23] src/objective/regression_obj.cc:89: Check failed: (info.labels.size()) != (0) label set cannot be empty’

Can anyone help me why I am getting this error.
Am I missing something?


even when i tried adding label to the DMatrix like below:

train_dmatrix = xgb.DMatrix(train[“ApplicantIncome”], label=train[“Loan_Status”])

it didn’t work :frowning:


if you need script in R i would love to help you out
Thanks :slight_smile:


Hello Rahul,

xgboost need all your features and target variable as numeric. Hope you are converting the loan status as numeric.
The error message is complaining about the size of target variable (train[“Loan_Status”])

below is a simulation on dummy df.

Hope this will help you in troubleshooting the issue.