XGBoostError label set cannot be empty

xgboost
python

#1

HI Experts,

I am getting error while getting cv score in xgboost python implementation.
This is what i followed:
#import

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

train = pd.read_csv("/…train_u6lujuX_CVtuZ9i.csv")

#param

param = {‘eta’: 1,
‘eval_metric’: ‘auc’,
‘max_depth’: 2,
‘nthread’: 4,
‘objective’: ‘binary:logistic’,
‘silent’: 1}

#DMatrix

train_dmatrix = xgb.DMatrix(train[“ApplicantIncome”])

#cv

xgb.cv(params=param, dtrain=train_dmatrix, seed=21)

#I get error as below:


XGBoostError Traceback (most recent call last)
in ()
----> 1 c = xgb.cv(params=param, dtrain=train_dmatrix, seed=21)

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/training.py in cv(params, dtrain, num_boost_round, nfold, stratified, folds, metrics, obj, feval, maximize, early_stopping_rounds, fpreproc, as_pandas, verbose_eval, show_stdv, seed, callbacks)
398 evaluation_result_list=None))
399 for fold in cvfolds:
–> 400 fold.update(i, obj)
401 res = aggcv([f.eval(i, feval) for f in cvfolds])
402

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/training.py in update(self, iteration, fobj)
217 def update(self, iteration, fobj):
218 “”"“Update the boosters for one iteration”""
–> 219 self.bst.update(self.dtrain, iteration, fobj)
220
221 def eval(self, iteration, feval):

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/core.py in update(self, dtrain, iteration, fobj)
804
805 if fobj is None:
–> 806 _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, iteration, dtrain.handle))
807 else:
808 pred = self.predict(dtrain)

/home/rahul/anaconda3/lib/python3.6/site-packages/xgboost/core.py in _check_call(ret)
125 “”"
126 if ret != 0:
–> 127 raise XGBoostError(_LIB.XGBGetLastError())
128
129

XGBoostError: b’[20:20:23] src/objective/regression_obj.cc:89: Check failed: (info.labels.size()) != (0) label set cannot be empty’

Can anyone help me why I am getting this error.
Am I missing something?


#2

even when i tried adding label to the DMatrix like below:

train_dmatrix = xgb.DMatrix(train[“ApplicantIncome”], label=train[“Loan_Status”])

it didn’t work :frowning:


#3

if you need script in R i would love to help you out
Thanks :slight_smile:
Heers!!


#4

Hello Rahul,

xgboost need all your features and target variable as numeric. Hope you are converting the loan status as numeric.
The error message is complaining about the size of target variable (train[“Loan_Status”])

below is a simulation on dummy df.

Hope this will help you in troubleshooting the issue.