ValueError: Unknown label type: 'unknown' for LogisticsRegression Model


#1

Hi,

I am new to machine learning and just starting out with various algorithms to understand the loan prediction problem.

However, I am getting ValueError: Unknown label type: ‘unknown’ when using the LogisticsRegression() on my train data.

I tried LinearRegression() model and that worked fine. Could you please help why I am getting this error I am stuck with it.

Below is my code:

colModify = [‘Loan_ID’, ‘Gender’, ‘Married’, ‘Dependents’, ‘Education’, ‘Self_Employed’, ‘Property_Area’, ‘Loan_Status’]
labelEncod = LabelEncoder()
for i in colModify:
dataframe[i] = labelEncod.fit_transform(dataframe[i])
dataframe.dtypes

Take out the dataset in num array

dataset = dataframe.values

split data into X and Y and create validation data

X = dataset[:,1:12]
Y = dataset[:,12]
validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = train_test_split(X, Y, test_size=validation_size, random_state=seed)

seed = 7
scoring = ‘accuracy’

Spot Check Algorithms

models = []
models.append((‘LM’, LinearRegression()))
models.append((‘LR’, LogisticRegression()))

evaluate each model in turn

results = []
names = []
for name, model in models:
kfold = KFold(n_splits=10, random_state=seed)
cv_results = cross_val_score(model, X_train, Y_train, cv=kfold)
results.append(cv_results)
names.append(name)
msg = “%s: %f (%f)” % (name, cv_results.mean(), cv_results.std())
print(msg)

I am getting below error at line where cross_val_score function is used:

raise ValueError("Unknown label type: %r" % y_type)

ValueError: Unknown label type: ‘unknown’


#2

Cross_val_score -

There is a mismatch in what you can pass vs what you are actually passing. Say between Array vs Data frame or 1D vs 2D list. Correct the error. Check what format is X-Train and Y_Train


#3

I checked it and both X_train and Y_train are:
<class ‘numpy.ndarray’>
<class ‘numpy.ndarray’>

also cross_val_score takes the ndarray only it seems.


#4

I think I worked through it.
There was an issue with the missing values I was filling in string and float columns which was giving me a dtype of object. Now the Y is giving float64 and I am able to see some results.

Thank you.