LabelEncoder throws type error in sort


#1

When I enter the following code from the tutorial:
from sklearn.preprocessing import LabelEncoder
var_mod = [‘Gender’,‘Married’,‘Dependents’,‘Education’,‘Self_Employed’,‘Property_Area’,‘Loan_Status’]
le = LabelEncoder()
for i in var_mod:
df[i] = le.fit_transform(df[i])
df.dtypes

I get the error below. Can someone please advise? Thank you!

In [24]: for i in var_mod:
…: df[i] = le.fit_transform(df[i])
…:
…:
Traceback (most recent call last):

File “”, line 2, in
df[i] = le.fit_transform(df[i])

File “C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py”, line 131, in fit_transform
self.classes_, y = np.unique(y, return_inverse=True)

File “C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py”, line 195, in unique
perm = ar.argsort(kind=‘mergesort’ if return_index else ‘quicksort’)

TypeError: ‘>’ not supported between instances of ‘str’ and ‘float’


#2

Please check for missing values in all the features that you are working with. If there are missing values make sure you are imputing it with same datatypes.


#3

Hi you can use pd.factorize(df[i]) instead.
Hope it helps.