Getting "TypeError: '>' not supported between instances of 'str' and 'float'"

loan_prediction
python

#1

Getting “TypeError: ‘>’ not supported between instances of ‘str’ and ‘float’”
in below code snippet

from sklearn.preprocessing import LabelEncoder
var_mod = [‘Gender’,‘Married’,‘Dependents’,‘Education’,‘Self_Employed’,‘Property_Area’,‘Loan_Status’]
le = LabelEncoder()
for i in var_mod:
df[i] = le.fit_transform(df[i]) ## typeError at this line
df.dtypes


#2

Hi @plarion,
There may be missing values in the dataset. Treat it before applying label encoder


#3

Thank you @jalFaizy, missing value was the issue.


#4

I am getting the following error for this particular code:

Code:

from sklearn.preprocessing import LabelEncoder
var_mod = [‘Gender’,‘Married’,‘Dependents’,‘Education’,‘Self_Employed’,‘Property_Area’,‘Loan_Status’]
le = LabelEncoder()
for i in var_mod:
df[i] = le.fit_transform(df[i])
df.dtypes

Error:

Note: i did impute all the missing values


#5

I would say, get count of all columns to see if any column still has NULL values.
i.e df[“Gender”].value_counts()


Labelencoder error
#6

No NULL values in the data still facing the same issue

df[“Dependents”].value_counts()
0 345
1 102
2 101
3+ 51
Name: Dependents, dtype: int64

df[‘Dependents’] = le.fit_transform(df[‘Dependents’])

TypeError: ‘>’ not supported between instances of ‘str’ and ‘float’\

Any help will be appreaciated


#8

it is because of 3+
better map manually using map({‘3+’:3})


#9

@T_Predict | anyone : more details on using map function pls

I tried doing this after getting the error-

TypeError: ‘>’ not supported between instances of ‘str’ and ‘float’

In [152]:

df[‘Dependents’].value_counts()
df[‘Dependents’]=map(float(‘3+’),3)
#df[‘Dependents’].map({‘3+’,3})
df[‘Dependents’].value_counts()
#df[‘Dependents’] = le.fit_transform(df[‘Dependents’])
#df[‘Dependents’]

ValueError Traceback (most recent call last)
in ()
1 df[‘Dependents’].value_counts()
----> 2 df[‘Dependents’]=map(float(‘3+’),3)
3 #df[‘Dependents’].map({‘3+’,3})
4 df[‘Dependents’].value_counts()
5 #df[‘Dependents’] = le.fit_transform(df[‘Dependents’])

ValueError: could not convert string to float: ‘3+’


#10

missing value is the issue


#11

Hey, you should have used
df['Dependents'] = df['Dependents].map({'3+': 3})


#12

For some reason executing the above makes the remaining as NaN. After running the above code execute df[‘Dependents’].isnull().sum()

Though not really sure but the only way i found a workaround is by exporting to a csv and reading the same back. After that LabelEncoder works absolutely fine.

df.to_csv(‘F:/Datasets/Loan_Pred_traintest_cleaned.csv’, index= False)
df = pd.read_csv(‘F:/Datasets/Loan_Pred_traintest_cleaned.csv’)

If i find the reason, i will post it here.


#13

It is due to the missing values in the columns gender,married and dependents.
use df.count() to check which columns have missing values and fill them and then use label encoding it will work fine.