Filling missing values


#1

I am trying to fill missing values of Dependents column using the below code and getting -

Sometimes getting
TypeError: (‘cannot use label indexing with a null key’) and KeyError

Code is -

table=df.pivot_table(index=[‘Gender’],columns=‘Dependents’,values=‘ApplicantIncome’,aggfunc=‘mean’)
print table
def combine(x):
return table.loc[x[‘Gender’],x[‘Dependents’]]
df[‘Dependents’].fillna(df[df[‘Dependents’].isnull()].apply(combine,axis=1),inplace=False)


#2

Hey @ASHISH_17,
The error here is that you are trying to access the table contents based on the Gender and Dependents value, where Dependents is actually Nan when the combine function is called.
So basically the return statement is being evaluated like this :

table.loc[x['Gender'],'Nan']]

Hence the TypeError : cannot use label indexing with a null key.

Let me know if you did not quite get that,
Pavleen


#3

What if I fill all the missing values?
I have tried using different columns where there is no null entry.
But this time I am getting KeyError
where I have used Self_Employed as index
KeyError: (‘Self_Employed’, u’occurred at index Loan_ID’)

How to proceed ?


#4

Hey @ASHISH_17,
Impute the missing value on similar lines as discussed here : Imputing Missing Values

Hope that this helped,
Pavleen