12 Useful Pandas Techniques in Python for Data Manipulation- #3 – Imputing missing files

#First we import a function to determine the mode
from scipy.stats import mode

/anaconda3/lib/python3.7/site-packages/scipy/stats/stats.py:245: RuntimeWarning: The input array could not be properly checked for nan values. nan values will be ignored.
“values. nan values will be ignored.”, RuntimeWarning)

TypeError Traceback (most recent call last)
in ()
1 #First we import a function to determine the mode
2 from scipy.stats import mode
----> 3 mode(data[‘Gender’])

/anaconda3/lib/python3.7/site-packages/scipy/stats/stats.py in mode(a, axis, nan_policy)
437 return mstats_basic.mode(a, axis)
–> 439 scores = np.unique(np.ravel(a)) # get ALL unique values
440 testshape = list(a.shape)
441 testshape[axis] = 1

/anaconda3/lib/python3.7/site-packages/numpy/lib/arraysetops.py in unique(ar, return_index, return_inverse, return_counts, axis)
231 ar = np.asanyarray(ar)
232 if axis is None:
–> 233 ret = _unique1d(ar, return_index, return_inverse, return_counts)
234 return _unpack_tuple(ret)

/anaconda3/lib/python3.7/site-packages/numpy/lib/arraysetops.py in unique1d(ar, return_index, return_inverse, return_counts)
279 aux = ar[perm]
280 else:
–> 281 ar.sort()
282 aux = ar
283 mask = np.empty(aux.shape, dtype=np.bool

TypeError: ‘<’ not supported between instances of ‘str’ and ‘float’

Gender has NA…so you must fillna with mode

Ex: data[‘Gender’].fillna(data[‘Gender’].mode()[0], inplace=True)

This should work for this problem but be careful for other problems because if your mode is NaN then itll replace it with NaN

© Copyright 2013-2019 Analytics Vidhya