Missing Values Imputation by prediction

missing_values

#1

I have been recently working on datasets and came across one notebook in which the user had used machine learning itself to impute missing values. Till then I used to normally use mean,mode and median for imputation. However I want to know if its better to predict the values using any ML algorithm and if yes then which algorithm to use?


#2

@aditya1702 : Using ML algorithm for imputing missing value is one of the effective method for treating missing value but it is time consuming. It totally depend on you how much time you have for dealing the problem, and how much time you want to invest on treating missing value.
It is very difficult to say whether ML algorithm is good compared to other normal method , is totally depend on which problem and what kind to dataset you have . Having ML algorithm in your bucket for treat missing value is always good. It always trial and error method.
If you choose ML algorithm to treat missing value , than the logic to algorithm still remain the same, which is , what is the type of variable (i.e. continuous or categorical) on which you are imputing missing value . If the variable continuous , u can use linear regression etc, and if the variable is categorical, u can use various non linear model eg: Logistic regression, SVM etc.
Hence there is no change in ML algorithm approach , where you have dependent variable and independent variable .

Hope I have answered you query .
Happy to help :slight_smile:


#3

Thank You @saurabh090909