Imputting missing values with -999?

missing_values

#1

which is the best method for imputation??

imputing with mean, median or mode or with -999??

imputing with -999 what does this exactly means


#2

Imputing is totally dependent on how your data is, in most of the cases we use Median. and -999 i have never heard imputing it with -999. I think you might have some misunderstanding


#3

Hi @ashokasr143,

It depends on the column you are imputing missing values.

For categorical columns, you can fill the missing values using mode.
For continuous variables, if there are a large number of outliers, its better to use median than mean. (mean is susceptible to outliers)

Here is an article that covers missing value imputation, do have a look.


#4

Missing values can imputed depending on data.

If the variable is quantitative go for mean or median.If the qualitative have outliers then go for median(i.e that variable will be skewed. If the variable is skewed then it won’t come under normal distribution, go for median)

If the variable is qualitative go for mode.

But before imputing missing values check the accuracy of the data without doing any changes.
Then again check the accuracy of the data by removing the missing values.
Then again check the accuracy of the data by imputing missing values.

By considering the best accuracy proceed with that method.

refer this