Does outlier treatment come first or missing value imputation?

outliers
cleaning
missing_values

#1

Is it preferred to do outlier treatment(say using capping and folding) first and then do missing value imputation rather than the other way around?

I feel doing outlier treatment first makes the distribution more normal post which the measures of central tendency can be used for imputation.

Also how to decide what the cap and floor values are to be in case of outlier treatment. Is there a standard norm for it?


#2

@vns1311 - I think you should perform missing value treatment two times one before outlier treatment and other after outlier treatment because in first step you should treat all missing value with appropriate values by doing this you will treat all missing values and after this, an outlier treatment will remove the outlier records and at last if still some outlier is created as blank space which can again fill by missing value treatment .

Hope this helps!

Regards,
Hinduja


#3

I think we should impute missing value first than go for outlier treatment


#4

First Outlier Treatment and then Missing Data Imputation

Reason is that the outliers will also influence the missing data algorithms in a negative manner.