Mice Package to impute missing values - Decode The Dalai Lama!




I am using Mi package to impute missing values.
But it is taking so much time.

Could anyone tell me what would be the best way to impute missing values in the data?



There are various methods to deal with missing data, I am listing the most common ways to deal with this:

  1. For continuous variables, impute with mean, median or mode of variable so which metric we should pick, it depends on the distribution of variable.
  2. For categorical variable, you can pick value have higher frequency in the variable
  3. Find the relation of variable (having missing value) with other existing variable and if there is any relationship, impute missing value based on the relationship between these variable(s). For example, In titanic survival challenge, you can impute missing value of age based on salutation of name (Mr, Master, Miss, Mrs, Col, Doc).
  4. You can build a separate model to impute missing values.

For more detail on missing value treatment, I would suggest you to refer this article.




Thank you very much for the reply!

I wanted to know specific why Mi package take so much to impute the values?

I go through the article also, but in real time scenario’s(especially Insurance pricing), we don’t impute missing value in the data preparation stage. It require lot of business judgement, checking initial model parameters and regulatory guidelines as per specific business geography.