How to handle missing values of categorical variables for unsupervised data?

I’m doing a project on Hotel review dataset in that one column is of location and it has 20%+ missing values.
Since missing values is more than 20%, I don’t think mode will work.

Hi mukul,
can you share the columns and few data of your project…

hotel_name 0
num_reviews_reviewer 0
review_body 0
review_date 0
review_title 0
reviewer_name 0
reviewer_rating 0
visit_date 26
entire_review 0
Customer_Name 0
customer_location 1560

Here there is no pin-code or address. We can delete the location column but before that once try with mode,backward filling or forward filling. and compare the the models.

I want location for visualization purpose.
What I mean is after prediction I want to make some graph presenting which countries customer are more satisfied and all.

for the graph presentation you can remove the NAN location and their values and plot the graph in percentage of satisfactions , you will get approximate values for the presentation.

I was thinking to predict the missing values using KNN

As rock bt said it is correct approach. You can also do knn for missing

© Copyright 2013-2019 Analytics Vidhya