Is it a good practice to remove observations with very less frequency from the data?

data_wrangling
data_science
histogram

#1

Hi,

Suppose while exploring some data, I see the histogram of a variable like this one

Then is it a good and helpful practice to assign to the observations with very low frequencies the values with higher frequencies or even removing those observations? Does this help in improving models?

Thanks


#2

@Aditya_Sharma

The answer would depend on the case, where how much information would this particular information capture.

Even if the volume is low, you can have a pocket of very high signal, which can be a micro segment in itself. So, while removing less frequency observations can be a good option, it is definitely not true always.

Hope this helps.

Kunal