I am new to data science and learning about regression techniques using R & SAS. I working on a data set to develop a predictive model on credit card spending. I have close to 130 variables and that includes some log variables also (with original variables). Now, there are number of missing values in log variables(also in original) because of two reasons:
- Missing value in original variable
- value ‘0’ in original variable and that also make sense from a business perspective. For example, toll free calling last month, means some customer may not have any toll free calling and it is possible to have ‘zero’ in the data set.
I understand that log transformation in an important aspect in regression techniques but I would like to understand that how do I deal with this situation while doing outlier and missing value treatment.