How to handle numeric values in Y/N variable in R

missing_values

#1

Hi Team,
I have a data set with Y/N values and also a numeric value. How to handle those unexpected values.
Data is as below.

OWN_OCCUPIED
Y
N
N
12
Y
Y

Y
Y

The above has one missing value and numeric. I want those two replace with ‘NA’. How is this possible in R. I tried as below but couldn’t find the solution

e <- c()
i = 1

while(i <= nrow(df2)){
e[i] <- df2[i,3]
if(e[i] ==‘Y’){
print(“Accepted”)
}
else if(e[i] == ‘N’){
print(“ACCEPTED”)
}
else
print(“Reject”)
i = i+1
}


#2

Before you spend time thinking about how to handle this scenario, it is necessary to spend time understanding

  • Spend a bit of time understanding why they’re there. i.e. is it a source data error / error with how this data is being read (can happen in delimited files if not handled properly). You can probably use Excel (or OpenOffice) for this … this is the exploratory part of your work
  • The distribution / quantity of values do you have. You can do this by unique(df2[,3]) or table(df2[,3])
  • I can’t stress enough on the need to question endlessly on why you’re seeing numeric values there. If they are indeed valid values, interpreting those correctly is a subject domain problem (i.e. related to the source system where you are getting this data from or how the business you’re working for interprets these values)

There’s a bit of work cut out. Unfortunately, unless the groundwork above is done, posting this question on forums will be of little help.