How to use na.roughfix to remove missing value from data

r
randomforest

#1

I am studying about random forest and I came to know that random forest can not be applied to those data which has missing value so I came across the function na.roughfix to remove missing value but I am not able to use it in R

‘data.frame’: 891 obs. of 12 variables:
PassengerId: int 1 2 3 4 5 6 7 8 9 10 ... Survived : int 0 1 1 1 0 0 0 0 1 1 …
Pclass : int 3 1 3 1 3 3 1 3 3 2 ... Name : Factor w/ 891 levels “Abbing, Mr. Anthony”,…: 109 191 358 277 16 559 520 629 417 581 …
Sex : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ... Age : num 22 38 26 35 35 NA 54 2 27 14 …
SibSp : int 1 1 0 1 0 0 0 3 0 1 ... Parch : int 0 0 0 0 0 0 0 1 2 0 …
Ticket : Factor w/ 681 levels "110152","110413",..: 524 597 670 50 473 276 86 396 345 133 ... Fare : num 7.25 71.28 7.92 53.1 8.05 …
Cabin : Factor w/ 148 levels "","A10","A14",..: 1 83 1 57 1 1 131 1 1 1 ... Embarked : Factor w/ 4 levels “”,“C”,“Q”,“S”: 4 2 4 4 4 3 4 4 4 2 …


#2

Hi @hinduja1234,

na.roughfix is used to impute missing values by the random forest model.There are two ways in which it works.If the data is numeric,na’s are replaced by median values and if the variable is categorical,the most frequently occurring value is taken.
To apply it use the option na.action = na.roughfix inside your randomForest function.
Hope this helps!!