Error in randomForest.default(m, y, ...) : NA/NaN/Inf in foreign function call (arg 1) in R

r
kaggle
random_forest

#1

Hello,

I was trying to run a randomForest model on the bike sharing demand problem from kaggle using some given variables and some engineered variables.

date=substr(train$datetime,1,10)
day<-weekdays(as.Date(date))
train$day=day

and the model->

library(randomForest)
set.seed(415)
fit <- randomForest(registered ~ humidity +weather+atemp + day_type + hour + day, data=train, importance=TRUE, ntree=200)

Then I am getting an error and a warning saying ->

Error in randomForest.default(m, y, …) :
NA/NaN/Inf in foreign function call (arg 1)
In addition: Warning message:
In data.matrix(x) : NAs introduced by coercion

This error doesn’t come when I don’t use the day variable. Why is this error coming and how can I get rid of it?

Thanks


#2

it seems you have missing values in day variable . do you. check using is.na


#3

This error generally occurs in randomForest due to the following reasons:

  • If a variable passed is character
  • actual NaNs and Infs

I see you have passed the day variable which was created using the weekday() function. Weekday function creates a character:

test <- c("2015-02-01", "2015-06-01")   
day<-weekdays(as.Date(test))
str(day)
#  chr [1:2] "Sunday" "Monday"

What you need to do is convert your day variable into a factor using:

train$day <- as.factor(train$day)

This should solve your problem.