How to resolve error NA/NaN/Inf in foreign function call (arg 6) in KNN

knn

#1

I am getting this error when i m trying to run knn algorithm. There are no missing values .I imputed them with -1. Still its giving me this error. Can anyone help me out with this error

train <- read.csv(“train.csv”)
total var = 128

train <- train[-1] #Since 1st one was Id

#Created parition

ind <- createDataPartition(train$Response, p=0.70, list=FALSE)

dat_train <- train[ ind,-127] #leave your target variable out
dat_test <- train[-ind,-127] #leave your target variable out
cl<-train[ind,127] #your target variable for the train set
knn (dat_train, dat_test, cl, k=5, prob=TRUE)

Error in knn(train = prc_train, test = prc_test, cl = prc_train_labels, :
NA/NaN/Inf in foreign function call (arg 6)
In addition: Warning messages:
1: In knn(train = prc_train, test = prc_test, cl = prc_train_labels, :
NAs introduced by coercion
2: In knn(train = prc_train, test = prc_test, cl = prc_train_labels, :
NAs introduced by coercion

@shuvayan @Lesaffrea @Aarshay Can you help me with this.


#2

hello @Rohit_Nair ,

I have a question:
Why in knn the data is dat_train but in the error in knn it is train = prc_train ?


#3

sry i posted wrong code here…

prc_train <- train[1:59381,]
prc_test <- test[1:19765,]

prc_train_labels <- (train[1:59381, 127])
prc_test_labels <-(train[1:19765, 127])

prc_test_pred <- knn(train = prc_train, test = prc_test,cl = prc_train_labels, k=11,prob=T)

This was the code. By mistake i copied diffrent code which i was trying to run.


#4

#Error Resolved :slightly_smiling:

What i did was converted all integers and a factor var to numeric. There was a factor variable named "product_Info_2 (A1,A2,A3,A4,B1,B2,B3, and so on) in the data. I converted it into numeric values like (1,2,3,4,5,6,7,8, and so on). Thats it the error was gone.

SO basically what i understood is,if u get this error check out these things :

1]. There should be no missing values in the data Use sum(is.na(x)) . If any missing values is there impute it or remove it.

2] Check if there are any factor variables. If any convert it into numeric.


#5

great.
Hope now you understand that whenever algorithms which use distance based methods are involved we need to convert all factor variables to numeric,


#6

@shuvayan yes i understood :slight_smile: :slight_smile:

Now i want to plot the graph for the result obtained … How can i do it ? can u help me with the R code for that


#7

I tried running this code :
nng(prc_test_pred_df, dx = NULL, k = 11, mutual = T, method = NULL)

Its running for more than an hour. Stll didint give me the plot. Genrally it takes so long ?

No of obs = 60K
no of var - 127

prc_test_pred is the predicted test data


#8

Hi,

Can I use KNN for Record Linkage, which involves Text Data ?
I have a very simple dataset, consisting just three columns: Name 1, Name 2 and Match Indicator.
If Name 1 and 2 are matching, then the Indicator is Y else N.

I tried executing the algorithm, but getting the foreign function call (arg 6) error.
Not sure if it works with strings ? If not, would you mind suggesting me another machine learning algorithm ?

Regards,

Vilas