How to resolve R SVM Text classification error

r

#1

R text classification sometimes gives below error - what could be reason?
Error in validObject(.Object) :
invalid class “matrix.csr” object: ra has too few, or too many elements
Calls: create_container … new -> initialize -> initialize -> .local -> validOb
ject
Execution halted

I have noticed this error occurs if nothing matches test data from training set. If atleast one word matches then error does not occur.


sample code.

training data- just change path)

library(RTextTools)

directory <-“D:\textminingORG\nlp\vocabulary\“
dataText<-read.csv(paste(directory,“Vocabulary_Categorizationv3.txt”,sep=””),header= TRUE)
dataMatrix<-create_matrix(dataText[“text”])
container<-create_container(dataMatrix,dataText$isred,trainSize=1:9,virgin=FALSE)

model <- train_model(container,“SVM”,kernel=“linear”,cost=1,gamma=0.5)

predictionData <- list(“dangerous be”,“donated many”,“generous mostly”,“TV”);

create a prediction document term matrix

trace(“create_matrix”,edit=T)
predMatrix <- create_matrix(predictionData, originalMatrix=dataMatrix)

create the corresponding container

predSize = length(predictionData)
predictionContainer <- create_container(predMatrix, labels=rep(0,predSize), testSize=1:predSize, virgin=FALSE)

predict

results <- classify_model(predictionContainer, model)
results
plot(results)


training data

Text,income,isred
action,1000,0
animal_trade,2500,1
arms,500,1
arrest,200,1
attack,10,1
bank_fraud,2000,1
betting,50,1
black_money,300,1
blackmail,200,1
bogus,90,1


predictionData <- list(“dangerous be”,“donated many”,“generous mostly”,“TV”);
gives error as nothing matches with training data

predictionData <- list(“attack”, “dangerous be”,“donated many”,“generous mostly”,“TV”);
works

thanks


#2

@sset - Can you please elaborate your error with the code part in which you are getting the error


#3

Hi,

predSize = length(predictionData)
predictionContainer <- create_container(predMatrix, labels=rep(0,predSize), testSize=1:predSize, virgin=FALSE)

gives error if none of test data matches with training data.


#4

Hi - any workaround for this issue. Myself need this working - kindly help