Error: inherits(doc, "TextDocument") is not TRUE in R

r
wordcloud

#1

hello,

I was trying the tutorial “Build a word cloud using text mining tools of R” and a line of code is causing an error.

Code->
dtm <- DocumentTermMatrix(docs)

error->
Error: inherits(doc, “TextDocument”) is not TRUE

I am not able to figure out how to get rid of it. Please tell me what am I doing wrong.

Thank you.


#2

Hi @adityashrm21, we need context here. What kind of object is docs? Please paste the rest of the code.


#4

cname <- file.path(".",“corpus”,“target”)
library ™
docs <- Corpus(DirSource(cname))
library (SnowballC)
for (j in seq(docs))

  • {docs[[j]] <- gsub("/"," ",docs[[j]])
  • docs[[j]] <- gsub("@"," ",docs[[j]])}

docs <- tm_map(docs,tolower)
docs <- tm_map(docs,removeWords, stopwords(“english”))
docs <- tm_map(docs,removeNumbers)
docs <- tm_map(docs,removePunctuation)
docs <- tm_map(docs,stripWhitespace)
dtm <- DocumentTermMatrix(docs)


#5

It could be an issue with the version of tm you are using. If it’s the latest/a very recent release, then run the following command before proceeding with removing stuff from the words, i.e. immediately after converting to lower-case.

docs <- tm_map(docs, PlainTextDocument)

#6

Thanks Anon,

It was really helpfull.


#7

Will this worked for me too , a Big Thanks Atul… :slight_smile: