Rstudio and tm package - using the inspect() function

inspect
r
tm
rstudio

#1

I am using the tm package for textmining within R studio. I have used the readPDF() reader function to convert PDF into text and have that loaded as out Corpus.
I have installed xpdf application for readPDF() to work.

docs <- Corpus(DirSource(cname), readerControl=list(reader=readPDF))

When I run the inspect() function on this corpus,I am left with a cleared console.
Does anyone has encountered similar problem? Any help is appreciated.


#2

I would rather focus on if the data import worked properly. I have not seen this error. However I use this standard boilerplate code.

libraryβ„’

library(wordcloud)

txt2=”C:/Users/KUs/Desktop/new”

b=Corpus(DirSource(txt2), readerControl = list(language = β€œeng”))

b<- tm_map(b, tolower) #Changes case to lower case

b<- tm_map(b, stripWhitespace) #Strips White Space

b <- tm_map(b, removePunctuation) #Removes Punctuation

tdm <- TermDocumentMatrix(b)

m1 <- as.matrix(tdm)

v1<- sort(rowSums(m1),decreasing=TRUE)

d1<- data.frame(word = names(v1),freq=v1)

wordcloud(d1$word,d1$freq)


#3

sir related to this I have a query which includes I am fetching tweets from twitter there I wrote these pieces of code :smile:

tweets<-searchTwitter("#valentine",since=β€˜2014-01-01’,n=1500,lan=β€˜en’)

tweets.text<-laply(tweets,function(t)t$getText())
list(tweets.text)

tweets.text= gsub("(RT|via)((?:\b\W*@\w+)+)", β€œβ€, tweets.text)

tweets.text= gsub("@\w+", β€œβ€, tweets.text)

tweets.text= gsub("[[:punct:]]", β€œβ€, tweets.text)

tweets.text= gsub("[[:digit:]]", β€œβ€, tweets.text)

tweets.text= gsub(β€œhttp\w+”, β€œβ€, tweets.text)

tweets.text= gsub("[ \t]{2,}", β€œβ€, tweets.text)
tweets.text= gsub("^\s+|\s+$", β€œβ€, tweets.text)
tweetstext<-tolower(tweets.text)
tweetstext<-as.character(tweetstext)
then I apply finding emotion and polarity code and all. So sir I know using corpus removing @,http,numbers, whitespace,upper to lower is right but I dont know how to convert normal tweets to corpus. So what is your suggestion about it. are these functions sufficient to remove these or not or do we have transform normal tweets into corpus,if yes how to do that. Really waiting for your answer @ajay_ohri sir


#4

for twitter text mining with R, here is something a student of mine created

Twitter analysis by Kaify Rais from Ajay Ohri

#5

Hello Ajay sir,

It is awesome that within your busy schedule you find time to reply us. Sir I am unable to download the doc which you shared related to twitter analysis. Could you please give me download link link for that.

It is very urgent sir. Waiting for your reply. Thanks in advance again.

Regards

Sourav


#6

@santu_rcc014 the download link is on the site http://www.slideshare.net/ajayohri/twitter-analysis-by-kaify-rais


#7

@ajay_ohri
Dear Ajay Sir,
Thanks for your insightful comments and the resources you provided for textmining.