I am trying to replicate the below code for a single column of a dataframe in python:
# Create initial documents list: doc = [ ] doc.append( 'It is a far, far better thing I do, than I have every done' ) doc.append( 'Call me Ishmael' ) doc.append( 'Is this a dagger I see before me?' ) doc.append( 'O happy dagger' )
I have done till:
import pandas as pd review = pd.read_csv('/home/text/Downloads/reviews.csv') review_df = pd.DataFrame(review) # See the dimensions of the data frame: review_df.shape # Create tdm for each column after removing NA: trgt_col = review_df[pd.notnull(review_df['col1'])]
What I would like to do is ultimately create a TermDocument Matrix of the words in the specified column.
So can someone please help me with this??