Comprehensive Guide to Text Summarization using Deep Learning in Python change from CVS to TXT files

Hello ,
I’m new to Data science and after i read the great article of ARAVIND PAI.
And I wonder how i can execute TXT file and not CSV file , as a dataset.
Thank you for the help.

Hello @amirai14301

You can use txt files too with pandas

data = pd.read_csv('data.txt')

Give the appropriate sep parameter to indicate how your columns are separated.

Or if you have a list of individual text files, say

data/
        file1.txt
        file2.txt
        file3.txt
        file4.txt
       .....

You can read them as :

text = []
for file in os.listdir("data"):
    with open(file,"r") as f:
        text.append(f.read())

Then you can create a pandas dataframe from this

data = pd.DataFrame(text, columns=['text'])

Thank you,
So if after I make a framework of Panda with TXT files ,
the code should run with no problem? or there minor changes?

Hi,
When I’m trying to read the files but I’m getting this error :
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x80 in position 3131: invalid start byte

Another problem in the line of data.drop_duplicates Im getting KeryError: ‘Text’

Without seeing your actual code, it would be difficult to help you.

Maybe this could help you out -

For KeryError: ‘Text’ , maybe the column ‘Text’ doesn’t exist in your dataset, or you are misspelling it.

© Copyright 2013-2019 Analytics Vidhya