Import .CSV file in python Error

python

#1

Hi,

I am beginner in Python. I am trying to import .CSV file in Python using following code:

import pandas as pd
talent = pd.read_csv('Test_talent.csv')
print(talent.shape)

it gives me dimensions (2317,82)

But if I import same file in Rstudio it’s dimension is 14388 Obs 82 columns

How can I import these dataset in Python?
Really confuse

Thanks in advance


#2

Hi @premsheth,

I hope you have added the location of the csv file correctly. For example, I usually write it as

import pandas as pd
df=pd.read_csv("/home/aishwarya/Desktop/train.csv")  
df.shape

Check that and let me know if it works.


#3

@AishwaryaSingh

Thanks for reply
I tried both things.

  1. open jupyter notebook in folder where this file located
  2. Tried to copy paste correct path and tried to import

When I do same thing in R its works fine but I don’t know whats problem in Python


#4

Hi @premsheth

Is it still showing incorrect value or are you getting some error? If you write the correct path, and use .shape , it should work fine.

Try reading some other file in both python and R and see if you have the same problem.


#5

Its giving me incorrect dimensions
Actual observations are 14388 it shows correct in R but in python its shows 2317 observations only


#6

Please verify the same for some other file, in both Python and R.


#7

Thank you @AishwaryaSingh
Ok I will try again


#8

Check if the fields are separated by “,” or “;” .

talent = pd.read_csv (‘Test_talent.csv’, sep = “;”)


#9

Ok thank you sir for reply

If I use sep = " ; " it gives me following error:
ParserError: Error tokenizing data. C error: Expected 1 fields in line 204, saw 2

if I use sep = " , " no change


#10

@premsheth your steps is correct but it could be for the following reasons.

  1. separators ‘,’
  2. Unicode (coding )
    and also try to read like
    data = pd.read_csv(‘file1.csv’, error_bad_lines=False)
    Please try with some different encoding method. or share us sample or data set.

#11

Are you sure the number of observations seen in R is correct?
R can sometimes add several NA filled lines at the end if the file is manipulated in certain ways in excel. Make sure that all the observations seen in R are real and not inflated.


#12

Can you please post the result of len(talent)