New York Taxi Trip Duration

I am currently working on New York Taxi trip duration EDA assignment.

I was trying to plot the data points provided onto a map of New York city.

I am using geopandas library. Here is the code for same.

geo_df = gpd.GeoDataFrame(df, crs = {‘init’: “EPSG:4326”},
geometry = [Point(xy) for xy in (df[‘pickup_longitude’], df[‘pickup_latitude’])])

It is giving me the following error :

Please let me know how it can be resolved.

I can’t run it but as for me all your problem is that you create geometry in wrong way.

Use print() to see what you create

 for xy in (df[‘pickup_longitude’], df[‘pickup_latitude’]):
      print(xy)

It probably create two elements:

  • first: full column df[‘pickup_longitude’],
  • second: full column df[‘pickup_latitude’]

but you have to create pairs using both lists at the same time

And to create pairs you need zip( list1, list2 )

As example

x = [0, 1, 2, 3, 4]
y = [5, 6, 7, 8, 9]

for xy in (x, y):
    print(xy)

# result: 2 lists
# [0, 1, 2, 3, 4]
# [5, 6, 7, 8, 9]
    
for xy in zip(x, y):
    print(xy)    

# result: 5 pairs
# (0, 5)
# (1, 6)
# (2, 7)
# (3, 8)
# (4, 9)

Check:

for xy in zip(df[‘pickup_longitude’], df[‘pickup_latitude’]):
      print(xy)

So you need

geometry = [Point(xy) for xy in zip(df[‘pickup_longitude’], df[‘pickup_latitude’])]

Or you can use pandas function to iterate rows:

for index, xy in df[ [‘pickup_longitude’, ‘pickup_latitude’] ].iterrows():
     print(xy)
1 Like

Thank you furas.

print(xy) created 2 elements indeed.

On using zip, it worked perfectly fine.

© Copyright 2013-2019 Analytics Vidhya