Applying for loop to pandas dataframe in python

pandas
ipython
python

#1

Hello,
I have been analysing the bike sharing problem on kaggle.
I tried to build a new column for time (having values from 0-23)by applying a for loop on datetime column in the dataframe.
But some of the values where negative in the new column obtained which should have not been the case.
The code is as follows:

df1 = pd.DataFrame(np.random.randn(10866)
df1 =df1.rename(column={ 0 : ‘time’}) # defining a new dataframe to store time
for i in range (0,10866):
df1.time[i] = datetime.strptime(df.datetime[i], ‘%d-%m-%Y %H:%M’).strftime(’%H’)

What would be an alternative way to achieve the same result?

thanks


#2

Hey ,
you can use apply function instead of for loop for the same result.
Define a new function doing the same job as the for loop and the apply it to the required column of the dataframe using ‘apply’ function

def time(x):
return datetime.strptime(x, ‘%d-%m-%Y %H:%M’).strftime(’%H’)
df4 = pd.DataFrame({‘time’:df[‘datetime’].apply(time)})

you can insert the new dataframe df4 in the old one as following:
df.insert(1,‘time’,df4)

Here you don’t have to define a new dataframe separately or rename it

Hope this helps