I’ve to create a dataframe in pandas and I have the following options:
Initialize a dummy data frame using:
df = pd.DataFrame(index=range(4),columns=['A','B','C','D'])
Then fill in each value using dummly.iloc[…,…]
Create a dictionary containing the required columns and rows. Then create a dataframe using:
df = pd.DataFrame(dict)
In option 1, no additional space is required for dictionary creation and converting dictionary to dataframe will involve additional computation.
In option 2, I think indexing a dictionary is faster and adding values 1 by 1 should take less time.
Which option is computationally better in case of:
(a) Fixed number of records (will allow initialization of dataframe indices)
(b) Dynamic number of records (a new row has to be added to dataframe every time)
I think option 1 for scenario (a) and option 2 for scenario (b). But I’m not sure if my reasoning is justified. Please share some thoughts.