Why does feature created in a temp dataframe get created in original dataframe as well




I am using the following technique to add a new Feature/ Field.

for temp_df in full_data_df:
    temp_df["Family"] = temp_df["Sibsp"]+temp_df["Parch"]+1

full_data_df- exist already.

My questions:

  1. why does Family feature/field gets added to full_data_df as well since I am only adding it to temp_df

  2. how to check if a dataframe is a reference to the original object

  3. in the for loop does dataframe (full_data_df) pass the data in temp_df one series at a time



Hi @mohitlearns,

temp_df is not a different dataframe, it is a variable. For example, if I write,

   for i in range(0,10)

this means, for every i in the range 0 to 10, it will perform the required condition. So i takes the value 0,1,2,3,… . You will have to change the code accordingly.

Could you make it more clear? What do you mean by checking the datafrane as a reference to the original object?

What have you defined as temp_df? Is it an empty dataframe?


Thanks for your reply!

Just a quick clarification required.




for dataset in full_data:

now if i perform the following…


…the response i get from python is :


can you throw some light as to how its a variable and not a dataframe

secondly, I am still not clear how adding a feature to “dataset” creates a feature in “fulldata”


Hi @mohitlearns,

This gives a list. I suppose you want full_data to be a dataframe. If yes, use the below line of code :


Another question, why are you doing this? Do you want to create another dataset and add a column to it?

You have written for dataset in full_data . It is reading every row in full_data and performing the operation you assigned. As the example mentioned previously :

for i in range(0,5):
     >>operation assigned 

it will take i =0, i=1, i=2 and so on, in simpler words, replaces i with each value in the range. Here it is taking dataset in full_data, and working on full_data.


You must use .copy()