While reading today’s article on dealing with categorical variables,I read about the dummy coding example given:
I think that the sex_female and sex_male columns are both saying the same thing so it wouldn’t make sense to use both in the model.Please correct me if I am wrong??
I took a look at the one-hot encoding using python,but I am not being able to understand the code:
Here the levels are 0,1,2,3 and the new levels are 2,3,4,right??
Which levels have been combined??
What does the enc.feature_indices_ capture,why are there 4 indices??
What is the enc.transform doing??
I am sorry if these are very basic questions,but can someone kindly guide me on this??