While discussing about One Hot Encoding on loan_prediction data set in a recent blog in AV its explained as follows:
Lets take a look at an example from loan_prediction data set. Feature Dependents have 4 possible values 0,1,2 and 3+ which are then encoded without loss of generality to 0,1,2 and 3.
We, then have a weight “W” assigned for this feature in a linear classifier,which will make a decision based on the constraints WDependents + K > 0 or eqivalently WDependents < K.
Let f(w)= W*Dependents
Possible values that can be attained by the equation are 0, W, 2W and 3W. A problem with this equation is that the weight “W” cannot make decision based on four choices. It can reach to a decision in following ways:
- All leads to the same decision (all of them <K or vice versa)
- 3:1 division of the levels (Decision boundary at f(w)>2W)
- 2:2 division of the levels (Decision boundary at f(w)>W)
Here we can see that we are loosing many different possible decisions such as the case where “0” and “2W” should be given same label and “3W” and “W” are odd one out.
It will be helpful if the above 3 points of reaching a decision and loosing many different decisions are explained elaborately… Thanks