Dummy variables and accuracy

machine_learning
dummy_variable

#1

Dummy variables at times provide greater accuracy than continuous or categorical variables. What is the mathematical explanation behind the same ?

Thanks and regards with best wishes
Debanjan


#2

I would say comparing features created from dummy variable represent a completely different information than their original categorical/continuous variables. So comparing them would not be correct.

Comparing dummy values of features and their original features would be same as comparing two different features.


#3

If you have age as predictor, and dummied at 60, for instance, and the target is a disease or condition that affects people only 60+ , the model could be more accurate with the dummy.


#4

Thanks a lot Faizy.

But I am more interested about the mathematical explanations of the same.

Why in some business cases dummy variables produce more accuracy and what is the mathematical explanation ?

Or is it purely business logic ?


#5

Thanks a lot for your answer leoldv.

But I am interested in the mathematical explanation of the dummy variables.

Why mathematically does dummy variables produce more accuracy in some cases ?