Categorical variable in regression



How to use categorical variables (not factor) while performing regression other than using dummy in R?



We have various coding systems that can be used to code categorical variables. Most known methods of coding categorical variable is dummy variables, but this is not the only coding method. Selection of coding system depends on your requirement.

Example, you want to compare each level to the next higher level, in this case you would use “forward difference” coding. If you want to compare each level to the mean of the subsequent levels of the variable, you would go with “Helmert” coding. By choosing a coding system, you can obtain comparisons that are most meaningful for testing your hypotheses. Below is a table listing various types of coding system and the comparison that they make.