# What is a Dummy Variable?

#1

Hi ,

Please kindly explain Which situation or senarios do we create dummy variables?

Regards,
Bharath

Dummy variables, is necessary to standardize them?
#2

First Understand what is Dummy Variable :

Q.1. What is a Dummy variable?

Ans: A Dummy variable or Indicator Variable is an artificial variable created to represent an attribute with two or
more distinct categories/levels.

Q.2 : Why is it used?

Ans : Regression analysis treats all independent (X) variables in the analysis as numerical. Numerical variables
are interval or ratio scale variables whose values are directly comparable, e.g. ‘10 is twice as much as 5’, or
‘3 minus 1 equals 2’. Often, however, you might want to include an attribute or nominal scale variable such
as ‘Product Brand’ or ‘Type of Defect’ in your study. Say you have three types of defects, numbered ‘1’, ‘2’
and ‘3’. In this case, ‘3 minus 1’ doesn’t mean anything… you can’t subtracting defect 1 from defect 3. The
numbers here are used to indicate or identify the levels of ‘Defect Type’ and do not have intrinsic meaning of
their own. Dummy variables are created in this situation to ‘trick’ the regression algorithm into correctly
analyzing attribute variables.

#3

Hi Experts,

Please let me know on what base we are creating these dummy variables

Based on the above bivariate analysis, we can split age into 3-1 dummy variables. Similarly, we have done for duration variable.

# Create dummy variables for Amount and Duration

GermanCredit\$Age_c1 <- ifelse(GermanCredit\$Age <=26,1,0)
GermanCredit\$Age_c2 <- ifelse(GermanCredit\$Age >26 & GermanCredit\$Age <=33,1,0)
GermanCredit\$Amount_c1 <- ifelse(GermanCredit\$Amount <=1260,1,0)
GermanCredit\$Amount_c2 <- ifelse(GermanCredit\$Amount >1260 & GermanCredit\$Amount <=4700,1,0)
GermanCredit\$Duration_c1 <- ifelse(GermanCredit\$Duration <=15,1,0)
GermanCredit\$Duration_c2 <- ifelse(GermanCredit\$Duration >15 & GermanCredit\$Duration <=30,1,0)

thanks,
Tony

#4

Hi Jal faizy,