I was going through the documentation of
scipy.stats.chi2_contingency and I found out that expected observation of for a cross table between two variables is obtained by multiplying the row total for that cell by the column total for that cell and then divide by the total number of observation.
What is the significance in creating the expected result by this technique?
What can we comment about the co-relation between the two variables by looking at p-value after chi-square test?
Follow this article for more explanation.
I am working on megastar contest. The objective of this dataset is to predict the correct category of the working professionals in India. Below is data dictionary of the variables.
After applying chi-square test on two categorical variable(UG_Education and Category) by using following code:
I get following result :
I have following questions about the two categorical variables(UG_Education and Category) :
1. Are they co-related?
2. What is Null Hypothesis in this case if the cross table is :
3. If the two variables are not co-related, does that mean UG_Education has no effect on outcome(Category is the outcome variable)?
Thanks in advance,