Choosing for Hypothesis test



-I have reading a lot about hypothesis testing i.e T test,Z test,Chi square test,ANOVA and F test.But only getting more confused that which test should we should used in what situation?

-Online materials like video tutorials are only explaining about concepts but its still hard for me to understand its application on practical scenarios that which one to choose?

-What I am exactly looking for is a structured path for hypothesis testing everything at one place rather that bits and pieces that only making me more confused.

If anyone can help me with this it will be really great full.



I will try to keep it simple, once you know the basic difference you can deep dive into each concept.
1) Z and T test

T-test and z-test are more or less the same thing, in z test you know the population mean and standard deviation and sample is large enough (n>30). When you don’t know the population mean and standard deviation or the sample size small (n<30), you should use t-test

There are mainly two types of t test: single sample and two sample t test

Lets take an example, you are manufacturer of a potato chips and you know that from your historical data that the mean weight of the every pack is 100 grams with std. dev. of 5 grams and you want check that the new produced lot of 100 packs have the weight equal to 100 grams or not.
Your null hypothesis is the weight of the packs is equal to 100 grams.
This is single sample t-test

Now if you are teacher and you want to check whether the performance of the girls and boys in the class is same, then you can you two sample t-test where your NULL hypothesis is there is no difference in the average marks of boys and girls.

There is another type in the two sample t-test, but the sample is related to same object and is called as paired sample t test
Lets say, you want to check the same mobile price before Diwali and after Diwali, is there any significant difference in the mobile prices. Then this is called paired sample t test as the two samples belongs to same object
Your null hypothesis is, there is no significant difference in the price of the mobile before and after Diwali.

If you have more than two categories in the data then you have to use F test i.e. ANOVA (Analysis of variance)
Lets say you want to check the number of votes grabbed by three political party are same after a general elections, then we have to use anova where you null hypothesis is the average votes grabbed by each party are same while the alternative hypothesis is at least one of the party has different vote share.

3) Degree of freedom
The formula for DoF is different for different test, but i will try to explain it through a simple example.
Lets say you have 7 different colour T shirts with you and you want wear different colour on each day of a week, now on 1st day of the week you have 7 options to choose from, like this on 6th day you will remain with 2 option and on 7th day you don’t have any other option
So here DoF is 6 because you have 6 days where you can choose from different colours but you dont have freedom to choose on 7th day

There are two types of chi-square-chi-square goodness of fit test and A chi-square test for independence
I have explained the 2nd type because of time constraint.
Lets say you are T shirt seller and you think that the sale of the 3 type of t shirts (white,red and blue) is independent of each other.
lets say you always order 100 shirts of each type from your supplier, the sold and unsold t shirts for single period is given below.
Here your NULL hypothesis is, the sale of each shirt is independent of each others sale.

I hope you gain some clarity from above explanation, i have tried my best, i can explain it more but there is a time constraint and these are vary vast topics if you dig more.:slightly_smiling_face:



I am still confused between a single sample t-test and a two sample t-test. According to my understanding from the above explanation,
one sample t-test: comparing one pack to 100 packs
two sample t-test: comparing a sample of 100 packs with another sample of 100 packs.
Is that correct?