I’m doing some data analyses on the binary data, the binary data is either “Pass” or “Fail”. The purpose of this analysis is to find the why there are some factories have relatively low fail rate vs some factories have high fail rate. There are two different datasets here, they are unequally distributed (i.e. Dataset A contains 10,000 rows while dataset B contains 5,000 rows).
Before I’m going to analyze the data in greater detail, I would like to verify the data that there is no statistical difference between two datasets.
I wanted to do it in Excel tool, do you have any great idea? (e.g. T-Test)
Thank you everyone.