\chi^2 is a test statistic for hypothesis testing. motivation for chi-square The motivation for chi-square is because t-test (means, “is the value significantly different”) and z-test (proportion, “is the incidence percentage significantly different”) all don’t really cover categorical data samples: “the categories are distributed in this way.” Take, for instance, if we want to test the following null hypothesis: Category Expected Actual A 25 20 B 25 20 C 25 25 D 25 25 \alpha = 0.05. What do we use to test this?? (hint: we can’t, unless…) Enter chi-square. chi-square test chi-square test is a hypothesis test for categorical data. It is responsible to translate differences in distributions into p-values for significance. Begin by calculating chi-square after you confirmed that your experiment meets conditions for inference (chi-square test). Once you have that, look it up at a chi-square table to figure the appropriate p-value. Then, proceed with normal hypothesis testing. Because of this categorical nature, chi-square test can also be used as a homogeneity test. conditions for inference (chi-square test) random sampling expected value for data must be \geq 5 sampling should be <10\% or independent chi-square test for homogeneity The chi-square test for homogeneity is a test for homogeneity via the chi-square statistic. To do this, we take the probability of a certain outcome happening—if distributed equally—and apply it to the samples to compare. Take, for instance: Subject Right Hand Left Hand Total STEM 30 10 40 Humanities 15 25 40 Equal 15 5 20 Total 60 40 100 We will then figure the expected outcomes: Right Left 24 16 24 16 12 8 Awesome! Now, calculate chi-square with each cell of measured outcomes. Calculate degrees of freedom by (num_row-1)*(num_col-1). chi-square test for independence The chi-square test for independence is a test designed to accept-reject the null hypothesis of “no association between two variables.” Essentially, you leverage the fact that “AND” relationships are multiplicative probabilities. Therefore, the expected outcomes are simply the multiplied/fraction of sums: calculating chi-square \begin{equation} \chi^2 = \frac{(\hat{x}_0-x_0)^2}{x_0} +\frac{(\hat{x}_1-x_1)^2}{x_1} + \cdots + \frac{(\hat{x}_n-x_n)^2}{x_n} \end{equation} Where, \hat{x}_i is the measured value and x_i is the expected value.