Table of Contents
Chi-square, Fisher's exact, and McNemar's test
Chi-square test
A Chi-square test is a common test for nominal (categorical) data. One application of a Chi-square test is a test for independence. In this case, the null hypothesis is that the occurrence of the outcomes for the two groups is equal. For example, you have two user groups (e.g., male and female, or young and elderly). And you have nominal data for each group, for example, whether they use mobile devices or which OS they use. So, your data look like this. If your data of the two groups came from the same participants (i.e., the data were paired), you should use the McNemar's test.
Own device A | Don't own device A | |
---|---|---|
Male | 25 | 5 |
Female | 15 | 15 |
Windows | Mac | Linux | |
---|---|---|---|
Young | 16 | 11 | 3 |
Old | 21 | 8 | 1 |
And now you are interested in figuring out whether the outcomes for the two groups were statistically equal. The assumption of Chi-square is that the samples are taken independently or are unpaired. If not, you need to use McNemar's test. And if you have only a small sample size, you should use the Fisher's exact test.
Effect size
The effect size of a Chi-square test can be described by phi or Cramer's V. If your data table is 2 x 2, you will calculate phi (k=2 in the equation below) and otherwise, Cramer's V (k>2 in the equation below) . But the calculation is pretty much the same and it is as follows:
where N is the total number of the samples, and k is the number of the rows or columns, whichever smaller, in your data table. And the chi-squared here is the value without any correction. Here are values which are considered small, medium and large sizes.
small size | medium size | large size | |
---|---|---|---|
Cramer's phi or V | 0.10 | 0.30 | 0.50 |
R code example
Let's use the examples above. First, prepare the data.
And run a Chi-squared test.
So, the first example has a significant difference, which means the ownership of device A significantly differs between male and female users. The effect size of the first test can be calculated with vcd package:
For a 2×2 table, you can also calculate the odds ratio. The odds ratio is how the probability of the phenomena is affected by the dependent variable. This can be calculated as ad / bc.
Own device A | Don't own device A | |
---|---|---|
Male | a = 25 | b =5 |
Female | c = 15 | d = 15 |
How to report
You can report the results of a Chi-square test like this:
Our Chi-square test with Yates' continuity correction revealed that the percentage of the ownership of device A significantly differed by gender ((1, N = 60) = 6.08, p < 0.01, = 0.35, the odds ratio is 5.0).
Fisher's exact test
You can instead use Fisher's exact test if your sample size is small. It is hard to say how many samples are small, but in general, it is better to use a Fisher's exact test than a Chi-square test when you have small than 10 in any cell of your data table (like the examples above).
R code example
Running a Fisher's exact test is pretty similar to Chi-square.
How to report
How to report the results of a Fisher's exact test is pretty much the same as the Chi-square test. Unlike Chi-square test, you don't have any statistics like chi-squared. So, you just need to report the p value. Some people include the odd ratio with the confidence intervals.
McNemar's test
McNemar's test is basically a paired version of Chi-square test. Let's say you asked whether the participants liked the device before and after the experiment.
After experiment | |||
---|---|---|---|
Yes | No | ||
Before experiment | Yes | 6 | 2 |
No | 8 | 4 |
Here, what you want to test is whether the number of the participants who liked the device were significantly changed between before and after the experiment.
Effect size
The effect size of the Fisher's exact test can be calculated in the same way as the one for the Chi-square test.
R code example
Running a McNemar's exact test is pretty similar to Chi-square.
Thus, we cannot reject the null hypothesis, and it means that the number of the participants who liked the device were not significantly changed between before and after the experiment. As you can see here, mcnemar.test() automatically makes correction for continuity. You can disable it with correct=F option, and the results will become the same with the function for Cochran Cochran's Q test.
McNemar's test and binomial test
In SPSS, the binomial distribution is used for McNemar's test. Thus, the results look different from those you can get in R. A binomial test is very similar to McNemar's test, but its null hypothesis is that the ratio of the two categories is equal to an expected distribution. In most cases, a binomial test is used for testing whether two categories are equally likely to occur.
Question 2 (post-treatment) | |||
---|---|---|---|
Yes | No | ||
Question 1 | Yes | a | b |
(pre-treatment) | No | c | d |
More precisely, you need to use a binomial test rather than McNemar's test if b+ c in the 2×2 table is small. However, in R, you can run McNemar's test with continuity correction, so it will cause a big problem because the results of a binmoal test and McNemar's test with continuity correction become similar.
If you want to do a binomial test like SPSS does, you need to use binomial function. And you need two numbers, which is the total count for the cases where the participants flipped the responses (i.e., b+ c. In the example we are using, 2 + 8 = 10), and the number of one of these two cases (i.e., 2 or 8).
After experiment | |||
---|---|---|---|
Yes | No | ||
Before experiment | Yes | 6 | 2 |
No | 8 | 4 |
In this case, the p value is pretty close regardless of the ways to do a McNemar's test.
How to report
How to report the results of a McNemar's test is pretty much the same as the Chi-square test.