Chi-square, Fisher's exact, and McNemar's test

Chi-square test

A Chi-square test is a common test for nominal (categorical) data. One application of a Chi-square test is a test for independence. In this case, the null hypothesis is that the occurrence of the outcomes for the two groups is equal. For example, you have two user groups (e.g., male and female, or young and elderly). And you have nominal data for each group, for example, whether they use mobile devices or which OS they use. So, your data look like this. If your data of the two groups came from the same participants (i.e., the data were paired), you should use McNemar's test.

Own device ADon't own device A
Male255
Female1515

WindowsMacLinux
Young16113
Old2181

And now you are interested in figuring out whether the outcomes for the two groups were statistically equal. The assumption of Chi-square is that the samples are taken independently or are unpaired. If not, you need to use McNemar's test. And if you have only a small sample size, you should use Fisher's exact test.

Effect size

The effect size of a Chi-square test can be described by phi or Cramer's V. If your data table is 2 x 2, you will calculate phi (k=2 in the equation below) and otherwise, Cramer's V (k>2 in the equation below) . But the calculation is pretty much the same and it is as follows:

$\Large \phi \ or \ V= \sqrt{\frac{\chi^{2}}{N(k-1)}}$,

where N is the total number of the samples, and k is the number of the rows or columns, whichever smaller, in your data table. And the chi-squared here is the value without any correction. Here are values which are considered small, medium and large sizes.

 small size medium size large size Cramer's phi or V 0.10 0.30 0.50

R code example

Let's use the examples above. First, prepare the data.

> data <- matrix(c(25, 5, 15, 15), ncol=2, byrow=T)
> data2 <- matrix(c(16, 11, 3, 21, 8, 1), ncol=2, byrow=T)


And run a Chi-squared test.

> chisq.test(data)



Pearson's Chi-squared test with Yates' continuity correction

data:  data
X-squared = 6.075, df = 1, p-value = 0.01371



> chisq.test(data2)



Pearson's Chi-squared test

data:  data2
X-squared = 2.1494, df = 2, p-value = 0.3414


So, the first example has a significant difference, which means the ownership of device A significantly differs between male and female users. The effect size of the first test can be calculated with vcd package:

> library(vcd)
> assocstats(data)



X^2 df  P(> X^2)
Likelihood Ratio 7.7592  1 0.0053440
Pearson          7.5000  1 0.0061699

Phi-Coefficient   : 0.354
Contingency Coeff.: 0.333
Cramer's V        : 0.354


For a 2x2 table, you can also calculate the odds ratio. The odds ratio is how the probability of the phenomena is affected by the dependent variable. This can be calculated as ad / bc.

Own device ADon't own device A
Malea = 25b =5
Femalec = 15d = 15


> (25 * 15) / (5 * 15)



5


How to report

You can report the results of a Chi-square test like this: Our Chi-square test with Yates' continuity correction
revealed that the percentage of the ownership of device A significantly differed by gender ($\small \chi^{2}$(1, N = 60) = 6.08, p < 0.01, $\small \phi$ = 0.35, the odds ratio is 5.0)
.

Fisher's exact test

You can instead use Fisher's exact test if your sample size is small. It is hard to say how many samples are small, but in general, it is better to use a Fisher's exact test than a Chi-square test when you have small than 10 in any cell of your data table (like the examples above).

R code example

Running a Fisher's exact test is pretty similar to Chi-square.

> fisher.test(data)



Fisher's Exact Test for Count Data

data:  data
p-value = 0.0127
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
1.335859 20.757326
sample estimates:
odds ratio
4.859427


How to report

How to report the results of a Fisher's exact test is pretty much the same as a way to report the result of Chi-square test. Unlike Chi-square test, you don't have any statistics like chi-squared. So, you just need to report the p value. Some people include the odd ratio with the confidence intervals.

McNemar's test

McNemar's test is basically a paired version of Chi-square test. Let's say you asked whether the participants liked the device before and after the experiment.

After experiment
YesNo
Before experimentYes62
No84

Here, what you want to test is whether the number of the participants who liked the device were significantly changed between before and after the experiment.

Effect size

The effect size of an Fisher's exact test can be calculated in the same way as a Chi-square test.

R code example

Running a McNemar's exact test is pretty similar to Chi-square.

> data <- matrix(c(6, 2, 8, 4), ncol=2, byrow=T)
> mcnemar.test(data)



McNemar's Chi-squared test with continuity correction

data:  data
McNemar's chi-squared = 2.5, df = 1, p-value = 0.1138


Thus, we cannot reject the null hypothesis, and it means that the number of the participants who liked the device were not significantly changed between before and after the experiment. As you can see here, mcnemar.test() automatically makes correction for continuity. You can disable it with correct=F option, and the results will become the same with the function for Cochran's Q test.

McNemar's test and binomial test

In SPSS, the binomial distribution is used for McNemar's test. Thus, the results look different from those you can get in R. A binomial test is very similar to McNemar's test, but its null hypothesis is that the ratio of the two categories is equal to an expected distribution. In most cases, a binomial test is used for testing whether two categories are equally likely to occur.

Question 2
YesNo
Question 1Yesab
Nocd

More precisely, you need to use a binomial test rather than McNemar's test if b+ c in the 2x2 table is small. However, in R, you can run McNemar's test with continuity correction, so it will cause a big problem because the results of a binmoal test and McNemar's test with continuity correction become similar.

If you want to do a binomial test like SPSS does, you need to use binomial function. And you need two numbers, which is the total count for the cases where the participants flipped the responses (i.e., b+ c. In the example we are using, 2 + 8 = 10), and the number of one of these two cases (i.e., 2 or 8).

After experiment
YesNo
Before experimentYes62
No84


> binom.test(2, 10, 0.5)



Exact binomial test

data:  2 and 10
number of successes = 2, number of trials = 10, p-value = 0.1094
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.02521073 0.55609546
sample estimates:
probability of success
0.2


In this case, the p value is pretty close regardless of the ways to do a McNemar's test.

How to report

How to report the results of a McNemar's test is pretty much the same as those of a Chi-square test. See here for more details.
Dear: I´ve navegated on internet and books for at least one month looking for the effect size estimate for a G-test of independence. Any idea about it? Thanks.
— Anonymous (2012-07-03 13:03:05)
For the question about the effect size of a G-test, it seems like it is similar to a Chi-square test, so I think (but not 100% sure) that the same effect size we use for the Chi-square test seems appropriate. With SAS, you can perform a G-test along with a Chi-square test, and shows the results along with Cramer's V.
http://udel.edu/~mcdonald/statgtestind.html

Again, I am not 100% sure about this, and couldn't find any concrete reference. So please take this with a grain of salt.
— KojiYatani (2012-08-02 13:15:30)
For the G-test see

http://en.wikipedia.org/wiki/G-test
http://stats.stackexchange.com/questions/25209/difference-between-g-test-and-t-test-and-which-should-be-used-for-a-b-testing
— Anonymous (2012-11-14 16:55:17)
Thank you! This is very helpful!! Do you have any idea why my Cramer's V values are different than the ones I get in SPSS?
— Anonymous (2013-04-12 01:56:02)
Would be nice to see ci's for phi / cramers phi / Cohen's w :)
— Anonymous (2013-07-29 19:57:51)