As I discuss in the null hypothesis significance test section, one drawback of the p value is that it does not give us any information about how large the effect is. For example, with a t test, we can know whether one technique is faster than the other, but we cannot tell how large the contribution of the techniques is to the improvement of performance time (i.e., the size of the effect). And the p value is dependent of the sample size. So, if you just really look for a significant difference, you can simply run your study until you find it (although it might require hundreds or thousands of participants). Thus, we want to have some metrics which do not depend on the sample size, and show the size of the effect.
Effect size is the metric that indicates the magnitude of the effect caused by a factor. The effect size is independent of the sample size. Thus, the effect size can complement some of the shortcomings of the NHST and p value.
There are a number of metrics that indicate the effect size. But common ones are Cohen's d, Pearson's r, and eta-squared (and partial eta-squared). You can understand the way to calculate them better within the context of statistical tests, so I have put these explanations in each statistical method (e.g., the effect size for a t test).
As we see in the mean (or average), there is a notion of a confidence interval (CI) for the effect size. The calculated effect size is merely an estimate, and it is not the exact value of the effect. Thus, the estimate has an error range, and the confidence interval shows it. One difference from the confidence interval for the mean is that the confidence interval for the effect size can be asymmetric. This means that the estimated effect size may not be the average of the lower bound and upper bound of the confidence interval. For example, you can encounter cases like your estimated effect size is 0.5, and the confidence interval is [0.1, 0.7].
Including the confidence interval for the effect size has not been common yet, but some research fields have started to require researchers to do so in journal and conference papers. Again, the calculated effect size is not a definite value, so it is understandable that knowing the precision is important.
Another important information we can take from the confidence interval is whether it includes the zero or not. The zero effect size means that there is no effect by the factor. With the effect size and the range of the confidence interval, we can know what implications they have as follows.
CI includes the zero | The absolute value of the effect size | The range of CI | Implication |
---|---|---|---|
No | small | small | The effect apparently exists, but the effect is small. |
small | large | This is unlikely to happen. | |
large | small | The effect apparently exists, and we are sure that the effect is large. | |
large | large | The effect apparently exists and may be large, but we are not sure about the real size of the effect. | |
Yes | small | small | We are sure that the effect is small, but not sure whether the effect really exists. |
small | large | We are not sure whether the effect exists. We thus need more data. | |
large | small | This is unlikely to happen. | |
large | large | We are not sure whether the effect exists. We thus need more data. |
Maybe a little tricky part is that the confidence interval does depend on the sample size. So by simply increasing the sample size, you can narrow the confidence interval, and will be able to claim that the effect exists at some point (again this might require you to recruit hundreds or thousands of participants though). However, differently from the p value, the effect size itself is independent of the sample size, and increasing the sample size does not contribute to improving the size of the effect. Thus, looking at both the effect size and its confidence interval is important to correctly interpret the result.
If you want to know more about the effect size and its confidence interval, you can check some stats books and the following paper written by Ken Kelly. It also talks about how to use the MBESS package. Confidence Intervals for Standardized Effect Sizes: Theory, Application, and Implementation
Comments