Cohen’s d for the Paired Design: A Better Way to Find the Confidence Interval

I’ve just read a great series of papers by Denis Cousineau and Jean-Christophe Goulet-Pelletier. They propose a new way to calculate a CI on Cohen’s d in the paired design (e.g., a pre-post study). It’s an approximation, but they show that it’s excellent. Great! We’ll be using it in esci jamovi.

Background

Cohen’s d is the ratio of an effect size (often a mean, or difference between means) to a standard deviation. Typically both are estimates from the data, so it’s hardly surprising that the distribution of d is complicated. We need noncentral t distributions. As Sue Finch and I explained back in 2001 (Cumming & Finch, 2001), noncentral t distributions allow us to calculate accurate CIs on d. The method works for the single group case, and for two independent groups, assuming homogeneity of variance.

Back then we used a very early version of ESCI to illustrate how sliding two ever-changing noncentral t curves along the d axis (the pivot method) allowed us, for those two designs, to find the lower and upper bounds of the CI on the d calculated from sample data. The figure below uses the version of ESCI that goes with UTNS to illustrate the pivot method.

For the paired design we couldn’t, alas, find even a good approximate way to calculate a CI on d.

Paired design: The Algina & Keselman approximation

Happily, by the time I was writing UTNS, Algina & Keselman (2003) had proposed an approximate solution to the problem of the paired case. They reported simulations that showed their method was pretty good, for a limited range of situations. In UTNS, pp. 306-307, I described my efforts to use simulations to assess their method. I found I could broaden the range of cases for which the approximation did well. Even so, there were limits, as stated in ESCI. For example, N had to be at least 6, and dunbiased could not be greater than 2. But at least ESCI could provide a quite good approximate CI on d, for the paired design, for pretty much all cases researchers are likely to encounter.

Debiasing d

The usual d = [(effect size)/SD] overestimates δ. A simple correction factor, which depends on the df of the SD, gives us dunbiased, which is what we should routinely use. In UTNS, for the paired case, I followed Borenstein et al. (2009) and used df = (N – 1), even though this seemed a little strange, given that the SD is estimated from the standard deviations of both measures (e.g., the pre-scores and the post-scores).

The big leaps forward

df for debiasing, paired design

Goulet-Pelletier & Cousineau (2018 here, and erratum 2020 here) report a wide-ranging review of d and its CI. Their simulations suggest that in the paired case debiasing should use df = 2(N – 1), not (N – 1) as I used in UTNS and ESCI. They refer to dunbiased as g.

Then Fitts (2020 here) investigated this issue and found by simulation that the debiasing df needs to reflect ρ, the population correlation between the two measures. When ρ = 0, as in the independent groups case, df = 2(N – 1), as for independent groups. If ρ = 1, then df = (N – 1). Intermediate values of ρ need intermediate values of df.

Cousineau (2020 here) took a major step forward by finding a good approximation to the distribution for d in the paired design, and a formula for the df that includes ρ.

A CI on d, paired design

Now, hot off the press, Cousineau & Goulet-Pelletier (2021 here) report a massive set of simulations that assess eight (!!) ways to calculate an approximate CI on d, five of them being their new proposals. The Algina-Keselman method that I used in UTNS turns out to be reasonable, but isn’t the best. The best is the ‘Adjusted Λ’ [“lambda-prime”] method’, which is one of their new proposals. This gives CIs that have very close to 95% coverage, and some other desirable properties, for a wide range of values of N, d, and ρ.

See their paper for a description of the method, and on p. 58 the R code. It’s probably what we’ll use in esci jamovi.

This progress makes me very happy. Maybe you too?

Geoff

Algina, J., & Keselman, H. J. (2003). Approximate confidence intervals for effect sizes. Educational and Psychological Measurement, 63, 537–553. https://doi.org/10.1177/0013164403256358

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. New York, NY: John Wiley & Sons.

Cousineau, D. (2020). Approximating the distribution of Cohen’s dp in within-subject designs. The Quantitative Methods for Psychology, 16(4), 418–421. https://doi.org/10.20982/tqmp.16.4.p418

Cumming, G., & Finch, S. (2001). A primer on the understanding, use and calculation of confidence intervals that are based on central and noncentral distributions. Educational and Psychological Measurement, 61, 532-574. https://doi.org/10.1177/0013164401614002

Fitts, D. (2020). Commentary on “a review of effect sizes and their confidence intervals, part I: The Cohen’s d family”: The degrees of freedom for paired samples designs. The Quantitative Methods for Psychology, 16(4), 281–294. https://doi.org/10.20982/tqmp.16.4.p281

Goulet-Pelletier, J.-C., & Cousineau, D. (2018). A review of effect sizes and their confidence intervals, Part I: The Cohen’s d family. The Quantitative Methods for Psychology, 14(4), 242–265. https://doi.org/10.20982/tqmp.14.4.p242

Goulet-Pelletier, J.-C., & Cousineau, D. (2020). Erratum to Appendix C of “A review of effect sizes and their confidence intervals, Part I: The Cohen’s d family”. The Quantitative Methods for Psychology, 16(4), 422–423. https://doi.org/10.20982/tqmp.16.4.p422

Cousineau, D., & Goulet-Pelletier, J.-C. (2021). A study of confidence intervals for Cohen’s dp in within-subject designs with new proposals. The Quantitative Methods for Psychology, 17(1), 51–75. https://doi.org/10.20982/tqmp.17.1.p051