3 Easy Ways to Obtain Cohen’s d and its CI

[Update 7/4/2020 – Added reference to preprint on Cohen’s d for paired designs and put code in an actual code block]
Lots of research questions boil down to estimating the difference between two means (Mdiff = Mgroup_of_interest – Mreference_group). This is the ‘raw score’ effect size–it reports the difference between groups on the same scale of measurement they were measured on. Usually, that’s all you need (and an estimate of uncertainty). Sometimes, though, it’s nice to also obtain a standardized effect size, one that does not depend on the scale of measurement. In these cases, Cohen’s d is the go-to measure:
Cohen’s d = Mdiff / sdbut_which_sd?
Cohen’s d turns out to be freaking complicated. First, there are issues with how to standardize the mean difference (which sd do you use?). This bumps up against the thorny issue of it is reasonable to assume equal variance. Then there’s the fact that Cohen’s d from a sample is slightly upwardly biased, so it needs to be corrected for bias, which causes some people to relabel it as Hedges g. And in case that wasn’t confusing enough, there’s an additional issue of how best to estimate the confidence interval of d. There are lots of solutions (some good, some bad), and most stats tools aren’t very clear on which approach they are using. That’s a surprising amount of complexity for what would have hoped would be an easy standardization of effect size.
In this blog post I am not going to wade through all these complexities . Instead, I will demonstrate three different ways you can easily obtain Cohen’s d and its CI. Each of these approaches will be very transparent about the all-important choice of the denominator (Lakens, 2013). Each uses the technique of Goulet-Pelletier & Cousineau (2018), which simulation studies suggest is generally the best approach (though perhaps not for paired designs–see the section on “approaches” at the end for details). In all cases, we are going to assume equality of variance between groups/measures–it turns out that without this assumption the CI on d becomes problematic.
All three of the approaches I’ll explain are based around the esci package for R that I (Bob) am currently working on (https://github.com/rcalinjageman/esci). As of 7/3/2020 this package is a rough draft–I’m now working through it to make the code beautiful (to the extent I can). You can use it as-is with some confidence–but be warned that I am tinkering and may yet make update-breaking updates to the package. I don’t have much documentation yet (does this page count?), but you can find a basic walk through the package here: https://osf.io/d89xg/wiki/tools:%20esci%20for%20R/
Method 1 – Use esci in jamovi
Let’s start with the easiest option: using a GUI. jamovi is a delightful point-and-click program for statistical analysis (https://www.jamovi.org/). It’s free, it’s open source, it runs on any platform (even Chromebooks), and it’s extensible with modules. I’d call it an SPSS replacement, but it is so much better than that. jamovi is built on R, so you can obtain R syntax for everything you do in jamovi (just turn on “Syntax mode”). Seriously, jamovi is great.
The esci package I’ve developed for R runs as a module in jamovi. Just: 1) run jamovi, 2) click the modules button near the top-right corner, 3) access the jamovi library, and 4) scroll down to esci and click install. You’ll now have an esci menu in your jamovi program (and it will stay there until you remove it–you only need to install a module 1 time per machine). There are screen-by-screen instructions here: https://thenewstatistics.com/itns/esci/jesci/
Once done, you can obtain cohen’s d for both independent and paired designs, and you can do so from raw data or from just summary data. The commands to use are:
- esci -> estimate independent mean difference (the estimation version of an independent t-test), or
- esci -> estimate paired mean difference (the estimation version of a paired t-test)
For example, here I’ve selected “estimate independent mean difference”. In the analysis page that appears I’ve selected the toggle-box for “summary data”. I then typed in the means, standard deviations, and sample sizes for my two groups. In an instant, I get output which includes Cohen’s d and its CI

Here’s a close-up of the output for Cohen’s d:
dunbiased = 0.91 95% CI [0.30, 1.63] Note that the standardized effect size is d_unbiased because the denominator used was SDpooled which had a value of 2.15 The standardized effect size has been corrected for bias. The bias-corrected version of Cohen's d is sometimes also (confusingly) called Hedges' g.
esci explains to you what denominator was used and its value (important) and it clarifies that correction for bias has been applied. One thing missing (for now) is a reference for the approach to obtaining the CI, which really matters… I’ll fix that soon. Maybe there is additional information that would be useful? If so, let me know.
Method 2 – Obtain Cohen’s d in R from summary data with estimateStandardizedMeanDifference
Maybe you are an R power user and you just can’t even when it comes to using a GUI for data analysis. No problem. esci is a package in R. It’s not on CRAN (and probably won’t be for some time), but you can obtain it directly from github using the devtools package. Then you can use the function estimateStandardizedMeanDifference. Here’s a detailed code example that includes everything you would need to download and install the package:
# Setup -------------------------------------------
# First, make sure required packages are installed.
if (!is.element("devtools", installed.packages()[,1])) {
install.packages("devtools", dep = TRUE)
}
if (!is.element("esci", installed.packages()[,1])) {
devtools::install_github(repo = "rcalinjageman/esci")
}
# Second, load the required libraries
library("esci")
# Third, get some cohen's d
# Get d directly from summary data for a two-group design
estimate <- estimateStandardizedMeanDifference(m1 = 10,
m2 = 15,
s1 = 2,
s2 = 2,
n1 = 20,
n2 = 20,
conf.level = .95)
estimate
# Get d directly from summary data for a paired design
estimate <- estimateStandardizedMeanDifference(m1 = 10,
m2 = 15,
s1 = 2,
s2 = 2,
n1 = 20,
n2 = 20,
r = 0.80,
paired = TRUE,
conf.level = .95)
estimate
# Or, use raw data to estimate the raw mean difference with CI *and* d with CI
# THis boring example uses mtcars
data <- mtcars
data$am <- as.factor(data$am)
levels(data$am) <- c("automatic", "manual")
estimate <- estimateMeanDifference(data, am, mpg,
paired = FALSE,
var.equal = TRUE,
conf.level = .95,
reference.group = 1)
estimate
plotEstimatedDifference(estimate)
Note that for the paired data I passed a flag (paired = TRUE) and *also* an r value–that’s the correlation between the paired measures. If you don’t have it, you can often calculate it from summary data and the t-test results.
I have omitted the output here because it follows the exact same format as for jamovi above (after all, it’s running the same code under the hood).
Method 3 – Obtain Cohen’s d and its CI from raw data with estimateMeanestimateMeanDifference
Finally, let’s obtain Cohen’s d from raw data–and in the process obtain the raw-score mean difference and a nice plot that emphasizes the raw data and the effect size and its uncertainty.
Here’s a very uninspired example using the mtcars dataset–sorry it’s not very exciting, but there aren’t a lot of fun built-in datasets in R. We’ll compare the miles per gallon (mpg) for automatic vs. manual cars. The type of transmission is in the column “am” and it is coded as a numeric 0 (manual) or 1 (automatic). In this example I will make it into a factor (esci requires that a grouping variable be a factor) and relabel it to make the output easier to understand.
Again I’ve made the code complete, including everything needed to ensure esci is installed.
# Setup -------------------------------------------
# First, make sure required packages are installed.
if (!is.element("devtools", installed.packages()[,1])) {
install.packages("devtools", dep = TRUE)
}
if (!is.element("esci", installed.packages()[,1])) {
devtools::install_github(repo = "rcalinjageman/esci")
}
# Second, load the required libraries
library("esci")
# Now make a copy of mtcars and convert am to a labelled factor
data <- mtcars
data$am <- as.factor(data$am)
levels(data$am) <- c("automatic", "manual")
# Prepare yourself for some Cohen's d (and a nice plot)
estimate <- estimateMeanDifference(data, am, mpg,
paired = FALSE,
var.equal = TRUE,
conf.level = .95,
reference.group = 1
)
estimate
plotEstimatedDifference(estimate)
As you can see above, we use this function by passing the dataframe (data), the grouping variable (am) and the outcome variable (mpg). The reference.group parameter is optional–it specifies which level of your grouping variable factor that should serve as the reference group when calculating the effect size (Mdiff = Mgroup_of_interest – Mreference_group). If you leave this parameter out, esci will use the first level of your grouping variable.
Again, the output for Cohen’s d follow the same as above, so I’m not going to go through it. But check out the cool plot:

Approaches
There are a number of different ways to estimate the CI on Cohen’s d. esci uses the method explained by Goulet-Pelletier & Cousinea: (Goulet-Pelletier & Cousineau, 2018). I’ll expand this blog-post at some point to explain it, but for now I strongly suggest reading the actual paper–it not only clearly explains the approach but it also compares it against many other options, including the ones used in popular R packages (see the appendix)… it turns out not all R packages emit good quality CIs for d!
I’m deeply indebted to these authors–I was able to adapt the code they provided into esci and they have repeatedly helped answer questions to improve the function.
One big issue, though — a recent preprint? I found on ResearchGate suggests that all approaches to obtaining a CI for d will fail with paired designs (Fitts, 2020). I’m still digesting this, and waiting to see the peer-reviewed version. But it is probably worth some extra caution with CIs for a paired design–the preprint shows they can have poor capture rates when r is very strong.
To Read
- Goulet-Pelletier, J.-C., & Cousineau, D. (2018). A review of effect sizes and their confidence intervals, Part I: The Cohen’s d family. The Quantitative Methods for Psychology, 242–265. doi: 10.20982/tqmp.14.4.p242
- Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology. doi: 10.3389/fpsyg.2013.00863
Leave a Reply