# Fun with esci in R: The simple two-group design

esci is now available as a module in jamovi and as a package in R (JASP coming soon, hopefully). Let’s have some fun with esci in R!

We’ll start with a simple two-group design. Specifically, we’ll use data from Experiment 4 of (Kardas & O’Brien, 2018). In this study, participants watched a video explaining how to do a simple mirror-tracing task (Cusack, Vezenkova, Gottschalk, & Calin-Jageman, 2015). Participants were randomly assigned to watch the training video either 1 time or 20 times. They then predicted how they would perform on the task (0-100%) and then completed the task (0-100%). Kardas & O’Brien found that watching the training video repeatedly boosted confidence (predicted scores) but not performance.

**Opening the data – R**

If you haven’t installed esci yet, you can do so with:

`install.packages("esci")`

Once installed, we will load esci into memory and then we will store the Kardas & O’Brien data set bundled in esci, giving it the name **mydata**

```
library(esci)
mydata <- esci::data_kardas_expt_4
```

**Analyze the data in R**

We are going to analyze the effect of video **Exposure **on **Prediction** scores. We can do this with the estimate_mdiff_two command. We’ll want to store the result, so tell R to store it in a new variable called **estimate**.

```
estimate <- esci::estimate_mdiff_two(
data = mydata,
outcome_variable = Prediction,
grouping_variable = Exposure,
conf_level = 0.95,
assume_equal_variance = TRUE
)
```

Note that we’ve decided to assume equal variance… but it’s probably a better default **not **to do this.. and it’s easy enough to change the command by setting assume_equal_variance to FALSE.

**Inspect the result**

What we get back in R is a list, a complex object that contains other objects. You can inspect this object in lots of different ways, but lets try listing the objects it contains:

```
names(estimate)
[1] "properties" "es_mean_difference_properties"
[3] "es_mean_difference" "es_median_difference"
[5] "es_median_difference_properties" "es_smd_properties"
[7] "es_smd" "es_mean_ratio"
[9] "es_mean_ratio_properties" "es_median_ratio"
[11] "es_median_ratio_properties" "overview"
[13] "raw_data"
```

We can see that our results has properties, and then a bunch of different objects that start with es — that’s short for effect size. We get a mean difference, a median difference, an smd, a mean ratio, and a median ratio. Many of these have their own properties as well. Finally, we also get an overview and raw_data.

Let’s see the overview:

```
> estimate$overview
outcome_variable_name grouping_variable_name grouping_variable_level mean mean_LL mean_UL median
1 Prediction Exposure 1 56.37795 52.90820 59.84770 60
2 Prediction Exposure 20 67.76224 64.49236 71.03212 71
median_LL median_UL sd min max q1 q3 n missing df mean_SE median_SE
1 53.57284 66.42716 22.07273 0 100 40 71.5 127 0 268 1.762318 3.279225
2 66.12580 75.87420 17.66669 0 100 59 81.0 143 0 268 1.660803 2.486881
```

You can see that overview is a table — it lists each group found in the data (1x and 20x exposure) and provides basic descriptive statistics: mean with confidence interval, median with confidence interval, standard deviation, etc.

Let’s take a look at the es_mean_difference table:

```
> estimate$es_mean_difference
type outcome_variable_name grouping_variable_name effect effect_size LL
1 Comparison Prediction Exposure 20 67.76224 64.492358
2 Reference Prediction Exposure 1 56.37795 52.908204
3 Difference Prediction Exposure 20 ‒ 1 11.38429 6.616553
UL SE df ta_LL ta_UL
1 71.03212 1.660803 268 65.020985 70.50349
2 59.84770 1.762318 268 53.469143 59.28676
3 16.15202 2.421576 268 7.387331 15.38124
```

You can see that this table gives us the mean and confidence interval of the 20x group, of the 1x group, and **of the difference between them**, reporting (in row 3) the **contrast **between the 20x and 1x group. The 1x group, in this case is the **reference group** — we express the effect size *relative *to the 1x group. The mean difference in prediction scores is 11.38 95% CI [6.6, 16.15]. We also get the standard error, degrees of freedom, and the confidence interval at **t**wo **alpha **(90% CI in this case). Clearly, watching the instructional video make a pretty big difference in predictions, it boosted confidence by over 10 points on a 0-100 scale in this sample! There is some uncertainty about the size of the effect, but overall, it seems clear that more video exposure leads to more confidence.

Notice that we have some other ways of expressing the effect size. For one, we can examine **median **differences–probably a better idea in most cases in psychology, but not widely done.

```
> estimate$es_median_difference
type outcome_variable_name grouping_variable_name effect effect_size LL UL SE
1 Comparison Prediction Exposure 20 71 66.125803 75.87420 2.486881
2 Reference Prediction Exposure 1 60 53.572837 66.42716 3.279225
3 Difference Prediction Exposure 20 ‒ 1 11 2.933636 19.06636 4.115567
ta_LL ta_UL
1 66.909445 75.09055
2 54.606155 65.39385
3 4.230494 17.76951
```

This table is setup similarly to the es_mean_difference — we again get each group and the **contrast **between them. There is more uncertainty here (a difference of 11 points with a 95% CI [2.9, 19.06])… the data are consistent with a large median difference but also with a fairly small median difference of just 2.9 points (and valuers near the CI boundary are not very different in their compatibility with the data). So we’d still want to be cautious about concluding there is a meaningful median difference.

Want more ways to express this? Of course! We can also thing about the **ratios **between the group means or medians. Here’s the **ratio of medians**:

```
> estimate$es_median_ratio
outcome_variable_name grouping_variable_name effect effect_size LL UL comparison_median
1 Prediction Exposure 20 / 1 1.183333 1.037629 1.349498 71
reference_median
1 60
```

The 20x group had a median 1.18x the 1x group, but the CI is broad [1.037, 1.349].. so somewhere between a very small to very large increase in median is compatible with this data.

And, of course, psychologists remain a bit obsessed with Cohen’s d. So let’s look at the es_smd table:

```
> estimate$es_smd
outcome_variable_name grouping_variable_name effect effect_size LL UL numerator denominator
1 Prediction Exposure 20 ‒ 1 0.5716119 0.327274 0.8149238 11.38429 19.86031
```

SE df d_biased
1 0.1244027 268 0.5732178

This is a fairly large effect: d = 0.57 95% CI [0.33, 0.81] and the confidence interval is fairly narrow — we could fairly easily plan a sensitive follow-up study to help confirm and better characterize this effect.

But wait, there are lots of approaches to Cohen’s d… what is the denominator that was used and what flavor of Cohen’s d have we produced? Take a look at es_smd_properties to find out.

`> estimate$es_smd_properties`

$message
This standardized mean difference is called d_s because the standardizer used was s_p. d_s has been corrected for bias. Correction for bias can be important when df < 50. See the rightmost column for the biased value.

Ah, so this is *d*s – because it used the pooled standard deviation. If we had set assume_equal_variance to FALSE we’d have obtained d_avg, which uses the average of the group standard deviations as the normalizer.

**Visualiz**ations

We don’t just want a bunch of tables… let’s **see **this data.

This is easy in esci, we just pass our stored result (**estimate**) to an appropriate plot function. In this case, we’ll use plot_mdiff to visualize a mean or median difference:

```
esci::plot_mdiff(
estimate,
effect_size = "mean"
)
```

and we get this beautiful figure:

Which we can then customize to our heart’s content (it’s a ggplot2 plot object).

Want to see the median difference instead? Here we go:

```
esci::plot_mdiff(
estimate,
effect_size = "median"
)
```

and we get:

**Evaluating a Hypothesis**

Although Kardas & O’Brien conducted several studies on video exposure, this was the first study they conducted using mirror tracing as the performance task. Therefore, they probably didn’t yet have a clear quantitative prediction to test — they weren’t really ready for hypothesis testing. Imagine, though, that you are going to conduct a replication study. Based on Kardas & O’Brien, you believe increased video exposure produces a **substantive **change in confidence, and you decide to define this as at least a 5 point difference in means. In other words, you’re specifying an **interval null**. The skeptic’s hypothesis (the null hypothesis) is that any difference in confidence will be negligible (< 5 point difference) and your hypothesis is that it will be substantive (>5 point difference).

We can visualize your prediction against the results by tweaking our call to plot_mdiff just a bit:

```
esci::plot_mdiff(
estimate,
effect_size = "mean",
rope = c(-5, 5)
)
```

We’ve passed a two-element vector that defines the interval null. This is called a ROPE or region of practical equivalence. We’ve defined the rope using the concatenate function in R which creates vectors: c(-5, 5) — that means create a vector with elements -5 and 5 and send that to the function where it is expecting a ROPE to be defined.

Here’s what we get:

You can see the ROPE shaded in in red and pink, and you can visually compare the results with the predictions of you and the skeptic. The rules for declaring victory or simple: if the whole CI of the result is inside the ROPE, the skeptic wins, if the whole CI is outside, you win, and if there is overlap there is a draw. In this case, we can see that the CI on the difference is fully outside the ROPE (though not by a ton). If the ROPE had really been established *a priori *and a sensitive experiment designed to test the predictions, we’d now have a strong confirmatory indication that their is, indeed, a substantive effect of video exposure on confidence (well, strong statistical evidence… we’d still need to think about the internal and external validity of our study and the extent to which this supports our claim as well).

Want to conduct the hypothesis test a bit more formally? esci can help with the test_mdiff function, which takes arguments very similar to what we passed to plot_mdiff:

```
esci::test_mdiff(
estimate,
effect_size = "mean",
rope = c(-5, 5)
)
```

We get back a complex object, but one of its components is a table called interval_null which has this content:

```
$interval_null
test_type outcome_variable_name effect rope
1 Practical significance test Prediction 20 ‒ 1 (-5.00, 5.00)
confidence CI
1 95 95% CI [6.616553, 16.15202]\n90% CI [7.387331, 15.38124]
rope_compare p_result
```

1 95% CI fully outside H_0 p < 0.05
conclusion significant
1 At α = 0.05, conclude μ_diff is substantive TRUE

Viola!

In this example, we conducted a hypothesis test on mean differences, but we could just as easily work with median differences just by changing the effect_size argument to “median”. How cool is it to be able to conduct interval null tests of differences in median!? Think how sophisticated you will feel!

**Conclusions**

We’ve taken a quick tour of analyzing a two group design in esci.

esci is still in development. I expect the visualization functions, like plot_mdiff, to still change a bit. But the overall workflow should hopefully be stable and sensible: you generate an estimate with an estimate_ function, you can then visualize it (plot_ functions) and/or evaluate a hypothesis with it (test_ functions). The estimate_function produces complex lists with all the results you need: overview table, various es_ tables reporting different effect sizes, and _properties lists with all the nitty-gritty details. And that’s that!

- Cusack, M., Vezenkova, N., Gottschalk, C., & Calin-Jageman, R. J. (2015). Direct and Conceptual Replications of Burgmer & Englich (2012): Power May Have Little to No Effect on Motor Performance (J. M. Haddad, Ed.). Public Library of Science (PLoS). doi: 10.1371/journal.pone.0140806
- Kardas, M., & O’Brien, E. (2018). Easier Seen Than Done: Merely Watching Others Perform Can Foster an Illusion of Skill Acquisition. SAGE Publications. doi: 10.1177/0956797617740646

## Leave a Reply