# Gordon’s ‘dances’: Vivid Simulations Bring Statistical Ideas Alive

Bob and I are delighted to welcome Gordon Moore who joins us in working on the second edition of ITNS. Gordon, an independent tutor in computing, statistics and mathematics, is based in England, so our ITNS2 team of three now spans three continents.

We are now releasing Gordon’s dances in beta, and seek your feedback. Developed in JavaScript, dances opens in your browser via this link. ITNS2 will be accompanied by Bob’s data analysis software, esci, in R, and Gordon’s web-based simulations, all of which are based on, and go beyond, my Excel-based ESCI. The first and most important of Gordon’s simulations is dances, which replaces and goes beyond CIjumping in ESCI.

Below are four examples of dances bringing key statistical ideas alive. These are frozen images: It’s way more convincing watching the simulations dancing down the screen.

Getting started with dances:

• Open dances in a browser
• Click on the ‘?’ at top right in the control panel (left side of screen) to turn on popout tips, which give brief explanations when the mouse hovers over labels or controls.
• Use the three big buttons. Play as you wish. Click ‘Clear’ to start again.

## 1. Variability: Very often larger than we think. Dance of the means.

Take repeated samples of size N = 20 from the pictured normally distributed population. Watch the pattern of values (blue open circles) jump around from sample to sample. Watch the means (green dots) from successive samples dance down the screen: So much variation, even with samples of size 20! This is the dance of the means.

## 2. Randomness is lumpy but, in the long run, totally predictable. Dance of the confidence intervals.

Place 95% CIs on each of the dancing means, again with samples of N = 20. CIs that don’t capture the population mean, mu (blue line), are red. In the short term, red CIs seem to come very haphazardly, sometimes rarely, sometimes in clumps. In the long term, however, very very close to 95.0%  of CIs will capture mu and 5.0% will be red.

This happens when CIs are all the same length, being based on the population SD, sigma, assumed known. Remarkably, it also happens when, as in the picture below, CIs vary in length because they are based on sample SDs, when sigma is assumed not known. Either way, we are seeing the dance of the CIs.

The falling means pile up to form the mean heap; means in the heap keep their colour, red or green. In the long run, the mean heap shape will closely match the theoretically expected, normally distributed, sampling distribution curve.

## 3. The Central Limit Theorem: Surprisingly close, even with tiny samples.

The central limit theorem states that, almost whatever the shape of the population distribution, the sampling distribution of sample means will be approximately normal. Furthermore, the larger the samples, the closer the sampling distribution will be to normal.

In dances you can draw whatever weird shape of population distribution you choose, then take samples of some chosen size, N, and compare the mean heap with the normal curve.

The figure below shows that, even with my hand-drawn, highly skewed population, and samples as tiny as N = 3, the mean heap is much less skewed than the population, and surprisingly close in shape to the symmetric normal curve.

## 4. The p value varies so widely it can’t be trusted: Dance of the p values.

Run a replication, exactly the same as the original experiment but with a new sample, and find that the p value is likely to be very different. The sampling variability of the p value is surprisingly large: Alas, we simply shouldn’t trust any p value.

The figure below shows the dance of the CIs and the corresponding p values—which vary from <.001 to more than .8! Deep blue patches mark p>.10, through to bright red patches for p<.001. This is the dance of the p values!

Population mean, mu, is 60, and SD, sigma, is 20. The null hypothesis is H0: mu0 = 50, so the effect size in the population is half of sigma, or Cohen’s delta = 0.50, conventionally considered to be a medium-sized effect. With N = 16, the power is about .50, which is typical for many research fields in psychology and some other disciplines.

The running simulation is way more vivid than any picture, especially when sounds are turned on, ranging from a bright trumpet for p<.001 down to a deep trombone for p>.10.

Change N, or population effect size, and see generally lower or higher p values but, most surprisingly, in every case the values of p still jump around dramatically.

For videos of such dances, search YouTube for ‘dance of the p values’ and ‘significance roulette’.

Figures and dances like those shown here will come in Chapters 4, 5, and 6 in ITNS2.

Meanwhile, please have a play with Gordon’s wonderful dances and let us have your thoughts and suggestions. Thanks.

Geoff