Cochrane: Matthew Page Wins the Prize!

Years ago, Matthew Page was a student in the School of Psychological Science at La Trobe University (in Melbourne), working with Fiona Fidler and me. He somehow (!) became interested in research methods and practices, especially as related to meta-analysis. He moved to Cochrane Australia, which is based at Monash University, also in Melbourne.

After completing a PhD there he had a post-doc with Julian Higgins, of Cochrane and meta-analysis fame, in Bristol, U.K.. Then he returned to Cochrane Australia, where he is now a research fellow.

He has been building a wonderful research record, working with some leading scholars, including Julian Higgins (of course) and the late Doug Altman.

It was wonderful to hear, a day or two ago, this announcement:
“Cochrane Australia Research Fellow and Co-Convenor of the Cochrane Bias Methods Group Matthew Page recently took out this year’s Bill Silverman Prize, which recognises and celebrates the role of constructive criticism of Cochrane and its work.”

Read more about the prize and Matt’s achievements here.

Congratulations Matt!


The pic below shows Prof David Henry (left) presenting Matt with this year’s Bill Silverman Prize at the Cochrane Colloquium in Edinburgh.

Draw Pictures to Improve Learning?

In ITNS we included a short section near the start describing good strategies for learning, based on empirical studies. Scattered through the book are reminders and encouragement to use the effective strategies. Now, just as we’re thinking about possible improvements in a second edition, comes this review article:

Fernandes, M. A., Wammes, J. D., & Meade, M. E. (2018). The surprisingly powerful influence
of drawing on memory. Current Directions in Psychological Science, 27, 302–308. DOI:10.1177/0963721418755385

It’s behind a paywall, but here is the abstract:

The surprisingly powerful influence of drawing on memory: Abstract
The colloquialism “a picture is worth a thousand words” has reverberated through the decades, yet there is very little basic cognitive research assessing the merit of drawing as a mnemonic strategy. In our recent research, we explored whether drawing to-be-learned information enhanced memory and found it to be a reliable, replicable means of boosting performance. Specifically, we have shown this technique can be applied to enhance learning of individual words and pictures as well as textbook definitions. In delineating the mechanism of action, we have shown that gains are greater from drawing than other known mnemonic techniques, such as semantic elaboration, visualization, writing, and even tracing to-be-remembered information. We propose that drawing improves memory by promoting the integration of elaborative, pictorial, and motor codes, facilitating creation of a context-rich representation. Importantly, the simplicity of this strategy means it can be used by people with cognitive impairments to enhance memory, with preliminary findings suggesting measurable gains in performance in both normally aging individuals and patients with dementia.

For the original articles that report the drawing-for-learning studies, see the reference list in the review article, or search for publications in 2016 and after by any of the three authors.

A few thoughts
I haven’t read the original articles, and the review doesn’t give values for effect sizes, but the research program–largely published in the last couple of years–takes am impressively broad empirical approach. There are many comparisons of different approaches to encoding, elaboration, and testing of learning. Drawing holds up very well in the great majority of comparisons. There are interesting suggestions, some already tested empirically, as to why drawing is so effective as a learning strategy.

As usual, lots of questions spring to mind. How effective could drawing be for learning statistical concepts? How could it be used along with ESCI simulations? Would it help for ITNS to suggest good ways to draw particular concepts, or should students be encouraged to generate their own representations?

These and similar questions seem to me to align very well with our basic approach in ITNS of emphasising vivid pictorial representations whenever we can. The dances, the cat’s eye picture, the forest plot…

Perhaps we should include drawing as a powerful extra recommended learning strategy, with examples and suggestions included in ITNS at strategic moments?

As usual, your comments and advice are extremely welcome. Happy sketching!


ITNS–The Second Edition!

Routledge, our publisher, has started planning for a second edition. That’s very exciting news! The only problem is that Bob and I can’t think of anything that needs improving. Ha! But, seriously, we’d love to hear from you about things we should revise, update, or somehow improve. (Of course, we’d also love to hear about the good aspects.) We’d especially like to hear from:

Teachers who are using ITNS. What do you like? What’s missing? What are the irritations? What difficulties have you encountered?

Students who are using ITNS. Same questions! Also, how could the book be more appealing, accessible, effective, even fun?

Potential teachers. You have considered ITNS, perhaps examining an inspection copy, but you decided against adoption. Why? Was it mainly the book and ancillaries, or outside factors? How could we revise so that you would elect to adopt?

The Routledge marketing gurus tell us that one strong message back from the field is: “ITNS is really good, just what the world needs and should be using. But for me, right now, it’s too hard to change. I’ll wait until others are using it, maybe until I’m forced to change.” If that’s how you feel, please let us know.

Perhaps that position is understandable, but it seems to conflict with the enthusiasm with which some (many?) young researchers are embracing Open Science, and the major changes to research practices that Open Science requires. Consider, for example, the emergence of SIPS and, just recently, the Down-Under version.

That position (i.e., it’s too hard to change right now) also contrasts strongly with the strong and positive responses that Bob and I get whenever we give talks or workshops about the new statistics and Open Science.

So we’re puzzled why more teachers are not yet switching their teaching approach–we’ve tried hard to make ITNS and, especially, its ancillaries as helpful as we can for anyone wishing to make the switch.

Thinking about how we could improve ITNS, here are a few of the issues you may care to comment about:

Open Science Lots has happened since we finalised the text of ITNS. We would certainly revise the examples and update our report of how Open Science is progressing. However, the basics of Open Science, as discussed in Chapter 1 and several later chapters, endure. ITNS is the first introductory text to integrate Open Science ideas all through, so we had to figure out for ourselves how best to do that. How could we do it better?

ESCI ESCI is intended to make basic statistical ideas vivid, memorable, and easy to grasp. It also allows you to analyse your own data and picture the results, for a range of measures and simple designs. Many of the figures in the book are ESCI images. However, in ESCI you can’t, for example, easily load, save, and manage files. The online ancillaries include workbooks with guidance for using ITNS with SPSS, and with R. Should we consider replacing ESCI, noting that we want to retain the graphics and simulations to support learning? Should we retain ESCI, but include more support for Jamovi, JASP, or something else? Other strategies we should consider?

NHST and p values We present these in Chapter 6, after the basics of descriptives, sampling, and estimation in earlier chapters. You can elect to skip this chapter, or give it as little or as much emphasis as you wish. Is this the best chapter organisation?

Ancillaries We offer a wide range via the publisher’s companion website. What’s most useful? Least useful? How could we improve the ancillaries?

…they are just a few thoughts. Tell us about anything you wish. You could even tell us it’s all wonderful, if you like!

In advance, many thanks,

P.S. Make a public comment below, and/or email either of us, off list:

Open Science DownUnder — Fiona Fidler reports

Last week, the 2018 Australasian Open Science Conference was held in Brisbane at the University of Queensland: The first conference in Oz on the themes of Open Science and how to improve how science is done. They expected 40 and 140 turned up! By all reports it was a rip-roaring success. So mark your diary for the second meeting, likely to be in Melbourne on 7-8 Nov 2019. That’s a great time of year to escape the misery of the Northern Hemisphere in winter and take in a bit of sun, sand, and surf–and good science.

Fiona Fidler kindly provided the following brief report of last week’s meeting:

A new Open Science and Meta-Research community in Australia

Our research group recently attended the Australasian Open Science (un)conference at the University of Queensland. The meeting was modelled on SIPS (Society for Improving Psychological Science), which means the focus was on doing things, not streams of long talks.

For the first meeting of its kind in Australia, it certainly pulled a crowd. Organises Eric Vanman, Jason Tangen and Barbara Masser (Psychology, UQ) had initially expected attendance of around 40. In the end, 140 of us gathered in Brisbane. And more still engaged through twitter, during and after the conference. It has been wonderful to discover this Australia community and great plans to stay connected are emerging, e.g., formalising a Melbourne Open Science community and working towards our own Australian and interdisciplinary SIPS-style society. If you’re reading this and you’d like to add your name to the list of people interested in these things, send me (Fiona) an email ( and I’ll make sure you receive the survey Mathew Ling (Deakin Uni) is currently setting up.

This first meeting included: hackathons to establish checklists for assessing the reliability of published research; brainstorming sessions about open science practices in applied research; discussions (unconferences) on the existence of QRPs outside of a hypothesis testing framework, and practical problems in computational reproducibility; R workshops and sessions on creating ‘open tools’. A Rapid Open Science Jam at the end of the first day resulted in new project ideas, including one to survey undergraduate intuitions about open science practice. View the full program for all the other good things I haven’t mentioned (there are many). And of course, there’s more on twitter: #uqopenscience.

We are all very grateful to the large and impressive group of student volunteers who contributed to the great success of #uqopenscience, including Raine Vickers-Jones who opened the conference with warmth and enthusiasm.

In 2019 the conference will move to Melbourne and take on a slightly more interdisciplinary flavour, as the Australasian Interdisciplinary Open Science Conference. We expect to see ecologists, biologists, medical researchers and others, in addition to the existing psychology base. Tentative dates 7-8 Nov 2019 at the University of Melbourne. We anticipate being able to offer a limited number of travel scholarships for students and ECRs.

For now, look for updates on or contact the organising committee Fiona Fidler (Uni Melb, @fidlerfm), Hannah Fraser (Uni Melb @HannahSFraser), Emily Kothe (Deakin, @emilyandthelime) and Jennifer Beaudry (Swinburne, @drjbeaudry).
Thanks Fiona for that report. Shortly after sending it, she sent another message–saying ‘Here’s a much better blog post’ and then giving the link to:
Eight take-aways from the first Australasian Open Science conference

Well, whether or not better, it certainly has tons of great stuff. I’d love to have been there!

Here are a couple of thoughts of mine:

So young! Like SIPS, it looks like the median age of participants is less than half my age! Which is fantastic. If ever we worried that the next generation of researchers would play it safe and just do what their professors told them to do, well we need not have worried. They are creating the new and better ways to do science, and are finding ways to get it out there and happening. All of which is great. (Hey, reach for ITNS when it can help, especially by helping beginners into OS ways.)

Not just psychology Note #6, ‘it isn’t just psychology’, and also Fiona’s comments about the range of disciplines likely to be involved in the second meeting, next November. Psychology has the research skills to do the meta-research and collect evidence about scientific practices, and to develop many of the policies, tools, and materials needed for OS. That can all be valuable for numerous other disciplines as they make their versions of the OS journey.


A Wonderful Panorama of Statistics

Bob and I have been off-air for a while, but we haven’t gone away. I’ve been meaning for ages to blog about a wonderful book. Here it is:

Sowey, E., & Petocz, P. (2017). A panorama of statistics: Perspectives, puzzles and paradoxes in statistics. Wiley.

And the flyer with succinct information about the book is here. (Enjoy the full spread of the fine artwork that wraps around the book’s cover. The original painting, by Jeffrey Smart, is an enormous and wonderful sight in the foyer of one of Melbourne’s prominent theatres. Worth visiting!)

Panorama has been my beside-the-bed book for a while now. You could read it straight through, but I’ve preferred to dip in haphazardly, just about always finding something intriguing. It’s a cornucopia of statistical ideas, examples, oddities, paradoxes, historical tales, and more.

The back story: The journal Teaching Statistics, from 2003 to 2015 published the Statistical Diversions column by Peter Petocz and Eric Sowey. Peter and Eric are distinguished statisticians–and statistics teachers–based in Sydney.

Whenever a new issue of Teaching Statistics arrived I would first turn to their column to check out the new goodies, and the commentaries they gave on the questions they’d posed in the previous issue.

Eric and Peter now present the content of those columns, and more, assembled into coherent chapters as their Panorama book. It’s a great resource for any teacher looking for ways to engage or extend their students, or for anyone simply interested to explore–and be fascinated by–the discipline of statistics.

Here are a few tastes:

Over about 20 years I built ESCI and wrote two books (the second with Bob, of course). For all that time I played with simulations of randomness, notably ESCI’s dances–of CIs in particular. I concluded early on that randomness is endlessly surprising and fascinating. It’s amazingly lumpy in the short term, while in the very long term fits exactly with what theory says we should expect. Even with this long experience, I found Chapters 11 (Some paradoxes of randomness) and 12 (Hidden risks for gamblers) especially interesting.

My brief version of Q11.5 (p. 89): Two people, Alice and Bert, toss a fair coin numerous times. Alice scores a point when a Head turns up, Bert a point for a Tail. How often is the lead likely to change? See pp.243-244 for the authors’ discussion–which may help us avoid unwarranted conclusions about what the movements of stock prices mean.

Getting the answer you want
You are teaching about questionnaires and wish to explain how a sequence of slanted questions can steer respondents in any direction you choose. A short and sharp satirical example is from the classic British Yes Minister program. See pp. 64 and 228 in the book, and the video here. (There are numerous links in the book. A list of all those links, in clickable form, is here.)

Eponymy and Stigler’s Law
We’re all familiar with many statistical eponyms (the Fisher exact test…). What is the relevance of Stigler’s law? Is that law true or false? If true it must be misleadingly named? See pp. 178-181 for an intriguing discussion, and pp. 292-295 for discussion of the questions posed in the earlier pages.

For more about the book, see an interview with the authors here, and to see the first few dozen pages of the book go here and click ‘look inside’.


P.S. On a totally different topic, one of the reasons I’ve been off-air is that Lindy and I joined a two-week tour of Greenland. It was fascinating. For example we visited the Ilulissat Glacier, which drains about 7% of the huge Greenland icecap, and which may have been the source of the iceberg that sank the Titanic. The most scary statistics I’ve seen for some time describe how that giant glacier–and others in Greenland–have greatly increased their rate of retreat in the last 10-20 years. In the case of the Ilulissat Glacier the calving is now no longer from a vast floating ice tongue, but from the much-retreated glacier front sitting on solid rock. So now the massive new icebergs all contribute to rising sea levels. That’s accelerating climate change in action, which is truly scary.

Positive Controls for Psychology – My pitch for a SIPS project

Positive controls are one of the most useful tools for ensuring interpretable and fruitful research.  Strangely, though, positive controls are rarely used in psychological research.  That’s a shame, but also an opportunity–it would be an easy but substantial improvement for psychological researchers to start using regularly using positive controls.  I (Bob) am currently at SIPS 2018; I’ll be giving a lighting talk about positive controls and hopefully developing some resources to encourage the use of positive controls.  To kick things off, I’ve started this OSF page on positive controls:

What is a positive control?  A positive control (aka an active control) is a research condition that has a known effect in the research domain; it’s a research condition that ought to work if the research is conducted properly.  For example, a researcher might be studying how much a new drug affects alertness.  She will administer either the new drug or placebo and then measure alertness with an odd-ball task.  A positive control would be adding a third group that receives caffeine (blinded, of course), a drug well known to produce a modest increase in alertness on this task.

Why use positive controls?  There are several potential benefits:

  • Positive controls help indicate the sensitivity and integrity of the experiment.  If the experiment is conducted properly and with sufficient data then the positive control ought to show the expected effect.  If the experiment does not, then the research will know that something may have gone on and will be able to investigate.  Positive controls are especially useful for interpreting “negative” results.  From the example above, if the researcher finds that the new drug does not influence alertness, she may wonder about the result: was enough data collected and was the procedure administered correctly?  Checking that the positive control came out as expected gives reassurance that the research was conducted properly and was sensitive to the desired range of effects.
  • Positive controls can be used as a training tool–new researchers can run positive controls to ensure procedural proficiency before collecting real data (and while collecting real data)
  • Screening for outliers and/or non-compliant responding – for some positive controls there is a clear range of valid responses even at the individual level.  In these cases, positive controls provide an additional way to screen for outliers and unusual responses.

How to select a positive control?  To aid interpretation, a positive control should be well-matched to the experimental question.  The ideal positive control:

  • Is from the same research domain
  • Has a well-characterized effect size that is similar to what is expected for the research question (or a set of positive controls can be used to test sensitivity to small, medium, and large effects)
  • Is sensitive to the factors that could ruin the effect of interest
  • Is easy/short to administer

Can positive controls really help in psychology? Yes.  I (Bob) have been using positive controls extensively in my replication research.  These have been essential in demonstrating the quality and sensitivity of the replication research.  For some examples, see:

So how do I get started?  I (Bob) have started an OSF page on positive controls.  I’m hoping to use some of my time at SIPS 2018 to populate the page and start some research to show they are worth using.   Here’s the page (still in development):



Precision for Planning: Great New Developments

–updated with a link from Ken Kelley to access the functions in the paper, 6/28/2018–

In a new-statistics world, the best way to choose N for a study is to use precision for planning (PfP), also known as accuracy in parameter estimation (AIPE). Both our new-statistics books explain PfP and why it is better than a power analysis–which is the way to choose N in an NHST world. ESCI allows you to use PfP, but only for the two independent groups and paired designs.

The idea of PfP, as you may know, is to choose a target MoE; in other words, choose a CI length that you do not wish to exceed. Then PfP tells you the N that will deliver that MoE (or shorter)–either on average, or with 99% assurance.

PfP is a highly valuable approach, hampered to date by the lack of software to make using it easy for a full range of measures and designs. Indeed the PfP techniques required to build such software have been developed only comparatively recently; many have been developed by Ken Kelley and colleagues. Further developments are needed and now Ken and colleagues have published a new article with important advances:

Kelley, K., Darku, F. B., & Chattopadhyay, B. (2018). Accuracy in parameter estimation for a general class of effect sizes: A sequential approach. Psychological Methods, Vol 23(2), 226-243.

The translational (simplified) abstract is below, at the bottom.

The article may be available here, or you may need to get it via your library.

Traditional PfP, as described in our two books and implemented in ESCI, has some severe limitations:
1. The population distribution is assumed known–usually a normal distribution.
2. A particular effect size measure is used, for example the mean or Cohen’s d.
3. A value needs to be assumed for one or more population parameters, even though these are usually unknown. For example, our books and ESCI support PfP when target MoE is expressed in units of population standard deviation, even though this is usually unknown.
4. Traditional PfP gives a single fixed value of N for the target MoE to be achieved (on average, or with 99% assurance).

Very remarkably, the Kelley et al. article improves on ALL 4 of these aspects of traditional PfP! Imho, this is a wonderful contribution to our understanding of PfP and to the range of situations in which PfP can be used. It will, I hope, contribute to the much wider use of PfP for sample-size planning.

Much of the article is necessarily quite technical, but here is my understanding of the approach, in relation to the 4 points above.
1. A non-parametric approach is taken, meaning that no particular form of the population distribution is assumed. Using the central theorem makes the analysis tractable.
2. A very general form of effect size measure is assumed (in fact, the ratio of two functions of the population parameters). A large number of familiar effect size measures, including the mean, mean difference, and Cohen’s d, are special cases of this general measure, so the PfP technique that Kelley et al. develop can be applied to any of these familiar measures, as well as many others.
3. The sequential approach they take–see 4 below–allows them to estimate the relevant population parameters, and to update that estimate as the process proceeds. No dubious assumption of parameter values is required.
4. Conventional approaches to statistical inference rely on N being specified in advance. Open Science has emphasised that data peeking invalidates p value and other conventional approaches to inference. (In data peeking, you run a study, analyse, then decide whether to run some more participants, for example until statistical significance is achieved.) Avoiding data peeking is one reason for preregistration–which includes stating N in advance, or at least the stopping rule, which must not depend on the results obtained.

However, sequential analysis was developed about 75 years ago in the NHST world. It is seldom used in the behavioral sciences, but allows you to analyse data collected to date and then decide whether to continue, or to stop and declare in favour of the null hypothesis, or the specified alternative hypothesis. The stopping rule is designed so the procedure gives Type I and Type II error rates that are as selected for the NHST. Yes, sequential analysis is more complex to use, and you don’t know in advance how many participants you will need, but it can on average lead to smaller N being required than for conventional fixed-N approaches.

Kelley et al. have very cleverly used the sequential approach to PfP and, at the same time, have solved 3 above. The idea is that you take a pilot sample of size N1, then use the results from that to estimate relevant parameters and to calculate the MoE on your effect size estimate. If that MoE is not sufficiently short to provide the precision you are seeking, you test a further N0 participants (N0 is generally small, and may be 1). Then again estimate the parameters and calculate MoE. It that MoE is not sufficiently small, test a further N0 participants, and so on, until you achieve the desired precision.

Then interpret the final effect size estimate and its CI. Yes, the method my be complex, but it is very general and should on average give a smaller N than conventional PfP would require.

I find the generality and potential of the method stunning, and I can’t wait to see it made available within full-function data analysis applications. That will give a great boost to the highly desirable shift from power analysis to PfP, and more generally from NHST to the new statistics. Hooray!


—UPDATE — Ken Kelley writes:

On my web site is a link with instructions and code for a few specific instances of the method. The link is here:

For each of the effect sizes, there are several functions that need to be run first. But, after getting those into one's workspace, the actual function is easy to use. The functions available at the above link are for the coefficient of variation, for a regression coefficient in simple regression, and for the standardized mean difference. 

My co-authors and I have plans to develop an R package for a more general applications. In fact, we already have made progress on the package, which will focus on sequential methods

Translational Abstract
Accurately estimating effect sizes is an important goal in many studies. A wide confidence interval at the specified level of confidence (e.g., .95%) illustrates that the population value of the effect size of interest (i.e., the parameter) has not been accurately estimated. An approach to planning sample size in which the objective is to obtain a narrow confidence interval has been termed accuracy in parameter estimation. In our article, we first define a general class of effect size in which special cases are several commonly used effect sizes in practice. Using the general effect size we develop, we use a sequential estimation approach so that the width of the confidence interval will be sufficiently narrow. Sequential estimation is a well-recognized approach to inference in which the sample size for a study is not specified at the start of the study, and instead study outcomes are used to evaluate a predefined stopping rule, which evaluates if sampling should continue or stop. We introduce this method for study design in the context of the general effect size and call it “sequential accuracy in parameter estimation.” Sequential accuracy in parameter estimation avoids the difficult task of using supposed values (e.g., unknown parameter values) to plan sample size before the start of a study. We make these developments in a distribution-free environment, which means that our methods are not restricted to the situations of assumed distribution forms (e.g., we do not assume data follow a normal distribution). Additionally, we provide freely available software so that readers can immediately implement the methods.

P.S. I haven’t yet located the software mentioned in the final sentence above. Ken’s great software for PfP (and other things) is MBESS, so that may be where to look.

Effect Sizes for Open Science

For the last 20 years or so, many journals have emphasised the reporting of effect sizes. The new statistics emphasised also the reporting of CIs on those effect sizes. Now Open Science places effect sizes, CIs, and their interpretation centre stage.

Here’s a recent article with interesting discussion and much good advice about using effect sizes in an Open Science world:
Pek, J., & Flora, D. B. (2018). Reporting effect sizes in original psychological research: A discussion and tutorial. Psychological Methods, 23(2), 208-225.

The translational (simplified) abstract is down the bottom.

Unfortunately, Psychological Methods is behind the APA paywall, so you will need to find the article via a library. (Sometimes it’s worth searching for the title of an article, in case someone has secreted the pdf somewhere online. Not in this case at the moment, I think.)

A couple of the article’s main points align with what we say in ITNS:

Give a thoughtful interpretation of effect sizes, in context
Choose effect sizes that best answer the research questions and that make most sense in the particular situation. Often interpretation is best done in the original units of measurement, assuming these have meaning–especially to likely readers. Use judgment, compare with past values found in similar contexts, give practical interpretations where possible. Where relevant, consider theoretical and possibly other aspects or implications. (And, we add in ITNS, consider the full extent of the CI in discussing and interpreting any effect size.)

Use standardised effect sizes with great care
Standardised (or units-free) effect sizes, often Cohen’s d or Pearson’s r, can be invaluable for meta-analysis, but it’s often more difficult to give them practical meaning in context. Beware glib resort to Cohen’s (or anyone else’s) benchmarks. Be very conscious of the measurement unit–for d, the standardiser. If, as usual, that’s a standard deviation estimated from the data, it has sampling variability and will be different in a replication. (In my first book, UTNS, I introduced the idea of the rubber ruler. Imagine the measuring stick to be a rubber cord, with knots at regular intervals to mark the units. Every replication results in the cord being stretched to a different extent, so the knots are further or less far apart. The Cohen’s d value is measured in units of the varying distance between knots.)

There’s also lots more of interest in this article, imho.

Translational Abstract
We present general principles of good research reporting, elucidate common misconceptions about standardized effect sizes, and provide recommendations for good research reporting. Effect sizes should directly answer their motivating research questions, be comprehensible to the average reader, and be based on meaningful metrics of their constituent variables. We illustrate our recommendations with four different empirical examples involving popular statistical methods such as ANOVA, categorical variable analysis, multiple linear regression, and simple mediation; these examples serve as a tutorial to enhance practice in the research reporting of effect sizes.

APS in San Fran 3: Workshop on Teaching the New Stats

Tamarah Smith and Bob presented a workshop on Teaching the New Stats to an almost sold-out crowd. I wasn’t there, but by all reports it went extremely well. Such a workshop seems to me a terrific way to help interested stats teachers introduce the new stats into their own teaching.

After taking that first step, it may all get easier, because, in my experience, teaching the new stats brings its own reward, in that students understand better and therefore feel better. So we the teachers will also feel the joy.

Tamarah and Bob’s slides are here. It strikes me as a wonderful collection, with numerous links to useful resources, and great advice about presenting an appealing and up-to-date course to beginning students. Also, indeed, to more advanced students. It’s well worth browsing those slides. Here are a few points that struck me as I browsed:

**You can download the workshop files, and follow along.

**You may know that jamovi and JASP are open source applications for statistical analysis designed to supersede commercial applications, notably SPSS. They are more user friendly, as well as being extensible by anyone. Already, add-on modules written in R are beginning to appear. (These are exciting developments, worth trying out.)

**Bob is developing add-ons for jamovi for the new statistics. (Eventually, jamovi augmented by Bob’s modules may replace ESCI–with the great advantages of providing data file management and a gateway to the power and scope of a full data analysis application.)

**The workshop discussed three simple examples (comparison of two independent means, comparison of two independent proportions, and estimation of interactions).

**The first example (independent means) was discussed in most detail, with a comparison of traditional NHST analysis, and new-stats analysis using ESCI then jamovi (with Bob’s add-on); then a Bayesian credible-interval approach. Then meta-analysis, to emphasise that estimation thinking and meta-analytic thinking are essential frameworks for the new stats.

**The GAISE guidelines were used to frame the discussion of the pedagogical approach. There is lots on encouraging students to think and judge in context–which should warm the heart of any insightful stats teacher.

**There are examples of students’ responses to illustrate the presenters’ conviction that conceptual understanding is better when using the new stats.

**There’s a highly useful discussion of a range of statistical software for new-stats learning and data analysis.

There’s lots more, but I’ll close with the neat summary below of the new-stats approach, which is now best ethical practice for conducting research and inferential data analysis.


APS in San Fran 2: Symposium on Teaching the New Stats

Our symposium was titled Open Science and Its Statistics: What We Need to Teach Now. The room wasn’t large, but it was packed, standing room only. I thought the energy was terrific. There were four presentations, as below.

Bob and Tamarah Smith have set up an OSF page on Getting started teaching the New Statistics. It holds all sorts of goodies, including the slides for our symposium.

At that site, expand OSF Storage, then Open Science and its Statistics–2018 APS Symposium slides and see 4 files for the 4 presentations:

Bob Calin-Jageman (Chair)
Open Science and Its Statistics: What We Need to Teach Now
Examples of students being stumped by traditional NHST analysis and presentation of a result, but readily (and happily) understanding the same result presented using the new stats. In addition we should teach and advocate the new statistics to improve statistical understanding and communication across all of science that uses statistical inference.

Geoff Cumming
Open Science is Best Practice Science
Being ethical as a teacher or researcher requires that we use current best practice, and for statistical inference that is the new statistics. The forest plot is, in my experience, a highly effective picture for introducing students to estimation and meta-analysis. I gave a paper advocating its use in the intro stats course back in 2006. In my experience, the new stats is, in contrast to NHST, a joy to teach. Students saying ‘It just all makes sense…’ is one of the most heart-warming things any teacher can hear.

Susan Nolan & Kelly Goedert
Transitioning a Traditional Department: Roadblocks and Opportunities for Incorporating the New Statistics and Open Science into Teaching Materials
Roadblocks include NHST appearing everywhere, common software (SPSS) not making the new stats easy, colleagues who are steeped in the old ways, widely-used textbooks taking traditional approaches, … But there are new textbooks and open source software emerging, and there are strategies for spreading the word and bringing colleagues on board. (See the slides for numerous practical suggestions and links to many useful resources to support teaching and using the new statistics.)

Tamarah Smith
Feeling Good about the New Statistics: How the New Statistics Improves the Way Researchers and Students Feel about Statistics
Statistics anxiety is a problem for many students, and impedes their learning. The new statistics opens up great opportunities for teaching so that anxiety is much reduced and students’ attitudes are more positive. The new statistics helps teachers meet the GAISE guidelines for assessment and instruction in stats education. Students feel better and more engaged and their learning is grounded. As a result their teachers also feel better. Let’s do it.

Personally, I found it wonderful to hear so many examples and reasons all converging on the conclusion that teaching the new statistics (1) is what’s needed for ethical science, (2) helps students understand much better and feel good about their learning, and (3) is great for teachers also. A triple win!


P.S. The pic below is from Bob’s slides and is adapted from Kruschke and Liddell (2018). The crucial thing for Open Science is the shift to estimation and meta-analysis, and away from the damaging dichotomous decision making of NHST. The estimation and meta-analysis can be frequentist (conventional confidence intervals) or Bayesian (credible intervals)–either is fine. In other words, there is a Bayesian new statistics, alongside the new statistics of ITNS. Maybe the Bayesian version will come to be the more widely used? I believe the biggest hurdle to overcome for that to happen is the arrival of good teaching materials that make Bayesian estimation easily accessible to beginning students, as well as to researchers steeped in NHST.

But the main message is that either of the cells in the bottom row is just what we need.

Kruschke, J. K. & Liddell, T. M. (2018). The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychonomic Bulletin & Review, 25, 178–206.