A few words about the latest from Pierre Dragicevic. He’s an HCI researcher in Paris who totally gets the need for the new statistics. I’ve written about his work before, here and here. Now, with colleague Lonni Besançon, he reports a study of how HCI researchers have reported statistical inference over the period 2010 – 2018. It’s a discouraging picture, but with glimmers of hope.
The study is:
Lonni Besançon, Pierre Dragicevic. The Continued Prevalence of Dichotomous Inferences at CHI. 2019. It’s here.
Lonni and Pierre took the 4234 articles in the CHI conference proceedings from 2010 to 2018–these proceedings are one of the top outlets for HCI research. They used clever software to scan the text of all the articles, searching for words and symbols indicating how statistical inference was carried out and reported.
I recall the many journal studies that I, with students, carried out, 10 to 20 years ago: We did it all ‘manually’: we scanned articles and filled in complicated coding sheets as we found signs of our target statistical or reporting practices. Analysing a couple of hundred articles was a huge task, as we trained coders, double-coded, and checked for coding consistency. Computer analysis of text has its limitations, but a sample size of 4,000+ articles is impressive!
Here’s a pic summarising how NHST appeared:
About 50% of papers reported p values and/or included language suggesting interpretation in terms of statistical significance. This percentage was pretty much constant over 2010 to 2018, with only some small tendency for increased reporting of exact, rather than relative, p values over time. Sadly, dichotomous decision making seems just as prevalent in HCI research now as a decade ago. 🙁
If you are wondering why only 50% of papers, note that in HCI many papers are user studies, with 1 or very few users providing data. Qualitative methods and descriptive statistics are common. The 50% is probably pretty much all papers that reported statistical inference.
What about CIs? Here’s the picture:
Comparatively few papers reported CIs, and, of those, a big majority also reported reported p values and/or used significance language. Only about 1% of papers (40 of 4234) reported CIs and not p or a mention of statistical significance. The encouraging finding, however, was that the proportion of papers reporting CIs increased from around 6% in 2010 to 15% in 2018. Yay! But still a long way to go.
For years, Pierre has been advocating change in statistical practices in HCI research–he can probably take credit for some of the improvement in CI reporting. But, somehow, deeply entrenched tradition and the seductive lure of saying ‘statistically significant’ and concluding (alas!) ‘big, important, publishable, next grant, maybe Nobel Prize!’ persists. In HCI as in many other disciplines.
Roll on, new statistics and Open Science!