Between those, I think I like “plausibility picture” more. I think it nicely conveys that theses should be a heuristic eyeball tool.

]]>Geoff ]]>

This needs to be fleshed out.

I have to yet to meet a teacher, coach or instructor of any kind who would not agree that their expectations affect many students’ (or athletes, etc) performance. Likewise managers in many companies. If expectations do not affect performance, then this is something which should radically change teaching, coaching, instructing and management.

And if they do affect performance (even if not on IQ tests), then we need to know how.

]]>I had updated the post, and used the “strikethrough” font to cross our 4 and replace with 8. I thought this would be good to show the original error, your comment noting it, and the correction. But it looks like the strike-though wasn’t especially noticeable. I’ve updated again to try to make it more visible.

]]>Sorry to bother you again, but now the note says both 4 and 8;) Feel free to delete this comment after you correct it!

]]>Thanks!

]]>2) No need for a new term to replace “null hypothesis” for non-null hypotheses: Just use Neyman’s term “tested hypothesis” or its abbreviation, “test hypothesis”.

1) No need for all that tortured nonintuitive normal/SD dependent tradition to measure distance from the test hypothesis: Just measure the information against the test hypothesis supplied by its P-value p by converting it to the Shannon information (now over 60 years history as “surprisal”, “logworth” and other names including S-value) s = -log(p). Unlike the P-value, the S-value is additive across independent tests (as Fisher exploited), equal-interval scaled, unbounded above so hard to confuse with a posterior probability; and when using base-2 logs has immediate translation into a coin-tossing experiment, e.g., p of 0.03 is s = -log(0.03) = 5 bits of information against the hypothesis, which is the same amount of information as 5 heads in a row supplies against fairness of a coin tossing set-up. The 1-sided 5-sigma physics criterion becomes about 22 bits or 22 heads in a row. And so on.

Yes what I am saying is The New Statistics is already old and in need of update – you should read my 2019 TAS-supplement paper and update your book accordingly:

Greenland, S. (2019). Some misleading criticisms of P-values and their resolution with S-values. The American Statistician, 73 suppl 1, 106-114, open access at

http://www.tandfonline.com/doi/pdf/10.1080/00031305.2018.1529625