Adventures in Replication – Reviewers don’t want to believe disappointing replication results
Trying to publish replication results is difficult. Even when the original evidence is very weak or uncertain, reviewers tend to look for reasons to explain away a smaller effect in the replication. If nothing comes to mind, reviewers may even make something up. Here’s an actual review I received:
Depending upon where your participants are from, it is quite possible that cultural differences could explain some of the differences between your results and those by B&E. If indeed they hail from the Dominican Republic, as you state on page 4, then a viable hypothesis might be that power primes have less of an effect (on performance, or perhaps in general) on individuals with more interdependent self-construals than on individuals with more independent self-construals.
What the manuscript had actually stated is that the students were from Dominican University, my home university in River Forest, Illinois. That’s not in the Dominican Republic. The reviewer’s major concern over the replication was based on assuming we were in a different country and then not bothering to check.
When I pointed out the error to the editor, they assured me that the reviewer’s misunderstanding did not substantively influence the decision to reject the manuscript.
Of course, reviewers can make mistakes. But in the reviews I’ve collected this type of thing seems surprisingly common, and the mistakes all seem to lean towards the reviewer finding reasons to discount the research. Coincidence or an example of motivated reasoning?
Here are a couple of other examples:
I would like to see confidence intervals reported for all statistics (including manipulation checks), not only sometimes. Also, the authors do not report effect sizes.
I think replication is extremely important to our field but I also think it's important that we use our limited resources (and journal pages) for replication studies of greater importance than this.
The two weakest studies use a novel DV that has not been used in the literature (at least the authors provide no citations for this measure).