I tend to do a lot of peer reviewing, but I’m certain that I don’t do it well. Of all of the consequential stuff that I have to write as a professional academic, I get the least feedback on my reviewer reports, and there isn’t much incentive to reflect on my deficiencies as a reviewer. I know that I have many such deficiencies, but I don’t know exactly what they are. If you were to point to any bit of advice and say that it’s poor, I would deny it and have a justification for it ready at hand. But I also recognize that I’m no better than (and probably worse than) the average reviewer, and I know that the average reviewer has significant vices. So while I can’t point to any direct evidence of my vices, I know by other routes that I have them, and that they are significant.
I am in a similar position with respect to identifying the virtues and vices of my own reviewers, about which my judgments are also very unreliable. One difference is that I often accept a few points made by the more critical reviewers, which assures me that I’m not just being overly defensive or thin-skinned. But on the whole, these three or four criticisms (usually three) tend not to strike at the heart of my project and require only superficial changes, a coincidence makes my judgment suspect.
For these reasons, my only real opportunities to identify specific instances of reviewer virtue and vice are those (rare) cases when the editor shares the reports of my fellow reviewers. In those cases, I don’t have the same investment in the paper as I do my own, and the other reports usually give me too little information to have any strong predilections for or against the other reviewers. If some of the more pessimistic literature on cognitive biases is correct, situations like these may be our best hope for assessing the habits and practices of peer reviewers. With this in mind, I’d like to make some very rough generalizations about a widely shared vice of peer review, based solely on those “report swapping” experiences.
My hypothesis is this: much as we might think that our own papers were mistreated by reviewers, in fact, most reviewers (70-80%?) seem to make fair judgments about the overall quality of a book or article, or at least judgments that reach a realistic threshold of fairness (I can elaborate in comments if you like). However, only a minority of the fair reviews (maybe 30%?) are well justified. Most reviews, of course, result in rejections or R&R’s (R&R=”revise and resubmit”). It seems like these judgments are usually fair: philosophical work should be sophisticated, polished and at least somewhat enlightening to warrant publication, and that’s a high bar. But many reviewers seem to be bad (and some spectacularly bad) at explaining why the paper is inadequate. Instead of saying that the paper isn’t sufficiently novel or rigorous, they list problems that might or might not need fixing, but the fixing of which won’t do much to make the paper publishable. So, for example, they point out that the author misreads some philosopher or another, when the point on which the author misreads her isn’t particularly central to the argument. They note that the primary subject of the paper isn’t what the title or introduction led them to expect. They offer sweeping, methodological objections to the very idea of undertaking the author’s project, even though they would have welcomed more a stimulating or insightful project that uses the same methodology. They point out that the author’s argument depends on some highly contested assumption that the author didn’t defend, forgetting that sometimes philosophy needs stipulation to make progress (or else we’d still be stuck on dream arguments and brains in vats).
So in short, more times than not, I agree with the decision of my fellow reviewers to publish or not to publish, but I’m rarely persuaded by the reasons they give, and sometimes find them tragically misleading. I have more confidence in their final judgments than their reasons. This is very similar to my experience grading essays with TA’s, which we sometimes do in order to calibrate our respective grade-o-meters. For the most part, my TA’s and I tend rank papers similarly. Maybe my favorite is third or fourth on their lists, it’s never happened that I rank last a paper they any one of them loved. But my TA’s and I usually come up with very different ways of describing what’s wrong with the paper, some of which are incompatible with one another. Considering us as a group, I trust our intuitions about the merits of a paper more than I do our justifications. The same is probably true of all of the regular instructors reading this blog–if given ten student papers, we’d rank them similarly, but when forced to explain what’s problematic about the bad ones, we’d be all over the map.
I have a theory about how this happens, but before getting too attached to my theory, I thought I would compare my experience with the other readers of the blog, just in case I am too quick to conclude that this phenomenon is so prevalent.