Peer Review and Rationalization

I tend to do a lot of peer reviewing, but I’m certain that I don’t do it well. Of all of the consequential stuff that I have to write as a professional academic, I get the least feedback on my reviewer reports, and there isn’t much incentive to reflect on my deficiencies as a reviewer. I know that I have many such deficiencies, but I don’t know exactly what they are. If you were to point to any bit of advice and say that it’s poor, I would deny it and have a justification for it ready at hand. But I also recognize that I’m no better than (and probably worse than) the average reviewer, and I know that the average reviewer has significant vices. So while I can’t point to any direct evidence of my vices, I know by other routes that I have them, and that they are significant.

I am in a similar position with respect to identifying the virtues and vices of my own reviewers, about which my judgments are also very unreliable. One difference is that I often accept a few points made by the more critical reviewers, which assures me that I’m not just being overly defensive or thin-skinned. But on the whole, these three or four criticisms (usually three) tend not to strike at the heart of my project and require only superficial changes, a coincidence makes my judgment suspect.

For these reasons, my only real opportunities to identify specific instances of reviewer virtue and vice are those (rare) cases when the editor shares the reports of my fellow reviewers. In those cases, I don’t have the same investment in the paper as I do my own, and the other reports usually give me too little information to have any strong predilections for or against the other reviewers. If some of the more pessimistic literature on cognitive biases is correct, situations like these may be our best hope for assessing the habits and practices of peer reviewers. With this in mind, I’d like to make some very rough generalizations about a widely shared vice of peer review, based solely on those “report swapping” experiences.

My hypothesis is this: much as we might think that our own papers were mistreated by reviewers, in fact, most reviewers (70-80%?) seem to make fair judgments about the overall quality of a book or article, or at least judgments that reach a realistic threshold of fairness (I can elaborate in comments if you like). However, only a minority of the fair reviews (maybe 30%?) are well justified. Most reviews, of course, result in rejections or R&R’s (R&R=”revise and resubmit”). It seems like these judgments are usually fair: philosophical work should be sophisticated, polished and at least somewhat enlightening to warrant publication, and that’s a high bar. But many reviewers seem to be bad (and some spectacularly bad) at explaining why the paper is inadequate. Instead of saying that the paper isn’t sufficiently novel or rigorous, they list problems that might or might not need fixing, but the fixing of which won’t do much to make the paper publishable. So, for example, they point out that the author misreads some philosopher or another, when the point on which the author misreads her isn’t particularly central to the argument. They note that the primary subject of the paper isn’t what the title or introduction led them to expect. They offer sweeping, methodological objections to the very idea of undertaking the author’s project, even though they would have welcomed more a stimulating or insightful project that uses the same methodology. They point out that the author’s argument depends on some highly contested assumption that the author didn’t defend, forgetting that sometimes philosophy needs stipulation to make progress (or else we’d still be stuck on dream arguments and brains in vats).

So in short, more times than not, I agree with the decision of my fellow reviewers to publish or not to publish, but I’m rarely persuaded by the reasons they give, and sometimes find them tragically misleading. I have more confidence in their final judgments than their reasons. This is very similar to my experience grading essays with TA’s, which we sometimes do in order to calibrate our respective grade-o-meters. For the most part, my TA’s and I tend rank papers similarly. Maybe my favorite is third or fourth on their lists, it’s never happened that I rank last a paper they any one of them loved. But my TA’s and I usually come up with very different ways of describing what’s wrong with the paper, some of which are incompatible with one another. Considering us as a group, I trust our intuitions about the merits of a paper more than I do our justifications. The same is probably true of all of the regular instructors reading this blog–if given ten student papers, we’d rank them similarly, but when forced to explain what’s problematic about the bad ones, we’d be all over the map.

I have a theory about how this happens, but before getting too attached to my theory, I thought I would compare my experience with the other readers of the blog, just in case I am too quick to conclude that this phenomenon is so prevalent.

9 replies on “Peer Review and Rationalization”

  1. Really interesting, and I look forward to hearing your theory — and what you think we can do about it! As it turns out, the instances I can remember in which an editor shared with me other referees’ reports were all at journals that don’t specialize in Chinese philosophy, which contributes to another sort of difference of justification: often the referees have been consciously chosen with expertise in different things. So I was the person especially tasked to attend to the Chinese philosophy side of the argument, while someone else focused on the things that she or he knew well. Not that there was any explicit division of labor: each referee could comment on whatever he or she saw fit. But in practice we tended to focus on different things. In such a case, there might be more room for coming to different final assessments, though not in the particular instances with which I was involved.

    • Hi Steve,

      Good point. In all cases but one, I was the sole specialist in Chinese philosophy as well. There are multiple, sufficient justifications for rejecting a paper, so it wouldn’t be surprising if other referees availed themselves of the other justifications. In a paper devoted to comparing Kongzi with Aristotle on X, a very poor grasp of Kongzi, Aristotle or X might well be sufficient grounds for rejection.

      In my experience, though, most (but not all) of the other referees didn’t call attention to problems that we’d normally consider fundamental or insurmountable. For example, one pattern on the political theory side is for referees to suggest that the problem is primarily methodological, so that the author needs some sort of justification for her methodology or, even worse, needs to defend the very possibility of comparative political thought. In all cases, the papers didn’t stand outside the usual range of viable methodological approaches, and in any event I highly doubt that adding a defense of the methodology would save them. The real problem, usually, was that the author just didn’t have a path-breaking or philosophically stimulating argument or thesis, and I suspect that’s also what the other referee detected, as reflected in his/her scores for originality and the like.

      My theory is that a lot of papers tend to merit rejections or R&R’s because they just don’t stand to offer a great deal to the field, even if other problems that we characteristically focus on (bad arguments, etc.) were to be fixed. But it’s difficult to say why a paper offers too little of this, so we fall back on our usual habits as philosophical critics. It might help to think about how we characteristically respond to students papers that are philosophically uninteresting. Rather than explain why or in what respects they are uninteresting, we seek out more readily identifiable problems, which of course we will invariably find. So when a student goes after Descartes’ view that the pineal gland is the “principal seat of the soul,” I end up talking about how he misuses the term “dualism,” or I find a controversial assumption and suggest that he should have defended it, neither of which really address the problem. Something similar is at work in the other referee reports, I think.

  2. Thanks a lot for this very interesting post. I can only answer on the basis of my experience, i.e.: when I have to reject a paper I always try to “teach” something to the author instead of just telling her/him that she or he failed to understand what a scholarly article should look like. Thus, I focus on concrete instances where the article lacks background, or bibliography, or something has been translated wrongly. In fact, what I am thinking in those cases is often much more something like: “What you should change is so much, that there is no way you will ever be able to write something publishable”. But such a judgement would be depressing and useless, so I rather focus on what can be concretely ameliorated.

    • That’s an interesting way of putting it, Elisa. I agree that the full truth would be depressing, and maybe that’s reason enough to think that it’s useless, because we almost always convince ourselves to tune out the really devastating critiques. I guess I think the author should at least be alerted to the fact that small fixes I’ve highlighted aren’t going to save the paper.

      And more generally, it seems like it would be good if we were more aware of the fact that referees’ judgments about the overall merits of the paper are better than their justifications for those judgments. As it is, there’s a certain predictable grief cycle that authors of rejected papers go through, one stage of which is to focus obsessively on some mistaken criticisms that the reviewer dashed off, as though the reviewer’s overall judgment actually hinged on those criticisms. It probably didn’t. So, for example, I recently got a just-barely-R&R (borderline rejection) decision on a paper in which the critic suggested different translations of a couple terms and passages. The suggested translations weren’t plausible, so of course I took this as evidence that the reviewer was clueless (or whatever). But the truth is, the paper needed dramatic revisions, and I’m glad I made them.

    • Yes, it makes sense. As authors, we should consider what you say and avoid focusing on small mistakes of the reviewer. As reviewers, we could try to avoid using the easiest “excuse” to prove that an article needs to be rejected and try to point on more fundamental issues. Being aware of the problem is certainly part of the solution.

  3. I should have added my conclusion: One understands that the paper should not be published and, due to one’s own desire to explain why, one gives motivations, which could however differ widely according to what one’s personal interests are. But this happens because one only picks up a few points in an article which is overall bad. Vice versa, if a project/article is by and large good, one will exemplify one’s judgement through the elements one is more likely to focus upon.

  4. Justin, the title of your post suggests this hypothesis:

    In table tennis one commonly have a reason for a move without having articulated that reason in words, in your head or elsewhere. Articulating the reasons would sometimes be hard. In moral philosophy we test our theories against intuitions on the assumption that we apprehend moral truth prior to articulating our reasons. Having good reasons is much easier than articulating them even to ourselves.

    I suppose the same is true of most of our well-founded beliefs.

    So maybe it’s the same in reading a philosophy paper. We can tell how good it is more easily than we can explain why, even to ourselves. So we may misidentify why. And there’s a limit to how much reviewers should care about correctly identifying why – and even narrower limits to how much reviewers are likely to care about correctly identifying why.

    • Exactly! If you construe “reasons” broadly to include things like heuristics, then I agree. As reason-givers, I think most philosopher-reviewers are somewhere in the vicinity of the typical yoga instructor (with respect to their specific domains of expertise). In principle, the philosophers could be a lot better, but in their haste and orneriness they tend not to be.

    • I always construe “reasons” broadly! 🙂

      Thanks; you could very fairly have replied, “Bill, as for the hypothesis you propose – it just makes a familiar broad point that I was taking as understood. My hypothesis is more specific (see my reply to Steve under #1 above).”

      And maybe the specific hypothesis is the right one, at least for most cases. But maybe sometimes a paper makes an argument that would be important if it were a good argument, and the reviewer fails to put her finger on exactly why it doesn’t seem very good to her.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.