Peer Review: Crude and Understudied, But Indispensable.

Peer Review: Crude and Understudied, but Indispensable.
Kassirer, Jerome P. MD; Campion, Edward W. MD

Informacin sobre el autor
From the offices of the editor-in-chief (Dr Kassirer) and deputy editor (Dr Campion), New
England Journal of Medicine, Boston, Mass.
Presented at the Second International Congress on Peer Review in Biomedical Publication,
Chicago, Ill, September 9, 1993.
Reprint requests to New England Journal of Medicine, 10 Shattuck St, Boston, MA 02115
(Dr Kassirer).
Peer Review is not perfect, and when it is done sloppily, journals publish research that is
flawed. Even when peer review is rigorous, flawed research sometimes gets into the
literature. Journals have long relied on peer review, yet concerns about its limitations have
often been expressed [1,2,3,4]. Critics point out that some reviewers are unqualified and
others, because of personal or professional rivalry, are biased. Editors may even select
reviewers on the basis of the reviewers' biases. Furthermore, two or more reviewers may
have widely discrepant opinions about a study. Critics also make the point that peer review
not only fails to prevent the publication of flawed research but also permits the publication
of research that is fraudulent. Some have described peer review as arbitrary, subjective, and
secretive. In addition, many critics (including some of the popular press) maintain that it is
simply unnecessary and slows the communication of information to the public.
Before we can set about discovering how to make peer review better, we need to clarify its
definition, making a distinction between the overall process by which editors manage
manuscripts--let us call this manuscript management--and the cognitive part of this process,
which we may call manuscript assessment. Studies of peer review and debate about it have
focused on everything but its most important aspect--the cognitive task of the reviewer
assessing a manuscript. Of the articles published from the first peer review congress, all but
one addressed manuscript management, not manuscript assessment [5]. The articles
examined the roles and responsibilities of authors and editors, the management of scientific
misconduct, the accuracy of published material, the history and philosophy of peer review,
and technical aspects of the peer review process.
We know surprisingly little about the cognitive aspects of the process--what a reviewer (or
an editor) does when he or she assesses a study submitted for publication. Consequently, we
have few ideas about how to improve the process, teach it, and defend it. In medical school,
house-staff training, and the courses given to research trainees, we teach statistics,
epidemiology, study design and interpretation, and, in some instances, critical appraisal of
the literature. But these courses are not designed to prepare physicians for the job of
consultant to the editor, which is the basic task of the manuscript reviewer. When it comes
to learning how to review a manuscript, we seem to fall back on an approach traditional in
clinical medicine: see one, do one, teach one.
To begin our investigation, we need, as in all scientific inquiry, a testable hypothesis. In
fact, it should be possible to study the cognitive content of peer review. Since the pioneering
studies on human problem solving by Newell and Simon in the 1970s [6], methods have
evolved to identify aspects of the representation of knowledge and the strategies people use
to solve a variety of problems. Researchers have used these techniques to study the elements
of the diagnostic process, causal reasoning, and the complex trade-offs that physicians
confront when dealing with therapeutic uncertainty [7,8,9]. One technique involves
analyzing transcripts of tape recordings of people thinking aloud while solving a problem.
This approach can reveal the structure of a person's problem-solving processes. Why not
simply ask people what they are doing? Because the answers people give depend to some
extent on their own preconceived, private theories of how their minds work. These theories
may not represent accurately what they actually do. Studies such as those that employ
transcripts of people thinking aloud can yield only preliminary hypotheses about the
cognitive process, but if sufficient competing hypotheses are generated, they can then be
tested experimentally. To test them, one must examine a sufficiently large number of
examples of each type of reasoning to ensure reliability. Unfortunately, we are far from
doing that with manuscript reviewing.
At present, we can only speculate about the cognitive basis of manuscript review. The
speculations that follow are based on a literature review, a taped interview with a
distinguished scientist, our experience at the New England Journal of Medicine, and a
review of data from our files on rejected manuscripts. We offer here a tripartite hypothesis
about the cognitive tasks involved in manuscript assessment. The first element of the
hypothesis is that manuscript assessment is a special case of problem solving, and that the
fundamental task of a manuscript reviewer (and editor) is to detect and describe flaws.
Tables 1 through 4 list some common flaws and other reasons for rejection; this is only a
partial list, and there is overlap between categories. The major categories are flaws related
to design Table 1, presentation Table 2, interpretation Table 3, and questions about the
overall importance of the research Table 4.
In these lists, some terms such as "inadequate," "unconvincing," "unsupported,"

"inappropriate," and "invalid" come up frequently. Bias is a recurring source of concern. In
a review of biases, dozens have been identified at various stages of a study [10]--in
specifying and selecting the study sample, executing the experimental maneuver, measuring
outcomes, analyzing the data, and interpreting the data. Unfortunately, there is no consensus
on how to evaluate or assess the relative importance of these many kinds of bias.
The second part of our hypothesis argues that there is a kind of rejection threshold involved
in the assessment of manuscripts--a point at which the cumulative weight of a manuscript's
faults tips the scales toward rejection. To take an extreme example, when a reviewer judges
a study's methods to be grossly invalid, the threshold is reached, and the reviewer
recommends rejection regardless of the other attributes of the study. But given all the
potential faults of a study and the differential importance of each, defining the rejection
threshold for a given manuscript would be complex and difficult. Yet we have never tried
to define the relative gravity of the various faults detected by peer review, and no one has
come to grips with how they should be weighed in the evaluation of manuscripts.
The final part of the hypothesis suggests an analogy between a reviewer's recommendation
and a diagnostic test. It argues that manuscript assessment, like even the most sophisticated
diagnostic tests, has a certain sensitivity and specificity. If this is true, then the assessment
of manuscripts must yield a certain complement of false-positive and false-negative results.
As in any human endeavor, it seems likely that false-positive and false-negative results must
occur even in the hands of the most objective reviewers. According to this part of the
hypothesis, erroneous recommendations are an inevitable and unavoidable aspect of the
review process. They are the natural consequences of dealing with uncertainty and
employing an assessment strategy, and are not necessarily biased or arbitrary decisions
[11,12]. For example, disagreement among reviewers is common [3,13] and is probably
primarily a reflection of the complexity of the process of manuscript assessment rather than
being evidence that the peer review process is arbitrary or capricious. Although in some
circumstances respectable agreement among reviewers has been achieved [14], consistency
may give less information rather than more. Reviewers with different experiences, different
areas of expertise, and different views of the body of knowledge may produce quite different
assessments that are valuable to author and editor even though their recommendations are
divergent.
There are several implications of the three-part hypothesis. First, further study of the task of
manuscript assessment may provide us with a more advanced theory of the cognitive basis
of manuscript review and a better appreciation of factors that influence reviewers'
recommendations. Second, if we have a framework that explains manuscript assessment
better, we might be able to teach it better than we do with the haphazard apprenticeship
approach now in widespread use. Third, better definition of the process should help allay
the fears of critics who believe that there are no rules governing peer review and that the
entire process lacks objectivity. Fourth, we should be able to design studies to learn more
about both manuscript assessment and the overall process of peer review.
Before we begin to apply the methods of cognitive science to the study of manuscript
assessment, we must ask ourselves a fundamental question. Even if we can study the
phenomenon, is it worth the effort? We know that peer review is not perfect. It does not
eliminate bias, on the part of either the reviewer or the editor. It does not weed out fraudulent
research or even all flawed research. It cannot guarantee the truthfulness or the validity of
the work. Although much has been written about the defects of peer review, its merits when
directed and used by a thoughtful editorial staff are substantial. As Bailar and Patterson [13]
put it some years ago, peer review at its best can screen out investigations that are poorly
conceived, poorly designed, poorly executed, trivial, marginal, or uninterpretable; it
improves the quality of individual manuscripts, steers research results to appropriate
journals, and helps people who are not experts to decide what to believe. The peer review
system is not totally unscientific, arbitrary, or subjective, as some have proposed.
These final observations are intended not to discourage research into peer review, but rather
to urge that it be done right. We cannot have one standard for scientific reports and another
for studies of peer review. An adequate study must specify precisely what part of the peer
review process is being studied and must meet the same demanding standards that we apply
to our best scientific studies. Studies of peer review should be published only if they can
pass a vigorous peer review process themselves. We may just have to admit that the process
we use to assess sophisticated scientific research is crude. Although our understanding of
peer review also remains crude, this fallible, poorly understood process has been
indispensable for the progress of biomedical science.
Volver al principio
REFERENCES
1. Relman AS. Peer review in scientific journals--what good is it? West J Med.
1990;153:520-522. [Context Link]
2. Relman AS, Angell M. How good is peer review? N Engl J Med. 1989;321:827-829.
[Context Link]
3. Ingelfinger FJ. Peer review in biomedical publication. Am J Med. 1974;56:686-692.
Bibliographic Links [Context Link]
4. Altman LK. The myth of 'passing peer review.' In: Bailar JC III, Angell M, Boots S, et
al, eds. Ethics and Policy in Scientific Publication. Bethesda, Md: Council of Biology
Editors; 1990. [Context Link]
5. Guarding the guardians: research on editorial peer review: selected proceedings from the
First International Congress on Peer Review in Biomedical Publication. JAMA.
1990;263:1317-1441. [Context Link]
6. Newell A, Simon HA. Human Problem Solving. Engelwood Cliffs, NJ: Prentice Hall
Publishers; 1972. [Context Link]
7. Kuipers B, Kassirer JP. Causal reasoning in medicine: analysis of a protocol. Cogn Sci.
1984;8:363-385. [Context Link]
8. Kassirer JP, Gorry GA. Clinical problem solving: a behavioral analysis. Ann Intern Med.
1978;89:245-255. Buy Now Bibliographic Links [Context Link]
9. Elstein AS, Schulman LS, Sprafka SA. Medical Problem Solving: An Analysis of Clinical
Reasoning. Cambridge, Mass: Harvard University Press; 1978. [Context Link]
10. Sackett DL. Bias in analytic research. J Chronic Dis. 1979;32:51-63. Bibliographic
Links [Context Link]
11. Kuipers BJ, Moskowitz A, Kassirer JP. Critical decisions under uncertainty:
representation and structure. Cogn Sci. 1988;12:177-210. [Context Link]
12. Moskowitz AJ, Kuipers BJ, Kassirer JP. Dealing with uncertainty, risks and tradeoffs in
clinical decisions: a cognitive science approach. Ann Intern Med. 1988;108:435-449. Buy
Now Bibliographic Links [Context Link]
13. Bailar JC, Patterson K. Journal peer review: the need for a research agenda. N Engl J
Med. 1985;312:654-657. [Context Link]
14. Oxman AD, Guyatt GH, Singer J, et al. Agreement among reviewers of review articles.
J Clin Epidemiol. 1991;44:91-98. Bibliographic Links [Context Link]
Data Interpretation, Statistical; Manuscripts, Medical; Peer Review, Research; Publishing;
Quality Control; Research Design

Peer Review: Crude and Understudied, But Indispensable.

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Peer Review: Crude and Understudied, But Indispensable.

Uploaded by

Copyright:

Available Formats

Peer Review: Crude and Understudied, but Indispensable.

Kassirer, Jerome P. MD; Campion, Edward W. MD

In these lists, some terms such as "inadequate," "unconvincing," "unsupported,"

You might also like