You are on page 1of 14

Lang. Teach. (2015), 48.

4, 531–544 
c Cambridge University Press 2015
doi:10.1017/S0261444815000257

Replication Studies

Written corrective feedback in L2 writing: Connors & Lunsford


(1988); Lunsford & Lunsford (2008); Lalande (1982)

Dana Ferris University Writing Program, University of California, Davis, USA


drferris@ucdavis.edu

Written corrective feedback (CF) has been the most heavily researched topic in second
language (L2) writing over the past 20 years. As a recent research timeline article in this
journal (Ferris 2012; see also Bitchener & Ferris 2012) shows, studies of error correction in
student writing have crossed disciplines (composition and rhetoric, foreign language studies,
applied linguistics) and have utilized a range of research paradigms, including descriptive text
analysis, quasi-experimental designs, and quantitative and qualitative classroom research.
This article highlights two landmark studies on this topic, both from the 1980s, representing
two of these research traditions. It explains why replication of these two studies would further
advance our knowledge about written CF and makes specific suggestions about how the
replications should be completed.

1. Introduction
This paper presents an argument for replication of two key studies on the topic of written
corrective feedback (CF), or written error correction, in the field of second language (L2)
writing studies. The two studies (Connors & Lunsford 1988;1 Lalande 1982), both completed
in the 1980s, were published in major journals (College Composition and Communication and The
Modern Language Journal) and have been frequently cited and are quite influential in their
respective fields of composition studies and applied linguistics. They represent two different
research paradigms—descriptive text analysis and quasi-experimental classroom research—
and reported findings that have been useful in subsequent research and in the design of
teaching materials. They have thus been important and are worthy of replication. In both
cases, the research methods were clearly described, and there are ways in which the designs
could be modified and improved upon for approximate and/or conceptual replications (Porte
2012).
This paper begins with a brief overview of the history of research on the topic of written
CF as a means to situate these two landmark studies in the literature. It explains further why
these studies were chosen. The body of the paper then focuses on a description of each study
followed by specific suggestions for replication.
1 A replication of Connors & Lunsford (Lunsford & Lunsford 2008) is also briefly discussed in this paper.
532 REPLICATION STUDIES

2. Background

The study and practice of written CF has been a controversial topic in both composition
and L2 studies. Some scholars have argued that providing feedback on errors to student
writers is futile at best and harmful at worst (Williams 1981; Truscott 1996). Others have
countered that L2 writers need (and want) feedback and instruction on their language
miscues in order to remediate persistent error patterns, improve the accuracy of their texts,
and communicate most effectively with their readers (e.g., Shaughnessy 1977; Eskey 1983;
Leki 1991). Pedagogical approaches to error correction in L2 writing have ranged from
intensive correction of all errors, to marking errors very little, if at all, to targeted CF that
focuses on a few patterns of error at a time and provides metalinguistic explanation along
with the corrections so that learners can better understand the rules involved.
Because there have been widely disparate philosophical stances and approaches to the
related questions of IF and HOW to correct language errors in L2 writing, researchers in both
composition studies and L2 studies (including foreign language contexts) have investigated
the phenomenon of written CF in various ways. Some of these studies date back nearly a
century (see Santa 2006, for a detailed overview). Before suggesting targets for replication, it
is useful to understand the different approaches researchers have taken toward studying CF
in order to assess their impact on our knowledge about this topic.

2.1 Text analytic description of student error patterns

In composition studies, the most common research design related to CF has been to gather
a large corpus of student writing and to examine, categorize, and quantify the patterns of
written error found in those texts. This type of study dates back to the early 20th century,
and by 1930, one scholar presented a synthesis of 33 previous studies of this type (Harap
1930). The study by Connors & Lunsford (1988; also Lunsford & Lunsford 2008) to be
discussed later in this paper follows this tradition but with the added element of analyzing
not only the student errors but also the teachers’ error markings. Generally speaking, these
researchers have reported that, contrary to popular belief, students’ error FREQUENCIES have
not increased over time, but the TYPES of errors they make have changed.

2.2 Longitudinal classroom studies of the effects of error correction

In this second group of studies, researchers have examined teachers’ CF approaches and
their effects on student writers/texts over time (a term or a school year). Some of these efforts
have taken the form of qualitative case studies (e.g., Cohen & Robbins 1976; Hyland 2003;
Ferris et al. 2013; Storch & Wigglesworth 2010) in which students commented in think-alouds
and/or retrospective interviews about the CF they had received and what they had learned
from it. In other classroom studies, one or several intact classes were studied to see whether
the accuracy of their writing improved over time (e.g., Haswell 1983; Semke 1984; Robb, Ross
DANA FERRIS: WRITTEN CORRECTIVE FEEDBACK IN L2 WRITING 533

& Shortreed 1986; Kepner 1991; Ashwell 2000; Chandler 2003; Ferris 2006) and to evaluate
whether particular approaches to CF seemed to facilitate those outcomes. The second study
to be proposed for replication, Lalande (1982), falls into this group.

2.3 Controlled experimental studies of specific CF techniques and/or error types

This final group of studies represents the dominant paradigm for written CF research over the
past 15–20 years. Conducted by applied linguistics researchers, these studies have included
pretest—posttest—delayed posttest designs and have always included a control group that
received no CF at all (e.g., Sheen 2007; Ellis et al. 2008; Bitchener & Knoch 2010a, 2010b;
van Beuningen, de Jong & Kuiken 2012). Most (not all) of these studies have focused on only
a few target features at a time (most typically definite and indefinite articles). This recent
body of work has yielded important findings with applications for instruction and directions
for future research. This article does not make a specific replication proposal for a study from
this third group. Because there has been so much activity of this type, it has already become
a self-replicating line of research—either researchers are doing approximate replications of
their own previous work or other researchers are taking up similar projects (e.g., Ellis et al.
2008; Bitchener & Knoch 2010a, 2010b).
In contrast, studies from the first two groups have tended to be scattered, one-off attempts,
yet with questions and variables that could profit from further, focused investigation. Taken
together, the three groups of studies have provided a comprehensive picture of the types of
errors students may make, instructors’ various approaches to CF and their effects, individual
and contextual variables that may influence the success or failure of CF, and the types of
language errors that may be the most responsive to CF. Although we have learned a great
deal from this research base about this ubiquitous pedagogical practice, many more questions
remain. Rather than simply designing completely new research, replication of earlier studies
from two of the three dominant research paradigms will allow scholars to continue exploring
key questions.
The following section describes two landmark studies on written error and CF and suggests
profitable avenues for replication efforts. The studies were chosen for several reasons: (1) They
were published in major composition or applied linguistics journals; (2) They reported upon
the research design in terms descriptive enough to allow replication; (3) They have been
influential and frequently cited in subsequent research and pedagogical literature; and (4)
Their contributions to the discussion of written CF are immediately apparent, but it is also
easy to see how replications would improve upon and extend their designs and findings in
useful and productive ways.

3. The original studies and suggested approaches to replication

3.1 Connors & Lunsford (1988); also Lunsford & Lunsford (2008)

Connors & Lunsford situated their study against the historical backdrop of the many
descriptive quantitative studies of students’ written errors conducted in the early 20th century
534 REPLICATION STUDIES

and comprehensively summarized in Harap (1930). As the article’s title suggests, they were
interested in ‘the frequency of formal errors in current college writing’ (p. 395).They described
the purpose of the research on page 396:

We became interested in error-frequency research as a result of our historical studies, when we realized
that no major nationwide analysis of actual college essays had been conducted, to our knowledge, since
the late 1930s . . . we determined to collect a large number of college student essays from the 1980s,
analyze them, and determine what the major patterns of formal and mechanical error in current student
writing might be.

In short, as the authors also explained, they wanted a DATA-DRIVEN understanding of the
types of errors college student writers CURRENTLY made (in the U.S. in the 1980s) rather than
simply relying on their own intuitions or assuming that the frequency and nature of student
errors had not changed over time.
To address their goal, Connors & Lunsford solicited contributions of student papers from
college writing instructors across the U.S. They contacted 1,500 teachers by mail to request
their help and received more than 21,500 papers from 300 instructors. From this large corpus,
they created a stratified random sample of 3,000 student papers, already marked for errors by
the teachers, to use for error analysis. Rather than relying on error lists from earlier studies or
from handbooks, Connors & Lunsford developed their own ‘taxonomy’ of the most frequent
errors in their sample (see Figure 1). To do so, they randomly selected 300 papers from their
large sample and ‘set out inductively to note every formal error pattern we could discover in
the two piles of papers . . . we tried to ignore any elements of paper content or organization
except as they were necessary to identify errors’ (p. 399). The resulting taxonomy has become
known as the ‘top-20’ error list.
Having gathered and prepared their sample and created their analysis scheme, Connors
& Lunsford then gathered a group of 50 raters (faculty and graduate students from a large
U.S. university) to participate in a two-day training and reading session. They explained their
error categories, and the raters practiced and discussed the classification procedures with
sample papers. Then the raters read, marked, and classified errors in the 20 categories for the
3,000 papers. No inter-rater reliability calculations were performed, for reasons the authors
explained on p. 401.
In their analysis of their findings, Connors & Lunsford looked not only at the frequencies
and percentages of the errors found by the raters but also compared those independent
findings to the actual errors marked by the teachers who had contributed their students’
papers. Their analysis yielded the desired quantitative description of the most common
errors made by U.S. college writers at the time (Figure 1, first column), and the authors
further offered some generalizations about the teachers’ marking habits (see discussion, pp.
402–444):

• Teachers’ decisions about what errors to mark were highly variable, even idiosyncratic.
• Teachers did not choose to mark all of the errors present in student papers. On average,
about 43% of the errors in the top 20 categories found by the independent raters were
marked by the classroom teachers.
DANA FERRIS: WRITTEN CORRECTIVE FEEDBACK IN L2 WRITING 535

• Teachers seemed to have two distinct (and sometimes even contradictory) criteria for
which errors to mark: (1) errors that seemed especially serious within the context of the
paper; or (2) errors that seemed quicker or easier to explain and/or to mark.

Beyond the teachers’ approaches to marking, Connors & Lunsford further observed (pp. 405–
406) that student error patterns in their sample were different from those noted in the early
20th-century studies that had inspired their own research. In particular, college writers in
the 1980s tended to make more errors in spelling, particularly with homonyms such as its/it’s
or two/to/too. They speculated that late 20th-century students read less than did students
decades earlier, and that less exposure to ‘the visible aspects of written forms’ (p. 406) may
lead to the increased incidences of such errors. However, they were also careful to note (pp.
406–407) that even though students made different types of errors than they had in the past,
the overall error frequencies had not risen—in other words, current student writers were not
making more written errors or ‘getting worse.’
The second author of the study, Andrea Lunsford, together with Karen Lunsford, published
a replication of the Connors & Lunsford study 20 years later (2008). The purpose of the
replication, again, was largely historical—the authors wanted to document whether both
student error patterns and teachers’ marking strategies had changed in the years since the
original study. The second study’s procedures were virtually identical to those of the original,
but they worked with a much smaller sample (877 student papers rather than 3,000). Lunsford
& Lunsford took some care in their article to explain why the sample was smaller: Between
1988 and 2008, colleges and universities had begun enforcing much more stringent rules for
institutional research, and it was difficult and cumbersome to obtain the necessary permissions
to collect student papers from a range of different institutions.
Rather than using the same ‘top 20’ list from 1988, Lunsford & Lunsford again went
through the step of analyzing a subsample of their student paper corpus to identify the 20
most frequent error types. They did find differences from their original list (as to both types
and relative frequency); these can also been seen in the middle column of Figure 1. The
authors speculated that some of the differences could be explained by tasks and assignment
genres in their recent sample that were distinct from the papers in the 1988 sample. For
instance, many more papers were source- or research-based in 2008 than in 1988, leading to
the inclusion of ‘incomplete or missing documentation’ as the third most frequent error on the
new top-20 list. The authors were somewhat disappointed to discover that teachers’ marking
strategies had not changed much in 20 years. Between 1988 and 2008, teacher preparation,
especially principles for responding to student writing, had evolved, but the authors did not
find evidence of this progress in the teacher feedback in their sample.

3.1.1 Approach to replication

Connors & Lunsford were ahead of their time, especially in composition studies, in pursuing
quantitative descriptions of students’ written errors. When their study was first published in
1988, the affordances of modern corpus linguistics and computational linguistics were barely
known, let alone readily available to the average researcher. Nonetheless, their desire to
536 REPLICATION STUDIES

provide a large-scale description of both student error and of teachers’ marking practices
is a worthy one, as is their interest in updating the findings at regular intervals, given
changes in student populations and in pedagogical practices. There have been relatively
few attempts in either first language (L1) composition or L2 writing studies to provide data-
based quantitative descriptions of student errors or teachers’ CF. Such findings are important
for teacher preparation, for materials development, and for feedback to the student writers
themselves because they help teacher educators, textbook authors, and classroom instructors
target structures on which to focus.
Thus, replication studies should, generally speaking, follow the procedures used by Connors
& Lunsford (1988) and later by Lunsford & Lunsford (2008): A sizable corpus should be
gathered, a taxonomy of common errors should be generated from a subsample of texts
taken from the corpus, and trained raters should find and classify the errors in the texts.
However, to make such studies even more useful, several modifications should be considered
for approximate replications.
First, the taxonomy of errors to be included in a top-20 list should also include consideration
of common errors made by L2 writers. One recent study (Ferris 2006) of L2 writers at one
university in the U.S. yielded a similar ranked list, but as can be seen in the final column of
Figure 1, this list is quite different from those generated in the two studies already discussed.
In 1988 there were fewer L2 writers in U.S. colleges and universities than there are now, so
it is perhaps not surprising that the sample obtained by Connors & Lunsford did not present
typical ‘English as a second language’ (‘ESL’) errors such as problems with verb tense,
omitted articles before nouns, and so forth. However, the U.S. college/university population
was certainly much more diverse by 2008, when Lunsford and Lunsford published their
replication.
The lack of representation of errors more unique to L2 writers on the 1988 and 2008
top-20 lists may be traceable to the ways in which the sample papers were gathered and how
the taxonomy was created and the raters were subsequently trained. The sampling limitation
could be addressed by actively soliciting student samples from college writing instructors who
work with large L2 populations (or teach composition courses specifically designated for L2).
The rating issues could be resolved by having someone trained and experienced with reading
and responding to L2 writing participate in the process of creating the analysis taxonomy
and training and norming the raters.
Second, background information about the student writers should be obtained along
with the sample of student papers. Teachers could be asked to complete a summary sheet
describing their students—class status, age ranges, L2 backgrounds, gender, and so forth. It is
difficult to assess whether a large sample of student writing is ‘typical’ of U.S. college students
(or students elsewhere) if nothing is known about the student populations that the sample has
been gathered from.
Third, data collection should be triangulated by asking the teachers to fill out questionnaires
or surveys about their written CF philosophies or practices. On these surveys, teachers could
be invited to volunteer to telephone, email, or Skype interviews to investigate their responses
in more depth. A set of interviews with focal teachers, along with the survey responses from a
larger sample of instructors, would reduce the speculative nature of the discussion of how/why
teachers marked errors as they did (or did not). In other words, rather than inferring teachers’
DANA FERRIS: WRITTEN CORRECTIVE FEEDBACK IN L2 WRITING 537

Connors & Lunsford (1988) Lunsford & Lunsford (2008) Ferris (2006)

(U.S. college students) (U.S. college students) (ESL university students in


California)

1. No comma after 1. Wrong word 1. Sentence structure


introductory element 2. Missing comma after 2. Word choice
2. Vague pronoun an introductory 3. Verb tense
reference element 4. Noun endings
3. No comma in 3. Incomplete or missing (singular/plural)
compound sentence documentation 5. Verb form
4. Wrong word 4. Vague pronoun 6. Punctuation
5. No comma in non- reference 7. Articles/determiners
restrictive element 5. Spelling error 8. Word form
6. Wrong/missing (including homonyms) 9. Spelling
inflected endings 6. Mechanical error with 10. Run-ons
7. Wrong or missing a quotation 11. Pronouns
preposition 7. Unnecessary comma 12. Subject-verb agreement
8. Comma splice 8. Unnecessary or 13. Fragments
9. Possessive apostrophe missing capitalization 14. Idiom
error 9. Missing word 15. Informal
10. Tense shift 10. Faulty sentence (Appendix, p. 103; from Chaney
11. Unnecessary shift in structure 1999, p. 20)
person 11. Missing comma with a
12. Sentence fragment non-restrictive
13. Wrong tense or verb element
form 12. Unnecessary shift in
14. Subject-verb verb tense
agreement 13. Missing comma in a
15. Lack of comma in compound sentence
series 14. Unnecessary or
16. Pronoun agreement missing apostrophe
error (including its/it’s)
17. Unnecessary comma 15. Fused (run-on)
with restrictive sentence
element 16. Comma splice
18. Run-on or fused 17. Lack of pronoun-
sentence antecedent agreement
19. Dangling or misplaced 18. poorly integrated
modifier quotation
20. Its/it’s error 19. Unnecessary or
(Table 1, p. 403) missing hyphen
20. Sentence fragment
(Table 7, p. 795)

Figure 1 Student error types in L1 & L2 composition studies (listed in order of frequency)
Note: This figure has been taken from Bitchener & Ferris 2012, p. 97.
538 REPLICATION STUDIES

motives and strategies from marks on student papers, researchers should actually ask at least
some of the teachers what they were thinking. (See also Reid 1994 and Goldstein 2001 for
thoughtful discussions of the limitations of text analysis alone as a vehicle for studying teacher
strategies in responding to student writing.)
Finally, inter-rater reliabilities need to be calculated and reported for large-scale text
analysis. The explanation offered by Connors & Lunsford for why they did not do so, while
thoughtful, is not ultimately convincing. If numerous raters are employed to make judgments
and classifications, and conclusions and recommendations based upon those judgments are
offered, it is important that readers be confident that the raters were relatively consistent and
that the quantitative findings reported are robust and reliable. It is instructive to note that
the ‘top-20’ findings from the two studies were (and still are) used in a best-selling series of
writing handbooks authored by Andrea Lunsford. In other words, those findings have had
a practical impact on teacher training and especially on materials used with students. Thus,
it is important for such descriptions to be trustworthy—and inter-rater reliability scores are
one recognizable way of approaching this issue.

3.2 Lalande (1982)

As discussed in section 2 of this paper, written CF studies situated in classroom settings


represent only one of the three major approaches to research on this topic. Large-scale
quantitative text analysis (as in the Connors & Lunsford study) yields descriptive information
about the features/errors seen in student writing, sometimes with the added goal of examining
teacher markings on those errors. However, such studies, in order to be useful, must also be
large, meaning that an emphasis on what happens in individual classroom settings is not
practical. The third group of studies discussed above was controlled experimental studies,
and in such designs, researchers design writing tasks and provide carefully focused feedback
to examine students’ acquisition/control of a handful of carefully selected linguistic features.
Neither the writing tasks nor the feedback are provided by the classroom instructors; thus,
researchers can maintain maximum control over the procedures and ensure reliability of the
findings.
Both descriptive text analyses and controlled experiments provide information that is
useful for researchers and for classroom teachers. However, while removing student writing
and feedback from the classroom setting may enhance reliability, it also threatens validity:
How do we know that the carefully controlled writing and feedback processes used in the
experiments will transfer in any meaningful way to real classrooms and actual student writing?
This is an important question, and thus classroom studies, especially longitudinal ones, are a
critical piece of the research base on written CF.
Like Connors & Lunsford’s large-scale text analysis, Lalande’s (1982) classroom study,
published in the The Modern Language Journal, was ahead of its time. The concepts and
techniques used in his study have influenced later discussions, research, and pedagogical
suggestions regarding written CF, and his early work is frequently cited. In his introduction,
Lalande explained that his study built upon several key principles/theories regarding L2
writing:
DANA FERRIS: WRITTEN CORRECTIVE FEEDBACK IN L2 WRITING 539

1. To make progress in written accuracy, learners need correction or feedback.


2. For such feedback to be useful, teachers’ corrections must be systematic and not haphazard
or idiosyncratic.
3. Students should be asked to engage in a process of ‘guided learning and problem-solving’
(p. 140) in which they are asked to analyze and correct errors that have been called to their
attention. Making students aware, in an ongoing manner, of the written errors they make
also facilitates this guided learning process.

Lalande’s study focused on intermediate learners of German enrolled in four sections of the
same course at a large U.S. university. There were 15 students in each section, for a total of
60 subjects, and each section was taught by a different instructor. Lalande took great care to
ensure and to explain that the students in the four sections were at equal levels of proficiency
in German (including writing) and that the four teachers were equally capable of identifying
and systematically marking student errors in their compositions.
Two of the sections were designated as control groups, and they received ‘traditional’
written CF: the teachers provided corrections for all of the errors and then students were
required to rewrite their texts to incorporate the corrections. In the two experimental sections,
students received corrections that followed a specific Error Correction Code that included
12 grammatical and orthographic error categories. Each error was marked and coded with
the exception of lexical errors, which were directly corrected by the teacher. The student, at
the next class session, was asked to interpret the codes and correct the errors. To help solve
problems, students were allowed to consult their textbooks, their peers, or their instructors.
Finally, students in the experimental sections were able to monitor the frequency of their
errors by referring to an ongoing Error Awareness Sheet, a chart that tallied the errors in the
various categories across all writing assignments. These charts were also maintained by the
researcher for the control group students, but they were not shown to those students as the
experiment progressed.
In the course of a semester, students wrote five in-class essays. The first and fifth were
retained by the researcher as a pretest and posttest. The second, third, and fourth essays
followed the correction and revision procedures just described. Lalande used the pretest and
posttest texts to measure student progress in reducing errors in the categories that had been
marked. He also obtained end-of-term questionnaire data from the students to assess their
reactions to the error correction procedures they had experienced.
Lalande found strong evidence that the CF procedures used with the experimental group
were more helpful than the traditional correction style employed for the control group.
Differences between the two groups in reducing errors were statistically significant in 11 of
the 12 targeted error categories. Notably, this improvement held as the term progressed,
even though the students attempted more complex structures (which might have led to more
writing errors) as their knowledge of German improved. Further, while students in both
groups felt that their German writing had improved and that receiving corrections had
helped them, only the experimental group saw value in the rewriting activities, and they
found the information in the Error Awareness Sheet to be useful to them.
Despite Lalande’s use of terms like ‘control’ and ‘experimental’ groups, this was a situated,
longitudinal classroom study. The essays were written as part of the normal curriculum and
540 REPLICATION STUDIES

were based upon German literature that had been read and discussed in class. The teachers
provided feedback to their own students, and the rewriting activities were completed during
class, with students able to ask questions if they needed to. In contrast, in the third group
of controlled studies described above, students typically produce contrived texts based upon
pictures, researchers rather than teachers give the CF, and rewrites are not part of the design.
In Lalande’s study, both instructors and students perceived the data collection activities as a
regular part of their classes; indeed, the students were never told that they were part of an
experiment.
This study is extremely valuable because it highlights several aspects of CF that were
later discussed and researched extensively. First, Lalande was one of the first researchers to
investigate systematic, focused feedback on targeted error categories rather than unfocused
corrections of anything that catches a teacher’s eye. Interestingly, Lalande reported success
using a large number (12) of error categories; later researchers have posited that students
can only cope with a few at a time. Perhaps because Lalande’s procedures were repeated
several times over a semester and were accompanied by guided rewriting activities and error
awareness charts, the students were able, due to the added practice and information, to cope
with the additional information.
Second, Lalande’s study compared the effects of DIRECT and INDIRECT feedback, though
he did not use those terms himself (see Hendrickson 1980; Ferris 2011; Bitchener & Ferris
2012; Ferris et al. 2013). The control group in this study received direct feedback, meaning
that the teachers provided the correct forms and students simply transcribed them during
the rewriting activity. The experimental group received indirect feedback, which provides
information about errors but asks students to figure out the corrections for themselves. In
subsequent studies of reactions to CF, students have expressed a preference for indirect
feedback, perhaps sensing that even though direct feedback requires less effort from them,
they will learn more by engaging in the problem-solving process (Leki 1991; Ferris & Roberts
2001; Ferris 2006; Ferris et al. 2013)
Third, Lalande’s design required students to rewrite their texts after corrections, rather
than simply noting them for future reference. Later researchers have argued that revision after
feedback is an important step to ensuring both immediate uptake and long-term retention of
CF (see Ferris 2004, 2010). Finally, Lalande’s inclusion of the error awareness chart to give
students further information about their progress laid the groundwork for future studies of
the effects of error logs on students’ writing development (Roberts 1999; Ferris 2006).
Unfortunately, these important aspects of the written CF process have not been adequately
researched in the 30 years following the publication of Lalande’s study. There have been
several controlled studies of focused versus unfocused feedback (e.g., Sheen 2007; Ellis et al.
2008) and a few attempts to contrast direct and indirect feedback, but nearly all of these have
been in controlled experimental settings rather than natural classroom contexts. Further, there
has been little research on the effects of revision following CF (an exception is Chandler 2003;
see also the discussion in Ferris 2010) or of using error awareness charts to raise students’
consciousness about their recurring error patterns (one recent exception is Ferris et al.
2013). In short, while the innovations and techniques recommended and tested in Lalande’s
study seem useful, they should be further operationalized and tested with other groups of
learners.
DANA FERRIS: WRITTEN CORRECTIVE FEEDBACK IN L2 WRITING 541

3.2.1 Approach to replication

Lalande did many things well in designing this study. He included pretest and posttest
measures, he ensured that the student groups and the instructors were equivalent in their
abilities, and he triangulated the statistical measures by adding the student questionnaire
element. Replications of this study in other contexts (e.g., in secondary classrooms, in ESL or
composition settings, with other L2s besides German, looking at different target structures for
correction) would be valuable to assess whether Lalande’s highly structured approach to error
correction, revision, and error awareness works well outside of U.S. university foreign language
classrooms. Further, replications of this study could benefit from a couple of adjustments to
the design.
First, it would be useful to isolate the different elements of the design for further comparison.
Specifically, there were four contrasts between Lalande’s control and experimental groups:

• The control group received comprehensive corrections while the experimental group
received focused corrections on targeted error categories;
• The control group received direct correction, but the experimental group received
indirect correction with error codes attached;
• While both groups were required to rewrite their essays after receiving CF, the control
group students merely were asked to transcribe their teachers’ corrections, and the
experimental group students had to engage in problem-solving;
• The experimental group students were given information about their error patterns
and progress via the Error Awareness Sheet, but the control group students were not.

While it is clear that the experimental group benefited more from the ‘enlightened’ methods
used with them than did the control group students from the ‘traditional’ methods, it is hard
to tell which of the four elements of the contrasting treatments were the most beneficial or
necessary for the reported positive outcomes.
A replication (or a series of replications) might target the above contrasts more precisely. For
example, one group could receive focused, indirect CF but with no required rewriting, while
another group could receive the same CF but be required to rewrite. Or a group could receive
focused, indirect CF with revision and be given an error awareness chart, while another would
have feedback and revision held constant but not see the chart. Lalande’s design contrasts
one overall approach with several elements with a more traditional approach. Replications
that isolate the specific elements, alone or in combination, would help us to understand better
which aspects of the CF process are more important—focused CF? Indirect CF? Rewriting?
Error charts? All of these elements in combination? A replication which looks at the same
elements of the error treatment approach that Lalande did but divides the treatment groups
differently would help us to better interpret his findings and speculate as to their reliability
and generalizability to other contexts.
The second general replication suggestion is related to the first. If the purpose of the
research is to investigate the effects of the techniques discussed above, it becomes less necessary
to contrast an ‘enlightened’ approach with a ‘traditional’ one. In other words, the ‘controls’
used in Lalande’s study would be irrelevant and would be replaced by the presence or
542 REPLICATION STUDIES

absence of a particular technique (+/− revision, or +/− charts, for instance). What is
most interesting about Lalande’s design and findings is not that an ‘old-school’ approach
to error correction did not work as well as a modern one. Rather, the individual pieces of
the modern approach have potential benefits and are worth more careful study. However,
since few instructors these days would argue for the traditional approach implemented for the
controls in Lalande’s research, it does not seem beneficial to continue to use such an approach
as a straw man to prove that the newer methods in combination are superior. Further, as
already noted, the correction/revision/awareness variables operationalized by Lalande have
not been adequately investigated in subsequent written CF research.

4. Conclusion

Some readers might believe that, given the proliferation of studies on written CF in recent
years, the last thing the field needs is more of them in the form of the replications I
have proposed. I would argue that, on the contrary, the quantity of research already
completed makes it easier to see where there are gaps and where continued, carefully targeted
investigations would be useful. The Connors & Lunsford (1998; Lunsford & Lunsford 2008)
study should be replicated regularly for the same reasons they did the study in the first
place: students and their writing change over time, and we cannot assume that conclusions
drawn about student errors in the past will hold true in the present and future. The Lalande
(1982) study should be pursued further for an entirely different reason. He did a nice job
of designing a compelling classroom study that highlighted a number of issues that could
profitably be investigated further in a range of contexts but to date have been rather neglected
by subsequent researchers. Longitudinal classroom studies are more difficult to design and
complete than are text analyses or short-term controlled experiments, but they are critically
important if classroom instructors are to take researchers’ findings seriously. In sum, both
types of research are necessary and valuable to make our knowledge current and applicable
to real-world contexts.

References

Ashwell, T. (2000). Patterns of teacher response to student writing in a multiple-draft composition


classroom: Is content feedback followed by form feedback the best method? Journal of Second Language
Writing 9.3, 227–258.
Bitchener, J. & D. R. Ferris (2012). Written corrective feedback in second language acquisition and writing. New
York: Routledge.
Bitchener, J. & U. Knoch (2010a). The contribution of written corrective feedback to language
development: A ten-month investigation. Applied Linguistics 31.2, 193–214.
Bitchener, J. & U. Knoch (2010b). Raising the linguistic accuracy level of advanced L2 writers with
written corrective feedback. Journal of Second Language Writing 19.4, 207–217.
Chandler, J. (2003). The efficacy of various kinds of error feedback for improvement in the accuracy
and fluency of L2 student writing. Journal of Second Language Writing 12.3, 267–296.
Chaney, S. J. (1999). The effect of error types on error correction and revision. California State University,
Sacramento, Department of English: M.A. thesis.
DANA FERRIS: WRITTEN CORRECTIVE FEEDBACK IN L2 WRITING 543

Cohen, A. D. & M. Robbins (1976). Toward assessing interlanguage performance: The relationship
between selected errors, learners’ characteristics, and learners’ expectations. Language Learning 26.1,
45–66.
Connors, R. & A. A. Lunsford. (1988). Frequency of formal errors in current college writing, or Ma
and Pa Kettle do research. College Composition and Communication 39.4, 395–409.
Ellis, R., Y. Sheen, M. Murakami & H. Takashima. (2008). The effects of focused and unfocused
written corrective feedback in an English as a foreign language context. System 36.3, 353–371.
Eskey, D. E. (1983). Meanwhile, back in the real world...Accuracy and fluency in second language
teaching. TESOL Quarterly 17, 315–323.
Ferris, D. R. (2004). The ‘grammar correction’ debate in L2 writing: Where are we, and where do we
go from here? (and what do we do in the meantime . . . ?). Journal of Second Language Writing 13, 49–62.
Ferris, D. R. (2006). Does error feedback help student writers? New evidence on the short- and long-
term effects of written error correction. In K. Hyland & F. Hyland (eds.), Feedback in second language
writing: Contexts & issues. Cambridge, UK: Cambridge University Press, 81–104.
Ferris, D. R. (2010). Second language writing research and written corrective feedback in SLA:
Intersections and practical applications. Studies in Second Language Acquisition 32, 181–201.
Ferris, D. R. (2011). Treatment of error in second language student writing (2nd edn). Ann Arbor, MI: University
of Michigan Press.
Ferris, D. R. (2012). Written corrective feedback in second language acquisition and writing studies
(Research timeline). Language Teaching 45.4, 446–459.
Ferris, D. R. & B. J. Roberts (2001). Error feedback in L2 writing classes: How explicit does it need to
be? Journal of Second Language Writing 10.2, 161–184.
Ferris, D. R., H. Liu, A. Sinha & M. Senna (2013). Written corrective feedback for individual L2
writers. Journal of Second Language Writing 22, 307–329.
Goldstein, L. (2001). For Kyla: What does the research say about responding to ESL writers? In T. Silva
& P. K. Matsuda (eds.), On second language writing. Mahwah, NJ: Lawrence Erlbaum Associates, 73–
90.
Harap, H. (1930). The most common grammatical errors. English Journal 19.6, 440–446.
Haswell, R. H. (1983). Minimal marking. College English 45, 600–604.
Hendrickson, J. M. 1980. The treatment of error in written work. The Modern Language Journal 64.2,
216–221.
Hyland, F. (2003). Focusing on form: Student engagement with teacher feedback. System 31, 217–230.
Kepner, C. G. (1991). An experiment in the relationship of types of written feedback to the development
of second-language writing skills. The Modern Language Journal 75.3, 305–313.
Lalande, J. F. II (1982). Reducing composition errors: An experiment. The Modern Language Journal 66.2,
140–149.
Leki, I. (1991). The preferences of ESL students for error correction in college level writing classes.
Foreign Language Annals 24.3, 203–218.
Lunsford, A. A. & K. J. Lunsford (2008). ‘Mistakes are a fact of life’: A national comparative study.
College Composition and Communication 59.4, 781–806.
Porte, G. (2012). Introduction. In G. Porte (ed.), Replication research in applied linguistics. Cambridge:
Cambridge University Press, 1–18.
Reid, J. M. (1994). Responding to ESL students’ texts: The myths of appropriation. TESOL Quarterly
28, 273–292.
Robb, T., S. Ross & I. Shortreed. (1986). Salience of feedback on error and its effect on EFL writing
quality. TESOL Quarterly 20.1, 83–93.
Roberts, B. J. (1999). Can error logs raise more than consciousness? The effects of error logs and grammar feedback
on ESL students’ final drafts. California State University, Sacramento, Department of English: M.A.
thesis.
Santa, T. (2006). Dead letters: Error in composition, 1873–2004. Cresskill, NJ: Hampton Press.
Semke, H. (1984). The effects of the red pen. Foreign Language Annals 17.3, 195–202.
Shaughnessy, M. P. (1977). Errors and expectations. New York: Oxford University Press.
Sheen, Y. (2007). The effect of focused written corrective feedback and language aptitude on ESL
learners’ acquisition of articles. TESOL Quarterly 41.2, 255–283.
Storch, N. & G. Wigglesworth (2010). Learners’ processing, uptake, and retention of corrective feedback
on writing. Studies in Second Language Acquisition 32, 303–334.
544 REPLICATION STUDIES

Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning 46,
327–369.
van Beuningen, C., N. H. de Jong & F. Kuiken. (2012). Evidence on the effectiveness of comprehensive
error correction in second language writing. Language Learning 62.1, 1–41.
Williams, J. M. (1981). The phenomenology of error. College Composition and Communication 32.2, 152–
168.

DANA FERRIS is Professor and Associate Director for Second Language Writing in the University Writing
Program at the University of California, Davis, USA. In her research projects, she has extensively
investigated teacher response to student writing, including (but not limited to) the specialized topic
of written CF. Her books include Written corrective feedback in second language acquisition and writing (with
John Bitchener, Routledge, 2012), Treatment of error in second language student writing (Michigan, 2011),
Response to student writing (Erlbaum, 2003), and Teaching L2 composition: Purpose, process, and practice (with
John Hedgcock, Routledge, 2014). Her work has appeared in journals such as TESOL Quarterly, Journal
of Second Language Writing, Language Teaching, Studies in Second Language Acquisition, and Research in the Teaching
of English.

You might also like