Grammar and The Lexicon - Developmental Ordering in Language Acquisition

Grammar and the Lexicon: Developmental Ordering in Language Acquisition
Author(s): James A. Dixon and Virginia A. Marchman

Source: Child Development, Vol. 78, No. 1 (Jan. - Feb., 2007), pp. 190-212
Published by: Wiley on behalf of the Society for Research in Child Development
Stable URL: https://www.jstor.org/stable/4139220
Accessed: 14-11-2018 21:25 UTC
REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/4139220?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Society for Research in Child Development, Wiley are collaborating with JSTOR to
digitize, preserve and extend access to Child Development
This content downloaded from 141.117.117.172 on Wed, 14 Nov 2018 21:25:46 UTC
All use subject to https://about.jstor.org/terms
Child Development, January/February 2007, Volume 78, Number 1, Pages 190-212
Grammar and the Lexicon: Developmental Ordering in Language Acquisition

James A. Dixon Virginia A. Marchman
University of Connecticut Stanford University
Recent accounts of language acquisition propose that the knowledge structures that comprise l
within a single, unified system that shares computational resources and representations. One im
approach is that developmental relations within the system become central to theorizing abo
quisition. Previous work suggested that lexical development preceded grammatical developme
mental ordering with strong theoretical implications. One purpose of the current articl
developmental ordering hypothesis. Results showed that children (aged 16-30 months) develo
grammar synchronously. The second purpose is to demonstrate a recently developed method
developmental ordering, the nonlinear-mapping approach, and show how the method can be
capitalize on multiply determined developmental systems, such as language.
Developmental ordering is a crucial link between

aspect of the system is contingent upon the de
developmental theory and data. The order in which
opment of another.
skills, abilities, structures, and so forth develop isdomain
In the a of language, developmenta
fundamental source of evidence in developmental
dering relations have played a particularly im
science. Developmental theories are constrained byin recent theorizing. Across the first yea
tant role
the ontogenetic orderings that are empirically
life, children's
ob- language abilities typically prog
through
served. Likewise, our theories and hypotheses are several phases that appear to refle
also often motivated by the order in which items
asynchronous sequence of events. Babbling
arise. For example, children acquire basic-level
sound con-
play characterize much of the progress w
cepts before either superordinate or subordinate
the first year, followed later by the onset of r
nizable
ones, an ordering that constrains theories of semanticwords around a child's first birthday
cognition (McClelland & Rogers, 2003). In memory
typically only after a relatively extended perio
development, Courage and Howe (2004)single proposedwords or short simple phrases (e.g., "mo
that autobiographical memory and the sock") concept thatof children's productions reflect the g
"self" develop synchronously, thus placing mara lower
of their native language, increasing in ove
limit on early autobiographical recollections.
lengthIn the
as well as the inclusion of closed-class f
development of early verbal behavior,(e.g., Ejiri inflections
and like plural-s or past tense-
Masataka (2001) proposed that rhythmic Whilemotor
there are well-documented individual differ-
action and nonverbal vocalizations are important
ences in the timing of this sequence (e.g., Bates, Dale,
& Thal,
precursors to canonical babbling. In general, 1995), the transitions from sounds to words
devel-
opmental ordering allows researchers to togrammar
addresshave traditionally been viewed as de-
fundamental questions about relations within
marcating the three distinct and developmentally au-
developing system, for instance, whether tonomous
develop- phases (e.g., Pinker, 1999).
ment of different aspects of the system stemsOthers
from havea instead proposed that the knowledge
common source or whether the development structures
of one that comprise these fundamental lan-
guage domains are constructed by the child within a
unified developing system using a common set of
domain-general learning mechanisms and compu-
Portions of this work were supported by a Grant from the
National Institutes of Health (HD 42235). The authors would like tational or representational resources (Bates &
to thank the members of the MacArthur-Bates CDI Advisory Goodman, 1999; Elman, 2004; MacWhinney, 2001,
Board for permission to use the norming data for the current 2004; Marchman, 1997; Zevin & Seidenberg, 2004).
project. We also thank Donna Thal, Jeff Elman, and two an-According to this view, the wide array of knowledge
onymous reviewers for helpful comments on a previous version of
structures that comprise language are emergent
the manuscript.
Correspondence concerning this article should be addressed to
phenomena; they do not reside in a preformed state
James A. Dixon, Department of Psychology, 406 Babbidge Road,
University of Connecticut, U-1020, Storrs, CT 06269-1020. Elec- ( 2007 by the Society for Research in Child Development, Inc.
tronic mail may be sent to james.dixon@uconn.edu. All rights reserved. 0009-3920/2007/7801-0011
Grammar and the Lexicon 191
models of the process claiming to provide evidence

within the organism, nor are they simply learned
from the environment. Rather, they emerge for the psychological reality of the mechanisms they
through
complex interactions between domain-general embody (Cohen & Chaput, 2002). For example, in the
learning mechanisms and an intricately structured,
acquisition of the English past tense, children typic-
multiply faceted world. As the system organizes in
ally produce overregularization errors after the ir-
response to regularities at the different levels
regularofverbs had been produced correctly, that is U-
language, various structures emerge at highershaped development (e.g., Cazden, 1968, but see
levels
of organization. For example, Plaut and KelloMarcus
(1999)et al., 1992, for an alternative view). Plunkett
and may
showed that phonological representations Marchman (1993) presented a computational
emerge from the confluence of semantic, acoustic,
model of the development of the English past tense
and articulatory factors. According to this in which
view,the model's behavior also followed a U-
phonological representations form through theshaped
re- course. While the model's behavior did not
mimic
peated interactions with these other levels; such in-children in all regards, the claims made by
teractions can only occur if all the levels are Plunkett
cast in a and Marchman (1993) regarding the com-
single system. Similarly, it has been proposed putational
that mechanisms underlying the transition
abstract grammatical abilities (e.g., productivefrom rote production of lexical items to productive
rule-
like use of inflectional morphemes, like language "daddy use would have been greatly undermined
goed") emerge over the course of building a if the model did not follow the same developmental
lexical
system (e.g., Bates & Goodman, 1999). Here, ordering.
gram-
matical rules and principles are derived viaA the second major goal of the article is to demon-
tracking of frequency-weighted regularities strate that oc-and extend previous work on a general
cur within and across lexical forms. method for testing developmental ordering hypoth-
eses. The developmental ordering of items (e.g.,
For these theories of acquisition, then, it becomes
paramount to understand the developmental order- structures, abilities, skills, behaviors) is a key theo-
retical prediction about any developing system.
ing relations within that system. These orderings can
reveal contingencies in how development takes However, given the types of measures researchers
place. For example, developmental change within typically have available, testing these hypotheses has
one level of language (e.g., the lexicon) may be been notoriously difficult. The current article pro-
contingent upon prior developmental change within vides a detailed example of how a developmental
another level (e.g., phonology). Thus, developmentalordering hypothesis can be tested using an approach
orderings place important constraints on theory and presented by Dixon (2005), which we call the non-
provide the central phenomena that are addressed linear-mapping approach. It also demonstrates how
by computational models. That is, computational
this approach can be extended for developmental
models of language acquisition predict the order systems
in that contain multiple influences, such that
which language structures emerge based on the converging evidence is brought to bear on the issue.
principles of acquisition that are embedded in the
models. The order of the emergence of language
Developmental Ordering of the Lexicon and Grammar
structures is one of the model's key benchmarks; the
performance of the model is compared with what is A large body of work shows that development of
known about the developmental sequences observed the lexicon and grammar are strongly correlated, a
in children (Elman, 2001; Marchman, Plunkett, & cross-domain association that has been taken as ev-
Goodman, 1997; Munakata & McClelland, 2003; idence that learning in these domains is paced by the
Plunkett & Marchman, 1993; Thomas & Karmiloff-same computational or representational mecha-
Smith, 2003). Models are supported to the degree nisms. Nevertheless, the developmental asynchrony
that they show the same developmental orderings between lexical learning and grammar learning
that have been observed empirically. suggests that lexical development may initially out-
A major goal of the current article is to test onepace the development of grammar (see Bates &
such ordering hypothesis, specifically, that lexicalGoodman, 1999 for a review). One central piece of
development precedes the development of grammar. evidence for this latter hypothesis concerns the shape
A considerable body of empirical evidence has been of the relation between measures of grammar and
presented that supports this hypothesis (Bates vocabulary
& size. When children's scores on a mea-
Goodman, 1999; Dromi, 1987). This developmentalsure of grammar are plotted as a function of their
relation must be explained by theories of languagescores on a measure of vocabulary size, the relation
acquisition and demonstrated by computational
appears strikingly curvilinear. Figure 1 shows an
192 Dixon and Marchman
40
plexity (i.e., grammatical function words such as
prepositions and conjunctions) were omitted from
the vocabulary count. When the growth curves of
no
-_ 30i individuals were examined, their patterns were
3 20
z similar to the average patterns. Furthermore, a recent
E
E 10)
study with children learning both English and
Spanish at the same time (Marchman, Martinez-
Sussmann, & Dale, 2004) indicated that similarly
strong relations are seen within each language (i.e.,
0 100 200 300 400 500 600
English lexicon to grammar; Spanish lexicon to
700
Number of Words Produced

grammar), even though across language relations
Lexicon
(i.e., English lexicon to Spanish lexicon; English
Figure 1. Grammatical complexity as a function of numbergrammar
of to Spanish grammar) were weak. These
same relations were found using measures of
words produced. The curve shows the fitted line from the model
in which the measure of lexicon and lexicon squared were used to
vocabulary
and grammar based on naturalistic lan-
predict grammatical complexity. The shape of this particular curve
guage
is nearly identical to that presented in the review by Bates and
performance, not just those relying on check-
Goodman (1999). list reports from parents and/or other caregivers.
A considerable amount of theorizing has been
carried out to explain this developmental ordering
example of this relation using normative data from
relationship. Perhaps the most prominent idea is that
the vocabulary checklist and grammar complexity grammar emerges from the mechanisms involved in
sections of the MacArthur-Bates Communicative acquiring the lexicon itself. Under this hypothesis,
Development Inventories (CDI; Fenson et al., as the lexicon
1993, in increases in size, grammar becomes
press). As can be seen in the figure, lexicalorganized
devel- into increasingly complex forms. Put
opment appears to initially occur more rapidly another
thanway, grammatical forms are emergent
grammatical development (i.e., there is moreproperties
change of the unified language system and are
most strongly tied to changes in the lexicon within
in the size of the lexicon early on than in grammar).
that system.
Note that this pattern of data cannot be explained by As the size of the lexicon grows, the
the logically required fact that grammar must quality
have of the emergent grammar changes. In this
some minimum set of words on which to be used. way, a "critical mass" of vocabulary is required for
Taken alone, this fact only specifies the location of
the development of a particular level of grammatical
the intercept; some minimum number of words complexity
is (Marchman & Bates, 1994; Plunkett &
necessary for grammatical development to Marchman,
begin. 1993). In another version of this view,
learning words involves learning their grammatical
Beyond that minimum number of words, the relation
as well as lexical-semantic properties, including
between lexicon and grammar could be negligible.
Thus, this fact does not imply that grammar and requirements
the regarding the constructions in which
those words can legally appear and which inflec-
lexicon will be strongly related over their respective
developmental courses (and is completely mute tionalon
morphemes are required. Others have pointed
what shape that relationship might take). out that early word combinations tend to be highly
Bates and Goodman (1999) noted that this non- routinized and situation-specific, suggesting that
learning grammar, like learning words, may be
linear relationship between the lexicon and grammar
has been demonstrated both cross-sectionally and driven by processes that are item-specific and fre-
quency dependent. It is only over the course of
longitudinally for English-speaking children (see
learning an increasingly sophisticated and inter-
also Bates et al., 1994, Dale, Dionne, Eley, & Plomin,
2000; Fenson et al., 1994; McGregor, Sheng, & Smith, related lexicon that grammatical structures become
2005). Studies have also found similar relationships encoded in terms of their abstract syntactic form
in Italian (Caselli, Casadio, & Bates, 1999), Hebrew (e.g., Akhtar, 1999; Braine, 1976; Lieven, Pine, &
(Maital, Dromi, Sagi, & Bornstein, 2000), IcelandicBaldwin, 1997; Tomasello, 2003). All of these pro-
(Thordardottir, Weismer, & Evans, 2002), and Span- posals presume strong associations between the
ish (Jackson-Maldonado et al. (2003). Bates and learning principles involved in lexical and gram-
Goodman (1999) examined several possible meth-
matical acquisition, and are compatible with an
increased integration of lexical-semantics and
odological artifacts that could account for these ef-
fects, and demonstrated that the relationship heldgrammar in theories of linguistics (e.g., Bresnan,
even if words that are related to grammatical com- 1982; Goldberg, 1999). Should they bear out empir-
ically, each places important constraints on mapping,

the ar- such as that exhibited by the measure of
chitecture of the language system and causal item B, may initially seem unlikely or even con-
rela-
tions within that system. trived, note that this situation only requires that the
measure be more responsive at some levels of de-
velopment than others. It seems likely that this is a
Reconsidering Developmental Ordering and Functional
Form very common, perhaps even modal, situation in de-
velopmental research. Certainly, all researchers hope
The shape of the relationship between twothat
mea- their measures capture developmental change
sures provides a potentially powerful indicator of span, but it is probably overly optimistic
across the
the type of developmental relation between the that a measure captures the extent of de-
to expect
velopmental
underlying variables. However, the interpretation of change equally well across the devel-
that shape or functional form depends crucially
opmental
on range.
the mapping between the underlying variablesThis and analysis leads to the unsettling, general
conclusion
their respective measures. Consider, for example, thethat the observed functional form seen in
developmental relationship between the twothe items,
data may not accurately reflect the actual form of
A and B, shown in the upper panel of Figure 2.underlying
the The relationship between the variables of
line in the center of the figure, labeled "Develop-
interest. But, is it possible to tell the difference be-
ment," shows that development proceeds from tweentherelations that reflect a true developmental
bottom of the panel to the top. Each item isordering
repre- from those that derive from measurement
sented by a bar that increases in saturation nonlinearities?
as the Some recent work on this issue has
item develops. The period during which the indicated
devel- that it is. For example, Dixon (1998)
opmental changes of interest occur for eachshowed
item is how some developmental ordering hy-
potheses
marked by a rectangle enclosing the bar. In the cur- were eliminated by particular data pat-
terns. In more recent work, Dixon (2005) showed that
rent example, items A and B develop synchronously;
they start and complete development together nonlinear
and mappings, such as the one discussed
develop at the same rate. above, must create specific relationships between
Each underlying item is assessed by a measure,
error and the predicted value of the measure, given
standard
shown next to it (and labeled arbitrarily with num- ordinary least-squares (OLS) regression
bers from 0 to 100). The lines connecting theassumptions
under- (Cohen, Cohen, West, & Aiken, 2003).
lying variables to the measures illustrate the
Furthermore, the evaluation of these error patterns
mapping between them. For both items, the mea-the tools to test directly whether a nonlin-
provides
sured values increase as the underlying levelear
of mapping
the between the measure and the under-
variable increases. Furthermore, the measure lying
of item
variable is masking the true underlying
A captures developmental change in A equally relation
well among the constructs. That is, by examining
across its developmental span. That is, the measure
the pattern of residuals (i.e., errors) across the pre-
dicted
is equally responsive to early and later changes in values
A of the measure used as the dependent
(i.e., forms an interval-level scale; Stevens, 1951).
variable, it is possible to determine whether an ob-
However, the situation for item B is different. The ordering relation between two variables (i.e.,
served
measure of item B is more responsive to later B) is actually an artifact of a nonlinear
A before
mapping
changes than it is to earlier ones. The mapping be- between measure and construct. Because
tween the underlying variable B and the measure of
this approach features heavily in the analyses that
B is ordinal, but nonlinear. follow, we first provide an overview of the logic and
The dashed lines show hypothetical individuals then present
at the approach in some detail using the
different points in the developmental period. For relationship as an example.
synchrony
each measure, an individual's score can be found An by
overview of the nonlinear mapping approach. The
following the dashed line to the underlying logic variable
of the nonlinear mapping approach flows from
and then the solid line to the measure. The lower a fundamental and widespread assumption about
panel of Figure 2 shows the observed relationship the nature of developmental systems in combination
with some easily demonstrable consequences of
between the measures of A and B. Note that, despite
nonlinear functions. We assume that development
the fact that the underlying relationship is synchro-
ny, the observed relationship shows a strong A be-results from the interactions of a complex system and
fore B, curvilinear pattern. The nonlinear mappingthat, therefore, even the strongest relationships are
between the underlying variable B and its measure not deterministic. One seemingly unremarkable im-
creates this illusory pattern. Although a nonlinear
plication of this assumption is that there is always
Underlying Relationship:
Synchrony
Item A , Item B
100 - - 100
90 - 90
80 - E a 80
C: 04 C
70 - < a " M - 70
-0
< 610 10
0
9030
20 20
10
10
Observed Data Pattern:

100 -A Prior to B
100 A Prior to B
90
80
70
E 60-
510
10
0 10 20 30 40 50 60 70 80 90 100
Measure of Item A
Figure 2. The synchrony relationship between items A and B is shown in the top panel. The measure of each item, with values that ran
from 0 to 100 (an arbitrarily chosen scale), is shown beside it. The lines connecting each underlying variable to its respective measure give a
sense of the mapping between them. For example, the measure of item A is linearly related to the underlying level of A. The measure
item B is nonlinearly related to the underlying variable B; earlier developmental changes in underlying item B show smaller increases
the measure relative to later ones. The lower panel shows the idealized data pattern that results from the situation in the top panel.
(e.g., lapses in the participant's attention, misun

some unexplained variance in the relation between
any two (or more) items. Note that this is an as-
derstanding of a question, etc.).
The nonlinear mapping function above creates the
sumption about the underlying reality of develop-
mental systems-it is presumed to be true of
curvilinear relationship between Am and Bm from the
developmental relations regardless of whether any-
linear relationship between A, and Bu. However, it
also
one is measuring them. The nonlinear mapping ap-leaves a signature pattern in the data. Becaus
proach capitalizes on this bit of unexplained
the sum of As, these
transformed, and underlying error,
terms that wereeu,previously
is nonlinearly
variance to help diagnose the relation or mapping
independent
between underlying variables and our measures of now become related. This mathe-
them. If the mapping is nonlinear, then thematical
unex- fact is easily demonstrated in the current
plained variance must become systematicallycontext
related
by expanding the equation: (o-P*A,-+e )2,
to the measured variable. This follows from the fact which yields: C2+(P*Au)2+6E+2o*(P*A,)+2o*u
+2(P*A,)su. For our current purposes, the interest-
that nonlinear functions create systematic relations
among previously unrelated terms (the mappinging element is the last, multiplicative term,
from the underlying variable to its measure acts like
2(P*Au,)e,, because it shows that the nonlinear
a function). Therefore, by examining the relation mapping function creates a relation between Au and
between error and the measured variable, one can Eu. (The remaining terms can be ignored for the
test whether an observed developmental relation moment.) Multiplicative terms, of course, imply that
was created through a nonlinear mapping. In the the effect of one variable will depend on the value
next section, we present the approach in more detailof the other. All nonlinear mappings create rela-
using the synchrony relation as an example. We use tionships among the underlying terms; the powers
the familiar general linear model to represent de- are a convenient way to represent nonlinear map-
velopmental relations, thus facilitating the transla-pings, but not essential to the approach. The major
tion from hypothesized relations to well-knownpoint here is that if there is a nonlinear mapping
analytic methods, such as OLS regression. from the underlying variable to the observed mea-
Representing synchrony as a simple equation. Thesure, systematic relationships between the under-
synchrony relationship can be represented as a sim- lying variable and underlying error will be created.
ple equation: Bu = +P*Au, where the intercept is course, the precise form of this relationship will
Of
denoted as a and the slope as P. The equation spec- depend on the particular type of nonlinear mapping.
ifies that changes in Bu are linearly related to changes For example, variable-error relations will look dif-
in Au and, therefore, is another way of stating thatferent if there is a nonlinear mapping from variable
the underlying variables A and B develop syn- to measure in only one of the variables versus if
chronously. Unexplained variance or error, in the variable-measure nonlinearities exist in both vari-
form of individual-level differences and unmeasured ables. Therefore, given an observed developmental
influences, enters the model at this level: ordering (e.g., A before B), it is possible to test
Bu = I+3*Au+eu. In standard fixed-effect ap- this ordering is an artifact of a nonlinear
whether
proaches, such as OLS regression, we assume that between the measure and the underlying
mapping
construct, and hence, does not actually reflect the
error, cu, is unrelated to the levels of the predictors.
We also assume that &E is normally distributed, true
hasunderlying
a developmental relationship.
mean of 0, and an unknown variance.
This simple model describes the relationship be-
An Alternative Explanation for the Lexicon - Grammar
tween A and B at the underlying level, a level we
Relationship
cannot observe directly--measures are all we have
Inour
in hand. Assume that, consistent with Figure 2, the current context, we consider two possible
situations in which we would see a curvilinear re-
measure of A, Am, is a linear function of A,. Similarly,
assume that our measure of B, Bm, is a nonlinear
lationship between the lexicon and grammar. First, it
is possible
function of Bu, in the manner depicted in Figure 2. that the underlying developmental rela-
tionship
The relationship between the underlying variable B is indeed as it appears at the level of the
measures; lexical development precedes grammat-
and its measure can be represented as a nonlinear
function such as being raised to a positive ical development. In this case, both measures of
power
lexicon and grammar would capture developmental
greater than 1: Bm = (Bu)2. Substituting the equation
change
presented above for Bu, we find that Bm, = equally well across the period, that is,
there would be no nonlinearities in measurement
(a+P~eAu+Eu)2+,m. The additional error term, e,,,
that
represents error that occurs at the measurement would change the way the underlying
level
developmental ordering appears in the mentdata. A

precedes lexical development, until later
second possibility is that the underlyingarticle.)
relation-
ship between lexicon and grammar is actuallyIn panel (i) at the top of Figure 3, we plo
observedone
synchrony, but a nonlinear mapping between priority relation between lexicon
grammar.
or more of the underlying variables to its respect- In the bottom panel (u), we p
ive measure is driving the curvilinear relation- relation between lexicon and gram
underlying
ship, masking the true situation that synchrony.
lexicon andNow, consider how this synchron
grammar are developing synchronously. (For pur-
tionship might give rise to the observed curv
poses of exposition, we postpone discussion of a in more detail. There are only
relationship
third logical possibility, that grammatical develop-
types of nonlinear mapping situations that can
E
E
(D
r)
Measure of Lexicon
100 Lexicon Grammar 100 100 Lexicon Grammar 100 100 Lexicon Grammar 100
90 90 90 90 90 90
80 80 80 80 80 80
70 70 70 70 70 70 70 "
0 6060 -660
OX5050 nE E 60
50 m E8
50 E 0 50
6050
60E
40......... ......... 40 0 .......... .......4040......0
30 0 30 - 30 ..... ..30 30 ... . ... 30
20 . .. .. ...... . 20 20 20 20 .- ..- ......... .20
10
0 10..........0
...... 10 . 010 ......010
0.. a 10
.........
0
.........n(ii) L.(iii) (iv )
E
E
C:
.0)
C (u)
D
Underlying Lexicon
Figure 3. The top panel (i) shows an idealized version of the observed relationship between lexicon and
shows the hypothesized underlying synchrony relationship between lexicon and grammar plotted analo
shows the three different nonlinear mapping situations that can produce the observed relationship from
(ii), the measure of lexicon is a nonlinear, decelerating function of underlying lexicon, but the measure of g
underlying grammar. In panel (iii), the measure of lexicon is linear, but the measure of grammar is a n
underlying grammar. Finally, panel (iv) shows the situation in which both lexicon and grammar are nonl
variables as just described.
simulated
the curvilinear pattern in panel (i) from an under-relationship between lexicon and gram-
mar second
lying synchrony relationship (panel (u)). The when they are developing in synchrony. The
row of panels in Figure 3 illustrates thesemiddle
three and
sit-lower panels of Figure 4 are standard
uations graphically. In the left-most panelways
(ii),of presenting residuals. Deviations from the
the
mapping between the lexicon and its measure is
nonlinear such that early development of the lexicon
gets more change in the measure relative to100
later oo+
development. The mapping between grammar and

80-
its measure, however, is linear. In the center panel +
(iii), the mapping between the lexicon and its
measure is linear, but the mapping between gram-
E 60
mar and its measure is nonlinear. The grammar
0
(C + -+ +
measure is more responsive to later developmental

40
changes than it is to earlier ones. The right-most
panel (iv) shows both variables with measures that
20+
are nonlinear in the ways just described. Thus, the
only ways to obtain the observed nonlinear rela-
tionship when the lexicon and grammar are actually
developing in synchrony are (1) for the measure 0 of 40 80 120 160
Underlying L
the lexicon to be a nonlinear, decelerating function
of the underlying lexicon, or (2) for the measure
20
of
grammar to be a nonlinear, accelerating function of
underlying grammar, or (3) for both those nonlinear
mappings to occur. Dixon (2005) demonstrated -lo
that + +
10 0 + +
these types of nonlinear mappings between mea- + + +
-4 - -++ ++ + ++
+ 4+
sures and their underlying constructs make predic- -*4I+-+++ + 4+ ++4+ t
-++ + f,
+lt -4- +
+ - 4++
tions about the patterns that will be observed at the
level of the residuals. + ++#
Nonlinear mappings predict specific residual pat- + + t + +.+-b,4,

+ + + +
terns. We illustrate the predictions for each type of + + +
nonlinear mapping using a simple simulation,

adapted from Dixon (2005). The simulation allows us
to show how error that is unrelated to the underlying 0 40 80 12400 160
variables (i.e., homoscedastic) becomes systematic-
Predicted Value of Grammar
ally related to the measured variables (i.e., hetero- -10 -+ - ++-++
scedastic). Furthermore, the simulation concretely
demonstrates that each type of nonlinear mapping
-lo-
shown in Figure 3 creates a specific relationship be-
tween error and predicted values that can be both
visually inspected and tested statistically. We present ++ + 0
+ ++ + 0
+ + 4 + 4 + + 4+ ++
U)+ + + 4+ 1-+
the general forms of these predicted residual pat-
terns graphically.
First, we show the hypothesized underlying syn-
chrony relationship and residuals. This under-the-
hood view of the developmental relations is, of 0 40 80 110
course, never available to researchers. We present it

here to show the relation graphically and to provide
a point of comparison for examining the residual
plots below. The simulated data consist of two Predicted Value of Lexicon
Predicted Value of Lexicon

underlying variables, which we label here L (lexicon)
Figure 4. The top panel shows the underlying relationship be-
and G (grammar), related via the following equation:
tween lexicon and grammar specified in the simulation. The
G, = + PL, +En. The values of a and P were set at middle
5 panel shows the residuals when grammar is predicted as a
function of lexicon. The lower panel shows the residuals when
and 1.5, respectively, and tn had a mean of 0 andlexicon
a
variance of 20. The upper panel of Figure 4 plots the is predicted as a function of grammar.
model predictions (the zero point on the ypanel

axis) (iii)
are of Figure 5. We created Gm such that
shown as a function of the predicted values.Gm = Note
(Gu)2+Emg Lm was created as a linear function
of L,, and
that the extent of scatter around the best-fitting linethe error terms were created as described
is constant in both plots, indicating that at When
above. the Gm and G2 were used to predict Lm, the
underlying level error is unrelated to the fit
predicted
was again very good, but the pattern of residuals
values of grammar and lexicon, respectively (i.e.,
was quite different from that in the previous case. As
homoscedastic). However, if a nonlinear mapping in panel (ix) of Figure 5, the pattern of
can be seen
residuals was curvilinear with two bends (i.e., cubic),
obtains between the measure and the underlying
construct, the appearance of these residualbut does not show the negative relationship pre-
patterns
will change dramatically in ways that reflect
sented the
in panel (viii). When Lim and L2 were used to
particular type of nonlinearity imposed by the
predict Gm, the pattern of residuals was strongly
measure. heteroscedastic, as can be seen in panel (vi). Here, the
residuals
To illustrate, Figure 5 presents the were positively
patterns of re- related to the predicted
siduals that are obtained when one or more of the value of Gin.
Finally, consider what would occur if both mea-
constructs is nonlinearly related to its measure. In all
cases, the observed relation between lexicon and sures, Lm and Gm, were nonlinear functions as just
grammar is priority, as plotted in the top-most paneldescribed (i.e., Lm was an accelerating function of L,,
Gm was a decelerating function of Gu), depicted
(panel (i)). The left-most portion of the figure pres-
ents the situation in which the measure of the lexi- graphically in panel (iv) of Figure 5. When Gm and
con, Lm, was a decelerating function of Lu, illustrated
Gn were used to predict Lm, panel (x) shows that the
graphically in panel (ii). To generate the residual
pattern of residuals takes on a curvilinear shape and
plots, we created Lm as: Lm = (Lu)2tE6ml Gm was that
a the residuals were negatively related to the
predicted values of Ln. Finally, when Lm and L2 were
linear function of Gu: Gm= Gu+Gmg. The error terms
were drawn from a normal distribution with a meanused to predict G,,, the residuals were again posi-
of 0 and a variance of 20. Panel (viii) presents the
tively related to the predicted values of Gm (panel
pattern of residuals when Gm and G2, were used (vii)).
to Thus, each of the three nonlinear mapping
predict Lm. In this case, the model fit was very good, situations that are capable of creating the observed
but the pattern of residuals was strongly hetero- priority relationship from underlying synchrony can
scedastic (i.e., the distribution is not uniform across be identified by a specific signature pattern in the
the values of the dependent measure). In this case, plots of the residuals.
the absolute values of the residuals were negativelyAlthough this might initially appear complex, the
related to the magnitude of the predicted values implications of can be summarized as follows: first, if a
Lm and the degree of scatter changes across the pre- curvilinear relationship is observed at the level of the
dicted values as well. Panel (v) plots residuals when measures (e.g., as has been shown for lexicon and
grammar) but the true relation between lexicon and
Lm and
tern L2m were
of residuals used to predict
was heteroscedastic, Gm. Again,
although the the pat- grammar is synchrony, it must be the case that there
relationship between the absolute value of the re- is a nonlinear mapping between one or more of the
siduals and the predicted values was positive and measures and their underlying constructs. Specific-
somewhat weaker. These two plots (panels (viii) and ally, (a) the measure of the item that appears to be
(v)) illustrate that if the underlying relationship be- developing more rapidly, the lexicon in this case,
tween the lexicon and grammar were synchronous may be a decelerating function of the underlying
but a nonlinear mapping between the underlying variable, and/or (b) the measure of the item that
lexicon, Lu, and its measure, Lm, were driving the appears to be developing more slowly, grammar in
observed data pattern, the residuals would be relat- this case, may be an accelerating function of the
ed to the predicted values. This relationship is the underlying variable.
strongest for predicted values of lexicon, because the Second, each type of nonlinear mapping will nec-
measure of lexicon here is nonlinear. The nonlinear essarily induce a systematic relation between
underlying error and the predictor(s), a relation that
mapping has left a signature pattern in the residuals;
a specific set of relationships between underlying
is observable in the residual patterns. Measures that
error and the predictor variables has been induced
are decelerating functions of the underlying variable
by the nonlinearity of the measure. create negative relationships between error and the
Next consider what would happen if our measure predicted value of the measure. Measures that are
of grammar, Gm, was an accelerating function of
accelerating functions create positive relationships
underlying grammar, G,, illustrated graphicallybetween
in error and the predicted values.
E
E
Measure of Lexicon
100o- Lexicon Grammar -0 1 Lexicon Grammar-lOO-L exicon

9000900-80 80 90 -8080so0 90 90 9
-770-- 70 70 70- 6
00 20020 2
!5o.E2
cc +6300300060
00E+1)0
-05 :1
(cCOU)o
E0)
6050
0++02
-2
6 50
00 CO0CO0) C0Ca00
30
40-
0 0 30
......
.. 30.
..
20 2020.
-0o E0 -10
+ . .1.2020-------
+ E 4 0 20 20 - . .....
10710.0
0000+ + 0+ c t++Jfo 47;

10 + 0Y+
00 ++04
+ + + (v)
00 40 07tt
00 100+00++ t 7 00000 0 02
+~~-0
++007.0t
-20000030000 -20+3 0' -20
10++
0 4000000T0#0
60 10 + 20
80 +t . + 40
+ t- 0 60
+ +7++
807+10 0
++ 5~
40
00000+
0 4o- t w?t+ +oo<0
+4%-F 4 0to ?4
1 +04 o-a
I + 4t' 000000+4+
+t-
+0 7 + +o*-
+4-
+ 40 00
+F 4 + #0 0
10 + + + ++ :p+++
+++,it4+++
-t li
+ 10+-
t0++ + + + A+ ++
+ + 4 + *+;?t-?
+ ++ *
Predicted Value of Grammar Predicted Value of Grammar Predicted Value of
-++
30(vi 30 I.30(x
+-+0 U -20.-4 -2+0+ ++t+. ++t+
+ +
+ 0f-20*0
+ + 4 0++0 + -20 +0
+++
-20+
20 4+ r o+++i+*
20 0 20 00
? 4 4+ +o 40++++ + +
+0+00+0+ + 07I04 0 fr+?40,0+
Q;< <j+++ + . 7+0+ +QW t .......

0 0 - 0 - 0 , . . . . . . . . . . . . . . . . . ,4 .
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 6
Predicted Value of Lexicon Predicted Value of Lexicon Predicted Value of
Figure 5. The top panel (i) shows an idealized version of th

are shown using the synchrony relationship as an exampl
the nonlinear mappings presented in each column. To emp
arbitrary scale values (0-100). The patterns of residuals wh
(represented in the columns). The patterns of residuals w
Third, as we show below, one can use available which words (out of 680 possible) their child
statistical methods to assess whether these predicted "understands and says." These checklists capture
relationships (between error and predicted values)the reporters' general knowledge about whether or
are obtained in the data, providing useful informa-not the child "knows" a particular word, rather than
tion regarding the validity of claims on develop-specific information about the contexts in which a
mental ordering. In this case, we illustrate thisword is used or the accuracy of its pronunciation.
technique by examining whether the curvilinear re-The items are listed alphabetically in categories
lation between lexicon and grammar reflects an(e.g., animal sounds, vehicles). Vocabulary produc-
underlying situation of priority or is actually syn-tion scores are the total number of words that are
chrony. We used this analytic approach using recent reported.
data from the MacArthur - Bates CDI. We present the Progress in grammar is assessed using the
results of this analysis, after reviewing the CDIGrammatical Complexity section of the CDI: W&S.
measures and sample. Finally, we show how theIn this section, reporters read a series of 37 pairs of
approach can be extended to capitalize on thephrases, and are asked to indicate which one of
multiply determined nature of most developmentala pair of phrases "sounds most like the way your
systems. child is talking right now." The first is an example
that lacks grammatical markers or is syntactically
Method simple, whereas the second provides a more com-
plex alternative (e.g., "Kitty sleep" vs. "Kitty sleep-
ing"). The grammatical complexity score is the
Participants number of times the second sentence was selected
(37 maximum).
Parental report data were compiled from typically
developing children sampled cross-sectionally from
16 to 30 months (N = 1,461; 728 females) in the up-
Results
dated norming sample of the MacArthur - Bates CDI:
Words & Sentences (CDI: W&S; Fenson et al., 2007).
Table 1 shows the means and standard deviations for
A minimum of 37 girls and 37 boys were represented
the reported vocabulary and grammar measure
at each age (range: n = 37-60). All children were
grouped by age in months. Both vocabulary pro-
learning English as a first language, and none of the
duction and grammatical complexity increased with
children were reported to have experienced serious
age, rs(1460) = .64, .60, respectively. An a level of 0.05
birth complications, diagnosed developmental disa-
bilities, hearing loss, or other medical problems. The
sample consists of participants from Seattle, WA,Table
SanI
Diego, CA, Madison, WI, Dallas, TX, New Haven,
Mean (SD) Number of Words Produced and Grammatical Complexity
CT, New Orleans, LA, and Providence, RI. The eth-
Grammatical
nicity breakdown of the sample reflects a dominance
Number of words complexity
of Caucasian (White) participants, with African
Age
American, Asians, and Hispanic participants each
(months) M SD M SD
representing somewhat smaller proportions of the
sample. The distribution is similar to that in 16
the 58.86 59.08 0.48 2.09
17 86.24 83.95 0.53 1.76
United States at large, except for a lower percentage
18 107.07 110.54 1.03 2.95
of Hispanics due to the requirement that children are
19 167.04 138.94 2.45 4.47
learning English as a native language.
20 181.27 134.47 2.67 5.22
21 208.15 157.12 4.32 6.51
22 256.61 166.92 5.79 7.06
Measures and Procedure
23 313.44 162.37 9.36 9.95
As described in Fenson et al. (in press), parents 24 307.29 171.03 8.50 9.35
completed the CDI: W&S, as well as a Basic Infor- 25 338.18 178.11 9.92 9.91
26 382.52 180.70 14.73 12.50
mation Form, which provided basic demographic
27 408.20 176.24 16.74 12.45
and health information.
28 414.14 158.40 17.89 12.32
Size of vocabulary is assessed using the vo-
29 433.06 174.74 18.90 12.59
cabulary checklist from the CDI: W&S. This section
30 518.64 125.23 22.85 11.44
asks reporters (typically mothers) to indicate
40
was used for all significance tests, unless otherwise ++ 30
indicated. ? +0220
E- +
+
-+-+
+++++ +I+
+
+
X - + + +
S30
++ +
+ +
++-+
+ +
E 420+1- +- + +
Predicting Grammar From the Lexicon
E + +- ++-++ + 4 + --
c/+ - + + +4 / - + ++ +
Consistent with the literature reviewed above, the 10 0 + + +4+

+ +- +++ +
relationship between vocabulary production and E ++ ++ + ++ +
E +i + - +
grammatical complexity was strong and curvilinear.
-+ -++-- +if-* H-H1 1 + l 1+ + - - ++"
When vocabulary production was used to predict +tf ++ -f-+ + Hm3 ++* -IF * -++t+ + +4-f+4 ++_+ +
0 100
t +200
tt 300
t t 400 500 600 700
tt
grammar, the model explained 68% of the variance, Number of Words Produced
F(1, 1459) = 3,080.43, B = 0.05, t(1459) = 55.50. (To
10 + 0 t 3 +0 +45-00+ 60++0+ +N+ 1- '' . *++
remove nonessential multicollinearity, all predictors 300 + + + + -+ H f +-+,,i, , . +if++tt fPe i ctd e i on +
were grand-mean centered before computing the 20
20+
+
+
+ +2+
300e -ict+ e cn+ +
interaction and quadratic terms; Cohen et al., 2003.) 4+ +
+ +
+ i+
Adding the quadratic term to the model (i.e., +0

10 + + + + +
+ + -+
4 4+ +t
+ j ++ +t
+ +- ?
vocabulary squared) significantly increased the
proportion of explained variance to 74%. The pa- 300+ ++ + -+ f
rameter estimates for the linear term, B = 0.04, and
-00
the quadratic term, B = 0.00008, were both signifi-
cantly different from 0, ts(1458) = 51.78, 18.07, re- 30
-400
spectively. The model predictions are shown in the 10 +0 30 0 40+ 5
upper panel of Figure 1.

Following the logic outlined above, we examined
the pattern of residuals across the predicted values of
grammar. As can be seen in the middle panel of
Figure 6, the pattern of residuals was systematically
heteroscedastic. A simple initial approach to evalu-
ating this type of pattern is to examine the absolute
values of the residuals as a function of the predicted
values of grammar, r(1460) =.53. This relationship is
probably underestimated, because the scale is clearly
bounded in the upper and lower regions. However,
the pattern is very similar to that in panel (vi) of Predicted Lexicon
Figure 5. Figure 6. The top panel shows the relationship between the num-
A more sophisticated approach allows us to cap- ber of words produced and grammatical complexity. The middle
ture heteroscedastic patterns, such as this one, more panel shows the residuals as a function of the predicted values of
grammatical complexity when both number of words and its
completely. Because residual patterns can reflect
square are used as predictors. The lower panels show the residuals
both the variance of the errors and their mean values
as a function of predicted values of lexicon when both grammat-
changing as a function of the predicted values,ical
al-complexity and its square are used as predictors.
ternative methods are necessary to test these rela-
tionships appropriately. Therefore, we adopted a
very flexible approach developed by Breusch and mar were used to predict the standardized residuals,
the positive relationship seen in the raw residuals
Pagan (1979) and Cook and Weisberg (1983) (see also
Fox, 1991). This approach allows one to model wasre- confirmed, X2(1) = 568.69.
sidual patterns with standard regression techniques
using a simple transformation of the residuals: The
Predicting Lexicon From Grammar
residual score for each individual is squared and
When grammar and the quadratic term (i.e.,
standardized (i.e., divided by a term that closely
approximates the sample variance). The regression
grammar squared) were used to predict vocabulary,
the model results were similar to those described
procedure yields a test statistic that is asymptotically
distributed as chi-square. (We present the details above;
of 73% of the variance was accounted for,
computing these standardized scores, along with F(1, 1458)= 1,927.97. Both parameter estimates were
significantly different from 0, Bs = 19.72, -.40, for
more information about obtaining the test statistic, in
the linear and quadratic terms, respectively,
Appendix A.) When the predicted values of gram-
its measure, and not a reflection of the actual

ts(1458) = 48.17, - 15.81. The model predictions
were nearly identical to those in the upper panel ofrelationship at the level of the constructs.
underlying
Figure 1, so we do not plot them again. Next, we show how the nonlinear mapping ap-
The pattern of residuals is shown in the lower
proach can be extended to provide converging evi-
panel of Figure 6. As can be seen in the figure,
dence on
the
developmental ordering hypotheses.
linear relationship between the absolute values of the
residuals and the predicted values was very weak,
Extending the Approach to Multiply Determined Systems
r(1460) = -.073. Rather, the relationship appears to
be strongly curvilinear. To describe the curvilinear Most developmental theories propose that many
nature of the relationship, we again used the variables
familiarimpact the developing system and, there-
power polynomials. We used the predicted fore, vocabu-
contribute to the development of the item under
lary values, and the square and cube of the investigation.
predicted For example, the development of
values, as predictors of the (signed) residual children's
values. theory of mind may depend on executive
The parameter estimates for the linear, quadratic, function, sibling relations, and the ability to relate
and cubic terms were all significant, Bs multiple = - 0.287, dimensions, as well as a number of other
- 0.002, 0.00001, ts(1124) = - 5.69, - 9.97, 8.16. The factors (e.g., Astington & Jenkins, 1999; Wellman,
pattern is curvilinear in much the same way as that Cross, & Watson, 2001). In the current context,
in panel (ix) of Figure 5; it is cubic and there is very grammar has been hypothesized to emerge from the
little linear relationship between the absolute values unified, developing language system. Lexical de-
of the residuals and the predicted values of vo- velopment may be playing a central role, but other
cabulary. codeveloping factors, such as working memory ca-
This relationship can be tested more formally us- pacity, pragmatics, quality of social interactions, and
ing the standardized squared residual scores de- so forth, would also be important. A common strat-
scribed above. Squaring the raw residuals makes this egy in developmental work is to assume that these
cubic relationship become quadratic; the negative factors are largely, if imperfectly, age-related, and
deviations are now positive. To test for this quadratic thus age can be used as a proxy for these additional
trend, we first entered the predicted values of vo- codeveloping influences. For example, showing that
cabulary as a predictor of the squared residuals. one's theoretically important predictors explain
Consistent with the descriptive analysis presented variance in the dependent measure, above and be-
above, there was a weak, negative relationship beyond any variance explained by age, is standard
tween predicted vocabulary and the squared resid- (and reasonable) practice (see e.g., Kail, 2000;
uals, x (1) = 4.65. Adding the square of the predicted Marchman et al., 2004).
vocabulary scores dramatically increased the fit of Here, we capitalize on the ability of age to stand in
the model, change in X2(1) = 34.08. Also consistent for other developmental influences, but within a
with the descriptive analysis, there was a strong somewhat different strategy. Recall that we showed
curvilinear relationship between the residuals and above that error becomes associated with the vari-
the predicted values. ables at the underlying level as a result of the non-
In summary, these analyses of relations between linear mapping. For example, if G,, = + P *Lu+Eu,
lexicon and grammar in the CDI normative data re- and Gm are related to G,, through a nonlinear map-
veal that the residuals are systematically related to
the predicted values. The observed residual patterns ping: Gm
become = (Gu)2+mEg,
related to the Lu, andthen the error
ultimately to theterm,
pre- En, will
are consistent with the possibility that lexicon and dicted values of Gm. The underlying error term
grammar develop synchronously, but that the captures unexplained differences in grammar, in-
measure of grammar is nonlinear such that early cluding influences from factors not currently in the
developmental changes are underrepresented rela- equation. If we have measures of these other factors,
tive to later ones. Put another way, the grammatical we can, of course, remove them from the error term
complexity measure appears to be systematically by adding them to the equation. In the current con-
more responsive to the later developments in gram-
mar. Therefore, the evidence considered thus far text, we take age,
developmental An, as a that
influences proxy for unmeasured
affect the child's
suggests that the lexicon and grammar may develop
underlying level of grammar; therefore, n =
synchronously. These analyses indicate that the oft-
cited curvilinear relationship between vocabulary (Au+,,+,),here,
changed andof
G,course,
= ci+li*L,+
is thatPaA,+Eu'.
we have All that has
unpacked
size and grammar is actually a function of a non- the error term by including age effects in the model.
linear mapping between underlying grammar and This small change allows us to extend the approach
give rise to the observed curvilinear relationship

in a significant way. Just as error became associated
with L,, as a result of the nonlinear mapping,between
A, willlexicon and grammar. Specifically, grammar
become related to Lu; the nonlinear mappingmay initially develop more rapidly than the lexicon.
creates
the equivalent of a simple linear interactionThis alternative, which Dixon (2005) referred to as
term,
Au*L,. (This can be confirmed easily by expanding
"opposite partial priority," posits that at the under-
the squared equation.) A nonlinear mapping lyingbe-
level, the relationship between grammar and
tween the underlying level of grammar the
andlexicon
its is still curvilinear but opening down-
ward, rather than upward, as seen at the observed
measure, therefore, predicts that age and vocabulary
will interact in predicting grammar. When the non-
level (i.e., concave vs. convex). In order for this op-
posite
linear mapping is an accelerating function, as in theunderlying relationship to give rise to the
observed convex pattern, one or both of the mea-
example above, the interaction term will be positive.
When the nonlinear mapping is a decelerating sures must be dramatically nonlinear. Dixon (2005)
showed that the same types of nonlinear mappings
function, the interaction term will be negative.
To test this prediction in the current dataset, we
that produce the curvilinear relationship in a situ-
ation where the two underlying constructs were
added age and its linear interaction with vocabulary
developing
(i.e., Age x Vocabulary) as predictors to the model in synchrony would be necessary to
for grammar (which already included both the lineara curvilinear relationship in the case of op-
produce
and quadratic vocabulary terms). The results indi-
posite priority. However, the nonlinearity would
cated that the interaction term was positive haveandto be even more extreme. For example, to go
contributed significantly to the model, B = from 0.0032, even a fairly weak concave underlying
relationship to a convex relationship, such as that
t(1456) = 5.28; i.e., the effect of vocabulary increased
with age. This effect is a straightforward conse-between lexicon and grammar, would re-
observed
quence of a nonlinear construct-to-measure map-
quire a nonlinear mapping more extreme than the
ping. Terms that combine before the nonlinear square.
mapping become positively related if the nonlinear As the nonlinearity between the underlying and
mapping is an accelerating function. measured levels increases (i.e, goes up the ladder of
powers; Mosteller & Tukey, 1977), the relationships
It is also worth noting that the pattern of residuals
remained systematically heteroscedastic across among
the the underlying terms change. Recall that
predicted values of grammar when age when and itsthe nonlinear mapping approximates the
interaction with vocabulary were included in the
square, it creates a product term involving lexicon
model. The correlation between the absolute value of and age, a simple linear interaction or moderator
the residuals and the predicted values was .40; the effect. However, if the nonlinear mapping approxi-
relationship between standardized residuals and the mates the cube, for example, then the interaction
predicted values of grammar remained very strong, becomes quadratic. Moderators like other predictors
X2(1)- 222.81. If a nonlinear mapping has occurred, may have simple linear effects or curvilinear effects
error should remain associated with the predicted (Baron & Kenny, 1986). The quadratic interaction
values, even when the set of true predictors are used prediction follows directly from the expansion of
in the model. While age is only a rough proxy for the
(o+P ,Lu+~~Pa*Au+Eu')3+.mg
set of true codeveloping variables, the fact that the equalities = G3+rmg
simply state that the measured values of - Gm. These
heteroscedasticity is not ameliorated by its inclusion grammar, Gm, are related to the underlying values of
in the model further supports the interpretation thatgrammar, G,, through a nonlinear mapping (i.e., the
there is a nonlinear mapping between measure and cube), and that lexicon, L,, and age, A,, are related to
underlying construct. Both the interaction between underlying grammar. (Because the expansion of the
age and vocabulary, and the fact that including this equation above is cumbersome, we present it in
term in the model did not eliminate the heterosce- Appendix A, along with a description of a simple
simulation of these relationships.) Testing for the
dasticity in the residuals are consistent with a situ-
ation in which there is a nonlinear mapping between presence of a quadratic interaction is straightfor-
the measured and actual levels of grammar. ward; one creates a quadratic interaction term, which
is then added to the model. In this case, the term is
Synchrony Versus Opposite Priority

the product of lexicon and age squared: LmA2.
In summary, an opposite priority relationship at
Thus far, we have only considered synchrony as a underlying level could create the observed pat-
the
tern, but only if a fairly extreme nonlinear mapping
potential alternative hypothesis to priority, but an-
other form of developmental ordering might also was at work. Such nonlinear mappings must create a
would still have had an excellent chance of reaching

set of specific relationships among the underlying
variables; a nonlinear mapping more extremethe same
thanconclusions. The approach does not require
the square will create a curvilinear, accelerating
the very large numbers necessary for norming an
interaction. Therefore, we can test whetherassessment
this typeinstrument; rather it can be used with
sample
of nonlinear mapping is driving the observed sizes more usually obtained in develop-
rela-
mental
tionship by testing for a positive, quadratic studies.
inter-
action.
To address this hypothesis we added A2 and the Discussion

quadratic interaction term to the model. (A2m must be
added so that the interaction term does not pick upPrevious work suggested that, across the first few
on the quadratic effect of Am itself.) As noted above,years of life, the strong developmental relationsh
if the observed relationship between lexicon andbetween the lexicon and grammar was curvilinear.
grammar was driven by a nonlinear mapping that isThis relationship was, at the very least, a phenom-
more extreme than the square, a significant, positiveenon that required explanation. Some of these ex
quadratic interaction is predicted. However, the pa-planations have significant theoretical implications
rameter estimate for the quadratic interaction term For example, the curvilinear relationship led re
was nonsignificant, B = 0.00008, t(1455) = 0.16. The searchers to hypothesize that lexical developmen
power of this test is estimated at .97. Therefore, themight drive grammatical development. According t
hypothesis that grammatical development actually one version of this hypothesis, the growing lexico
precedes lexical development is not supported. provides the foundation for grammar learning, an
contributes fundamentally to the organization of in
creasingly complex grammatical forms. Specificall
Power and Sample Size Issues
the developmental precedence of lexicon ove
The current data set is very large by most stan- grammar was consistent with proposals in whic
dards, which might raise concerns about the feasi- grammatical principles emerge in a system that h
bility of this approach with the smaller data sets thatbuilt up a sufficient lexical base to support the fu
researchers typically possess. Therefore, we con-ther abstraction of grammatical regularities, that is, a
ducted a power analysis (Cohen, 1988) to estimatecritical mass (Marchman & Bates, 1994). This pro
the sample sizes necessary to obtain reasonable posal derived from the converging evidence of th
power (.80) for the more central tests of the nonlinear observed developmental asynchrony between lexic
mapping explanation within the current data set. and grammatical development in naturalistic an
First, consider the relationship between the parent report studies of children, as well as analyse
standardized residuals and the predicted values ofof the computational principles captured in model
grammar. Given the magnitude of the observed ef-of language learning (Bates & Goodman, 199
fect, an N of 50 would be necessary to achieve aPlunkett & Marchman, 1993).
power of .80, given an a of 0.05. This is a modest However, the analyses presented here strongl
sample size by most standards. Next, consider thesuggest that the actual developmental relationsh
curvilinear relationship between the standardizedbetween reported vocabulary and grammatic
residuals and the predicted values of vocabulary. Ancomplexity may be qualitatively different from th
N of 270 would be needed to obtain a power of .80, seen at that level of the measures. We found patter
with a set at 0.05. This is a fairly large sample size,of residuals consistent with the proposal that d
but then detecting relationships represented byvelopments in lexicon and grammar actually occur
higher order polynomials often requires larger sam-synchronously. The observed relation between lex
ples. Finally, consider the effect of adding age and itscon and grammar appears to be curvilinear becaus
interaction with vocabulary to the model for gram-the mapping from the underlying grammar to its
mar, f2= .087, between a small and medium effect, measure is nonlinear. Thus, while the original pro
according to Cohen. With a set at 0.05, an N of 115 isposals required an explanation of the development
needed to obtain a power of .80, a sample size that ispriority of the lexicon over grammar, it appears th
within reach for many types of investigations. the curvilinear relationship is actually an artifact of
Of course, the effect sizes observed in the currentnonlinear mapping between underlying gramm
study may or may not be similar to those in otherand its measure.
research domains. The main point here is that had Furthermore evidence for the curvilinear lexicon-
this study been conducted with far fewer partici- grammar relation deriving from a nonlinear map-
pants (i.e., <20% of the current sample size), it ping at the level of the measures was obtained in
subsequent analyses that focused on the ring conse-

at multiple levels across the system, to a more or
quences of adding age to the model as a proxy less degree,
for and are not limited to particular time
global developmental changes. According to the It would be unwarranted to assume that
windows.
the child is not learning anything about grammar
nonlinear mapping explanation, age and vocabulary
until grammatical principles are able to manifest
should interact (positively and linearly) in predicting
grammar. As predicted, the interaction between themselves
age in observable ways (e.g., the production
and vocabulary was strong and accounted of foroverregularized
sig- forms). It is well known that
nificant variance. This interaction is a direct children
predic- demonstrate sensitivity to grammatical
tion from the mathematics of nonlinear mapping regularities in comprehension before the point when
and, therefore, provides converging evidencethey can reliably use those grammatical forms in
for the
proposal that the measure of grammar is anproduction
accel- (e.g., Shipley, Smith, & Gleitman, 1969).
erating function of the underlying construct. The
Thus, the fact that parents report that children show
hypothesis that lexicon leads grammar cannot little grammatical sophistication during periods of
easily
early
accommodate the interaction; there is nothing lexical growth is likely to be more a function of
in this
proposal that predicts that the effect of the
lexicon
form of the knowledge that is being reported (i.e.,
should increase with age. production of closed-class inflections or multiword
phrases),
What are the consequences of these findings for rather than an index of a lack of gram-
matical
theories of lexical- grammar relations? We must knowledge per se.
first
note that several previous studies have proposed
In addition, the mechanisms underlying critical
that lexical and grammatical growth do indeed
mass effects are most reasonably explained in terms of
proceed synchronously. For example, Anisfeld, links between precise lexical accomplishments and
Ro-
senberg, Hoberman, and Gasparini (1998) reported
the emergence of very particular abstract patterns that
form the
on timing relations between changes in the growth of basis for grammatical regularities. For ex-
vocabulary size and the onset of word combinations
ample, it seems reasonable to propose that children's
examined at weekly intervals over the course of 8-
productive use of past tense forms (i.e., overregular-
ization
10 months in five children. They found that accel- errors) should be most dependent on learning
a particular
erations in lexical growth (i.e., the vocabulary spurt) set of lexical items that provide the rele-
were coincident with the onset of word combinations vant cues from which to abstract the regular past tense
and that lexical growth tended to decelerate withpattern, rather than total vocabulary size. Indeed, ra-
continued growth in word combinations (see alsother than a general measure like total vocabulary size
Dromi, 1987; van Geert, 1991). These close-timingand grammatical complexity, the analyses presented
synchronies were interpreted as evidence thatin Marchman and Bates (1994) and the computational
grammar learning is well underway before the vo- model of Plunkett and Marchman (1993) focused on
cabulary spurt and that grammatical analyses fa-relations between the number of regular verb forms
cilitate further lexical learning (i.e., a type ofand the productive use of past-tense inflections. In
syntactic bootstrapping, Gleitman & Gleitman, 1992; children, these more focused measures of lexical ac-
Naigles, 1990). Thus, lexical growth does not occurcomplishments (i.e., size of regular verb lexicon) are
presyntactically, but rather is "part and parcel of thehighly intercorrelated with "omnibus" measures of
child's transition to grammatical language" (Anis-reported vocabulary. Yet, it would be naive to assume
feld et al., 1998, p. 166). that these strong intercorrelations indicate that critical
But, does evidence for developmental synchrony mass effects are actually operating in a monolithic
of lexicon and grammar undermine the notion that fashion across all of lexical development and all of
the developing lexicon provides a critical foundationgrammar. Thus, it would be productive for future
for grammar learning? While it may seem at first work to go beyond global measures of lexical and
blush to be the case, the notion of a critical mass is grammatical progress and to begin to map in more
not in and of itself incompatible with the idea thatprecise ways which particular features of children's
systemic change is ongoing simultaneously in both lexical knowledge do (and do not) serve as the foun-
lexical and grammatical domains. Even if a sufficientdational precursors for the child's abstraction of
number of lexical forms may be required for thespecific grammatical regularities.
system to abstract and apply certain grammatical Finally, recent work has identified crosslinguistic
regularities, the computational principles guidingdifferences in the shape of lexical-grammar relations
these abstractions are operating in the context of a that depend on the particular measures used. In Caselli
highly interactive and multiply determined system.et al. (1999), lexical- grammar relations in both English
In such systems, changes are simultaneously occur- and Italian exhibited the characteristic nonlinear
function when grammar was measured words

usingmight
the also be those that enhance the mecha-
complexity scale from the CDI, as analyzed
nismshere.
guiding the acquisition of grammar. Import-
However, linear relations were seen when antly, grammar such an interpretation would not require that
was indexed by the child's reported use oflexicon functionand grammar share key mechanisms or rep-
words, a finding that was interpreted to suggest resentations
thatin any theoretically important way.
Italian-learning children might require a smallerAlternatively, lexical developmental change in the sys-
base of open class content words before they can tembegin
may be governed by a single system-internal
to produce closed-class function words. Furthermore, control parameter. For example, Thelen, Sch6ner,
in a recent follow-up study, Devescovi et al. Scheier,
(2005) and Smith (2001) showed that developmen-
reported that linear relations were again observed tal changein in the A-not-B error could be explained by
Italian (but not English) when grammar was indexed
changes in the "cooperativity" of the dynamic field,
by yet another measure, "mean of the three a single parameter that controls the field's resting
longest
utterances" (M3L). Such crosslinguistic differences level. Thelen
in et al. described differences in this pa-
the shape of lexical - grammatical relations remain
rameter as analogous to changing the weight of in-
compatible with a view in which lexical development ternal processes relative to the weight of external
drives the abstraction of grammatical regularities input. Theto synchrony relationship is consistent with
the extent that these differences can be accounted for the idea that these two aspects of language are con-
under the assumptions of a multiply determined and trolled by a single, system-wide parameter - "coop-
highly interactive system. Thus, linear lexical-gram- erativity" might be one candidate, but clearly others
mar relations would be more likely to occur in Italian would also be worth investigating.
than English, given that the "relatively rich, regular One such candidate could be general cognitive or
and consistently marked grammatical system in Italian information processing efficiency (Kail, 2000). Sev-
may provide an easier target, requiring few exemplars eral recent studies have suggested that im-
(and smaller vocabularies) to support extraction of provements in the efficiency of spoken language
strong generalizations" (Devescovi et al., 2005, p. 783). understanding are associated with growth in ex-
These sorts of predictions fall naturally out of a view of pressive vocabulary size (Fernald, Swingley, & Pinto,
language in which lexical and grammatical develop- 2001; Fernald, Perfors, & Marchman, 2006; Zangl,
ment are tightly yoked within the context of a unified Klarman, Thal, Fernald, & Bates, 2005). Indeed, it
learning system. Future studies should continue to may be the case that children's early success in lan-
examine the degree to which these developmental re- guage learning is in fact facilitated initially by their
lations are obtained empirically in different languages limited processing capacity (e.g., Elman, 1993,
and across a wider range of measures. Newport, 1990). If the subsequent development of
Of course, the conclusion that lexicon and gram- language is linked to global characteristics of a
mar develop synchronously across the developmen- child's developing information processing system, it
tal period cannot escape another set of theoretical is certainly reasonable to propose that processing
implications. Like the proposal that lexical develop- factors could impact a child's progress in both
ment drives grammatical development, the devel- learning words and learning grammar, in the ab-
opmental synchrony of lexicon and grammar could sence of any more specific links between the two.
be a consequence of them both being driven by a The possibility that lexical- grammatical links,
single underlying factor. For example, if lexical and whether they be synchronous or otherwise, are en-
grammar learning occurs within a unified system, tirely indirect, driven by the mutual impact of
then particular features of the input to the system mechanisms, or representational requirements that
(i.e., a common set of environmental influences) may operate outside the lexical and grammatical systems
control both the acquisition of new words and new per se, has been addressed in several recent studies.
grammatical forms. It is well known that the features For example, using multivariate behavioral genetic
of the talk that children hear (i.e., quality and quan- techniques, Dale et al. (2000) argued that there is a
tity of speech to the child) can substantially impact substantial genetic influence on the relationship be-
vocabulary learning (e.g., Hart & Risley, 1995; Hut- tween vocabulary and grammar and that general
tenlocher, Haight, Bryk, & Seltzer, 1991), and have abilities lacking a strong verbal component are not
long-term consequences for performance on assess- likely to be responsible for pacing the developments
ments of language and cognitive skill more generally. in both domains. Furthermore, in a recent follow-up
Thus, it is feasible to propose that the specific features study, Dionne, Dale, Boivin, and Plomin (2003) rep-
of the language-learning environment (e.g., amount licated the heritability findings of the 2 years sample
or quality of talk) that enhance children's learning of reported in Dale et al. (2000), but also attempted to
address the directionality of the effects using a lexicon

cross- and grammar is strikingly robust acro
lagged technique. They concluded that studies
lexical that examine a variety of populations an
knowledge was related to grammatical level,adopt as well several different methodologies. Furthermor
recent
as grammatical level facilitating lexical learning (i.e.,studies have taken important steps towar
syntactic bootstrapping). ruling out several of the leading alternative hypoth
Finally, recent studies with children learningeses-that
two is, artifacts of common environmental
languages have demonstrated that lexical-grammar influences or a general cognitive or language-learn-
links do not "spill over" from one language ingto theFuture work should continue to explore a
skill.
broad range of alternative explanations, and their
other, but rather are closely yoked to the accomplish-
ments within each of the languages beingimplications
learned for interpreting developmental order-
(Conboy & Thal, 2006; Marchman et al., 2004). In
ing as evidence of shared system-internal processes.
Marchman et al.'s study of more than 110 children
learning both English and Spanish, the results indi-
An Example of Testing Developmental Ordering
cated that grammatical abilities were strongly tied to
lexical level in each language (i.e., Spanish grammar
A second major purpose of the current article was
scores were predicted by Spanish vocabulary;to English
provide a complete example of how the nonlinear-
grammar scores were predicted by English vocabu- mapping approach (Dixon, 2005) could be used to
lary). Furthermore, grammatical accomplishments test developmental
in ordering hypotheses. We showed
each language were uniquely related to vocabularythat the curvilinear relationship between lexicon and
scores in each language, and hence, were notgrammarattrib- could result from synchrony at the
underlying
utable to other factors that may have been guiding the level, but only under a very limited set of
nonlinear
child's general progress in language learning (e.g., age, mapping conditions. Specifically, the
measure of grammar must be an accelerating func-
mother's years of education, relative English-to-Span-
ish exposure, lexicon, and grammar in the other lan-
tion of underlying grammar and/or the measure of
lexicon must be a decelerating function of under-
guage). Similar patterns were seen using naturalistic
language samples. These results suggest thatlying lexicon. Because these nonlinear mappings
lexical-
between the measure and underlying construct will
grammar links are not an artifact of a global language-
create specific relationships between error and the
learning ability, but rather that these accomplishments
underlying
are tied together in a very precise way in the context of variables, they make predictions about
solving a very precise problem (i.e., becoming the
a pro-
patterns that will be observed in the residuals.
ficient language user of a particular language). Although the model fits were quite good by most
One other type of explanation may be advanced standards
to (e.g., 73-74% of the variance explained),
explain the synchrony relationship -lexical
the and
residuals were systematically heteroscedastic.
grammatical development may be mutually When and re-
vocabulary was used to predict grammar, the
pattern showed a pronounced wedge shape-the
ciprocally influential at a relatively fine time-scale.
That is, small gains in one result in small variance
gains inincreased as the predicted values of gram-
the other. According to this hypothesis, mar increased. To quantify this relationship de-
separate
systems of lexicon and grammar are involved in a we correlated the absolute value of the
scriptively,
constant, cyclical interchange. These sort of residuals
bidirec- and the predicted values. This very famil-
tional and continuous effects might appear as iar syn-
procedure captures the pattern to some extent,
chrony relations (although they could also butcreate
a slightly more complex strategy is required to
modelthe
other functional forms as well), but are actually formally the relationship between residuals
and predicted values. Using a type of standardized
consequence of temporally fine-grained interactions
across, rather than within, systems. To our residual
knowl- and appropriate test statistic (which col-
edge, studies designed to test this kind of continuous
lectively handles the violations of distributional as-
relationship have yet to be undertaken. sumptions that are problematic for testing residual
While this is no exhaustive list of the hypotheses
patterns), we showed a strong, positive relationship
capable of explaining the strong lexicon andbetween
gram- the predicted values and residuals.
mar associations without resorting to an emergentistA very different pattern of residuals emerged
when grammar
view, such hypotheses will generally either contain a was used to predict vocabulary.
The
third factor that drives the relationship or some formpattern appears curvilinear with two bends
of reciprocal causation. Clearly, ruling out all(i.e., cubic). Both the familiar descriptive methods
possi-
ble alternative explanations is a daunting (i.e.,task.
OLS regression) and the more formal analysis
of standardized residuals confirmed the visual
However, the evidence for strong relations between
impression in the plot. The pattern of residuals

at the underlying
was level carries effects that are not
cubic and there was only a very weak negativecurrently
re- in the model. Adding a proxy variable to
lationship with the predicted values. the model essentially splits its effect from the
Both these residual patterns were consistent with error term. Because the proxy variable
underlying
those predicted by the hypothesis that thewill have its effects on the measure through the
lexicon
and grammar develop synchronously butnonlinear that the mapping, it must become related to the
other predictor
measure of grammar was an accelerating function of (or predictors) in the model, just as
the underlying level. The only other logical the error term did. That is, the nonlinear mapping
possi-
must create an interaction among the predictors.
bility (i.e., the measure of lexicon is a decelerating
function of underlying lexicon) capable ofTherefore,
creating a second level of evidence can be brought
to bear
the observed curvilinear data pattern predicted on the proposal of a nonlinear mapping be-
quite
different residual patterns. Note that had tween
wethe measure and construct-if a nonlinear
ob-
served homoscedastic patterns or heteroscedastic
mapping exists, variables that capture general or
patterns that conflicted with the predictedglobal effects (and are often of little theoretical
patterns
(e.g., a negative relationship between theinterest)
absolute
are predicted to interact with the other
values of the residuals and predicted predictors values ofin the model.
grammar), the proposal that lexicon and grammar A corollary of this prediction is that the hetero-
develop in synchrony would have been placed scedasticun-
nature of the residual patterns will remain
der considerable pressure. Because the nonlinear after the proxy variable and the interaction term are
mapping must create the predicted residualadded to the model. Because the heteroscedastic
patterns,
a failure to observe them is good evidence patterns
againstare caused by the nonlinear mapping,
nonlinear mapping. Indeed, the only way toadding arriveadditional
at terms to the model cannot remove
a homoscedastic pattern given a nonlinear the relationship between error and the predictors.
mapping
(and nontrivial error at the underlying level) Whatever
is to error is left at the underlying level must
have a true association between error and the vari- still undergo the nonlinear mapping and, therefore,
become related to the other terms in the model. This
able of interest that runs in the opposite direction.
Making inferences from correlational data requires point is important to note, because an alternative
the assumption that such compensatory relation- explanation for heteroscedasticity is that a variable,
which is correlated with the current predictors, is
ships are not driving the results. In this way, failure
to observe the predicted residual patterns would missing from the model. According to this alterna-
tive explanation, it should be possible to nearly
constitute correlational evidence against a nonlinear
mapping between the measure and underlying eliminate the heteroscedasticity of the residuals. The
construct. nonlinear mapping account predicts that heterosce-
dasticity cannot be eliminated in this way.
Extending the nonlinear mapping approach to multiply
determined systems. Most current approaches to de-
velopmental science assume that development is a
Sources of Nonlinear Mapping
complex, multiply determined (or multiply proba-
bilistic) phenomenon. Researchers hope to identify It is clearly disheartening to find out that a
many, perhaps even most, of the central factors measure
that is not doing a consistent job of capturing
contribute to the development of a particular changeitem across the developmental period of interest
(i.e.,one
(e.g., a structure, skill, ability), but rarely would there is a nonlinear mapping between one's
assert that they knew the full set of developmentalmeasure and the construct of interest), as we have
influences. We often handle this multiplicity demonstrated
of de- for grammatical complexity, in the
velopmental influences by using proxy variables, current case. How and why does this situation arise?
such as age, socioeconomic status, birth weight, It isand
likely that the reasons for a nonlinear mapping
so forth; we assume that these global variableswill cap-vary depending on the features of the particular
ture some of the variance attributable to unmeasured measures and the characteristics of the particular
developmental dimensions. For example, age is often constructs under consideration. For parent report, it
taken as a proxy for the host of dimensions that are is not surprising that the kinds of grammatical
believed to develop across time. changes observed early in development are less
In the current context, we showed how the non- likely to be those that parents can easily report on.
linear mapping approach can be extended to capit-For other measures, the situation might arise due
alize on the multiply determined nature of the to different developmental requirements of the re-
developmental system. We noted that the error term sponses that are required by the child (e.g., drawing,
pointing at pictures). As mentioned above, studies References

using methodologies with different task demands
Akhtar, N. (1999). Acquiring basic word order: Evidence
(e.g., looking preference) have demonstratedforthat
data-driven learning of syntactic structure. Journal of
children do know something about grammar earlier
Child Language, 26, 339-356.
than they can begin to produce early word combin-
Anisfeld, M., Rosenberg, E. S., Hoberman, M. J., & Gasparini,
D. (1998). Lexical acceleration coincides with the onset of
ations similar to those listed on the CDI (e.g., Naigles,
2002). Thus, future studies might find it advanta-
combinatorial speech. First Language, 18, 165-184.
Astington,
geous to utilize a combination of techniques to assess J. W., & Jenkins, J. M. (1999). A longitudinal
changes that occur both early and late in the period,of the relation between language and theory-of-
study
and how those changes relate to lexical growth. mind development. Developmental Psychology, 35, 1311-
1320.
The current findings do not discredit the CDI as a
Baron, R. M., & Kenny, D. A. (1986). The moderator-
useful instrument to measure language progress, nor
mediator variable distinction in social psychological
do they bear exclusively or particularly to measures
research: Conceptual, strategic, and statistical consider-
that are derived from parent report (see Fensonations.
et al.,Journal of Personality and Social Psychology, 51,
in press for a discussion of the strengths and limita-
1173-1182.
tions of parent report). This particular instrument,
Bates, E., Dale, P. S., & Thal, D. J. (1995). Individual differ-
and the technique of parent report more generally,
ences and their implications for theories of language
remain worthwhile methods for developmentalists to
development. In P. Fletcher & B. MacWhinney (Eds.), The
use. As with any measure, the data derived from handbook
the of child language. Cambridge, MA: Blackwell.
CDI must be interpreted in light of potentialBates,
nonlin- E., & Goodman, J. C. (1997). On the inseparability of
grammar and the lexicon: Evidence from acquisition,
earities in measurement. We have demonstrated here
aphasia and real-time processing. In G. Altmann (Ed.),
that nonlinear mapping in measurement can have
(Special issue on the lexicon), Language and Cognitive
significant theoretical consequences. Nonlinearities Processes, 12, 507-586.
of the sort uncovered for the grammatical complexity
Bates, E., & Goodman, J. C. (1999). On the emergence of
measure from the CDI are almost certainly not un-grammar from the lexicon. In B. MacWhinney (Ed.), The
ique to this instrument; indeed, they are likely to be
emergence of language (pp. 29-70). Mahwah, NJ: Er-
more common than not in our field. The techniques lbaum.
described here illustrate that knowing more, ratherBates, E., Marchman, V., Thal, D., Fenson, L., Dale, P. S.,
than less, regarding the relations between the meas-Reznick, J. S., et al. (1994). Developmental and stylistic
variation in the composition of early vocabulary. Journal
ures and underlying variables can have important
consequences for hypotheses about the develop- Child Language, 21, 85-123.
of
Braine, M. D. S. (1976). Children's first word combinations.
mental ordering of core theoretical constructs.
Monographs of the Society for Research in Child Develop-
ment, 41(1, Serial No. 164).
Summary and Conclusion Bresnan, J. (Ed.). (1982). The mental representation of gram-
matical relations. Cambridge, MA: MIT Press.
We examined the developmental relationship be-
Breusch, T. S., & Pagan, A. R. (1979). A simple test for
tween lexicon and grammatical complexity using
heteroscedasticity and random coefficient variation.
data from the norming sample of the MacArthur- Econometrica, 47, 1287-1294.
Bates CDI: W&S. Results suggested a significant re-Caselli, M. C., Casadio, P., & Bates, E. (1999). A comparison
interpretation of the nonlinear relation between lexi-
of the transition from first words to grammar in English
con and grammar, a data pattern that was previously and Italian. Journal of Child Language, 26, 69-111.
thought to imply that lexical development precedes Cazden, C. B. (1968). The acquisition of noun and verb
grammar. Furthermore, we showed how to extend inflections. Child Development, 39, 433-448.
the nonlinear mapping approach (Dixon, 2005) Cohen,
to J. (1988). Statistical power analysis for the behavioral
multiply determined systems, such that a second lev-sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
el of evidence was brought to bear on the questionCohen,
of J., Cohen, P., West, S. G., & Aiken, L. S. (2003).
Applied multiple regression/correlation analysis for the be-
developmental ordering. The analyses converge on
havioral sciences (3rd ed.). Mahwah, NJ: Erlbaum.
the conclusion that the lexicon and grammar are ac-
Cohen, L. B., & Chaput, H. H. (2002). Connectionist models
tually developing in synchrony across the first fewof infant perceptual and cognitive development: Com-
years of life. This finding has important implicationsment. Developmental Science, 5, 173-175.
for theories of language acquisition that view lexical
Conboy, B. (2002). Patterns of language processing and growth
and grammatical development within a unified sys-in early English-Spanish bilingualism. Unpublished doc-
tem that shares important representational andtoral dissertation, University of California, San Diego
computational resources. and San Diego State University.
Conboy, B. T., & Thal, D. J. (2006). Ties between Fox, theJ.lexicon

(1991). Regression diagnostics. Newbury Park, CA: Sage.
and grammar: Cross-sectional and longitudinal Gleitman,
studies L. R., & Gleitman, H. (1992). A picture is worth a
of bilingual toddlers. Child Development, 77, 712-735. thousand words, but that's the problem: The role of
Cook, R. D., & Weisberg, S. (1983). Diagnostics for syntax
heter-in vocabulary acquisition. Current directions in
oscedasticity in regression. Biometrika, 70, 1-10.Psychological Science, 1, 31-35.
Courage, M. L., & Howe, M. L. (2004). AdvancesGoldberg, in earlyA. E. (1999). The emergence of the semantics of
memory development research: Insights from the argument
dark structure constructions. In B. MacWhinney,
side of the moon. Developmental Review, 24, 6-32. (Ed.), The emergence of language (pp. 197-212). Mahwah,
Dale, P. S., Dionne, G., Eley, T. C., & Plomin, R. NJ:(2000).
Earlbaum.
Lexical and grammatical development: A behavioral Hart, B., ge-
& Risley, T. (1995). Meaningful differences in the
netic perspective. Journal of Child Language, 27, 619-642. everyday experience of young American children. Baltimore:
Devescovi, A., Caselli, M. C., Marchione, D., Pasqualetti, Paul H. Brookes
P., Publishing Co.
Reilly, J., & Bates, E. (2005). A crosslinguistic study Huttenlocher,
of the J., Haight, W., Bryk, A., & Seltzer, M. (1991).
relationship between grammar and lexical development. Early vocabulary growth: Relation to language input
Journal of Child Language, 32, 759-786. and gender. Developmental Psychology, 27, 236-248.
Dionne, G., Dale, P. S., Boivin, M., & Plomin, R. (2003). Jackson-Maldonado, D., Thal, D. J., Fenson, L., Marchman,
Genetic evidence for bidirectional effects of early lexical V., Newton, T., & Conboy, B. (2003). El Inventario del
and grammatical development. Child Development, 74, desarrollo de habilidades comunicativas: User's guide and
394-412. technical manual. Baltimore: Paul H. Brookes Publishing
Co. and
Dixon, J. A. (1998). Developmental ordering, scale types,
strong inference. Developmental Psychology, 34, 131Kail,-145.
R. (2000). Speed of information processing: Devel-
opmental change and links to intelligence. Journal of
Dixon, J. A. (2005). Strong tests of developmental ordering
hypotheses: Integrating evidence from the second mo-
School Psychology, 38, 51 -56.
ment. Child Development, 76, 1-23. Lieven, E. V. M., Pine, J. M., & Baldwin, G. (1997). Lexic-
Dromi, E. (1987). Early lexical development. Newally-based
York: learning and early grammatical develop-
Cambridge University Press. ment. Journal of Child Language, 24, 187-219.
Ejiri, K., & Masataka, N. (2001). Co-occurrence of preverbal MacWhinney, B. (2001). Emergence from what? Journal of
vocal behavior and motor action in early infancy. De-Language, 28, 726-732.
Child
velopmental Science, 4, 40-48. MacWhinney, B. (2004). A multiple process solution to the
Elman, J. L. (1993). Learning and development in neural logical problem of language acquisition. Journal of Child
networks: The importance of starting small. Cognition, Language, 31, 883-914.
48, 71-99. Maital, S. L., Dromi, E., Sagi, A., & Bornstein, M. H. (2000).
Elman, J. L. (2001). Connectionism and language acquisi- The Hebrew Communicative Development Inventory:
tion. In M. Tomasello & E. Bates (Eds.), Language devel- Language specific properties and cross-linguistic gen-
opment: The essential readings. Malden, MA: Blackwell. eralizations. Journal of Child Language, 27, 43-67.
Elman, J. L. (2004). An alternative view of the mental Marchman, V. A. (1997). Models of language development:
lexicon. Trends in Cognitive Sciences, 8, 301- 306. An "emergentist" perspective. Mental Retardation &
Fenson, L., Dale, P. S., Reznick, J. S., Bates, E., Thal, D. J., & Developmental Disabilities Research Reviews, 3, 293-299.
Pethick, S. J. (1994). Variability in early communicative Marchman, V. A., & Bates, E. (1994). Continuity in lexical
development. Monographs of the Society for Research in and morphological development: A test of the critical
Child Development, 59(5, Serial No. 242). mass hypothesis. Journal of Child Language, 12, 339-366.
Fenson, L., Dale, P. S., Reznick, J. S., Thal, D., Bates, E., Marchman, V. A., Martinez-Sussmann, C., & Dale, P. S.
Hartung, J. P., et al. (1993). The MacArthur communicative (2004). The language-specific nature of grammatical
development inventories: User's guide and technical manual. development: Evidence from bilingual language learn-
San Diego: Singular Publishing Group. ers. Developmental Science, 7, 212-224.
Fenson, L., Marchman, V. A., Thal, D., Dale, P., Reznick, S., Marchman, V. A., Plunkett, K., & Goodman, J. (1997).
& Bates, E. (2007). MacArthur-Bates communicative devel- Overregularization in English plural and past tense in-
opment inventories: User's guide and technical manual (2nd flectional morphology: A response to Marcus. Journal of
ed.). Baltimore: Brookes Publishing Co. Child Language, 24, 767-779.
Fernald, A., Perfors, A., & Marchman, V. (2006). Picking up Marcus, G., Pinker, S., Ullman, M., Hollander, J., Rosen, T.,
speed in understanding: How increased efficiency in on- & Xu, F. (1992). Overregularization in language acqui-
line speech processing relates to lexical and grammatical sition. Monographs of the Society for Research in Child De-
development in the second year. Developmental Psychol- velopment, 57(Serial No. 228).
ogy, 42, 98-116. McClelland, J. L., & Rogers, T. T. (2003). The parallel dis-
Fernald, A., Swingley, D., & Pinto, J. (2001). When half a tributed processing approach to semantic cognition.
word is enough: Infants can recognize spoken words Nature Reviews Neuroscience, 4, 1-14.
using partial phonetic information. Child Development, McGregor, K. K., Sheng, L., & Smith, B. (2005). The
72, 1003-1015. precocious two-year-old: Status of the lexicon and
links to the grammar. Journal of Child Language, 32, 563 - Appendix A

585.
Mosteller, E, & Tukey, J. W. (1977). Data analysis and re-

Modeling heteroscedasticity with standardized residu-
gression: A second course in statistics. Reading, MA: Ad-
dison-Wesley. als. To compute the standarized squared residuals,
onemod-
Munakata, Y., & McClelland, J. L. (2003). Connectionist first creates a term closely related to the mean
squared error from the current regression model.
els of development. Developmental Science, 6, 413-429.
Naigles, L. G. (1990). Children use syntax to learnHowever,
verb the sum of squared (SS) error is divided by
meanings. Journal of Child Language, 17, 357-374. N, rather than N - k - 1. This modified variance term
Naigles, L. R. (2002). Form is easy, meaning is hard:
is Re-
therefore: i2 = y(ei)2/N, where ei is the residual
solving a paradox in early child language. Cognition,
for 86,
individual i. This value is easily obtained because
157-199.
Y(ei)2 is the SS error term from the regression that
Newport, E. L. (1990). Maturational constraints on lan-
generated the residuals. Next, each residual term is
guage learning. Cognitive Science, 14, 11-28.
squared and divided by cy'2. These standarized
Pinker, S. (1999). Words and rules: The ingredients of language.
New York: Basic Books. squared residuals are then subjected to a regular OLS
regression analysis. For example, to test whether the
Plaut, D. C., & Kello, C. T. (1999). The emergence of
phonology from the interplay of speech comprehension predicted values of grammar were linearly related to
and production: A distributed connectionist approach.the residual values, we used the predicted values of
grammar as the predictor variable and the stand-
In B. MacWhinney (Ed.), The emergence of language (pp.
381-415). Mahwah, NJ: Erlbaum. ardized squared residuals as the dependent variable.
Plunkett, K., & Marchman, V. (1993). From rote-learning The
to final step is to obtain the sums of squares for the
system building: The acquisition of morphology in
model from the regression on the standardized
children and connectionist nets. Cognition, 48, 21-69. squared residuals. Dividing this value by 2, SSmodel/
Shipley, E. F., Smith, C. S., & Gleitman, L. R. (1969). A study
2, yields a test statistic that is asymptotically dis-
of the acquisition of language: Free response to com-
tributed as chi-square, on degrees of freedom equal
mands. Language, 45, 322-342.
to the number of predictors in the model. See Fox
Stevens, S. S. (1951). Mathematics, measurement, and
(1991) for a very accessible review of this approach
psychophysics. In S. S. Stevens (Ed.), Handbook of exper-
and a discussion of alternative methods.
imental psychology (pp. 1-49). New York: Wiley.
Thelen, E., Sch6ner, G., Scheier, C., & Smith, L. B. (2000). An example of a more extreme nonlinear mapping: The
The dynamics of embodiment: A field theory of infant cube. First, we consider the expansion of
(+ +Pl*Lu+ a*Au+Eu')3. To increase the clarity of the
perseverative reaching. Behavioral and Brain Sciences, 24,
1-86.
presentation, we use the following notation:
Thomas, M. S. C., & Karmiloff-Smith, A. (2003). Modeling
L = P, Lu', A = Ba*Aur, E = e,u. Assuming that a = 0,
language acquisition in atypical phenotypes. Psycholog-
the expansion of the cubed equation yields the fol-
ical Review, 110, 647-682.
lowing terms: L3+A3 +E3 +3L2A+ 3LA2 3L2E+3-
Thordardottir, E.T, Weismer, S. E., & Evans, J. L. (2002).
LE2+3A2E+3AE2+6LAE. The expansion shows the
Continuity in lexical and morphological development in
quadratic interaction term, 3LA2.
Icelandic and English-speaking 2-year-olds. First Lan-
guage, 22, 3-28. To show the quadratic prediction more completely,
we created a simple model of this nonlinear map-
Tomasello, M. (2003). Constructing a language: A usage-based
ping
theory of language acquisition. Cambridge, MA: Harvard situation, given an underlying relationship in
University Press. which the lexicon developed more slowly than
grammar (i.e., the opposite partial priority hypoth-
van Geert, P. (1991). A dynamic systems model of cognitive
and language growth. Psychological Review, 98, 3-53.
esis). We created a data set in which the underlying
Wellman, H. M., Cross, D., & Watson, J. (2001). A relationship
meta- between grammar, lexicon, and age was:
analysis of false belief reasoning: The truth about false
belief. Child Development, 72, 655-684. G-1 - O+PJl*(L6)+-a*Au+-Eu, where P -20, na 1,
and E, was drawn from a normal distribution with a
Zangl, R., Klarman, L., Thal, D. J., Fernald, A., & Bates, E. 0 and variance of 50.
mean of
(2005). Dynamics of word comprehension in infancy:
This model creates a curvilinear relationship such
Development in timing, accuracy, and resistance to
that grammar develops slightly more quickly than
acoustic degradation. Journal of Cognition and Develop-
ment., 6, 179 - 208. the lexicon; when the underlying relationship is
plotted as in Figure 1, the curve is concave (i.e.,
Zevin, J. D., & Seidenberg, M. S. (2004). Age-of-acquisition
opening downward) rather convex (i.e, opening
effects in reading aloud: Tests of cumulative frequency
and frequency trajectory. Memory and Cognition, upward).
32, We explicitly choose to model a mild ver-
31-38.
sion of the opposite priority hypothesis, because it
requires a less extreme nonlinear mapping Age

to create
and the linear interaction term, lexicon x age,
the observed, convex pattern. Less extremecontributed
nonlinearsignificantly to the model, increasing the
mappings will have smaller effects. fit by 7% and 4%, respectively. The crucial question
The nonlinear mapping between underlying
for our current purposes, however, was whether
grammar and the measure of grammar was repre-
adding the quadratic interaction term, lexicon - age2,
sented as the cube: Gm = (G,u)3+m. The measure model
to the of would also improve the fit, as sug-
gested by the expanded equation presented above.
lexicon, Lm, was a linear function of underlying
lexicon, L,. The error term, Em, was drawn Thefrom a interaction was significant, B =11.06,
quadratic
normal distribution with a mean of 0 and variance of t(993)= 12.88, when added to a model that included
50. Finally, to simulate the developmental relation- all the terms above and Age2
ships between lexicon, grammar, and age more This simple model demonstrates that, given even a
closely, the measure of lexicon was created such that weak underlying opposite-priority relationship (i.e.,
it was correlated with grade (as well as with gram- a concave curve), the nonlinear mapping necessary
mar). The correlations among the three variables: to create the observed relationship (i.e., a convex
grammar, lexicon, and age, closely approximated the curve) also creates a quadratic interaction between
zero-order correlations in the CDI data set (.85, .67, the major predictor (i.e., lexicon) and the secondary
and .64, for grammar-lexicon, lexicon-age, and variable (i.e., age). Therefore, testing for this quad-
grammar- age, respectively). ratic interaction provides one way to distinguish
When Lm and L2, were used to predict G,,, the between an underlying opposite-priority relation-
model fit was, of course, very good, R2 = .88. Adding ship and an underlying synchrony relationship.

Grammar and The Lexicon - Developmental Ordering in Language Acquisition

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Grammar and The Lexicon - Developmental Ordering in Language Acquisition

Uploaded by

Copyright:

Available Formats

Grammar and the Lexicon: Developmental Ordering in Language Acquisition

Author(s): James A. Dixon and Virginia A. Marchman

Grammar and the Lexicon: Developmental Ordering in Language Acquisition

Developmental ordering is a crucial link between

models of the process claiming to provide evidence

Number of Words Produced

ically, each places important constraints on mapping,

Observed Data Pattern:

(e.g., lapses in the participant's attention, misun

developmental ordering appears in the mentdata. A

.........n(ii) L.(iii) (iv )

development. The mapping between grammar and

measure is more responsive to later developmental

Nonlinear mappings predict specific residual pat- + + t + +.+-b,4,

terns. We illustrate the predictions for each type of + + +

nonlinear mapping using a simple simulation,

course, never available to researchers. We present it

Predicted Value of Lexicon

model predictions (the zero point on the ypanel

100o- Lexicon Grammar -0 1 Lexicon Grammar-lOO-L exicon

0000+ + 0+ c t++Jfo 47;

Q;< <j+++ + . 7+0+ +QW t .......

Predicted Value of Lexicon Predicted Value of Lexicon Predicted Value of

Figure 5. The top panel (i) shows an idealized version of th

Consistent with the literature reviewed above, the 10 0 + + +4+

Adding the quadratic term to the model (i.e., +0

spectively. The model predictions are shown in the 10 +0 30 0 40+ 5

upper panel of Figure 1.

its measure, and not a reflection of the actual

give rise to the observed curvilinear relationship

Synchrony Versus Opposite Priority

would still have had an excellent chance of reaching

To address this hypothesis we added A2 and the Discussion

subsequent analyses that focused on the ring conse-

function when grammar was measured words

address the directionality of the effects using a lexicon

impression in the plot. The pattern of residuals

pointing at pictures). As mentioned above, studies References

Conboy, B. T., & Thal, D. J. (2006). Ties between Fox, theJ.lexicon

links to the grammar. Journal of Child Language, 32, 563 - Appendix A

Mosteller, E, & Tukey, J. W. (1977). Data analysis and re-

requires a less extreme nonlinear mapping Age

You might also like