Towards
a ‘Converging Theories’ Model of Language Acquisition:
Continuing Discontinuity
Joseph
Galasso
English
Department
California State University, Northridge
joseph.galasso@csun.edu
Incomplete
Working Draft: Spring 2003
Abstract
The Dual Mechanism Model credits the Brain/Mind with having two
fundamentally different cognitive modes of language processing—this
dual mechanism has recently been reported as reflecting inherent
qualitative distinctions found between (i) regular verb inflectional
morphology (where rule-based stem+affixes form a large contingency),
and (ii) irregular verb constructions (where full lexical forms
seem to be stored as associative chunks). In this paper, we examine
the Dual Mechanism Model and broaden its scope to covering the
overall grammatical development of Child First Language Acquisition.
Proposal
This paper proposes new accounts of old issues surrounding child
first language acquisition. The general framework of our proposal
is based upon hybrid theories—proposals stemming from recent investigations
in the areas of PDP-style connectionism, as well as from more
naturalistic studies, and sample-based corpora of Child Language
Acquisition. Much of what is sketched out here attempts to converge
the leading tenets of two major schools-of-thought—namely, Associative
Frequency learning and/vs. Symbolic Rule learning. Cast from this
new tenor, proponents calling for a Dual Mechanism Account
have emerged advocating a dual cognitive mechanism in dealing
with processing differences found amongst regular and irregular
verb inflection morphology (inter alia). The main task
of this paper is (i) to broaden and extend the dual mechanism
account—taking it from the current slate of morphology to the
larger syntactic level, and (ii) to spawn some theoretical discussion
of how such a dual treatment might have further reaching implications
behind more general developmental aspects of language acquisition
(as a whole), namely (though not exclusively), the twin benchmarks
of syntactic development regarding Lexical vs. Functional (staged)
grammar, etc. Our central claim will be that whatever factors
lead to a deficient morphology, say, at a given stage-1 of development—factors
that may potentially lead to the postulation of a non-rule based
account—these same factors are likely to be carried over, becoming
a factor of deficiency in the overarching syntax. Thus, the tone
of the discussion is dualistic throughout. Our main goal is two-prong:
first, to assert as the leading null hypothesis that language
acquisition is Discontinuous in nature from that of the
adult target grammar, and that this discontinuity is tethered
to maturational factors which lay deep seeded in the brain—factors
which yield fundamental differences in the actual processing of
linguistic material, (a so called ‘Fundamental Difference Hypothesis’),
and second, to show that this early multi-word non-target stage
can be attributed to the first leg of this dual-mechanism—i.e.,
that side of cognitive/language processing that governs (i) (quasi-)
formulaic structures along with (ii) non-parameterizations. We
attribute the generation of this two-stage development to maturational
scheduling—viz., a Non-Inflectional stage-1 and/vs. an Optional
Inflectional stage-2 (where formal grammatical relations are first
learned in a lexical bottom-up fashion and then later regroup
to generalize across the board in a word class top-down fashion).
It is our understanding that the two-staged development involves
and shares both a relevant associative style theory of learning
(cf. Skinner/Associative style learning, for our former stage-1),
while preserving the best of what syntactic rule-driven theories
have to offer (cf. Chomsky/Rule style learning, for our latter
stage-2)—hence, the entitled term Converging. By analyzing
much of what is in the literature today regarding child language
acquisition, as well as drawing from the rich body of work presently
being undertaken in connectionism, it is our hope that a new hybrid
converging theory of child language acquisition can be presented
in a way that captures what is inherently good from both schools—an
alternative theory that bears more flavor of truth than camp rhetoric.
<>
Why—I don’t need any ‘rule’
to see this tree here in front of me. My eyes work just fine.
That is, insofar as there exists a single tree. But, how is it
that my ‘tree’ gets destroyed once I move my head ever so slightly
to the east and fall into view of a second tree? The mystery of
it all lies somewhere in the dismantling, between a single torn
branch of lifted foliage, that forces the rule—for how was I ever
to know that this second tree was indeed a tree after all?
(Poem based on Plato’s form).
“Humans use stories that
they tell themselves in order to get themselves to work on this
or that. These stories often deal with confrontation between areas
and ideas. From some point of view, it is almost always the case
that these high-level stories are relevant only as motivation
and not really relevant to what eventually happens in terms of
technical understanding”. (Allen Newell)
Sometimes, stories within
a certain school split—e.g., formalist debates on the amount of
functionalism Chomsky can and should afford to surrender (cf.
Pinker & Bloom). Sometimes differing stories converge—Neo-Behaviorists
seeking out an innately based architecture (cf. Elman). In any
event, differing schools-of-thought are prosaic at best, ripe
with unfortunate misunderstandings that lead to the fanfare of
debate. There is no clarion call behind dueling rationales; one
is left merely to one’s own devises, scrambling to gather fuel
for the worthy debate. All reduce to subtle argument of fine detail—very
rarely is there really substantial differing. The world as we
see it ultimately provides very little in the way of such dividend:
perhaps ontogeny recapitulates phylogeny in every respect.
0. Overview
Periodically, say every two or three generations, our vows on
science are renewed by a sweeping change of reasoning—cerebral
airs that deliver their own inextricable kind of ‘off-the-beaten-path’
hedonism. These solemn changes are few and far between and constitute
what the philosopher of science Thomas Kuhn called ‘Paradigm Shifts’
(a new-way of thinking about and old-something). Unfortunately,
these generational spurts often provide very little in the way
of true original thinking, and much of what is behind the fanfare
quickly reduces to little more than the recasting of old ‘brews’
into new ‘spells’. Perhaps a glimmer of true original thought
(a ‘new-something’) comes our way every two hundred years or so.
We are in luck! One of the greatest breakthroughs in science has
been borne in the latter half of the last century and has made
its way onto the scene shrouded by questions surrounding how one
should go about rethinking the Human Brain/Mind—questions that
have led to eventualities in Computer Programming, Artificial
Intelligence (AI), Language/Grammar, Symbolic-Rule Programs and
Connectionism. Much of what sits here in front of me, at my desk,
can be attributed in one way or another to this ‘new-something’,
and whenever there is a new-something, whether it be steam-locomotives
to transistors to tampering with DNA, there’s bound to be an earful
of debate and controversy. And so remnants of this debate have
edged their way ever so slowly onto the platform—from the likes
of the psychologist Donald Hebb (1940s-50s) (and his revolutionary
notion of ‘nerve learning’ based on oscillatory frequency), to
the great debates between two great personalities in the AI field,
Marvin Minsky and Frank Rosenblatt (1950s-60s), to those in the
realm of language, Noam Chomsky (1960s-80s). More recently, the
debates have taken on a vibrant life of their own by the great
advances in computer technology. The most clearly articulated
of these recent debates has come to us by two leading figures
in the research group called Parallel Distributed Processing
(PDP)—namely, Jay McClelland and Dave Rumelhart (1980s). Most
recently, the debates have come to carry a portmanteau of claims—chief
among them is the claim that human brain function, and thus human
computation, is not analogues to (top-down) symbolic-based computers,
but rather, the brain and its functional computations should be
considered on a par with what we now know about (bottom-up) nerve
functions and brain cell activations. In other words, the paradigm
shift here occurs the moment one rejects the computer as an antiquated
model of the brain (and language), and instead, prompts up a newer
model of language and thinking based on connections and connectionism
(as understood in neurological studies). In this vain, it is fair
to say that we should no longer view language as a mere gathering
and shaping of atomic particles or logical symbols—much like how
one might view the atomic nature of computer language as it is
composed of a serial string of 0’s and 1’s—rationing out sub-parts
of the structure in more-or-less equal portions in hope at arriving
at a larger and more cohesive general frame of language. It could
be argued by connectionists that language is not only much more
fluid than what any strict rule-driven/symbolic function could
provide, but also that language requires a greater measure of
freedom and flexibility at the bottom end. Whereas rules originate
top-down, it may likely turn out that bottom-up processes better
reflect what is actually going-on, at least in the initial learning
processes of language. (One nontrivial note here to remember is
that there is a fundamental and crucial difference between (AI)
artificial computer (chips) and living brain cell (neurons): the
latter must secure survival. There is no sense in the notion that
silicon chips need to secure survival, since there is no death
of a chip. Cells are living organisms that must somehow ensure
its survival, and this survival apparatus certainly for the individual
cell, must be organized in a bottom-up fashion). Along these lines,
much of what is coming out of West Coast schools-of-thought (connectionism)
affords the old school of Gestalt psychology a new lease on life.
Some connectionists find themselves talking-up the fact that language
can’t simply be a cohesion of atoms put together in very elegant
ways, but that some ‘higher-order’ of fluidness must exist. Human
cognition is more fluid, more context driven. In a token manner
of speaking, Kohler might carry-on here about mysterious magnetic
fields which suddenly arise in the brain which pull sub-particle
visual stimuli together—any notion of a gestalt brain, of course,
has long been disputed (I think). However, it should be noted
that Gestalt psychology continues to pave a way for a serious
return in the contexts of connectionism. (In addition, as a historical
footnote, let’s not forget that while Rosenblatt’s work originated
with visual perception, it is now viewed that his work, if carried-out
in today’s climate, would have had potentially serious linguistic
implications.). And so let us turn to language. With specific
regards to grammar, the Word-Perception Model of Rumelhart and
McClelland (1981, 1986) has made a dramatic impact in the field.
Not only has it provided us with a new way of looking at potential
brain processing (a quantitative way of looking with regards
to weights of connections, thresholds, memory storage, etc.),
it also made rather precise claims about what kinds of material
(qualitative) would be difficult to process in such a model:
(the need for hidden units regarding 2-degree complex structures
and paradigms, recursive complexity and back-propagation, etc.).
Clearly, when one can predict with a fair amount of certainty
where problems will be had, and then attempt to account for the
nature of the problem in terms of the model, then surely the criterion
of explanatory value is close to being met. For example, the now
conceded fact that ‘hidden units’ must be installed (p.c. Jeff
Elman, as part of the innate apparatus) in order for the full
complexity of language to be process via any PDP, I believe, speaks
volumes to where we stand today in explanatory value—in fact,
hidden units have now become the main rallying cry for those who
postulate for rule-based accounts of language (not to mention
the nativists among us. See Marcus vs. Elman on this matter).
Finally, the typical intransigence that often will shape and define
opposing views has given way to a certain amount of movement leading
to an ultimate compromise between the two leading schools of thought—as
noted by the likes of Steven Pinker and Alan Prince. For instance,
Pinker & Prince’s somewhat tentative and partial acceptance
of a connectionist model regarding specific lexical processes,
if nothing else, has buttressed their own alliances in the pursuit
of upholding counter-claims against proponents for a pure ‘Single
Mechanism Model’ (strictly based on associative learning). And
so out of this twist of faiths, a renewed and rejuvenated interest
in rule-driven processes has been gathering momentum in an attempt
to seek more narrowly confined rule-based analogies for dealing
with specific aspects of language/grammar as a whole. Finally,
as suggested by Newell above, long-standing dichotomies often
provide a variety of clever means to think about a wide range
of topics. It goes without saying that as a pedagogical device
at least, students not only crave a good debate, but more importantly,
they often report that new material introduced in the form of
a debate procures a much higher level of understanding. Well,
this singular debate has been ongoing for centuries, merely masked
under several different labels: nature vs. nurture, innate
vs. learned, hard-wire vs. soft-wire abilities, instinct
vs. learning, genetic vs. environment, top-down vs. bottom-up
strategies, and as presented herein, the Single vs. Dual
Mechanism Model.
Introduction
-
It is a fact that children do not produce ‘adult-like’ utterances
from the very beginning of their multi-word speech. And so much
of the debate ongoing in child first language acquisition has
been devoted to the nature and extent of ‘What gets missed
out where’. Theory internal measures have been spawned every
which way in effort to account for the lack of apparent adult-like
language in young children—Theories abound. Despite some evidence
that would seem to point to the contrary, more robust syntactic
theories from the outset continue to view the very young child
as maintaining an operative level of language closely bound
to abstract knowledge of grammatical categories (Pinker 1984,
Hyams 1986, Radford 1990, Atkinson 1992, Wexler 1996, Radford
& Galasso 1998). For instance, Pinker (1996) has described
early language production in terms of a first order (general
natives) cognitive account-suggesting a processing ‘bottleneck’
effect which is attributed to limited high-scope memory to account
for the child’s truncated syntax of Tense/Agr/Transitive errors
(e.g., Her want), and over application Tense errors (e.g.,
Does it rolls?). Radford (1990), Radford and Galasso
(1998), on the other hand, has maintained a second order (special
nativist) maturational account affecting syntactic complexity
in order to explain the same lack of adult-like speech. It should
be noted that these two nativist positions share a common bond
in that they are reactions to much of what was bad coming on
the heels of work done in the 1970s—theories which sought to
account for such errors on a purely semantic level e.g., Bloom
(1975), Braine (1976) and to some extent Bowerman (1973). Steering
away from potentially non-nativist associative/semantic-based
accounts to proper syntactic-based accounts was viewed by most
to be a timely paradigm shift—acting as a safeguard against
what might be construed as bad-science Behaviorism (of the purely
semantic kind). This shift brought us toward a more accurate
‘Nativist’ stance swinging the Plato vs. Aristotle debate back
to Plato’s side, at least for the time being (as witnessed in
Chomsky’s entitled book ‘Cartesian Linguistics’)—a move
keeping in line with what was then coming down the pike in Chomskyan
linguistics. One thing that seems to have caught the imagination
of developmental linguists in recent years has been to question
again the actual infrastructure of the child-brain that produces
this sort of immature grammar—namely, a rejuvenated devotion
has reappeared in the literature circumscribing new understandings
of age-old questionings dealing with Theory of the Brain/Mind.
-
For instance, proponents of Behavioral/Associationist Connectionism
today (cf. Jeff Elman, Kim Plunkett, Elizabeth Bates, among
others) are more than ready to relinquish the old Chomskyan
perspective over special nativism (‘special’ in that language
is viewed as coming from an autonomous region in the brain,
unconnected to general cognition or other motor skill development,
pace Piaget and vs. general nativism), and have
rather shifted their locus on an innateness hypothesis based
not on natural language (per se) but rather on a type
of innateness based on the actual architecture itself that
generates language (architecture meaning brain/mind: viz., an
innate Architecture, and not an innate Universal Grammar).
-
For Chomsky, it was this autonomous language faculty (that he
refers to as a language organ) that allowed this innate language
knowledge to thrive and generate grammar. For the connectionist
movement, it is the very architecture itself that is of interest—the
input/output language result being a mere product of this perfected
apparatus. So in brief, the debate over innateness has taken
on a whole new meaning—today, perhaps best illustrated by this
more narrow debate over General vs. Special Nativism.
We shall forgo the meticulous details of specific theories at
hand and restrict ourselves to the rather prosaic observation
that the child’s first (G)rammar (G1) is not at all contemporary
with the adult (T)arget grammar (Gt). Notwithstanding myriad
accounts and explanations for this, for the main of this paper,
let’s just simply examine the idea that the two grammars (child
and adult)—and we do consider them as two autonomous and separate
grammars—must partake in some amount of Discontinuity:
(Gt is less than equal to G1, or Gt<G1) and that such a discontinuity
must be stated as the null hypothesis tethered to maturational/biological
differences in the brain. Hence, G1 represents the (B)rain at
B1..(B2..B3¼Bt ), while Gt represents the brain at Bt).
-
Discontinuity theories have at their disposal a very powerful
weapon in fighting off Continuity theories—whether it be language
based, or biological based (noting that for Chomsky, the study
of Language, for all intents and purposes, reduces to the study
of biology). This great weapon is the natural occurrence of
maturational factors in learning. In fact, on a biological level,
maturation is taken to be the null hypothesis—whether it be
e.g., the emergence and consequent loss of baby teeth, to learning
how to walk-talk, to the onset of puberty. In much the way the
adult achieves, the achievement can be attributed to the onset
of some kind of scheduled-learning timetable—for language, it’s
an achievement mirroring a process in which the nature and level
of syntactic sophistication and its allocation is governed in
accordance to how the brain, at the given stage, is able to
handle the input.
-
It is common knowledge that (abstract) grammatical relations
are frequently a problem for language acquisition systems.
Early reflection on this was made by Brown when he discovered
that one could not explain why some grammatical morphemes were
acquired later than others simply in terms of input. The question
was posed as follows: If all morphemes are equally presented
in the data-stream at roughly the same time—contrary to what
might be believed, parents’ speech toward their children is
seldom censored so as to bring about a reduced mode of grammatical
communication/comprehension—then, what might account for the
observed asymmetrical learning? Similarly, Pienemann (1985,
1988) has made claims for a grammatical sequencing of learning
second language based on complexity of morphology. This question
led to early notions of a linguistic maturational timetable
much like what Piaget would have talked about regarding the
child’s staged-cognitive development—maturation being the only
way to address such a staged development. Likewise, a Chomskyan
position would have it that there must be something intervening
in the child’s (inner) brain/mind (albeit not tied to cognition)
that brings about the asymmetrical learning since there’s no
change in the (outer) input. Well, one of the first observations
uncovered by Brown was that a child’s linguistic stage-1 (with
multi-word utterances (MLUw) lower than 2) went without formal
functional grammar. In other words, Brown noted that a telegraphic
stage of learning was absent of abstract grammar such as Inflection,
Case and/or Agreement.
-
One consequence of this style of learning was that children were
considered to learn by rote-methods, associative means similar
to what Skinner had earlier advanced in (his ‘bad science’)
Behaviorism. It was somewhat tentatively suggested here regarding
a very early stage-1 that children didn’t start learning language
as a set of rules of logic (as Chomsky would have us believe
in his notion of generative grammar), but that children would
first grapple with the linguistic input by gathering and constructing
formulaic chunks. Children would only later on, say at a stage-2
of language acquisition, start to employ Chomskyan rules to
generate a target grammar (as a consequence, see ‘U-shape learning’
discussed below). For example, Bellugi (1967), Klima and Bellugi
(1966), Bellugi (1971), initially allowed for a certain amount
of formulaic misanalysis to enter into the accounting of non-adult-like
stage-1 structures. More specifically, recently Rowland and
Pine (2000) have similarly suggested that e.g., early Subject-Auxiliary
inversion errors such as * What he can ride in? (inter
alia) (along with the optional target structures showing
inversion What can he ride in?) cannot be accounted for
by a rule-driven theory—viz., if the child has access to the
rule, the theory would then have to explain why the child sometimes
applies the rule, and sometimes fails to apply it. Rowland &
Pine rather suggest an alternative account by saying that as
a very early strategy for dealing with complex grammar (e.g.,
Aux. Inversion, Wh-fronting) children learn these semi-grammatical
slots as lexical chunks—a sort of lexicalized grammar—whereby
they establish formulaic word combinations: e.g., Wh-word
+ Auxiliary as opposed to Auxiliary + Wh-word combinations.
It was shown that aspects of error rate and optionality (as
opposed to rule-driven mechanisms) highly correlated to high
vs. low frequency rates of certain combinations in the child’s
input This early non-rule-based strategy was then able to account
for the vast array of the child data—viz., where the number
of non-inverted Auxiliaries vs. inverted Auxiliaries were at
a significantly higher rate at the initial stage-1 of development.
As an example of a non-rule-based account here, they show that
when inversions did occur, they typically involved only a certain
select few Wh-words, and not the entire class. Hyams (1986,
p.85) somewhat agrees with such a reduced structure when she
asserts the following: By hypothesis, the modals (or
Aux. Verbs) are unanalyzable during this period. Such
overall claims strongly support Stromswold’s (1990) statistical
data analysis which clearly demonstrated that children at a
very early stage-1 might not productively realize an utterance
string containing [ don’t, can’t] in e.g., I/me [don’t]
want, You [can’t] play as the syntactic elements
[{Aux} + clitic{n’t}], but that such strings were more limitedly
realized as quasi-formulaic representations of a negative element.
In other words, the claim could be extended to mean that
for the child at this stage-1, the lexical item don’/can’t
reduced to the one-to-one sound-meaning of not: e.g.,
Robin [don’t] [ =no(t)] play with pens
(Adam28) where the verbal inflection {-s} goes missing since
it isn’t analyzed as an Aux Verb. Likewise, Brown came to similar
tentative conclusions by recognizing that (i) verbal inflection
seemed not to be generalized across all verbs in the initial
stages, and therefore, that (ii) children didn’t really start
with rules, but rather employed a strategy of ‘lexical-learning’.
Early stage-1 inflected verbs might then be learned as separate
verbs (chunks) thus explaining observable optionality: since,
as the story was then told, either you knew a rule (and so you
always applied it) or you didn’t. Optionality of verbal inflection
was then seen as a dual process of word acquisition in the brain:
both uninflected and inflected words were stored as two different
items in the lexicon. (See Bloom 1980 for comments). This notion
of a stage-1 learning via non-rule-based means implied that
the stage was a formulaic stage, and set-up in such a way as
to learn by associative processes buttressed by frequency learning.
The Dual Mechanism Model
-
It has recently been hypothesized that the language faculty consists
of a dualistic modular structure made up of two basic components:
(i) a Lexical component—which has to do with formulating
lexical entries (words), and a (ii) Computational component—which
is structured along the lines of algorithmic logic (in a Chomskyan
sense of being able to generate a rule-based grammar). It is
argued that these two very different modes of language processing
reflect the ‘low-scope’ (1 st order) vs. ‘high-scope’
(2 nd order) dichotomy that all natural languages
share. Low/High scope would be described here in terms of a
how and where certain aspects of language get processed in the
brain (see also section # below on brain studies). In addition
to newly enhanced CT brain imaging devices, multidisciplinary
data (e.g. linguistic, psychological, biological) are starting
to trickle in providing evidence that a dual mechanism is at
work in processing language. Results of experiments indicate
that only a dual mechanism can account for distinct processing
differences found amongst the formulations of irregular inflected
words (e.g., go>went, foot>feet) and regular inflected
words (e.g., stop>stopped, hand>hands). The former (lexical)
process seems to generate its structure in terms of stored memory
and is taken from out of the mental lexicon itself in mere associative
means: these measures are roughly akin to earlier Behavioristic
ideas on frequency learning, etc. fashionable in the 1940s-1960s
and made notable by the experimental work of D. Hobb and B.F.
Skinner. The latter regular mode of generating structure is
tethered to a Chomskyan paradigm of (regular) rule-driven grammar—the
more creative, productive aspect of language/grammar generation.
Such regular rules can be expressed as [Stem]+[affix] representations,
whereas a stem constitutes any variable word <X> (old
or novel) that must fit within the proper categorization (parts-of-speech)
stem. For instance, using a simplified version of Aronoff’s
realization pair format (1994, as cited in Clahsen 2001, p.
11), the cited differences in parsing found between e.g., (i)
a regular [Stem + affix] (decomposed) construction vs. (ii)
an irregular copular ‘Be’ [Stem] (full-form) lexical item can
be notated as follows:
a. <[V, 3sg, pres, ind], X+s> |
b. <[V, 3sg, pres, ind, BE], is> |
The regular 3Person/Singular/Present rule in (a) spells out
the bracketed functional INFLectional features of Tense/Agreement
by adding the exponent ‘s’ to the base variable stem ‘X’. The
features in (b) likewise get spelled; but rather than in the
form of an exponent, the features are built into the lexeme
‘BE’ by the constant form is. Once the more specific,
irregular rule is activated, the default regular rule-base spell-out
is blocked-preventing the overgeneralization of * bes.
-
INFLection. Recent research conducted by Pinker (MIT),
Clahsen (et al.) (Essex), among others has shown that a dual
learning mechanism might be at work in acquisition of a first
language. The research first focuses on terminology. It is said
that there are two kinds of rules for Inflection: an Inflection
based on lexical rules, and an Inflection based on combinatory
rules. In short, the types of rules are described as follows:
(i) Lexical Rules:
Lexical rules (or lexical redundancy rules) are embedded
in the lexical items themselves (‘Bottom-up’). Lexical
rules may be reduced to being simple sound rules somewhat
akin to statistical learning; for instance, associative
regularities are built-up from out of the sequencing
of lexical items—e.g., the <sing>sang>sung
-> ring>rang>rung> sequencing of an
infix (vowel change) inflection (presented below) |
(ii) True Rules:
Word inflection of the former type (i.e., lexical rules)
is cited as an inflection not based on rules, but rather
encoded in the very lexical item itself. True Rule (or
affixation), on the other hand, would be a combinatory
symbolic process based on variables—a creative endeavor
not bound by associative input (‘Top-down’). Whereas
lexical-based inflections are exclusively triggered
by frequency and associative learning methods—i.e.,
they are not prone to deliver the creative learning
of novel words with inflection—Novel word inflection
is generated (by default) once the true rule-based grammar
is in place. One simple example that Pinker and Clahsen
give in illustrating lexical/associative Inflection
is the irregular verbs construction below:
|
-
Irregular Verb Constructions: The #ing>#ang>#ung paradigm
Table 1
a). sing > |
sang > |
sung |
b). ring > |
rang > |
rung |
c).*bring > |
*brang > |
*brung |
The cause of this commonly made error in (9c) is due to the
fact that the phonological patterning of rhyme #ing>#ang>#ung—as
a quasi-past-tense infix (lexical-rule) form—is so strong that
it often over-rides and out strips the default regular (true-rule)
form of V+{ed} inflection for past tense. (Spanish offers many
similar examples where frequency of irregular verbs affect the
paradigm such as the irregular (incorrect) * Roto (=Broke)
over-generalization generated from the regular inflection Romp-ido.)
(*marks ungrammatical structures)
-
The erroneously over-generated patterns of *bring>brang>brung
(for English) and *Roto (for Spanish) are heavily based
on statistical frequency learning in the sense that the sound
sequences of other irregular patterns (e.g., ring>rang>rung,)
contribute to the associative patterning. Recall that structured
lexical/associative learning merely generalizes, by analogy,
to those novel words that are similar to existing ones. Regular
grammatical rules (true rules), on the other hand, based on
affixation, may apply across the board to any given (variable)
syntactic category (such as Verb, Noun). In one sense, the ultimate
character of ‘true rules’ is that which breaks the iconic representation
of more primitive, associative-based processes, whether it be
a neuropsychological process or some other process.
-
The point that the actual over-generalized strings (bring>brang>brung)
are not found in the input demonstrates that there is some
aspect of a rule evoked here—albeit, a rule based on rhyme association,
and thus not a ‘pure rule’ where true (non-associative) variables
would be at work. In other words, these lexical rules are to
be generalized as a form of associative pattern learning, and
not as a true rule, since they are associated with sound sequencing
only. One crucial implication of an Inflection generated by
a true-rule is that such inflection could be easily applied
to novel or unusual words: viz., words never before heard in
the input (contrary to frequency learning of lexical rules discussed
above—cf. Brown (1957), Berko (1958).
-
Expanding on previous studies which examined differences in priming
effects between Derivational and Inflectional morphology,
Clahsen concludes that difference in priming effects can only
be accounted for by a dual mechanism of learning—interpreting
the data to show that high priming effects were
connected with productive inflectional forms not listed in the
mental lexicon, where as low priming effects were connected
to productive derivational forms associated with stem entries.
-
With regards to German forms of pluralization, Clahsen (et al.
p. 21) note that the same argument can be made for a dual mechanism
process—viz., the high priming regular (default) plural ‘-s’
( auto-s) contrasts with the low priming of the irregular
plural ‘-er’ ( kind-er). The raw findings here suggest
that certain irregular inflections in German (e.g., participle
{-n}, plural {-er}) might be stored in the lexicon as undecomposed
form chunks and that these two processes of storage are activated
in very different places and manners in the brain—viz., the
findings that irregular inflections spawn reduced priming as
compared to regular inflection suggest that regular inflections
are built forms based on rules that contain variables which
make the basic unmarked stem/root available for priming.
It is clear from the table below that regular inflected word
forms such as {-t} participles and {-s} plurals produce full
priming and no word-form frequency effects. For irregular inflected
forms such as {-n} participles, {-er} plurals and (irregular)
{-n} plurals the opposite pattern appears. The data suggest
that irregular forms are stored as undercomposed stems—hence
the emergences of full form frequency effects. Regular forms
are captured by the full rule process and are stored in a computational
manner that works off of variable+stem algorithms—hence, the
lack of full-form frequency effects. These differences in German
morphology seem to parallel what we find between English (i)
Inflectional morphology and (ii) Derivational morphology where
the former seeks out specific rule formulations—e.g., V + {ed}
= Past, or N + {s} = Plural, etc. and where the latter seeks
out associative style sound-to-meaning learning approaches (as
in irregular verbs/nouns e.g., go>went, tooth>teeth, etc.)
Applying fMRI brain imaging techniques, a consensus has begun
to emerge suggesting that the lexical storing of derived stems
+ suffixes (e.g., teach+{er}) may actually be processed as one
single word chunk in the otherwise lexical (word/recognition)
temporal-lobe areas of the brain, and not, as intuition would
have us believe, as a dual segmented [stem + suffix] lexical
structure which has undergone a process much like a morpho-syntactic
string). This may be an apparent economical move keeping in
line with the classic one- sound-one-meaning association. In
noting this, there seems to be a natural tendency in the diachronic
study of language to move from (i) rule-driven Inflectional
morphology—with more complex rule-driven infrastructures
[+Comp] (Comp=complex) to less complex [-Comp] structures—to
(ii) association-driven Derivational morphology. This
tendency can be easily captured by looking into the way words
have evolved over a duration of time—e.g., Break|fast /bre:
kfaest/ has evolved from a twin morpheme structure [[Verb
Break] + [Noun Fast]] > to Breakfast /bre:kfIst/ [Noun
Breakfast] composed of a single morpheme chunk..
Table 2 Summary of experimental effects (Taken
from Clahsen et al. 2001: p.26)
Representation |
Full priming
effect? |
Full-form
frequency effect? |
Source |
-t
particples: ge[kauf]-t |
yes |
no |
Sonnenstuhl
et al. (1999), Clahsen et al. (1997) |
-s
plurals: [auto]-s |
yes |
no |
Sonnenstuhl&Huth
(2001), Clahsen et al. (1997) |
-er
plurals: [kinder] |
no |
yes |
Sonnenstuhl
&Huth (2001) Clahsen et al. (1997) |
-n
participles: [gelogen] |
no |
yes |
Sonnenstuhl
et al. (1999), Clahsen et al. (1997) |
-n
plurals I: [bauern] |
no |
yes |
Sonnenstuhl&Huth
(2001) |
-ung
nominalizations: [[stift]ung] |
yes |
yes |
Clahsen
et al.(2001) |
diminutives:
[[kind]chen] |
yes |
yes |
Clahsen
et al. (2001) |
-n plurals
II: [[tasche]n] |
yes |
yes |
Sonnenstuhl&Huth
(2001) |
-
In sum, Pinker and Clahsen assume that the language faculty has
a dual architecture comprising of (i) combinatory rule-based
lexicon (leading to the lack of full-form effects) and (ii)
a structured non-rule-based lexicon (leading to full-form effects).
Questions on specifics will surface in the following sections-namely:
How are these two methods represented in the brain?
-
A Stage-1 Language Acquisition. There is a huge
and ever-growing body of data today being collected by developmental
linguists in the field which suggests that the brain of a child
matures in incremental ways which, among other things, reflects
the types of ‘staged’ language development produced by the child
for a given maturational stage. The collected data suggest that
children’s early multi-word speech demonstrates ‘Low-Scope’
lexical-specific knowledge, and not abstract true-rule
formulations attributed to grammar. Somewhat akin to Piagetian
notions of language development (see general nativism below):
One difference being that it need not be tied here, exclusively,
to a cognitive apparatus. This staged, maturational theory of
language development accounts for the lack of specific linguistic
properties by suggesting that the brain is not yet ready to
conceptualize higher and more abstract (High-Scope) forms of
linguistic conceptualizations
-
The idea behind ‘What gets missed out where’ in child
speech production has given those linguists interested in morphology
and syntax a particularly good peek at how the inside of a child’s
brain might go about processing linguistic information—and other
information for that matter. As stated above, research initially
carried out by Brown and his team (1973), working under a Chomskyan
paradigm of linguistic theory, and consequent work by others
(cf. Radford) suggests that there is a stage-1 in language acquisition
that tightly constrains the child’s speech to simple one-to-two
word utterances with no productive forms of verb or noun inflection.
One child that appears in the early studies, Allison, provides
transcripts between 16-19 months showing no signs of the onset
of formal inflectional grammar—only later-on close to two years
of age (22-24 months) does inflectional grammar/syntax emerge,
and then only in what could be said as a sporadic, optional
manner.
-
This stage-1 is considered to be a grammatical stage
with an MLUw (Mean Length of Utterance word) of 2 words or less.
More specifically, in the sense of the apparent lack of formal
grammar, this shouldn’t be confused with the idea of an earlier
a-grammatical stage well before the onset of multi-word speech.
(Surely, there can be no grammar or syntax of which to speak
if there aren’t multi-word constructions). This grammatical
stage-1 therefore differs with the notion of a one-word stage
(MLU=1) where supposedly absolutely no grammar/syntax is at
work. The grammatical stage-1 is said to begin roughly with
the onset of multi-words at about the age of 18 months (+/-20%).
It is reasonable to suppose that such a stage would have target
semantic meaning—even though, say the arbitrary ‘one-to-one
sound-to-meaning’ relationship is not of the target type (e.g.,
onomatopoeia forms /wuwu/=dog, /miau-miau/ =cat, etc.).
-
The above notions beg the question: At what point do we have
evidence of grammatical categorization? For example, the traditional
distributional criterion that defines the Noun class as that
category which may follow Determiners (a/the/many/my/one) made
not be available to us if, say, Determiners have yet to emerge.
Hence, distributional evidence may be lacking in such cases.
One way around the dilemma has been to suggest that early stage-1
grammar is categorical in nature simply owing to a default assumption
that categorization is part of the innate ability to acquire
language (in Chomskyan terms, part of the richly endowed LAD
or Language Faculty) and that words are both inherently categorical
and semantic in nature. Pinker (1984) claims that the categorization
of early stage-1 words should be roughly pegged to their inferred
semantic properties. Radford (1990), in a slightly different
approach, prefers to consider such early multi-words at stage-1
as lexical in the sense that (i) they have built-in default
lexical categorization abilities (forming classes of Nouns,
Verbs, Adjectives, Adverb, and Prepositions), but, at the same
time, (ii) rely heavily on their semantic-thematic properties.
In any event, either description starkly contrasts with a connectionist
view-which claims that e.g., the class ‘subject’ emerges through
rote-learning of particular framed constructions. Subject-hood
is learned as a category via rote associative learning of thematic
relations. Now, it remains unclear to me precisely how close
such thematic links to category-hood get to Radford’s 1990 interpretation.
I would only venture to say that both views share the belief
that semantics hold the central cognitive underpinnings upon
which syntax can later be built.
-
This account of stage-1 has been labeled as the Lexical thematic
stage-1 in language acquisition (Radford 1990). It is unclear
how far Radford would like to go in accepting his stage-1 as
cognitively based: the labeling here of lexico- thematic
(the term thematic referring to argument structures pegged to
semantics) certainly permits some amount of semantics to leak
into the discussion. Nevertheless, Radford emphatically rejects
the notion that a stage-1 syntax could be exclusively based
on semantics. It is here that Radford gets full mileage out
of his two-prong converging Lexical-Thematic stage-1 grammar:
a stage-1 that is both—
(i) ‘thematic’ in the sense that it
leans towards general nativism since simple utterance
types at the earliest MLU get directly mapped onto their
thematic argument structure; while,
(ii) ‘lexical’ in the sense that the child seems to
be fully aware that they are dealing with words based
on lexical grammatical categories, and not semantic.
This is made apparent by how children know the morphological
range of category (e.g., Noun, Verb) selectiveness along
with inflection distribution. |
Next
Page >>
[2] [3]
|
|