Towards a 'Converging Theories' Model of Language
Acquisition:
Continuing Discontinuity
Joseph Galasso
California State University, Northridge
joseph.galasso@csun.edu
(2003b)
Introduction
There
was a time when the classical split between behaviorism and nativism was easily identifiable, each rationale breaking down along their
traditional fault lines. On one hand, you had the 'behaviorists-folk' who
believed more or less that all forms of learning, language included, could be
somehow reduced and extracted from ambient input found in the environment. If
there were to be any talk of innate structure leading to such learning, it
would be relegated to innate structure compounding more cognitive mechanisms
which underpinned associative-style learning--perhaps something along the lines
of an innate memory capacity or an associative linking component of the
brain/mind that allowed semantics to link to syntax, thus solving any linking
problem (cf. Pinker), or perhaps something along the lines of an innate architecture
structure that paved a way for frequency learning (cf. Elman). On the other
hand, while 'nativists-folk' agreed that there was something of interest to be
said about such accounts of learning (i.e., artificial intelligence and
connectionist strands of Computational Theories of the Mind (CTM)), the strong
nativists among them saw through the clever guise of CTM and never let
themselves be taken in by what appeared to be simply another attempt to reduce
true language (a syntactic structure) to being a simple bi-product of mere
computation (cf. Fodor).
This
working paper, the broad second segment of 'Twin Working Papers',[1] attempts to review the literature surrounding the two
sides and to bear to light reasons why I believe we have made really very
little progress in understanding / explaining how a 'rule-based' equation of
language actually arises as a computation in the brain. (The problem that
belies 'explanation' is well compounded: Darwin's theory of evolution even
fails on this test. So, I suppose we are in good company). Having said this,
there is good reason nevertheless to promote the Dual Mechanism Model (DMM) as
the best possible candidate to eventually bridge the gap between the two sides
of the traditional divide. A caveat here follows: As I hope to show, while the
DMM may do well in accounting for a number of phenomena, as it is presently
understood, it ultimately fails to provide us with any new, comprehensive model
towards an explanation of true language. On one side of the argument, the DMM
at best simply refashions the same problems the behaviorists were plagued with
more than a half decade ago--namely, the overwhelming 'mystery' of how the
brain/mind creates rule-driven syntax (top down) from mere cognitive capacity
(bottom-up) (the 'bootstrapping' dilemma). To my mind, while the DMM succeeds
in descriptively carving out the data roughly into these two distinctive
processes (root-based vs. affix-based) (or frequency vs. rule driven), it does
little to explain the distinctions outright or to make any sense of how/why the
two processes converge (when they do converge) and/or why they don't (when they
don't). (Examples of such convergence have recently been reported by Clahsen
(2001) who suggests that not only does derivational morphology, indeed a
morphological process, actually show processing similarities akin to lexical
retrieval tasks, but so too does high frequency regular rule-based inflectional
morphology show similarities akin to lexical retrieval tasks--the two processes
may actually converge in becoming rote-learned incorporations of otherwise
decomposed morpho-phonetic structures).[2] In the ensuing pages, we examine the role the Dual
Mechanism Model has in language acquisition while keeping an eye on how it will
ultimately fail in offering any viable complete picture of linguistic
knowledge. However, having started on this rather pessimistic note, I proceed
in good faith to make clear that the DMM is at the moment our best and most
promising tool in sorting through the many complexities language has to offer.
The
Dual Mechanism Model credits the Brain/Mind with having two fundamentally
different cognitive modes of language processing--this dual mechanism has
recently been reported as reflecting inherent qualitative distinctions found
between (i) regular verb inflectional morphology (where rule-based stem+affixes
form a large contingency), and (ii) irregular verb construction (where full
lexical forms seem to be stored as associative chunks). In this paper, we
examine the DMM and broaden its scope as a means to covering the overall
grammatical development of Child First Language Acquisition.
Converging
Theories and the Brain as Self-Referent
The one major theme behind much of what is expressed
within the notes comes to be centered on a driving notion called 'Converging
Theories'. The term 'converging', though used more-or-less as a device to merge
the two major theories in the field of language acquisition, equally serves a
second purpose having to do with a converging of brain processing. Perhaps the
leading motivation behind my compiling the notes for the 'Twin Working Papers'
sits with trying to understand the brain, its modular aspects, and how the brain comes to bootstrap itself and becomes a mind worthy of producing language.
Let's
start by saying that the brain is self-referent, meaning it takes in only that input (external to
itself) which has already been generated in the brain in the first place
(internal of itself). Contrary to this, one is often tempted into thinking that
the brain processes such information as if the input were truly novel to the
brain in some way or other, as if the input were truly objective, and that the
brain then takes this novel input and makes sense out of it (viz., to the
extent that there exists an anthropic principle behind man's capacity to reason). This doesn't seem to
be the case at all. The brain rather first creates, churns out, takes back in, reexamines,
and creates anew again and again. That which we are inclined to perceive and
thus understand in our environments is exactly that and only that which has
already been born to the brain. The brain is not only self-referent in its
processing of knowledge, but modular in its allocation of the processing. The
modular aspect of the brain, simply put, could be best summed-up by cutting the
brain into two halves (frontal vs. temporal): (i) the temporal sensori-motor
brain (the 'animal brain'), and the
frontal abstract-brain (the 'human
brain'). Each halve has its own processing tasks. Each halve can only
understand/process that form of knowledge (externalized to the outside) which
it originally conceived (internalized from the inside). The sensori-motor brain
is instinctively 'knee-jerk-like' in nature in that it solely responds to a
kind of self-preserving behavior. This outward manifestation of this behavior
is first generated from the animal brain itself. The sensori-brain works in a
'bottom-up' cognitive manner; it easily runs with a neo-Darwinian story of
evolutionary adaptation and
accounts for much of what we know resides behind more concrete processing:
namely, the inputs-outputs of man's sensual word (visual/auditory, etc.). The
abstract-brain is a curiosity of sorts; it is rather non-self-preserving in
nature and works in a 'top-down' manner of exaptation in the sense that it caters to no known Darwinian adaptive reasoning. The
converging of these two modular aspects of the brain allows for the allocation
of specific types of knowledge to enter into specific domains. The dual modes
gather and identify only those select forms of the input which it first
produced--hence, therein lies a kind of circular loop
between (i) the subjective preconceived internalization of behavior/mental
processing, (ii) the objective release of the behavior/mental processing in the
form of output, returning to the (iii) internalization of the output.
There
exists a long linguistic tradition concerning such lines of reasoning. For
instance, the inquiry into how children might eventually 'notice' similarities
in the form of frequency-driven input (bottom-up) in both represented
utterances and encoded events could be reinterpreted into questioning how the
very young child is able to 'notice' such input in the first place. The
'noticing problem' has likewise spun-off into other areas of linguistics having
to do with word learning and taxonomy, semantic boot-strapping analogies and
innate assumptions leading to morphology and syntax. Unfortunately, the
noticing problem often suffers either from circularity in one respect or
paradox in another: viz., if one means to say children notice in adult-like
terms from the outset of their speech, then surely one must advocate an
(adult-like) innate mechanism for such noticing in the first place (citing
Plato's problem in general along with the specific linguistic problem of
poverty of stimulus). However, contrary to the above citation, noticing
hypotheses tend to rely on bottom-up sensori-brain methods for dealing with
such learning, not nativist top-down assertion of abstraction. For example,
stage-1 language development tends to be described as utterance-event pairings
iconic in representation, a Stimulus & Response one-to-one association as
opposed to a latter developed stage-2 which tends to be described by saying
that the child notices non-iconic abstract representations and similarities
having to do with imperfections of rule-based paradigms. Clearly, if the first
stage of noticing is correct, and, to a degree we believe it is, then surely
one must obtain some means of getting a hold on the knowledge (if not via a
priori epistemology, then perhaps at
least via some biological modular of brain processing).
Proposal
This
paper proposes new accounts of old issues surrounding child first language
acquisition. The general framework of our proposal is based upon hybrid
theories--proposals stemming from recent investigations in the areas of
PDP-style connectionism, as well as from more naturalistic studies, and
sample-based corpora of Child Language Acquisition. Much of what is sketched
out here attempts to converge the leading tenets of two major
schools-of-thought--namely, Associative Frequency learning and/vs. Symbolic Rule
learning. Cast from this new tenor, proponents calling for a Dual Mechanism
Account have emerged advocating a
dual cognitive mechanism in dealing with processing differences found amongst
regular and irregular verb inflection morphology (inter alia). The main task of this paper is (i) to broaden and
extend the dual mechanism account--taking it from the current slate of
morphology to the larger syntactic level, and (ii) to spawn some theoretical
discussion of how such a dual treatment might have further reaching
implications behind more general developmental aspects of language acquisition
(as a whole), namely (though not exclusively), the twin benchmarks of syntactic
development regarding Lexical vs. Functional grammar. Our central claim will be
that whatever factors lead to a deficient morpho-phonolgy, say, at a given
stage-1 of development--factors that may potentially lead to the postulation of
a non-rule based account--these same factors are likely to be carried over,
becoming a factor of deficiency in the overarching syntax. Thus, the tone of
the discussion is dualistic throughout. Our main goal is two-prong: first, to
assert as the null hypothesis that language acquisition is Discontinuous in nature from that of the adult target grammar, and
that this discontinuity is tethered to maturational factors which lay
deep-seated in the brain--factors which yield fundamental differences in the
actual processing of linguistic material, (a so called 'Fundamental Difference
Hypothesis'), and second, to show that this early multi-word non-target stage
can be attributed to the first leg of this dual-mechanism--i.e., that leg of
cognitive/language processing that governs (i) (quasi-) formulaic structures
along with (ii) non-parameterizations. We attribute the generation of this
two-stage development to maturational scheduling--viz., a Non-Inflectional
stage-1 and/vs. an Optional Inflectional stage-2 (where formal grammatical
relations are first learned in a lexical bottom-up fashion and then later
regroup to generalize across the board in a word class top-down fashion). It is
our understanding that the two-staged development involves and shares both a
relevant associative style theory of learning (Associative-style
Constructive Learning for our former
stage-1), while preserving the best of what syntactic rule-driven theories have
to offer (Rule-based Generative Acquisition for our latter stage-2)--hence, the entitled term Converging.
By analyzing much of what is in the
literature today regarding child language acquisition, as well as drawing from
the rich body of work presently being undertaken in connectionism, it is our
hope that a new hybrid converging theory of language acquisition can be
presented in a way that captures what is inherently good from both schools--an
alternative theory that bears more flavor of truth than camp rhetoric.
<>
Why--I don't need any 'rule' to see this tree here
in front of me. My eyes work just fine. That is, insofar as there exists a
single tree. But, how is it that my 'tree' gets destroyed once I move my head
ever so slightly to the east and fall into view of a second tree? The mystery
of it all lies somewhere in the dismantling, between a single torn branch of
lifted foliage, that forces the rule--for how was I ever to know that this
second tree was indeed a tree after all?
(JG).
<>
"Humans use stories that they tell themselves in
order to get themselves to work on this or that. These stories often deal with
confrontation between areas and ideas. From some point of view, it is almost
always the case that these high-level stories are relevant only as motivation
and not really relevant to what eventually happens in terms of technical
understanding". (Allen Newell)
<>
Sometimes, stories within a certain school
split--e.g., formalist debates on the amount of functionalism Chomsky can and
should afford to surrender (cf. Pinker & Bloom). Sometimes differing
stories converge--Neo-Behaviorists seeking out an innately based architecture (Jeff Elman).
0. Overview
Periodically, say every two or three
generations, our vows on science are renewed by a sweeping change of
reasoning--cerebral airs that deliver their own inextricable kind of
'off-the-beaten-path' hedonism. These solemn changes are few and far between
and constitute what the philosopher of science Thomas Kuhn called 'Paradigm
Shifts' (a new-way of thinking about and old-something). Unfortunately, these
generational spurts often provide very little in the way of true original
thinking, and much of what is behind the fanfare quickly reduces to little more
than the recasting of old 'brews' into new 'spells'. Perhaps a glimmer of true
original thought (a 'new-something') comes our way every two hundred years or
so. We are in luck! One of the greatest breakthroughs in science has been born
in the latter half of the last century and has made its way onto the scene
shrouded by questions surrounding how one should go about rethinking the Human
Brain/Mind--questions that have led to eventualities in Computer Programming,
Artificial Intelligence (AI), Language/Grammar, Symbolic-Rule Programs and
Connectionism.
Much
of what sits here in front of me, at my desk, can be attributed in one way or
another to this 'new-something', and whenever there is a new-something, whether
it be steam-locomotives to transistors to tampering with DNA, there's bound to
be an earful of debate and controversy. And so remnants of this debate have
edged their way ever so slowly onto the platform--from the likes of the
psychiatrist Warren McCulloch and mathematician Walter Pitts and their
pioneering work on early 'neuron-like' networks (leading to connectionism), to
the psychologist Donald Hebb (1940s-50s) (and his revolutionary notion of
'nerve learning' based on oscillatory frequency), to the seminal debates
between two great personalities in the AI field, Marvin Minsky and Frank
Rosenblatt (1950s-60s), to those in the realm of language, Noam Chomsky
(1960s-80s). More recently, the debates have taken on a vibrant life of their
own by the advances in computer technology. The most clearly articulated of
these recent debates has come to us by two leading figures in the research
group called Parallel Distributed Processing (PDP)--namely, Jay McClelland and Dave Rumelhart
(1980s).
Most
recently, the debates have come to carry a portmanteau of claims--chief among
them is the claim that human brain function, and thus human computation, is not
analogues to (top-down) symbolic-based computers (from Chomsky 1980), but
rather, the brain and its functional computations should be considered on a par
with what we now know about (bottom-up) nerve functions and brain cell
activations (to Hebb 1940)--as you see, our time-table has been inverted. In
other words, the paradigm shift here occurs the moment one rejects the computer
as an antiquated model of the brain (and language), and instead, prompts up a
newer model of language and thinking based on older models of connections and
connectionism (as presently understood in neurological studies). In this vain,
it is fair to say that we should no longer view language as a mere gathering
and shaping of atomic particles or logical symbols--much like how one might view
the atomic nature of computer language as it is composed of a serial string of
0's and 1's--rationing out sub-parts of the structure in more-or-less equal
portions in hope at arriving at a larger and more cohesive general frame of
language. It could be argued by connectionists that language is not only much
more fluid than what any strict rule-driven/symbolic function could provide,
but also that language requires a greater measure of freedom and flexibility at
the bottom end. Whereas rules originate top-down, it may likely turn out that
bottom-up processes better reflect what is actually going-on, at least in the
initial learning processes of language. (One nontrivial note here to remember
is that there is a fundamental and crucial difference between (AI) artificial
computer (chips) and living brain cell (neurons): the latter must secure
survival. There is no sense in the notion that silicon chips need to secure
survival, since there is no death of a chip. Cells are living organisms that
must somehow ensure its survival, and this survival apparatus certainly for the
individual cell, must be organized in a bottom-up fashion). Along these lines,
much of what is coming out of West Coast schools-of-thought (connectionism)
affords the old school of Gestalt psychology a new lease on life. Some
connectionists find themselves talking-up the fact that language can't simply
be a cohesion of atoms put together in very elegant ways, but that some
'higher-order' of fluidness must exist. Human cognition is more fluid, more
context driven. In a token manner of speaking, Kohler might carry-on here about
mysterious magnetic fields which suddenly arise in the brain which pull
sub-particle visual stimuli together--any notion of a gestalt brain, of course,
has long been disputed (I think, and notwithstanding notions of a 'quantum
gravity brain' as advocated by the great mathematician Roger Penrose). However,
it should be noted that Gestalt psychology continues to pave a way for a
serious return in the contexts of connectionism. (In addition, as a historical
footnote, let's not forget that while Rosenblatt's work originated with visual
perception, it is now viewed that his work, if carried-out in today's climate
of connectionism, would have had potentially serious linguistic implications.).
And so
let us turn to language. With specific regards to grammar, the Word-Perception
Model of Rumelhart and McClelland (1981, 1986) has made a dramatic impact in
the field. Not only has it provided us with a new way of looking at potential brain
processing (a quantitative way of
looking with regards to weights of connections, thresholds, memory storage,
etc.), it also has made rather precise claims about what kinds of material (qualitative) would be difficult to process in such a model: (the need
for hidden units regarding 2-degree complex structures and paradigms, recursive
complexity and back-propagation, etc.). Clearly, when one can predict with a
fair amount of certainty where problems will be had, and then attempt to
account for the nature of the problem in terms of the model, then surely the
criterion of explanatory value is close to being met. For example, the now
conceded fact that 'hidden units' must be pre-installed (p.c. Jeff Elman, as
part of the innate apparatus) in order for the full complexity of language to
be process via any PDP, I believe, speaks volumes to where we stand today in
explanatory value--in fact, hidden units have now become the main rallying cry
for those who postulate for rule-based accounts of language (not to mention the
nativists among us. See the contentious debates between Marcus vs. Elman on
this matter).
Finally,
the typical intransigence that often shapes and defines opposing views has
given way to a certain amount of movement leading to a partial compromise
between the two leading schools of thought--as called upon by Steven Pinker and
Alan Prince. Specifically speaking, Pinker & Prince's somewhat tentative
and partial acceptance of a connectionist model regarding only certain types of
lexical processes, if nothing else, has in turn buttressed their own
allegiances in the pursuit of upholding counter-claims against proponents for a
pure 'Single Mechanism Model' (strictly based on associative learning). And so
out of this twist of fates, a renewed and rejuvenated interest in rule-driven
processes has been gathering momentum in attempting to seek more narrowly
confined rule-based analogies for dealing with specific aspects of
language/grammar as a whole.
As
suggested by Newell in the quote above, long-standing dichotomies often provide
a variety of clever means to think about a wide range of topics. It goes
without saying that as a pedagogical device at least, students not only crave a
good debate, but more importantly, they often report that new material introduced
in the form of a debate procures a much higher level of understanding. Well,
this singular debate has been ongoing for centuries, masked under several
different labels: nature vs. nurture, innate vs. learned, hard-wire vs.
soft-wire abilities, instinct vs.
learning, genetic vs. environment, top-down vs. bottom-up strategies, and as presented herein, the Single vs.
Dual Mechanism Model.
[1]. It is a fact that
children do not produce 'adult-like' utterances from the very beginning of
their multi-word speech. And so much of the debate ongoing in child first
language acquisition has been devoted to the nature and extent of 'What gets
missed out where'. Theory internal
measures have been spawned every which way in effort to account for the lack of
apparent adult-like language in young children--Theories abound. Despite some
evidence that would seem to point to the contrary, more robust syntactic
theories from the outset continue to view the very young child as maintaining
an operative level of language closely bound to abstract knowledge of
grammatical categories (Pinker 1984, Hyams 1986, Radford 1990, Wexler 1996).
For instance, Pinker (1996) has described early language production in terms of
a first order (general natives) cognitive account-suggesting a processing
'bottleneck' effect which is attributed to limited high-scope memory to account
for the child's truncated syntax of Tense/Agr/Transitive errors (e.g., Her
want), and over application Tense
errors (e.g., Does it rolls?).
Radford (1990) on the other hand, has maintained a second order (special
nativist) maturational account affecting syntactic complexity in order to
explain the same lack of adult-like speech. It should be noted that these two
nativist positions share a common bond in that they are reactions to much of
what was bad coming on the heels of work done in the 1970s--theories which
sought to account for such errors on a purely semantic level e.g., Bloom
(1975), Braine (1976) and to some extent Bowerman (1973). Steering away from
potentially non-nativist associative/semantic-based accounts to proper
syntactic-based accounts was viewed by most to be a timely paradigm
shift--acting as a safeguard against what might be construed as bad-science
Behaviorism (of the purely semantic kind). This shift brought us toward a more
accurate 'Nativist' stance swinging the Plato vs. Aristotle debate back to
Plato's side, at least for the time being (as witnessed in Chomsky's entitled
book 'Cartesian Linguistics')--a
move keeping in line with what was then coming down the pike in Chomskyan
linguistics. One thing that seems to have caught the imagination of
developmental linguists in recent years has been to question again the actual
infrastructure of the child-brain that produces this sort of immature grammar--namely,
a rejuvenated devotion has reappeared in the literature circumscribing new
understandings of age-old questionings dealing with Theory of the Brain/Mind.
[2]. For
instance, proponents of Behavioral/Associationist Connectionism today (cf. Jeff
Elman, Kim Plunkett, Elizabeth Bates, among others) are more than ready to
relinquish the old Chomskyan perspective over special nativism ('special' in
that language is viewed as coming from an autonomous region in the brain,
unconnected to general cognition or other motor skill development, pace Piaget and vs. general nativism), and have rather shifted their locus on an
innateness hypothesis based not on natural language (per se) but rather on a type of innateness based on the
actual architecture itself that generates language (architecture meaning
brain/mind: viz., an innate Architecture, and not an innate Universal Grammar).
[3]. For
Chomsky, it was this autonomous Language Faculty (that he refers to as a
language organ) that allowed this innate language knowledge to thrive and
generate grammar. For the connectionist movement, it is the very architecture
itself that is of interest--the input/output language result being a mere
product of this perfected apparatus. So in brief, the debate over innateness has
taken on a whole new meaning--today, perhaps best illustrated by this more
narrow debate over General vs. Special Nativism. We shall forgo the meticulous details of specific
theories at hand and restrict ourselves to the rather prosaic observation that
the child's first (G)rammar (G1) is not at all contemporary with the adult
(T)arget grammar (Gt). Notwithstanding myriad accounts and explanations for
this, for the main of this paper, let it suffice to simply examine the idea
that the two grammars (child and adult)--and we do consider them as two
autonomous and separate grammars--must partake in some amount of Discontinuity: (Gt is less than equal to G1, or Gt<G1) and that
such a discontinuity must be stated as the null hypothesis tethered to
maturational/biological differences in the brain. Hence, G1 represents the
(B)rain at B1..(B2..B3¼Bt ), while Gt represents the brain at Bt).
[4]. Discontinuity
theories have at their disposal a very powerful weapon in fighting off
Continuity theories--whether it be language based, or biological based (noting
that for Chomsky, the study of Language, for all intents and purposes, reduces
to the study of biology). This great weapon is the natural occurrence of
maturational factors in learning. In fact, on a biological level, maturation is
taken to be the null hypothesis--whether it be e.g., the emergence and
consequent loss of baby teeth, to learning how to walk-talk, to the onset of
puberty. In much the way the adult achieves, the achievement can be attributed
to the onset of some kind of scheduled-learning timetable--for language, it's an
achievement mirroring a process in which the nature and level of syntactic
sophistication and its allocation is governed in accordance to how the brain,
at the given stage, is able to handle the input.
[5]. It
is common knowledge that (abstract) grammatical relations are frequently a
problem for language acquisition systems. Early reflection on this was made by
Brown when he discovered that one could not explain why some grammatical
morphemes were acquired later than others simply in terms of input. The question
was posed as follows: If all morphemes are equally presented in the ambient
input at roughly the same time--contrary to what might be believed, parents'
speech toward their children is seldom censored so as to bring about a reduced
mode of grammatical communication/comprehension--then, what might account for
the observed asymmetrical learning? Similarly, Pienemann (1985, 1988, 1989) has
made claims for a grammatical sequencing of learning second language based on
complexity of morphology. This question led to early notions of a linguistic
maturational timetable, much like what Piaget would have talked about regarding
the child's staged-cognitive development--maturation being the only way to
address such a staged development. Likewise, a Chomskyan position would have it
that there must be something intervening in the child's (inner) brain/mind
(albeit not tied to cognition) that brings about the asymmetrical learning
since there's no change in the (outer) input. Well, one of the first
observations uncovered by Brown was that a child's linguistic stage-1 (with
multi-word utterances (MLU) lower than 2) went without formal functional
grammar. Brown noted that an initial telegraphic stage of learning ensued
absent of abstract grammatical makers such as Inflection, Case and/or
Agreement.
[6]. Constructivism
vs. Generativism: A Brief Summary
Constructivists' accounts assume that children's
grammatical knowledge initially consists of constructions based on high
frequency forms in the input. Their models assume polysemy in representation
since lexemes are viewed as being stored in a distributional network in order
to encode different meanings: sound-to-meaning links are therefore made based
on similar phonological to semantic distributions. Furthermore, it is their
general claim that such a correlation is strictly associative, and that it
holds between the quantity and quality of the exemplars obtained of particular
constructions with the constructions of more general schemes that underlie
language use. The constructivist model assumes a 'bottom-up' cognitive
scaffolding of language learning (somewhat akin to what Piaget had earlier
claimed regarding a cognitive underpinning to language development).
Generativists'
accounts, on the other hand, differ with constructivist models in one very
simple account--their models credit children (very early on in their speech
development) with tacit syntactic knowledge, unrelated in any way to frequency,
data-driven constructivist claims which define language as being tethered in
someway to cognition. Generativists in this sense draw on parameter-setting
mechanisms (as opposed to data-driven mechanisms) to account for language
growth. Generativists maintain two versions of a general language development
model; both versions speak to a more innateness (top-down) account of language
acquisition. The first version is represented herein as Wexler's O(ptional)
I(nfinitive) model (ibid). The OI
model grants children from the very earliest stages of development with the
abstract knowledge of morphological inflection. According to OI accounts,
children have access to inflection. The fact that inflections may optionally
project (at stage-1) speaks to matters of specific feature spell-outs of the
phrasal projections (i.e., all inflectional phrases project, it is rather the
features pertaining to the phrases that may go un(der)specified and thus not
project). The second model associated with Radford (Radford & Galasso ibid.)
claims that children may initially
produce some early inflection, but that there is evidence that the child may
not be processing such attested inflection in a true syntactic way: (children
at this early stage may in fact be treating inflections in a
non-syntactic/derivational manner). In addition to this claim, the general idea
here is that a very early grammatical stage indeed exists where one finds no
true syntactic processing in the child's speech (i.e., there is a
'No-Inflectional' stage-1). What is of interest to us here regarding Radford's
'No Functional stage' model (Radford 1990) is that it readily overlaps with
constructivists claims for their stage-one as well. Specifically speaking, it
has become a custom for constructivists to say that although they believe there
is no syntax for their early stage-1, children's grammar is indeed protracted
and that those 'abstract rules' which underwrite syntax proper eventually do emerge at a later stage in the course of
the child's language development. Hence, it would seem that Radford's version
and the constructivists version might converge and agree regarding the earliest
stage of development. Both models predict similar stages of development: (viz.,
a stage-1 void of any inflectional). Though this concord of predications appear
to be true empirically, theoretical concerns are real and would continue to
weigh heavily on the mind's of the linguists, thus undercutting any feeble
attempt to accord the two positions.
Constructivism,
and beyond. One
consequence of this style of learning was that children were considered to
learn by rote-methods, associative means similar to what Skinner had earlier
advocated in Behaviorism. It was somewhat tentatively implied here regarding a
very early stage-1 that children didn't start learning language as a set of
abstract rules of logic (as Chomsky would have us believe in his notion of
generative grammar), but that children would first grapple with the linguistic
input by gathering data-driven patterns and constructing broad-range syntactic
templates based on such distributional analyses of the patterns (a kind of
first order frequency learning). Children would only later on, say at a stage-2
of language acquisition, start to employ Chomskyan style rules to generate a
target grammar (as a consequence, see 'U-shape learning' discussed in §60). Benchmarks
of development thus followed: (i) Recognition of patterns comes first (no
attested phonological/morpho-syntactic over-regularizations) (ii) Abstractions
of the patterns come after (attested phonological/morpho-syntactic
over-regularizations). Data-driven analogies fit well with recently proposed
computational models of syntactic acquisition, a model in which children
initially form syntactic templates on the basis of distribution analyses of
linguistic input (Cartwright & Brent: 1997). Data-driven models trace their
antecedents back to the 1960s. For example, Bellugi (1967), Klima and Bellugi
(1966), Braine (1963), initially allowed for a certain amount of formulaic
misanalysis to enter into the accounting of non-adult-like stage-1 structures. In
a contemporary about-face from much of what had been advocated in the
Parameter-theory of the 1980s, Rowland and Pine (2000), among others, have
returned to the aforementioned 1960s by similarly calling on first bottom-up,
data-driven procedures in securing potential syntactic paradigms. According to
such constructivists terms, children do not have any general (rule-driven)
knowledge of syntactic categories, at least not until they have acquired
enough similar templates from which
they can abstract a general pattern. This model would readily explain why
over-regularizations tend not to occur very early on in children's speech: if
the stage in question employs no rules, then, by definition, no
over-regularizations of rules can occur. (It is suggested in this context that
the onset of over-regularization as attested in the data indicates the later
rule-based stage-2 of development). It has been suggested that what one means
by 'until they have acquired enough similar templates' is that there may be a
frequency based storage threshold at work that converts an overburdened
data-driven analysis into rule-based abstraction: i.e., a kind of Critical Mass
Hypothesis which speaks to the notion that an eventual rule-driven grammar
requires a certain quantitative 'tipping point' to be reached of (i) precise
number of patterns to (ii) general abstraction of patterns. Without a
compilation of data, no abstraction can be achieved: children must acquire a
sufficient amount/number of exemplars before abstracting general patterns from
them can be productive. (See §§26, 27 'Less is More hypothesis').
[7]. For
instance, Rowland & Pine (op. cit) suggest that e.g., early Subject-Auxiliary
inversion errors such as *What he
can ride in? (along with the optional
target structures showing correct inversion What can he ride in?) cannot be accounted for by a rule-driven theory--viz.,
if the child has access to the rule, the theory would then have to explain why
the child sometimes applies the rule, and sometimes fails to apply it. Rowland
& Pine rather suggest an alternative account by saying that as a very early
strategy for dealing with complex grammar (e.g., Aux. Inversion, Wh-fronting)
children learn these semi-grammatical slots as lexical chunks--a sort of
lexicalized grammar--whereby they establish formulaic word combinations: e.g., Wh-word
+ Auxiliary as opposed to Auxiliary
+ Wh-word combinations. It was shown
that aspects of error rate and optionality (versus rule-driven mechanisms) highly correlated to high vs.
low frequency rates of certain combinations in the child's input. This early
non-rule-based strategy was then able to account for the vast array of the
child data--viz., where the number of non-inverted Auxiliaries vs. inverted
Auxiliaries was at a significantly higher rate at the initial stage-1 of
development. As an example of a non-rule-based account here, they show that
when inversions did occur, they typically involved only a certain select few
Wh-words, and not the entire class. Hyams (1986, p.85) somewhat agrees with such
a reduced structure when she asserts the following: By hypothesis, the
modals (or Aux. Verbs) are
unanalyzable during this period.
[8]. Moreover,
such claims strongly support Stromswold's (1990) statistical data analyses
which clearly demonstrate that children at a very early stage-1 might not
productively realize an utterance string containing [don't, can't] in e.g., I/me [don't] want, You [can't] play as the syntactic elements [{Aux} + clitic{n't}], but
that such strings were more limitedly realized as quasi-formulaic
representations of a negative element. In other words, the claim could be extended to mean that for the child
at this stage-1, the lexical item don't/can't reduced to the one-to-one sound-meaning of not: e.g., Robin [don't] [=no(t)]
play with pens (Adam28) where the
verbal inflection {-s} goes missing since it isn't analyzed as an Aux Verb. (Though see Schütze (2001) for some arguments against this
position). Likewise, Brown came to
similar tentative conclusions by recognizing that (i) verbal inflection seemed
not to be generalized across all verbs in the initial stages, and therefore,
that (ii) children didn't really start with rules, but rather employed a
strategy of 'lexical-learning'. Early stage-1 inflected verbs might then be
learned as separate verbs (chunks) thus explaining observable optionality:
since, as the story was then told, 'either you know a rule, and so you always
apply it, or you don't'. Optionality of verbal inflection was seen as two
singular processes of word acquisition in the brain: both uninflected and
inflected words were stored as two different items in the lexicon. (See Bloom
1980 for comments). This notion of a stage-1 learning via non-rule-based means
implied that the stage was a formulaic stage, and set-up in such a way as to
learn by associative processes buttressed by frequency learning.
[9] Having
spelled out some of the issues surrounding Constructivism vs. Generativism, one
major question seems to prevail throughout: How might it be possible to bridge
the gap between a associative/semantic relations and abstract/formal categories? One way to solve the question might be to stipulate
that whatever mechanism generativists cling to regarding their account of
syntactic development, proponents of a Converging Theories Model (based on the
Dual Mechanism Model) likewise evoke the similar generativist stance: in
accepting a strong maturational perspective, we are able to take the best of
both positions (i.e., no other explanation needs to be posited outside of what
remains to be the generative traditional stance). What the converging theories
model offers is a middle of the road theory which suggests that a maturational
stage-1 of development is universally maintained, irrespective of whether or
not one adheres to a generative or constructivist stance. Theory internal
measure put aside, a universal biological account of brain development spreads
equally across both models.
The
Dual Mechanism Model
[10]. It
has recently been hypothesized that the language faculty consists of a
dualistic modular structure made up of two basic components: (i) a Lexical component--which
has to do with formulating lexical entries (words), and a (ii) Computational component--which
is structured along the lines of algorithmic logic (in a Chomskyan sense of
being able to generate a rule-based grammar). It is argued that these two very
different modes of language processing reflect the 'low-scope' (1st
order) vs. 'high-scope' (2nd order) dichotomy that all natural
languages share. Low/High scope would be described here in terms of a how and
where certain aspects of language get processed in the brain (see also section
[§64] on brain studies). In addition to newly enhanced CT brain imaging
devices, multidisciplinary data (e.g. linguistic, psychological and biological)
are starting to trickle in providing evidence that a dual mechanism is at work
in processing language. Results of experiments indicate that only a dual
mechanism can account for distinct processing differences found amongst the formulations
of irregular inflected words (e.g., go>went, foot>feet) and regular inflected words (e.g., stop>stopped,
hand>hands). The former (lexical)
process seems to generate its structure in terms of stored memory and is taken
from out of the mental lexicon itself in mere associative means: these measures
are roughly akin to earlier Behaviorist ideas on frequency learning. The latter
regular mode of generating structure is tethered to a Chomskyan paradigm of
(regular) rule-driven grammar--the more creative, productive aspect of
language/grammar generation. Such regular rules can be expressed as
[Stem]+[affix] representations, whereas a stem constitutes any variable word
<X> (old or novel) that must fit within the proper categorization
(parts-of-speech) stem. For instance, using a simplified version of Aronoff's
realization pair format (1994, as cited in Clahsen 2001, p. 11), the cited
differences in parsing found between e.g., (i) a regular [Stem + affix]
(decomposed) construction vs. (ii) an irregular copular 'Be' [Stem] (full-form)
lexical item can be notated as follows:
a. <[V,
3sg, pres, ind], X+s>
b. <[V,
3sg, pres, ind, BE], is>
The
regular 3Person/Singular/Present rule in (a) spells out the bracketed
functional INFLectional features of Tense/Agreement by adding the exponent 's'
to the base variable stem 'X'. The features in (b) likewise get spelled; but
rather than in the form of an exponent, the features are built into the lexeme
'BE' by the constant form is. Once
the more specific, irregular rule is activated, the default regular rule-base
spell-out is blocked-preventing the overgeneralization of *bes.
[11]. INFLection. Recent research conducted by
Pinker (MIT), Clahsen (et al.) (Essex), among others, has shown that a dual
learning mechanism might be at work in acquisition of a first language. The
research first focuses on terminology. It is said that there are two kinds of
rules for Inflection: an Inflection based on lexical rules, and an Inflection
based on combinatory rules. In short, the types of rules are described as
follows:
(i)
Lexical Rules: Lexical rules (or
lexical redundancy rules) are embedded in the lexical items themselves
('bottom-up'). Lexical rules may be reduced to being simple sound rules
somewhat akin to statistical learning; for instance, associative regularities
are built-up from out of the sequencing of lexical items--e.g., the <sing>sang>sung
-> ring>rang>rung>
sequencing of an infix (vowel change) inflection (presented below)
(ii)
True Rules: Word inflection of the
former type (i.e., lexical rules) is cited as an inflection not based on rules,
but rather encoded in the very lexical item itself. True Rule (or affixation),
on the other hand, would be a combinatory symbolic process based on variables
('top-down')--a creative endeavor not bound by associative input. Whereas
lexical-based inflections are exclusively triggered by frequency and
associative learning methods--i.e., they are not prone to deliver the creative
learning of novel words with inflection--novel word inflection is generated (by
default) once the true rule-based grammar is in place. One simple example that
Pinker and Clahsen give in illustrating lexical/associative Inflection is the
irregular verbs construction below:
[12].
Irregular Verb Constructions: The #ing>#ang>#ung paradigm
Table 1
a). sing >
|
sang >
|
sung
|
b). ring >
|
rang >
|
rung
|
c).*bring >
|
*brang >
|
*brung
|
The
cause of this commonly made error in (12c) is due to the fact that the
phonological patterning of rhyme #ing>#ang>#ung--as a quasi-past-tense
infix (lexical-rule) form--is so strong that it often over-rides and out strips
the default regular (true-rule) form of V+{ed} inflection for past tense.
(Spanish offers many similar examples where frequency of regular verbs affect
the paradigm such as the irregular (correct) Roto (=Broke)
over-generalization from the (incorrect) regular inflection *Romp-ido.) (*marks ungrammatical structures).
[13]. The
erroneously over-generated patterns of *bring>brang>brung (for English) and *Romp-ido (for Spanish) are heavily based on statistical
frequency learning in the sense that the sound sequences of other patterns
(e.g., ring>rang>rung, and
infinitive verb V-{er} respectively)
contribute to the associative patterning (a frequency effect forming the sound
pattern irregular-rule in the former example and a default regular-rule in the
latter example). Recall that structured lexical/associative learning merely
generalizes, by analogy, to those novel words that are similar to existing
ones. Regular grammatical rules (true rules), on the other hand, based on
affixation, may apply across the board to any given (variable) syntactic
category, be it similar or otherwise. In one sense, the ultimate character of
'true rules' is that which breaks the iconic representation of more primitive,
associative-based processes, whether it be a neuropsychological process or some
other process.
[14]. The point
that the actual over-generalized strings (bring>brang>brung) are not found in the input demonstrates that there is
some aspect of a rule evoked here--albeit, a rule based on rhyme association,
and thus not a 'pure rule' where true (non-associative) variables would be at
work. In other words, these lexical rules attributed to irregular formations
are to be generalized as a form of associative pattern learning, and not as a
true rule, since they are associated with sound sequencing only. One crucial
implication of an Inflection generated by a true-rule is that such inflection
could be easily applied to novel or unusual words: viz., words never before
heard in the input (contrary to frequency learning of lexical rules discussed
above--cf. Brown (1957), Berko (1958).
[15]. Expanding
on previous studies which examined differences in priming effects between Derivational and Inflectional morphology,
Clahsen concludes that the difference in priming effects can only be accounted
for by a dual mechanism of learning--interpreting the data to show that high
priming effects were connected with productive inflectional forms not
listed in the mental lexicon, whereas low priming effects were connected to productive derivational forms
associated with stem entries.
[16]. With
regards to German forms of pluralization, Clahsen et al. (p. 21) note that the
same argument can be made for a dual mechanism process--viz., the high priming
regular (default) plural '-s' (auto-s)
contrasts with the low priming of the irregular plural '-er' (kind-er). The raw findings here suggest that certain
irregular inflections in German (e.g., participle {-n}, plural {-er}) might be
stored in the lexicon as undecomposed form chunks and that these two processes
of storage are activated in very different places and manners in the
brain--viz., the findings that irregular inflections spawn reduced priming as
compared to regular inflection suggest that regular inflections are built
forms based on rules that contain variables which make the basic unmarked
stem/root available for priming. It
is clear from the table below that regular inflected word forms such as {-t}
participles and {-s} plurals produce full priming and no word-form frequency
effects. For irregular inflected affix forms such as {-n} participles, {-er}
plurals and (irregular) {-n} plurals, the opposite pattern appears. The data suggest
that irregular forms are stored as undercomposed stems--hence the emergences of
full form frequency effects. Regular forms are captured by the full rule
process and are stored in a computational manner that works off of
variable+stem algorithms--hence, the lack of full-form frequency effects. These
differences in German morphology seem to parallel what we find between English
(i) Inflectional morphology and (ii) Derivational morphology where the former
seeks out specific rule formulations--e.g., V + {ed} = Past, or N + {s} =
Plural, etc. and where the latter seeks out associative style sound-to-meaning
learning approaches (as in irregular verbs/nouns e.g., go>went,
tooth>teeth, etc.) Applying fMRI
brain imaging techniques, a consensus has begun to emerge suggesting that the
lexical storing of derived stems + suffixes (e.g., teach+{er}) may actually be
processed as one single word chunk in the otherwise lexical (word/recognition)
temporal-lobe areas of the brain, and not, as intuition would have us believe,
as a dual segmented [stem + suffix] lexical structure which has undergone a
process much like a morpho-syntactic string). This may be an apparent
economical move keeping in line with the classic one- sound-one-meaning
association. In noting this, there seems to be a natural tendency in the
diachronic study of language to move from (i) rule-driven Inflectional morphology--with
more complex rule-driven infrastructures [+Comp] (Comp=complex) to less complex
[-Comp] structures--to (ii) association-driven Derivational morphology. This tendency can be easily captured by looking into
the way words have evolved over a duration of time--e.g., Break|fast /bre: kfaest/ has evolved from a twin morpheme structure [[Verb Break] + [Noun
Fast]] > to Breakfast /bre: kfIst/ [Noun Breakfast] composed of a single morpheme chunk.
Table 2 Summary of
experimental effects (Taken
from Clahsen et al. 2001: p.26)
Representation
|
Full priming effect?
|
Full-form frequency effect?
|
Source
|
-t
particples: ge[kauf]-t
|
yes
|
no
|
Sonnenstuhl et al. (1999), Clahsen et al.
(1997)
|
-s
plurals: [auto]-s
|
yes
|
no
|
Sonnenstuhl&Huth (2001), Clahsen et al. (1997)
|
-er
plurals: [kinder]
|
no
|
yes
|
Sonnenstuhl &Huth (2001) Clahsen et al. (1997)
|
-n
participles: [gelogen]
|
no
|
yes
|
Sonnenstuhl et al. (1999), Clahsen et al.
(1997)
|
-n
plurals I: [bauern]
|
no
|
yes
|
Sonnenstuhl&Huth (2001)
|
-ung
nominalizations: [[stift]ung]
|
yes
|
yes
|
Clahsen et al.(2001)
|
diminutives: [[kind]chen]
|
yes
|
yes
|
Clahsen et al. (2001)
|
-n plurals II: [[tasche]n]
|
yes
|
yes
|
Sonnenstuhl&Huth (2001)
|
[17]. In
sum, Pinker and Clahsen assume that the language faculty has a dual
architecture comprising of (i) combinatory rule-based lexicon (leading to the
lack of full-form effects) and (ii) a structured non-rule-based lexicon
(leading to full-form effects). Questions on specifics will surface in the
following sections-namely: How are these two methods represented in the brain?
[18]. A Stage-1
Language Acquisition. There
is a huge and ever-growing body of data today being tallied by developmental
linguists in the field which suggest that the brain of a child matures in
incremental ways which, among other things, reflects the types of 'staged'
language development produced by the child for a given maturational stage. The collected
data suggest that children's early multi-word speech demonstrates 'Low-Scope'
lexical-specific knowledge, and not abstract true-rule formulations attributed to grammar. Somewhat akin to
Piagetian notions of language development (see general nativism [§31] below):
One difference being that it need not be tied here, exclusively, to a cognitive
apparatus. This maturational theory of language development accounts for the
lack of specific linguistic properties by suggesting that the brain is not yet
ready to conceptualize higher and more abstract (High-Scope) forms of
linguistic conceptualizations.
[19]. The idea
behind 'What gets missed out where'
in child speech production has given those linguists interested in morphology
and syntax a particularly good peek at how the inside of a child's brain might
go about processing linguistic information--and other information for that
matter. As stated above, research initially carried out by Brown and his team
(1973), working under a Chomskyan paradigm of linguistic theory, and consequent
work by others (cf. Radford) suggests that there is a stage-1 in language
acquisition that tightly constrains the child's speech to simple one-to-two
word utterances with no productive forms of verb or noun inflection. One child that
appears in the early studies, Allison, provides transcripts between 16-19
months showing no signs of the onset of formal inflectional grammar--only
later-on close to two years of age (22-24 months) does inflectional
grammar/syntax emerge, and then only in what could be said as a sporadic,
optional manner.
[20]. This
stage-1 is considered to be a grammatical stage with an MLUw (Mean Length of Utterance word) of 2
words or less. More specifically, in the sense of the apparent lack of formal
grammar, this shouldn't be confused with the idea of an earlier a-grammatical
stage well before the onset of multi-word speech. (Surely, there can be no
grammar or syntax of which to speak if there aren't multi-word constructions).
This grammatical stage-1 therefore differs with the notion of a one-word stage
(MLU=1) where supposedly absolutely no grammar/syntax is at work. The
grammatical stage-1 is said to begin roughly with the onset of multi-words at
about the age of 18 months (+/-20%). It is reasonable to suppose that such a
stage would have target semantic meaning--even though, say the arbitrary
'one-to-one sound-to-meaning' relationship is not of the target type (e.g.,
onomatopoeia forms /wuwu/=dog, /miau-miau/ =cat, etc.).
[21]. The above
notions beg the question: At what point do we have evidence of grammatical
categorization? For example, the traditional distributional criterion that
defines the Noun class as that category which may follow Determiners (a/the/many/my/one) made not be available to us if, say, Determiners
have yet to emerge. Hence, distributional evidence may be lacking in such
cases. One way around the dilemma has been to suggest that early stage-1
grammar is categorical in nature simply owing to a default assumption that
categorization is part of the innate ability to acquire language (in Chomskyan
terms, part of the richly endowed LAD or Language Faculty) and that words are
both inherently categorical and semantic in nature. Pinker (1984) claims that
the categorization of early stage-1 words should be roughly pegged to their
inferred semantic properties. Radford (1990), in a slightly different approach,
prefers to consider such early multi-words at stage-1 as lexical in the sense
that (i) they have built-in default lexical categorization abilities (forming
classes of Nouns, Verbs, Adjectives, Adverb, and Prepositions), but, at the
same time, (ii) rely heavily on their semantic-thematic properties. In any
event, either description starkly contrasts with a connectionist view which
claims that e.g., the class 'subject' emerges through rote-learning of
particular framed constructions. Subject-hood is learned as a category via rote
associative learning of thematic relations. Now, it remains unclear to me
precisely how close such thematic links to category-hood get to Radford's 1990
interpretation. I would only venture to say that both views share the belief
that semantics hold the central cognitive underpinnings upon which syntax can
later be built.
[22]. This
account of stage-1 has been labeled as the Lexical thematic stage-1 in language acquisition (Radford 1990). It is
unclear how far Radford would like to go in accepting his stage-1 as
cognitively based: the labeling here of lexico-thematic (the term
thematic referring to argument structures pegged to semantics) certainly
permits some amount of semantics to leak into the discussion. Nevertheless,
Radford emphatically rejects the notion that a stage-1 syntax could be
exclusively based on semantics. It is here that Radford gets full mileage out
of his two-prong converging Lexical-Thematic stage-1 grammar: a stage-1 that is
both--
(i) 'thematic' in the sense that it leans towards
general nativism since simple utterance types at the earliest MLU get directly
mapped onto their thematic argument structure; while,
(ii)
'lexical' in the sense that the child seems to be fully aware that they are
dealing with words based on lexical grammatical categories, and not semantic.
This is made apparent by how children know the morphological range of category
(e.g., Noun, Verb) selectiveness along with inflection distribution.
[23]. One argument
against a semantically based stage-1 was that from the very beginning,
children's productive multi-word speech (MLU= 2+) yielded Inflectional plurals
{+s} and gerund {+ing} endings--the first two morphemes to be acquired according
to Brown's morpho-sequencing list. These endings were only attached to
syntactic categorial word-classes: e.g., {s} to nouns, {ing} to verbs, etc.
There seemed to be no attempt by the young child to generalize such inflections
onto pure semantic categories. In other words, if children's word classes at
this stage-1 were thematic, rather than syntactic in nature, we would expect
that specific inflections would be distributed along semantico-thematic lines:
e.g., plural {s} to agent, gerund
{ing} to action words, etc. (Radford 1990, p. 41). Such findings are not
reported in the data. It was this absence of semantically based grammars which
led discussions about possible a priori innate grammatical categories, a grammar based on a syntax (without
meaning) rather than a syntax based on semantics (meaning) (cf. general vs. special nativism). Although it is indeed correct to
suggest that there seem to be no purely semantically based Inflections at
stage-1, one argument against the conclusion of the claim, and seemingly in
support of a semantically-based stage-1, would be to suggest that, in fact,
most utterances at this stage are instances of formulaic constructions. Only at
a later stage-2 would we find instances of real productive inflection--viz.,
even though on the surface, inflection appears to be utilized at stage-1, the
surface structure only mimic input driven phonological patterns.
[24]. This
'mixed bag' of a grammatical stage is indeed an argument against
'too-strong-of-claim' syntactic-based model of early grammar (assuming that a
syntactic version holds as a buttress for Continuity--we shall take some comfort
in it however due to the fact that this strong claim we take will be short
lived and relegated to the very earliest of grammatical stages: (=MLU below 2).
There is a caveat here. One argument, however, against interpreting from no
evidence-namely, the observation that no inflection shows up on argument-themes
might be the following: If our stage-1 were in fact formulaic, and not
rule-based, then there indeed would be no utterance of an improper formulaic
inflection attached to a semantic category simply because this would not have
been available in the phonological input. Formula constructions come out of the
input in a highly regular manner--based on high frequency, saliency and churn
out as formulaic un-analyzable chunks. (See §42 for an account of apparently
correct parameterized word order found at an otherwise non-parameterized stage
of acquisition).
[25]. The
argument could run as follows. The fact that children at stage-1 never produce
e.g., the action-inflection '-ing'
to semantically classed action-words like *up-ing/down-ing/over-ing/on-ing, etc. merely indicates that such strings are not part
of the available input (particularly note worthy given that our stage- 1 is
semi-formulaic in nature). It will be argued that the very earliest of stages
(stage-1), addressed herein, is indeed the very earliest of staged
developmental grammar--what may have been even termed a-grammatical in previous
theories (viz., the one word stage (cf. Atkinson, 1992; Radford, 1990; among
others). Let it be known that I am all too ready to acknowledge and agree that
language is indeed built upon pure syntax at our stage-2 of development, (and
not on semantics): the classic evidence for a syntactic-based language at the
earliest stages has been taken from the child's inflectional system at work on
the basis of grammatical categories. Notwithstanding early attempts to cast
syntactic analyses to early stages of language, there have been attempts in the
child language acquisition literature to construct a dual model for stage-1
based on (i) semantico-thematic relations on one hand, and (ii) categorial
syntax on the other. This hybrid model has been considered as a
lexical-thematic stage-1 of child language acquisition where mere semantic
properties tied together those lexical syntactic categories void of any
functional material (as related to the functional categories IP & CP). The
most fully articulated version of this hybrid theory could be found in Radford
(1990):
[26]. The
question is then put to us in the following form: Is there any evidence at the
earliest phases of stage-1 (say MLU<2) that the child actually
analyzes strings as a syntactic structure--as opposed to a formulaic
speech-utterance (i) which may be tethered to a variety of gradient meanings,
and (ii) which may reduce to mere surface-level syntactic phenomena)? In other
words, what may appear on the surface as syntax proper, may in all actuality simply be a result of the
surface formulae learned and that real tacit syntactic knowledge is not
represented. There seems to be little that hinges on the possible alternatives:
If,
on the one hand, we consider such semi-formula as syntax proper-making our
stage-1 (MLU<2) a syntactic stage--then so be it. We are then forced
to reconciling our syntactic stage-1 to the one word stage as previously
thought and nothing is lost.
If,
on the other hand, a lexical-thematic stage-1 involved itself with bridging
this narrowing gap between formula and syntax--then so be it. The benefits we
have gained by adapting this measure is that it allows us a nice continuity
bridge onto the later phrases of stage-1 (MLU +2).
[27]. One
interesting by-product of such a lexical-thematic stage-1 is that it doesn't
specify Word Order: word order being traditionally tied to functional
parameterization (see Travis, 1984; Atkinson, 1992; Tsimpli 1992; and Galasso,
1999/2003). Coming on the heels of such semantic-based models of language
acquisition, claims have been made suggesting that the cause of a semantic stage-1
is due to memory deficits. As part of a Maturational time-table, the child
starts off with a very limited memory attention span--this memory deficit
(maturational based) triggers the more 'robust & primitive'
semantic-lexical level of language (since the lexical component is more
salient) to kick start productive communication (see Newport's 'Less-is-More
Hypothesis', S. Felix's non-UG/cognitive approach to L2 learning, as well as J.
Elman's work in relation to connectionism. For evolutionary accounts, see
Bickerton's Proto-language, 1990).
Less-is-More Hypothesis. According to
Newport's 'Less-is-More' Hypothesis, a Radfordian style maturational
time-table--dividing our stage-1 from stage-2--would be linked to 'working
memory' deficits: Stage-1 starts with early limited memory and thus can solely
rely on the more primitive and robust rote-learned and formulaic structures.
(One needn't say that all possible structures at stage-1 are rote or
formula--let it suffice to say that the flavor of the stage suggests little if
any evidence for 'true-rule' formations or parameterizations, citing stage-1
variant Word Orders and null INFLections). This handicap of low memory actually
works as an advantage for the child in that it serves to constrain the
perceived input to basic degree-0 SV(X) structures--the structures are
ready-made by the lower-level cognitive processes and made available to the
stage-1 child. Lower-level memory seeks out idiomatic lexical-based categories
or lexical based morphemes as opposed to functional, syntactic based
morphemes/categories (termed 'l'-morphemes' vs. 'f'-morphemes respectively by
Pesetsky (1995) as understood in Distributional Morphology (see [§54] ). (N.B.
Felix (1981) as well as Krashen's claim that it is precisely this over-production
of the cognitive apparatus/high memory that makes second language learning so
fraught with difficulty--having to 'learn' language overtly instead of naturally
'acquiring' it in a natural setting.)
[28]. We can
better frame arguments that claim for a cognitive/memory dependence for
language acquisition by addressing the very nature of syntax. First, syntax
requires much more in the way of computational memory. (Or perhaps the question
is better framed conversely--viz., more memory forces the computation to
reorganize itself by way of syntax.) The emergence of syntax coincides with the
onset of higher (quantity) amounts of language material--i.e., a higher number
of memorized words/strings leading to longer and a richer complexity of
sentences, etc. For instance, Degree-zero structures (say, basic SV sentences,
order irrelevant) come at the expense of lower memorizations, while, et vice
versa, Degree-1 structures, (embeddings, binding, recursiveness) come at a much
higher cost with regards to memorization. Why is that? Well, in one manner of
speaking the reason is self serving: simply due to the fact that in order to
have a degree-1 sentence, the empirical (maturational) data dictates that a
child must have, at some prior time, gone through a degree-0 stage, a process
that mirrors memorization capacity. But more to the point, the reason for this
mental/computational juggling has to do with how our brains go about making the
most out of our limited memory capacity. The very nature of these high amounts
of material forces a shift in how the brain can process (parse) the material.
It is believed in the neuro-linguistic community that the shift here--both in
the quantity and quality of language--triggers the already over burdened process
of rote-learning and memorization to be lifted, triggering the share of burden
to be replaced by rule-based processes (variables, categories, etc.). Such
rule-based learning frees up space in the lexical component of the brain (say,
the list of words stored) and allows new routes to be mapped. In other words,
such a huge volume of material forces new ways of organizing the input (hence,
categorization). In sum, the two-prong development as sketched out above might
proceed as follows:
(i) At
the Micro-Development level (stage-1) the data-stream is reduced for the child
in terms of its cognitive saliency: (the data-output is not changed, rather
it's the intervening deficiency of the child's mental processing that overall
affects these data). The child, working with a primary memory 'tool-kit',
allows a small subset-a of language input, this in turn allows the child to
ultimately deal with less data enabling rote-learning to take place. (N.B. It
is generally acknowledged that any memory deficit or trauma resulting in
language attrition would first affect the more abstract levels of
language/syntax).
(ii) At
the Macro-Development level (Stage-2) the data stream is affected by the
upsurge in memorization that in turn expands what becomes salient for the
child. Perhaps having to do with the triggering of hidden units at the end of
stage-1, the child now is in a position of capably taking the data and applying
paradigmatic structures--all which lead to formal (stage-2) grammar. Thus, Macro
development makes available more memory which in turn spawns new ways of
handling the material--the initial process of stage-1 rote association and
memory is no longer adequate and syntax proper emerges as a way of handling
both the quantity and quality of this newfound material.
[29]. What
syntax allows the brain to do is categorize and form analogies based on the
vast amount of input, rather than to memorize and store all input as meaningful
chunks (with an associative sound-to-meaning relationship imposed). This
results ultimately in a finite array of neuro-linguistic networks in the brain.
Hence, in a basic input-output model--similar to what we understand to be
happening in behaviorist stimulus and response associative models--quantity of
input equates to quality of brain processing. As is evident, the classic enigma
(chicken and the egg scenario) remains: Is it this newly wired brain which now
seeks out the formations of paradigms and variable rules that is responsible
for the quantum leap of quality of language, or is it this quality leap in
language that somehow drives the changes in the brain? This is tantamount to
the classic Nature vs. Nurture debate. My hunch here is that (i) the nature of
the raw Data as it is (ii) tied to
cognitive processing may be the driving force behind any structural changes
that occur in the brain--in other words, language changed the brain and not the
other way around. (It may ultimately be impossible to separate the one from the
other). But this is only a hunch, and again, it reduces to the same catch-22
scenario (if it is the data that is the driving force behind the change, how do
we account for a maturational protracted development, and secondly, surely, how
the brain handles and processes the data must be part of the equation for any
theory that attempts to account for developmental stages of language). In a
certain sense, Newport's 'less-is-more' hypothesis simply restates this same
paradox. Regarding architecture and the nature vs. nurture debate, clearly all
linguists suppose now that some connection must be made between genes and
environment Thus, a two-staged development follows:
(i) Stage-1
comes with low-level memory with strong correlates to semantics and
rote-learning. As a consequence, one-to-one sound-to-meaning correspondence
ensues explained by more prosaic economic constraints placed on cognition.
(ii)
Stage-2 comes with increased memory that (for reasons having to do with
processes of parsing, etc.) triggers high level categorization and syntax.
One-to-many/many-to-one relations are evoked triggering a highly rich
paradigmatic grammar.
[30]. Radford
(2000) more recently has gone on the record as saying that the Language Faculty
specifies a universal set of features--namely, that a child acquiring language
has to learn which subset of these features are assembled into the lexical
items as +universal (all other features awaiting parameterization via a
maturational timetable). The problem for the child is assembling the features
into lexical items. To a certain degree, the child needs to build-up lexical
items one feature at a time (see Clahsen's Lexical Learning Hypothesis). Thus,
the issue for Radford is that there are innate architectural principles--loosely
referred to as an Innate Grammar Construction Algorithm--which determine how
lexical items project into syntactic structures. This begs the following
question: How much of this initial learning deficit cited for our lexical
stage-1 is owed to the child's protracted language development being
exclusively tied to a maturational based low-scope cognitive template--a
potentially semantic based template upon which later formal abstract categories
(such as functional categories) can be mapped? It is clear at least that more
abstract functional categories come on-line later in the course of development.
[31]. General
vs. Special Nativism. This
is a nice place to pause and examine the role that our lower-scope cognitive
processes might play in deciphering between Stage-1 vs. stage-2 grammar. In
brief, there are two schools of thinking on this, both of which could maintain
general ties to a Chomskyan paradigm. One school takes an evolutionary stance
(Pinker & Bloom) and basically claims that lexical learning leading to
grammaticalization is heavily based on what are preexisting cognitive
constraints (much in the manner of former Piagetian models of language
development). Such linguists would disagree with the notion that a special
module in the brain must exist in order for language to manifest. Recall,
Chomsky in his strongest claims suggests that the Language Faculty (LF) is an
independent autonomous organ found somewhere in the mind/brain (similar to say
the liver or the stomach) and that this LF organ shares very little in the way
of general cognitive processes--a language module all to its own and without
common lineages to other regions or modules of the brain. This notion is
referred to in the language acquisition literature as a Double Disassociation
Hypothesis (disassociation between formal language and cognition) (see Smith
and Tsimpli for some discussion). The second anti-Neo-Darwinian position
suggests that a special module in the brain is required for language, and that
language learning can be accounted for by reduced/non-cognitive means.
[32]. Regarding
the debate over General vs. Special
Nativism, it is still unclear how the
debate should be viewed. Much of the argument quickly degenerates into the
classic aforementioned 'chicken-and-the-egg' dilemma of being circular in
nature: e.g., (i) The Special Nativist claims that the child first needs syntax
to uncover the underlying semantics (syntactic-bootstrapping), while (ii) The
General nativist insures that in order to properly construct a syntax category
in the first place, general properties of (inherent) cognitive-semantics must
be observed (semantic-bootstrapping). (Interesting, Chomsky's most recent work
on Minimalism suggest that there may be economical constraints on language
processing (from out of Logical Form). While it is still unclear how to
interpret the wide range of claims on the minimalist table, and Chomsky himself
often remains agnostic at these levels of inquiry, such economic constraints
could be interpreted as indeed not pertaining to consideration of pure syntax,
and rather adhering to more cognitive levels of processing: e.g., Minimalist
notions of shortest move, minimal amount of rules, and to a certain degree, the
objective essence behind the (PF) phonological form of language as versus the
(LF) logic form, etc.). On one hand however, it seems to me that a dualist
approach to acquisition (as presented herein) would initially favor a first
order semantic-bootstrapping view, given that semantics seem to play an
essential role in language acquisition early on before the onset of syntax.
(There is no conclusion drawn here, as nothing argued in this paper hinges on
that debate).
[33]. Why--I
don't need any 'rules' to see this tree. My eyes work just fine. That is,
insofar as there exists a single tree. How is it that my 'tree' gets destroyed
once I move my head ever so slightly to the east and fall into view of a second
tree? The mystery of it all lies somewhere in the dismantling, between a single
torn branch of lifted foliage, that forces the rule--for how was I ever to know
that this second tree was indeed a tree after all?
Well,
the above passage makes for a nice analogy, but it merits a closer look. When I
look at this cup of coffee in front of me, reach out for it, and drink its
contents, it certainly appears to me that I do little more than what my own
cognitive abilities lets me achieve--I don't perform any 'abstract rule'
formulations, procedures as such: although, I do agree that one could possibly
uncover all of the aforementioned procedural content coming together such as
e.g., Gestalt psychology, visual cortex processing, contextual/meta-linguistic
background of say [+liquid] => drink => mouth, along with muscle motor
coordination that allows me to see into space reaching and holding the cup
without breaking the glass (etc.). In face of all this possible 'theory'
nonetheless, it remains somewhat natural for me to maintain the idea that when
I 'see' a tree, I just 'see' a tree (period). But much has come out of Gestalt
theory in the past (being somewhat reframed here in the present context of
connectionism) that suggests there may be something to this very natural notion
of just seeing after all. Gestalt theory on perception states that there are
first-order perceptions in which, say, a child might see a line or a slope in a
strict iconic representation of the visual field. No rules apply--and there is a
strict Stimulus and Response (S&R) equation involved. Regarding language
acquisition, this first-order representation could be illustrated by the early
onset of vowel recognition (i.e., environmental sound)--and not sound as
filtered through assimilation processes, etc. (as seen in the u-shaped model
[§61] below). At a later stage of perception, second-order perceptions allow
the child to break iconic mappings and allow lines, slopes, etc. to begin to be
seen (with less vividness) as e.g., a chair--now, a larger, somewhat more
generic unit, which embodies the lower level visual stimuli. It seems to be the
case that the role of second-order perceptions is to pull and frame larger
aspects of Objects and Events--in linguistic terms, forming Nouns (out of the
former) and Verbs (out of the latter). So regarding language, we should be
clear that by the time a child reaches the very first stages of language
development--where a child is said to begin producing single word
utterances--s/he has already moved from the first-order perceptual field into a
second-order field. So, the idea that children may have some means to rules,
perhaps bootstrapped from Gestalt psychology (the General Nativist Position)
may not be totally implausible. However, and more to our point, Newport's 'Less-is-More'
hypothesis just as well could be interpreted to fit Gestalt findings: when
memory/cognitive capacity is low, children see in a fixed iconic manner, and
when memory/cognitive capacity increases, the child reorganized the visual
field and must begin to classify according to class--e.g., the child sees a
chair (second-order) as opposed to a chaotic string of lines and slopes, etc.
So, roughly, the theme throughout holds--memory/cognitive capacity drives
computational order. One way though to save our nice analogy is by pinning it
down (to a narrow application) to issues surrounding Lexical S&R
behaviorism vs. Functional rule-based grammar. Surely, the spirit of the
analogy is well taken. Yes, iff (if and only if) I ever saw one tree, I could
adhere and maintain an exclusive iconic S&R process; it is when I look and
see another tree that I must compare notes and begin to re-organize both visual
trees into a class of 'Tree' (using Plato's terminology). Again, Newport's
theme above holds in that too much information, in this case the second tree,
forces an adjustment in the computation--corresponding to our data drive axiom.
In other words, on one basic and primitive level (order-1), visual transmission
is nothing more than sensory input directly stimulating the sensory cortex).
However, at a more abstract and functional level (order-2), perception is not
fully determined by sensory input, but is dependent on intervening processes of
Gestalt psychology. Hence, a dual mechanism account likewise credits a purely cognitive
behavior such as vision as having two distinct modes of processing--(i)
Bottom-up sensory-driven Transmission
and (ii) Top down context-driven Perception. These two approaches could nicely map onto our
analogous dichotomy between Skinner and S&R style learning vs. Chomsky's
rule-based symbolic style learning. So, our emerging linguistic schism
separating Derivational morphological processes from Inflectional processes may
not be a schism relegated to language per se, but may actually be operative in separating other
lower-level cognitive procedures as well.
Discontinuity:
A Lexical-Thematic stage-1
[34]. It is now
widely reported in the literature that children generally go through a stage
during which they optionally project Functional Categories: e.g., Determiner
Phrases (DP), Finite Main Verb Phrases which mark Tense (TP), Infinitive and
Agreement markings such as infinitive-'to' (IP), and 3person/present/singular
{s} (AgrP) (respectively). Wexler (1994) refers to this stage as the Optional Infinitive
stage. In more general terms regarding Inflection, Wexler's 'Optional
Infinitive' stage has recently been more accurately characterized as an
'Optional Inflection' stage (see Radford & Galasso 1998). More
importantly, a picture seems to emerge in the investigation of early child
speech that shows an even earlier stage of development--a stage in which the
overall deficit well exceeds any notion of Infinitive/Inflectional Optionality.
Mainly speaking, there seems to exist a stage-1 in the course of child language
acquisition--briefly peaking at around two years of age with MLUw well below 2.5
and then quickly falling off--which indicates 'No Inflection' whatsoever. What
might have been too hastily claimed a stage-1 in Wexler's terms, must now be
relegated to a stage-2 in Radford & Galasso's terms. It so happens that
Wexler has been a leading proponent of maturational-based theories of language
acquisition, so supporting arguments for Continuity based on the optionality
data as presented by Wexler and his colleges don't get a fair play. Likewise
interpretations regarding our own work here would most certainly solicit
continuity in getting at fair play--that is, however, only if it were the case
that Wexler's stage-1 indeed simply equated to our stage-1. As it turns out, it
doesn't. Our stage-1 is much more systematically void of functional material.
In other words, Wexler's OI-model doesn't offer us a solid, foul-proof
discontinuity model. By definition, 'optionality' suggests that the child has
some working tacit competence of the adult target grammar--it may only be that
the performance level or mastery of such competence is lacking. Certainly, this
is a far cry from any possible notion of a 'strong discontinuity' theory.
[35]. One
Continuity argument could run as follows. Since, the child at the earliest of
conceivable syntactic stages is already marking Inflection (albeit optionally),
then there is no justifiable reason to assume (even as the null hypothesis)
that child's grammar is Discontinuous with the adult target grammar. (In this
sense, the dual mechanism apparatus has established itself from the get-go, and
thus, no child-to-adult discontinuity has to be assumed). The differences found
between the child's grammar and adult target grammar would not be significantly
real, in developmental language terms, and could be readily accounted for by a
variety of superficial means--such as e.g., saliency conditions, morphological
feature spell-out conditions, parameter
miss-settings, phonological complexity and general immature cognitive factors
bringing about the memory deficits of such non-salient phonological features.
Having said this, a very different scenario emerges if indeed our stage-1 is a
stage that precedes optionality by showing 'no inflection' whatsoever. In such
a scenario, a discontinuity hypothesis now seems to emerge as the null
hypothesis, as previously cited above, yielding to highly universal biological
considerations. My own data (a syntactically-coded naturalistic corpus of well
over 10,000 analyzable utterances) presented in the following sections, taken
from Radford & Galasso (1998) & Galasso (1999/2003), demonstrate this
two-prong stage of acquisition, the consequence of which will buttress our
calls for Discontinuity. (N.B. It goes without saying that by postulating a
'dual mechanism model' for adult language systems, any working theory claiming
a stage in which a child starts-off with a truncated 'single model' (for
stage-1) would be tantamount to discontinuity).
The
Inflection {'s'}
[36]. In
examining the 'portmonteau'-morpheme {s}, the data provide (prima facie) evidence of some relation between the acquisition of
Possessive {'s} and the Third person singular {s}. At our stage-1, there is no
evidence of the inflectional marker across the board, it is only with the onset
of our so labeled 'Stage-2' (from age 3,2 onward) that we begin to see Wexler's
notion of optionality kick-in. The table 3 below shows the relative frequency
of use of Poss(essive) {'s} and 3Psing {s} in obligatory contexts before and
after age 3;2:
Table
3 Occurrence of
Inflection {s} in Obligatory Contexts
Age
|
3SgPres {s}
|
Poss {'s}
|
2;3-3;1
|
0/69 (0%)
|
0/118 (0%)
|
3;2-3;6
|
72/168 (43%)
|
14/60 (23%)
|
Token
sentence examples of the two-staged data are presented below (respectively):
a).
That Mommy car (2;6). No Daddy plane (2;8), Where Daddy bike?
Batman
(2;11 in reply to 'Whose it is'). It Daddy bike. No Baby bike
(3;0).
b).
Daddy's turn (3;2). It's the man's paper (3;4). It's big boy Nicolas's.
It's
Tony's. What's the girl's name? Where is Zoe's bottle? (3;6).
c).
Baby have bottle (2;8). No Daddy have Babar (2;9). The car go (2;11)
The
other one work ( 3;0). Here come baby (3;1)
d).
Yes, this works. This car works. My leg hurts. It rains He comes (3;1-3;2).
Interestingly,
the data above suggest a potential parallel between the acquisition of third
person singular {s} and possessive {'s} (see Radford & Galasso for
discussion).
[37]. But more
importantly, they also suggest that whatever discontinuity is at work in the
child's grammar, it seems to manifest across the board in a systematic way. In
other words, the lack of inflection here is not categorical specific, but
rather is realized across categories affecting both DP and IP alike. It seems
that Poss and 3PS {s} at our
stage-1 both reflect general catastrophic agreement failure. Certainly, any
notion of a real child-to-adult discontinuity would want to be expressed in
such absolute terms--as opposed to any Optional-based theory which might be
corned into spinning arguments from what on the surface would appear as mere
non-mastery and under-specification of Continuity into arguments for real
Discontinuity. As was expressed above, both deficits could be captured by a
lack of Agreement--a functional property of adult grammar. Consider then the
phrase structure discontinuity of the two stages below:
Agreement
Structure
Stage-1:
[ IP Mummy [I -agr 0]
car]
Stage-2:
[IP Daddy [+agr 's]
turn]
[38]. It is
argued herein that both possessive {'s} and third person {s} are reflexes of an
agreement relation between an inflectional head and its specifier-and any
omission reflects an agreement failure. The specific issue at hand here is that
only an absolute omission stage-as seen with our stage-1--would provide support
for true discontinuity. Any optionality here, e.g., [+/- agr] would play
directly into the hands of Continuity theories with the mere additional
disclaimer that the adult target grammar has indeed been acquired, but simply
not mastered. (See Wexler & Schütze (1996) for treatments of
under-specification of Agr as would be encountered in our stage-2 data).
Possessors
[39]. In a similar vain, we find additional support for a
non-target grammar in the wake of data showing Case errors e.g., with
possessors (enter alia). The
assumption that children's possessive structures may be initially (i)
non-specified, and then later (ii) (optionally) underspecified with respect to
agreement also accounts for the wide array of case errors where children (at
stage-1) use the default case of objective possessor (me) and only later come to acquire the target Case of
possessor (my), etc. The use of
objective possessors e.g., (me)
has been reported for Dutch by Hoekstra & Jordens (1994), but not for
English. If we look at the earliest first person singular possessor structures
produced in the data, we find that objective me possessors predominate at ages 2;6-2;8, and that
Genitive possessors (prenominal my and pronominal mine) are
initially infrequent (with no cases reported for the use of nominative I for possessor):
Table
4 Occurrence of
First Person Singular Possessors
Age
|
Objective "Me"
|
Genitive "My/Mine"
|
2:6-2:8
|
53/55 (96%)
|
2/55 (4%)
|
2;9
|
11/25 (44%)
|
14/25 (56%)
|
2;10
|
4/14 (29%)
|
10/14 (71%)
|
2;11
|
5/24 (21%)
|
19/24 (79%)
|
3;0
|
4/54 (7%)
|
50/54 (93%)
|
3;1-3;6
|
6/231 (3%)
|
225/231 (97%)
|
a). That me
car. Have me shoe. Me and Daddy
(=Mine and Daddy's). Where me car?
I want me bottle. (2;6-2;8)
b).
I want me duck. That me chair. Where me car? No me, Daddy (= It
isn't mine, daddy) Me pasta, Mine pasta. My
pasta. It my key. It my (=It's mine). No book my
c).
It is my t.v. Where is my book? Where is my ball? Don't touch my bike. I want my key. It's my money. (3;0)
[40]. In terms
of the analysis outlined above, the data seem to suggest that the possessive
structures produced early on (=stage-1) are predominately not specified for
possessor-agreement, with agreement gradually being specific more and more
frequently (until it exceeds 90% mastery at age 3;0). While it is true that we
can't argue here for absolute Non-agreement of Case at stage-1 (whereas for the
earliest file, age 2;6, we get at least two examples of correct my), this contrast in acquisition--as compared to what
we observed earlier regarding the agreement of the morphological inflection
{s}--may be a residual effect of the two types of agreement involved: its seems
to be the case that true morphological inflection should be the benchmark of
agreement and not lexical equivalents e.g., prenominal/pronominal my/mine (respectively) due to the fact that it is always more difficult to tease apart lexical
form functional underlying structure and determine if a lexical item is being
properly projected as a functional category, or if merely the lexical 'shell'
is simply phonologically produced 'rote- learned'. (Also see [§54]ff regarding
such distinctions placed between the two features as understood in
Distributional Morphology). The above examples could be expressed by the same
type of Phrase structure presented below:
Agreement Structure
1. Stage-1
[IP me [-agr 0] car]
2. Stage-2:
[showing +/-optional Agr(eement)]:
(i) [IP me [-agr0] pasta] [showing +/-optional Agr]
(ii) [IP me/my/mine [+/-agr] pasta]
(iii) [IP my [+agr] pasta]
[41]. It could
be argued that for our stage-1, 'adult target' agreement (acting as a
functional and formal feature of language) is set to the default via a [-agr]
setting and so renders the possessor case objective. The close to 100% omission
of adult-like agreement provides additional support for a discontinuity theory
between child and adult grammars.
Word
Order
[42].
There may be a Dual Mechanism Model for target word order. Children's
acquisition of word order/syntax may involve:
(i)
Data-driven a
'slower process' by which the induction of general
patterns
fall from specific examples.
(ii)
Parameters A faster approach may entail the simple triggering of the
correct word order parameter.
The
major difference between parameter setting and data-driven learning involves
major differences between the quantity of the data required (with parameters
requiring the least amount of data and thus would presumably come-on line/be
set the earliest).
[43]. One
initial assertion that can be made regarding the possible early insensitivity
of verbs towards their appropriate position within a sentence has come from
early MLU data. For instance, many naturalistic studies of early language
development suggest that rather than generating structure via abstract
grammatical generalizations, children may actually be tethering their grammars
to individual lexical items with respect to functional elements auxiliaries
(Kuczaj & Maratsos, 1993, Pine & Lieven, 1997); determiners (Pine &
Lieven, 1997) and pronouns (Pine & Baldwin, 1997). Data on early
verb/argument structure (see Radford 1990: pp. 213-17 for stage-1 examples)
suggest that early MLU verb classes may not adhere to appropriate SVO argument
structure in the sense that target transitive verbs take obligatory object
arguments. Radford cites very early two-word structures of the I/(me) want, Her hit,
type where the direct objects required presence in adult speech go missing.
Such deficits might suggest the children's initial knowledge of verb-argument
structure to be developed around individual verbs (and not verb type). In
addition, semantic over-extensions of Intransitive verbs of the 'Me sleeped
teddy' type (=sleeped, slept>put
to bed) may likewise show over-extensions on an individual verb basis
(Tomasello, 1992), or individual frame-basis (Braine, 1976), but show little
evidence that the extension carries over to the entire verb class. In view of
these data, it remains questionable whether or not children's very earliest MLU
staged grammar operates with abstract, rule-based representations at all--e.g.,
[+/-] Verb Transitivity. More specific to English SV(X) word order, some
questions regarding rule-based word order parameterization for early MLU speech
have been formulated. Atkinson (1992) (following the work of Susan
Goldin-Meadow with deaf children and 'Home Sign') suggests that there may be no
theoretical reason to stipulate for a correct target word order at, say, a
pre-parameterized stage of development. If children have an inherent abstract
understanding of predicate-argument structure (cf. Valian, 1991), they should
then be able to understand the differences between the subject and object of a
transitive verb and how to apply this to word order.
[44]. Although
traditional naturalistic studies have typically shown that correct SVO word
order usually appears early on in the data (Brown, Cazden, and Bellugi-Klima
(1968), Bloom, 1970; Brown, 1973; Radford, 1990), there is mounting literature
to suggest otherwise (e.g., Braine, 1971; Bowerman, 1973; Tsimpli, 1992;
Galasso, 1999/2003). Mixed word order data to this effect suggest that there
may be a very small window in the chronological development of language that
doesn't reflect target word order--i.e. a pre-SV(X) stage for English. In
addition, the fact that early child English seems to provide us with correct
word order recognition may be accounted for by means other than linguistic
motivation--e.g., non-linguistic, (and perhaps cognitive-based) sequencing strategies
based on formulaic aspects of the input, etc. (cf. Atkinson, 1992). Recall that
the 'U-shape' learning discussed herein shows how possible surface similarities
may actually have very different underpinning structural realizations--e.g., (i)
went (formulaic) => (ii) go-ed (rule based), => (iii) went (rule insertion). While went in (i) and (iii) look identical on the surface, they
are actually products of two very different processes. Other various studies on
novel/non-sense verbs similarly reveal a small window in the duration of staged
speech development that gleams word order errors (Olguin & Tomasello, 1993;
Akhtar & Tomasello, 1997). The child's inability to generalize correct
word order to novel verbs suggests that word order, at this early MLU stage,
may be learned on a 'low-scope' memorization level one verb at a time rather
than via a rule-based 'high-scope' parameterization process. Thus, it remains
unclear whether or not children's very early MLU speech should be credited with
having rule-based processes/parameterizations for determining word order. If
not, a special nativist position could still be maintained in the sense that
functional parameterization has not yet taken place (cf. Atkinson op. cit). In
light of a potential stage-1 non-parameterization account for free word order,
strong arguments could be devised suggesting that instances of free word order,
in fact, demonstrate the early onset of abstract rules (albeit via a non
setting)--if we take 'rules' here to mean the setting (or non-setting) of
parameters. Such arguments would counter the general claims being made (cf.
Tomasello, Rowland and Pine (ibid)) that stage-1 is more or less entirely
rote-learned. The fact that we do find word order errors may in fact call for
some level of formal rule abstraction (and not rote memorization)--much in the
counter intuitive manner of the U-shape learning model discussed herein. In
other words, if stage-1 word order is distributional, this might predict that
word order errors are few and far between at stage-1. However, a
pre-parameterized stage-1 would, by definition, want to show potential word
order errors. (See Data in [§46] below).
[45]. Keeping
to the spirit of Chomsky's Minimalist Program regarding Word Order, we would
like to maintain Richard Kayne's proposal that word order is indeed a universal
hierarchical property of a Spec>Head>Comp relation. One could perhaps go
as far as to make the very strong claim that SVO mirrors cognition, and thus a
universal order of Subject-Verb-Object is innately given. In any event, Kayne's
universal constraint is seen as keeping to the spirit of Chomsky's innateness
hypothesis, and so we'll take it as the null hypothesis here and see where we
go with it. However, we can only possibly adhere to it insofar as the empirical
data bear it out--and it is here that we instantly run into some difficulties.
Mainly speaking, if we want to maintain a universal SVO order, we therefore
must do so at that stage of development where the child in fact has access to Double Argument String structures (DAS). For instance, a prior Single Argument
String stage (=SAS) would have no way
of showing the appropriate Spec and Comp distributions. Well, when looking at a
good cross section of child acquisition data, it appears that there is no
strong evidence pointing to an exclusive SAS stage--(without some small amount
of DASs interceding). While this may be the case, a stage does evidence in the
data where at least the majority of utterances are indeed not only SASs, but
that such SASs show variable word orders amongst the Subject/Object and
Verb--rendering SV, VS, OV, VO orders. It is at this juncture that we have to
weaken Kayne's strong universal claim for an SVO order as correlated to his Linear Correspondence Axiom (LCA), and say that such an axiom only holds for a
child at (DAS) stage-2 of development--again, a stage roughly corresponding with
the (albeit optional) emergence of abstract rule formulations and functional
categories, both which lead to Parameterization. So in one full sweep, what we
have done is somewhat preserve Chomsky's original version of a word order based
on Functional Parameterization (pace
Kayne's strong stance for a non-parameterized word order based on his universal
LCA) and have added a further Kaynian stipulation by saying that LCA may only
work, rendering all structures as base generated SVO orders only after a
pre-cursor parameterization has taken place positioning the Object either
Leftward or Rightward of the Verb--now providing two basic universal orders: SVO
and (the mirror image) OVS: (of course, the latter order is very rare as a
base-generated order, though some have claimed Japanese as an OVS base order,
and then, via subject movement, derive an SOV order (fn). In any event, Kayne
is explicit in stating that his Head Medial Principle, (stipulating that a Head/Verb must remain in middle
position and one of the tenets of his axiom) would conceivably permit the four
word orders above to be accessed by a child in a SAS stage-1.
[46]. Looking
at the data (Galasso: 1999/2003), we indeed find a strong correlation between
SAS strings and mixed word order alongside DAS strings and fixed order.
Table 5 Word
Order
Files: 8-16 SAS
|
SV
|
VS
|
DAS= SVX / Other
|
Age: 2;4-2;8 n.=
|
87
|
78
|
290 5
|
I.
Some token examples include:
(a) SV: Daddy
cooking. Him go
(b) OV: Dog kick (=
I kick dog). A egg cook. (= I cook egg).
(c) VS: Open me (= I open). Work bike (=Bike works)
In
terms of structure, before on the onset of DASs, a Proto XP could be assigned
to our SAS stage providing the variable word orderings:
[47]. In
addition to general word order variability, Wh-word order patterns emerge in
our early files (age 2;4-3;0) showing semi-formulaic consistencies when
examined in light of the general acquisition of complex structure--as mentioned
above regarding SAS vs. DAS complexity. Our data evidence a pattern showing Non
CSV (Non Comp Subject Verb) ordering which could be interpreted as formulaic in
nature. This stage roughly overlaps with our SAS stage mentioned above. Like
Kayne on Word order, Cinque (1990) has formulated a strong universal position
claiming that all Wh-elements universally position within the Spec-of-CP.
Recall, that CP is a functional category that should have a delayed onset time
under any maturational theory (cf. Radford: 1990). Here too we need to weaken
the strong position by adding the stipulation that in order for this Spec-CP
analysis to hold, the subject must simultaneously surface forcing the
Wh-element to raise and preposition in Spec-CP. Otherwise, very early (stage-1)
Wh-arguments (e.g., What, Who)
seemingly get initially misanalyzed as base-generated 3Person
Pronoun/Quantifiers placed in superficial subject Spec-VP position. This
miscategorization often results in Agreement errors where the Wh-word, seen as
incorrectly taking the thematic-role of the subject, agrees (by default) with
the verb. Consider the structures of the two following CP- structures below:
Table
6 Wh-Word
Order
|
Non CSV
|
Wh Spec-CP (CSV)
|
Files 1-21 n.=
|
78
|
0
|
Files 22-25 n.=
|
120
|
80
|
[48]. In
sum, arguments could be devised suggesting that early Wh-structures are prime
examples of semi-formulaic strings base generated (VP insitu). A later second stage (or even overlapping stage)
may thus be seen as converting formulaic processes into rule driven processes
whereby syntactic manifestations of Wh-movement occur with or without Auxiliary
inversions. (See Stromswold (§6) above for Non-Aux inversions). Regarding
formulaicy, Pine & Lieven (1997), Pine et al. (1998) claim that a non-rule based account is what
is behind the formation of early correct wh-questions (a U-shape learning take
on the data). While adopting a constructivist account in explaining the high
rate of correctly inverted Wh + Aux combinations, they go on to predicted that
correctly inverted questions in a the child's stage-1 data would be produced by
those wh + aux combinations that had occurred with high frequency in the
child's input. They go on to specify that there is evidence that the earliest
wh-questions produced with an Aux. can be explained with reference to three
formulaic patterns that begin with a limited range of wh-word + aux.
Combinations (e.g., "whydon't" you/she) (Rowland & Pine, 2000). Such findings on early formulaic
structures parallel what Tomasello (1992) and Newport (op. cit.)
suggest regarding an initial stage-1 that reflects a processing deficit tied to
functional grammar. In other words, child stage-1 processing which shows a bias
toward the modeled high frequency lexical input (vs. rule driven analogy) may arise due to constraints
imposed by the low memory bottle-neck of distributional learning (Braine 1987,
1988).
Lexical
Stage-1: A Recap
[49]. In light
of the above data, and the collections of data elsewhere, it could be argued
for our stage-1 that the child's utterances involve pure projections of
thematic argument relations. In Minimalist terms, the operation 'Merge' would
directly reflect thematic properties and that this operation is innately given
by the Language Faculty: Verbs directly theta-mark their arguments as in
predicate logic expressions:
Table 7 Argument/Predicate
Structure
Token Utterance:
|
(d)addy work
|
(m)ommy see daddy
|
Predicate Logic:
|
work(d)
|
see(m,d)
|
The above Word Order/Syntax includes (SV) and (SVO)
patterns and is structured below:
[vP[N Dad][[v0][VP[V work]]] [vP [N Mom][[v0][VP [V see][N dad]]]]
(vP=light-verb
Phrase).
[50]. In both
example above, the Nouns (Daddy
& Mommy) contain no formal
features (such as person or case) and so don't agree with the verb. The verb
likewise carries no Tense or Agreement features. In this sense, theta-marking
directly maps onto the semantics of lexical word classes--viz., 'pure merger'
involves only theta-marked lexical items. It is therefore claimed that there is
no Indirect theta-marking capacity at stage-1 such that oblique or
prepositional markers would enter into the syntax: for example, the PP 'to
work' in Daddy goes to work, would
be thematically reduced in the operation Merge as Daddy go work (work
=Nouns and not infinitive verb). Such utterances are wide spread for our stage-1
as was revealed in the section above. In addition to seemingly direct thematic
based syntax/grammar, numerous other studies have shown that, indeed, children
inappropriately overextend semantic (causative) alternations of verbs such as giggle
vs. tickle by indiscriminately giving them identical thematic
argument structures Thematic role 'Patient' in their intransitive forms: e.g., don't
giggle me! vs. don't tickle me! (Bowerman, 1973). If we wish to make claims that such
overgeneralizations are a result of some innate linking rule, then clearly some
sort of default semantic-based linking rule must be up for discussion. In any
event, the lack of non-semantic [-Interpretable] formal features certainly
dispels the notion of syntax and leads us to look at such early stage-1 lexical
items as being stripped of their formal features, and projecting quasi-semantic
information on a class of their own-perhaps to the point that each lexical item
is learned and projected in isolation
[51]. In
conjunction to an isolative lexicon, and much in the same spirit with Pine et
al. above, Morris et al. (ms 1999) has sketched out a theoretical proposal
(based on PDP-style connectionism) that relegates verb-argument structures in
children's stage-1 grammar to individual 'min-grammars'-that is, each word is
learned ('bottom-up') in isolation in that there are no overarching
abstractions ('top-down') that link one verb's argument structure to another. In
other words, there are no argument rules, only isolated word meanings-each
argument structure is a separate grammar unto itself (p. 6). It is only at a second stage-2 that the child
is seen as corresponding the semantic as well as the syntax over from one word
to another. For example, the verbs eat and drink, hit and kick,
etc. will merge at stage-2 in ways that will project this overarching abstract
structure regarding transitivity, thematic structure, etc. Hence, stage-2 is
defined as the benchmark in emergence of true syntax and rule formation.
[52]. In sum,
what the above sketch has to offer us is the proposal that children start off
(stage-1) with rote-learned items and then strive to find commonalities--the
child then builds-up this lexicon from brute memory and only later (stage-2)
does she slowly start to form levels of abstraction. The claim is that children
learn grammatical relations over time--the bottom-up processes mimic the
maturational processes behind language acquisition (viz., first a stage-1
'bottom-up' lexical learning followed by a stage-2 'top-down' rule formation).
[53]. The idea that formal features along with their
respective feature complexity drive the protracted maturation of child language
acquisition has recently been addressed by Radford (2000). The notion is that
children acquire language incrementally based on each feature's complexity. For
instance, we might hypothesize that the internal complexity of the Agreement
feature [PER(son)] might be more complex that the internal conceptual
complexity of the feature [DEF(initeness)] since DEF may contain some amount of
cognitive semantics. In sum, Radford makes use of the syntactic labeling [+/-
Interp(retable)] features as a mechanism to account for the dual stage
development in Children's speech. The +Interp features co-exists alongside
lexical categories while the -Interp features co-exists alongside functional
categories (The twin benchmark of lexical stage-1 vs. a functional stage-2 of
child language development remains upheld, though now with an assigned new
twist having to do with the respective categories' feature complexity).
[54]. Distributional
Morphology. A second but
similar line of reasoning, likewise motivated by outcomes in Chomsky's
Minimalist Program (see Marantz 1995) calls for morphology to be the all
encompassing aspect of grammar--doing away all together with the lexicon as
maintained under so call 'lexicalist hypotheses', as well as dispensing, to a
certain degree, with traditional notions of syntax that sought to derive a
syntactic model outside of the lexicon in a seemingly top-down manner. The
theory's basic core calls on a number of assumptions: viz., (i) that syntactic
hierarchical structures 'resonant all the way down to the word' (or perhaps
more accurately described 'as being essentially derived from the word'); (ii) that
the notion of 'word' is broken up into two properties-- the word shell of
phonology, (or as it is termed in DM, the Idiom), and the word's selectional
morphological features. The distinctions are articulated in terms of morphology
by the following labeling: the 'l'-morpheme--which pertains to the idiom aspect
of the sound-meaning relation--and the 'f'-morpheme--which correlates to the
abstract morphological features. These two labels may be seen as correlating to
Radford's usage of +/-Interpretable features where the [+Interp] feature
distinction pertains to lexical item's semantic properties (part of which would
be the Idiomatic aspect of the word as used in DM, along with its phonological
make-up (i.e., 'l'-morpheme), and where [-Interp] would correlate to the more
formal and abstract syntactic properties (i.e., the 'f'-morpheme). The
two-prong theory today is seen as part and parcel of a formal language system.
Traditional parts of speech such as 'Noun' are redefined as a bundle of
features that make-up a single l-morpheme type (called Root). The Noun root or
'l'-morpheme is defined by how the root entertains certain local relations or
governing conditions which it imposes on its complement hosts--e.g., how the
Noun root might c-command or license its Determiner (in a local Specifier
position) or a 'Verb (in a local Complement position). A classic example here
would be how the same lexical item Destroy appears as a 'noun' Destru(ction) when its nearest adjacent licenser is a Determiner (The
destruction), or how the item takes
on the role of a verb when its nearest adjacent licensers are Tense/Agreement
and Aspect ( Destroy-(s), (is) destroy-ing, (have) destroy-ed) (marking Tense, and Participle respectively). This
model now places the burden of syntax not with exterior stipulations, but
rather with interior conditions that seem to flow up-ward from the lexical
items itself and into the relevant projecting phrase. In this new definition
(taken right out of MP, 'Bare Phrase Structure'), the 'phrase' is reorganized
as simply the sum of the total interacting 'f'-morpheme parts; the 'word; is
thus redefined as nothing more than a 'buddle-of-features' that project out of
the phonological shell. This new analysis will hold a number of consequences
for how we come to understand language acquisition. For starters, much of what
is being spelled out here concerns a two-stage acquisition of language
development and that this dual stage can be accounted for the dual mechanism
model as advanced in this paper. What I am on about here can be summarized as
follows regarding language acquisition:
(i) Syntax,
as understood in Chomsky's Pre-Minimalist's terms, may for all intents and
purposes reduce to specific bundle-of-features that are encoded in
'parts-of-speech' words, (rendering a seemingly bottom-up learning mechanism
where 'meaning' governs not only how words are learned, but how their syntactic
properties project).
(ii) Syntax
may no longer be considered as a top-down generator of sentence types, and so
words have the capacity to emerge in a early stage of language merely encoded
with 'l-morphology or [+Interp] features. In this way, one may be able to
define an early stage-1 word as exhibiting more or less only the phonological
shell of the word void of its otherwise embedded syntax. If this is indeed the
case, a viable maturational story can likewise hold for the onsets of
'f'-morphology [-Interp] features for the given word. Much in the manner of
Roger Brown's observation leading to a sequence of morphological development (starting
with -ing and ending with the Aux.
Clitic etc.), a similar story could
likewise hold regarding how certain features mature and then merge in a word--a
maturation of features however which would not delay the onset of the word in
phonological terms (or 'l'-morpheme values), but would only delay the relevant
selectional properties (or 'f'-morpheme values, etc.) associated with its
functional grammar (See Galasso 2003 Ch.5 for analyses of how an early
DP-projections (without IP material) may take-on a default +DEF status empty of
any other functional features.)
The
twin notions above would ultimately buttress any theory which would see
language development as a maturational interplay of features--as captured herein
with our discussion of a Converging Theories Model.
[55]. A
typical Chomskyan syntactic tree asserts that functional features (individual
features having to do with M(ood), T(ense) and Agr(eement)) are assumed to be
projected in a top-down way: these functional features are understood to be what
is behind the notion of movement--lexical items move up the tree in order to
acquire and check-off these features. The following question certainly could be
formulated in Chomskyan terms: 'why can't lexical items have such features
embedded in their sub-categorical entries, and if they can, what then would
motivate movement other than some ad hoc stipulation requiring features to be checked-off in a overall top-down
environment'? Consider the tree below (reduced showing only M & T/Agr
features):
The
tree above positions the T/Agr features, along with their specific phrases, as
having a top-down representation. If such a tree is completely available
early-on in language acquisition--as the Continuity view would maintain--than
there should be no reason why a child would exhibit 100% omission of say a
top-down Agr feature in the way that would affect only certain words and not
others. (When only certain words show individual residual affects, e.g.,
regarding subcategorization, syntax etc., then a strong claim can be made that
the overarching phrase structure is not what is behind the phenomenon, but rather
specific lexical-parameterizations may be involved.) (See Janet Fodor 1997,
Baker 2002, Borer 2003 for a seemingly bottom-up treatment of lexical
parameterization). In other words, if the structure is in place (from top-down)
to deliver the feature of Agr (as with Case), than it would be hard to explain
away the fact, if observed in the data, that some words could maintain Case
while others (which should maintain Case in the target language) do not. Guasti
and Rizzi (2001) say: 'When a feature is not checked in the overt syntax, UG
makes it possible to leave its morphological realization fluctuating'. Fine.
But, this is seemingly a bottom-up problem. It seems that such optionality
would have nothing to do with the phrase (per se). What do we say when the feature itself (as
projected from the tree top-down) seems to select some words over others
regarding inflection? Surely, if this is a top-down venture, then the features
should project onto all verbs (for the appropriate phrase), and not just a
select few. But this is in fact what we find at our stage-2 of language
development--some words may (optionally) inflect/project the specific feature
while others completely by-pass it (entirely).
[56]. For
example, data taken from Radford and Galasso (above) show that while the
Genitive feature may project at stage-2 of development , it does the project
over the full class of Possessive words. In other words, features seem to come
on-line in increments as they are dependent on their lexical host (a sort of
bottom up lexical learning hypothesis). For instance, at the early part of
stage-2, lexical possessors such as His and My get acquired
before their inflectional possessor counterparts such as Daddy's and Man's.
If the feature attributive to both forms of Possessive structures were of a
common stock (top-down), then the disparity of development would be hard to explain.
[57]. This
gives us the flavor of specific words (and not word classes) taking on
functional features (bottom-up). The question here is how does one maintain the
higher-ordinance structure of functional grammar originating from the latter
two upper layers of the tree while selecting the functional projection on only
a select handful of words. One way around the dilemma may be to suggest that
the lexical word itself has part of the (upper-branching) tree embedded in the
very lexical item itself (as in sub-categorization). In this way, a specific
word may reflect a specific functional feature or parameter while another word
may not (on a specific lexeme by lexeme basis)--in all actuality, what we are
talking about here is that (i) the initial process of the acquisition of
functional grammar involves one word at a time (in a bottom-up way), and that
(ii) only at a later more developed stage does such feature projection extend
to the overall class of words (which then extent to phrases). Following in the
spirit of Lexical Parameterization
(Borer), Janet Dean Fodor in a similar vain has tentatively suggested in some
recent work that parameterization may affect certain words (as in lexical
feature specificity) and not others (outside of the scope of its word class)
(talk presented at the University of Essex, 1997). One outcome of this would
assume that children establish parameter values (perhaps piece-meal) and not
grammars as wholes. An example of such bottom-up parameterization or say
feature specificity (only selecting [+/-Nom]
Case marking here) might then be diagrammed in the following manner:
[58]
Such an exclusively bottom-up parameterization method would however obscure
correlations often found in the data regarding Case and/or Agreement--such as a
seemingly top-down holistic correlation which seeks to link (i) [+Nom] Case if
in an agreement relation with a Verbal INFL, (ii) [+Gen] Case if in an
agreement relation with a nominal INFL, (iii) Default Case otherwise. It may be
that such correlations do come on-line after an initial 'non-phrase'
parameterization stage--hence, an initial and not fully fledged parameterized
stage would meagerly work with individual words, delaying
class-parameterization to a slightly later stage.
[59]. A
growing body of research recently undertaken by developmental linguistics
suggests that children's (stage-1) multi-word speech may in fact reflect
low-scope lexical specific knowledge rather than abstract categorical-based
knowledge. As discussed above, this distinction clearly points to a possible
language acquisition processes as proceeding from out of a dual mechanism in
the brain. For example, regarding verb inflection, studies (Tomasello &
Olguin, Olguin & Tomasello, Pine & Rowland) have shown that the control
children have over morphological inflection very early in the multi-word stage
is largely individually rote learned--that is, there is no systematic
relationship between stem and inflection, nor is there any transfer from
'supposed' knowledge of an infection to other stems. In other words, at the
very earliest stages of multi-word speech, there is little or no productively
of transferring the knowledge of one verb to another. This may suggest a
stage-1 based not on complete paradigm formation, but rather on
(semi)-formulaicy.
[60]. Rowland
suggests that a distributional learning mechanism capable of learning and
reproducing the lexical-specific patterns that are modeled in the input may be
able to account for much of what we find in the early stage-1 data. Input of a
high frequency nature will then trigger rote learning associations and patterns
that will manifest in the speech production of young children. This notion of rote-learned vs. rule-based or non-systematic vs. systematic behavior (respectively) can be
further investigated by looking into what has become known as the U-shape
learning curve. For instance,
indications of systematic (rule-based) behaviors can be seen in
overgeneralization. In other words, if overgeneralizations appear with, say,
the morphological inflection {s} as in the portmanteau forms for either Verb or
Noun--e.g., I walk-s, feet-s
(respectively), than a sound argument could be made that rules have been
employed-albeit, rules which have erroneously over-generated. (In fact, if
children in the process of their early language acquisition are never seen to
over-generalize rule-like formations, this is very often a sign of potential
Specific Language Impairment (SLI), a result of some neo-cortical brain
malfunction which has disturbed the normal syntactic structuring of rules and
paradigms.) And so, we rightly extend the argument that if rules are being
applied at a given stage, than a rule-based grammar has been activated: Right
you say. Well, as it turns out, there are some very interesting findings which
suggest that apparent 'look-a-like' rules at stage-1 are in fact imposters and
don't really behave as 'true' rules.
[61]. U-Shaped
Learning. One of the most
striking features of language acquisition is the apparent so called U-shaped
Learning Curve found straddling the
two stages of language acquisition. In brief, the U-shaped curve is understood
in the following way:
(i)
Inflection. Children's
earliest Inflected/Derivational word types are, in fact, initially correct-that
is, it appears to be the case amongst very early MLU that children have correct
formulation of rules. (It goes without saying that typical early MLU utterances
indeed have no tense markings to speak of (cf. Wexler & Radford's Maturational
Theory). The point here is that whenever a small sampling of Tense does appear
in early MLU speech, it always appears correctly). An example of this is the
early emergence in the data of the past tense and participle affixes [ed] and
[en] e.g., talked/gone (respectively).
The initial Past Tense and Plural forms are correct, regardless of whether or
not these forms are regular (talked/books) or irregular (went/sheep-ø).
However, and what is at the heart of this striking development, it also appears
that this initially correct performance stage is then followed by a period of
attrition during which the children actually regress--that is, at this slightly
later stage in development, they do not only lose some forms of affixation, but
in addition, produce incorrect over-generalizations in tense forms (go>goed>wented), and plural forms (sheep-s), as well as non-inflected tensed forms e.g., talk-ø/go-ø
(=past tense). To recap, the first
occurrence of inflectional overgeneralization roughly at age 2 years that
supports a rule-based grammar is preceded by a phase without any errors at all.
(ii)
Phonology. Similar
to what one observes regarding a u-shape grammatical/inflectional development,
children also appear to follow a u-shape learning curve with regards to phonology.
An example of this is the often cited early productions of e.g., (i) slept /slept/, cooked /k_kt/, played
/plae:d/ > to (ii) sleeped
/slipId/, cooked /k_kId/, played /plae:Id/ >
and back to (iii) slept /slept/, cooked /k_kt/, played /plae:d/ (respectively) completing a U-shaped
morpho-phonetic
curve yielding /t//d/ > /Id/
> /t//d/.
What
appears to be good examples of 'rule-based' inflection and assimilation in (i)
and (ii) (above respectively) is in all actuality nothing more than the product
of a 'parrot-like' imitation sequence--more akin to iconic pattern processing
derived from stimulus and response learning. The child can be said to engage in
segmental, phonetic-based rules only when s/he appears to process the rules
yielding an incorrect overgeneralization of past marker {ed} typically
pronounced as the default /Id/ which
forms the middle-dip portion of the u-shape curve. Recall, in terms of
phonology, the child has three allophonic variations to choose from:
a. {ed}
=> /t/ "walked" /wa:kt/
b. {ed}
=> /d/ "played" /ple:d/
c. {ed}
=> /Id/ "wanted" /wantId/
It
seems that a default setting with regards to phonology (place & manner of
articulation) is minus Comp(lex) where [-Comp] denotes one feature distinction
over a two or more features (for instance, bilabials /b/ /m/ would have a
[-Comp] feature whereas labio-dentals and inter-dental /f/ /q/ (respectively) would have a [+Comp] since both lip and tooth are
involved. In addition, it seems that plus voicing [+V] typically wins out over
minus voicing [-V]. By using these default settings, we naturally get voiced
plosive /b/ d/ /g/, nasals /m/ /n/, as our very first sequence of consonants
along with [+V] vowels. By taking this default status, the /Id/ should be the allophone of choice, and it often is.
In this manner of speaking, adherence to the default setting suggests at least
some formation of the rule: defaults work within rule-based paradigms and so
should be considered as a quasi-rule-based generation as opposed to a pure
imitation sequence.
[62]. The first two
stages of development that form this apparent U-shape curve has been
interpreted as manifesting the application of qualitatively different processes
in the brain--representing different modes or stages in the course of language
acquisition. This u-shaped curve arguably provides some support for our stage-1
to be defined in terms of a formulaic stage rather than as a syntactic and
true-rule learning stage. The second up-side of the U-shaped curve is found to
coincide with an independent syntactic development--the emergence of a
Finiteness marker, and that this finiteness marker only emerges at our
functional stage-2. In sum, the three stages could be described in the
following way:
(i)
The first period of the first up-side
curve (correct production) correlates with a style of rote-learning. This more
primitive mode of learning suggests that the mental lexicon is bootstrapped by
mere behaviorist-associative means of learning. In such a rote-learning stage,
lexical items (either regular or irregular inflections) are stored in an
independent mental lexical heavily based on memorization of formulaic chunks
and associations and are processed in a different part of the brain. It is of
no surprise that irregular verb past inflection (go>went) out number regular verb past inflection (talk>talk-ed): The former being stored in the lexicon as a
formulaic chunk, while the latter indicating the morphological rule formation
[V+{ed}]. Hence, our dual converging theories model postulates for a sharp
contrast and disassociation between regular vs. irregular inflection. This
seemingly early correct production is therefore due to a low-scope,
phonological 'one-to-one & sound-to-meaning' relationship with no relevance
to rules. Hence, our formulaic past tense inflection is not realized as [stem + affix] [talk-{ed}], but rather as one unanalyzable chunk [talked] (cf. Clahsen et al. 2003, fn.2)
(ii)
The second stage then marks the onset
of a rule process (albeit, not necessarily the mastery of it). Here, the child
is seen as letting go with the formulaic lexical representation in favor of
rule formations: i.e., patterns of concatenate stems appear along side
inflectional affixes. Thus, irregular forms often get over-generalized with the
application of the rule resulting in e.g., goed/wented/sheeps. This overgeneralization stage maps onto a
chronological functional categorical stage of language acquisition where
rule-based mechanisms are becoming operative. Thus, the over-generalized
up-swing of the U-shaped curve is linked to children's syntactic development:
over-generalization of inflection appears when the child ceases using
bare-stems (as in stage-1) to refer to past events.
(iii) The third and final stage marks the second up-side swing of the U-shaped curve
and represents the correct target grammar.
[63]. It is
thus proposed that this tri-staged learning process--from correct to incorrect
to correct again--can more properly be accounted for by a dual learning
mechanism in the brain: (i) an initial mechanism that has no bearing on rules
and is pinned to a type of process best suited for more associative-style
learning, such as base lexical learning, irregular verb learning, lexical
redundancy formations, etc.
Brain
Related Studies
[64]. Much
of the theory behind a dual model of language has become buttressed by recent
developments in Brain Related studies. There is now an ongoing stream of data
coming in that tells us the brain does indeed process different linguistic
input in strikingly different ways. Some of the first analyses using fMRI
(functional Magnetic Resonance Imaging), and other brain-related measures show
that irregular inflection processes (go>went) seem to be located and produced in the temporal
lobe/motor strip area of the brain, a processing area strictly associated with
basic word learning referred to as the lexical component,
or Lexicon). On the other hand,
regular inflection processes e.g., (stop>stopped), where the rule [stem]+[affix] is applied, point to
areas of the brain which generate rule formations, i.e., the computational
component. In other words, there
seems to be a clear indication that the two types of linguistic processes are
dissociated. This same disassociation seems to project between how one
processes derivational morphology--here,
being equated to irregular and/or whole lexical word retrieval--and inflectional
morphology.
[65]. Wakefield
and Wilcox (=W&W) (1994: 643-654) have recently concluded that a
discontinuity theory--along the lines proposed by Radford--may have an actual
physiological reality as based on a biological 'maturation' of brain
development. Their work consists of two segments: the first being a theory of
the relationship between certain aspects of brain maturation and certain
transitions in grammatical representation during the course of language
acquisition, the second being a preliminary investigation to access the
validity of the theory by testing some of the specific hypothesis that it
generates. In their model, it is the left posterior aspect of the brain, at the
junction of the parietal, occipital, and temporal lobes (POT) that generates
semantically relevant, modality-free mental representations by allowing signals
from all neocortically-represented sensory modalities to converge in a single
processing region. In turn, the linguistically relevant contributions of
Broca's area, located in the inferior portion of the left frontal lobe imparts
abstract structure to those representations with which it interacts--including
(functional) grammatical components as well as the semantic components. The
idea here is that we can now tentatively spot functional abstract grammar
within the frontal lobe areas of the brain, and show how such grammatical
aspects relate to the more primitive, prosaic elements of lexical-semantics (as
spotted in the temporal lobe regions). The trick here is to see if the two
regions are initially talking to one another (as in neuro-connectivity), say at
our grammatical stage-1. Using PET/ERP-language studies, a sketchy two-prong
picture emerges suggesting that the neural mechanism(s) involved split along
lexical and functional grammatical stages of language development. It is clear
that Broca's area is involved not only with the generation of abstract
hierarchical structure, but, with the representation of lexical items belonging
to functional categories. However, the studies reveal that in order for Broca's
area to work at this highly abstract level of representation, the frontal lobe
which houses Broca's area must also connect to the POT region of the brain--in
this sense, a real conversation must be carried out between the (first order)
semantic properties of language (POT) and their functional counterparts. This
relationship parallels the lexical-functional dichotomy found in all language.
[66]. The
W&W study suggests that the maturational development of language follows
from brain development--and can be summarized below:
a. The
lexical stage-1 of language acquisition naturally arises from a disconnect
between the more primitive POT (temporal-lobe/lexical-grammar region) and the
hierarchical Broca's area (frontal-lobe/functional grammar).
b. This
disconnect has to do with the biological development of myelination in the
bundle of axons that connect the two areas together. Myelination of axons is
then said to mature at roughly that chronological stage where we find a lexical
(staged) grammar merging with a functional (staged) grammar.
c. With respect to the brain/language relationships in
the child, it is important to recognize that during the period of time
typically associated with the initial stages of language acquisition, the brain
is still in a relatively immature state. Neural plasticity begins with the
sensory motor-strip temporal area (POT), and then proceeds to move to secondary
areas (Broca's area) related to the frontal lobe region.
Conclusion:
A Converging Theories Model
[67]. In
the history of all pursuit of science, it has traditionally been the case that
science precedes and develops via different methods and theories. Converging
approaches always strive to expose inherent weakness in their opposing
theories. It goes without saying that convergence methods go far in peeling
away biased assumptions which often lead to half-correct assertions. Taking
what is good from one theory and throwing away what is not is just common-sense
science. For example, on one 'converging; hand, Chomsky has asserted that
syntax is the result of the creative human brain set-up in such a way as to
manipulate true-rules. It creates, from nothing external to itself, the
structure of language. In restricting ourselves to the point at hand, Chomsky
has assimilated much of his arguments from the long line of rational philosophy
and has converged such reasoning into how he believes an autonomous language
structure (internal) might be construed. His belief that syntax is autonomous
directly paves a way for him to distinguish between species-specific
(human/hominid) language and other modes of cognitive-based primitive
communications (animal/pongid). His now famous debates--first between Skinner
(Behaviorism) and later with Piaget (Constructivism)--can be readily reduced
back to Converging Methodologies between (philosophy and cognition) which
sought to return language to seventeenth century nativist assumptions. Later,
he would go on to extend such arguments to fight off pure
pragmatic/socio-linguistic pursuits of linguistics--saving the study of language
from becoming strictly a 'humanities' field of study which emphasized social
phenomena with little if any analytical worth: (cf. Quine, Rorty pace Chomsky). Taking his notion of an autonomous syntax
further, the natural next step to take would be to say that all other aspects
of language (whatever they may be) that can't fall under this autonomous
rule-based syntactic realm might be conversely tethered to both behaviorism and
associationism as part of an underlying cognitive mechanism. Chomsky has
himself expressed the possibility that general mundane concepts--many of which
contain inherent sub-categeorial features that are extremely convoluted and
abstract, yet from which we go on to readily attach labels (=words)--may be
preconceived and innate: however, he goes on to suggest that such conceptual
innateness may be tethered to cognition as a universal ability to get at
meaning (Chomsky 2000: p.61-62):
These
conceptual structures appear to yield semantic connections of a kind that will,
in particular, induce an analytic-synthetic distinction, as a matter of
empirical fact.
These
elements (he cites concepts such as
locational nature, goal, source of action, object moved, etc.) enter widely
into lexical structure¼ and are one
aspect of cognitive development.
[68]. On
one hand, what Chomsky seems to be saying is that (i) Functional Grammar, or
Syntax (par excellence) is
autonomous and disassociated from all other aspects of the mind/brain-including
meaning and/or cognition. Thus, syntax is created from out of the mind's
creative and independent eye (with all aforementioned nativist trappings).
However, and to the point of this section, Chomsky doesn't hesitate to
attribute those non-syntactic aspects of language, say word learning (based on
frequency learning and associationism, to cognition. This, I believe, goes to
the heart of the matter--namely, that a converging theories has been evoked and
could be summarized as follows:
Chomsky
and Cnverging Theories
1.
Syntax proper (labeled herein as
Functional Grammar) is creatively formed by a true-rule process via an innately
given Language Acquisition Device (LAD) (more recently called the Language
Faculty)--comprising of initial grammatical default settings of which are called
Universal Grammar. This is where the more abstract Inflectional rules are
housed: the functional features of number/person/case/agreement/tense e.g.
Plural [N+ {s}], Past Tense [V+{ed}], etc. Of course, the 'Wugs-Test' of Berko
goes directly under this category: meaning is detached from syntax.
2.
Word learning (labeled herein as
Lexical Grammar) is formed via a one-to-one iconic association between sound
and meaning. This process of both word learning on (i) a phonological level,
and word learning on (ii) a semantic/conceptual level, is more akin to past
behavioristic notions of learning. Very young children (at our stage-1) may
exploit and over-extend such processes--this is apparently what we find
regarding formulaic type utterances, Irregular Verb/Noun lexical learning and
retrieval, as well as Derivational morphology.
[69]. Connectionism. In
view of Chomsky's assertion that Syntax is autonomous, there can be by
definition no primitive lower-level capacities at work in syntax--namely,
nothing that hinges on perception, sound, object movement, spatio-temporal,
etc. Although we share with our primates such low-scope abilities, more than
anything else, it is our ability to work with abstract rules which creates the
unsurpassable, and ever widening gap between human language and animal
communication--the former based on true-rules & syntax, the latter based on
more primitive behavioristic modes of learning. Regarding the higher-level
processes having to do with syntax/grammar, the bootstrapping problem as
discussed above does provide a way for lower-level processes associated with
connectionism to serve as a springboard for later rule-based grammar. For
instance, it is now widely assumed (cf. Plunkett, Elman, among others) that
something like a connectionist system most provide the neurological foundations
for the apparent symbolic mind. In other words, a symbol processing system
might sit on top of a connectionist implementation of the neurological system.
Such a heteroarchical layered approach to language would be similar to stating
that in order to talk about Darwinian Biology, one must first acknowledge the
underlying universals of Physics. Likewise, I believe brut memorization also
served an evolutionary road to syntax: (I am becoming more and more convinced
that syntax arose via a high memory capacity--namely, in order to handle the
input of this newfound high memory, syntax had to emerge.) Clearly, there must
be at least some casual connection between the fact that chimps both start-off
and quickly max-out in having extremely low MLUs (in terms of signing). Two
year-old toddlers quickly surpass chimps in MLUw. In this sense, there may be a
bottom-up story to syntax after all. However, having said this and more to the
point of Chomsky's reference to autonomous syntax, a symbol processing system
would operate according to its own set of principles. Recently, the notion of hidden
units/rules providing crucial
feedback loops in connectionist processors have been interpreted (much to the
chagrin and potential demise of the pure connectionist group) as a form of a
quasi innate symbolic devise--cleverly hidden in the actual architecture itself.
(See the on-going debates between Marcus vs. Elman, Elman vs. Marcus on this).
Nonetheless, it is now becoming commonly accepted in connectionist circles that
a number of local architectural constraints are indeed necessary in order to
bring about a sufficiently qualitative approximation of computation worthy of
language: constraints such as the right number of units (hidden and overt),
layers, types of connections etc. Notwithstanding camp rhetoric and inevitable
spin involve--again, arguments tantamount to the old nature vs. nurture debate--there
however may be something to the notion that such hidden units serve as a bridge
between the two systems (and for that matter, the two schools of thought):
there is a certain degree of truth to the analogy stating hidden unit
tabulations spawn symbolic rule paradigms.
[70] References
Akhtar, N. (1999) Acquiring basic
word order: evidence for data-driven learning of syntactic structure. Journal
of Child Language 26. 339-356.
Akhtar, N. & Tomasello, M. (1997)
Young children's productivity with word order and verb morphology. Developmental
Psychology 33, 952-965.
Aronoff, M. (1994) Morphology
by itself: Stems and inflectional classes. MIT Press.
Atkinson, M. (1982) Explorations
in the study of child language development. CUP.
(1992)
Children's Syntax. Blackwell.
Baker, M. (1988) Incorporation:
A theory of grammatical function changing. Chicago: Chicago University Press.
(2001)
The Atoms of Language. New York:
Basic Books.
Bates, E; Bretherton, I; &
Snyder, L. (1988) From first words to grammar. CUP.
Bellugi (1967) The development
of negation. Ph.D. Diss. Harvard
University.
Berko, J. (1958) The child's
learning of English morphology. Word,
14. 150-177.
Bickerton, D. (1990) Language
& Species. Chicago: University of
Chicago Press.
Bloom, L.(1970) Language
development. MIT Press.
(1973)
One word at a time. The Hague:
Mouton.
Bloom, L; Lifter, K. &
Hafitz, J. (1980) Semantics of verbs and the development of verb inflection in
child language. Language 56 386-412.
Borer, H. & Wexler, K. (1987)
The maturation of syntax. In T. Roeper and E. Williams (Eds) Parameter
setting. Dordrecht: Reidel.
Borer, H. & Rohrbacher, B.
(2002) Minding the Absent: Arguments for the Full Competence Hypothesis. (To
appear in Language Acquisition,
Ms. University of Southern California).
Bowerman, M. (1973) Early
syntactic development: a cross-linguistic study with special reference to
Finnish. CUP.
(1974)
Learning the structure of causative verbs: a study in the relationship of cognitive,
semantic and syntactic development. Papers and Reports on Child Language
Development 8. 142-78.
Braine, M. (1963) On learning the
grammatical order of words. Psychological Review 70, 323-348.
(1976)
Children's first word combinations. Monographs of the Society for Research
in Child Development. 41. (n. 164).
(1987)
What is learned in acquiring word classes. In B. MacWinney (Ed) Mechanisms
of Language Acquisition. 65-87.
Erlbaum.
Brown, R.(1957) Linguistic
determinism and the part of speech. Journal of Abnormal & Social
Psychology 55, 1-5.
(1958)
Words and things. New York: Free
Press.
(1973)
First Language: The early stages.
Cambridge, MA: Harvard University Press.
Bybee, J. (1995) Regular
morphology and the lexicon. Language and Cognitive Processes 10(5), 425-55.
Cartwright, T. & Brent, M.
(1997) Syntactic categorization in early language acquisition; Formalizing the
role of distributional analysis. Cognition 63, 121-170.
Cazden, C. (1968) The acquisition
of noun and verb inflections. Child Development 39, 433-448.
Chomsky, N. (1956) Three models
for the description of language. IRE Transactions on Information Theory. Vol. IT-2, p.3
(1959)
A Review of B.F. Skinner's "Verbal Behavior." Language 3. 26-58.
(1965)
Aspects of a Theory of Syntax. MIT
Press.
(1966)
Cartesian linguistics: A chapter in the history of rationalist thought. New York: Harper & Row.
(1986)
Knowledge of Language: Its nature, origin, and use. New York: Praeger.
(1995)
The Minimalist Program. MIT Press.
(2000)
New Horizons in the Study of Language and Mind. CUP.
Clahsen, H. (1999) Lexical
entries and rules of language: A multi-disciplinary study of German inflection.
Behavioral and Brain Sciences, 22.
991-1013.
Clahsen, H; Eisenbeiss, S; Penke,
M. (1994) Underspecification and Lexical Learning in Early Child Grammars. Essex
Research Reports in Linguistics 4.
1-36.
Clahsen, H; Sonnenstuhl, I; &
Blevins, J. (2001a) Derivational Morphology in the German Mental Lexicon: A
Dual Mechanism Account. (Ms. University of Essex).
Clahsen, H; Eisenbeiss, S;
Hadler, & M; Sonnenstuhl, I. (2001b) The mental representation of inflected
words: An experimental study of adjectives and verbs in German. Language 77. 510-543.
Clahsen, H; Aveledo, F; &
Roca, I. (2002) The development of regular and irregular verb inflections in
Spanish child language. Journal of Child Language 29. 591-622.
Clahsen, H; Hadler, M; &
Weyerts, H. (2003) Frequency Effects in Children and Adults' Production of
Inflected Words. (Ms. University of Essex)
Elman, J. (1993) Learning and
development in neural networks: The importance of starting small. Cognition 48. 71-99.
Elman, J., Bates, E., Johnson,
M., Karmiloff-smith, A., Parisi, D. Plunkett, K. (1996) Rethinking
innateness: a connectionist perspective on development. MIT Press.
Felix, S. (1987) Cognition and
Language Growth. Dordrecht: Foris.
(1992)
Language acquisition as a maturational hypothesis. In J. Weissenborn, H.
Goodluck, & T. Roeper (Eds). Theoretical Issues in Language Acquisition. Hillsdale, N.J: Erlbaum.
Fodor, Janet D. (1997) Talk
presented at Essex University on Parameters & Triggers.
(1998)
Unambiguous triggers. Linguistic Inquiry 29. 1-37.
Fodor, J.(1975) The Language
of Thought. Cambridge, Mass. Harvard
University Press.
(1983)
The Modularity of Mind. MIT Press.
(2000)
The Mind Doesn't Work That Way.
MIT Press.
Galasso, J. (1999) The Acquisition of Functional Categories: A Case
Study. Unpublished Ph.D. Dissertation. University of Essex, U.K.
(2003a)
Notes on a Research Statement for Child First Language Acquisition. paper no. 1. MS. California State University,
Northridge.
(2003c)
The Acquisition of Functional Categories. Indiana University Linguistics Club Publications.
Gardner, H. (1985) The Mind's
New Science. New York: Basic Books.
Gibson, E. (1992) On the adequacy
of the competition model. Language
68, p. 447-74.
Goldin-Meadow, S & Mylander,
C. (1990) Beyond the input given: The child's role in the acquisition of
language. Language 66: 323-55.
Gould, S.J. (1977) Ontogeny
and Phylogeny. CUP.
(1991)
Exaption: A crucial tool for evolutionary psychology. Journal of Social
Issues 47: 43-65.
Grodzinsky, Y. (1990) Theoretical
perspectives on language deficits.
MIT Press.
Guasti, T. & Rizzi, L. (2002)
Agr and Tense as distinctive syntatcic projections: Evidence from acquisition.
In G. Cinque (Ed). The Cartography of DSyntactic Structures. New York: Oxford University Press.
Halle, M. & Marantz, A.
(1993) Distributed morphology and the pieces of inflection. In K. Halle and S.
Keyser (Eds.) The View From Building 20. MIT Press.
Hebb, D. (1949) Organization
of Behavior. New York: Wiley.
Hoekstra, T. & Jordens, P.
(1994) From adjunct to head. In T. Hoekstra & Schwartz (Eds) Language
Acquisition Studies in Generative Grammar. Benjamins. pp. 119-149.
Hyams, N. (1986) Language
Acquisition and the Theory of Parameters.
Dordrecht: Reidel
Hyams, N. & Wexler, K (1993)
On the grammatical basis of null subjects in child language. Linguistic
Inquiry 24. 421-59.
Kayne, R. (1994) The
Antisymmetry of Syntax. Linguistic
Inquiry Monograph no. 25. MIT Press.
Klima, E. & Bellugi, U.
(1966) Syntactic regularities in the speech of children. In J. Lyons & R.J.
Wales (Eds.), Psycholinguistics papers (183-208). Edinburgh: University of Edinburgh Press.
Köhler, W. (1929) Gestalt
Psychology. New York: Liveright.
(1969)
The Task of Gestalt Psychology.
Princeton, N.J: Princeton University Press.
Kuczaj, S. & Maratsos, M.
(1983) Initial verbs of yes-no questions: A different kind of grammatical
category. Developmental Psychology 19,
440-444.
Kuhn, T. (1973) The Structure
of Scientific Revolutions. Chicago:
University of Chicago Press.
Lieven, E; Pine, J; &
Baldwin, G. (1997) Lexically-based learning and early grammatical development. Journal
of Child Language. 24. 187-219.
Marantz, A. (1995) The Minimalist
Program. In G. Webelhuth (Ed) Government and Binding Theory and The
Minimalist Program. Basil Blackwell.
Marcus, G. (2001) The
Algebraic Mind. MIT Press.
Marcus, G; Ullman, M; Pinker, S;
Hollander, M; Rosen, T; & Xu, F. (1992) Overregularization in language
acquisition. Monographs of the Society for Research in Child Development. 57(4) n. 228.
McClelland, J.L., &
Rumelhart, D.E. (1985) Distributed memory and the representation of general and
specific information. Journal of Experimental Psychology: General, 114. 159-188.
McClelland, J.L., &
Rumelhart, D.E. & the PDP Research Group (1986) Parallel distributed
processing: Explorations in the microstructure of cognition. Vol. 2. Psychological
and Biological Models. MIT Press.
Minsky, M. (1968) Semantic
Information Processing. MIT Press.
Minsky, M. & S. Papert.
(1969/1988) Perceptrons. MIT
Press.
Naigles, L. & Lehrer, N.
(2002) Language-general and language-specific influences on children's
acquisition of argument structure: a comparison of French and English. Journal
of Child Language 29. 545-566.
Newell, A. (1993) The Serial
Imperative. In P. Baumgartner & S. Payr Speaking Minds: Interviews with
Twenty Eminent Cognitive Scientist.
Princeton N.J: Princeton University Press.
Newport, E. (1990) Maturational
constraints on language learning. Cognitive Science 14. 11-28.
Olguin, R. & Tomasello, M.
(1993) Twenty-five-month-old children do not have a grammatical category of
verb. Cognitive Development 8.
245-72
Ouhalla, J. (1991) Functional
categories and parametric variation.
London: Routledge.
Penrose, R. (1994) Shadows of
the Mind. Oxford University Press.
Pesetsky, D. (1995) Zero
Syntax. MIT Press.
Pienemann, M. (1989) Is language
teachable? Psycholinguistic experiments and hypotheses. Applied Linguistics 10. 52-57.
Pine, J. & Lieven E. (1997)
Slot and frame patterns and the development of the determiner category. Applied
Psycholinguistics 18, 123-138.
Pine, J., Lieven, E., &
Rowland, C. (1998) Comparing different models of the development of the English
verb category. Linguistics 36,
807-830.
Pinker, S. (1984) Language
learnability and language development.
Cambridge, MA: Harvard University Press.
(1987)
Learnability and cognition: The acquisition of argument structure. MIT Press.
(1989)
Learnability and Cognition: the acquisition of verb-argument structure. Harvard University Press.
(1997)
How the Mind Works. New York:
Norton.
(1999)
Words and Rules. New York: Basic
Books.
Pinker, S. & Bloom, P. (1990)
Natural language and natural selection. Behavioral and Brain Sciences 13, 707-784.
Pinker, S. & Prince, A.
(1988) On language and connectionism: Analysis of a parallel distributed
processing model of language acquisition. Cognition 28. 73-193
Pinker, S. & Prince, A.
(1994) Regular and irregular morphology and the psychological status of rules
of grammar. In S.D. Lima, R.L., Corrigan, & G.K. Iverson (Eds.), The
Reality of Linguistic Rules.
Philadelphia: Benjamins.
Plunkett, K. & Marchman, V.
(1991) U-shape learning and frequency effects in a multi-layered perceptron:
Implications for child language acquisition. Cognition 38. 43-102.
Plunkett, K. & Marchman, V.
(1993) From rote learning to system building: Acquiring verb morphology in
children and connectionist nets. Cognition 48, 21-69.
Quine, W. (1990) Pursuit of
Truth. Cambridge, MA. Harvard
University Press.
Radford, A. (1990) Syntactic
Theory and the Acquisition of English Syntax. Basil Blackwell.
(1996)
Toward a structure-building model of language acquisition. In H. Clahsen (ed) Generative
Perspectives on Language Acquisition.
Benjamins.
(1997)
Syntactic Theory and the Structure of English. CUP.
(1998)
Genitive subjects in child English. Lingua 106. 113-131
(2000)
Children in Search of Perfection: Towards a Minimalist Model of Acquisition. Essex
Research Reports in Linguistics, Vol.
34.
Radford, A. & Galasso, J.
(1998) Children's Possessive Structures: A case study. Essex Research
Reports in Linguistics 19. 37-45.
Rosenblatt, F. (1962) Principles
of neural dynamics. New York:
Spartan.
Rowland, C. & Pine, J. (2000)
Subject-auxiliary inversion errors and wh-question acquisition: 'What do
children know?' Journal of Child Language 27, 157-181.
(2003)
The development of inversion in wh-questions: a reply to Van Valin. Journal
of Child Language 30. 197-212.
Rumelhart, D. & McClelland,
J. (1986) On learning the past tense of English verbs. In J. McClelland, D.
Rumelhart, & the PDP Research Group (Eds). Parallel distributed
processing. vol. 2. MIT Press.
Schütze, C. (1997) INFL in
Child and adult language: Agreement, case and licensing. Ph.D. Diss. MIT.
Schütze, C. (2001) The status of He/She
don't and theories of root
infinitives. Ms. UCLA.
Schütze, C; & Wexler, K.
(1996) Subject case licensing and English root infinitives. In A Stringfellow,
D. Cahma-Amitay, E. Hughes & A. Zukowsli (Eds), Proceedings of the 20th
Annual Boston University Conference on Language Development. Somerville, MA: Cascadilla Press.
Smith, N. & Tsimpli, I.M.
(1995) The Mind of a Savant.
Blackwell.
Speas, M. (1990) Phrase structure
in natural language. Dordrecht:
Kluwer.
Stromswold, K. (1990)
Learnability and the acquisition of auxiliaries. Ph.D. MIT.
Tomasello, M. (1992) First
verbs: a case study of early grammatical development. CUP.
(2000)
Do young children have adult syntactic competence? Cognition 74. 209-253.
Travis, L. (1984) Parameters
and effects of word order variation
Ph.D. Diss. MIT.
Tsimpli, I.-M. (1992) Functional
categories and maturation. Ph.D. Diss. UCL.
Valian, V. (1986) Syntactic
categories in the speech of young children. Developmental Psychology 22. 562-79.
(1991)
Syntactic subjects in the early speech of American and Italian children. Cognition
40, 21-81.
Wakefield, J. & Wilcox, J.
(1995) Brain Maturation and Language Acquisition: A Theoretical Model and
Preliminary Investigation. Proceedings of BUCLD 19, 643-654. Vol. 2. Somerville, Mass. Cascadilla
Press.
Wexler, K. (1994) Optional
infinitives, head movement and the economy of derivations. In D. Lightfoot
& N. Hornstein (Eds). Verb Movement. CUP.
(1996)
The development of inflection in a biologically based theory of language
acquisition. In M. Rice (Ed) Toward a genetics of language. Mahwah, N.J: Erlbaum.