teaching? More importantly, the precise study of fragments of adult
knowledge of language quickly underscored the existence of “poverty
of stimulus” situations: the adult knowledge of language is largely
underdetermined by the linguistic data normally available to the child,

On nature and language

which would be consistent with innumerable generalizations over and
above the ones that speakers unerringly converge to. Let us consider a
simple example to illustrate this point. Speakers of English intuitively
know that the pronoun “he” can be understood as referring to John in
(1), but not in (2):

(1) John said that he was happy
(2) — He said that John was happy

We say that “coreference” between the name and the pronoun is pos-
sible in (1), but not in (2) (the star in (2) signals the impossibility of
coreference between the underscored elements; the sentence is obvi-
ously possible with “he” referring to some other individual mentioned
in the previous discourse). It is not a simple matter of linear prece-
dence: there is an unlimited number of English sentences in which
the pronoun precedes the name, and still coreference is possible, a
property illustrated in the following sentences with subject, object
and possessive pronouns:

(3) When he plays with his children, John is happy
(4) The people who saw him playing with his children said that
John was happy
(5) His mother said that John was happy

The actual generalization involves a sophisticated structural computa-
tion. Let us say that the “domain” of an element A is the phrase which
immediately contains A (we also say that A c-commands the elements
in its domain: Reinhart (1976)). Let us now indicate the domain of the
pronoun by a pair of brackets in (1)“(5):

Editors™ introduction

(6) John said that [he was happy]

(7) [He said that John was happy]
(8) When [he plays with his children], John is happy
(9) The people who saw [him playing with his children] said
that John was happy
(10) [His mother] said that John was happy

The formal property which singles out (7) is now clear: only in this
structure is the name contained in the domain of the pronoun. So,
coreference is excluded when the name is in the domain of the pronoun
(this is Lasnik™s (1976) Principle of Non-coreference). Speakers of
English tacitly possess this principle, and apply it automatically to new
sentences to evaluate pronominal interpretation. But how do they come
to know that this principle holds? Clearly, the relevant information is
not explicitly given by the child™s carers, who are totally unaware of
it. Why don™t language learners make the simplest assumption, i.e.
that coreference is optional throughout? Or why don™t they assume
that coreference is ruled by a simple linear principle, rather than by
the hierarchical one referring to the notion of domain? Why do all
speakers unerringly converge to postulate a structural principle rather
than a simpler linear principle, or even no principle at all?
This is one illustration of a pervasive situation in language ac-
quisition. As the experience is too impoverished to motivate the gram-
matical knowledge that adult speakers invariably possess, we are led
to assume that particular pieces of grammatical knowledge develop
because of some pressure internal to the cognitive system of the child.
A natural hypothesis is that children are born with a “language faculty”
(Saussure), an “instinctive tendency” for language (Darwin); this

On nature and language

cognitive capacity must involve, in the first place, receptive resources
to separate linguistic signals from the rest of the background noise,
and then to build, on the basis of other inner resources activated by a
limited and fragmentary linguistic experience, the rich system of lin-
guistic knowledge that every speaker possesses. In the case discussed,
an innate procedure determining the possibilities of coreference is
plausibly to be postulated, a procedure possibly to be deduced from a
general module determining the possibilities of referential dependen-
cies among expressions, as in Chomsky™s (1981) Theory of Binding, or
from even more general principles applying at the interface between
syntax and pragmatics, as in the approach of Reinhart (1983). In fact,
no normative, pedagogic or (non-theory-based) descriptive grammar
ever reports such facts, which are automatically and unconsciously as-
sumed to hold not only in one™s native language, but also in the adult
acquisition of a second language. So, the underlying principle, what-
ever its ultimate nature, appears to be part of the inner background of
every speaker.
We can now phrase the problem in the terminology used by the
modern study of language and mind. Language acquisition can be seen
as the transition from the state of the mind at birth, the initial cognitive
state, to the stable state that corresponds to the native knowledge of a
natural language. Poverty of stimulus considerations support the view
that the initial cognitive state, far from being the tabula rasa of empiri-
cist models, is already a richly structured system. The theory of the
initial cognitive state is called Universal Grammar; the theory of a
particular stable state is a particular grammar. Acquiring the tacit
knowledge of French, Italian, Chinese, etc., is then made possible
by the component of the mind“brain that is explicitly modeled by
Universal Grammar, in interaction with a specific course of linguis-
tic experience. In the terms of comparative linguistics, Universal

Editors™ introduction

Grammar is a theory of linguistic invariance, as it expresses the univer-
sal properties of natural languages; in terms of the adopted cognitive
perspective, Universal Grammar expresses the biologically necessary
universals, the properties that are universal because they are deter-
mined by our in-born language faculty, a component of the biological
endowment of the species.
As soon as a grammatical property is ascribed to Universal
Grammar on the basis of poverty of stimulus considerations, a hy-
pothesis which can be legitimately formulated on the basis of the
study of a single language, a comparative verification is immediately
invited: we want to know if the property in question indeed holds
universally. In the case at issue, we expect no human language to allow
coreference in a configuration like (2) (modulo word order and other
language specific properties), a conclusion which, to the best of our
current knowledge, is correct (Lasnik (1989), Rizzi (1997a) and ref-
erences quoted there). So, in-depth research on individual languages
immediately leads to comparative research, through the logical prob-
lem of language acquisition and the notion of Universal Grammar.
This approach assumes that the biological endowment for language
is constant across the species: we are not specifically predisposed to
acquire the language of our biological parents, but to acquire whatever
human language is presented to us in childhood. Of course, this is not
an a priori truth, but an empirical hypothesis, one which is confirmed
by the explanatory success of modern comparative linguistics.

3 Descriptive adequacy and explanatory adequacy
It has been said that language acquisition constitutes “the funda-
mental empirical problem” of modern linguistic research. In order
to underscore the importance of the problem, Chomsky introduced,

On nature and language

in the 1960s, a technical notion of explanation keyed to acquisition
(see Chomsky (1964, 1965) for discussion). An analysis is said to meet
“descriptive adequacy” when it correctly describes the linguistic facts
that adult speakers tacitly know; it is said to meet the higher require-
ment of “explanatory adequacy” when it also accounts for how such
elements of knowledge are acquired. Descriptive adequacy can be
achieved by a fragment of a particular grammar which successfully
models a fragment of adult linguistic knowledge; explanatory ade-
quacy is achieved when a fragment of a particular
Grammar with its internal structure, analytic principles, etc., and a
certain course of experience, the linguistic facts which are normally
available to the child learning the language during the acquisition pe-
riod. These are the so-called “primary linguistic data,” a limited and
individually variable set of utterances whose properties and structural
richness can be estimated via corpus studies. If it can be shown that the
correct grammar can be derived from UG and a sample of data which
can be reasonably assumed to be available to the child, the acquisition
process is explained. To go back to our concrete example on corefer-
ence, descriptive adequacy would be achieved by a hypothesis correctly
capturing the speaker™s intuitive judgments on (1)“(5), say a hypothe-
sis referring to a hierarchical principle rather than a linear principle;
explanatory adequacy would be achieved by a hypothesis deriving the
correct description of facts from general inborn laws, say Chomsky™s
binding principles, or Reinhart™s principles on the syntax“pragmatics
A certain tension arose between the needs of descriptive and
explanatory adequacy in the 1960s and 1970s, as the two goals pushed
research in opposite directions. On the one hand, the needs of de-
scriptive adequacy seemed to require a constant enrichment of the

Editors™ introduction

descriptive tools: with the progressive broadening of the empirical ba-
sis, the discovery of new phenomena in natural languages naturally
led researchers to postulate new analytic tools to provide adequate
descriptions. For instance, when the research program was extended
for the first time to the Romance languages, the attempts to analyze
certain verbal constructions led to the postulation of new formal rules
(causative formation transformations and more radically innovative
formal devices such as restructuring, reanalysis, clause union, etc.:
Kayne 1975, Rizzi 1976, Aissen and Perlmutter 1976), which seemed
to require a broadening of the rule inventory allowed by Universal
Grammar. Similarly, and more radically, the first attempts to analyze
languages with freer word order properties led to the postulation of
different principles of phrasal organization, as in much work on so-
called “non-configurational” languages by Ken Hale, his collaborators
and many other researchers (Hale 1978). On the other hand, the very
nature of explanatory adequacy, as it is technically defined, requires
a maximum of restrictiveness, and the postulation of a strong cross-
linguistic uniformity: only if Universal Grammar offers relatively few
analytic options for any given set of data is the task of learning a lan-
guage a feasible one in the empirical conditions of time and access
to the data available to the child. It was clear all along that only a
restrictive approach to Universal Grammar would make explanatory
adequacy concretely attainable (see chapter 4 and Chomsky (2001b) on
the status of explanatory adequacy within the Minimalist Program).

4 Principles and parameters of Universal Grammar
An approach able to resolve this tension emerged in the late 1970s. It
was based on the idea that Universal Grammar is a system of principles
and parameters. This approach was fully developed for the first time in

On nature and language

informal seminars that Chomsky gave at the Scuola Normale Superiore
of Pisa in the Spring semester of 1979, which gave rise to a series of
lectures presented immediately after the GLOW Conference in April
1979, the Pisa Lectures. The approach was refined in Chomsky™s Fall
1979 course at MIT, and then presented in a comprehensive monograph
as Chomsky (1981).
Previous versions of generative grammar had adopted the view,
inherited from traditional grammatical descriptions, that particular
grammars are systems of language-specific rules. Within this ap-
proach, there are phrase structure rules and transformational rules
specific to each language (the phrase structure rule for the VP is dif-
ferent in Italian and Japanese, the transformational rule of causative
formation is different in English and French, etc.). Universal Gram-
mar was assumed to function as a kind of grammatical metatheory, by
defining the general format which specific rule systems are required
to adhere to, as well as general constraints on rule application. The
role of the language learner was to induce a specific rule system on
the basis of experience and within the limits and guidelines defined
by UG. How this induction process could actually function remained
largely mysterious, though.
The perspective changed radically some twenty years ago. In
the second half of the 1970s, some concrete questions of compara-
tive syntax had motivated the proposal that some UG principles could
be parametrized, hence function in slightly different ways in differ-
ent languages. The first concrete case studied in these terms was the
fact that certain island constraints appear to be slightly more liberal
in certain varieties than in others: for instance, extracting a relative
pronoun from an indirect question sounds quite acceptable in Italian
(Rizzi 1978), less so in other languages and varieties: it is excluded
in German, and marginal at variable degrees in different varieties of

Editors™ introduction

English (see Grimshaw (1986) for discussion of the latter case; on
French see Sportiche (1981)):

(11) Ecco un incarico [S™ che [S non so proprio [S™ a chi
[S potremmo affidare ]]]]
Here is a task that I really don™t know to whom we could

(12) — Das ist eine Aufgabe, [S™ die [S ich wirklich nicht weiss
[S™ wem [S wir anvertrauen k¨ nnten]]]]
Here is a task that I really don™t know to whom we could
It is not the case that Italian allows extraction in an unconstrained
way: for instance, if extraction takes place from an indirect question
which is in turn embedded within an indirect question, the acceptabil-
ity strongly degrades:

(13) — Ecco un incarico [S™ che [S non so proprio [S™ a chi
[S si domandino [S™ se [S potremmo affidare ]]]]]]
Here is a task that I really don™t know to whom they
wonder if we could entrust
The suggestion was made that individual languages could differ
slightly in the choice of the clausal category counting as bounding
node, or barrier for movement. Assume that the relevant principle,
Subjacency, allows movement to cross one barrier at most; then, if
the language selects S™ as clausal barrier, movement of this kind will
be possible, with only the lowest S™ crossed; if the language selects
S, movement will cross two barriers, thus giving rise to a violation of
subjacency. Even if the language selects S™, movement from a double
Wh island will be barred, whence the contrast (11)“(13) (if a language
were to select both S and S™ as bounding node, it was observed, then

On nature and language

even movement out of a declarative would be barred, as seems to be the
case in certain varieties of German and in Russian: see the discussion
in Freidin (1988)).
In retrospect, this first example was far from an ideal case of
parameter: the facts are subtle, complex and variable across varieties
and idiolects, etc. Nevertheless, the important thing is that it quickly
became apparent that the concept of parameter could be extended to
other more prominent cases of syntactic variation, and that in fact the
whole cross-linguistic variation in syntax could be addressed in these
terms, thus doing away entirely with the notion of a language-specific
rule system. Particular grammars could be conceived of as direct in-
stantiations of Universal Grammar, under particular sets of paramet-
ric values (see Chomsky (1981) and, among many other publications,
different papers collected in Kayne (1984, 2001), Rizzi (1982, 2000)).
Within the new approach, Universal Grammar is not just a
grammatical metatheory, and becomes an integral component of
particular grammars. In particular, UG is a system of universal
principles, some of which contain parameters, choice points which
can be fixed in one of a limited number of ways. A particular grammar
then is immediately derived from UG by fixing the parameters in a
certain way: Italian, French, Chinese, etc. are direct expressions of UG
under particular, and distinct, sets of parametric values. No language-
specific rule system is postulated: structures are directly computed by
UG principles, under particular parametric choices. At the same time,
the notion of a construction-specific rule dissolves. Take for instance
the passive, in a sense the prototypical case of a construction-specific
rule. The passive construction is decomposed into more elementary
operations, each of which is also found elsewhere. On the one hand,
the passive morphology intercepts the assignment of the external
Thematic Role (Agent, in the example given below) to the subject

Editors™ introduction

position and optionally diverts it to the by phrase, as in the underlying
representation (14a); by dethematizing the subject, this process
also prevents Case assignment to the object (via so-called Burzio™s
generalization, see Burzio (1986)); then, the object left without a
Case moves to subject position, as in (14b) (on Case Theory and the
relevance of Case to trigger movement, see below):

(14) a. was washed the car (by Bill)
b. The car was washed (by Bill)

None of these processes is specific to the passive: the interception of
the external thematic role and optional diversion to a by phrase is also
found, for instance, in one of the causative constructions in Romance
(with Case assigned to the object by the complex predicate faire+V in
(15)), movement of the object to a non-thematic subject position is also
found with unaccusative verbs, verbs which do not assign a thematic
role to the subject as a lexical property and are morphologically marked
in some Romance and Germanic languages by the selection of auxiliary
be, as in (16) in French (Perlmutter 1978, Burzio 1986):

(15) Jean a fait laver la voiture (par Pierre)
Jean made wash the car (by Pierre)
(16) Jean est parti
Jean has left
So, the “passive contruction” dissolves into more elementary consti-
tuents: a piece of morphology, an operation on thematic grids, move-
ment. The elementary constituents have a certain degree of modular
autonomy, and can recombine to give rise to different constructions
under language-specific parametric values.
A crucial contribution of parametric models is that they provided
an entirely new way of looking at language acquisition. Acquiring a

On nature and language

language amounts, in terms of such models, to fixing the parameters
of UG on the basis of experience. The child interprets the incoming
linguistic data through the analytic devices provided by Universal
Grammar, and fixes the parameters of the system on the basis of the
analyzed data, his linguistic experience. Acquiring a language thus
means selecting, among the options generated by the mind, those
which match experience, and discarding the other options. So, acquir-
ing an element of linguistic knowledge amounts to discarding the
other possibilities offered a priori by the mind; learning is then
achieved “by forgetting,” a maxim adopted by Mehler and Dupoux
(1992) in connection with the acquisition of phonological systems:
acquiring the phonetic distinctions used in one™s language amounts
to forgetting the others, in the inventory available a priori to the child™s
mind, so that at birth every child is sensitive to the distinction between
/l/ and /r/, or /t/ and /t./ (dental vs. retroflex), but after a few months
the child learning Japanese will have “forgotten” the /l/ vs. /r/ distinc-
tion, and the child learning English will have “forgotten” the /t/ vs. /t./
distinction, etc., because they will have kept the distinctions used by
the language they are exposed to and discarded the others. Under the
parametric view, “learning by forgetting” seems to be appropriate for
the acquisition of syntactic knowledge as well.
The Principles and Parameters approach offered a new way of ad-
dressing the logical problem of language acquisition, in terms which
abstract away from the actual time course of the acquisition process
(see Lightfoot (1989) and references discussed there). But it also gen-
erated a burst of work on language development: how is parameter
fixation actually done by the child in a concrete time course? Can it
give rise to observable developmental patterns, e.g. with the resetting
of some parameters after exposure to a sizable experience, or under

Editors™ introduction

the effect of maturation? Hyams™s (1986) approach to subject drop in
child English opened a line of inquiry on the theory-conscious study
of language development which has fully flourished in the last decade

