Poverty of the Stimulus

In linguistics, the poverty of the stimulus (POS) is the assertion that natural language grammar is unlearnable given the relatively limited data available to children learning a language, and therefore that this knowledge is supplemented with some sort of innate linguistic capacity.

Nativists claim that humans are born with a specific representational adaptation for language that both funds and limits their competence to acquire specific types of natural languages over the course of their cognitive development and linguistic maturation. The argument is now generally used to support theories and hypotheses of generative grammar. The term “poverty of the stimulus” was coined by Noam Chomsky in his work Rules and Representations. The thesis emerged from several of Chomsky’s writings on the issue of language acquisition. The argument has long been controversial within the field of linguistics, forming the backbone for the theory of universal grammar. Arguments in support of poverty of stimulus are not attempting to appeal to innate principles in exchange for learning appellates of universal grammar.

Summary

History

Although Chomsky officially coined the “poverty of the stimulus” theory in 1980, the concept is directly linked to another Chomskyan approach named Plato’s Problem. He outlined this philosophical approach in the first chapter of the “Aspects of the Theory of Syntax” in 1965. Chomsky asserts that there is a physiological component in the brain that develops in children, and thus, they are able to acquire language universally. Plato’s Problem traces back to “Meno”, a Socratic Dialogue. In Meno, Socrates “undigs” mathematical knowledge of a servant who was never explicitly taught the geometry concepts. Plato’s Problem directly parallels the idea of the innateness of language, universal grammar, and more specifically the poverty of the stimulus argument because Socrates discovered people’s innate ability to fully understand foreign concepts that they are never exposed to. Chomsky illustrates that children are not exposed to all structures of language, yet they fully achieve the necessary linguistic knowledge at an early age.

Premises

Though Chomsky and his supporters have reiterated the argument in a variety of different manners (indeed Pullum and Scholz (2002) provide no less than 13 different “sub arguments” that can optionally form part of a poverty-of-stimulus argument), one frequent structure to the argument can be summed up as follows:

Premises:

There are patterns in all natural languages that cannot be learned by children using positive evidence alone. Positive evidence is the set of grammatical sentences that the language learner has access to, as a result of observing the speech of others. Negative evidence, on the other hand, is the evidence available to the language learner about what is not grammatical. For instance, when a parent corrects a child’s speech, the child acquires negative evidence.
Children are presented only with positive evidence for these particular patterns. For example, they hear others speaking using only sentences that are “right”, not those that are “wrong”. Negative evidence is not available to children in a way that they could use to learn language.
Children do learn the correct grammars for their native languages.

Conclusion: Therefore, human beings must have some form of innate linguistic capacity that provides additional knowledge to language learners. Essentially, stimulus is not an entirely adequate way to explain the process of learning. The poverty of stimulus argument attempts to explain how native speakers form a capacity to identify possible and impossible interpretations through ordinary experience. Thus, “language acquisition is not merely a matter of acquiring a capacity to associate word strings with interpretations. Much less is it a mere process of acquiring a (weak generative) capacity to produce just the valid word strings of the language.”

Proposed evidence

For the argument

Those advocating the poverty of the stimulus argument believe that the linguistic input children receive is not rich enough to enable them to acquire language through exposure to linguistic stimuli alone and they therefore argue that there must be some innate, biological mechanism that allows children to learn language. Several patterns in language have been claimed to be unlearnable from positive evidence alone. One example is the hierarchical nature of languages. The grammars of human languages produce hierarchical tree structures and some linguists argue that human languages are also capable of infinite recursion (see context-free grammar). For any given set of sentences generated by a hierarchical grammar capable of infinite recursion, there are an indefinite number of grammars that could have produced the same data. This would make learning any such language impossible. Indeed, a proof by E. Mark Gold showed that any formal language that has hierarchical structure capable of infinite recursion is unlearnable from positive evidence alone, in the sense that it is impossible to formulate a procedure that will discover with certainty the correct grammar given any arbitrary sequence of positive data in which each utterance occurs at least once. However, this does not preclude arriving at the correct grammar using typical input sequences rather than particularly malicious sequences or arrive at an almost perfect approximation to the correct grammar. Indeed, it has been proposed that under very mild assumptions (ergodicity and stationarity), the probability of producing a sequence that renders language learning impossible is in fact zero.

Another example of language pattern claimed to be unlearnable from positive evidence alone is subject-auxiliary inversion in questions, i.e.:

You are happy.
Are you happy?

There are two hypotheses the language learner might postulate about how to form questions: (1) The first auxiliary verb in the sentence (here: ‘are’) moves to the beginning of the sentence, or (2) the ‘main’ auxiliary verb in the sentence moves to the front. In the sentence above, both rules yield the same result since there is only one auxiliary verb. But, the difference is apparent in this case:

Anyone who is interested can see me later.

* Is anyone who interested can see me later?
Can anyone who is interested see me later?

Of course, the result of rule (1) is ungrammatical while the result of rule (2) is grammatical. So, rule (2) is (approximately) what we actually have in English, not rule (1). The claim, then, first is that children don’t see sentences as complicated as this one enough to witness a case where the two hypotheses yield different results, and second that just based on the positive evidence of the simple sentences, children could not possibly decide between (1) and (2). Moreover, even sentences such as (1) and (2) are compatible with a number of incorrect rules (such as “front any auxiliary). Thus, if rule (2) was not innately known to infants, we would expect half of the adult population to use (1) and half to use (2). Since that doesn’t occur, rule (2) must be innately known. (See Pullum 1996 for the complete account and critique.)

In one study it was shown that even very young children understand the syntactic structure of the anaphor ‘one’. There is not enough information in the input for a child to conclude whether ‘one’ is anaphoric to only N° (it stands in for only a noun) or if it is anaphoric to N’ (it stands in for both the noun and the adjective). Despite this children understand utterances using ‘one’ based on their attention to one object over another in the study.

The last premise, that children successfully learn language, is considered to be evident in human speech. Though people occasionally make mistakes, human beings rarely speak ungrammatical sentences, and generally do not label them as such when they say them—ungrammatical in this case referring to the descriptive sense, rather than the prescriptive.

To apply the idea of universal grammar and POS to a real-life situation, one can look to second-language acquisition. If an L2 learner acquires information and knowledge about a second language that they did not gain from either language input they have experienced or from their first language, they must have gained it from their UG. This means L2 learners have an innate principle for learning, which supports the poverty of the stimulus theory. The main example of “such innate knowledge is the principle of structure-dependency”. Not all languages are structure-dependent; therefore it is extremely helpful to use this concept as a test for the issue of innate knowledge. If a person shows understanding of structure-dependency in the L2 and they were not directly taught it, nor was their first language structure-dependent, this supports the concept of innate knowledge, and therefore the concept of the poverty of the stimulus. A study done by Cook provides three types of sentences to L2 learners:

A: Joe is [the dog that is black].

B: Is Joe [the dog that is black]?

C: Is Joe is [the dog that black]?

The subjects of the study were 35 native speakers of English and 140 L2 speakers of English, with the L1 languages being Polish, Finnish, Dutch, Japanese, Chinese, and Arabic. The subjects each read a list of 96 sentences and rated each as either OK, not OK, or not sure. the subjects were not taught structure-dependency in relation to English prior to the study. The results showed that in regards to sentences structured like example C above, speakers of Polish, Finnish, Dutch and Japanese all answered 95% or above correct; Arabic speakers answered 87.1% correct; and Chinese speakers answered 86.8% correct. This shows that L2 speakers follow the poverty of the stimulus theory when structure-dependency is concerned.

Against the argument

Notable figures in the philosophical and empirical study of the mind have challenged the various aspects of the poverty of stimulus argument. Much of the criticism comes from researchers who study language acquisition and computational linguistics. Additionally, some connectionist researchers have refuted aspects of Chomsky’s model, owing to premises that are at odds with connectionist beliefs about the structure of cognition. Constructionists are theorists who do not believe Chomskyan arguments and believe language is learned through some kind of functional distributional analysis (Tomasello 1992). One problem in language is called the no negative evidence problem. This is basically that children can use only positive evidence to learn language. Constructionists appeal to statistical and social learning mechanisms, which they claim can overcome a lack of negative evidence, whereas nativists simply use linguistic constraint theories (Baker 1979, Jackendoff 1975).

One common critique is that positive evidence is actually enough to learn the various patterns that Chomskyan linguists claim are unlearnable by positive evidence alone. A common argument is that the brain’s mechanisms of statistical pattern recognition could solve many of the difficulties stated by the argument. For example, researchers using neural networks and other statistical methods have programmed computers to learn rules such as (2) cited above, and have claimed to have successfully extracted hierarchical structures, all using positive evidence alone. Indeed, Klein & Manning (2002) report constructing a computer program that is able to retrieve 80% of all correct syntactic analyses of text in the Wall Street Journal Corpus using a statistical learning mechanism (unsupervised grammar induction), demonstrating a clear move away from “toy” grammars. In another study, a probabilistic type of computer without any programmed preconceptions about grammar was presented with many newspaper articles. Despite the fact that the scientists had censored all articles containing the sentence “colorless green ideas sleep furiously”, the computer, after “reading” thousands of articles, deemed that sentence 10000 times more probable than a scrambled ungrammatical version. This has been suggested as proof that statistical analysis without preconceptions can reveal general grammatical rules at a human-like accuracy. Also supporting the idea of learning through statistical reasoning is the Bayesian Model of language acquisition. This model provides a rational approach to language learning, suggesting that a person does not need to experience all the structures and concepts of a language in order to learn them.

Another suggested flaw with the poverty of the stimulus argument is that preliminary research shows that some languages in the world, for instance Daniel Everett’s observations of the Pirahã language in the Amazon, seem to violate the rules of some of the specific precepts of Chomsky’s models for universal grammar. Creoles and pidgins were thought to support the universal grammar hypothesis, but research demonstrates that pidgin learners systematize the language based on the probability and frequency of forms, not by universal grammar. This criticizes universal grammar on the basis that languages are dynamic and not fixed. However, Chomsky has, in fact, presented a proposed solution to such possible outliers that do not fall in with his theory, claiming that, the fact that a language does not display certain factors of universal grammar and poverty of the stimulus does not mean that the fundamentals of these ideas do not exist in its speakers’ brains. These supposed missing factors just may not present themselves due to extrinsic constraints. Chomsky claims that all humans do possess the capabilities of learning and using language based on universal grammar and poverty of the stimulus; if a group of speakers fail to do so, they are not lacking in ability, but rather simply in manifestation. This argument has become a topic of dispute and skepticism due to several scholars’ criticisms of Everett’s work and the validity of the data.

There is also criticism about whether negative evidence is really so rarely encountered by children. Pullum argues that learners probably do get certain kinds of negative evidence. In addition, if one allows for statistical learning, negative evidence is enough. It has been proposed that if a language pattern seems to be intuitive, but is never encountered, then the language learner might regard the absence of this pattern as negative evidence. Chomsky accepts that this kind of negative evidence plays a role in language acquisition, terming it “indirect negative evidence”, though he does not think that indirect negative evidence is sufficient for language acquisition to proceed without universal grammar. However, contra this claim, Ramscar and Yarlett (2007) designed a learning model that successfully simulates the learning of irregular plurals based on negative evidence, and backed the predictions of this simulation in empirical tests of young children. Ramscar and Yarlett suggest that failures of expectation function as forms of implicit negative feedback that allow children to correct their errors.

As for the argument based on Gold’s proof, it’s not clear that human languages are truly capable of infinite recursion. No speaker can ever in fact produce a sentence with an infinite recursive structure, and in certain cases (for example, center embedding), people are unable to comprehend sentences with only a few levels of recursion. Chomsky and his supporters have long argued that such cases are best explained by restrictions on working memory, since this provides a principled explanation for limited recursion in language use. Some critics argue that this removes the falsifiability of the premise. It is questionable whether Gold’s research actually has any bearing on the question of natural language acquisition at all, since what Gold showed is that there are certain classes of formal languages for which some language in the class cannot be learned given positive evidence alone. Some have drawn the conclusion that it is not clear that natural languages fall in such a class, and that they may not be amongst those that are not learnable.

Finally, it has been argued that people may not learn exactly the same grammars as each other. If this is the case, then only a weak version of the third premise is true, as there would be no fully “correct” grammar to be learned. However, in many cases, poverty of the stimulus arguments do not in fact depend on the assumption that there is only one correct grammar, but rather that there is only one correct class of grammars. For example, the poverty of the stimulus argument from question formation depends only on the assumption that everyone learns a structure-dependent grammar.

Home

Psychology Resources

Services

Usage Policy