Why is language ambiguous? (ambiguity, pt 1.)
Note: This post has been migrated from my old WordPress site (originally posted on October 8, 2018).
Human language is full of ambiguity. Most people are familiar with homophones––words that sound the same, but have different meanings––such as bank (e.g. the bank of a river, vs. a place to deposit your money. But ambiguity cuts across multiple levels of language, from inflectional morphemes (–s can mark a plural noun, a 3rd-person singular verb in the present tense, or a possessive) to syntactic structures (e.g. he saw the man with the telescope).
Intuitively, it seems like the prevalence of this ambiguity ought to make communication difficult. The more interpretations any given word (or syntactic unit, utterance, etc.) has, the higher the probability that a listener will misinterpret the intended meaning. Indeed, some (Chomsky, 2002) have cited ambiguity as possible evidence language may not have evolved for communication at all.
Others, however, argue that ambiguity actually serves a communicative function––in other words, that it’s a feature, not a bug. Below, I’ll summarize some of their theoretical arguments, then describe a recent empirical study that sheds light on which parts of a lexicon are ambiguous, and why.
Ambiguity in Language: Theoretical Foundations
The most well-known argument was initially put forth by Zipf(1949), and goes like this: languages evolve to meet the demands of their speakers, but speakers and listeners have competing demands. According to Zipf, both speakers and listeners want to “minimize their effort”. In the case of speakers, he argued that an optimal language would consist of a single word, e.g. ba. To express any meaning, all a speaker need say is “ba”. And in the case of listeners, an optimal language would map every meaning to a distinct form, to avoid the need to infer which meaning the speaker intended. Together, these competing demands––which Zipf terms unification and diversification, respectively––produce a language with “some, but not total, ambiguity” (Piantadosi et al, 2012; pg. 3).
One obvious limitation to this argument, as pointed out in Wasow et al (2005) and Piantadosi et al (2012), is that a totally ambiguous language does not truly minimize a speaker’s effort––after all, if a listener misinterprets what the speaker meant, the speaker has to spend additional effort clarifying what they meant.
Thus, Piantadosi et al (2012) posit a slightly different set of trade-offs: rather than unification and diversification, language evolves to satisfy the competing demands of clarity (signals in which the intended meaning has a high probability of being correctly derived) and ease (signals which are easy to produce and process). “Easy” signals include those which are short, frequent, and phonotactically well-formed (i.e., more like other signals in that language)––but since there is a limited number of these signals in any given language, ease is sometimes sacrificed in the pursuit of clarity. Here, Piantadosi et al (2012) cite the NATO phonetic alphabet (“alpha” for a, “bravo” for b, etc.) as an example; in order to avoid confusion, monosyllabic letter names are replaced with bisyllabic words . Similarly, clarity is sometimes sacrificed for ease, as in the case of referential pronouns, which are ambiguous, but usually short and easy to produce (e.g., “he”).
Piantadosi et al (2012) then argue that these competing pressures produce communication systems that are optimized for efficiency:
First, apparently ambiguous signals (e.g. those in which clarity is sacrificed) are almost always unambiguous in context. For example, in the case of “run”, which can be either a noun or verb, the meaning is often disambiguated by the preceding word, e.g. “a run” or “we run”. Thus, language permits ambiguity because the intended meaning is usually made clear by the surrounding context of use, so that there is no need to provide additional information in the signal itself.
Second, ambiguity means that particularly easy signals (E.g. short, frequent, and well-formed) can be repurposed for multiple meanings. This benefits speakers and listeners alike, in that their lexicon will contain words that are easier to produce and process.
The theory outlined by Piantadosi et al (2012) does seem intuitive, and others (Levinson, 2000) have made similar arguments. But is it accurate? Fortunately, like all good theories must, it makes specific, testable predictions: if one benefit of ambiguity is the recycling of “easy” linguistic units, then linguistic units that are easier to produce and process should have more meanings associated with them. In other words, these expressions should act as “attractors”, acquiring multiple meanings because of their convenience and ease.
Based on psycholinguistic evidence, we know there are several variables that strongly affect ease of lexical processing and production: frequency (more frequent words are easier), length (shorter words are easier), and phonotactic well-formedness (how much a word conforms to the phonotactic rules of a language). These variables, then, are the predictors.
Piantadosi et al (2012) operationalized ambiguity as the number of meanings that a given word (or syllable) has. In the first analysis, this was measured as the number of homophones: the number of words with distinct, unrelated meanings. In the second analysis, this was measured as the number of senses of a word; this included homophones, but also words with polysemous relationships (E.g. run as in “the train between Boston and New York” vs. “John runs to the store”). Finally, in the third analysis, the authors asked whether certain syllables appeared in more words; following the same logic, “easier” syllables should be recycled more often, in more distinct words.
Across English, Dutch, and German, various operationalizations of ease systematically and significantly predicted the ambiguity of a word (or syllable). That is, words which were shorter and/or more frequent tended to attract more meanings (phonotactic probability did not reliably predict homophony across languages).
This finding is consistent with the predictions of the theory outlined above. The languages surveyed all contain lexical ambiguity, but lexical ambiguity is most concentrated in regions of the lexicon that theoretically should be easier to produce and process.
At first glance, the prevalence of ambiguity in human language seems suboptimal. But Piantadosi et al (2012) suggest that ambiguity is actually a design feature of language: it makes language more “efficient” by recycling easier words in favor of longer, more difficult words. And because the meaning of any given word should be clear in context, this lexical ambiguity doesn’t incur undue costs on either speakers or listeners.
Of course, there are a few limitations to this theory and the corresponding analysis. First, the analysis is conducted only on Germanic languages (English, German, and Dutch). This convenience sample is understandable, given that these languages are very well-documented, but a clear avenue for future research would be to ask whether this finding replicates in non-Germanic languages.
Second, the analysis is primarily of lexical ambiguity. Even if the theory explains why lexicons contain ambiguity, it is unclear whether it extends to cases of syntactic or pragmatic ambiguity––both of which might exist for different reasons, and exhibit different patterns in language use.
Third, the theory assumes that disambiguation is not inordinately expensive for listeners. This assumption is not exclusive to Piantadosi et al (2012); as pointed out in the paper, Levinson (2000) argues that languages will typically minimize the effort involved in articulation, and rely more on listener inference, as inference is “cognitively cheap”. But even if it is generally true, it does suggest an avenue for further exploration: if certain kinds of ambiguity do require expensive inferences to resolve, one would expect a language to minimize those kinds of ambiguity, or to have well-developed mechanisms for repairing misinterpretations.
A related assumption is that ambiguous words will generally be clear in context. In order for disambiguation to be “cheap”, and for lexical ambiguity to not result in costly miscommunications, a language must have contextual cues (linguistic, situational, etc.) available to make the intended meaning clear. Piantadosi et al (2012) argue that if context is informative at all about meaning, then a word will necessarily be less ambiguous when coupled with a context than without it––so a well-designed communication system will allow context to provide meaning, and won’t build in redundancies into the meanings of particular words. This last assumption is intuitive, and perhaps seems trivially true to some––of course words are less ambiguous in context––but again, it opens the door for interesting questions. The most interesting of these questions to me is: given that context is informative about meaning, which contextual cues are used for which kinds of ambiguity?
Chomsky, N. (2002). An interview on minimalism. N. Chomsky, On Nature and Language, 92–161.
Levinson, S. (2000). Presumptive meanings: The theory of generalized conversational implicature. MIT Press.
Piantadosi, S. T., Tily, H., & Gibson, E. (2012). The communicative function of ambiguity in language. Cognition, 122(3), 280–291. https://doi.org/10.1016/j.cognition.2011.10.004
Wasow, T., Perfors, A., & Beaver, D. (2005). The Puzzle of Ambiguity. Morphology and the Web of Grammar: Essays in Memory of Steven G. Lapointe, 1–18.
Zipf, G. (1949). Human behavior and the principle of least effort. New York: Addison-Wesley. Footnotes