# CONTEXT IN COMMUNICATION: A COGNITIVE VIEW

EDITED BY: Gabriella Airenti, Marco Cruciani and Alessio Plebe PUBLISHED IN: Frontiers in Psychology

### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

*All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-142-5 DOI 10.3389/978-2-88945-142-5

## About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **CONTEXT IN COMMUNICATION: A COGNITIVE VIEW**

Topic Editors: **Gabriella Airenti,** University of Torino, Italy **Marco Cruciani,** University of Trento, Italy **Alessio Plebe,** University of Messina, Italy

Ancient Messene mosaic floor, Greece Photo © Borisb17 | Dreamstime.com

Context is what contributes to interpret a communicative act beyond the spoken words. It provides information essential to clarify the intentions of a speaker, and thus to identify the actual meaning of an utterance. A large amount of research in Pragmatics has shown how wide-ranging and multifaceted this concept can be. Context spans from the preceding words in a conversation to the general knowledge that the interlocutors supposedly share, from the perceived environment to features and traits that the participants in a dialogue attribute to each other. This last category is also very broad, since it includes mental and emotional states, together with culturally constructed knowledge, such as the reciprocal identification of social roles and positions. The assumption of a cognitive point of view brings to the foreground a number of new questions regarding how information about the context is organized in the mind and how this kind of knowledge is used in specific communicative situations. A related, very important question concerns the role played in this process by theory of mind abilities (ToM), both in typical and atypical populations.

In this Research Topic, we bring together articles that address different aspects of context analysis from theoretical and empirical perspectives, integrating knowledge and methods derived from Philosophy of language, Linguistics, Cognitive Science, Cognitive Neuroscience, Developmental and Clinical Psychology.

**Citation:** Airenti, G., Cruciani, M., Plebe, A., eds. (2017). Context in Communication: A Cognitive View. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-142-5

# Table of Contents



Benjamin Stahl and Diana Van Lancker Sidtis

*230 Bridging the gap between DeafBlind minds: interactional and social foundations of intention attribution in the Seattle DeafBlind community* Terra Edwards

# Editorial: Context in Communication: A Cognitive View

#### Gabriella Airenti <sup>1</sup> \* and Alessio Plebe<sup>2</sup>

*<sup>1</sup> Department of Psychology, Center for Cognitive Science, University of Torino, Torino, Italy, <sup>2</sup> Department of Cognitive Science, University of Messina, Messina, Italy*

Keywords: context, pragmatics, communication, common ground, theory of mind (ToM)

### **Editorial on the Research Topic**

### **Context in Communication: A Cognitive View**

Context is a controversial concept. Research in philosophy of language, linguistics and cognitive science has shown that the communicative content of an utterance is not limited to the conventional content of what is said. The notion of context has been introduced in semantics and has assumed a central role in language studies with the pragmatic turn that has shifted the focus from meaning to speaker's meaning, a change of paradigm that can be traced back to Wittgenstein's conception of language use (Wittgenstein, 1953) and to the work of philosophers of language like Austin (1962), Grice (1975, 1978), and Searle (1969). In this framework pragmatics deals with the intentional aspects of language use. The notion of context is then no more restricted to the interpretation of indexicals and demonstratives (Kaplan, 1989). More generally, it applies to what is presupposed as common ground among the participants in a conversation (Stalnaker, 2002, 2014).

From a cognitive perspective communication is an inferential process based on mental states and shared knowledge (Clark, 1996). What contributes to interpret a communicative act beyond the spoken words may, broadly speaking, be included. Intuitively, context is the background for comprehension, what makes communication possible. This is a critical point. In fact, context both is an inescapable concept in the study of communication and eludes univocal definition. There is no one context but many.

In launching this Research Topic we did not expect to find a final definition or to have the last say. We were interested in singling out the present lines of research in this field. The papers we have collected attack the problem from different perspectives and using different research methodologies.

The paper by Faber and León-Araúz is aimed at, if not final, a comprehensive and detailed definition of context. They propose a taxonomy based on scope: local, spanning typically five items before or after the term occurrence; and global, such as a whole text or all that goes beyond the text such as the communicative situation. They apply this distinction to syntax, semantics, and pragmatics even if, as they note, at this level the boundaries are fuzzy. The challenging enterprise of detailing what context is, becomes mandatory in formalizing specialized knowledge resources, but the results shed light on the structure of context in general language.

On the way of clarifying what context constitutively is, García-Carpintero addresses Stalnaker's notion of context as common ground, mentioned above, showing certain weaknesses. The Stalnakerian view of common ground as sets of propositions reveals unsatisfying in cases of expressions with rich illocutionary features. The most convincing cases are those of slurs and pejoratives, where attempts to flatten the content into declarative form, will deprive context of important dimensions of expressive meaning. Therefore, context, in addition to sets of propositions, should be extended to include shared propositional commitments. Although the case of pejoratives and slurs is the most convincing, the requirement for shared commitments appears

Edited and reviewed by:

*Manuel Carreiras, Basque Center on Cognition, Brain and Language, Spain*

> \*Correspondence: *Gabriella Airenti gabriella.airenti@unito.it*

#### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *18 December 2016* Accepted: *17 January 2017* Published: *06 February 2017*

#### Citation:

*Airenti G and Plebe A (2017) Editorial: Context in Communication: A Cognitive View. Front. Psychol. 8:115. doi: 10.3389/fpsyg.2017.00115* in other cases examined by Garcia-Carpintero as well: directives, questions, predicates of taste, pretense.

Notably, the set of shared commitments proposed by Garcia-Carpintero includes aspects of the emotional state of the speaker. A step further inside the personal and interpersonal spheres is taken by Marques, investigating predicates of personal taste, aesthetic or moral values. A well known drawback afflicting contextual explanations is disagreement. If two conflicting judgments can be explained by simply augmenting the original sentences with propositions about the context of the two speakers, disagreement should disappear. Marques argues for contextualism, suggesting that disagreement can be addressed by taking into account differences in non-doxastic attitudes, and is enhanced by the evolutionary reinforcement of certain personal dispositions in social coordination.

The main contender to the contextualist strategy defended by Marques is relativism, which is contrasted with expressivism in the paper by Frápolli and Villanueva. The idea is that there are two main ways to accommodate context dependence, by what they call building-block or organic models. The former, that gives prominence to the principle of compositionality over the principle of context, is proper to relativism, while the latter, that privileges context over compositionality, belongs to expressivism.

While in the group of papers described so far, the main perspective under which context is studied is semantic, enriched with insights on mental phenomena, in the next group the cognitive perspective prevails, asking questions about how context is structured and accessed in the mind. Mazzone builds upon one of the most developed theories in cognitive pragmatics, Relevance Theory (Sperber and Wilson, 1986) and discusses how this theory succeeds in explaining the way relevant context is constructed during utterance understanding. He identifies a weakness in spelling out the mechanisms in place during the process of selecting the context, which, suggests Mazzone, can instead be identified in the combination of a bottom-up activation of schemata, especially goal-directed schemata, with a top-down activation of contextual information. This sort of mechanism is supported by what is currently known about the hierarchical structure of the frontal cortex.

Relevance Theory is the starting assumption also for Attardo, in the search for a satisfactory context to explain utterances. He stresses how the exploration of relevance is largely abductive in nature, and remarks that the derivation of context requires additional mechanisms that counteract the expansive tendencies of relevance and abduction. Such bonding mechanisms, argues Attardo, can be construed under the principles of satisfaction and charity.

Paradigmatic in a cognitive perspective on context is the discussion about the so-called Theory of Mind (ToM), the set of skills that allow to attribute beliefs, goals, and percepts to other people: how essential is this ability in constructing the context necessary to understand utterances? The two contributions by Kissine and Cummings provide two contrasting answers. For Kissine there are grades of interpretative strategies to derive relevant implicatures of an utterance, and the lower levels, like the egocentric relevance, do not require any ToM. For Cummings utterance interpretation is highly dependent on attributing cognitive and affective mental states to the minds of language users, and she proposes that for the purpose of context derivation the best notion of ToM should encompass the rational, intentional, holistic character of interpretation. Both papers draw on studies with ASD (Autism Spectrum Disorder) subjects to support their arguments. Kissine reports of subjects with ASD able to correctly discriminate between "ironical" and "literal" interpretations. Cummings reports clinical cases where ASD subjects exhibit deficits covering the three cornerstones of ToM she identified: rationality, intentionality, and holism.

Airenti investigates young children's ability to produce and understand different forms of humor. In particular she focuses on teasing, a form of humor already present in preverbal infants that is also considered a typical feature of irony. She proposes that the acquisition of specific communicative contexts enable children to engage in humorous interactions before they possess the capacity to analyze them in the terms afforded by a full-fledged ToM.

In addition to increase our understanding, the cognitive perspective on context has important practical implications, as in the divergent interpretations of numeric quantities reported by Mandel. Subjects tend to assume large numerical quantities not as exact values, rather adopting a lower-bound at least or an upper-bound at most interpretation, depending on the context.

Several papers fall within the domain of experimental pragmatics.

Filippi et al. explore the role of prosodic cues in word learning. In natural situations learners have to identify words within a sequence of sounds and to relate them to specific referents extracted by the visual scene. Developmental research has suggested that adults' use of exaggerated pitch might direct infants' attention to specific elements in the context and guide learning. In their study the authors show that also adults exposed to an artificial language in different experimental conditions exploit pitch enhancement as a pragmatic cue.

The role of intonation employed as an indicator of focus in pragmatic interpretation is treated in Cummins and Rohde. In Gricean pragmatics the interpretation of an utterance is based on the relation between what has been said and the potential utterances that would have been relevant to the current discourse purpose, had it been uttered. This set of relevant alternatives is defined in the notion of Question Under Discussion (Roberts, 1996/2012). The three experiments reported in this study showed that hearers used the intonation as an indication of which QUD is currently in play in the interpretation of scalar implicatures, presuppositions, and coreference.

Domaneschi et al. maintain that for the analysis of context an important role is played by cognitive load. In fact, cognitive effort might have an effect on which presuppositions are activated. In their study they show this effect with presupposition selection in conditional sentences with a trigger in the consequent. The effect of cognitive effort in interpreting communicative utterances involving pragmatic enrichment is also the subject of Janssens and Schaeken's paper. However, their study showed no influence of the working memory load on the performance in the task of inferring the implicatures from but, so and nevertheless. They also found that a major role in interpretation is played by the content of the arguments suggesting that context and content are fundamental in the interpretation process.

In their paper Dupuy et al. discuss how the context affects the interpretation of scalar implicatures. In particular, they focus on the pragmatic interpretation of some. They test two factors, the existence of factual information that facilitates the computation of pragmatic interpretations in the context, i.e., the cardinality of the domain of quantification, and the fact that the context makes the difference between the semantic and the pragmatic interpretations relevant. Their results suggest that the main factor that enhances pragmatic interpretation is the relevance of the contrast that in turn increases the salience of the cardinality.

Two papers use event-related potential (ERP) electrophysiological technique to analyze the role of context in the comprehension of two important pragmatic phenomena, metaphor and referential ambiguity. Bambini et al. conducted two experiments in which EEG activity was recorded when participants were presented with metaphors in two different context situations, a minimal vs. a supportive context. Their results suggest the presence of two dissociable ERP signatures in the processing of metaphors. In fact, the N400 effect was visible only in minimal context, whereas the P600 was visible both in the absence and in the presence of contextual cues. From these data the authors argue that linguistic context reduces the effort in retrieving lexical aspects of metaphors but does not suppress later pragmatic interpretation efforts needed in order to derive the speaker's intended meaning. Jiang and Zhou investigate how a comprehender resolves referential ambiguity in a conversation by using information concerning the social status of communicators in the context, and how empathic sensitivity to the social status information modulates ambiguity perception and the underlying neural activity. Electrophysiologically, they show the existence of differential neurocognitive processes underlying ambiguity resolution with different contextual cues.

Two papers analyze communication in context as a diagnostic and clinical resource.

Arcara and Bambini propose a test (APACS) to evaluate pragmatic abilities in clinical populations with acquired

## REFERENCES


communicative deficits, ranging from schizophrenia to neurodegenerative diseases. The test consists of six tasks devoted to assess different pragmatic abilities in the domains of discourse and nonliteral communication. Their assumption is that while globally depending on context, different pragmatic aspects might involve different cognitive skills.

Stahl and Van Lancker Sidtis analyze the contribution of formulaic expressions in clinical rehabilitation from speech and language disorders after stroke. For these patients formulaic expressions frequently remain one of the few resources available for communication. Therapy may support them in including these expressions within language games, i.e., communicative exchanges based on turn-taking. In this way the conversational context allows patients to exploit their residual resources in order to reestablish social interactions.

Edwards deals with an extreme case of communication reporting her fieldwork with a community of deaf-blind people in Seattle. Edwards via the analysis of interactional sequences and subjects' metapragmatic commentary shows how deafblind people use tactile-kinesthetic channels to overcome the difficulty to converge on objects of reference. She discusses two mechanisms that can account for this process: embedding in the social field and deictic integration. She argues that together they yield a deictic system set to retrieve a restricted range of values from the extra-linguistic context, thereby attenuating the cognitive demands of intention attribution.

In summary, this research topic is a sampling of innovative efforts to address challenging issues on context, involving complex questions spanning from brain processes to social interactions and pragmatics. This sampling witnesses a growing, vibrant community of researchers attempting to integrate the knowledge, the methods, and the theory-building tools from philosophy of language, linguistics, cognitive science, and cognitive neuroscience.

## AUTHOR CONTRIBUTIONS

Both authors contributed to the editorial and approved it.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Airenti and Plebe. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Specialized Knowledge Representation and the Parameterization of Context

Pamela Faber\* and Pilar León-Araúz

Department of Translation and Interpreting, University of Granada, Granada, Spain

Though instrumental in numerous disciplines, context has no universally accepted definition. In specialized knowledge resources it is timely and necessary to parameterize context with a view to more effectively facilitating knowledge representation, understanding, and acquisition, the main aims of terminological knowledge bases. This entails distinguishing different types of context as well as how they interact with each other. This is not a simple objective to achieve despite the fact that specialized discourse does not have as many contextual variables as those in general language (i.e., figurative meaning, irony, etc.). Even in specialized text, context is an extremely complex concept. In fact, contextual information can be specified in terms of scope or according to the type of information conveyed. It can be a textual excerpt or a whole document; a pragmatic convention or a whole culture; a concrete situation or a prototypical scenario. Although these versions of context are useful for the users of terminological resources, such resources rarely support context modeling. In this paper, we propose a taxonomy of context primarily based on scope (local and global) and further divided into syntactic, semantic, and pragmatic facets. These facets cover the specification of different types of terminological information, such as predicateargument structure, collocations, semantic relations, term variants, grammatical and lexical cohesion, communicative situations, subject fields, and cultures.

### Edited by:

Marco Cruciani, University of Trento, Italy

### Reviewed by:

Elisabetta Lalumera, Università degli Studi di Milano-Bicocca, Italy Rita Temmerman, Vrije Universiteit Amsterdam, Belgium Pius Ten Hacken, Universität Innsbruck, Austria

> \*Correspondence: Pamela Faber pfaber@ugr.es

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 31 August 2015 Accepted: 01 February 2016 Published: 23 February 2016

### Citation:

Faber P and León-Araúz P (2016) Specialized Knowledge Representation and the Parameterization of Context. Front. Psychol. 7:196. doi: 10.3389/fpsyg.2016.00196 Keywords: context parameters, specialized knowledge, terminology, terminological knowledge bases

## INTRODUCTION

According to Akman and Surav (1997) and Akman (2000), the denotation of context has become murkier as its uses have spread out in many directions to the extent that it has become a sort of 'conceptual garbage can.' For this reason, efforts are currently being made to parameterize and generally make sense of context and all that it implies. However, though instrumental in numerous disciplines, context has no universally accepted definition, because it can point to many different things. In the same way as the definition of any word, the definition of context can also vary depending on the field of application, such as Linguistics, Cognitive Science, or Computer Science (Bazire and Brézillon, 2005).

Specialized knowledge is related to all of these three areas in the sense that (1) it is shared and disseminated through linguistic communicative acts (journal articles, conferences, etc.); (2) it is processed and acquired in the mind; and (3) it may be subjected to formalization. Therefore, the

parameterization of context for specialized knowledge representation should be approached from a multidisciplinary perspective.

Specialized knowledge can be represented in a variety of formats (i.e., ontologies, vocabularies, thesauri, controlled languages, databases, etc.) that may or may not support context, because knowledge resources are conceived for very different purposes (i.e., classification, reasoning, knowledge acquisition, standardization, harmonization, information retrieval, machine, or human translation, etc.). More specifically, terminological knowledge bases (TKBs) generally describe the concepts and terms of specialized knowledge domains for users with linguistic and/or cognitive needs. TKB users are most often human (e.g., translators, experts, technical writers), but computer applications can also benefit from terminological resources when it comes to automatically interpreting and/or producing specialized texts. Even though TKBs usually provide conceptual representations based on some sort of knowledge modeling mechanism, they rarely support context modeling. In other words, very few provide controlled partial information concerning conceptual entities by viewing them from different viewpoints or situations. This can be a problem because the meaning, designation, collocates, and location of a concept within a knowledge configuration or linguistic structure often vary, depending on context.

Contextual information must thus be included in a TKB that aspires to being a knowledge representation resource. In this regard, it is timely and necessary to parameterize context in specialized knowledge domains with a view to more effectively facilitating knowledge representation, understanding, and acquisition. Nevertheless, matters are further complicated by the fact that context itself is a complex, multidimensional concept. Reasons for its conceptual fuzziness include the following: (i) there are various types of contexts; (ii) many types of data can be extracted from context analysis; (iii) contexts can also be used for a wide range of different purposes.

Contextual information can be specified in terms of scope (local vs. global) or according to the type of information conveyed (syntactic, semantic, and pragmatic variables). As reflected in corpus analysis, when context is mentioned in a text, it is metaphorically conceived as a container or a bounded space, since an utterance can be "in context" or "out of context." Context also frames or surrounds the utterance or object of analysis. In this sense, context bears a resemblance to Fauconnier's (1985, 1997) mental spaces since the location of an utterance in this bounded space or container is what makes it meaningful. As a relational construct in texts, context helps to anchor linguistic designations to objective reality by providing background information, situating objects and processes, and explicitly relating them to each other as well as to the agents that manipulate them and act on them. It is thus a constraining factor that drives understanding. In other words, as stated by Leech (1981), the specification of context (whether linguistic or nonlinguistic) has the effect of narrowing down the communicative possibilities of the message as it exists in abstraction from context.

The remainder of this paper is organized as follows. The Section "What is Context?" reviews the notion of context as found in the literature of different areas. In Section "Context and Terminology," context representation is described with regards to terminology and specialized knowledge. The Section "Context Parameters" proposes a taxonomy of context parameters from a local to a global scope further divided into syntactic, semantic, and pragmatic facets. These facets cover the specification of different types of contextually relevant terminological information, such as predicate-argument structure, collocations, semantic relations, term variants, grammatical, and lexical cohesion, communicative situations, subject fields, and cultures. The examples given are drawn from the domain of environmental science based on the experience acquired while building EcoLexicon (ecolexicon.ugr.es), an environmental multilingual TKB. The Section "Conclusion and Future Work" provides the conclusions derived from the parameterization of context for specialized knowledge representation.

## WHAT IS CONTEXT?

Research communities envision context differently since they conceive it in relation to different entities. Thus, context may be the parts of discourse surrounding a word, sentence, or passage, also known as co-text (Textual Linguistics), the set of situational elements where the object being processed is included (Cognitive Psychology), or that which surrounds and gives meaning to something else (Computer Science).

In Linguistics, context has long been regarded as an essential factor in the interpretation of linguistic utterances. It plays an important role in different tasks, such as meaning construction, inference, variation, modulation, sense disambiguation, etc. Quite often co-textual elements are sufficient to resolve ambiguity, but sometimes other context types also come into play.

Apart from the co-text sense, context in Linguistics is also mentioned in relation to pragmatic and cognitive notions, such as speech acts (Austin, 1962; Searle, 1969), conventions (Gadamer, 1995), maxims (Grice, 1975), framing (Goffman, 1974), common ground (Clark, 1996), and mutual manifestness (Sperber and Wilson, 1986, 1990), which refers to what one is capable of inferring or perceiving even if one has not done so as yet. The sum of these shared assumptions constitutes the cognitive environment of a group of individuals, which provides the foundation for successful communication (Yus, 2006).

These notions are related to sociocultural factors accounting for broader contextual variables, such as communicative settings, cultures, or world knowledge. Evidently, context has also been extensively studied in discourse studies, where it has been defined as the totality of conditions under which discourse is being produced, circulated and interpreted (Blommaert, 2005, p. 251). In the same area, Van Dijk (2005, p. 237) gives an even wider view by dividing context in different dimensions, namely, the cognitive, social, political, cultural, and historical environments of discourse.

In Cognitive Science, since the emergence of situated cognition, background situations have also become an essential element in the analysis of context. This has had an impact

on cognitive linguistics, where meaning is thought to be based mostly on situational context and constructed on-line (Croft and Cruse, 2004; Evans and Green, 2006). Meaning thus does not exist without context. For example, the theory of situated cognition argues that knowledge is situated, and is partly a product of the activity, context and culture in which it is developed and used (Brown et al., 1989, p. 32). Clancey (1994) adds that the situated aspect of cognition is that the world is not given as objective forms. Rather, what we perceive as properties and events is constructed according to the context.

Elman (2009, p. 572) highlights the importance of context in language comprehension and asserts that the meaning of a word is rooted in our knowledge of both the material and social world. Therefore, the meaning of a word is never 'out of context' even when we are not aware of what this context is. He also highlights the importance of larger knowledge structures: "events play a major role in organizing our experience. Event knowledge is used to derive inference, to access memory, and affect the categories we construct. An event may be defined as a set of participants, activities, and outcomes that are bound together by causal relatedness." Consequently, all lexical units, apart from their micro-context in discourse, need to be understood within the context of a larger event.

According to Yeh and Barsalou (2006, p. 350), knowledge of a larger event or situation restricts the entities and events likely to occur in it. Conversely, knowledge of current entities and events constrains the event or situation likely to be unfolding. Context thus plays a crucial role in knowledge understanding and acquisition since it can trigger one meaning while inhibiting another.

Cognitive processing necessarily includes linking an utterance or object to the right context, something that the human brain does with relative ease. In this sense, according to Flowerdew (2014), speakers and writers are remarkably adept at knowing which features of context to rely on to make their utterances meaningful, and listeners and readers are equally adept at contextualizing what they read or hear in order to understand it. However, what is not so easy is to agree on how to characterize context types and describe how they interact with each other. In fact, context was for a long time omitted in linguistic accounts because it was considered to be too chaotic and idiosyncratic, to be systematically characterized (Ervin-Tripp, 1996, p. 35).

Despite the evident challenge, the benefits of formalizing context are well known in computing. Computer Science has been dealing with context as a formal object –although more limited in scope– for some time now since McCarthy (1987, 1993), who stated that there is simply no most general context where all the stated axioms always hold and are meaningful.

From a computational perspective, contexts are useful for putting together a set of related axioms. In this way, contexts are used as a means for referring to a group of related assertions about which something can be said (Guha, 1991). However, the notion of context in computer science has two sides (Brézillon, 2005). Firstly, there is the cognitive science view, where context is used to model interactions and situations in a world of infinite breadth and human dimension, which is the key for extracting a model. Secondly, there is the engineering view, where context is useful in representing and reasoning about a restricted state space within which a problem can be solved. Since context, knowledge and reasoning are closely intertwined (Brézillon, 2005), the main aims of artificial intelligence with regards to the formalization of context seem obvious: (i) performing automatic inferences and reasoning (Guha, 1991; Lenat, 1995); (ii) identifying relational constraints for human–computer interaction and context-aware applications (Dey, 2001); (iii) improving automatic information retrieval, resolving ambiguities in natural language processing (NLP), etc. Also relevant to the parameterization of context is the concept of explanatory coherence (Thagard, 1989), which formalizes and computes coherence as a constraint satisfaction problem (Thagard and Verbeurgt, 1998).

Although some of these applications go beyond the scope of this proposal, there are others that could benefit from the systematization of context features in specialized knowledge resources, especially those related to NLP and domain ontologies. More specifically, with regards to knowledge representation and reasoning, context is needed to derive new knowledge from what is already known.

However, context is more than a set of previously specified discrete variables that have an impact on the knowledge of a language and a person's ability to use it. Context and language are considered to be in a mutually reflexive relationship, such that language shapes context as much as context shapes language (House, 2006).

## CONTEXT AND TERMINOLOGY

As is well known, Terminology is the study of how specialized knowledge concepts are structured, described, and designated in one or various languages within a specialized domain. One of the practical tasks in Terminology is the design and creation of terminological resources so that users, whether human or artificial, can effectively access concepts and associated information in order to understand, acquire, or produce specialized knowledge.

Although the tendency in the General Theory of Terminology (Wüster, 1979) was initially to disregard context and contextual variables as well as the terminological variation that they produce, it soon became apparent that specialized terms are lexical items that are used in communicative contexts (Sager, 1990; Cabré, 1999), and that these contexts can affect their potential meaning. In fact, specialized knowledge units or terms acquire their meaning in context, more specifically, within a frame including a semantic and pragmatic background (Reimerink et al., 2010).

Nevertheless, contextual information is rarely found in specialized knowledge resources. As pointed out by Bowker (2011), most term banks present terms out of context, or in only a single context. A possible reason is the widespread belief that terms in the same field never have more than one meaning and thus have a one-to-one relation with the object or process designated. However, terms and concepts are dynamic and context-sensitive. For instance, concepts may be recategorized so as to constrain their relational behavior, and terms may show several types of variants with different cognitive, semantic, and

usage consequences (León-Araúz and Faber, 2014), (see examples in Sections "Local Pragmatic Contexts" and "Global Pragmatic Contexts").

User understanding of an entity or group of entities depends on having access to the necessary information to activate the right frame or knowledge structure in which the word or term should be processed. In turn, the effective production of a specialized utterance also depends on the user having access to the combinatorial potential of the terms involved. When a terminological resource includes multilingual correspondences, contextual information becomes even more crucial because of the lack of isomorphism between languages and cultures.

Generally speaking, even when contextual information is included in the concept or term entries of knowledge resources, it is not inserted in a systematic way since there is no consensus of opinion on the exact nature of context. The most common form of context found in terminological resources is a textual excerpt where terms are shown in real use, whether such contexts are in the form of KWIC (Key Word in Context) concordances or longer full-sentence segments. These can be useful to enhance both linguistic and cognitive user needs since they can provide valuable information regarding the collocational behavior of the terms and/or the relational behavior of the concepts activated. However, this is only a small fraction of what context representation should be.

Dubuc and Lauriston (1997, pp. 81–83) were among the first to underline the importance of context in Terminology: "Contexts are important to terminology with respect to the relationship of a term with its field of application. The context embodies the discourse bearing the term [. . .]. It is the presence of conceptual features relevant to the term that determines the extent of the context." Despite the fact that their interest in context was restricted to evidence of the term being used in the specialized field and the conceptual content associated with the term, this was still a relatively new assertion for the time. They classified contextual excerpts as associative, explicative, or defining, depending on the quantity and quality of concept descriptors obtained. This seminal study focused on context as a way of enhancing the reader's mental image of a concept.

Pearson (1998) goes somewhat further and explains why context is a great deal more than a text excerpt included in a term entry for purposes of knowledge acquisition. She affirms that the only way of determining what a term is and whether language is behaving 'terminologically' is by examining context. A context thus reflects a certain communicative setting, which is the most important factor that shows whether a given lexical unit is being used as a term or as a general language word. Finally, she also highlights the usefulness of metalanguage patterns retrieved from corpora in the formulation of terminological definitions (ibid: 191–203). This was subsequently complemented by Meyer (2001), who introduced the notion of knowledge-rich context.

Not surprisingly, in the last 15 years, context has become an important focus in Terminology research and its uses have multiplied accordingly. In its co-text sense, it is currently a primary data source for elaborating and constraining the scope of meaning definitions. It has thus become a rich source of complementary conceptual information, linguistic usage, and knowledge representation, inter alia. Nevertheless, as observed in Section "What is Context?," context encompasses much more. Other than static text-based usage examples, context representation in Terminology should also cover background situations, cultures, communicative settings, etc.

The vital role of specifying context and of embedding specialized concepts in situations has been highlighted as a way of enriching conceptual representations in TKBs. According to Meyer et al. (1992), TKBs should reflect conceptual structures similarly to how concepts are related in the human mind. Similarly, Faber (2011) states that the organization of semantic information in the brain should underlie any theoretical assumption concerning the retrieval and acquisition of specialized knowledge concepts as well as the design of specialized knowledge resources.

For example, in an fMRI study of expert-novice differences in the identification of geological field instruments, Faber et al. (2014b) found that in contrast to novices, experts activated the bilateral precuneus, posterior cingulate, and insula, three regions previously implicated in mental imagery, episodic memory, and context representation. In addition, the importance of visual scene generation was reinforced by brain activation in the parahippocampal gyrus, which encodes meaningful contextual associations.

In Frame-Based Terminology (FBT; Faber et al., 2005, 2006, 2007; Faber, 2011, 2012, 2015), specialized knowledge units are only understood with reference to their underlying conceptual frame, whose elements are selected according to context. Context determines the activation of previously stored knowledge and the formation of new categories (Croft and Cruse, 2004 p. 75). In this sense, Barsalou (1983, 1991) found that conceptual categories can be created in an ad hoc goal-derived way, which indicates that context determines the conceptual organization underlying a concrete situation. Since categorization itself is a dynamic context-dependent process, the representation and acquisition of specialized knowledge should certainly focus on contextual variation (León-Araúz et al., 2013).

For this reason, one of the keys to the enhancement of specialized knowledge resources lies in parameterizing contextual information. This entails distinguishing different types of context, their scope and facets as well as how they interact with each other. This is not a simple objective to achieve despite the fact that specialized discourse does not have as many contextual variables as those in general language (e.g., figurative meaning, irony, etc.).

A solid theory of context and context types would be a timely contribution to lexical semantic research which would have repercussions in a wide range of fields. A principled set of context modeling parameters would facilitate knowledge acquisition and understanding. Such resources would ideally allow non-experts to understand a given domain by focusing on and capturing essential knowledge. However, they would also benefit diverse applications in NLP and in the Multilingual Semantic Web (MSW; León-Araúz and Faber, 2014). The MSW is envisioned as an information space where languageindependent knowledge would be accessible across different natural languages. This entails the improvement of many NLP techniques related to both comprehension and production,

such as word sense disambiguation, cross-lingual mappings, or question answering –always depending on general language resources such as WordNet. Thus, for the web to be truly semantic and multilingual, different NLP tools and techniques need to rely on high-quality multilingual resources –whether general or specialized– that account for the representation of context, a major barrier to successful communication.

## CONTEXT PARAMETERS

Many authors have proposed the characterization of context types, based on a wide range of different criteria. In Cognitive Linguistics, Evans and Green (2006, p. 21) underline the importance of different types of context in the modulation of any given instance of a lexical item as it occurs in a particular usage event. Broad context types mentioned are the following: (1) encyclopedic context (information accessed within a network of knowledge); (2) sentential context (utterance meaning); (3) prosodic context (intonation pattern); (4) situational context (physical location where the text is emitted); and (5) interpersonal context (relationship holding between text sender and receiver). Most other approaches give a more binary vision of context. For instance, Harris (1988) proposes world knowledge vs. language knowledge, whereas Halliday (1989) makes the distinction, context of situation vs. context of culture. This duality can also be found in the distinction between context and co-text.

In reference to specialized knowledge units, the primary division of context is based on scope, since contexts can be either local or global. Context may be a few words on either side of a term (He et al., 2010), the sentence or paragraph in which it is appears (Soricut and Marcu, 2003), a set of documents containing it (Cilibrasi and Vitanyi, 2007), a communicative act, or even a whole culture. According to Akman and Bazzanella (2003, p. 325), an adequate multi-modal coding of context on both the global and local levels would be useful in delimiting inferences, disambiguating deictic expressions, and solving the problem of indeterminacy.

Thus, the distinction of local vs. global can be found elsewhere in the literature though not with the same meaning. Bazzanella (1998), Akman and Bazzanella (2003), and Miecznikowski and Bazzanella (2007) refer to local context to denote a specific setting where the participants interact; and use global context for referring to the members of a community, their social norms, culture, beliefs, ideology, etc. In the same way, Mihalcea (2007) uses the same distinction to refer to a different context span within textual excerpts (a pair of words vs. lexical chains), whereas Dash (2008) proposes a continuum of four contexts from local to global: (i) local context (the immediate environment of a word); (ii) sentential context (syntactic-based); (iii) topical context (domain-based); and (iv) global context (extralinguistic reality).

In our view, local contexts are usually limited to the words within the term itself, to a small number of words in the immediate vicinity of a term, or to words connected by syntactic dependencies to the term. According to Agirre and Stevenson (2007, p. 225), the data that can be derived from local contexts are the following: part of speech, morphology, collocations, subcategorization, frequency of senses, syntagmatic and paradigmatic word association, selectional preferences, semantic roles, domain, topical word association, and pragmatics. Evidently, these categories are not watertight containers since there is a great deal of overlap between information types but they are all valuable data categories to be included in a TKB.

In contrast, global contexts can encompass the whole text or go beyond the text: to the communicative situation (i.e., formal vs. informal); to the conceptual networks reflected in it; to the culture in which the text is interpreted, etc. This means that global contexts refer to items that are often quite a distance from the term or even outside of the text altogether though within the specialized domain.

Both local and global contexts can be subdivided, based on whether they are mainly syntactic, semantic, or pragmatic. In our opinion, it is extremely difficult to trace a clear boundary line between syntax, semantics, and pragmatics because there is a significant degree of overlap. In fact, in Cognitive Linguistics, the distinction between semantics and pragmatics is even rejected. For example, at the local level, the use of term variants drainage basin or catchment area instead of river basin or watershed (all expressing the same concept) has pragmatic significance since it signals that the text sender has expert knowledge and is British or Australian instead of American. However, the choice of drainage basin also has a semantic dimension since drainage foregrounds water movement and accumulation which are the processes that occur in this area whereas river basin only foregrounds the location of the basin without any reference to water flow.

At the same time, the term also possesses a syntactic dimension. Drainage modifies basin, the head of the multi-word term. The implicit relation between modifier and head can be expressed by the preposition for (basin for drainage) since the basin is where drainage occurs. However, the structure of the term can be unpackaged to basin where water drains in and then drains out. It is thus the result of meaning compression given the fact that drainage encodes both the incoming and outgoing flow of water.

This interaction reflects the fuzzy boundaries between syntax, semantics, and pragmatics in general and specialized language. As Dash (2008, p. 29) points out, each context is interlinked with the other in an invisible thread of interdependency. This fuzzy three-level approach to context goes hand in hand with the micro-theories proposed by FBT, which are related to the information encoded in term entries, the relations between specialized knowledge units and the concepts that they designate (Faber, 2015, p. 15).

## Local Contexts

Local contexts are generally regarded as spans of +5 items before and after the term occurrence. They are important in the design stage of a TKB for a wide variety of reasons, which include (but are not limited to): (i) term disambiguation; (ii) meaning definition formulation; (iii) specification of linguistic usage; (iv) conceptual modeling; and (v) term extraction. Thus, local contexts can be used either by resource creators in order to

develop terminological by-products (i.e., definitions, conceptual networks, usage examples, etc.); or by the users themselves (i.e., obtaining direct access to the corpus).

In Corpus Linguistics, a recurring local context is known as a collocation. However, collocation is a rather vague term that does not cover the same range of linguistic phenomena for all linguists (Mollin, 2009, p. 176). The definition of a collocation, its length, the neighborhood of possible collocates and their strengths of occurrence (inter alia) are all part of the analysis of specialized language texts and the terms that they contain. Needless to say, the representation of this type of information should be an important element in the design of data fields in terminological entries. However, there are different ways to approach collocational information: they can be seen as a combination of grammatical elements, as the codification of semantic relations, or as pointers to pragmatic information.

Therefore, such local contexts can be syntactically parameterized based on syntagmatic patterns and/or semantically mapped in terms of the interaction, foregrounding, or specification of the definitional features of the concepts activated in them. As shown in Section "Local Pragmatic Contexts," certain types of pragmatic information are also reflected in local contexts. This occurs, for instance, when term variants indicate changes in the knowledge area, specialization level, geographic region, cultural community, and/or historical period.

### Local Syntactic Contexts

Local syntactic contexts are those that reflect the recurrent structural patterns in which the term participates. Terms have a combinatorial value and distinctive syntactic projections. However, a term's position in a subject or direct object slot or as the head of a prepositional phrase is often not very informative since the fact that a term has a certain grammatical function in a sentence is not always relevant to its meaning. Nor is the analysis of a multi-word terminological unit as a mere combination of grammatical categories much more helpful unless this pattern is linked in some way to its underlying semantics. It is more productive to take a semantic view of syntax and to analyze syntactic contexts as the linguistic codifications of predicateargument structure.

In this regard, each predicate can be said to have an argument structure or valence, specifying the number of arguments that it can take. The concept of valence was first proposed by Tesnière (1959) and now plays a crucial role in the majority of today's linguistic theories. Generally speaking, valence is regarded as the ability of certain lexical units (e.g., verbs) to open slots which are filled by other lexical units. Valence can be envisaged syntactically, semantically or as a combination of the two. Again, this is proof of the fuzzy interaction of context types and parameters.

A predicate's valence depends on its meaning since its arguments are essentially the participants which are minimally required for the activity or state described. Such representations should thus include the decomposition of the predicate and the specification of the semantic characteristics of the arguments (Faber and Mairal Usón, 1999).

Despite the fact that verbs have never been a primary focus in Terminology, approaches to syntax in Terminology can benefit greatly from linguistically sensitive theories of lexical structure that focus on verbs and on how their meaning relates to syntactic forms within a sentence. One reason for this is that verbs play an important role in specialized discourse because their position in a lexical domain and degree of semantic specificity is in direct relation to the number and type of arguments that they can combine with (Faber and Mairal Usón, 1999). In specialized texts, these arguments are terms or specialized knowledge units, whose semantic characteristics constrain the polysemy of the verb and even model its meaning. In this sense, one can say that the meaning of general language verbs can be significantly modified or even transformed by their context of activation. When general language verbs appear in domain-specific texts, they become specialized because their arguments constrain their meaning (L'Homme, 2003). At the same time, the presence of a particular verb also constrains the type of argument slots that specialized terms may fill.

For example, dissipate is a polysemic general language verb, which is often found in scientific discourse. When it is used transitively in the sense of one entity dissipating another entity, it has two arguments. The first argument has the semantic role of agent and the second has the role of theme. In this regard, the argument structure of dissipate is fairly straightforward since X (agent) causes Y (a theme undergoing the action) to be dissipated:

(1) Dissipate (x)agent (y)theme

According to the Merriam–Webster Dictionary, this transitive use of dissipate has one of the following four senses: (i) to break up and drive off (as a crowd); (ii) to cause to spread thin or scatter and gradually vanish; (iii) to lose (as heat or electricity) irrecoverably; (iv) to spend or use up wastefully or foolishly. Contextual data extracted from the enTenTen12 general English corpus in Sketch Engine show that dissipate is often used unaccusatively. In other words, the first argument is not made explicit. The most frequent meanings of dissipate in general language are ii (2) and iv (3):

	- (i) Temperature (e.g., warmth, heat). [When you exercise on land, sweat evaporates, and cools your skin to **dissipate** heat.]
	- (ii) Meteorological phenomena (e.g., storm, fog, mist). [By afternoon, however, the air traffic from the city had become normal again when the fog **dissipated** almost completely.]
	- (iii) Visual/olfactory perception (e.g., mirage, smell). [I hope most of the smell **dissipates** by the time that I ride my bike this afternoon.]
	- (iv) Emotions/feelings (e.g., fear, anxiety). [With the knowledge, the anger **dissipated** as quickly as it had come.]
	- (3) To spend or use up wastefully or foolishly.

(i) Valuable possessions (e.g., wealth, resources). [The entrepreneurs have become wealthy while showing how extravagance and luxuries **dissipate** wealth.]

As can be seen in the case of sense (2), the dissipated entity is most frequently related to temperature, weather, sensory perception, and emotions, whereas in sense (3), it is generally wealth or financial resources.

However, in specialized contexts, the meaning of dissipate does not really correspond to any of these possibilities. The reason for this is the semantic content of domain-specific arguments, which interact with the base meaning of dissipate and model it to create a new sense that is apt for scientific contexts. In the EcoLexicon corpus (subdomain Coastal Engineering), the statistically significant collocates of dissipate in the theme slot are the following: (i) energy (e.g., flux, gradient, power); (ii) cyclone or a storm-related term (tide, wave, wind, etc.), which can also be regarded as a type of energy. More specifically, energy appears as the generalization of heat whereas cyclone is a specification of energy. **Table 1** shows examples of specialized contexts for dissipate.

As can be observed, the arguments of dissipate are all NPs that belong to the same semantic categories and combine in similar patterns. In (4) in **Table 1**, the first argument is a process involving some type of friction and the second argument is energy. In (5) in **Table 1**, the storm entity appears unaccusatively without explicitly referring to the reasons for energy loss, which reflects the fact that the target audience is already aware that reasons for storm dissipation include colder sea surface temperatures, shearing winds, sinking air, moving over land, depending on their type, location, and intensity.

In both cases, the interaction of the semantic characteristics of both the dissipated entity (energy) and dissipating process (friction, breaking, falling, uprushing) clearly point to a new (specialized) sense of dissipate, which responds to the Laws of Thermodynamics:

(6) To cause (energy) to be lost through its conversion to heat.

The definition of dissipate in (4) fits the domain of Coastal Engineering. The energy produced by wave movement is dissipated (lost), typically from friction or turbulence when the waves are near the shore and come into contact with the sea bottom. Of course, the energy is not actually lost but rather is transformed into heat, which raises the temperature of the system. The conversion to heat, though explicit in the definition, is not lexicalized in contexts since it is part of the shared knowledge in the domain. This is one example of how verbs within domain-specific contexts become transformed when they are used in specialized texts since the terms that fill the slots in their argument structure contextualize, modify, and/or restrict their meaning. Moreover, since arguments are specialized terms and verbs are relational constructs, the analysis of argument structure can lead to the construction of semantic networks or frames, which again reflects the fuzzy boundaries between syntax and semantics. For instance, all arguments in the second specialized sense of dissipate are cohyponyms (tropical cyclone, hurricane, tornado).

Another way of viewing a local syntactic context is as a colligation, initially defined as the co-occurrence of grammatical categories (Firth, 1968, p. 181) or the grammatical company that a word keeps. In multi-word expressions (MWEs), the relations between words and their grammatical categories cover a wide spectrum. In most cases, the words are linked by both grammatical and lexical relations. In fact, it is difficult if not impossible to determine which relation is stronger in each case.

According to Hoey (2005, p. 43), the basic idea of colligation is that in the same way that a lexical item can be primed to occur with another lexical item, it can also be primed to occur in or with a particular grammatical function. Colligation is concerned with the typical grammatical patterning of words (or word classes). As such, collocation and colligation are not totally separate concepts, but together create a network of meaning. Distinguishing collocations (co-occurrences of words) from colligations (cooccurrence of word forms with grammatical phenomena; Gries and Divjak, 2009) is not always a simple task. There is no clear boundary between various types of word combinations inasmuch as they can be simultaneously a collocation and a colligation.

This highlights the interdependence of syntax and lexis. For example, whereas V + out + NP is a colligation, spew (V) + out + air pollution (NP) is a collocation which exemplifies the colligation. However, the meaning of colligation has since expanded to include the specification of semantic preference or semantic prosody. Accordingly, it can now refer to the cooccurrence of lexis and grammatical categories.

Semantic preference is the "relation between a lemma or word-form and a set of semantically related words" (Stubbs, 2002, p. 65). Semantic prosody (Louw, 1993) captures the fact that some elements attract lexical items designating negative things, features, actions, etc., whereas others show a characteristic co-occurrence with positive elements. Together these notions expand into the notion of semantic colligation: the mutual attraction holding between a grammatical construction and a semantic category (Gabrielatos, 2007).

For example, when something is projected (spewed, pumped, etc.) from a container, the ejected entity (air pollution, CO2) is frequently undesirable. Moreover, this semantic preference is confirmed by corpus data showing that when pollution is the theme argument, it tends to consistently combine with verbs belonging to two semantic domains:


When pollution or one of its components is the agent argument, it also tends to combine with verbs of change, but primarily with those predicates belonging to the subdomain to cause something to become worse (contaminate/foul/degrade, etc.).

(7) Still more fluorides from such pollution **contaminate** the animals and plants we use as food.

### TABLE 1 | Contexts of dissipate in specialized texts.

fpsyg-07-00196 February 19, 2016 Time: 20:46 # 8



Optionally, it is also activated with verbs of causative existence in the subdomain of to cause something bad to happen. More specifically, threaten means to cause something or someone to be vulnerable or at risk.

(10) The oceans are now so **threatened** by pollution and exploitation that many shorelines will soon be totally denuded of marine life.

The predisposition to appear in certain syntactic structures and combine with predicates from specific semantic subdomains is directly related to the semantic load of pollution, a dot object according to the (Generative Lexicon) (Pustejovksy, 1995), which can be regarded either as a process or the result of a process.

Both colligations and predicate argument structures reflect the fuzzy boundary between syntax and semantics. The fact that words that occur together tend to be semantically similar explains that local syntactic contexts could also be used for semantic clustering. Popularized by Firth (1957) in his famous line "a word is characterized by the company it keeps," this approach has been implemented in the distributional hypothesis (Harris, 1985) and the strong contextual hypothesis (Miller and Charles, 1991). Furthermore, within the scope of a sentence, word sense disambiguation is usually determined by a combination of two factors: (1) the syntactic frame into which the word is embedded, and (2) the semantics of the words with which it forms syntactic dependencies (Rumshisky, 2008, p. 217).

### Local Semantic Contexts

Local semantic contexts can either refer to semantic relations between the constituents of the specialized knowledge unit (terminternal semantic context) or to semantic relations between different specialized knowledge units in the text (term-external semantic context). In the first case, the scope of the context is the multi-word term itself, whose interpretation is based on the meanings as well as the dependency relations between the head and the modifiers from which a semantic relation can be inferred. In the second case, the context is the linguistic codification of a triplet: two specialized knowledge units linked by a phrase that explicitly marks a semantic relation.

Term-internal semantic contexts are exemplified by MWEs and constitute a large portion of the lexicon of any natural language. It is estimated that the number of MWEs in the lexicon of a native speaker is the same as the number of single words (Jackendoff, 1997), and these ratios are probably even higher in the case of domain-specific language, in which the specialized vocabulary and terminology are composed mostly of multiword expressions. According to Erman and Warren (2000, p. 29), the fact that half of spoken and written language comes in pre-constructed multiword combinations makes it impossible to consider them as marginal phenomena. In fact, these specialized MWEs are rapidly increasing because of the continuous addition of new terms that designate new concepts. This makes it virtually impossible to store all of them in a dictionary.

Since the meaning of multi-word terms is often a specialization of the meaning of its head, in many cases, term structure can be used as a way to automatically extract information for the specification of conceptual hierarchies, one of the main components of TKBs. In morphologically poor languages, such as English, they can take the form of sequences or stacks of nouns of varying length: (i) two constituents (capillary wave); (ii) three constituents (long-wavelength surface wave); (iii) four constituents (ocean surface gravity wave); and (iv) five constituents (surface gravity wave elevation spectra). It is for the addressee to unpack their meaning and determine the implicit relationship between the constituents. One way of doing this is to understand such compounds in terms of left or right branching dependency relations:


Evidently, the more numerous the terms in the stack, the more difficult it is for a computer to automatically establish dependencies. The most complicated cases are (13) and (14). For example, in (13), it is necessary to know that the term must be interpreted as a right-branching compound, since the term does not refer to an ocean surface. Instead, surface gravity

wave designates an important type of wave (which is a synonym of surface wave), which propagates in the ocean. In contrast, (14) is a left-branching compound, which is gradually generated from surface gravity wave by adding the subsequent heads of elevation and then spectra. The analysis of these dependencies is often based on term entrenchment and extremely difficult for a computer to perform automatically unless it has been previously trained to do so.

Another way of understanding term creation is to think of the general concept as an entity with an underspecified meaning. Its nature is what predetermines the potential specification of its meaning that will lead to the generation of hyponyms.

For example, wave designates an oscillation propagating through a medium. As such, it has a series of basic defining attributes, such as height, wavelength, steepness, period, speed, and frequency. This can be regarded as a kind of frame or type of implicit context that opens up slots. The specification of one or more of these attributes that modify the head (wave) creates different wave types (long-period wave, high-frequency wave, etc.).

However, these defining attributes are not the only source of new terms. When a wave is regarded as a process, this creates a more abstract context or frame that opens the door to other possible subtypes. A wave can thus be regarded as caused by an agent (wind wave), affected by a force (gravity wave), moving in a certain way, (plunging breaking waves), taking place at a certain location (surface wave), and occurring in a certain medium (water wave). However, in some cases, both scenarios or contexts can combine to produce terms such as long-wavelength surface gravity wave. This term has the following meaning relations:

(15) Long-wavelength surface gravity wave **Head**: wave **Located\_at**: surface **Affected\_by**: gravity **Length\_of** : long-wavelength

A long-wavelength surface gravity wave can thus be regarded as an oscillation (wave) with a long wavelength (long-wavelength) on the air–sea interface (surface) affected by a restoring force (gravity).

Evidently accessing the meaning of such a combination is not a trivial process since it activates a whole specialized frame that requires previous knowledge. In this sense, Maguire et al. (2010, pp. 49–50) cite the concept specialization model (Murphy, 1988) and dual process theory (Wisniewski, 1997), which propose a two-stage interpretation process. The first stage involves a slotfilling mechanism where the modifier is inserted into a slot in the head-noun schema to form an interpretation. The type of noun modifier is directly related to the basic meaning of the head.

For instance, in N + N compounds in which energy is the headword, the slot activated is usually agent (e.g., wave energy, wind energy, heat energy, etc.), which highlights the source or natural force producing the energy. In contrast, in sediment compounds, in most cases, the headword opens a <location> slot (e.g., intertidal zone sediment, streambed sediment, aquifer sediment, etc.) since sediment is solid fragmented material that is transported and deposited by water or wind at a certain location. Alternatively, there is also a set of sediment terms in which a <made-of> slot is opened up (lithogenous sediment, biogenous sediment, hydrogenous sediment, cosmogenous sediment). These A + N compounds foreground the "fragmented material" part of the definition.

Consequently, this indicates that the membership of the head noun in a small number of broad semantic categories reveals consistent patterns in modifier and head use, and that the semantic categories are not randomly paired.

Term-external semantic contexts take the form of KWIC concordances or knowledge-rich contexts that provide information about a concept's attributes or the relations that it forms with other concepts. They contain Knowledge Patterns (KPs; Barrière, 2004), which are lexico-syntactic patterns that indicate a semantic relationship, and at least two specialized knowledge units. Studies in this tradition include Pearson (1998), Meyer (2001), Barrière (2004), Aussenac-Gilles and Jacques (2006), and Sierra et al. (2008), inter alia.

Mitkov (1998, 2002) and Meyer (2001) distinguish between knowledge-rich contexts and knowledge-poor contexts. Knowledge-poor contexts do not include any item of domain knowledge related to the search word. In contrast, knowledgerich contexts contain at least one item of domain knowledge that is useful for the conceptual analysis of the search word. Such contexts should indicate at least one conceptual characteristic, whether it is an attribute or relation (Meyer, 2001, p. 281).

For example, the concordances of erosion in **Figure 1** show how different KPs convey different relations with other specialized concepts. The main relations reflected in erosion concordances are caused\_by, affects, has\_location, and has\_result, which highlight the procedural nature of the concept and the important role played by non-hierarchical relations in knowledge representations.

In **Figure 1**, erosion is related to various types of agent, such as storm surge (1, 7), wave action (2, 13), rain (3), wind (4), jetty (5), construction projects (6), mangrove removal (8), surface runoff (9), flood (10), human-induced factors (11), storm (12) and meandering channels (14). They can be retrieved thanks to all KPs expressing the relation caused\_by, such as resultant (1), agent for (2, 3), due to (6, 7), responsible for (11), and lead to (13). This relation can also be conveyed through compound adjective phrases, such as flood-induced (10) or storm-caused (12) and any expression containing cause as a verb or noun: one of the causes of (9), cause (4, 5, 8), and caused by (14).

Erosion is also linked to the patients it affects, such as water quality (15), sediments (16), coastlines (16), beaches (17), buildings (18), deltas (19), and cliffs (20). However, the affected entities, or patients, are often equivalent to locations (e.g., if erosion affects beaches it actually takes place at the beach). The difference lies in the KP linking the propositions. The affects relation is often reflected by the preposition of (10) or by verbs such as threatens (18), damaged by (17) or provides (19). In contrast, the has\_location relation is conveyed through directional prepositions (around, 21; along, 22; downdrift, 23) or spatial expressions, such as takes place (24). In this way, erosion is linked to the following locations: littoral barriers (21), coasts (22), and structures (23). Result is an essential dimension in the description of any process since it is not only initiated by an agent

affecting a patient in a particular location, but also has certain effects, namely, the creation of a new entity (sediments, 25; primary coasts, 26; beach material, 27; shorelines, 28; marshes, 29; bays, 31) or the beginning of another process (seawater intrusion, 31; profile steepening, 32).

As can be seen, all these related concepts are quite heterogeneous. They belong to different paradigms in terms of category membership and/or hierarchical range. For instance, some of the agents of erosion are natural (wind, wave action) or artificial (jetty, mangrove removal) and others are general concepts (storm) or very specific ones (meandering channel). This explains why knowledge extraction must still be performed manually or semi-automatically and how local semantic contexts can be conceptually valuable. Nevertheless, it also illustrates one of the major problems in knowledge representation: multidimensionality. Multidimensionality has been defined by


FIGURE 2 | Hierarchical relations associated with EROSION.

many authors (Bowker, 1997; Kageura, 1997; Wright, 1997; Rogers, 2004) as the phenomenon in which certain concepts can be classified according to different points of view or conceptual facets. Evidently, multidimensionality has important consequences regarding how domains are categorized and modeled (León-Araúz et al., 2013). This is better exemplified in the concordances shown in **Figure 2** since multidimensionality is most often codified in the is\_a relation.

In the scientific discourse community, concepts are not always described in the same way because they depend on perspective and subject-fields. For instance, erosion is described as a natural process of removal (33), a geomorphological process (34), a coastal process (35), or a stormwater impact (36). The first two cases can be regarded as conventional ontological hyperonyms. The choice of one or the other depends on the upper-level structure of the representational system, its level of abstraction and the support for context. However, the other two cases (i.e., coastal process and stormwater impact) cause the concept to be framed in more concrete subject-fields and referential settings.

The multidimensional nature of erosion is also clearly shown in subtypes, which are codified in term-internal semantic contexts. Erosion can thus be classified according to the dimensions of result (sheet, rill, gully, 37; differential erosion, 38), direction (lateral, 39; headward erosion, 49), agent (wave, 41; fluvial, 42; wind, 43, 46; water, 44; glacial erosion; 45), and patient (sediment, 47; dune, 48; shoreline erosion, 49).

These dimensions are contexts that need to be specified in the TKB in order to delimit information retrieval and make it more relevant. They can be represented as part of a definitional template (all cohyponyms being defined according to the same dimensions). Alternatively, they can be codified as a specification of the subsumption relation (fluvial erosion is\_a (agent) erosion), or simply as a concordance or knowledge-rich context.

In order to retrieve new related term pairs, KPs can be collected and systematized in the form of local grammars. For instance, **Figure 3** shows part of the formalization of the causal relation, which is based on causative verbs in any of their inflected forms (cause, produce, generate, trigger), morphological particles (-driven, -induced) and other literal causative expressions (responsible for), as exemplified in the concordances of erosion (**Figure 1**). When this grammar is applied to the corpus, it identifies structures such as "tsunamis, usually caused by large earthquakes" or "rain produced severe flooding," from which we can derive the conceptual propositions, or triplets, TSUNAMI causes EARTHQUAKE and RAIN causes FLOODING.

Once again, syntactic and semantic local contexts are not discrete variables. However, this approach must also be contrasted with global semantic and pragmatic contexts, since conceptual knowledge as reflected in the text is not always reliable. This means that texts do not reflect perfectly designed conceptual networks. For instance, in hyponymic term-external semantic contexts (x such as y, y and other x, x is a type of y, etc.), authors do not always choose the direct parent of a concept. Many times, they will use a grandparent (WORK > GROIN instead of COASTAL STRUCTURE > GROIN) or will even create an ad hoc category (OBSTACLE TO FLOW > GROIN instead of COASTAL STRUCTURE > GROIN; León-Araúz and Reimerink, 2016).

Furthermore, the existence of multiple hyperonyms can indicate two types of multidimensionality: intracategorial and inter-categorial multidimensionality. In intracategorial multidimensionality, hyperonyms point to the same concept but highlight different dimensions or different levels of granularity. However, in intercategorial multidimensionality, hyperonyms point to a paradigm change, which makes the different facets incompatible.

One example is FOREST, which is found in local contexts as a type of ecosystem or as a type of renewable resource. This means that the concept is viewed as a type of one hyperonym to the exclusion of the others. This evidently affects the way in which the concept relates to other concepts. For instance, when a forest is viewed as a renewable resource, it is more closely related to concepts such as solar energy and biofuel, whereas when viewed as an ecosystem, wetlands and lakes are its closest concepts. This necessarily has an impact on knowledge and context modeling. Contrasting these results with a global approach (see section "Global Contexts") and analyzing lexical cohesion in the whole text where these structures occur can result in a reliable reconstruction of a text-driven conceptual system.

### Local Pragmatic Contexts

fpsyg-07-00196 February 19, 2016 Time: 20:46 # 12

Local pragmatic contexts basically refer to parameters of terminological variation and culturemes. Although in Terminology, the initial goal was to have one linguistic designation for each concept for greater precision, it soon became obvious that in descriptive terminology work, this is not always the case. This occurs more frequently in standardization settings (e.g., institutional, legal, technical, etc.) where the objective is to harmonize terminologies for the sake of efficient unambiguous communication. However, in the same way as for general concepts, the same specialized concept can often have many linguistic designations depending on the context. Alternatively, the same linguistic designation can also refer to various concepts.

As in general language, it is possible to establish reasons for terminological variation based on user-based parameters of geographic, temporal or social variation or usage-based parameters of field, tenor, and channel (Gregory and Carroll, 1978). Nevertheless, these basic parameters only provide a very partial representation of a very complex situation, since there are other reasons for terminological variation that are often considerably more difficult to represent.

Freixa (2006, p. 52), for example, classifies the causes for terminological variation in the following categories: (1) dialectal, based on the origin of the authors; (2) functional, based on communicative registers; (3) discursive, based on the style of the authors; (4) interlinguistic, based on the contact between languages; and (5) cognitive, based on different conceptualizations. These are all pointers to different types of extra-textual contexts, which mainly stem from the author's identity, location, language, and way of thinking. According to Freixa (2002), cognitive term variants are not only formally different, but also semantically diverse, as they give a particular vision of the concept. They are thus the natural reflection of multidimensionality (Fernández-Silva et al., 2011). Very often, the choice of one term instead of another stems from different perspectives of reality. Nevertheless, there are certain types of variation that do not fall into any of these categories, such as morphological variants, orthographic variants, ellipted variants, abbreviations, graphical variation, variation by permutation, etc. (Bowker and Hawkins, 2006, p. 81). Their use in texts often seems to be random without responding to any pattern or regularity. Although initially, the existence of such variants may not seem to be a problem, reality is somewhat different. Since term variants are rarely interchangeable, it is not a question of merely adding more terms to the TKB. What is needed is more information in term entries so that users can know which term to select. In terminological resources, users are often confronted with a vast array of variants with no indication of how term variation arises or how their use may be constrained.

In fact, variants often have a communicative and/or cognitive motivation. Therefore, the use of one term or another may affect the semantics of a concept or the communicative situation in which the concept is activated. Based on this distinction, our experience in EcoLexicon and other foundational work on term variation, we propose that term entries should include the following extended classification of pragmatic markers. When building a multilingual TKB, these markers can also enhance interlingual correspondences, because users will be able to make a cognitively sound choice. Otherwise, translators may actually over-standardize, creating consistency in places where the use of variants was deliberate and well-reasoned (Bowker and Hawkins, 2006, p. 80).

	- (i) Abbreviation
	- (ii) Acronym (e.g., laser, Light Amplification by Stimulated Emission of Radiation).


The nature and scope of these variants are very diverse. Furthermore terms can activate more than one type of variant, which might make term choice more difficult. For example, H2O and/or water are domain-based variants since the first is more frequently used in chemistry and water treatment domains than in hydrology or geology. However, their use also depends on the communicative situation and the knowledge level of the speaker and receiver. In the same way, lap-appy could be classified as jargon as well as a short form. On the other hand, the same type of variant can also be expressed by more than one term. Diaphasic variants, in particular, form a continuum ranging from more formal to informal (e.g., thermal low pressure system, thermal low, thermal trough, and heat low). The same happens within expert variants, which can be graded on a scale of frequency or acceptance. For instance, coastal defense and coastal management are both expert variants, but, coastal management is the preferred term.

Moreover, in those cases where the same concept can be designated by different dialectal variants stemming from the geographic origin of the writer, this can also mean that, conversely, the same term can be used to designate different entities in different cultures. For instance, pier (a structure built on posts extending from land out over water, used as a landing place for ships, an entertainment area, or a promenade area) is often designated as jetty in the Great Lakes. In contrast, a jetty is most often a structure designed to prevent the shoaling of a channel and is not considered a recreational area. However, in British English, jetty is the synonym of a wharf, whereas, in American English, pier may also be a synonym of dock. Nonetheless, in British English a dock is the area of water used for loading or unloading cargo in a harbor, which in American English is called a port.

Geographical variation in this category domain can often be conceptually motivated and mainly based on the dimensions of location and function. For instance, a dike may be called a levee when it is located on a river, whereas a breakwater may be called a mole when it is covered by a roadway. On the contrary, when a breakwater serves as a pier, it is called a quay in British English and a wharf in American English. Needless to say, when the knowledge base includes a conceptual representation or ontology, important design decisions must be taken. A base concept must be chosen (e.g., PIER) and be specified to accommodate the references of these variants (e.g., WORKING\_PIER, PLEASURE\_PIER, FISHING\_PIER, etc.).

Local pragmatic contexts are thus reflected in terms and multiword expressions that are pointers to larger (global) situational, linguistic, and cultural contexts. Therefore, local and global pragmatic contexts constrain each other. Local contexts point to global contexts by constraining all possible situations (i.e., geographic, communicative, cognitive), and global contexts drive the choice of one variant over the rest.

Consequently, term variation should not be regarded as a linguistic phenomenon isolated from conceptual and cultural representations since it is one of the manifestations of the dynamicity of categorization and expression of specialized knowledge (Fernández-Silva et al., 2014).

## Global Contexts

Contexts can also be global with a wider scope. The scope of such contexts can be a whole document, a communicative situation (e.g., formal vs. informal), a subject domain (e.g., Geology, Meteorology, etc.), or an entire language-culture.

Global contexts affect the underlying design of the data fields of a TKB since they are too large to be included in a term entry unless it is in the form of syntactic, semantic, or pragmatic markers, which would be more suitable for local contexts. They can also be analyzed with a view to tagging and classifying corpora in macro- and micro-structural terms.

The macro- and micro-structure of a text, or even a set of texts where intertextuality plays a role in understanding a specialized domain, provide a larger context to be analyzed with regards to grammatical and lexical cohesion (syntactic and semantic context). When global contexts are extra-textual, they are global pragmatic contexts characterized by different combinations of authorship, readership, function, domain, culture, etc.

### Global Syntactic Contexts

When the document is used as the context, global syntactic contexts consist of the means of grammatical cohesion that tie the text together. These include endophoric reference (anaphora, cataphora), substitution, and ellipsis, as well as other grammatical cohesive markers that connect the different sentences of a text in a logical manner, such as, however, on the other hand, consequently,

etc. Such contexts could presumably also refer to the use of verb tenses throughout a discourse. For example, the typical Introduction, Methodology, Results, and Discussion (IMRAD) format of research articles is reflected in the verb tenses used in each section.

These verb tenses set the scene for the description of a research study and the presentation of results. In English, for example, the introduction is generally in the present tense when the author is describing the cause or (problematic) situation that produced the need for the research and the review of the literature on the topic. However, the past tense is used when referring to how the experiment was carried out. The present tense is again used when the author outlines the sections of the article. This use of verb tenses can differ when the article is written in a different language. For instance, in Spanish, there is a greater tendency to use the present and future tenses as though the research being described were being carried out in the paper itself.

Although it might seem that the use of tenses and syntax in general has little impact on terminology structure and selection, this is not the case. As observed by Gotti (2003), the fact that specialized discourse is characterized by elementary surface structures and relatively simple syntax allow the author a certain license to use complex and long pre-modified MWEs, which leads to a far longer sentence length. This means that the use of more or less complex nominal compounds is in direct relation to the relative simplicity of the syntactic structures in the text.

Grammatical cohesion in scientific discourse is often domainindependent, but still specialized. The same happens with the transdisciplinary scientific lexicon (Drouin, 2010), which includes abstract verbs (to think, to consider), abstract nouns (idea, factor, relation, hypothesis, data, approach), and collocations (to conduct an analysis) that refer to the description of scientific activities and reasoning but do not point to domain concepts. Thus, the study of global syntactic contexts can also have important computational applications, such as term extraction and coreference resolution.

### Global Semantic Contexts

Global semantic contexts are in turn reflected in the lexical cohesion of texts (Halliday and Hasan, 1976; Morris and Hirst, 1991). Lexical cohesion is based on the meaning relations between words in a text. Such relations are paradigmatic and link two words having a common component from the viewpoint of their meaning.

Apart from repetition, lexical cohesion is most frequently achieved by using synonyms and hyperonyms, which requires a certain previous knowledge of the domain. In this sense, the description of local semantic and pragmatic contexts in TKBs ensures lexical cohesion when TKBs are used for text production tasks. In fact, it has been shown that scientific journal articles and popularized accounts of the same research do not employ the same cohesive patterns (Myers, 1991). According to Myers (1991, p. 5), the readers of scientific texts must have previous knowledge of lexical relations to see the implicit cohesion of the text, while readers of popularizations must see the explicitly marked cohesive relations to infer lexical relations, and to link the semantic field of the specialized domain to those of everyday life.

Thus, the analysis of lexical cohesive devices, which is hardly a trivial task, has also been approached from a computational perspective. In distributional approaches, synonyms, hyperonyms, antonyms, etc., are typically calculated by means of context vectors for each word, grouping together words that appear in the same contexts. In Ellman and Tait's (1998) framework, a single instance of lexical cohesion is a lexical link, whereas a sequence of links is a lexical chain. Such chains can also be formed by relations or bonds between sentences that are related by two or three links (Hoey, 1991).

Lexical chains are identified by using relationships between word senses. Nevertheless, in order to build lexical chains, it is necessary to know word senses and semantic relations between words. A lexical chain for a text contains a subset of the words (word senses) in the text, and are semantically related. Although the length of such chains may cover a larger or smaller portion of the text, in this case, we are referring to those that cover the whole document. Evidently, the number of words and the number of semantic relations between words can be different for each lexical chain. According to Ercan and Cicekli (2007), the coverage and size of a lexical chain can indicate how well the lexical chain represents the semantic content of the text. Lexical chains are evidently meaningbased but can also be derived from collocational frequencies. For example, Phillips (1989, p. 51) states that the collocation between electric and charge is also linked to the patterns in the text between their collocations (e.g., charge collocates with distribution, density, point, and uniform; electric collocates with dipole). Bondi (2010, p. 4) affirms that this network of semantic relations identifies the 'aboutness' of a text, and is a marker of text content.

### Global Pragmatic Contexts

Global pragmatic contexts are the most complex form of context to be systematized and should thus be represented in a TKB, since they involve different interrelated variables.

Pragmatics is at the core of the dynamics of both terms and concepts (León-Araúz and Faber, 2014), since changes in conceptualization and in the lexicon are clearly not independent of each other but interact in a number of unforeseeable ways (Cimiano et al., 2010).

Generally speaking, pragmatics focuses on the effect of context on communicative behavior as well as on how inferences are made by the receiver (Faber, 2012). Crucial pragmatic dimensions in specialized communication contexts include (1) the beliefs and expectations of the text sender; (2) the knowledge shared by the text sender and text receivers; (3) the communicative objectives of the oral or written text stemming from the interaction of the participants; and (4) the factors that cause receivers to interpret the text in a certain way (Faber and San Martín, 2012, p. 178).

Strictly speaking, by its very definition, any type of pragmatic context is global. As previously mentioned, even local pragmatic contexts, as reflected in term variants or culturemes, are markers that point to larger communicative and cultural situations, which have an impact on conceptualizations in a given language-culture. Precisely for that reason, the description of entities is necessarily constrained by contextual

variation across communicative situations, cultures, and disciplines, as well as the fuzzy category boundaries that they establish.

For example, texts with a high term density (percentage of specialized knowledge units) are written by experts who wish to transmit knowledge to other experts in the same domain. Texts written for semi-experts or for non-experts have a correspondingly lower term density, although more term variants tend to be employed for the sake of transparency.

In specialized communication, genre and register are important concepts even though their definitions often seem to confusingly run together. However, following Lee (2001, pp. 46–47), we use register to refer to lexical-grammatical and semantic discourse patterns associated with situations, whereas genre is used to refer to the membership of a text in culturally recognizable categories, which may invoke more than one register. As such, genre is a socio-pragmatic phenomenon. According to Unger (2002, p. 2), a socio-pragmatic phenomenon is a set of shared assumptions that governs the communicative behavior of members of this group. It also relates communicative behavior to the structure of cultural institutions.

Hoffmann (1985, 1990) states that the purpose of a text depends on the context in which the text was created. In this sense, a text is both an instrument and a result that comes into being because of the specific productive activity (Hoffmann, 1985, p. 233). Similarly, Roelcke (1999, p. 42) underlines the importance of the specialized text regarded as a whole, and observes that the context of language usage also goes hand in hand with an increasing specialization of scientific and professional fields.

Although a definitive inventory and classification of specialized language genres and registers does not as yet exist, specialized language genres would doubtlessly be linked to specialized knowledge activities and text function within the context of a specialized knowledge field (cf. Hoffmann, 1985). Registers would presumably be subdivided primarily according to levels of formality. These formality levels would be constrained by parameters inherent in the context of specialized communication.

However, in TKBs, communicative context should not only be codified as a local pragmatic marker in term entries, especially when the aim of querying a TKB is multilingual communication. The reason for this is the fact that register-based variants in different languages do not necessarily establish 1:1 correspondences. This means that if a concept is designated by an informal term variant, it should not always be translated by its informal counterpart in another language and vice versa, because pragmatic conventions can also change from culture to culture. For instance, in an English doctor–patient communicative act, doctors tend to use more informal variants than in a similar situation in Spain. Even if a term-pair such as intestinos and intestines are full equivalents, bowels would be more appropriate in an English situation.

Nevertheless, the influence of culture is reflected in specialized domains in much more complex ways than it is in culturespecific terms or register-based differences (Faber and León-Araúz, 2014). They also may affect conceptual structures. For instance, one might think that natural landforms are more or less the same all over the world, but the truth is that there is a great deal of plasticity in how language models the earth and what is considered to be the essence of its features (Burenhult and Levinson, 2008, p. 148). Until recently, it was believed that entities such as mountain and river were candidates for universals (Smith and Mark, 2001). However, research in cognitive ethnophysiography has found that this is not the case. Apart from the problem of establishing interlinguistic correspondences, this also makes it hard to agree on how concepts are classified in the same language.

For example, the diversity of wetlands is an obstacle to arriving at a consensus in regards to their classification. One of the most widely used classifications was created by Cowardin et al. (1979), who divided wetlands into marine, estuarine, riverine, lacustrine, and palustrine environments. Nevertheless, this classification was eventually found to be too restrictive, and a more comprehensive categorization was required. The Ramsar classification system for wetland types (1996) thus proposed new categories to cover all types of wetlands in the world: marine/coastal wetlands, inland wetlands, human-made wetlands. In turn, the Canadian national wetlands working group (1997) established five classes: bog, fen, marsh, swamp, and shallow water.

However, labeling categories in terms of basic level concepts (Rosch, 1978) can be confusing, because they are highly localized. For instance, bogs or fens are usually grouped together and referred to as mires in Europe, but not in America. Marshes in Europe are often called reed swamps, but swamps in America are not dominated by reeds but rather by trees. Carr is the northern European term for the Southeast American wooden swamp, which in the United Kingdom is also called wet woodland. There are also specific types of wetlands that only predominate in certain geographic areas that are not lexicalized in all cultures, such as the Australian billabong, the African dambo, or the Canadian muskeg. In these cases, the local terms are only borrowed when describing these particular wetlands. Thus, when one of these terms is activated in a text, the location-related category features of the concept are constrained.

Multidimensionality is also found in discipline-based contexts. In Terminology, multidimensionality is often regarded as a way of enriching traditional static representations, enhancing knowledge acquisition through different points of view in the same semantic network or conceptual system. However, it can also produce an excessive information load. This is the case of certain general top-level concepts such as water (**Figure 4**), which is a classic example of information overload in EcoLexicon (León-Araúz et al., 2012, 2013; Faber et al., 2014a).

Water certainly holds different relations with a myriad of different concepts. However, EcoLexicon users would not acquire any meaningful knowledge if all dimensions of water were shown in the same network. Moreover, water rarely, if ever, activates those concepts at the same time, since this would evoke completely different and incompatible scenarios (León-Araúz et al., 2013). In this sense, although it is true that concepts cannot be activated in isolation, they can also retain sufficient autonomy so that the activation of one does

Frontiers in Psychology | www.frontiersin.org February 2016 | Volume 7 | Article 196 |

not necessarily entail the activation of the rest (Langacker, 1987, p. 162). Their activation should thus be domain-dependent.

According to Picht and Draskau (1985, p. 48), multidimensionality depends on the classifier as well as the different knowledge sources that may reflect different criteria when organizing the same domain. In conceptual modeling, facets and contexts can be established according to different criteria. However, in EcoLexicon, a discipline-oriented approach was found to be the most appropriate, since concepts may have different roles and degrees of prominence in the different disciplines that constitute the environmental sciences.

As opposed to formal approaches where concepts are ascribed to particular categories on the basis of a set of necessary and sufficient features, semantic networks in EcoLexicon take the form of a set of conceptual relations that might be highlighted or suppressed, depending on pragmatic factors. We agree with Michalski (1991) when he states that the context of a concept is the set of concepts relevant to its intended meaning.

The environmental domain was thus divided into a set of domain-based contexts (e.g., hydrology, geology, oceanography, civil engineering, etc.) and the relational power of concepts was constrained accordingly. This is done by assigning each conceptual proposition to one or more contextual domains based on a previously domain-based classified corpus. For example, the proposition CONCRETE made\_of WATER only appears relevant in Civil Engineering texts, but not in a geological context. Thus, when constraints are applied, the network of WATER within the civil engineering sub-domain is recontextualized and becomes more meaningful (**Figure 5**).

Recontextualization is in line with Cruse's (2002) approach to meaning (i.e., ways of seeing, microsenses, or context modulation) or Sperber and Wilson's (1986) Relevance Theory, since semantic networks are dynamically built according to context salience. Thus, concepts themselves can also have their own situated nature. In this sense, Barsalou (2009, p. 1283) states that a concept produces a wide variety of situated conceptualizations that support goal achievement in specific contexts. In a similar way, this would be in consonance with semantic priming, which according to McNamara (2005) can be influenced by the context created by the types of semantic relations present in a test list.

## CONCLUSION AND FUTURE WORK

In this paper, we have proposed a taxonomy of context primarily based on scope (local and global) and further

## REFERENCES


divided into syntactic, semantic, and pragmatic facets for TKB design. Although context is a controversial notion interpreted and represented as needed in each field, we believe that for specialized knowledge representation in terminological resources, context should be much more than a textual excerpt.

Context modeling formally describes aspects of the linguistic, physical, and social world around us for purposes of understanding and communication. In this regard, it is necessary to determine what aspects to include and exclude from the model, and at what level of detail to model each of them.

Ideally, context specification and representation in specialized knowledge resources is conducive to the formulation of a common structure applicable to and valid for different languages and cultures based on a representational framework that allows for correspondences at different levels as well as for the inclusion of the syntactic, semantic and pragmatic features upon which this correspondence is based.

In EcoLexicon, various context parameters have been explored. Their specification has materialized in various modules representing different contextual aspects, ranging from term variation, collocations or knowledge-rich contexts, to dynamic conceptual networks, flexible definitional templates, or conceptually enriched graphical resources. However, much still remains to be done, especially with regards to the interdependence between all modules and the transition from local to global constraints. In the future, users will be provided with different types of information selected according to context. For this to be possible, context in all of its facets must be accounted for in a systematic and principled way.

## AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

## ACKNOWLEDGMENTS

This research was carried out within the framework of project FF2014-52740-P, Cognitive and Neurological Bases for Terminology-enhanced Translation (CONTENT) funded by the Spanish Ministry of Economy and Competitiveness.





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Faber and León-Araúz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Contexts as Shared Commitments

### Manuel García-Carpintero\*

LOGOS-Departament de Lògica, Història i Filosofia de la Ciència, Universitat de Barcelona, Barcelona, Spain

Contemporary semantics assumes two influential notions of context: one coming from Kaplan (1989), on which contexts are sets of predetermined parameters, and another originating in Stalnaker (1978), on which contexts are sets of propositions that are "common ground." The latter is deservedly more popular, given its flexibility in accounting for context-dependent aspects of language beyond manifest indexicals, such as epistemic modals, predicates of taste, and so on and so forth; in fact, properly dealing with demonstratives (perhaps ultimately all indexicals) requires that further flexibility. Even if we acknowledge Lewis (1980)'s point that, in a sense, Kaplanian contexts already include common ground contexts, it is better to be clear and explicit about what contexts constitutively are. Now, Stalnaker (1978, 2002, 2014) defines context-as-commonground as a set of propositions, but recent work shows that this is not an accurate conception. The paper explains why, and provides an alternative. The main reason is that several phenomena (presuppositional treatments of pejoratives and predicates of taste, forces other than assertion) require that the common ground includes non-doxastic attitudes such as appraisals, emotions, etc. Hence the common ground should not be taken to include merely contents (propositions), but those together with attitudes concerning them: shared commitments, as I will defend.

### Edited by:

Alessio Plebe, University of Messina, Italy

### Reviewed by:

Mark Jary, Roehampton University, UK Alessandro Capone, University of Messina, Italy

### \*Correspondence:

Manuel García-Carpintero m.garciacarpintero@ub.edu

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 16 September 2015 Accepted: 01 December 2015 Published: 22 December 2015

### Citation:

García-Carpintero M (2015) Contexts as Shared Commitments. Front. Psychol. 6:1932. doi: 10.3389/fpsyg.2015.01932 Keywords: context, presupposition, accommodation, meaning normativity, rules

## TWO NOTIONS OF CONTEXT

As Stalnaker (2014, pp.13–34) reminds us, in formal semantics/pragmatics there have been two prominent theoretical articulations of the intuitive notion of a context—a concrete situation relative to which linguistic exchanges take place. The first is the one described in Kaplan's (1989) work, by means of which Kaplan's important notion of character (the linguistic meaning of contextdependent expressions) is defined. On this view, a context is a sequence of items on which the content of a sentence ("what is said" with it) might depend, given the character of some of the expressions in it. Thus, a context includes a speaker, the value of the character of "I"; a time, the value of the character of "now"; a place, the value of the character of "here"; a possible world, the value of the character of "actual." In contemporary intensional semantics, this is modeled as a centered possible world—a possible world together with a designated time and subject.

The second is the notion characterized in Stalnaker's (1978) influential work on presupposition and assertion: "a body of information that is available, or presumed to be available, as a resource for communication" (Stalnaker, 2014, p. 24). This is modeled as the "context set"—the set of possible worlds compatible with the presumed common knowledge of the participants<sup>1</sup> . This second notion is supposed to encompass the previous one, because the information needed to interpret indexicals

1For purposes of the present contrast, I take Lewis' (1979) model as a variant of the Stalnakerian model. I mention some relevant differences below.

García-Carpintero Contexts as Shared Commitments

(who the speaker is, what the time is, the place and the world in which the exchange takes place) is included in the context set. This raises delicate issues, not unrelated to the ones that I will be discussing, concerning how the "propositions" that make up contexts-as-common-ground should be understood for this claim to be justified; but it seems intuitively acceptable as a starting point<sup>2</sup> .

At first sight, the second conception is considerably more flexible than the first one, and as a result more adequate as a theoretical tool. In addition to "pure indexicals" like those already mentioned, there are demonstratives such as "he," "you," "that"; and their contribution to content appears to depend not on "objective" features of the concrete situation, but on what the participants take for granted (about who is the salient/demonstrated male or female, etc.) when they are uttered. To some researchers, including the present author, "answering machines" and related examples suggest that the divide between pure indexicals and demonstratives is spurious (cf. Cohen and Michaelson, 2013 and references there, although the authors do not subscribe to those views). And most linguists also contend that the distinction between deictic uses of indexicals, whose reference is determined by means of demonstrations, and anaphoric uses, determined rather by means of their links to the previous discourse, does not draw a genuine semantic boundary. As Heim and Kratzer (1998, p. 240) put it, "anaphoric and deictic uses seem to be special cases of the same phenomenon: the pronoun refers to an individual which, for whatever reason, is highly salient at the moment when the pronoun is processed." To the extent that we are clear as to how the information that o is the speaker comes to be in the context-as-common-ground so that utterances of "I" can be interpreted, it does not seem more problematic to understand how the information that o is the demonstrated male comes to be there.

Lewis (1980, pp. 85–86) points out, however, that the Kaplanian notion of context can also be sensibly taken to encompass the Stalnakerian one, and the previous points with it:

That is not to say that the only features of context are time, place, and world. There are countless other features, but they do not vary independently. They are given by the intrinsic and relational character of the time, place, and world in question. The speaker of the context is the one who is speaking at that time, at that place, at that world . . . The audience, the standards of precision, the salience relations, the presuppositions... of the context are given less directly. They are determined, so far as they are determined at all, by such things as the previous course of the conversation that is still going on at the context, the states of mind of the participants, and the conspicuous aspects of their surroundings.

Thus, the two notions of context might be perfectly compatible, "complementary, rather than alternative theories of the same thing" (Stalnaker, 2014, p. 16). For present purposes, however, I'll assume the Stalnakerian one; even if Lewis is right, it has at least the advantage of allowing for a more perspicuous presentation of the relevant features on which our theoretical proposals rely<sup>3</sup> .

As we have seen, the Stalnakerian notion is characterized as a set of propositions, or contents. The point I want to make in this article is that we should think of them as having instead a richer structure—more specifically, as having illocutionary features, understood in non-psychological, normative terms<sup>4</sup> . I will argue that it should not be understood as a set of propositions (or other representational contents) that are (presumed to be) mutually known, or mutually believed, but, more generally, as a class of shared propositional commitments some in the belief-mode, but some in other illocutionary modes too<sup>5</sup> .

The argument in the following pages proceeds by laying down five illustrative examples, observing that each of them constitutes a particular instance of the main claim just stated. They are: the contribution to the determination of what is said of a "question under assumption"; the interpretation of directives; the interpretation of pejoratives and slurs; the semantic of predicates of taste; the interpretation of fictions. Before going into the discussion of the examples, however, I need to say something about meaning and norms—both discourse norms, such as conversational norms and rules of accommodation, and illocutionary norms.

I should admit at the outset that the point I want to make should not be controversial, and in fact it is in a way obvious to researchers in this field. It is enough to pay attention to the fact that questions and commands make contributions of their own to the context in order to realize this. It is sometimes noted, and just put aside for reasons of expediency, because the semantics of declaratives is more familiar and well-studied. However, I will show that not having it clearly in mind leads to faulty arguments and overlooked possibilities. Section Example 3: Pejoratives and Slurs below on pejoratives is thus the core of the paper. In defending his truth-conditional account of pejoratives that I will question there, Hom (2012) approvingly quotes MacFarlane (2011):

<sup>2</sup>Huvenes and Stokke (2015) question "information-centrism," the view that context-as-body-of-information is what is needed in semantic theories of contextdependent expressions. This is also the view I am arguing against, although I will follow a different route. They use confusion cases involving indexicals and demonstratives, arguing that something beyond bodies of information is needed for proper theorizing about them. I think a more structured view of contexts along the lines to be suggested below might handle their cases, and hence that their arguments are interestingly complementary to those given here. More specifically, I think a proper handling of their cases requires adding further structure to contexts, distinguishing presuppositions that are semantic requirements (Fine, 2007) from those that are just shared knowledge with different sources—cp. Huvenes and Stokke (2015), fn. 12 and surrounding text.

<sup>3</sup>There are other notions of context in the literature, which might be free from the problems I'll raise; cf., for instance, Capone (2013), Fetzer (2012), and Gross (2001). I take it, however, that the Kaplan-Stalnaker's stance is sufficiently influential to merit discussion.

<sup>4</sup>This would not come as a surprise to those who contend that propositions themselves are constitutively endowed with force-like traits (King et al., 2014; Hanks, 2015); but the considerations here will not presuppose such a highly controversial view (wrong, I think), and will be compatible with more traditional views on which propositional contents themselves lack force-like features and can be put forward in different illocutionary modes.

<sup>5</sup>Green's (2000, p. 468) notion of the conversational record, defined in terms of the illocutionary commitments of discourse participants, offers a good formal model for the sort of structure I'll be arguing for.

The beauty of truth-conditional semantics is that it provides a common currency that can be used to explain indefinitely many interaction effects in a simple and economical account. We should be prepared to accept a messy, non-truth-conditional account . . . only if there is no truth-conditional account that explains the data.

The uncontroversial point that my two initial examples make shows this rhetoric to be highly problematic; in addition, these two initial examples will show how general the point is, in fact affecting all ordinary discourses. The three final examples show that we ignore it at our peril, starting with the very case for which Hom invokes the rhetoric of the methodological priority of the familiar semantics for declaratives.

## MEANING AND NORMS

In the previous section, I contrasted two different ways of thinking of contexts, and favored the Stalnakerian one. In this section I will discuss another contrast, between normative and descriptive, non-normative views of meaning, and I will indicate why I favor the former.

In recent work already mentioned, Stalnaker (2014, pp. 36– 37) contrasts two more different ways of thinking of contexts, which, as he points out, reflect the contrasting ways in which Austin and Grice thought of speech acts. Austin (1962) suggests thinking of them as social practices constituted by social norms, usually established and maintained by conventions; Grice (1957) takes them instead to be definable in natural, psychological terms, appealing to a peculiar kind of reflexive intention. Stalnaker's own views favor the latter sort of account; Lewis (1979) offers a model well adapted to the former. With respect to this issue, I depart from Stalnaker's views and favor the ones he rejects.

What is at stake in such debates? For present purposes, I'll just mention two relevant concerns that Austinians have with the Gricean account, which I take very seriously. On the negative side, Austinians emphasize that speech acts might well take place even when their authors lack the complex intentions that Griceans posit (Alston, 2000, pp. 48–49). A clerk in an information booth makes an assertion when she utters "the plane will arrive on time," even though she does not care at all what psychological impact this has on her audience. On the positive side, Austinians emphasize that speech acts are governed by norms, not just "regulative" ones (be clear!, polite!, witty!) but constitutive ones, and that this has a stronger impact on the determination of the speech act made than whatever communicative intentions the author had. Thus, for instance, the clerk in the above example might be criticized if she cannot have known the information she provided—we had been reliably told that the plane had only just taken off from the departing airport, and so we reply, "you cannot know that!."

Williamson (1996/2000) has defended an account of assertion along Austinian lines, on which the following norm (the knowledge rule) is constitutive of the act, and individuates it:

(KR) One must [(assert p) only if one knows p].

Other writers have accepted Williamson's view that assertion is defined by constitutive rules such as KR, but have proposed alternative norms; thus, Weiner (2005) proposes a truth rule, TR, and Lackey (2007) a reasonableness rule, RBR:

(TR) One must [(assert p) only if p].

(RBR) One must [(assert p) only if it is reasonable for one to believe p].

Norms like these are sui generis: they do not have their sources in moral or prudential codes, but in specifically illocutionary ones. They are defeasible and pro tanto: they can be overridden by stronger norms.

And it is possible to violate them, thereby rendering the acts wrong but occurring: what is constitutive of asserting p is not that one knows p, but that in performing it one is thereby subject to the requirement that one knows p. There are plenty of situations in which p is asserted when p is false, or the speaker lacks justification for it. The assertion is then wrong, and wrong relative to norms defining the nature of such a speech act.

Stalnaker (1978) provides an account of assertion, and of what I take to be an ancillary speech act, presupposition, in a Gricean spirit, on which a presupposition is a requirement on the context, and an assertion is a proposal to change it by adding to it its content, which will take effect if the assertion is not rejected. He (ibid., 87) puts forward several reasons why his suggestion cannot be taken as a definition sensu stricto: it is not individuative, in that acts other than assertion are such proposals, and presumably would be circular if taken in that way because it helps itself to the notion of another speech act, rejection. He nonetheless shows the account to be able to provide explanations for different phenomena.

One of those is presupposition accommodation, as when we decline an invitation by uttering "I cannot come, I have to pick up my wife at the airport." This will not be felt to be in any way incorrect even in contexts in which it is not mutually known, previous to the utterance, that the speaker is married. Stalnaker's suggestion to account for this relies on the correct point that whether or not the presuppositions of an utterance are satisfied should be checked right after the utterance has been produced. This is so because in many cases it is the very occurrence of the utterance that makes it the case that the context includes the information that must be in it for some presuppositions to be correct. Thus, an utterance u of "I am hungry" asserts that x is hungry, for some assignment to x, and presupposes that x is the speaker of u; but this latter information comes to be in place concurrently with the utterance. Something similar obtains, according to Stalnaker, in the "my wife" case.

Now, in previous work (García-Carpintero, 2015) I have argued that, although this is correct as far as it goes—so that in standard cases of informative presuppositions they have become common knowledge at "presupposition evaluation time," so that the common knowledge norm for presuppositions is ultimately not violated—in order to sustain it two assumptions that Stalnaker rejects are needed:<sup>6</sup> first, that some presuppositions (such as those assumed here for "I" and "the") are lexically

<sup>6</sup>Cf. von Fintel (2008).

triggered, and an adequate semantics for natural languages should countenance them. Second, that we think of presupposing as an ancillary speech act, understood along normative-Austinian lines—its constitutive norm being that the presupposed content is commonly known.

If those points are right, the model that Lewis (1979) provides for presupposing, asserting, and their interrelated effect on contexts such as accommodation is more appropriate than the psychological one that Stalnaker assumes. As is usual when it comes to understanding normative notions, Lewis takes games as a model, and offers different rules of accommodation for different expressions, understood in normative terms. This model can also be helpfully used in order to properly understand indirect speech acts. Grice (1975) offered a deservedly influential analysis for a very specific case, conversational implicatures, in which assertions are indirectly conveyed by other assertions. The specific maxims that Grice provided were attuned to that case and cannot be generalized. For instance, the maxim of quality ("Try to make your contribution one that is true") cannot be applied to explain how assertions are indirectly conveyed by questions, because questions are not constitutively either true or false. The Cooperative Principle ("make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange"), from which Grice derives the specific maxims, is a regulative norm that would be involved in any general account of indirect speech acts.

In sum, the view that I will assume henceforth as a point of departure, which my arguments suggest we should improve, has it that contexts are Stalnakerian sets of propositions as opposed to Kaplanian sets of parameters, and that their dynamics is to be understood along Lewisian normative lines, as opposed to Stalnakerian intentional ones.

## EXAMPLE 1: UNDERDETERMINACY AND THE QUESTION UNDER DISCUSSION

I will now start developing my argument. In the next two sections I will focus on two cases for which there is widespread agreement that the propositional account of context is inadequate questions and directives, the former in this and the latter in the next. I will argue that, even though the point that questions need be treated as adding to contexts more than propositions is sufficiently well established already, it has more pervasive consequences than usually acknowledged or even realized.

Some writers (e.g., Alston, 2000, pp. 116–120; Jary, 2010, pp. 15–16; Pagin, 2011, p. 123) defend accounts of assertion that imply that this act cannot be indirectly made, by requiring that an assertion consists of the communication of the proposition p by means of a sentence that means p. I think that this incorrectly makes it impossible by definition to make assertions of p with sentences that mean something else (or even by fully nonlinguistic means): in rhetorically asking "Who the heck wants to read this book?," I think I am asserting that (to put it mildly) nobody wants to read it. Aside from direct counterexamples like this, we might ask: why would assertion be special, in being the only speech act that cannot be made indirectly? Unless the point generalizes, and no speech act can be made indirectly; but it seems clear that, say, a literal expression of thanks such as "thanks for not browsing our journals" in a newsstand indirectly conveys a request.

Recently, however, other writers have provided arguments that would answer this worry, and hence would support views of assertion along the lines indicated. While Camp (2006) and Lepore and Stone (2010) have discussed the argument for specific cases such as metaphorical assertions, Fricker (2012) advances more general considerations. A point on which these authors rely is that indirectly conveyed claims are too ambiguous or underdetermined in their contents for the speaker to fully commit to them in the way constitutive of assertions<sup>7</sup> .

Notoriously, similar points have been made by so-called "minimalists" about semantic content such as Cappelen and Lepore (2005) and Borg (2012) against so-called "moderate contextualists" such as Bach (1994). Minimalists defend that semantic contents are truth-conditional (i.e., given a specification of a possible world, they deliver a truth-value), but nonetheless context-invariantly determined except when it comes to the value of "pure indexicals" (those for which Kaplanian contexts reserve parameters, "I," "now," "today," and a few more). On the basis of much-debated examples such as "I am ready," "I am tall," "I have had breakfast," or "there is milk in the fridge," moderate contextualists plausibly contend that compositionally determined semantic contents are truthconditionally incomplete: they do not yield a truth-value given a possible world. However, context might help to complete them, fixing a fully determinate content that the utterance literally and directly conveys. Against this, minimalists produce slippery slope arguments (Cappelen and Lepore, 2005, pp. 43–44; Borg, 2012, p. 83), allegedly showing that moderate contextualism is an unstable standpoint, ultimately committing their proponents to the much more radical view put forward, say, by Travis (1985), which does away with any recognizable notion of semantic content. These arguments purport to show that moderate contextualists' strategies would leave literal, directly expressed truth-conditional contents wildly underdertermined.

Thus, if these combined appeals to considerations of semantic underdetermination were valid, we would end up with the absurd consequence that the only contents we can ever assert are trivially true claims such as (on Borg's view, in the case of "I am ready") that I am ready for something or other or their trivially false negations. Fortunately, there is a compelling reply, which Schoubye and Stokke (2015) develop in detail for the case of minimalists' criticisms of moderate contextualism. They appeal to Roberts's (2012) proposal, elaborating on previous work by Carlson (1982) and others, that contexts are structured by a "question under discussion" (QUD) for which discussants try to provide adequate answers<sup>8</sup> . The QUD might have been explicitly asked, but it can also be merely implicit; in some cases, it may be very general, including the "Big Question,"

<sup>7</sup>Fricker offers two more specific reasons. First, a secondary message will be too ambiguous for the speaker to fully commit to it. Second, the audience will have to choose to draw certain inferences and it is thus they, not the speaker, who are responsible for the inferences that they choose to draw.

<sup>8</sup> Schaffer (2008, pp.3–5) offers a very clear, short presentation of the idea.

what is the way things are? Schoubye and Stokke convincingly argue that, taking this into consideration, in many ordinary contexts moderate contextualists are able to provide sufficiently well-determined (i.e., leaving aside the vagueness and lack of specificity in fact existing in ordinary communication) contents literally and directly expressed, completions of the typically nontruth-conditional, compositionally determined meaning of the uttered sentence.

Bergmann (1982) had in fact already made a similar point in defense of metaphorical assertions, which also gives a compelling reply to Fricker (2012) and Lepore and Stone (2010). As she puts it (ibid., 231): "without knowing the context in which a metaphor occurs and who its author is, it is impossible to state conclusively what the metaphor "means" without drawing out all that it could mean . . . But bring in a well-defined context and a real author, and matters may change drastically." She (ibid.) illustrates this with the following example:

Suppose I say to you, after hearing the latest report on Three Mile Island, "As far as I'm concerned, nuclear reactors are time bombs." You correctly interpret my remark as an assertion to the effect that nuclear reactors are likely to fail, at any momentof course, with disastrous consequences. A while later you say, "That was an interesting metaphor: nuclear reactors being time bombs. Although I don't think that the guys responsible for those things want people to get killed by them, still it seems that, like people who use time bombs, they have a frightening disregard for human lives." This, then, is something else that I could have used the metaphor to assert. But it does not follow, from the possibility of using a metaphor to make different assertions, that anyone who does use that metaphor is making all of those assertions.

Bergmann's point can be articulated by means of Schoubye and Stokke's strategy, by taking the specific feature of context required to develop her argument to be a particular QUD. Thus, we can take the QUD implicitly assumed in her example to be something like this: Which consequences should we derive from the Three Mile Island (28/3/1979) accident?

Using current formal semantics frameworks to model questions, Roberts (2012) provides a particular theoretical representation of QUD, which Schoubye and Stokke invoke to develop their point in a sufficiently precise way. I will not go into those details here. It is however clear how these to my mind very plausible views help illustrating the main claim I want to make here. QUDs interact with Stalnakerian contexts in the sort of rule-governed way Lewis (1979) set out to model with his scorekeeping analogy; but they are not such contexts, for they are not propositions. So, to adopt proposals of the sort just outlined requires us to abandon the simple-minded way of thinking of context presented at the outset. We should think of them as more complex, structured into at least two different components endowed with illocutionary features: a class of propositions that should be mutually known, and a class of questions (also commonly known as such in felicitous cases) for which discussants aim in a coordinated way to provide answers<sup>9</sup> .

## EXAMPLE 2: DIRECTIVES

As I said above, it is relatively uncontroversial that, while questions make contributions to context, their contributions differ from those that declaratives make. The previous section showed that this has more encompassing consequences than generally acknowledged, in that all contexts should be thought of structured by including a QUD, which then is highly relevant to determine the addition to the Stalnakerian context of commonly accepted propositions by ordinary utterances of declarative sentences. Contexts thus include the Stalnakerian set of propositions to which speakers are committed in the way they are committed to their beliefs, updated by accepted assertions; but they include also a separate class of propositions to which speakers are committed in the way they are to the questions that direct their inquiry (in whatever way this is formally represented), updated by new questions and by the assertions that partially answer them. Both components are mutually known, in felicitous cases. Now, as Lewis (1969) suggested, questions can be taken as a particular kind of directive (what utterances of imperative sentences signify by default); and directives in general independently help to establish the general point we are making here. I will also use the discussion of this second, less controversial case, to confront the "flattening" strategy which opponents of the main claim I will be making tend to use to sustain their view.

Let us say first a few things about how directives should be understood in the normative framework I sketched in the second section. Alston (2000, pp. 97–103) characterizes the constitutive norm for strong directives such as orders or commands as an obligation on the addressee to carry them out, emanating from a relevant authority on the side of the speaker. Kissine (2013, ch. 4) provides a related account of directives as supplying the hearer with a (mutually manifest) reason to act. In the Williamsonian format of (KR), the constitutive condition for the specific case of ordering that these authors advance could be put like this:

(D) One must [(order A to p) only if one lays down on A as a result an obligation to p].

As in the case of the assertion norms, the obligations here in question are sui generis and prima facie. As in that case too, the combinations that the rules forbid (there, to assert what is not the case, or not known, etc.) should be possible: it should be possible to command p to A without A's acquiring thereby the relevant sui generis prima facie obligation to p. This requirement is met: even in the army there are specified situations under which certain orders (to perform unconstitutional acts, to violate human rights, etc.), although they come into existence as emanating from the requisite authority, are nonetheless incorrect in that the addressees do not thereby incur the intended prima facie obligation.

Several authors have advanced semantic accounts of directives on which these are semantically distinctive objects, distinct from assertions (what declarative sentences signify by default), just as questions (what interrogative sentences signify by default) are; Han (2011), Portner (forthcoming), and Jary and

<sup>9</sup>Once again, I refer the reader to Green (2000, pp. 467–470) for a perspicuous way of formally representing the complexity I suggest we should take contexts to have, here and below.

Kissine (2014) provide good overviews. Along the lines of Stalnaker (1978), researchers such as Han, Portner and Jary and Kissine suggest that strong directives also have a content to be added (when successful) to a collection of propositions. However, these are not those constituting the Stalnakerian common ground, but rather a "To Do List" or "Plan Set" representing something like the active projects of the addressee. This is consistent with (D); in fact, it is nicely explained by it.

Like their declarative counterparts, imperative sentences have uses that go beyond the core cases of strong directives. Uttering "take bus 44" in reply to "how do I get from here to the airport?" is not a command, but a suggestion, a piece of advice or proposal; similarly for an utterance of "come round to my house to watch the game!," after the addressee has manifested interest in watching the game tonight and a lack of any plans for seeing it. "Help me!" is not a command, but a request. "Come in!" uttered after someone knocks on my door issues an authorization. "Get well soon!" said to someone who is ill or "Please don't rain!" looking at the sky are expressions of wishes, rather than orders. Semanticists adopt different views in light of this. Han focuses on commands as core cases, and leaves the other cases to be explained pragmatically as indirect speech acts. Portner and Jary and Kissine aim instead to provide an account general enough to encompass at least some other uses.

For our purposes, we do not need to go into these debates. We have said enough to indicate how directives add to the cumulative point we are making. Contexts are structured in complex ways, including different classes of propositions to which speakers are committed in different modes: in the way we are committed to our beliefs, but also in the way we are committed to our intentions, and to the questions guiding our inquiries. And, as we pointed out above, in felicitous contexts it is all these different commitments that are matters of mutual knowledge. As Stalnaker's (1978) account of assertion emphasizes, an accepted assertion comes to be presupposed afterwards, allowing for the satisfaction of presuppositional requirements later on in the discourse. Similarly, an accepted directive is taken for granted afterwards, constraining the legitimate moves that can be made in the discourse game, and obviously the same applies to the QUD.

Davidson (1979) and Lewis (1970) suggest dealing with nondeclaratives by taking them to be synonymous with explicit performatives, and then taking the latter to have, from a semantic standpoint, the truth-conditions they appear to do compositionally. Thus, "take bus 44!" would just mean, from a semantic point of view, the proposition that the speaker thereby requests the audience to take bus 44. Cannot we just adopt this line and avoid having to ascribe to contexts the complex structure we have so far posited? By taking questions and directives to express the propositions self-ascribing speech-acts that these views envisage, we could just stick to the Stalnakerian view of context as a set of propositions. It will be convenient to have a label for this strategy, for we will encounter other versions of it later in our discussion. Let me refer to it as the flattening scheme, or simply flattening.

In previous work (García-Carpintero, 2004) I have argued that these views are unmotivated10. However, even if we accept them (perhaps invoking the sort of methodological rhetoric discussed at the end of the first section, which is not far away from Lewis's, 1970 own motivation), it is important to appreciate that such flattening will not ultimately prevent the need for extra complexity that I am advocating. Let me argue for this here, before we move to the next example where the same point may not be equally clear.

In the first place, flattening is unmotivated because the distinction between the three moods, declarative, interrogative and imperative, appears to be as semantically relevant as any syntactic distinctions can be. It is even productive and systematically reflected in English and other languages in corresponding distinctions in ascriptions of the types of acts they indicate: "I told Peter that it is raining," "I asked Peter whether it is raining," "I told Peter to stop the rain." But let us grant that at a certain "core semantic" level we might disregard this, moving the distinctions to pragmatics. I have already mentioned above the debates between minimalists, moderate and radical contextualists, involving the proper account of examples such as "I am ready," "I am tall," "I have had breakfast," or "there is milk in the fridge," and have expressed my sympathies for the moderate camp (cf. García-Carpintero, 2006, 2013a). The point that makes the effect of flattening irrelevant is that, unless radicals are right, we should distinguish two kinds of pragmatic intervention. There are the processes producing clearly secondary, derivative meanings, such as particularized conversational implicatures and indirect speech acts. And there are the processes that contribute to determine what intuitively are the literal, directly conveyed meanings of ordinary utterances, as in the examples above—Bach's (1994) implicitures. Even if pragmatic processes are involved here, the data make it clear that they operate at a subsentential level, contributing together with the semantic compositional core to meanings that are productive and systematically determined.

So we should acknowledge three different levels of meanings, to account for which we need specific theoretical tools, reflecting three distinct robust kinds of fact. There is the core semantic, compositionally-driven level, at which the temporal contents of "I have had lunch" and "I have had measles" do not differ. Then there is the secondary pragmatic level, at which an utterance of the former sentence conveys a rejection of an invitation to go to a restaurant. And then there is the intermediary level of the intuitive literal and direct meaning, at which the temporal contents of the claims made by the two sentences differ—the former indicating a shorter interval between the activity and the current time. Even if pragmatic processes are involved at this intermediary level, we still need an account of it, and one that adequately interacts with the core semantic level.

As I understand views like the ones of Lewis and Davidson I am discussing, they contend that "semantics proper" (the theoretical pursuit dealing with the first level) should not care

<sup>10</sup>Davidsonians would do much better to adopt the "success semantics" that Ludwig (1997) and Lepore and Ludwig (2007, ch. 12) advance; when deployed in the Stalnakerian framework we are assuming, this would mean accepting the main claims I am making.

about the distinction between declaratives, interrogatives and imperatives, by invoking the flattening strategy. As I said, I do not believe this is correct. The taxonomical proposals in the debates about the semantics/pragmatics distinction I have sketched are not merely terminological arbitrary options: they have theoretical consequences. In particular, they should allow us to explain how the meanings at the intermediary level are determined, productively and systematically so, given the alleged outputs of the semantic core. It is clear that at the intermediary level a sentence in the interrogative does not mean the assertion that the speaker is asking for the relevant content, and the corresponding point applies to imperatives: they mean, respectively, a question and a directive. I believe that flattening would make it difficult to explain how the intuitively literal, direct meaning is conveyed<sup>11</sup> . But never mind. The important point is that the intermediary level—whether purely semantic or pragmatically intruded—is real;<sup>12</sup> it systematically interacts with the core compositional determination of meaning, and we are entitled to theorize about it. Once inside it, there is no way of avoiding the complex structured contexts we have shown the need to envisage.

## EXAMPLE 3: PEJORATIVES AND SLURS

As announced, this is the core section of the paper; here I use the case of slurs and pejoratives to defend my main claim about the nature of contexts. Kaplan (ms<sup>13</sup> ) started a fruitful debate on the meaning of pejoratives—as in "that bastard Kresge is famous" including slurs and racial epithets as in "there are too many chinks in our neighborhood." Kaplan suggests that a different dimension of expressive meaning ("use-conditional," as opposed to truth-conditional) is required. Hom (2008) makes a case for a straightforward truth-conditional account; thus, for instance, according to him "chink" makes a truth-conditional contribution akin to that of other predicates such as "Chinese"—a property determining according to him a necessarily empty extension, which can be roughly expressed as: ought to be subject to higher college admissions standards, and ought to be subject to exclusion from advancement to managerial positions, and . . . , because of being slanty-eyed, and devious, and good-at-laundering, and . . . , all because of being Chinese (Hom, 2008, p. 431). As many have pointed out (cf. Jeshion, 2013a, pp. 316–319), a main difficulty for this view lies in the projection behavior of these terms: when sentences such as those mentioned above are negated, are antecedents of conditionals, or embedded under modal operators or in interrogative or directive mood, they still derogate the relevant targets.

To account for this, writers have argued that the expressive meaning of pejoratives and slurs is instead either a conventional

<sup>13</sup>Kaplan, D. (ms) "The Meaning of 'Ouch' and 'Oops."'

implicature (Potts, 2007) or a presupposition (Macià, 2002, 2014; Schlenker, 2007) <sup>14</sup>. In defense of his truth-conditional account, Hom (2012, pp. 398–401) appeals to generalized conversational implicatures to explain the projection data. Now, I think a presuppositional account is more adequate; however, in order to deflate a very serious objection that has been raised against it, it is essential that we understand it relative to an extension of the proposal on the complexity of contents that I am making here. In any case, the other two proposals, the conventional implicature account and even perhaps Hom's generalized conversational implicature view, would also need to assume the extra complexity in contexts I will show we need. This is what I'll try to show in what remains of this section.

Both conventional implicatures (that somehow being poor contrasts with being honest, for "but" in "he is poor but honest"; that John is married, for the non-restrictive wh-clause in "John, who is married, will come to the party") and presuppositions (that someone broke the computer, for the cleft-construction in "it was John who broke the computer") are semantic, in that they are conventionally associated with some lexical items or constructions, and grasping them is required for full competent understanding15. Both are ways of conventionally indicating "non-at-issue" content. This is the most general reason why they project: thus, for instance, the negation in both "he is not poor but honest" and "it was not John who broke the computer" negates the "at issue" content, and so the same conventional implicature and presupposition as before are expressed. Neither can therefore be rejected by straightforward denials, so speakers must resort to oblique means such as Saddock's "hey, wait a minute" objection (Potts, 2012, pp. 2521–2522; Camp, 2013, pp. 341–342). Thus, it is not easy to tell them apart. Some researchers appeal to subtle projection differences (Potts, 2005; Tonhauser et al., 2013), but there is no agreement on this among linguists. In particular, their behavior when they occur in ascriptions of beliefs or acts of saying does not clearly distinguish between them, because, on the one hand, conventional implicatures might not project in such cases (Bach, 1999, pp. 338–343)<sup>16</sup> , as presuppositions typically do; and, on the other, presuppositions also project in

<sup>11</sup>I have defended that explicit performatives such as "I hereby promise not to drink again" literally say that the speaker promises by that very act not to drink again, and only indirectly convey the promise, as a form of generalized indirect speech act (García-Carpintero, 2013b). However, the account there presupposes that the three moods semantically encode information about speech-act types.

<sup>12</sup>I do not mean to suggest that the intermediate level is real in the straightforward psychological sense that many contextualists commit themselves to; cf. García-Carpintero (2001, 2006). However, as a reviewer helpfully suggested, it is at least real in that speakers are rationally committed to its deliverances.

<sup>14</sup>Williamson (2009) argues for a similar view. He classifies the expressive contents he proposes as conventional implicatures, but he understands that category in a traditional way, wider than the one I assume following Potts's work (ibid., 151, 153). I take his view to be compatible with the presuppositional account as much as with Potts's view. All these proposals can be viewer as different ways to elaborate on Kaplan's view that pejoratives should be account for by adding a "use-conditional" layer of meaning.

<sup>15</sup>In the case of presuppositions, Stalnaker and other writers dispute this; García-Carpintero (2015) defends it, for constructions such as the one given here for illustration.

<sup>16</sup>As I have pointed out elsewhere (García-Carpintero, 2006, pp.45–47), Bach (1999) in fact does not show that conventional implicatures (or presuppositions, for that matter), as understood here following Potts, are a "myth." He only shows that they are not part of "what is said" in his "illocutionary" sense, which is just to say that they are not part of the "at issue" content of declaratives. Rather they are, according to him, part of "what is said" in his "locutionary" sense. But this just means that they are conventional, semantic in the sense that they need to be grasped for full competent understanding. This is part of current standard views on conventional implicatures, such as Potts's. Hom (2008, pp. 424–426; 2012, pp. 391–392) appears to have been misled by Bach's suggestions in his criticisms of the conventional implicature view.

some such cases, like conventional implicatures (Schlenker, 2007, p. 244)<sup>17</sup> .

Presuppositions and conventional implicatures have different natures (Potts, 2007, 2012). Conventional implicatures have the job of providing new information, exactly like assertions, except that it is information which (even if relevant) has a somehow background character. Felicitous presuppositions articulate (for some relevant purpose) part of what is already commonly known. Unfortunately, this again does not offer a straightforward distinction, because, as we already pointed out above with the "my wife" example, the fact that a sentence carries a presupposition can be exploited by speakers to provide uncontroversial background information, through accommodation. Nonetheless, I am convinced by the arguments by Macià and Schlenker that the data of projection and rejection, given clear-headed assumptions about the respective nature of the two phenomena, show that the best way of classifying the expressive meanings of pejoratives and slurs counts them as presuppositions.

However, probably guided by the simple-minded assumptions about context that I am questioning here, both Macià and Schlenker give an inadequate characterization of the expressive presuppositions of pejoratives, which opens the view to spurious criticism. Schlenker (2007, p. 238) offers this characterization for the slur "honky": the agent of the context believes in the world of the context that white people are despicable. This is a clear-cut condition on a Stalnakerian context. But, as Williamson points out (2009, pp. 151–152), it cannot be right, because it does not capture the normative status ofslurs. Exposed to utterances in the above examples, we would challenge the speaker (using perhaps some variation of the "hey, wait a minute" strategy) to retract the derogation of Kresge or Chinese people; but we would hardly challenge her to retract the suggestion that she believes that Kresge or Chinese people are despicable: for all we care, she might well believe it, but this is not what we need to dissociate ourselves from when our interlocutors utter slurs we find objectionable. As Camp (2013, p. 333) points out, Potts's conventional implicature account has the same problem, for he just posits a condition on the subjective emotional state of the speaker—something to the effect that s/he actually is in a heightened emotional state (Potts, 2007, p. 171; 2012, p. 2532)<sup>18</sup> .

How, then, should contexts be understood as properly capturing expressive meanings? This depends on what emotions, and the speech acts conveying them, are. What pejoratives and slurs express, in my view, is that a certain emotional state (which, as researchers on these issues have made clear, can contextually vary along different parameters, cf. Potts, 2007; Hom, 2008; Camp, 2013, among others) is fitting or appropriate. Some philosophers have argued that emotions are just a particular kind of judgment—one to the effect that an object or situation instantiates their "formal objects," say, that Kresge or Chinese people are worthy of contempt, in our examples (cf. de Sousa, 2014; Todd, 2014, and references there). If this is right, then we do not need to go beyond the Stalnakerian context, on the assumption (which I am making) that presuppositions are ancillary speech acts, with normative essences like others, whose specific norm individuates them as requiring their contents to be common knowledge. That a speaker of "there are too many chinks in our neighborhood" takes it to be common knowledge that Chinese people are worthy of contempt explains the appropriate reaction to the utterance by non-prejudiced participants in the same conversation. Williamson (2009) seems to assume something like this<sup>19</sup> .

This would be a way of dealing with pejoratives analogous to the one offered by the flattening strategy for non-declaratives. Like that suggestion, however, the view of emotions and their expression on which it relies is controversial, and is rejected by many researchers (D'Arms and Jacobson, 2000, p.67; Deonna and Teroni, 2014, pp. 18–21). If emotions are instead, as I believe, sui generis normative states (Mulligan, 1998; D'Arms and Jacobson, 2000; Deonna and Teroni, 2014), and their expressions speech acts defined by distinctive norms, then in order to properly incorporate the presuppositional view we should add further illocutionary structure to the context set, and thus encompass them. This additional structure will be constituted by the intentional objects of the emotional states (say, Chinese people, with their (alleged) condition of generically having such-and-such features in the case of "chink"), subject to the normative condition that such intentional targets are thereby worthy of contempt and hence adequate recipients of mistreatment. On this view, the "formal object" of the emotion—the property of being contemptible in this case– is not part of the represented content, but the normative

<sup>17</sup>Ascriptions of propositional attitudes and speech acts are notoriously contextdependent; this explains the existential quantifications. In his interesting discussion of hybrid theories of evaluative terms, modeled on the views on pejoratives I am discussing, Schroeder (2009, 2014) places a strong emphasis on a distinction between hybrid expressions whose expressive content project even in attitude ascriptions, and those that do not. But, as far as I can tell, these are not properties of expressions themselves: we can only trace tendencies here. Slurs tend to project in ascriptions, but, as the examples by Schlenker and others show, they do not always do so. Such tendencies are orthogonal to the conventional implicature/presupposition divide. Quoting Bach (1999) (a work that he, unlike Hom–see previous fn.—appraises properly, cf. op. cit., 287-8, fn. 19), Schroeder shows that "but" might well not project in some ascriptions; but, following Potts (2005), I am taking non-restrictive wh-clauses as paradigm cases of conventional implicatures, and they do typically project in attitude ascriptions: John said that Peter, who will be coming soon, is welcome to the party.

<sup>18</sup>Boisvert (2014) provides a hybrid account of pejoratives and evaluative terms in the framework of "success" semantics, along the lines of the modified Davidsonian proposals in Ludwig (1997) and Lepore and Ludwig (2007, ch. 12).

As I said before, this is better than flattening, and also (as a result) compatible with the main claims I am making here. However, like Schlenker and Potts, Boisvert assumes a psychological expressivist, non-normative account of the nondeclarative additional speech acts that his account posits, which make it in my view similarly inadequate. To illustrate: there clearly is a semantic tension between uttering "thank you for p!" together with "shame on you for p!," but this cannot be adequately captured by an account on which the sentences merely indicate that the utterer actually feels grateful and disappointed regarding p; for, of course, there is no inconsistency in having such feelings regarding the same situation (cp. Boisvert, 2014, p. 34). In contrast, an account on which the sentences indicate acts subject to norms such that for them to be correct the same situation is to be both worthy of gratitude and of indignation does capture the tension.

<sup>19</sup>Likewise, Macià (2014) poses as the expressive presupposition of "chink" that speakers in the context are willing to treat Chinese people with a certain kind of contempt , on account of being Chinese. This is better than Schlenker's and Potts' subjectivist proposals, but is still objectionable along the lines that I develop in the main text.

condition that allegedly justifies addressing the emotional attitude toward it.

On the suggested view of emotions and the speech acts expressing them, the additional "emotive" structure of contexts should be assumed not only on a presuppositional account of pejoratives, but also on one on which they are conventional implicatures. For, even if the expressive content of pejoratives is background but novel "information," if unchallenged it would become part of the context set, licensing presuppositions down the line. The fact that we need to dissociate ourselves from such a prospect explains our normative reaction to utterances including slurs we disapprove of. This is why, even if Potts (2007, 2012) is right that such contents are conventional implicatures, as I mentioned above his subjective characterization of the expressive implicatures should be revised to support the present view of contexts.

Presuppositions are "filtered" in some contexts: they do not project when their triggers occur in the consequent of a conditional whose antecedent states them, or in the second conjunct of a conjunction whose first conjunct states them: if someone broke the computer, it was John who broke it; someone broke the computer, and it was John who did it. Schroeder (2014, p. 176) uses this point to dismiss the view that the expressive contents we are considering are presuppositions: "I cannot see how to construct a sentence of the form "if P, then Mark is a cheesehead" that does not implicate the speaker in disdain for people from Wisconsin." This is well taken, but I take it to be only a consequence of the fact that the expressive contents we are discussing—be they presuppositions, or conventional implicatures—are not just forceless propositions, which is what antecedents of conditionals or conjuncts must be. This leaves open whether they are presented as requirements on the common ground (and hence have a presuppositional character), or as new background commitments (and hence are conventional implicatures). Schroeder's argument is one more example of the misleading consequences of ignoring the main claim about the nature of contexts I am making here<sup>20</sup> .

Some of Hom's (2010, pp. 176–179; 2012, pp. 390–391) criticisms of the presuppositional and conventional implicature view have already been discussed, or have received adequate replies in the literature. The data about projection and "cancelation" are less clear than he assumes, and in any case can be accounted for by both proposals (cf. Macià, 2014). Intuitions about the truth-values of utterances are much less clear-cut than he and others take them to be (cf. Jeshion, 2013a, 317), and again can be accounted for by both the presuppositional and the conventional implicature proposals. Hom mentions "nonorthodox" cases that lack derogatory implications; but, again, defenders of alternative views have shown them to have enough resources to deal with them, as pragmatic effects or cases of polysemy (Jeshion, 2013b, pp. 326–330). Last but not least, what Hom (2010, p. 177) thinks is the "more fundamental problem with the presupposition account" can be adequately resisted if contexts are assumed to have the sort of illocutionary complexity I am arguing for. This is how he summarizes it:

To focus on slurring as a means of efficiently entering information into the conversational record is to miss the fundamental point of slurs, namely, that they are typically used to verbally abuse their targets, with no regard to whether the negative content actually gets accommodated within a framework of rational, cooperative behavior.

He (ibid.) summarizes this by approvingly quoting Richard (2008, p. 21): rather than trying to enter something into the conversational record, "someone who is using these words is insulting and being hostile to their targets." Now, the reply that the present proposal allows should be obvious. The contrast that Hom and Richard presume between making a requirement on the conversational record (or making an attempt at smuggling it there) and insulting/being hostile to some target does not exist, when what is thereby assumed to be in the context is a represented target presented as fitting the normative condition that it is contemptible and thereby liable to receive mistreatment: for this is precisely what the insult and the hostility amount to. It should be granted that Hom's and Richard's presumption that presuppositions merely concern "information" in the conversational record is shared by most of the theorists they oppose, but it is nonetheless wrong.

Actually, it is not at all obvious how Hom's own view properly captures the insulting character of utterances including slurs. His proposal is a form of the by now familiar flattening strategy for rejecting the main point of this paper in favor of straightforward truth-conditional treatments, the Davidson-Lewis line for non-declaratives, or the view that emotions are ordinary judgments, and their expression corresponding assertions. As we said, an immediate concern this raises has to do with the "projective" behavior of all such expressions under negation, conditionalization, etc.: as we have seen, intuitively expressive contents "escape" the operators under which they are embedded in such cases, while, if the expressive content is just straightforward truth-conditional content, it should remain embedded. But in fact, the problem already affects simple positive sentences: in principle, an assertion that a command is given can occur without the command being given; and an assertion that an emotional state, or the occasion for it, obtains (that something is frightening or contemptible) can equally occur without the emotional state obtaining (without the fear or contempt occurring)<sup>21</sup> .

As indicated above, Hom (2012, pp. 398–401) purports to explain the generation of the expressive content (in embedded and simple constructions) as a Gricean generalized

<sup>20</sup>It is a particularly revealing one, because it occurs in a paper that is otherwise admirably clear about the distinction between contents and forces; Schroeder's (2014, pp. 278–280) toy formal model is as clear as Green's (2000) when it comes to the proper articulation of meanings that, like expressive contents in my view, are propositions-cum-illocutionary forces.

<sup>21</sup>The same can intuitively obtain in the opposite direction: the non-cognitive attitude/act (the command or the derogation) can occur, without the cognitive one (the belief/assertion that the command or the derogation takes place) taking place, because the thinker/speaker lacks the conceptual resources to describe the noncognitive state/act. Hom deals with this apparent lack of necessity of his account by appealing to semantic externalism: semantically the equivalence obtains, even if ordinary speakers lack the resources to appreciate it.

conversational implicature22. I have serious doubts that this proposal can work on its own terms, but this need not concern us here. I want to make a point about it related to the one I made at the end of the previous section. In some cases, generalized conversational implicatures are not projected, but rather generated "locally," i.e., interacting with the compositional determination of contents, exactly as "implicitures"/"explicitures" are. The data suggest that, in some cases, expressive contents are thus generated locally (Schlenker, 2007, p. 244). It remains to be investigated whether these should be truly handled locally by our best theories; but, if they are, a full theoretical account of the data will need to contemplate the structurally enriched contexts we have advanced, even if metaphysically/metasemantically we classify the generation of expressive contents as a generalized conversational implicature.

## EXAMPLE 4: PREDICATES OF TASTE

The final two examples will be more cursorily discussed, but I hope they still have the power of contributing to the cumulative point I am making. In previous work with Teresa Marques (Marques and García-Carpintero, 2014), I defended a contextualist account of predicates such as "tasty" against criticisms that such a view cannot account for disagreement. In that paper we replied to proponents of recent forms of relativism, although the contextualist view that we defend is also relativist in traditional terms.

In ordinary cases of apparent disagreement between speakers who assert, respectively, S and not-S, the impression dissolves after recognition that S includes context-dependent expressions and that the utterances are made in different contexts. We agree, however, that an impression of disagreement remains among subjects who assent to "this is tasty" and its negation, even when it is clear that they make these judgments relative to different standards of taste. The main claim we make to account for this is that the remaining disagreement is practical. It concerns a noncognitive pro-attitude that people making this sort of claim (also in thought) have, favoring shared standards, which we take to be presupposed. We take this to be a pragmatic form of a "hybrid" view of a specific set of "thick" terms<sup>23</sup> .

Väyrynen (2013, ch. 3–6) defends a noncommittal pragmatic view of the evaluations associated with thick concepts based on interesting data about "objectionable" concepts, on which such implications are pragmatic but are neither conventional nor conversational implicatures nor presuppositions. Objectionable thick terms are those associated with evaluations found not acceptable (such as "lewd" or "blasphemous" for many today). Väyrynen provides interesting data showing that such evaluations

<sup>23</sup>Cf. Väyrynen (2013) and Schroeder (2014) for discussion.

project under different embeddings, and also that they can be (somehow) "canceled." As the reader will guess, however, I am not convinced by his arguments that the evaluations do not have a presuppositional status. Although he does not specifically discuss them, one of his arguments, the "appropriateness problem," would also affect the sort of non-cognitive presuppositions we posit for terms like "tasty." He argues (op. cit., 113) that someone approving the evaluation associated with "lewd" can sensibly use it while talking to commonly known objectors to such evaluations. Something similar might obtain in a case in which two food critics disagree about the food served at a restaurant, knowing full well that they do not share standards and that neither of them is at all disposed to adopt the other's standards.

However, the same situation might obtain with any ordinary presupposition, as when someone says in anger, "if the idiot I am talking to were listening, we would have less trouble." As Stalnaker (1978, p. 87) pointed out, presupposing (or asserting it, for that matter) something that is not accepted and will not be accommodated might not be pointless, because it might well have a further point, "as Congress may pass a law knowing it will be vetoed, a labor negotiator may make a proposal knowing it will be met by a counterproposal, or a poker player may place a bet knowing it will cause all the other players to fold24."

Be this as it may, properly developed, it is essential that the view is articulated in the sort of framework I am arguing for here, because according to it what is presupposed is not a proposition, but a pro-attitude: a preference for shared practical views.

## EXAMPLE 5: PRETENSE AND PRESUPPOSITIONS

In previous work (García-Carpintero, 2013c), I have defended a specific form of a "pretense-theoretic" account (alternative to Currie's, 1990 and Walton's, 1990) of fiction-makers' utterances of sentences such as "When Gregor Samsa woke, he found himself transformed into a gigantic vermin." by Kafka in the creation of Metamorphosis. I defended a speech-act account, assuming the sort of Austinian view of such acts in terms of social norms in contrast to Gricean views in terms of psychological reflexive intentions I advertised above. On my proposal, while non-fictions constitutively result from constatives—acts of saying, the genus of speech acts characterized in terms of norms requiring truth for their correctness, of which assertion is the core species fictions constitutively result from directives—the genus of which commands are the core species—characterized by a norm of providing the intended audience with reasons to imagine the fiction's content.

I modeled my proposal on the normative account of directives derived from Alston's (2000) outlined above in Section Example 2: Directives. As indicated there, I take commands to be subject

<sup>22</sup>The semantic externalism to which Hom appeals in order to deal with the necessity problem (see previous fn.) puts a strain on his appeal to conversational implicature to deal with this sufficiency one, because implicatures are supposed to be derivable. It is difficult to understand how ordinary speakers intuiting the allegedly implicatured condition—in our cases, the derogation of Chinese people, which is what everybody perceives in utterances of "there are too many Chinks in our neighborhood"—can make the inferences, if they themselves lack the resources to articulate the content of Hom's truth-conditional analysis.

<sup>24</sup>Väyrynen's other problem for presuppositional views, the "triggering problem" (op. cit., 112) does not affect our proposal—which takes the relevant presupposition to be pragmatic, not lexically triggered. I am not convinced that it shows that the evaluations he discusses, associated with thick terms, are also mere pragmatic implications; but it does raise an interesting issue concerning the triggering of presuppositions—to wit, whether there are general explanations for some families of triggers (cf. Abrusán, 2011)—which I cannot address here.

to a norm such that they are correct only if their audiences are thereby provided with a reason to see to it that their content obtains. The reason itself is to be based on different sources, depending on the specific nature of the directive: the authority of the speaker in the case of commands, or the good will or presumed interests of the audience in the case of requests, suggestions, or entreaties. My proposal was that a fiction with the content p is a result of an act that is correct only if it gives relevant audiences (audiences of the intended kind, with the desire to engage with such works) a reason to imagine p. The reasons in question have to do with whatever makes engaging with good fictions worthwhile; say, to experience the succession of emotions provoked by engagement with well-drafted, suspenseful thrillers for those of us who enjoy these things, or to emotionally engage the psychological nuances that Henry James's last novels allow us to consider in depth.

Now, consider an utterance of "When Gregor Samsa woke, he found himself transformed into a gigantic vermin." in its assumed context. This is a declarative sentence that would be used by default to make an assertion. The assertion in this case is merely pretend, which is why we would not complain that it cannot be true or impart knowledge by its including an empty name. The speaker, the fiction-maker, is using the sentence to make a different speech act, a sort of invitation or proposal to audiences of a certain kind to imagine certain contents. However, it behaves with respect to the dynamics of discourse exactly like the corresponding assertion would have done, by legitimizing presuppositions; thus, the next sentence could have been "it was not only gigantic, it was also frightening"—a cleft construction presupposing that the insect was gigantic—and it would feel entirely felicitous (as opposed to "it was not only tiny . . . ").

In virtue of examples like this, the common ground is not taken to consist of propositions that are strictly speaking common knowledge, but merely commonly "accepted" (Stalnaker, 2002). But in the framework I am advancing, we should take such an "acceptance" in the case of fiction to be a matter of further pretense: accepted pretend assertions become pretend presuppositions. The presuppositions, like the initial assertions, occur under the scope of a pretense. A pretend assertion is one advanced merely for its having taken place to be imagined, so that it is not subject to ordinary norms for assertions—only to norms for invitations to imagine. A pretend presupposition is similarly one put forward merely to be imagined, hence not subject to norms for presuppositions—one that does not fail because its content is not common knowledge. Fully understanding fictional discourse involves additional pretend presuppositions to the ones created by pretend assertion: the reference-fixing presuppositions that different views associate with empty names such as "Gregor Samsa" are similarly merely pretend presuppositions. It is thus irrelevant that they cannot be true, nor therefore matters of common knowledge<sup>25</sup> .

Not all presuppositions that a piece of fictional discourse assumes are pretend, however. Even the most fanciful tales assume facts that truly are (taken to be) common knowledge, in order to determine their contents. Special among them are presuppositions constitutive of the meaning of the terms the tale uses; these cannot be pretend. Reference-fixing presuppositions associated with names already in use also belong in this category of non-pretend presuppositions. They interact with merely pretend presuppositions to determine the content of the fiction, in ways that have been famously explored by Lewis (1978) in his influential analysis of "truth in fiction," by Walton (1990) for his "principles of generation" and by many others under their influence. When exposed to a fiction it is very important to disentangle those presuppositions that are merely pretend from those that truly taken to be matters of common knowledge; and there is psychological evidence that the distinction is not lost on ordinary thinkers<sup>26</sup> .

We have thus another compelling reason to add structure to contexts, relative to what is ordinarily assumed: it is mandatory to distinguish in them propositions to which discourse participants take themselves to be committed in the way they are to their beliefs, from those to which they are merely committed in the "to be imagined" mode.

## SUMMARY AND CONCLUSIONS

In this paper I have assumed a broadly Stalnakerian view of contexts, the concrete situations relative to which linguistic exchanges take place; I have assumed, that is, that they are not just sets of parameters, but meanings shared by the speakers participating in the relevant linguistic exchange. I have argued against Stalnaker's "info-centric" view of such contexts: they cannot be just propositions, or more in general representational contents, but rather these contents together with commitments toward them in different modes by speakers. This should be clear to everybody, just on the basis of the fact that conversations involve not just assertoric utterances, but also directives and questions. With respect to this familiar fact, I have made two not so familiar points: firstly, that it might well affect all sensible conversations (because all of them might involve commitments to share discourse projects); and secondly, that the familiar "flattening" strategy that attempts to reduce non-declaratives to declaratives, at least for core semantic purposes, will not make the contention ineffectual. Then, using the case of the semantics of pejoratives, I have argued that the consequences of the familiar fact are many times ignored, at the theorist's peril: flattening will not suffice in that case either; and, even among many of those who do not ignore the fact that such constructions indicate meanings additional to "at issue" contents, undue fixation on declaratives and truth-conditional content leads us to ignore the distinctive normative features of expressive meanings, providing as a result incorrect accounts. Finally, I have mentioned two other interesting examples that would allow us to sustain the same claim: the account of evaluative disagreements, and the interpretation of fictional discourse.

<sup>25</sup>Cf. Sainsbury's (2010, pp. 143–148) related discussion of "truth under a presupposition."

<sup>26</sup>Although the psychological data are controversial, having provided some support to those who want to deny a substantive distinction between fact and fiction (Matravers, 2014), there is also sufficient evidence supporting the phenomenological impression that we do separate what we get from fiction and non-fiction; cf. Friend (2014, forthcoming).

## FUNDING

Financial support for my work was provided by the DGI, Ministerio de Economía y Competitividad, Spanish Government, research project FFI2013-47948-P and Consolider-Ingenio

## REFERENCES


project CSD2009-00056; and through the award ICREA Academia for excellence in research, 2013, funded by the Generalitat de Catalunya. Thanks to three reviewers for this journal for their comments, which led to several improvements, and to Michael Maudsley for his grammatical revision.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer Alessandro Capone and handling Editor Alessio Plebe declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2015 García-Carpintero. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Disagreeing in context

### Teresa Marques \*

Law Department, Philosophy of Law Area, Universitat Pompeu Fabra, Barcelona, Spain

This paper argues for contextualism about predicates of personal taste and evaluative predicates in general, and offers a proposal of how apparently resilient disagreements are to be explained. The present proposal is complementary to others that have been made in the recent literature. Several authors, for instance (López de Sa, 2008; Sundell, 2011; Huvenes, 2012; Marques and García-Carpintero, 2014; Marques, 2014a), have recently defended semantic contextualism for those kinds of predicates from the accusation that it faces the problem of lost disagreement. These authors have proposed that a proper account of the resilient disagreement in the cases studied is to be achieved by an appeal to pragmatic processes, and to conflicting non-doxastic attitudes. It is argued here that the existing contextualist solutions are incomplete as they stand, and are subject to objections because of this. A supplementation of contextualism is offered, together with an explanation of why failed presuppositions of commonality (López de Sa), disputes over the appropriateness of a contextually salient standard (Sundell), and differences in non-doxastic attitudes (Sundell, Huvenes, Marques, and García-Carpintero) give rise to conflicts. This paper claims that conflicts of attitudes are the reason why people still have impressions of disagreement in spite of failed commonality presuppositions, that those conflicts drive metalinguistic disputes over the selection of appropriate standards, and hence conflicting non-doxastic attitudes demand an explanation that is independent of those context dependent pragmatic processes. The paper further argues that the missing explanation is 2-fold: first, disagreement prevails where the properties expressed by taste and value predicates are response-dependent properties, and, secondly, it prevails where those response-dependent properties are involved in evolved systems of coordination that respond to evolutionarily recurrent situations.

Keywords: contextualism in semantics, disagreement, conflicting attitudes, de nobis attitudes, dispositional evaluative properties

## 1. Introduction

When people have disagreements about taste, or about aesthetic or moral values, what is their disagreement about? What explains the apparent fact that it is legitimate for people to hold on to their views about the issue under discussion? And what explains that the disagreements at stake are often resilient and persistent? Is there an account of this kind of disagreement that can capture the perspective dependence of a given domain while preserving the sense of resilient disagreement between those with different perspectives?

In the recent debate that has opposed contextualists to relativists about predicates of personal taste, aesthetics, and morality, contextualists have tried to resist objections raised by non-indexical contextualists and assessment-relativists by adopting two distinct strategies. The first strategy is to

### Edited by:

Alessio Plebe, University of Messina, Italy

### Reviewed by:

Nat Hansen, University of Reading, UK Filippo Ferrari, University of Aberdeen, UK

### \*Correspondence:

Teresa Marques, C/ Trías Fargas, 25–27, Campus de la Ciutadella, Edifici Roger de Llúria, Law Department, Philosophy of Law Area, Universitat Pompeu Fabra, Barcelona, Spain

mariateresa.marques@upf.edu

#### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

> Received: 02 January 2015 Paper pending published: 28 January 2015 Accepted: 21 February 2015 Published: 19 March 2015

### Citation:

Marques T (2015) Disagreeing in context. Front. Psychol. 6:257. doi: 10.3389/fpsyg.2015.00257 argue that none of the relativist positions now available fare better than contextualist ones 1 . The second strategy is to show how resilient disagreements are to be explained. On the one hand, contextualists have appealed to a combination of pragmatic mechanisms to account for these disagreements: presuppositions of commonality,<sup>2</sup> and to further metalinguistic considerations about the choice of salient standards<sup>3</sup> . Contextualists have also added a more thorough explanation of the practical dimension of the disagreements at stake, for instance appealing to conflicts of non-doxastic attitudes<sup>4</sup> . Neither of these approaches—the pragmatic or the attitudinal—have been sufficiently developed so far. In this paper, I will indicate which aspects are still wanting. What is required is an account that frames both the pragmatic and the conative aspects within an explanation of inter-subjective or group coordination. The paper further argues that the missing account is 2-fold: first, disagreement prevails where the properties expressed by taste and value predicates are response-dependent properties, and, secondly, it prevails where those response-dependent properties are involved in evolved systems of coordination that respond to evolutionarily recurrent situations.

This paper is structured as follows. In Section 2, I present and indicate what is lacking in the otherwise promising contextualist proposals mentioned here. Thus, in Section 2.1, I show that appealing to presuppositions of commonality by itself is insufficient, because in other similar cases the awareness that a presupposition fails dispels the impression of disagreement. In Section 2.2, I consider Sundell's suggestion that the disputes take place at a metalinguistic level, and that in some of the relevant cases what is at stake is the choice of a salient standard. One problem with this proposal is that we need a better understanding of how disputes of this sort are to be adjudicated, and of what motivates speakers to pursue them. A further problem for both pragmatic explanations is that we have the impression that there are disagreements between subjects who are not part of the same conversational setting, or do not even interact in any form. Both presuppositions of commonality and metalinguistic disputes seem to require that some interaction exists. In Section 2.3, I raise a problem for solutions that rely on the incompatibility of (pro) attitudes. The most plausible explanation for the source of conflict—preclusion of joint satisfaction seems not to yield the desired result. Nonetheless, I think these three proposals made on behalf of contextualism are basically correct.

Section 3 offers the beginning of a solution. In Section 3.1, I suggest that we should follow Lewis and Hume in treating practical agreements as solutions to coordination problems. Disagreements would arise when people's dispositions are obstacles to coordination. The suggestion is supported by research on group action and rationality. In Section 3.2, I offer a conjecture that can resolve the objections raised against contextualism. The main problems are, first, that we have impressions of disagreement even where subjects do not share a conversational setting, do not

3 See (Sundell, 2011). know of each other, and do not have common goals. Second, we have impressions of disagreement even when the apparent dispositions revealed in the disagreement can be satisfied. My conjecture is that the kind of coordination problems that the different types of dispute pose are at the root of our having, as humans, evolved to have the emotional responses we have, to make value judgments about matters of taste, aesthetics or morality, and, crucially, to hear conflicts in the expressions of different personal preferences. This section reviews some research that corroborates this conjecture.

In Section Section 4, I examine the consequences of the proposal offered here for the current debate between contextualists and relativists. First, disagreement prevails where the properties expressed by taste and value predicates are responsedependent properties, and, second, it prevails where those response-dependent properties are (i) de nobis; 5 and (ii) involved in evolved systems of coordination that respond to evolutionarily recurrent situations.

## 2. Contextualist Strategies

Why be a contextualist in the first place?

Contextualism (also called "indexical relativism" by Kölbel (2004), or "indexical contextualism") is a semantic thesis. A contextualist about a given class k of expressions holds that an utterance u of a sentence S where a k-expression occurs as made at a given context C expresses a proposition p at C that is evaluated with respect to < w >C, the world of the context C. The class k is composed of expressions whose characters compositionally determine in context what content or proposition is expressed by an utterance of a sentence that contains a k-expression, and the content determined varies from context of use to context of use. The content or proposition expressed is then a function from possible circumstances of evaluation to extensions (in the case of sentences, to truth-values).

As Kölbel (2004) says, contextualists allow for different utterances of the same sentence to express different contents. An utterance expresses a given proposition as its content, and that content's truth will only be relative to possible worlds (as standard semantic theories require). Truth-relativists (moderate, i.e., nonindexical relativists, or radical, i.e., assessment-relativists), allow for the truth of the content expressed in context to depend on more than (just) a possible world.

Contextualism has both a linguistic and a metaphysical motivation. Speakers' judgments and linguistic intuitions are normally used in favor of contextualism in the various domains where contextualism has been defended, in particular judgments concerning:


<sup>1</sup> See for instance (Glanzberg, 2007; Stojanovic, 2007; Rosenkranz, 2008; Schaffer, 2009; Coliva and Moruzzi, 2012; Marques, 2014a,b).

<sup>2</sup> See (López de Sa, 2008; Marques and García-Carpintero, 2014).

<sup>4</sup> See (Sundell, 2011; Huvenes, 2012; Marques and García-Carpintero, 2014).

<sup>5</sup>De nobis are plural de se attitudes. Where de se attitudes are specific kinds of attitudes or mental states about oneself, de nobis attitudes are a specific kind of attitudes or mental states about ourselves. There are well-known motivations for de se thought, and different theories that try to accommodate what is essentially de se in thought. In Marques (Unpublished Manuscript), I draw a parallel between de se and de nobis attitudes by showing that the same reasons that support the existence of a distinctive kind of first-personal attitudes can be replicated for the first-personal plural case.

(c) On disagreement between people making different claims within any of these domains.

Yet, the available data on speaker's intuitions is not, at least not conclusively, decisive for contextualism, as relativist objections make clear. In the current debate about the meaning of predicates of personal taste and other evaluative predicates, several authors have raised objections against contextualist approaches, mainly on the basis that contextualism misses intuitions of disagreement that these writers show we have. The problem, as they argue, is that of lost disagreement.

Thus, Kölbel (2004) argues on this basis for what is usually called "moderate truth-relativism" (also known as "non-indexical contextualism," by contrast with the more standard contextualist views). Egan (2010), Lasersohn (2005), and MacFarlane (2014) have argued for another version that can be called "assessmentrelativism." As this paper is not dedicated to discussing the limitations that either form of relativism may have in accounting for disagreement, I will not explain here the differences between these views. In any case, further arguments must be provided to settle this discussion, preferably arguments that assume some common ground with relativists. By framing the discussion within a broad dispositionalist metaphysical theory, I am sharing at least this common ground with relativists<sup>6</sup> .

There are good reasons for a relational metaphysical account of the properties expressed by predicates like "is funny," "is disgusting," "is tasty," "is beautiful," "is good," etc. It seems highly implausible that claims about, for example, humor, taste, aesthetic value, and perhaps moral value, should be independent of how people react to funny, disgusting, tasty, or beautiful things.

An analogy with other dispositional properties can be helpful in understanding the motivation for a relational account of the relevant evaluative properties. Recently, Cohen (2009) argued that a metaphysical view of this kind about colors has a natural contextualist semantic implementation. Cohen draws attention to the fact that a single color stimulus can produce multiple psychophysically distinguishable perceptual effects in respect of color. He further adds that there is no well-motivated reason for thinking that just one of those variants is veridical. Thus, he concludes, predicates like "red" express relational properties, more specifically "response-dependent" ones such as looking red to subjects of kind S under circumstances K. By analogy with the color case, we can say that aesthetic and taste predicates—and perhaps moral predicates—express relational properties. A predicate like "is tasty," or "is disgusting," uttered in context C, expresses properties such as tasty for the perceivers relevant in context C under the perceptual circumstances relevant in C, or simply tasty for the standard relevant in C.

For the rest of the paper, I will assume that a dispositional account of the properties expressed by many evaluative predicates is correct, and that aesthetic, taste, humor, and moral predicates express relational properties. Saying this is not settling who the "subjects of kind S" are for each relational property. In some cases, one may expect universality (everyone) and in other cases expectations of universality might be unjustified.

It does not follow that any possible claims in matters of taste, or morality, exhibit such variability, and the extent to which there is any variability at all may vary between domains. Perhaps there is more variability in claims on matters of taste, and less in morality. Concrete sociological, historical or anthropological analysis would need to corroborate the actual degree to which such a variability exists.

The aim of this paper is to show that a contextualist can explain the resilient cases of disagreement, and, in so doing, take the wind out of the relativist's sails. The remainder of this section reviews three ways for a contextualist to secure disagreement: presuppositions of commonality, metalinguistic disputes, and conflicts of non-doxastic attitudes.

## 2.1. Presuppositions of Commonality

López de Sa (2008) and López de Sa (2015) defends contextualism (indexical relativism) from criticism based on disagreement data by pointing out that the proper semantic implementation of the proposal should envisage the presuppositions of commonality that assertions expressing judgments of taste carry. According to him, the failure of these presuppositions accounts for the data. The main problem with López de Sa's proposal, as I see it, is that when presuppositions of the kind he envisages fail, we should not feel that any relevant disagreement remains. This is corroborated in the case of gradable adjectives like "rich" or "tall." But a strong impression of disagreement is still felt even by semantically enlightened speakers, which cannot be explained by semantically blind folk invariantist intuitions.

Consider the following exchange between Clarissa and Jennifer, both excellent cooks with vast experience and good taste<sup>7</sup> .

	- (b) Jennifer: No, it's not disgusting; it's delicious.

People feel that Clarissa and Jennifer straightforwardly disagree. On contextualist semantics, however, if the relevant standard of taste is subject-relative, in their context the claims are equivalent to these:<sup>8</sup>

	- (b) Jennifer: No, it's not disgusting, it's delicious [given Jennifer's standards].

There seems to be no impression of disagreement in (2). In fact, as Kölbel (2004) points out, now both speakers can rationally accept what the other has said while maintaining their respective assertions, unlike what seemed to be the case in (1).

These are cases of what Egan (2010, p. 251) calls firstpersonally committed (auto-centric) uses, to be distinguished

<sup>6</sup> See for instance (Egan, 2012). The present paper can be seen as offering a justification for maintaining the more classic dispositional theory, instead of the relativist modification offered by Egan. In Marques (Unpublished Manuscript), I argue that Egan's de se version of dispositionalism about values fails to accommodate conflicting attitudes, and, given the nature of the theory, it also fails to accommodate doxastic disagreement.

<sup>7</sup>The example honors Clarissa Dickson Wright and Jennifer Paterson, the Two Fat Ladies.

<sup>8</sup>Assuming subject-relative standards here plays a dialectical role. Many authors in the literature assume subject-relative standards, and most objections to relativism focus on individual standards of taste.

from sympathetic (exocentric) uses in which we ascribe tastes by adopting alien perspectives ("that fodder must be delicious").

Yet, the contextualist acknowledges that there must be cases of pointless disputes, where subjects have contrasting sensibilities. Subjects are thereby either expressing different relational properties (or wrongly purporting to express an inexistent one shared by both of them). (1) is an example of such a case of a "faultless" dispute—one that does not involve any doxastic disagreement over a unique context-dependent content that Clarissa accepts and Jennifer rejects. But how will the contextualist explain the persisting intuitions of disagreement concerning such cases?

López de Sa's explanation (López de Sa, 2008, pp. 304–305) appeals to presuppositions of commonality. The relevant predicate "triggers the presupposition that the participants in the conversation are similar" with respect to the relevant standard. López de Sa assumes a Stalnakerian account of presuppositions (cf. Stalnaker, 2002). On this account, presuppositions are requirements on the "common ground" (the class of propositions that participants in the conversation take to be known by all, known to be known by all, etc) that may be triggered by specific expressions or constructions. Utterances carrying the presuppositions are not felicitous unless the common ground includes them, or, if it does not, they are "accommodated" by the conversational participants, i.e., included in the common ground as a result of the utterance.

Impressions of disagreement in (1) are then explained because "in any non-defective conversation. . . it would indeed be common ground" that the participants are relevantly alike. In such a conversation, one would be right and the other wrong. Of course, in (1) the presupposition fails, and as a result both claims are infelicitous. In other words, the impression of disagreement is to be explained by the fact that the following conditional is true about (1): had Clarissa and Jennifer been in a felicitous context in which the presupposition of a common standard was met, then they would have disagreed<sup>9</sup> .

But impressions of disagreement in analogous cases also disappear, as witnessed by the case of the vagueness-inducing relativity to "perspectives" or "ways of drawing the line" for gradable adjectives, as the example offered below illustrates10. However, such impressions remain among the fully reflective in the case of judgments of taste like the ones considered here. The comparison with gradable adjectives shows that the presuppositional account does not help.

The next example (originally from Richard, 2004) suggests the indexicality of gradable adjectives—adjectives that admit comparative and superlative degrees, intensifiers like "much" and "very," and so on, and illustrates how an impression of disagreement should disappear once different standards of wealth are made explicit. Imagine that Mary wins a million dollar lottery. Didi is impressed; but for Naomi, a million dollars is not much. Taking New Yorkers to be the relevant fields of comparison, they judge:

	- (b) Naomi: Mary is not rich.

The information about differential standards of richness provided by context, which accounts for the intuition that different contents are being affirmed and denied in (3a) and (3b), can in some other cases be explicitly articulated in the uttered sentence:

	- (b) Naomi, as before: Mary is not rich (given Naomi's standard).

This evidence can be handled by means of a contextualist proposal, following suggestions about the semantics of gradable adjectives in the literature. Assuming that the speaker's intentions play a crucial role in determining degree significance, Didi and Naomi either do not disagree, or participate in an infelicitous conversation where presuppositions of commonality fail. The problem for López de Sa's proposal is that the impression of disagreement also vanishes among semantically enlightened speakers in this case. However, his counterfactual still applies: Didi and Naomi would be disagreeing, if they were speaking in a felicitous context. Didi's possible reply to (3b) illustrates this.

4. (a) Didi: Mary is rich given what counts as rich for me; I see that you have a different perspective on these matters.

Therefore, what explains the impression of resilient disagreement between Clarissa and Jennifer in (1) cannot be that a counterfactual of that sort applies. A proposal along the lines of López de Sa's might be the beginning of an explanation of such a perception of disagreement. However, as (4) and (5) show, the presence of presuppositions of commonality is not enough to explain the perception of disagreement that remains even for the semantically enlightened subjects who adopt a contextualist semantics for value predicates<sup>11</sup> .

### 2.2. Metalinguistic Disputes

Sundell (2011) advances a well-argued defense of contextualism for predicates of personal taste and aesthetics that makes some progress with respect to the position held by López de Sa. Sundell argues, on the one hand, that impressions of disagreement or conflict as the ones we have with (1) also exist in the cases where it is clear that the asserted sentences not only do not contradict each other, but are in fact both true. On the other hand, by appealing to pragmatic and metalinguistic processes, he shows how many of the disputes of this kind can be analyzed as disputes over the selection or appropriateness of a contextually salient standard. I

<sup>9</sup>Baker (2012) criticizes this proposal. He invokes three commonly accepted tests for presuppositions (cf. von Fintel, 2004), and points out that they do not appear to support López de Sa's claims. For discussion, see (Marques and García-Carpintero, 2014); the presentation of the discussion in this section summarizes our work in that paper.

<sup>10</sup>Kennedy (2007) and Kennedy and McNally (2010), for instance, argue for a contextualist treatment for relative gradable adjectives such as "tall" or 'rich," although not for absolute gradable adjectives like "spotted" or "full." Thanks to an anonymous referee for pointing this out.

<sup>11</sup>It might be questioned that the impression of disagreement is anyway resilient in the dialogue between Clarissa and Jennifer even for semantically enlightened subjects, after Jennifer says, for instance, "Cow's tongue is delicious given what counts as delicious for me; I see that you have different tastes." Perhaps it is not obvious that there is a resilient sense of disagreement, but I am taking as veridical the reports given by many people that even after a qualification of this kind is made, they still perceive a conflict between Jennifer and Clarissa.

am sympathetic to Sundell's proposal, as I am to López de Sa's. But once more, as it stands, it is incomplete.

As indicated, a contextualist about the meaning of predicates of personal taste (and other predicates) should acknowledge that the perception of disagreement that is left in cases like (1) cannot be accounted for as a straightforward case of doxastic disagreement. For present purposes, let us accept that when two people doxastically disagree, the following inter-subjective doxastic attitude incompatibility holds:

### 2.2.1. Doxastic Attitude Incompatibility

If subject A's attitude is correct, then subject B's attitude cannot be correct<sup>12</sup> .

The occurrence of doxastic disagreement justifies the disapproval of other people's doxastic attitudes. But the notion of doxastic disagreement does not play any role in a contextualist account of the remaining impression of disagreement between enlightened subjects in (1). Both utterances, Clarissa's and Jennifer's, express true propositions. Now, as López de Sa suggests, the impression of doxastic disagreement may be explained by errors about contextual presuppositions. But once we acknowledge that those presuppositions fail, the impression of disagreement should also vanish.

Thus, if in the following dialogue, Clarissa takes a visible male to be the salient one referred to by "he" in that context and Jennifer objects because she takes the salient male to be the person the previous discourse was about, any impression of doxastic disagreement vanishes when they become aware that they have different referential presuppositions.

	- (b) Jennifer: He is not Scottish.
	- (c) Clarissa: He is Scottish, because the salient male I meant was not the one you have in mind but that one. [pointing to the visible person].

It is naturally possible to feel a disagreement about a "metalinguistic" proposition (concerning who is the salient male in the context, the referent of "he"), especially if participants have common knowledge about the nationalities of the visible male and the one previously spoken about, and Jennifer places a proper emphasis on her token of "he." In this case, Jennifer's objection is similar to the one metalinguistically expressed by (6c).

Sundell (2011) resists the disagreement-based arguments of relativists that target contextualism. According to him, both intuitive impressions of disagreement or conflict, and disagreement indicated by uses of denial, or metalinguistic negation,<sup>13</sup> are compatible with the absence of some forms of doxastic disagreement. He argues that many intuitive impressions of disagreement can be explained as cases of conflicting non-doxastic attitudes (p. 271), for instance, those manifested in this variation over (1):

	- (b) Clarissa: Well, I don't like it!
	- (c) Clarissa: # Nope/Nuh uh, I don't like it.

There is a perception of disagreement or conflict in (7a) and (7b), even though it is clear that the contents asserted by Clarissa and Jennifer are consistent—both are actually true. But Clarissa's disagreement with Jennifer would not have been felicitous if expressed via the denial in (7c). Disputes like the one in (7a) and (7b) should rather be explained by appealing to conflicting non-doxastic attitudes.

As an improvement over the notions of substantial disagreement that he discusses, Sundell proposes that we accept as (a kind of) disagreement "the relation between speakers that licenses linguistic denial" (Sundell, 2011, p. 274). Sundell gives us some examples that illustrate the variety of denial-licensing disputes. They cover presupposition disagreement [illustrated by (6a)–(6c) above], implicature, manner, character (after Kaplan, 1989), and finally context disagreement.

Context disagreement can include cases where sentences like those in (3) are uttered (Sundell, 2011, pp. 278–279). Consider this variation of the example. Adapting the point made by Barker (2002), we can imagine a case where Naomi is visiting Athens, and is curious to know what nowadays counts as rich in Greece. In reply, Didi utters (3a), "Mary is rich." In so doing, Didi is giving "some guidance concerning the relevant standard" for richness in Greece. Barker considers these as metalinguistic uses of gradable adjectives, uses that "produce a context-sharpening effect" (Barker, 2002, p. 1) (see also García-Carpintero, 2008 for a similar discussion). If these uses exist, then we can conceive of a dispute between Didi and Naomi that concerns what the relevant standard of richness is in their context, a dispute which can be expressed by (3a) and (3b). In other words, "if context sharpening is a commonly available mode of conveying information, then a natural prediction is that such information is a possible focus of dispute" (Sundell, 2011, p. 279).

There is however a further possible kind of context disagreement. Not only can people dispute which is the contextually salient standard in a conversation, speakers can also dispute which standard should be adopted, when none is settled. There are two issues that need further explaining. One concerns the contextual disagreements where speakers dispute which contextual standard should be selected. How are such disputes to be adjudicated? A second related issue concerns rather what drives such disputes?

Presumably, there is nothing prior to some aesthetic disputes or disputes over matters of taste about which standard to adopt when nothing in the context settles a standard. There are no doubt culture-wide paradigms of beauty that are part of the background of many aesthetic disputes, and likewise for discussions over matters of taste, etc. But culture-wide paradigms do not suffice to resolve all such disputes. They cannot settle, for example, a disagreement over who is more truthful to nature, Turner or the pre-Raphaelites.

Where nothing prior settles a dispute, a plausible hypothesis to explain the persistency of a disagreement is that in those cases conflicts of pro-attitudes merge with contextual disagreements (discarding other explanations for persistence, such as lack of knowledge of the nature of the dispute, of the relevant background, individual stubbornness, etc.) In a dispute of the kind now contemplated, each speaker tries to impose her

<sup>12</sup>For more on doxastic disagreement and exclusion, see (Marques, 2014a). <sup>13</sup>See (Horn, 1989; Carston, 1998, 1999).

own standard as the salient standard of the context, insofar as the speakers are motivated to push their own standard. But why should anyone do so? In other words, why would anyone want her own perspective on the things she appreciates (or doesn't) to be the perspective that others also have about what they appreciate (or don't)? If we assume that there is reason to treat different perceptions of taste as equally veridical, it becomes evident that an explanation of how these different perceptions can ground conflicts is missing from most of the recent literature on these issues.

## 2.3. Conflict

Huvenes (2012) discusses examples similar to (7), "I like cow's tongue"/ "Well, I don't!." He considers whether examples of this kind [and others, like (1)] admit linguistic denials, and other markers of disagreement like "that's not true"/ "that's false" or "I disagree." He argues that considerations having to do with disagreement do not undermine contextualism. Like Sundell, Huvenes also considers that there are a variety of forms of disagreement. He tries to defend the idea that two people can disagree, even if they both speak truthfully. These are the cases like (7), where speakers voice their different dispositions toward given foods. Huvenes mentions that the idea of appealing to conflicting pro-attitudes, desires or preferences, is not original. His idea is to use the distinction (Stevenson, 1963) made between "disagreement in belief " and "disagreement in attitudes," i.e., between doxastic and non-doxastic disagreement. Although the idea of conflicting conative attitudes is assumed to play a role in conflicts over evaluative matters in general, it is seldom explained.

The first chapter of Stevenson (1963) is dedicated to the nature of ethical disagreement, and the book starts by drawing the above mentioned distinction between doxastic and conative attitude disagreement, a distinction that philosophers, but mostly meta-ethicists, have assumed to exist ever since it was made. Expressivists (Stevenson, 1963, Blackburn, 1984 or Gibbard, 1990), relativists (Egan, 2012; MacFarlane, 2014), and contextualists (Sundell, 2011; Huvenes, 2012; Marques and García-Carpintero, 2014; Marques, Unpublished Manuscript, etc.) all embrace it.

We are concerned with the possibility of conflicting conative attitudes accounting for the resilient impressions of disagreement that most theorists argue exist in the cases under consideration. How should conflicting attitudes be explained? Two hypotheses for the conditions under which attitudinal conflicts occur have been put forward in the literature. The first condition is one of subjective rationality, and the second is one of satisfaction.

The rationality condition is what Kölbel as in mind when he describes disagreements thus: "we could not rationally accept what the other has asserted without changing our minds" (Kölbel, 2004, p. 305). The nature of the modality would need elucidation. Moreover, attitudes that are not beliefs, i.e., are non-doxastic, seem to raise further difficulties for a rationality constraint.

The satisfaction condition is what Stevenson has in mind with that sense of disagreement that "involves an opposition of attitudes both of which cannot be satisfied" (Stevenson, 1963, pp. 1–2). The two conditions can be summarized as follows.

**Rationality:** It is not possible for an individual to rationally have a pair of attitudes X and Y just in case there is an attitudinal conflict between subjects A and B when A has attitude X and B has attitude Y.

**Satisfaction:** If a subject A's attitude can be satisfied, then B's attitude cannot be satisfied.

We may however have reasons to doubt that RATIONALITY is true. It is not clear whether it is ever irrational to have a pair of conative attitudes like desires, or certain emotions (love and hate, fear and hope, say). In his Treatise, Hume argued that

it is only in two senses, that any affection can be called unreasonable. First, When a passion, such as hope or fear, grief or joy, despair or security, is founded on the supposition or the existence of objects, which really do not exist. Secondly, When in exerting any passion in action, we chuse means insufficient for the designed end, and deceive ourselves in our judgment of causes and effects. Where a passion is neither founded on false suppositions, nor chuses means insufficient for the end, the understanding can neither justify nor condemn it

Hume (1978, pp. II,iii,3, 415).

Both senses support the idea that the "unreasonableness" of the passions depends on the possibility of their satisfaction (whether their objects exist, and whether the means to attain them are sufficient). If Hume is right, then the individual rationality constraint for conative attitudes depends on an individual satisfaction condition. It is hence conceivable that someone is "not unreasonable" for having two attitudes X and Y, even if there is an attitudinal conflict between A's attitude X and B's attitude Y. Since we are left with SATISFACTION as the real condition on the rationality of attitudes, a question arises as to how it impacts on the existence of inter-personal conflict.

For SATISFACTION to be an acceptable condition for conflict, more has to be said about why certain pairs of attitudes, when held by two or more people, give rise to conflicts. Simply mentioning that two attitudes cannot be both satisfied will not account for many of the conflicts arising from the manifestation of different dispositions. In other words, there are pairs of attitudes held by different people that can be satisfied and nonetheless the people at stake seem to be in conflict. If the conative attitudes expressed are like those conveyed in (7) "I like cow's tongue," and these are strictly individual dispositions, then clearly the attitudes conveyed can be both satisfied. Since both dispositions or desires toward cow's tongue can be satisfied—Jennifer can eat what she desires and Clarissa can refrain from eating what she doesn't desire—there seem to be no grounds for those attitudes to be in conflict or incompatible, apart from the fact that they are different<sup>14</sup> .

<sup>14</sup>Schroeder (2008) criticizes several versions of expressivism for failing to explain, and merely assuming, that pairs of different conative attitudes are incompatible, or inconsistent. He says "I think that none of these looks remotely satisfactory as an expressivist explanation of why 'murdering is wrong' and 'murdering is not wrong' are inconsistent. None answers the basic question of what makes disapproval and tolerance of murdering inconsistent with one another. Each posits that there are such mental states that are inconsistent with one another, but none explains why" (Schroeder, 2008, p. 587). I agree with Schroeder's criticism of expressivism. Contextualists and relativists should be careful not to make the same mistake.

On the other hand, having different desires, or desiring different things, can't be a basis by itself for conflict or disagreement, as this example clearly illustrates: Jennifer quite fancies Ferrán Adriá, but Clarissa fancies his brother Albert instead. There's no conflict there, surely. Difference in attitudes does not establish conflict.

In what sense are Jennifer's and Clarissa's different dispositions toward cow's tongue in conflict? As long as they can concur in not forcing their choices on each other, both can have their preferences satisfied. Yet presumably we may still hear a disagreement in straightforward expressions of preferences like (7). An appeal to different individual dispositions by itself does not explain why even in this case we hear them disagreeing. If each of them is expressing a personal preference, with no consequences for what the other will eat, where is the remaining conflict?

Given that we have dismissed RATIONALITY, and that SAT-ISFACTION seems unsatisfactory if the attitudes at stake are purely first-personal singular, it seems to follow that we can only read (7) as expressing conflict between two people insofar as we see it as an expression of an expected common disposition shared by Clarissa and Jennifer. And unless we have a good explanation of why having the same dispositions matters, we will be incapable of explaining why people with different desires, preferences, or dispositions, have incompatible attitudes, or of explaining the role of conative attitudes in conflicts about evaluative thought and discourse, for instance in cases like (1)<sup>15</sup> .

A theorist that aims to account for evaluative dispositional properties should answer several questions. (I) Are the dispositional properties first-order or higher-order? (II) Are the dispositions first-personal singular or plural? And (III) what is the nature of the dispositions at stake?

I am inclined to opt for the higher order nature of these dispositions, because of examples of the following sort: Suppose I have a terrible cold. I've lost my sense of smell and taste. I'm offered a dish that has been prepared by the chef at my favorite restaurant. There's nothing he cooks that I don't like, so although I have not tried this one dish, I am almost certain it is delicious. But the dish does not taste like anything to me now (and I have never tried it before). It is not incoherent to believe "this does not taste delicious to me now, but I know it is delicious." Mutatis mutandis for something cooked by a hypothetical friend with terrible taste and poor hygiene habits. It is also, in similar conditions, not incoherent to believe "this does not taste bad to me now, but it is disgusting." This speaks at least in favor denoting by "disgusting"/ "delicious" whatever my gustatory experience would be in ideal conditions, or at least in normal conditions.

Are these dispositions first personal singular or plural? How can I generalize from the cook at one of my favorite restaurants and the hypothetical friend with bad taste? Presumably my generalization encompasses not just why the restaurant is good for me (in normal or ideal conditions), or why my friend has terrible taste (for me in normal or ideal conditions) but for anyone who is sufficiently like me in relevant respects (in constitution or in cultural background, or whatever turns out to be the relevant respects). But there is a further possible variation here, depending on which evaluative property is expressed: who "we" designates may vary from a large group—possibly everybody, to a very small group—oneself only. Finally, what is the nature of the dispositions at stake? The attitudes at stake may be desires, but presumably they could be other more primitive emotional reactions.

The hypothesis that "disgusting" expresses higher-order plural dispositions, and not just first-personal singular first-order dispositional responses seems to be confirmed by Rozin and Fallon's work:

The notion that disgusting items taste bad may be problematic. Whereas most people have never tasted most things they find disgusting, they are convinced that these substances would taste bad. Of course, bad refers not to sensory properties but to their interpretation of them. Thus, even if ground dried cockroach tasted just like sugar, if one knew it was cockroach, this particular sweet powder would taste bad... It is the subject's conception of the object, rather than the sensory properties of the object, that primarily determines the hedonic value. Although certain strong negative tastes (e.g., bitter tastes) may not be reversible by manipulation of the object source or context, we suspect that any positive taste can be reversed by contextual or object information (Rozin and Fallon, 1987, p. 24).

Now, David (Lewis, 1989) offers a schematic definition of what a value is:

[S]omething of the appropriate category is a value if and only if we would be disposed, under ideal conditions, to value it (Lewis, 1989, p. 68).

To value something is, for Lewis, to be in a certain sort of motivational mental state: to desire to desire it. This guarantees the internalist connection between value and motivation. Values are the things that we are disposed to desire to desire in certain circumstances. There are two categories of such things: the states of the world we desire to be the case, i.e., the propositions we desire to be true. These are de dicto desires. And we also desire to be in a certain way. These are de se desires. Lewis's dispositionalist theory fits well with the kind of relational account of evaluative properties described by analogy with the color case in §2. On this theory, to find that cow's tongue is tasty is to be disposed in the right way toward cow's tongue, i.e., it is to value having pleasant gustatory experiences when eating cow's tongue. And to find that cow's tongue is disgusting is to be disposed in the right way against cow's tongue, i.e., to value not being in contact with cow's tongue.

On the Lewisian theory, the evaluative property expressed involves the relevant group to which the speaker belongs. It is, if we want, a first-person plural secondary property, or a de nobis secondary property. The theory offers further advantages. It is cognitivist, since it accounts for the evaluative property expressed by the value predicate or word—and it can be true

<sup>15</sup>For discussion and examples illustrating the need for a good theory of conflicting conative attitudes, see (Lewis, 1989) and (Marques, Unpublished Manuscript).

or false that cow's tongue is tasty (or disgusting), and even that Jennifer (or Clarissa) can be mistaken about cow's tongue being tasty or not. At the same time, the theory is sufficiently subjectivist and dependent on people's desires to accommodate the perceived importance of conative attitudes in disputes of taste<sup>16</sup> .

Group membership for the purposes of identifying the relevant evaluative properties expressed by value terms cannot be the sort of thing that depends exclusively on one's occurrent desires. One may be mistaken at a given moment about one's overall dispositions, and one's occurrent desires may be affected by extraneous causes. If this occurs in the personal case, a fortiori it can happen in the first-person plural case, and one may be mistaken at any given time about what one's group values. Group identity and membership cannot depend exclusively on one's conversational interlocutors at a given moment. Because of this, what we (i.e., me and people sufficiently like me in the relevant respects) find delicious or disgusting is not determined by intra-conversational contextual factors, or at least, not entirely.

If the disagreement in (1), as in (7), results from conflicting dispositions and is about which standard should be adopted, what exactly drives Clarissa and Jennifer to try to impose their own standard? The previous paragraph indicates various ways the "selection of a standard" or a "dispute over a standard" can take place: people may be mistaken about what standards they actually endorse, they may be mistaken about group membership (who are "we") or it may simply be indeterminate who "we" are, or how "we" are to respond in ideal conditions. If and when subjects are disputing which standard should be adopted, they are disputing what they collectively should be disposed to (dis)value.

To repeat, on a dispositional account of value along the lines of Lewis's, a standard of taste is a kind of dispositional property, the disposition to value certain things. The dispositions at stake are first-person plural. This should yield the desired result. Clarissa finds cow's tongue disgusting. The theory should ascribe to her the disposition to value, i.e., to desire that we desire not to eat cow's tongue. Jennifer, however, desires that we desire to eat cow's tongue. Clarissa and Jennifer's desires amount to a disagreement in attitudes because they cannot be jointly satisfied at the same world.

A remaining question is the following: Why does it matter that people share a common value standard? In particular, why does a shared standard matter for tasty or disgusting, but not for fancying? <sup>17</sup> The next section tries to offer an answer to this question, relying on the role of coordination on the evolution of the relevant dispositions.

## 3. Coordination

The beginning of a solution should take coordination into account. Coordination plays a role in a different sense of agreement to the ones discussed so far—namely, in the sense of an agreement as a convention. For (Lewis, 1969), conventions are solutions to coordination problems. Lewis follows Hume's account of convention and agreement in the Treatise:

It is only a general sense of common interest... I observe, that it will be to my interest [e.g.,] to leave another in the possession of his goods, provided he will act in the same manner with regard to me. When this common sense of interest is mutually expressed and is known to both, it produces a suitable resolution and behavior

(Hume, 1978, pp. III.ii.2, 490).

What connects coordination with the kind of de nobis dispositions claimed to be central in evaluative properties?

Bacharach (2006) and Gold and Sudgen (2007) have done a considerable amount of work on the role of first-personal plural intentions in decision-theoretic reasoning irreducibly involving groups with which agents identify18. These dispositions are essential for group cohesion. Let us call them "de nobis dispositions." When Clarissa and Jennifer have de nobis dispositions, there is an increased probability that their actions will be coordinated with respect to an indefinite plurality of projects. An explicit indication that the presupposition of commonality (see López de Sa, 2008) fails, as in a metalinguistic expression of disagreement over the relevant standard (see Sundell, 2011), manifests the absence of such common de nobis dispositions, and it may undermine group cohesion. This is the practical aspect that is missing in other semantically similar cases, such as the disagreement about being rich, or who "he" refers to. And it explains where the conflict of attitudes arises.

I next offer a conjecture as to why we have such de nobis dispositions.

## 3.1. A Conjecture

The conjecture advanced here involves various components. The first is a commonly shared assumption among evolutionary cognitive scientists, namely that various kinds of coordination problems are at the root of our having, as humans, evolved to have the dispositions we have. The second component connects this

<sup>16</sup>Anonymous referees pointed out that there seems to be a difference between clearly evaluative predicates (moral terms for instance) and many taste predicates of the kind discussed here. Although we may expect convergence, they suggested, it would not be plausible to claim that "delicious" or "disgusting" express de nobis dispositional properties. I admit that there may be some cases where apparent taste predicates express first-personal singular, i.e., de se properties: a disposition to have a certain response or reaction in the presence of certain substances. The present account can be seen as defending an outline of the conditions a theory must satisfy if it is to be evaluative and to allow for conflict and disagreement. Singular de se dispositional properties can still be evaluative, but not allow for conflict or disagreement. Singular first-order de se dispositional properties will not be evaluative nor allow for conflict or disagreement.

<sup>17</sup>As an anonymous referee pointed out, "tasty" and "disgusting" are adjectives whereas "fancies" is a verb. I don't think this affects the main point. The background story for taste adjectives of this kind is that they express certain kinds of dispositions, namely dispositions that envolve desires. "Fancies" expresses the occurrence of a given desire. The main point here is that the desires at stake appear to be satisfiable and hence there's an explanation of the conflict missing. The example with "fancies" could be changed to an example with an adjective, for instance "simply irresistible."

<sup>18</sup>See also (Marques and García-Carpintero, 2014).

evolutionary assumption with dispositional theories of value, such as Lewis's. As a result, value judgments about matters of taste, aesthetics or morality are such that they both express dispositional properties and, crucially, reveal conflicts when dispositions vary. The present conjecture seems to be confirmed by research in biology, evolutionary psychology and anthropology. The conjecture, to be clear, is that our preference for some converging dispositions, and our aversion to some diverging dispositions, has an evolutionary explanation connected with finding needed solutions to recurrent coordination problems.

The conjecture is corroborated for instance by Tooby and Cosmides's work. Our distinctive capacity for cooperative behavior was, they have argued, evolutionarily important for human survival. Tooby and Cosmides (2010) summarizes many of their results. According to them, alliances pose a "series of adaptive problems that selected for cognitive and motivational specializations for their solution" (p. 200), where the two biggest obstacles to alliances are the problem of free-riders and the problem of coordination. Coordination to achieve common goals is necessary for coalitions, and it is also necessary that cooperators are not outcompeted by free-riders. We have evolved both anti-free rider adaptations and coordination adaptations. Tooby and Cosmides indicate that adaptations for coordination include programs implementing

a theory of group mind; programs implementing a theory of interests; programs implementing a theory of human nature; programs for leadership and followership; the outrage system; theory of mind; co-registration programs for solving common knowledge problems; language; and an underlying species-typical system of situation representation which frames issues in similar ways for different individuals

(Tooby and Cosmides, 2010, p. 202).

Sharing the same evolved architecture, they claim, provides a partial foundation for resolving the game theoretic problem of common knowledge with finite cognitive resources. For cooperative action to be taken, evolved procedures must exist for inducing or recognizing sufficient coordination in situation representation.

Among the adaptations that contribute to coordination are our emotional responses. Specific emotions are evolved systems of internal coordination, activated in response to evolutionarily recurrent situations such as danger, contamination, conflict or pleasure.

More generally, there seems to be a psychophysics of mutual coordination and coregistration, involving (for example) joint attention and mutual gaze, especially timed when salient new information could be expected to activate emotional or evaluative responses in one's companions. The benefits of coregistration and mental coordination can explain (at least in part) an appetite for co-experiencing (watching events is more pleasurable with friends and allies), the motivation to share news with others, for emotional contagion, for gravitation in groups toward common evaluations, for aversion to dissonance in groups, for conformity, for mutual arousal to action as with mobs (payoffs shift when others are coordinated with you), and so on.

(Tooby and Cosmides, 2010, p. 205).

The research about the evolution of taste and disgust, the education of taste, and eating customs, illustrates this broad description of the importance of coordination in human cognition. I mention briefly the case of what is disgusting, after (Rozin and Fallon, 1987; Rozin, 1996). Here is a very short summary of the explanation. As omnivores, humans have a very varied diet, but this means that they are at a high risk of consuming toxic substances. The evolution of gustatory taste permits discriminating potentially edible things. According to Rozin, disgust is the fear of incorporating an offending substance into one's body. The things that humans find disgusting things are, mostly, those coming from animals (in particular, some animal parts, like tongues and other internal organs). But there is a problem: it seems there is a wide variability in what is found disgusting (and conversely, tasty) from culture to culture, which suggests that there is a crucial learning period. Elizabeth (Cashdan, 1994) argues that there is indeed a sensitive period for learning about food in the first 2–3 years of a child's life. After 3 years, children's tastes diminish drastically. Coordinating eating habits with those of the immediate group may be one of the first requirements for survival. It then becomes a way of identifying one's group and community.

Pinker (1997) discusses the significant case of food taboos. According to him, food taboos indicate that the coordination of eating habits with those of one's group is important because it contributes to strengthening the cohesion of the group. Being able to eat together may permit the formation of new alliances. The feast days of many religions have as a central component rituals involving food and "breaking bread together" (Pinker, 1997, p. 385).

Now, conflicts may occur in actual situations where coordination toward common goals may be hindered—for instance, when Clarissa and Jennifer cannot agree on what they should eat. On the other hand, conflicts may occur in evolutionarily recurrent situations that have posed coordination problems, and thus led to the selection of specific emotional responses (responses toward edible things, toward dangerous or pleasurable situations, toward other people or their actions).

The conjecture here advanced is that in cases of this kind where sharing the relevant dispositions has played a role in finding coordination solutions in recurrent situations—the existence of divergent emotional responses is perceived as signaling potential conflicts, and thus ground the conflict among conative non-doxastic attitudes: not all of their desires will be satisfied.

The conjecture is illustrated by the case of tasty and disgusting things. Being disposed to eat the same sort of things enables further cooperation and altruistic behavior, and is more likely to lead to future benefits. Humans have evolved to approve of others with similar dispositions, and have evolved to disapprove of others with dissonant dispositions. Not being similarly disposed in some relevant aspects may hinder further cooperation. The desires that concern the benefits that result from others' cooperative behavior toward oneself may fail to be satisfied.

This research supports the claim that humans have a preference for consonance and an aversion to dissonance in certain kinds of dispositions. Other research supports the claim that certain modes of cognition are first-person plural or de nobis. Frith and Frith (2012) have recently reviewed the recent work in cognitive science and psychology on the various "mechanisms of social cognition." Among such mechanisms are, for instance, empathy or emotional contagion that permit alignment of representations, as well as forward modeling that allows the prediction of other's behavior. Some of the neural mechanisms involved in the observation of others and in learning, at the implicit level, are association, reward, gaze following and mirroring. Ongoing research on these mechanisms of social cognition is revealing the role they play in learning, cooperation, and language acquisition.

## 4. Consequences

How does the conjecture fir in with (i) the dispositional account of values a la Lewis that is being here assumed, and with (ii) a contextualist semantic account of evaluative predicates in general?

The Lewisian theory is not only internalist and cognitivist, but it is also naturalist. Values are dispositional states. If any dispositional theory is correct, it has to fit with what the best theories of the natural and social sciences tell us about the relevant kind of dispositions. Evolutionary psychologists' work on the evolution of altruism and cooperation, and on the evolution of the sense of taste for instance, corroborates a dispositional theory, at least with respect to taste properties. It may be that further research on the mechanisms of social cognition will tell us more about the x character of such dispositions.

This paper started with a discussion of challenges to contextualist semantics. The issue was whether a contextualist semantic account of evaluative predicates like "tasty," "disgusting" (and others more robust than taste predicates) can accommodate and explain disagreement data. A contextualist semantics that respects the metaphysical view of evaluative properties as secondary or dispositional properties of the sort discussed here will allow for the possibility that two speakers may be in dispute over different evaluative properties. One speaker may express a property about group<sup>1</sup> to which she belongs, i.e., that it values X, and another speaker expresses a property about group<sup>2</sup> to which she belongs, i.e., that it does not value X. Both speakers may be speaking truly. Wasn't this the main objection to contextualist accounts, that there seems to be some sense of disagreement left that cannot now be captured by the semantics? On a dispositional theory like Lewis's account, we have an explanation that covers doxastic disagreements, as well as an explanation of conflicting desires, where such a conflict exists if and when interlocutors are members of the same group. This could however mean that the challenge of lost doxastic disagreement results in a challenge of lost conflict of attitudes too. Or does it?

Contextualists have appealed to presuppositions of commonality to deal with the challenge of lost disagreement—we suppose that our interlocutors are like us in the relevant respects. They have appealed to metalinguistic disputes about the selection of standards—even if we are both speaking truly, we may in fact be engaged in a dispute over what standard should be implemented. And they have moreover appealed to conflicting conative attitudes that in any case remain. The main aim of this paper was to show the need to say more about these kinds of explanation. What drives disputes over the selection of evaluative standards, and why does it matter that common standards be accepted? What makes it the case that a pair of conative attitudes are in conflict? What is the role of coordination in finding common standards and in attitudinal conflicts? Keeping in line with the naturalistic motivation for a dispositional account of value properties, the paper has offered a brief review of some of the central research in evolutionary psychology and cognitive science that can begin to fill in the blanks, connecting on the one hand dispositions that help to find solutions to recurrent coordination problems, and on the other hand evaluative thought and discourse in general. Does this put us in a better position to answer the challenges of disagreement and conflict?

On the contextualist account, Clarissa and Jennifer express distinct but equally true propositions. This offers a semantic implementation of the relational account of the dispositional properties expressed by taste predicates. But contextualists about predicates of taste and value face the challenge of explaining how two people can accept different true propositions and nonetheless disagree. The suggestion here offered tried to develop the proposals put forward by López de Sa, Sundell, Huvenes, and Marques and García-Carpintero, by offering what the appeals to presuppositions of commonality, metalinguistic disputes and conflicting non-doxastic attitudes were missing.

The impression that there is a doxastic disagreement could presumably be explained by the existence of folk invariantist semantic intuitions. But there seem to be resilient disagreements even where semantically informed speakers like Clarissa and Jennifer still insist on uttering sentences like (1a) "Cow's tongue is disgusting" and (1b) "No, it isn't, it's delicious!" These can be presumably explained as metalinguistic disagreements over the selection of an appropriate standard, as Sundell proposes. What distinguishes cases like (1) from cases where speakers simply express their individual preferences, like (7), is that the former cases trigger presuppositions of commonality that the latter do not. But other cases of presupposition failure do not generate disagreements; in fact, learning that a presupposition fails usually dispels disagreements. We can anyway assume that a conflict of attitudes remains. If these attitudes were simply the expression of individual desires, and since two people with different personal desires can both be satisfied, it is hard to see what the cause of the remaining conflict can be. I have offered a broader explanation of conflicting conative attitudes, in line with a dispositional theory. However, this explanation still leaves us with a problem: if, on assumption, what Clarissa and Jennifer say is true with respect to their respective standard (which concerns two distinct groups), and if their non-doxastic attitudes are in conflict only if they concern the same group, then we still do not have an explanation of the conflict of attitudes.

In Section 2.3, I pointed to the fact that group membership cannot be the sort of thing that depends exclusively on one's occurring desires. One may be mistaken at a given moment about one's overall dispositions, and one's occurring desires may be affected by extraneous causes. Moreover, one may be mistaken at any given time about what one's group values. Also, group identity and membership cannot depend exclusively on one's conversational interlocutors in a context. It is not that whatever is a value is whatever the interlocutors in a conversation are disposed to value; context contributes to determine which value property is expressed in a context. As a simplified illustration, context determines whether by "tasty" the interlocutors mean tasty for people with sophisticated gourmet training, or tasty for the typical north-European 3 year old child. But a conversational context does not constitute the value property itself. The property at stake, whatever it is, is whatever the relevant group is disposed to value in the right conditions.

The concern that contextualist explanations are limited to intracontextual disputes does not arise straightforwardly. Because the relevant group's identity, membership and composition are not context-dependent matters, whether or not there is a disagreement or a conflict of attitudes is not straightforwardly a result of whether two people participate in the same conversation. Rather, it is a result of whether their doxastic or conative attitudes are compatible or in conflict. Finally, group identity and group membership may be indeterminate. This indeterminacy, together with some indeterminacy concerning what we should do

References


in ideal conditions of full imaginative acquaintance, leaves ample room for meaningful disputes about evaluative matters, and for metadisputes about what values we should share.

In the previous section, I reviewed some work that shows the importance of common evaluations for cooperative projects. The possibly variable extension of a given group (and the indeterminacy of the group identity and extension in question), together with "the benefits of coregistration and mental coordination" can at bottom be the reason why, even when people have different standards, they strive to establish a common ground, or, to put it another way, to extend group membership. Attitudinal conflicts can endure wherever there are expectations concerning what we, together, should come to value.

This is, in summary, a rephrasing of Lewis's conditionally relative view. There is no absolute answer as to who we are: "What I mean to commit myself to is conditionally relative: relative if need be, but absolute otherwise" (Lewis, 1989, p. 85).


von Fintel, K. (2004). "Would you believe it? the king of France is back! presuppositions and truth-value intuitions," in Descriptions and Beyond, eds M. Reimer and A. Bezuidenhout (Oxford: Oxford University Press), 315–341.

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Marques. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Expressivism, Relativism, and the Analytic Equivalence Test

*Maria J. Frápolli1,2\* and Neftalí Villanueva2*

*<sup>1</sup> Department of Philosophy, University College London, London, UK, <sup>2</sup> Department of Philosophy I, University of Granada, Granada, Spain*

The purpose of this paper is to show that, pace (Field, 2009), MacFarlane's assessment relativism and expressivism should be sharply distinguished. We do so by arguing that relativism and expressivism exemplify two very different approaches to contextdependence. Relativism, on the one hand, shares with other contemporary approaches a bottom–up, *building block*, model, while expressivism is part of a different tradition, one that might include Lewis' epistemic contextualism and Frege's content individuation, with which it shares an *organic model* to deal with context-dependence. The buildingblock model and the organic model, and thus relativism and expressivism, are set apart with the aid of a particular test: only the building-block model is compatible with the idea that there might be analytically equivalent, and yet different, propositions.

*Edited by:*

*Marco Cruciani, University of Trento, Italy*

### *Reviewed by:*

*Teresa Marques, Universitat Pompeu Fabra, Spain Francesca Ervas, University of Cagliari, Italy*

> *\*Correspondence: Maria J. Frápolli mjfrapolli@gmail.com*

### *Specialty section:*

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

*Received: 20 June 2015 Accepted: 06 November 2015 Published: 24 November 2015*

### *Citation:*

*Frápolli MJ and Villanueva N (2015) Expressivism, Relativism, and the Analytic Equivalence Test. Front. Psychol. 6:1788. doi: 10.3389/fpsyg.2015.01788*

Keywords: context-dependence, assessment relativism, expressivism, Frege, pragmatism, compositionality, principle of context

## INTRODUCTION

MacFarlane (2014, p. 172) has recently claimed that his own kind of relativism and contemporary expressivism, more specifically the one defended by Allan Gibbard, use 'essentially the same compositional semantics.' This claim, despite being accurate concerning the semantic value of the specific sentences that McFarlane's focuses on, might blur a fundamental difference between the expressivist analysis and other semantic approaches. Expressivism, we will argue, is in general compatible with standard compositional semantics, but its basic take on how propositional contents are individuated concedes priority not to the principle of compositionality, but rather to the principle of context. Under expressivism, content is individuated by the inferential import, and thus the compositionalist – building-block – order of explanation is challenged.

The aim of this paper is threefold. First, we will contrast two different models to accommodate context-dependence—the idea that explaining our linguistic practices requires both linguistic and contextual information. The *building-block* model, on the one hand, and the *organic* model, on the other, can be set apart by taking into consideration whether they give prominence to the principle of compositionality over the principle of context, or the other way around. Second, we will argue that expressivism, unlike relativism and other competitors, fits snugly under the latter, organic, model. Third, we will propose a test to determine whether a given theory belongs to the buildingblock or the organic model – if it is possible for a theory to accommodate the idea that there are analytically equivalent propositions that nevertheless differ, then this theory belongs to the compositional group. According to this test, the *analytic equivalence* test, assessment relativism belongs to the building-block model, while expressivism remains an alternative for advocates of the organic model.

## CONTEXT-DEPENDENCE: THE BUILDING-BLOCK SPECTRUM

Almost no theory of meaning available aspires to explain our meaningful communicative exchanges in a way that is completely independent from contextual considerations. An elaborate example of this extreme view might be Stojanovic's *what is said* (Stojanovic, 2007), where content is explicitly designed to be neutral with respect to context-dependent parameters. At one level or another, though, most theories of meaning assume that whatever we can say about the meaning of a string of symbols, as viewed in isolation, differs from what a normal speaker would say while uttering it, or an audience would get while understanding it.

Under the *building-block model*, meaning's order of explanation proceeds in successive stages, starting from the most basic considerations, and building up from them. At any level, information from the context might be acknowledged by different theoretical alternatives. Here there are some examples. With speech-act pluralism Cappelen and Lepore (2005) claim to put forward 'insensitive' semantics, meaning context independent, but they move most contextual effects to the realm of pragmatics, making the communicated information ultimately dependent on the context. Some other "minimalist" alternatives include in the semantic content only the contextual information that is retrieved with the aid of the linguistic meaning of certain expressions, such as indexicals (Stanley, 2000). Sometimes contextual information is meant to have both an impact on what is said as well as on what is globally communicated. Pragmatic explanations of the opacity of belief reports tend to exhibit this feature (see Salmon, 1986, but also Saul, 1998). These theoretical alternatives thus concede a place to contextual information, but are not usually dubbed 'contextualists,' because they explain in a context-independent way speakers' intuitions about the truth of what is said. Contextualists, on the other hand, explain our semantic intuitions by appealing to contextual information.

Within the realm of contextualism, indexical and nonindexical contextualism (cfr. MacFarlane, 2007, 2014) should be distinguished both from Truth-Conditional Contextualism (cfr. Recanati, 2010) and Relevance Theory (Carston, 2002). 'Indexical contextualism' is the general label for views according to which the context affects the semantic value of the subsentential linguistic items. Non-indexical contextualism, by contrast, restricts certain contextual processes to the realm of post-semantics. Truth-Conditional Pragmatics and Relevance Theory are instances of "radical contextualism" (Searle, 1992; cfr. Recanati, 2002, p. 303) – whose central motto is that there is no truth-evaluable level of meaning which is unaffected by contextual information.

Assessment relativism (MacFarlane, 2014) recognizes the impact of contextual information on our intuitions about the truth of what we say, but makes it so that some contextual information can be accessible only from a particular context – that of assessment. On occasions, it is not the context in which the sentence is uttered that matters, but the context in which the utterance is received. This type of context-dependence is usually set apart from the aforementioned versions of contextualism, even if it has been argued that the alleged benefits of this view specially those concerning disagreement—can be accommodated within enhanced contextualist approaches (see, e.g., Kölbel, 2009; Lopez de Sa, 2015, but also Marques and Garcia-Carpintero, 2014; Marques, 2015). At times, it has even been conflated with certain context-dependent approaches (expressivist approaches) whose starting point seems to be quite distant from the buildingblock model (vid. Field, 2009, p. 2521 , but cf. Yalcin, 2011, p. 327). We will show in the third section of this paper that assessment relativism truly belongs to the building-block model, and in doing so we will be able to establish a principled difference between this form of context-dependence and another common alternative, i.e., expressivism.

This quick list is by no means intended to be exhaustive; it is meant only to show the spectrum within which different takes on context-dependence can be accommodated. Whether we admit only a minimal amount of contextual information, or we are radical contextualists, we form part of the building-block model if contextual information enters a step-by-step process of meaning construction that starts from the meanings associated with subsentential components, to arrive at a later stage to a complete content.

Depending on the *stage* at which contextual information has an impact, pragmatic processes under the building-block model might be:

Prelinguistic. Input: unsegmented marks or sounds, not recognized as signs belonging to a language. Output: a piece of discourse.

Lexical. Output: a univocal string of words. Take 'I saw her duck under the table'; only after 'duck' is interpreted, either as a verb or a noun, do we proceed to the following stage.

Syntactic. Output: a univocal structure. Compare 'every ball has a red dot on it,' and 'every kid at school has a pet.' The second sentence exhibits a syntactic ambiguity. Even though a single red dot cannot be on every ball, every kid in the school can be truly said to have a pet if either they are given a different pet for every different kid, or they all treat the school turtle as their very own pet.

Pre-semantic. Output: a univocal set of meanings-cumstructure. Reference fixing for indexicals and semantic disambiguation are commonly assumed to require contextual information.

Semantic. Output: a proposition. Quantifier domain restriction (see Stanley and Szabó, 2000), modulation (see Recanati, 2004, *passim*, see for instance p. 136 and ff.), etc. are typically associated with *local* pragmatic processes.

Post-semantic. Output: a proposition plus a circumstance of evaluation. Typically associated with *global* processes.

<sup>1&</sup>quot;What I'm advocating for normative terms is very different from contextual relativism, so different that in my 1994 paper I decided not to call it 'relativism' at all, and to label it a kind of expressivism (though one very different from oldfashioned versions of expressivism, in that it gives evaluative statements a cognitive role). But MacFarlane (2007) has recently introduced the term 'assessor-relativism' for what seems at first blush to be just this sort of thing" (Field, 2009, p. 252).

Pragmatic. Output: multiple propositions. Secondary inferential processes, for the most part taken to be not sub-personal. Implicatures.

Depending on the *way* in which contextual information is accounted for, pragmatic processes might be:

Primary/secondary. For theories that defend a principled distinction between the semantic core of our utterances and other levels of meaning conveyed, primary pragmatic processes will be those affecting the semantic core, *what is said*, and secondary pragmatic processes will derive other layers of propositional content inferentially from what is said plus other contextual considerations. The latter will typically have an impact at the pragmatic level, even though interactions with other, lower levels are recognized by some approaches, such as Relevance Theory.

Local/global. Local pragmatic processes have an impact on subsentential phrases, global pragmatic processes modify the circumstances of evaluation, placing the whole sentence, as it were, in a different light to be evaluated. These are usually identified at the post-semantic level.

Mandatory/optional. A pragmatic process is mandatory if its intervention is necessary in order to arrive to a level of content that can be evaluated as true or false. Otherwise, it is optional.

Mandatory∗/optional∗. A pragmatic process is mandatory∗ if its intervention is "recruited" by the linguistic meaning of a lexical item, as occurs in the sentence. Otherwise, it is optional∗. Indexicals trigger mandatory∗ pragmatic processes. These processes are also sometimes deemed 'bottom–up' vs. 'top–down' processes.

Context-dependence under the building-block model covers the vast majority of theories of meaning in the market. So much so that it is often forgot that there are alternatives to this spectrum of theories. Within this model, contextual information finds its way into the explanation of linguistic communication as part of a progressive building process. But, as we will see in the following section, there are well-known semantic alternatives that exhibit a completely different kind of context-dependence. This form of context-dependence might look alien to some, but it is exemplified by among the best-known theoretical approaches of the analytical tradition. As we will see, the basic insight of this alternative approach was shared by Frege and David Lewis, to mention only two well-known examples. Under the *organic model* context plays a truly preeminent role. Putting context first is what Frege, Lewis, and others did, and it is also part of the agenda put forward by contemporary expressivism2 .

## PROPOSITIONAL PRIORITY AND THE ORGANIC MODEL

In the *organic model*, content individuation is not an issue of assembling pieces into a particular shape. Rather, the basic unit of analysis has to be able to move the chain on the conversational scoreboard, and thus the analysis should take as primitive only linguistic units that can be used to acquire certain inferential commitments. Context is not needed to fill in the holes left in the logical form by semantic underdetermination, but rather to supply the information that is needed to make sense of a certain communicative exchange.

Dealing with contextual information organically requires being able to apply the contribution of the context to the content expressed, and this in a way that cannot be specified by taking into account how the linguistic meaning of the subsentential bits becomes modified when introduced in that particular situation, only to be afterward assembled in a meaningful whole. As we saw in the previous section, whether we take the contextual information to be gathered with the aid of linguistic instructions – through mandatory∗ pragmatic processes, or freely – as the result of optional∗ processes, or secondary pragmatic processes, the building-block model would always proceed from subsentential units to a whole proposition. The organic model needs to start from a completely different stance. No longer would it suffice to check how the contextual information bears upon the particular meaning of the phrases as they are currently used, a large amount of contextual information can also have an impact on the content which cannot be domesticated into the modulation of some pieces of the whole. The starting point of the organic model is the content of judgments, whatever we can put forward as a premise or a conclusion, what we stand for and become responsible for in a conversation.

In communicative acts the immediate data are contents of propositional nature, expressed by sentences. These contents are individuated within a given context, and this makes the organic model context-dependent, even though context provides information in a way that cannot be equated to those mentioned above. To "move the chain," agents have to perform some kind of act, since acts are the minimal moves in the communicative game. Brandom gave flesh to this classical pragmatist intuition: 'sentences are the kind of expression whose freestanding utterance [*...*] has the pragmatic significance of performing a speech act' (Brandom, 2001, p. 125). 'Without expressions of this category,' Brandom went on, 'there can be no speech acts of any kind, and hence no specifically linguistic practice' (loc.cit). Both logically and chronologically, rational agents' first contact with language is somebody saying something. Only afterward is the identification of words and structures available.

<sup>2</sup>An argument could be made to the effect that Relevance Theory does actually belong to the theories grouped under the "organic model" label. Within Relevance Theory, individuation of content is performed with the aid of the presumption of optimal relevance – the cognitive impact of an utterance needs to match the effort that is required to interpret it. As the cognitive impact of an utterance is established with respect to the status of the audience's belief box at the time of the utterance, whether the presumption of optimal relevance is upheld can only be determined by paying attention to the whole judgment, instead of its subsentential components. Moreover, Relevance Theory acknowledges the existence of top–down pragmatic processes, even acting from the level of implicatures, with an impact on the explicature. These reasons, the presumption of optimal relevance as a guiding principle for content individuation and the existence of top–down pragmatic processes, could be sufficient to persuade some of the idea that Relevance Theory is unfairly listed within the building-block model theories. Tempting as this might be, we think that this inclination must be resisted, for the following reason:

the sheer distinction between bottom–up and top–down pragmatic processes *only makes sense* within a building-block background. Relevance Theory's commitment with a logical form that gets enriched with different components was essential to the position as it was introduced, and continues to be part of the standard description of the theory (Carston, 2000, p. 10; Clark, 2013, p. 305; Romero and Soria, 2014, p. 490).

Lewis' epistemic contextualism (Lewis, 1996) is a well-known example of an organic use of contextual information. His view cannot be forced into any of the building-block varieties of context-dependence introduced in the first section of the present paper. Lewis faces the challenge of the skeptic, and provides a definition of knowledge that can, on the one hand, explain why the skeptic maneuver makes sense, as traditionally discussed in epistemology, and, on the other hand, the fact that we truly know many things. The skeptic, by continuously forcing us to look at alternatives that we had not previously considered, makes us doubt our firmest beliefs, and therefore it seems that none of our beliefs can ever after be secured, so as to be called 'knowledge.' Lewis's strategy allows for our knowledge attributions to be true before meeting the skeptic, while our post-skeptic knowledge attributions become false. Meeting the skeptic has exercised a crucial change in the context, and knowledge attributions become context sensitive.

### Here is Lewis' definition:

S knows that P iff P holds in every possibility left uneliminated by S's evidence —Psst!— except for those possibilities that we are properly ignoring (Lewis, 1996, p. 561).

'S knows that p' will then be true if every alternative in which not p is eliminated by S's evidence. I know that my pen is inside my bag at 00:35 because I can rule out every possible chain of events leading up to my pen being elsewhere. I saw it in the bag a minute ago, lighting conditions are ok, I am under the influence of no perception-altering substances, nobody has entered the room since the last time I saw it in the bag, etc. My evidence eliminates every possibility in which my pen is not in my bag. If I know it, my attribution at 00:35 will always be true. But, 'is that so?' the skeptic would ask at 00:36, only to introduce subsequently an exotic alternative, previously ignored, in which my pen is absent from my bag, an alternative that my current evidence cannot eliminate. What if everything I see is nothing but a cleverly produced illusion, conducted by a demon who, as a matter of fact, happens to have my pen in his hand? I can no longer truly say that I know that my pen is in my bag, since my evidence tells me nothing about the existence of that demon. How can my attributions differ so drastically in a minute? Lewis's response is that the clever skeptic makes it inappropriate to ignore certain possibilities. It was true at 00:35 that I knew that my pen was in the bag, and it is also true that I do not know at 00:36 that my pen is in the bag. Being sometimes susceptible to the reasons of the skeptic does not make me an illogical person.

The context alters the content of the epistemic attribution by changing the alternatives that can properly be ignored, and so it does in a tacit way (thus, Lewis's 'psst'). Crucially, the context does not modify any of the subsentential items of the sentence, to make it fit into the conversational occasion. Lewis's contextdependence of knowledge attributions can be accommodated only within the organic model, one in which we start by looking at the conditions under which a particular judgment, my knowledge attribution in this case, makes sense.

Frege, one of the founding figures of semantic analysis, and therefore an unavoidable reference for current alternatives within the philosophy of language, also assumed the organic model of individuation as the backbone of his logic and semantic proposal. It would be a disservice to restrict Frege's *organic inclinations* to his first works. Not only did he maintain them in his first significant works, but he also took sides with the principle of context until the end of his career.

The project of defining the concept 'number' in 'Grundlagen' is an illustration of the organic procedure. Frege exposed the flaws of the classical strategy of defining numbers by putting 'units' together and shifted to a different method: 'It should throw some light on the matter to consider number in the context of a judgment which brings its basic use' (Frege, 1884/1960, §46, p. 59). This is an application of the second principle that he introduced in the prolog of this work and that defined his logicosemantic project, 'never to ask for the meaning of a word in isolation, but only in the context of a proposition' (p. xxii), the principle of context that shaped the development of logic and semantics ever since.

The principle of context is a rich indication that can be understood as making a point about the contextually modulated meaning of the words in a sentence, or else as a statement about the logical priority of propositions over concepts. The first reading, which has become the centerpiece of several varieties of contextualism, elaborates it in the notion of modulation of meaning (vid. Recanati, 2004, p. 39 and ff.). It is nevertheless the second reading that characterizes the organic model. To avoid misunderstandings, we will call this second reading the Principle of Propositional Priority: [Principle of Propositional Priority] Propositions are the primary bearers of logical, semantic, and pragmatic properties.

Two judgments, Frege explains (Frege, 1879, §3), can differ in two ways: (i) From the two of them together with a certain set of premises, the same set of consequences follows. (ii) Alternatively, the sets of their consequences might not coincide. In the first case, the two judgments have the same content; in the second case, their contents are different. A propositional content, the content of a possible judgment, is thus individuated by the contents that follow from it (together with some auxiliary information). In this model, subsentential and subpropositional elements play no essential role in content individuation. As Frege put it:

Let us assume that the circumstance that hydrogen is lighter than carbon dioxide is expressed in our formula language, we can then replace the sign for hydrogen by the sign for oxygen or that for nitrogen. This changes the meaning in such a way that 'oxygen' or 'nitrogen' enters into the relations in which 'hydrogen' stood before. If we imagine that an expression can thus be altered, it decomposes into a stable component, representing the totality of relations, and the sign, regarded as replaceable by others, that denotes the object standing in these relations. The former component I call a function, the latter its argument. *The distinction has nothing to do with the conceptual content; it comes about only because we view the expression in a particular way* (our italics; Frege, 1879, p. 22).

Frege's approach to the other classical principle, the Principle of Compositionality, is patent in this text. The interpretation of the principle that characterizes the building block model takes it as a criterion of propositional individuation in which propositions are complex entities made up of simpler parts. We claim, nevertheless, that this is not Frege's interpretation. The organic model is compatible with a view of compositionality as a method of propositional analysis, not as a criterion of propositional individuation. A single proposition can be expressed by different sentences, which open up diverse possibilities of propositional analysis. Even if propositions are, in the organic model, non-structured entities, the structure of sentences can be projected, for the sake of a particular analytic aim, onto the propositional contents expressed by them. This fact should not make us forget that there is a sharp distinction between the ontological characterization of propositions as structured entities build up on blocks, on the one hand, and the semantic project of assigning semantic values to expressions in a sentence, on the other. Lewis (1980)3 is an example of the defense of the organic model of propositional individuation and the compositional approach to the semantic value of expressions.

That the classical building-block interpretation of compositionality is alien to Frege's thought is no news any more. It has been defended by (Jansen, 2001) and (Pelletier, 2001), among others. In what follows we will offer new evidences4 .

From the principle of propositional priority follows one of Frege's longstanding insights, one that plays a particularly relevant role in this paper and that is the core of the organic model: it makes no sense to admit the possibility that there might be different, yet analytically equivalent, thoughts, an insight, we contend, that is not compatible with the building-block model. Furthermore, as a defining feature of the organic approach to propositions, it serves as a test to set apart two essentially distinct uses of context, as we will do in the next section of this paper. If propositional contents are organically individuated, analytically equivalent sentences express the same proposition. This claim amounts to a rejection of a possible isomorphism between sentences and the propositions expressed by them, and is a major consequence of the organic model. In Frege's writings the rejection of the isomorphism between sentences and thoughts is represented by his move toward contents by overlooking the grammatical surface of judgments. Languages serve thoughts to get 'clothed in the perceptible garb of a sentence' (Frege, 1918– 1919a, p. 354) and can be used 'as a bridge from the perceptible to the imperceptible' (Frege, 1923–1926, p. 259). Nevertheless, cloth and flesh, the perceptible and the imperceptible maintain their independence, and the principle of propositional priority establishes which one takes the lead. In 'Logical Generality' (Frege, 1923–1926), for instance, Frege says: 'We should not overlook the deep gulf that yet separates the level of language from that of the thought, and which imposes certain limits on the mutual correspondence of the two levels' (Frege, 1923, p. 259). Passive transformation becomes one of his favorite examples. From 'Begriffsschrift' to 'Logical Investigations,' he resorts to it to show that non-synonymous sentences (in the standard sense) can systematically be used to elicit the same thought:

A sentence can be transformed by changing the verb from active to passive and at the same time making the accusative into the subject. In the same way we may change the dative into the nominative and at the same time replace 'give' with 'receive.' Naturally such transformations are not trivial in every respect; but they do not touch the thought, they do not touch what is true or false (Frege, 1918–1919a, p. 357).

But passivization is not the only case. Frege's substitution mechanism to determine the contribution of subsentential expressions is a further example of his use of context, a mechanism that Brandom takes over to explain the inferential function of singular terms and predicates (Brandom, 2001, Chap. 4 *passim*). 'Frege was the first,' Brandom concedes, 'to use distinctions such as these to characterize the roles of singular terms and predicates. Frege's idea is that predicates are the substitutional sentence frames formed when singular terms are substituted for in sentences' (Brandom, 2001, p. 131).

An example of a different sort that nevertheless illustrates the same point occurs in the realm of logic. Logical terms mean unsaturated notions whose arguments can be sentences, truthvalues or thoughts, depending on the perspective we take on them. In particular, thoughts can be compounded to form more complex ones by means of logical operations. Nevertheless, the logical operations applied to sets of thoughts can be rendered in natural and logical languages through sentences with different ingredients. The thought expressed by any instance of the schema '(A & A)' is the thought expressed by the corresponding instance of 'A' (Frege, 1923–1926, p. 393, n. 21). The thought expressed by any instance of the schema 'Not [(not A) and (not B)],' is the thought expressed by the corresponding instances of 'Not [Neither A not B]' and by the corresponding instances of 'A or B' (Frege, 1923–1926, p. 396).

Thus, even if Frege explicitly uses the building-block image (as in Frege, 1914, p. 225), he takes the idea that thoughts are made out of simpler parts that correspond to the parts of the sentences metaphorically. In 'On Sense and Meaning' he says:

Here, I have used the word 'part' in a special sense. I have in fact transferred the relation between the parts and the whole of the sentence to its meaning, by calling the meaning of a word part of the meaning of the sentence, if the word itself is a part of the sentence. This way of speaking can certainly be attacked, because the total meaning and one part of it do not suffice to determine the remainder, and because the word 'part' is already used of bodies in another sense. A special term would need to be invented (Frege, 1892, p. 165).

At the end of his life, Frege still maintains the same view:

If one thought contradicts another, then from a sentence whose sense is the one it is easy to construct a sentence expressing the other. Consequently the thought that contradicts another thought appears as made up of that thought and negation [*...*]. But the words 'made up of,' 'consists of,' 'component,' 'part' may lead to our looking at it the wrong way. If we choose to speak of

<sup>3&</sup>quot;The less I have said about what so-called semantic values must be, the more I am entitled to insist on what I did say. If they don't obey the compositional principle, they are not what I call semantic values" (Lewis, 1980, p. 91).

<sup>4</sup>Reverse Compositionality (Fodor, 1998, cfr. Szabó, 2013) is no better candidate to do justice to Frege's ideas on this issue. If the principle is interpreted as a sort semantic version of "reverse engineering," then it is incompatible with the Fregean stance regarding the fact that multiple logical forms can result from the analysis of a single judgment. If it only amounts to the platitude that whatever the analysis of a judgment, the final components should be somehow related to the whole, then it is both compatible with the organic and the building-block models. Similarly, if it is only meant as 'a statistical psychological generalization that holds with great regularity' (Johnson, 2006, p. 52), then Reverse Compositionality is not particularly useful when discussing content individuation.

parts in this connection, all the same these parts are not mutually independent in the way that we are elsewhere used to find when we have parts of a whole' (Frege, 1918–1919b, p. 386).

In summary, the Fregean principle of propositional priority introduces a way of individuating propositional contents that makes an idiosyncratic use of context, a use that cannot be accommodated in any of the contemporary positions that attempt to harbor the effect of contextual factors in what is said. The Fregean organic model and the compositionalist model that serves as a background for the theories depicted in the first section of this paper stand in sharp contrast with profound philosophical consequences5 . The two models are incompatible, as we will show in the next section.

## EXPRESSIVISM AND THE ORGANIC MODEL

It is the purpose of this section to show that contemporary expressivism, at least in the way in which some of its most popular varieties are commonly understood, is incompatible with the building-block model. We do so by focusing on the expressivist's commitment with the idea, mentioned in the previous section, that there cannot be different, and yet analytically equivalent, propositions. In so doing, we will argue for a somewhat controversial statement (see, for example Field, 2009, p. 252) that we introduced in the first section of the paper – that MacFarlane's assessment relativism, as a representative of the building-block model, needs to be sharply distinguished from expressivism, a paradigmatic example of the organic model.

Classical Expressivism analyzes sentences with ethical terms, such as 'cheating on your husband is bad,' as having the general import of 'boo for cheating!,' i.e., as interjections devoid of propositional content that cannot qualify as true or false. Contemporary expressivism, by contrast, acknowledges an evaluable content, organically individuated, to acts with expressive terms. The 'expressivist' strategy,' as Gibbard puts it 'is to change the question. Don't ask directly how to define 'good'*...* shift the question to focus on judgments: ask, say, what judging that is good consists in' (Gibbard, 2003, p. 6). This pattern applies to a wide variety of topics. Gibbard (2012) applies it to semantics, Chrisman and Field to knowledge ascriptions (Chrisman, 2007, 2012; Field, 2009; Carter and Chrisman, 2012), Bar-On to firstperson ascriptions (Bar-On, 2004), and so on.

In an expressivist setting, the content of normative claims is individuated organically. Higher-order functions, functions like 'is wrong,' 'is good,' 'S knows that,' 'S believes that,' or 'necessarily,' are non-truth-conditional functions that do not describe how the world is. Some of these functions are functions of propositions whose semantic role does not consist in adding a conceptual component to the propositional content of the communicative act in which they are used. For those, expressivists like Gibbard (2012, p. 179) propose an *oblique* approach – by focusing on the mental states that are expressed by the use of normative utterances, inferential relations of entailment and incompatibility are exposed, and these are the touchstone of the expressivist analysis.

So far, the characterization of the meaning of functions of propositions is negative: they are non-truth-conditional, nondescriptive, non-contributive. But if they do not describe the world and do not contribute to the proposition, what semantic role do they perform? How are they individuated? A temptation for many expressivists, old and new, is to identify the meaning of the relevant terms with some kind of mental state, attitude or feeling. An example is Gibbard (1990): 'According to any expressivistic analysis, to call something rational is not, in the strict sense, to attribute a property to it. It is to do something else: to express a state of mind' (Gibbard, 1990, p. 9). But, as we argued in (Frápolli and Villanueva, 2012, p. 485), this is unnecessary, 'since the meaning of these expressions is exhausted once their inferential potential is indicated.' A look at this inferential potential makes it apparent that normative expressions are distinctively connected with other expressions that include functions of propositions – they entail some, they are incompatible with some others, and that these connections suffice to explain their semantico-pragmatic behavior. In the next few paragraphs, we sketch the kind of minimal expressivist analysis of functions of propositions that we have developed in Frápolli and Villanueva (2012).

To give the meaning of 'S believes that p' – we consider modal, but also doxastic and epistemic, attributions to belong to the realm of the normative – is to identify the circumstances under which an agent is entitled to utter this sentence, and the consequences that can be derived from the attribution. It is constitutive of the meaning of 'believe' that an agent cannot attribute to a subject the belief that p and at the same time the belief that p cannot be true. This is the standard truth norm. Attributing beliefs to an agent commits the attributor to the further attribution of plans to act according to his/her beliefs. If we attribute to Victoria the belief that she is late for work, we should attribute to her the intention to leave immediately (even if factors preclude her from acting in this way).

Similarly, the meaning of 'know' is such that if an agent attributes to a subject the knowledge that p, the agent will be committed to the truth of p. Attributing the knowledge of p is incompatible with our belief that p is false. As M. Williams puts it, 'in attributing knowledge to another person, I concede both the truth of what he believes and his right to believe it. And in advancing this double endorsement, I take on the same commitments and lay claim to the same entitlements' (Williams, 2001, p. 17). Knowledge and belief are different concepts *because* the conditions for their use and the commitments acquired by their attribution do not coincide.

The same can be said of pairs of logical terms such as 'or' and 'and.' Utterances of 'Yum likes licorice and Yuk dislikes

<sup>5</sup>It might perhaps be surprising for some not to find a mention in this section of Donald Davidson, one of the best-known champions of the cause against the organic model. The reason for this is that we wanted to avoid any possible confusion between holism, Davidson's own brand of anti-building-block theory, and expressivism, which is the target of this paper. The kind of contemporary expressivism that we explore in this paper is different from holism –and from 'global expressivism' (Price, 2011), or inferentialism– at least in two crucial aspects: expressivism is not committed to the idea that *every* expression needs to receive an organic analysis, and expressivism does not need to accept that every inference is a *meaning-determining* inference (cfr. Gibbard, 2012, p. 109 and ff.).

it' and 'Yum likes licorice or Yuk dislikes it' express different contents because the latter, unlike the former one, is compatible with the assertion that Yum dislikes licorice. Conjunction and disjunction are distinct concepts *because* they derive from, and produce, distinct permissions and prohibitions, i.e., because they sanction divergent behavioral responses. Generality and instantiation likewise give rise to different permissions and commitments. A rational agent cannot believe that an individual *x* has the property *P* and at the same time reject that something is *P*, for *Px* and *It is not the case that there is a P* express incompatible contents. On the other hand, if two expressions systematically give rise to the same set of commitments and share the circumstances under which they can be properly used, they also share their content. One might feel that the meanings of 'every' and 'all' are slightly different. In fact, they are not universally interchangeable *salva congruitate*. But if there is no detectable difference in claiming that *every child likes football* and *all children like football* in terms of the agent's entitlements and commitments, there is just one proposition expressed by the two claims. And the same happens with the following sentences,

'tame tigers exist,' 'some tigers are tame,' 'there are tigers that are tame,' and 'not all tigers are not tame.'

These sentences are not isomorphically identical; some words occur in some of them and not in others, and they do not possess the same structure. But the inferential moves that would be made by the use of any of them in a communicative act would not be affected by the replacement of any one by any of the others, for nothing follows from any of them that does not follow from the others too.

Now it should be patent that the expressivist approach hinted at so far falls within the organic model. The content of the set of expressions to which the expressivist analysis applies is individuated by reference to the inferential links granted or precluded by assertions in which they occur, rather than by factoring in the modulated meanings of subsentential items. This minimal brand of expressivism we take to be compatible with the core of major contemporary expressivist approaches, and the features that make this position an example of the organic model belong to this common core6 .

Our claim about the organic nature of the expressivist enterprise will be now assessed with the aid of the test that we introduced in the previous sections. Expressivism is naturally committed to the idea that there cannot be different, but analytically equivalent, propositions, and this tenet has been put to use as a premise by John MacFarlane in an argument to undermine the expressivist analysis of predicates of personal taste. We will show how, even though he is right in attributing that principle to the expressivist, his argument does not work. In the process, the crucial difference between expressivism and relativism will become apparent.

John MacFarlane has, in recent years, developed an analysis of predicates of personal taste that makes them context-dependent but that also differs from most previously known versions of contextualism. According to MacFarlane, what makes these predicates special is that they require the intervention, at a postsemantic level, of certain information that can be gathered only from the context of assessment – rather than the context of utterance. Even if it was true when originally produced, I can retract my claim that 'licorice is not tasty' because I am assessing it now in a different context. With this, MacFarlane manages to offer an alternative both to objectivism – the idea that claims that contain predicates of personal taste are true or false *simpliciter* – as well as to contextualism – the idea that every taste claim involves a reference to a set of taste standards. Expressivism, MacFarlane posits, can nevertheless be developed into a position on the matter that looks dangerously close to his own assessment relativism. Not so close, though, that it cannot be differentiated from it, and evaluated accordingly like a different theory.

Relativism and expressivism would offer similar treatments of descriptive beliefs, which are individuated in terms of compatibility and incompatibility among mental states (MacFarlane, 2014, p. 170). The conflict would arise when assessment-sensitive beliefs are considered. In these cases, MacFarlane's reconstruction of the expressivist position would offer an indirect characterization of beliefs via the language of preference. In such an expressivist framework, to attribute to someone the belief that licorice is tasty would be to attribute to that individual 'the very same kind of state' (MacFarlane, 2014, p. 173) that we would make the attribution by saying that he/she likes licorice. By contrast, McFarlane's relativism rejects the contention that beliefs with taste-relative contents can be identified with any 'state we could attribute using the language of preference' (ibid.). 'Why might it matter whether there is one state or two?' he asks. And his answer brings into the open a qualitative difference between the two accounts: for an expressivist it is 'conceptually impossible to think that something whose taste one knows first-hand is tasty while not liking its taste.' This, MacFarlane argues, would be going too far:

The relativist [*...*] can agree that the questions are 'not separate' in the following sense: first-person deliberation about each gets resolved by the same considerations. It does not follow from this, however, that the questions concern the same psychological state (MacFarlane, 2014, p. 174).

In other words, a first-person avowal of a belief with the content that licorice is tasty is practically indistinguishable from a claim to the effect that the speaker likes its taste. But even so, an agent who is working hard to improve his/her taste standards could make sense of a situation in which he/she still likes licorice but would be willing to accept that it might not be tasty after all. And clearly, a subject who assesses these avowals can mark the first one as saying something false while ascribing truth to the second claim.

Thinking that we like the taste of something having a taste we know first-hand, and thinking that something is tasty are conceptually, i.e., analytically, equivalent, and yet, MacFarlane

<sup>6</sup>Please note that our claims concerning expressivism and relativism, but also the building-block model and the organic model, concern only the *individuation of content*. Thus we take them to be for the most part orthogonal with respect to the much debated issue of the identity of propositions. Our goal is to explore when two contents differ, rather than to establish what propositions are.

argues, an agent can be in a position in which it is not irrational to attribute one but not the other. Only MacFarlane's own assessment relativism can account for this fact. Expressivism, no matter how close to relativism it might appear, is necessarily committed to the opposite idea. McFarlane's remarks disclose an irresoluble conflict with expressivism that can be expressed in terms of the two models of content-individuation described in the foregoing sections of the present paper. If McFarlane's diagnosis is accurate, relativism and expressivism come apart at this deeper level, and, to the despair of the proponents of the organic model, there is something intuitively correct in the idea that one might acknowledge that something is tasty without liking its taste herself.

Thus, once it is assumed that the building-block model and the organic model can be used to spell out the differences between such close views as assessment relativism and expressivism, MacFarlane's arguments could be taken even a step further, to use them against the whole organic model. If 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' are analytically equivalent, but it can be rational to believe one but not the other, maybe it is inappropriate as a general policy to claim that there cannot be different but analytically equivalent propositions. We close this paper by providing some evidence in favor of the organic take, arguing (i) that 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' are not analytically equivalent within the organic model, and (ii) that, with respect to those sentences that are declared to be analytically equivalent by the organic model, it is indeed irrational to believe one but not the other. A crucial aspect of our argument is the rejection, already mentioned, of the idea that the job of sentences like 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' is to voice mental states. We are willing to accept, as both McFarlane and Gibbard do, that the state of mind that makes an agent utter any of them might be identical. But it does not follow from this that they are analytically equivalent. Their meanings do not equate to the expression of feelings or attitudes but they are instead individuated by the inferential commitments that a speaker acquires when uttering them, commitments that belong to the public sphere and for which what happens in the agent's head is strictly irrelevant.

The general question underlying the conflict does not have an easy answer. Whether it makes sense to accept that two contents can be different even if the sentences by means of which we systematically express them are analytically equivalent crucially depends on the kinds of concepts involved. For ordinary firstlevel sentences, i.e., the kinds of sentences that express what Ramsey calls beliefs of the 'primary sort' (Ramsey, 1929, p. 146) and Boole and Frege 'primary propositions' (Frege, 1880–1881, p. 14), the possibility of finding cases of the kind put forward by Benson Mates (Mates, 1952), always seems open. Under the organic model, as we examined in Section "Propositional Priority and the Organic Model," content is individuated inferentially, but that does not mean that no mechanism can be devised to check whether or not two particular linguistic items, sentential or subsentential, express the same content. Whenever two expressions are not interchangeable *salva veritate*, it is proved that their inferential behavior crucially differ, and therefore cannot be taken to express the same content. Mates' cases are a particularly telling way to explore the inferential content of pairs of expressions.


While it still makes sense to ask whether 1b could be false, the truth of 1a goes unquestioned. 'Being an ophthalmologist' and 'being an oculist' are not, as a consequence, analytically equivalent. Comparing 2a and 2b offers a similar result. While 2a is obviously true, the truth of 2b can be challenged. A person attending a wine-testing course might be rational to think that he/she likes the taste of a certain wine while not thinking that it is tasty. This would not show that two analytically equivalent sentences can be rationally entertained as different, and therefore that expressivism fails. Rather, this would only show that expressivism – or MacFarlane's reconstruction of it as applied to taste predicates – had gone too far in claiming that those two thoughts are analytically equivalent. Within an expressivist-organic approach it makes perfect sense to think that licorice is not tasty, while still liking its taste, because 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' are *not* analytically equivalent. Thus 2a and 2b prove that they have different inferential import. MacFarlane's insistence on 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' being analytically equivalent, in spite of their distinct inferential behavior, only confirms that his assessment relativism belongs to the building-block model. The expressivist does not need to shy away from MacFarlane's argument precisely because organic content individuation is incompatible with the claim that 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' have the same content – they are not analytically equivalent. In fact, MacFarlane's claim that they are is only possible if relativism falls outside the sphere of the organic model.

'Being an ophthalmologist,' like 'being tasty,' are first-order predicates. Expressivism, we have claimed, is nevertheless essentially concerned with functions of propositions. Modal, epistemic, doxastic operators, along with ethical terms and logical constants were among the examples that we offered to characterize the view as an instantiation of the organic model. In fact, Frege's examples, the ones that we introduced in order to argue for the idea that an organic individuation of content was incompatible with the existence of analytically equivalent, and yet different, propositions, involved a difference only in functions of propositions – logical constants, or the operation of passivization. Using the Mates test to check on sentences differing only at the level of functions of propositions proves to offer striking results:


Sentences 3a and 4a are trivially true. But, contrary to what happens with 1b and 2b, sentences 3b, 4b, and 4c also appear to be unquestionably true. A rational agent cannot attribute the belief that lawyers are wealthy without attributing the belief that it is not the case that lawyers are not wealthy. Otherwise the agent would display serious rationality flaws. Those who reject 4c remove themselves from the community of rational agents. In these cases, what is in question is not lexical mastery but the basic understanding of the rules of language.

Concerning functions of propositions, our intuitions agree with the predictions of the organic model. Rational agents cannot believe *that Alan is an expressivist*, without believing *that it is true that Alan is an expressivist*, for what is at stake is a single belief not two beliefs tightly connected. The same content is expressed by the sentences 'it is not the case that Alan is not an expressivist,' and 'Alan is an expressivist or Alan is an expressivist.'

## CONCLUSION

MacFarlane's assessment relativism is necessarily different from any sensible reconstruction of an expressivist position. The expressivist is committed to the organic model, while MacFarlane's position illustrates the building-block approach. He is right that it is easy to imagine situations in which 'Licorice is not tasty' and 'I know the taste of licorice first-hand and I do like it' can be thought at the same time, about the same licorice, without the thinker being irrational, but that only shows that 'Licorice is tasty' and 'I know the taste of licorice first-hand and I do like it' are not analytically equivalent. Whenever real analytically equivalent cases can be found within the organic model, as in 3a to 4c, it is irrational to attribute one but not the other.

## REFERENCES

Bar-On, D. (2004). *Speaking My Mind*. New York, NY: Oxford University Press. Brandom, R. (2001). *Articulating Reasons. An Introduction to Inferentialism*. Cambridge, MA: Harvard University Press.

Cappelen, H., and Lepore, E. (2005). *Insenstive Semantics*. Basil: Blackwell. Carston, R. (2000). Explicature and semantics. *UCL Work. Pap. Linguist.* 12, 1–44.

The two models can be discriminated by the analyticequivalence test. The negative answer to the question whether analytically equivalent sentences always express the same proposition characterizes the building-block model of content individuation. The positive answer is the semantic core of the organic model. In the dispute between McFarlane and Gibbard, there is an essential mismatch that underlies their local disagreement about the identification of normative contents with expressions of mental states, which classifies either view under a different model. McFarlane is a representative of the buildingblock model, while Gibbard represents the organic model. Their views are thus more dissimilar than what meets the eye. Both models have strengths and weaknesses and at the level of firstorder contents the two parties propose possibly compatible accounts. Nevertheless, when functions of propositions are involved, the analytic-equivalence test settles the issue for the organic model. Only the organic model agrees with the speakers' intuitions and thus it is the only one appropriate for the analysis of higher-order functions, in general, and functions of propositions, in particular. We might reject that the speakers' intuition plays any role in the analysis of meaning, as the proponents of the various error theories do, but this move would take the study of language away from the game of science. We chose the empirical path in 'Minimal Expressivism' (Frápolli and Villanueva, 2012) by assuming that semantic hypotheses on the behavior of functions of propositions were, in this sense, *a posteriori*. The analytic-equivalence test adjudicates between the principle of compositionality and the principle of propositional priority and confirms that when higher-order concepts are at stake, expressivism is the correct approach.

## FUNDING

This project has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant Agreement No. 653056. The Ministerio de Ciencia e Innovación del Gobierno de España (Project FFI2013-44836-P), the Consejería de Innovación de la Junta de Andalucía (Project HUM-0499), and the University of Granada (Plan Propio, Programa de Sabáticos, and Proyecto Expresivismo Doxástico) have also contributed supporting our research.

## ACKNOWLEDGMENTS

We are grateful to David Bordonaba and to the two Frontiers reviewers for useful comments on an earlier draft.


Chrisman, M. (2012). Epistemic expressivism. *Philos. Compass* 7, 118–126. doi: 10.1111/j.1747-9991.2011.00465.x


*Pragmatics*, eds G. Preyer and G. Peter (Oxford: Oxford University Press), 240–250.


Price, H. (2011). *Naturalism Without Mirrors*. New York: Oxford University Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Frápolli and Villanueva. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Constructing the context through goals and schemata: top-down processes in comprehension and beyond**

### *Marco Mazzone\**

*Department of Humanities, University of Catania, Catania, Italy*

My main purpose here is to provide an account of context selection in utterance understanding in terms of the role played by schemata and goals in top-down processing. The general idea is that information is organized hierarchically, with items iteratively organized in chunks—here called "schemata"—at multiple levels, so that the activation of any items spreads to schemata that are the most accessible due to previous experience. The activation of a schema, in turn, activates its other components, so as to predict a likely context for the original item. Since each input activates its own schemata, conflicting schemata compete with (and inhibit) each other, while multiple activations of a schema raise its likelihood to win the competition. There is therefore a double movement—with bottom-up activation of schemata enabling top-down prediction of other contextual components—triggered by multiple sources. Another claim of the paper is that goals are represented by schemata placed at the highest-levels of the executive hierarchy, in accordance with Fuster's model of the brain as a hierarchically organized perceptionaction cycle. This account can be considered, in part at least, a development of ideas contained in Relevance Theory, though it may imply that some other claims of the theory are in need of revision. Therefore, a secondary purpose of the paper is a contribution to the analysis of that theory.

### *Edited by:*

*Gabriella Airenti, University of Torino, Italy*

### *Reviewed by:*

*Valentina Bambini, Institute for Advanced Study, Italy Jacques Moeschler, University of Geneva, Switzerland*

### *\*Correspondence:*

*Marco Mazzone, Department of Humanities, University of Catania, Piazza Dante 32, I-95124 Catania, Italy mazzonem@unict.it*

### *Specialty section:*

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

*Received: 22 December 2014 Accepted: 04 May 2015 Published: 19 May 2015*

### *Citation:*

*Mazzone M (2015) Constructing the context through goals and schemata: top-down processes in comprehension and beyond. Front. Psychol. 6:651. doi: 10.3389/fpsyg.2015.00651* **Keywords: schemas, context, mindreading, hierarchical representation, pragmatics**

## **Introduction**

The problem of adequately accounting for the cognitive role of context does not affect only pragmatics: most, and possibly all, human behaviors require taking into account indefinitely changeable contexts and even deciding what counts as the relevant context in the present case. As a matter of fact, one of the most developed theories in cognitive pragmatics is Relevance Theory (from now on, RT; Sperber and Wilson, 1986/1995), in which communication is analyzed as a special case of cognition precisely because both cognition in general and communication in particular have the problem of selecting what is relevant in the present context (or which is the presently relevant context).

My main purpose here is to provide an account of context selection in utterance understanding in terms of the role played by schemata and goals in top-down processing. This account can be considered, in part at least, a development of ideas contained in RT, though it may imply that some other claims of the theory are in need of revision. Therefore, a secondary purpose of the paper is a contribution to the analysis of RT. On the other hand, I also aim to show that the proposed account, which is based on a quite general mechanism, is consistent with explanations of flexible behavior in linguistics, in theory of concepts, and in psychological, neuroscientific, and computational theories of action control.

Relevance Theory has conceived of utterance interpretation as a special case of the search for relevance in cognition. Utterances raise expectations of relevance in the addressees, thus triggering a search for contexts in which they are actually made relevant. In practice, non-demonstrative inferences are constructed, with encoded meaning and contextual assumptions acting as premises that license contextual conclusions.<sup>1</sup> Utterance interpretation thus amounts to identifying the relevant cognitive context, that is, the appropriate and intended set of contextual assumptions (and conclusions). In this account, an important role is played by the organization of memory, more precisely by the differential accessibility of contents: these can be more or less strongly associated to (and then more or less easily activated by) the inputs to be processed. Relevance theorists have occasionally noted that such differential accessibility may depend on the fact that memory is organized in chunks, a point that notions such as schemata, frames, scripts etc. are intended to account for.

In this paper, I take this idea very seriously and attempt to frame it within a general model of the human brain architecture and cognitive processing. This model, proposed by Fuster (2001, 2003, 2014), conceives our cortex as organized along two highly interconnected hierarchies of representations, the sensory and the motor one, which together constitute a perception-action cycle. The representations are hierarchically organized in the sense that higher cortical layers provide the structure by which items at lower levels are arranged together, which is a different way to say that items are iteratively organized in chunks at multiple levels. I will call "schemata" the higher-level representations describing the organization of items at lower levels.

The general idea I will pursue is the following.<sup>2</sup> The activation of items at each level gives access in a probabilistic manner to schemata they pertain to, that is, activation spreads to schemata that are the most accessible due to previous experience. The activation of a schema, in turn, activates its other components, so as to predict a likely context for the original item. However, such prediction can be either confirmed or refuted by the actual context—more precisely, by the variety of the current inputs each of which activates its own schemata, and therefore its own predictions about context. Conflicting schemata compete with (and inhibit) each other, while multiple activations of a schema raise its likelihood to win the competition. There is therefore a double movement—with bottom-up activation of schemata enabling topdown prediction of other contextual components—triggered by multiple sources.

In utterance understanding, this picture applies both to linguistic and non-linguistic inputs. Each of them spreads activation to schemata, thus providing probabilistic predictions about their possible context. Since each input acts as context for the others, those predictions are in fact assessed against each other.

Another crucial assumption of this paper is that goals are represented by schemata placed at the highest-levels of the executive hierarchy, that is, at the top of the motor stream of the perception-action cycle described by Fuster. In particular, the most abstract goals are located within the prefrontal cortex (PFC), which is responsible for controlled action, that is, for top-down control of action in a processual sense. I will shortly examine how these different senses of "top-down" are related with each other and actually involved in utterance understanding: not only do linguistic and non-linguistic inputs activate schemata in general, they also activate schemata specifically representing goals, and this activation may result in attentive (PFC-driven) processing of utterances.

In line with Grice, but also with suggestions coming from Levinson and from RT, I will propose in fact that utterance interpretation requires forming hypotheses about goals/intentions. On the one hand, utterances are evidence provided by the speaker to the addressee in order for her to recognize a communicative intention that may go far beyond coded meaning. On the other hand, recognition of this intention requires its being placed within an entire system of goals, since communicative intentions are in general means for other goals, which can or cannot be themselves communicative. In a sense, then, the purpose of communication is the shared representation of a set of (communicative and noncommunicative) goals by the speaker and the addressee.

In this perspective, language production and understanding appear to be just components of a more general top-down/ bottom-up cortical dynamic involved in the execution and understanding of intentional action. However, such action-oriented view of language is not uncontroversial, and I will discuss some theses of RT that might turn out to be in conflict with it.

My discussion of RT requires an important qualification. What I propose here, apart from being a development of ideas put forth by RT, can be interpreted in part as an attempt to specify how RT might be implemented at the neuro-computational level. In practice, my account of the associative dynamic by which inputs activate a variety of schemata that compete with, or strengthen the activation of, each other, may provide a unitary explanation of the neuromechanics of a range of phenomena spanning different

<sup>1</sup>Strictly speaking, the inference is from the explicit meaning to the intended conclusions, with explicit meaning being the result of pragmatic processes applied to coded meaning. However, for the sake of simplicity I will speak of coded meaning whenever the distinction is of no import to our discussion.

<sup>2</sup>Let me introduce explicit definitions of the most important terms I will use. I call "schema" any higher-level cognitive representation, which is apt to specify the relationships between its components at a lower level. Schemata are based on co-occurrences in previous experience and they provide memory with structure. In the present context, I mainly use "bottom-up activation" to refer to the process by which pieces of information activate the schemata they pertain to, while "top-down activation" is the process by which the activation of a schema activates in turn its (other) components (for a different sense of "topdown," see below The Architecture of the Brain and the Prefrontal Cortex). "Competition" among representations, and specifically among schemata, may occur for the simple fact that they are differentially activated and, therefore, one has stronger effects than the other. However, strictly speaking, activation is just one side of the coin: there can be both excitatory and inhibitory links between representations. As a consequence, competition can also occur by way of inhibition, when schemata represent alternative state of affairs. For an example, see below (example 1) the discussion of how the ambiguity between two meanings of "bank" is resolved thanks to the activation of a schema for GETTING MONEY FROM THE BANK1. Although I will emphasise the excitatory role of this schema on the contextually appropriate meaning, it should also be considered the possibility of an inhibitory link between this meaning and its alternative(s).

levels of linguistic processing. To this extent, I see my proposal as largely compatible with RT.<sup>3</sup> The only problem I raise here concerns a quite specific issue, that is, the role assigned to quantitative expectations of relevance. More specifically, I discuss the idea that in pragmatic interpretation there is an assessment of the *amount* of cognitive effects. As an alternative, I will consider a different route that has been explored by RT, based on the idea of expectations about specific *types* of cognitive effects. As I will argue, while the idea of a quantitative assessment of cognitive effects is consistent with the view of communication as geared to maximization of information, the alternative proposal is more compatible with the view of communication as based on the variety of human purposes.

## **RT and Pragmatic Context**

## **Context and Relevance**

First of all, let us consider in some detail the crucial role that RT assigns to context, to the point that constructing the right context comes to be seen as the main part of the entire process of utterance interpretation. Some terminological clarifications are in order. What RT is in fact concerned with is the cognitive context, that is, the set of assumptions needed in order for the addressee to infer the intended conclusions from the coded meaning of utterances.<sup>4</sup> This notion has then to be distinguished from the more standard notion of linguistic and situational context, that is, the factual linguistic and extra-linguistic environment in which the utterance is embedded and which provides further inputs to cognitive processes. Those inputs contribute to activate the assumptions involved in the interpretation of coded meaning: that is, the *factual* context contributes to the activation of the *cognitive* context.

In this perspective, the context is not something given before the interpretation starts, contrary to what has often been assumed:

In much of the pragmatic literature, events are assumed to take place in the following order: first the context is determined, then the interpretation process takes place, then relevance is assessed. [*. . .*RT] suggests a complete reversal of the order of events in comprehension. It is not that first the context is determined, and then relevance is assessed. On the contrary, people hope that the assumption being processed is relevant (or else they would not bother to process it at all), and they try to select a context which will justify that hope: a context which will maximize relevance (Sperber and Wilson, 1986/1995, p 141–142).

Thus, the interpretation process is not preceded by the selection of a context, rather the latter is constitutive of the former, and this process in turn does not precede relevance assessment, on the contrary it is driven from the beginning by expectations of relevance. Such expectations are embodied, so to speak, in the mechanism by which interpretation is performed, insofar as this mechanism works so as to ensure that a relevant context is selected. In this sense, RT claims that cognition in general, and utterance interpretation as a special case, is geared to maximization of relevance.

What then is relevance, and by which mechanism is it attained in utterance understanding? According to RT, intuitively an input is relevant when its processing yields a positive cognitive effect, specifically it "is relevant to an individual when it connects with background information he has available to yield conclusions that matter to him" (Wilson and Sperber, 2002a, p. 251). However, in a realistic account cognitive effects must be balanced against the cognitive effort required in order to get them. Therefore, a complete definition of relevance has two sides, a positive and a negative one:


When utterance interpretation is at issue, this definition is intended to refer to relevance of interpretations (vs. inputs). The mechanism by which interpretations that are relevant in this sense are construed is described as a heuristic in two steps, the "relevance-theoretic comprehension procedure":


The first step of the comprehension procedure is easily understood in the following terms. An interpretation is built following a path of least effort, that is, contextual assumptions licensing contextual conclusions (in relation to the coded meaning) are selected in order of accessibility. This step does not require more than a simple associative mechanism, which makes some assumptions more accessible than others given the utterance and the factual context. As to the second step, it prescribes some sort of assessment of the obtained interpretation against previous expectations of relevance. However, I see here a potential problem, which has consequences for the proposed definition of relevance as well. RT has provided only vague suggestions about how this assessment might be performed, as Sperber and Wilson (1987, p. 742) themselves admit:

Relevance, as it affects cognition, is not computed or numerically measured but monitored or assessed, yielding only gross absolute judgments and, in certain types of cases only, finer relative judgments. Suppose that the brain is sensitive to the amount of reorganization brought about by the processing of some information and to the expenditure of energy thus incurred, just as it is sensitive to changes of posture and expenditure of energy in the case of bodily

<sup>3</sup>With the further qualification that my proposal has consequences whose compatibility with other aspects of RT requires discussion: for a first example, RT seems to be committed to the view that the inferential component of comprehension cannot be implemented in associative terms (for a discussion, see Mazzone, 2014a); for a second example, my proposal seems to trivialize the notion of modularity (Mazzone, submitted) in a way that might not fit with relevance theorists' views. <sup>4</sup>But see note 1.

movement. This is very vague—hopelessly so, some AI [artificial intelligence] people may think—but it is not so vague that it could not be false, and it is what we are claiming anyhow.

Starting from Sperber and Wilson (1986/1995, p. 130), relevance theorists have occasionally repeated without further development such "speculation," as they call it, according to which "contextual effects and mental effort, just like bodily movements and muscular effort, must cause some symptomatic physicochemical changes." To my knowledge, none of the supporters of RT has ever tried to relate this speculation to any known cognitive mechanism. However, I want to show that RT has provided a number of clues pointing toward a different direction.

To start with, it should be noted that, while this speculation concerns the assessment of both effects and effort, there is in fact an asymmetry between them in the comprehension procedure. The minimization of effort is apparently ensured already by the first step, that is, by accessing the most accessible (i.e., the least costly) interpretations in the first place. This may suggest that what needs to be further assessed, as required by the second step, is (the maximization of) cognitive effects. As a matter of fact, relevance theorists often refer to expectations of relevance specifically in terms of expectations about the amount of cognitive effects. In sum, while the criterion of effort can be accounted for very naturally in terms of associative accessibility, it is the criterion of cognitive effects that needs to make an appeal to the above speculation about "symptomatic physico-chemical changes."

However, RT has also considered two alternative views about expectations of cognitive effects, even if only implicitly.

## **Does Effort do Everything?**

According to the first suggestion, the criterion of minimization of effort alone might be sufficient to drive the cognitive system toward the maximization of benefits. This is suggested in passing by Sperber and Wilson (1996), in a passage where effort is first considered, as one would expect, as the purely negative side of relevance, but then an unexpected question follows (emphasized in italics):

when expectations of effect are wholly indeterminate, the mind should base itself on considerations of effort: pick up from the environment the most easily attended stimulus, and process it in the context of what comes most readily to mind. *Ceteris paribus*, what is easier is more relevant, if it is relevant at all. *But what are the chances that what comes more easily to mind is, in fact, relevant?* [emphasis mine] They would be close to nil, if saliency in the environment and accessibility in memory were both random, and moreover uncorrelated (Sperber and Wilson, 1996).

The question is unexpected, because there seems to be no reason why "what comes more easily to mind" should be relevant over and beyond the fact that it is, *ceteris paribus*, relevant *by definition* simply because it demands little effort (negative side of relevance). Of course, easily accessed stimuli (and interpretations) might happen to be almost entirely irrelevant on the positive side, that is, they might have little or no cognitive effects. But it is precisely in order to avoid that risk that the second step of the comprehension procedure is required, while here Sperber and Wilson seem to wonder whether ease of access can *by itself* ensure some relevance on the positive side. And in fact the previous quotation is followed by an evolutionary argument to the effect that what requires little cognitive effort is also likely to be, so to speak, the right sort of information, independently of any further mechanism for ensuring that sufficient cognitive effects are attained:

But humans are evolved organisms with learning capacities of sorts, so it is not too surprising to find that they spontaneously pay more attention [*. . .*] to objects and events that, on average, are more likely to be relevant to them.

For the same reason, it is not surprising that the perceptual categorization of a distal stimulus should tend to activate related information in memory. [*. . .*] Nor is it surprising that memory is so organized that pieces of information that are likely to be simultaneously relevant tend to be coaccessed or co-activated in chunks variously described in the literature as "concepts" "schemas," "scripts," "dossiers," etc (Sperber and Wilson, 1996).

In practice, the suggestion is made that relevance on the negative side of the notion (the ease-of-access side) is sufficient to ensure relevance also on the positive side. More specifically, the organization of information in memory, by means of concepts, schemas etc., is suggested to ensure that ease of access of a given content is a reliable sign of its (probabilistic) contextual significance.

I want to emphasize that this suggestion is very close to a proposal made by Recanati (2004) with regard to what he calls "primary pragmatic processes." These are processes by which the coded meaning of utterances is adjusted and expanded in order to get the contextually appropriate and complete proposition that is today called the "explicit meaning" of the utterance. In Recanati's view, these processes, unlike the genuinely inferential ones required for deriving the implicit meaning of the utterance from its explicit meaning, are simple associative processes based on spreading of activation in conceptual networks. According to Recanati, this spreading activation is not wholly unconstrained and blind insofar as it activates schemata<sup>5</sup> , which ensure a search for coherent interpretations: "Coherent, schemainstantiating interpretations [*. . .*] tend to be selected and preferred over non-integrated or "loose" interpretations" (Recanati, 2004, p. 37). This occurs because of a double associative dynamic: on the one hand, on the bottom-up direction of the dynamic, "a schema is activated by, or accessed through, an expression whose semantic value corresponds to an aspect of the schema"; on the other hand, on the top-down direction, the "schema thus activated in turn raises the accessibility of whatever possible semantic values

<sup>5</sup>Recanati prefers the plural "schemata" whereas Sperber and Wilson use "schemas." From now on, I will always use the former for the sake of uniformity. For a more extensive discussion of the notion of schema, see Mazzone (2014a).

for other constituents of the sentence happen to fit the schema" (Recanati, 2004, p. 37).<sup>6</sup>

Interestingly, not only have Sperber and Wilson (1996) suggested a similar role for schemata (scripts etc.) in the context of the theoretical discussion mentioned above; relevance theorists have also appealed to this mechanism in various analyses of concrete examples. For instance, let us consider Carston's (2007) analysis of the following utterance:

(1) I'm going to the bank now to get some cash.

Since there are two possible meanings for "bank" (FINAN-CIAL INSTITUTION, RIVER SIDE), the problem is how the addressee may come to choose the right one. Carston (2007) makes the hypothesis that starting from the activation of CASH, a stereotypical frame or script for GETTING MONEY FROM A BANK<sup>1</sup> (where BANK<sup>1</sup> = FINANCIAL INSTITUTION) is recalled, thus strengthening the activation of BANK1. As in Recanati (2004), the idea is that something like a schema is activated bottom-up by some of its component (GETTING MONEY FROM A BANK<sup>1</sup> is activated by the concept GETTING MONEY, which is activated in turn from the words "to get some cash"), and then it raises top-down the accessibility of its other components (BANK<sup>1</sup> = FINANCIAL INSTITUTION), so that the concept FINANCIAL INSTITUTION comes to be preferred as the interpretation of "bank."

In this example, the relevant meaning of "bank" can be selected by nothing else than ease of access, thanks to the fact that—in Sperber and Wilson's (1996) words—"memory is so organized that pieces of information that are likely to be simultaneously relevant tend to be co-accessed or co-activated in chunks."<sup>7</sup>

## **Expectations about Either the Amount or the Type of Effects?**

If our previous considerations are right, it might be the case that, contrary to the standard view in RT, no assessment of cognitive effects is required in addition to the negative criterion of accessibility. However, in many occasions relevance theorists have claimed instead that simple accessibility does not constrain interpretations enough, and that some independent assessment of cognitive benefits is needed. Specifically, the standard view is that interpretations must be assessed against some expected amount of cognitive effects. But relevance theorists have also explored a different route, that is, the idea that our expectations of relevance concern the type rather than the amount of cognitive benefits. I intend to argue, first, that there is a substantial difference in conceiving expectations of relevance in terms of the type vs. the amount of cognitive effects, and, second, that, at a closer analysis, this hypothesis points to the same direction as the suggestion that cognitive efforts may suffice to explain the search for relevance.

First of all, let us note that relevance theorists explicitly mention expectations about the *type* of cognitive effects, either with or without mention of their *amount*. For an example of the mention of both, consider this quotation from Carston (2007, p. 20, emphasis mine): "an utterance automatically triggers quite specific expectations of relevance in its addressee, that is, expectations concerning *both the quantity and the kind* of cognitive effects (implications) it will yield if optimally processed." Mention of the type has become especially frequent in recent versions of RT, the ones characterized—in Wilson's (2004, p. 352) words—by "the introduction of the mutual adjustment process (e.g., Sperber and Wilson, 1998; Wilson and Sperber, 2002b, 2004)." The idea is that pragmatic processing does not operate sequentially, by means of only forward inferences from the proposition expressed to the intended cognitive effects (passing through the selection of appropriate contextual assumptions). On the contrary, there is a parallel process based on both forward and backward inferences, in the course of which explicit content, contextual assumptions and cognitive effects are mutually adjusted to each other:

Mutual adjustment is seen as taking place in parallel rather than in sequence. The hearer does not first identify the proposition expressed, then access an appropriate set of contextual assumptions and then derive a set of cognitive effects. In many cases [*. . .*], he is just as likely to reason backward *from an expected cognitive effect* to the context and content that would warrant it (Wilson, 2004, p. 353; emphasis mine).

As the last sentence suggests, the backward inferences involved in the mutual adjustment process require expectations about specific kinds of cognitive effects. For one example (from Wilson and Carston, 2007), consider the following exchange:

(2) Peter: Will Sally look after the children if we get ill?

Mary: Sally is an angel.

Apparently the implicit content conveyed by Mary's utterance is an affirmative answer to the question raised by Peter, something like SALLY WILL LOOK AFTER THE CHILDREN IF WE GET ILL. This can be seen as the conclusion of an inference having as its premises the explicit content of Mary's utterance and possibly some contextual assumptions. As to the explicit content, however, the concept that the word "angel" contributes to it cannot be the encoded concept ANGEL which has as its property SUPERNATU-RAL BEING OF A CERTAIN KIND. It must be instead a different concept obtained by adjusting the encoded concept to the context.

<sup>6</sup>Mazzone (2011a, 2014a) argues for a generalization of this explanation (based on associative processing and schemata) beyond the limits of "primary pragmatic processes."

<sup>7</sup>As correctly pointed out by one of the referees, relevance theorists have developed a view of lexical pragmatics (with an important role for the notion of ad hoc concepts) that is not mainly based on ease of access. This view is in fact consistent with their general assumption that, although associative links may affect the accessibility of contextual assumptions and conclusions, the overall interpretation will only be accepted "if it satisfies the hearer's expectations of relevance and is properly warranted by the inferential comprehension heuristic" Wilson and Carston (2006, p. 429). I have discussed these proposals in more details elsewhere (for RT's lexical pragmatics, see Mazzone, 2011a, 2014c; for ad hoc concepts, Mazzone, 2014a). My only point here is that, insofar as RT's lexical pragmatics ultimately depends on the inferential comprehension heuristic and expectations of relevance, it is crucial to understand how those expectations are assessed. In section context and relevance I raised a problem for the standard RT's proposal based on the quantitative notion of expectations of relevance, while in the next section I argue that that problem can be avoided by adopting a different, qualitative, notion.

A natural explanation of this adjustment is precisely by means of a backward inference from the expected conclusion. Since Peter's question requires a yes/no answer, it can be thought to raise the expectation that Mary intends to claim either SALLYWILL LOOK AFTER THE CHILDREN IF WE GET ILL or its negation, and this expectation in turn licenses a backward inference toward the explicit content, which has to be coherent with either the affirmative or the negative claim. Thus, the concept ANGEL has to be adjusted until the explicit content provides a premise (for instance, SALLY IS KIND AND CARING) which has either the affirmative or the negative claim as its conclusion.

The example clearly shows how expectations about *specific cognitive effects* are involved in drawing backward inferences. This makes the notion of expected *type* (of cognitive effects) significantly different from the one of expected *amount*: while the former concerns specific contents that imposes backward constraints on the content of the premises, the latter is devoid of any content and therefore can at most permit a comparison with the amount of actual cognitive effects. Another key difference is that the notion of backward inferences from expected cognitive effects admits of a natural explanation in terms of ease of access via schemata. In our previous example, the expected cognitive effect that Mary intends to give a yes/no answer to Peter depends on a well-learned schema connecting yes/no questions and yes/no answers. Peter's question is likely to activate this schema, which in turn activates the expectation about Mary's possible answer. On the contrary, with regard to the assessment of the amount of cognitive effects, RT provides no better explanation than the vague speculation about "symptomatic physico-chemical changes."

To summarize, we have described two alternatives to the standard RT's claim that actual cognitive effects are assessed against expectations about their amount. Now it turns out that these alternatives are not only complementary but also explainable in terms of the same mechanism: ease of associative access and the schematic organization of memory (i.e., the organization of memory in "chunks"). In fact, expectations about specific kinds of cognitive effects apparently amount to associative activations of contextual conclusions via schemata. Thanks to this common mechanism, contextual assumptions and conclusions can be activated both by words constituting the utterance (via forward inferences) and by inputs from the linguistic and non-linguistic context (via backward inferences). In this perspective, instead of an assessment of the amount of cognitive effects against expectations of relevance, the process may be described as a mutual assessment of different predictions about the context. In other words, the suggestion is that hypotheses about the cognitive context are activated from different sources (utterance, linguistic and extra-linguistic context) and then assessed against each other, in a way that appeals only to ease of access (the negative side of relevance) and the organization of memory: hypotheses that are coherent with each other within the schematic organization of memory are activated more strongly and win the competition.

Let me shortly specify what this reconstruction amounts to, with regard to RT as a whole. The mechanism I have been describing—based on bottom-up activation of schemata, top-down activation of contextual information, and an assessment of these hypotheses on context against each other—is not intended to be an entirely alternative view of utterance interpretation. As I said, there are components of RT that I am explicitly endorsing, and others for which my proposal can be seen as an implementation from a neuro-computational perspective. In particular, I do not need to discuss the central core of the theory, that is, its rational reconstructions of the inferential structure leading from explicit meaning and a number of contextual assumptions to contextual implications. My proposal can rather be seen as a contribution to the understanding of such inferential mechanism, specifically, of how it is implemented by the basic activation/inhibition dynamic of the brain. My suggestion is that schemata at different levels of abstraction provide memory with the rational structure that is needed not only to activate explicit meaning and contextual assumptions (and implications), but also to assess which of these components of pragmatic inferences are coherent with each other and which are not.<sup>8</sup>

## **How Goals Enter into the Picture**

Now, I intend to argue that the above picture is entirely compatible with consideration of goals in utterance interpretation, in the line suggested by Paul Grice. Grice (1989) has described utterance understanding as a rational enterprise. More precisely, in his view the hearer assumes that the speaker is a rational agent pursuing her communicative goals and producing utterances that can be inferentially interpreted by the hearer as means to express those communicative intentions. Thus, in a sense utterance understanding is a matter of reconstructing coherent means-end structures. In this perspective, Grice also makes an appeal to context as a way to make guesses about the speaker's goals, so as to license inferences backward to (what now is called) the explicit content of the utterance, as in the following example:

in cases where there is doubt, say, about which of two or more things an utterer intends to convey, we tend to refer to the context (linguistic or otherwise) of the utterance and ask which of the alternatives would be relevant to other things he is saying or doing, or which intention in a particular situation would fit in with some purpose he obviously has (e.g., a man who calls for a "pump" at a fire would not want a bicycle pump; Grice, 1957, p. 387).

In this example, since the context suggests the noncommunicative goal of extinguishing a fire, the interpretation of a request for "a pump" is adjusted accordingly. A first thing to be stressed is the structural similarity with our previous example (2), where Peter's question can be said to play the same role played here by extra-linguistic context: it settles the goal thanks to which the explicit meaning of Mary's answer is adjusted (via a backward inference). That is, based on our knowledge of language we expect that Mary will adopt the goal of answering affirmatively or negatively Peter's question. Assuming she has that goal, Mary can be expected to provide an explicit content which is a proper means to pursue it.

But not only is there a structural similarity which allows us to describe both RT's and Grice's examples in terms of the

<sup>8</sup>For a wider discussion of this idea, see Mazzone (2014a).

retrodiction of means from contextually inferred goals. Moreover, with regard to Grice's example, it is natural to think that the man who calls for a "pump" has literally—not just as a figure of speech—the goal of extinguishing a fire. Having goals/intentions is legitimately considered constitutive of the notion of (intentional) action. If that is correct, in Grice's example the representation of an extra-linguistic goal is key to the pragmatic interpretation of the man's request. To the extent that this can be generalized, it seems that pragmatic processing needs to be embedded within a more general ability of mind-reading. This is explicitly recognized by Sperber and Wilson (2002), who mention approvingly Grice for having described human communication as a case of expression and recognition of intentions, define pragmatic interpretation as "an exercise in mind-reading" (Sperber and Wilson, 2002, p. 3), and propose in fact that the relevance-guided comprehension procedure is a "sub-module of the human mind-reading ability" (idem: 21). Although Sperber and Wilson do not draw such a conclusion, it seems reasonable to conclude that communicative intentions are embedded within wider goal structures and that this has a role to play in linguistic production and comprehension.<sup>9</sup>

Levinson (1992) has interestingly developed this idea in terms of the notion of "activity type." Activity types are defined as social patterns of goal-directed behaviors in specific settings, delivering as such expectations about what's going on next. Specifically, activity types raise expectations about the communicative actions to come. This means that communicative actions tend to be interpreted as moves in the current activity type, and therefore as something whose goals are expected to be sub-goals of the general activity. Levinson gives the following example: the sentence "C'mon Peter" may have a variety of meanings, but if one hears it during a basketball game it acquires a very clear sense, based on the kind of goal the speaker may have in that precise context. Other examples of activity types are trials and lessons, analyzed by Levinson in order to show that questions in English may have very specific uses (i.e., goals), which "are closely tied—indeed, derived from—the overall goals of the activities in which they occur" (idem: 82).

Let me summarize. Up to this point I have explored, mostly through an analysis of RT, the idea that utterance understanding is accomplished by a mechanism based on ease of access and the structure of memory. The key idea is that schemata in memory are activated (bottom-up) by multiple sources and then compete with each other for the (top-down) construction of cognitive contexts. I have also proposed that this process involves representation of goals.

In the rest of the paper, my purpose is to make this proposal both clearer and wider in scope by showing that a mechanism of the same sort has been invoked in a number of different cognitive domains.

## **Schemata and Top-down Processes in the Cortex**

## **Concepts**

As noted by relevance theorists, theories of memory assume that concepts are not isolated entities; they are organized instead in networks where some connections are stronger than others. Specifically, concepts are organized in chunks as a consequence of regular covariations, so as to ensure probabilistic coherence between them. For one example, Barsalou (2005) has argued for the notion of situated conceptualization, that is, the idea that conceptual representations in memory preserve information about specific settings in which the represented objects appear. On this background, Barsalou provides a nice formulation of the dynamic of activation between concepts and the situated conceptualizations they are embedded in:

The situated conceptualization that becomes active constitutes a rich source of inference. The conceptualization is essentially a pattern, namely, a complex configuration of multimodal components that represent the situation. When a component of this pattern matched the situation, the larger pattern became active in memory. The remaining pattern components-not yet observed-constitute inferences, that is, educated guesses about what might occur next. Because the remaining components co-occurred frequently with the perceived components in previous situations, inferring the remaining components is justified (Barsalou, 2005, p. 628).

It is easy to see that "patterns" are assigned here the same role played by schemata in our previous explanation of utterance understanding: a pattern or schema receives activation from any of its components and, once activated, it raises in turn the accessibility of its other components. Importantly, Barsalou's analysis is not concerned with utterance understanding, it is devoted instead to explain the general functioning of concepts, specifically with regard to helping construct perception, predicting entities and events, supporting categorization, and providing inferences in general (idem: 621). Thus, it seems that our above explanation of utterance understanding is just a special case of a cognitive mechanism with a much wider scope.

## **Language**

As a matter of fact, a very similar mechanism is invoked by Ray Jackendoff (2007a) in his proposal of a parallel architecture in language processing. The main idea is that the generative engine at work in language production and comprehension is not exclusively based on syntax. On the contrary, syntax is just one of the layers involved—thanks to their respective principles of organization—in the generative arrangement of linguistic materials. Crucially, Jackendoff abandons the assumption of a radical distinction between grammar and lexicon, which was based on the idea that while lexicon is constituted by *representations*, syntactic rules are implemented instead by specific *processes*, with the former being inert entities processed by the latter. His alternative proposal is that linguistic entities at any layer, including syntactic structures, are bits of information stored in long-term memory

<sup>9</sup>This proposal is further analysed in Mazzone (in press). One of the referees observes that RT has developed a complex account of the role of mindreading, metarepresentations and the mechanism of epistemic vigilance in utterance understanding. Although there is no room to address here in any detail the issue, my view might be intended as a proposal about the low-level implementation of mindreading (an associative account of mindreading is defended in Mazzone, 2014d).

and organized hierarchically, with higher levels prescribing the way in which items at lower levels must be arranged together. For each layer (syntax, semantics, phonology), the very same process of "unification" is held to be responsible for assembling specific items in accordance with the respective hierarchical organizations. Interestingly, Jackendoff 's proposal is just the most prominent representative of a general trend within syntactic theory, of which even Chomskyan minimalism is an example: that is, the trend toward the substitution of representations for procedural rules. In other words, the weight of explanation for language processing is nowadays mostly placed upon the organization of (linguistic) memory, not upon specialized processes.

On this background, Jackendoff describes the syntactic arrangement of a sentence as the result of a double movement: on the one hand, an initial word sets up "grammatical expectations" about the possible sentence structures, based on the syntactic patterns associated to that word at higher levels of the hierarchy; then, "further words in the sentence may be attached on the basis of the [previously activated] top-down structure" (Jackendoff, 2007a, p. 8). This amounts to the dynamic of bottom-up activation of schemata and top-down activation of their other components that is by now familiar to us. It is not a surprise, then, that Jackendoff characterizes the process as non-directional, such that it may work "from the bottom up or from top down or from anywhere in the middle" (idem: 8), and as based on competition between (and mutual inhibition of) alternative hypotheses, as in our previous description of pragmatic processing.

## **Hierarchies in Action**

Jackendoff 's theory of parallel architecture shows very convincingly how, as far as language is concerned, hierarchical organization of representations is apt to explain generative processing. But hierarchical representations have been taken to explain the generative nature of action as well.

The similarity between language and action with regard to their common generative nature is explicitly addressed in Jackendoff (2007b) and is largely recognized in psychological and neuroscientific theories of action (see Mazzone, 2014b, for a review). For one example, Baars and Gage (2010) observe that making plans for the future requires the ability to reconfigure elements of prior experiences in a way that does not exactly copy past experiences. This ability, they claim, is apparent in tool-making, one of the fundamental features of primate cognition, but "the generative power of language to create new ideas depends on this ability as well" (Baars and Gage, 2010, p. 402). According to the authors, "the ability to manipulate and recombine internal representations depends critically on the PFC [prefrontal cortex], which probably made it critical for the development of language" (idem: 402). We will turn below to this suggestion about PFC.

There is much research, in particular, on the relationship between hierarchical representations and generative processing in action understanding. Baldwin and Baird (2001, p. 171), for instance, claim that a "generative knowledge system underlies our skill at discerning intentions, enabling us to comprehend intentions even when action is novel and unfolds in complex ways over time" and suggest that this system "is probably just as rich and complex as the generative system underlying language" (idem: 171). They cite evidence that children can parse continuous actions along intention boundaries. However, they claim, the ability to parse and process hierarchically organized actions applies more generally:

Adults also appear to process continuous action streams in terms of hierarchical relations that link smaller-level intentions (e.g., in a kitchen cleaning-up scenario: intending to grasp a dish, turn on the water, pass the dish under the water) with intentions at higher levels (intending to wash a dish or clean a kitchen; Baldwin and Baird, 2001, p. 172).

The idea of a strict analogy (together with common neurological bases) between hierarchical structures in language and action is further developed by Pastra and Aloimonos (2012), which offer some detailed examples of how actions can be analyzed in terms of parse trees, within the framework of "a biologically inspired generative grammar of action, which employs the structure-building operations and principles of Chomsky's Minimalist Program as a reference model" (Pastra and Aloimonos, 2012, p. 103).

Moreover, Glenberg and Gallese (2012)show how a mechanism that is firmly grounded in the study of motor control might have "been exploited for language learning, comprehension and production" (idem: 905). Their proposal is based on HMOSAIC (Haruno et al., 2003), which is a hierarchical version of MOSAIC, a model-based theory of motor control developed byWolpert et al. (2003). Haruno et al. (2003) have demonstrated that, within such a hierarchical architecture, higher-level layers "can learn to select the basic motor acts and learn the appropriate temporal orderings of those acts" (Glenberg and Gallese, 2012, p. 910). The whole mechanism is explicitly described as associative, but the hierarchical structure allows nonetheless for abstract representations, standing as a whole for intentions of the agent: in practice, while at the lowest level in the model motor acts are simply chained with each other so that any of them triggers the next one, higherlevel representations provide abstract patterns that capture action structure and timing more explicitly.

Let me summarize. In all of these approaches to action, flexible and generative processing is explained by means of hierarchical representations, in which patterns at higher levels prescribe predictable arrangements at lower levels. As it should be clear, those accounts place the explanatory weight on the organization of memory, not on specialized processes; in some case, simple associative processing is explicitly mentioned as the appropriate mechanism for memory acquisition and exploitation. This picture is entirely compatible with the above considerations on concepts and language processing, and with our previous account of pragmatic understanding. On the other hand, as we saw, consideration of action brings into focus notions such as goal and intention. It is therefore opportune to analyze in some detail how these notions are related to our key notion of schema.

### **Schemata and Goals**

It is reasonable to think that goals and intentions are complex entities, whose representation involves a number of components of different nature.<sup>10</sup> However, for our purposes we can confine our

<sup>10</sup>Mazzone (2011b) proposes that goals can be analysed in terms of (a) motoric and perceptual representations of end-states; (b) attributions of value to those

attention to a simplified notion of goal/intention, along the lines of the above considerations on action. The idea—implicit in Baldwin and Baird (2001), Pastra and Aloimonos (2012), and Glenberg and Gallese (2012)—is that the goal underlying an action is the endpoint of that action, with more complex actions being constituted by a sequence of smaller actions each of which is a means to (and a sub-goal of) the overarching goal, while actions at the bottom of the hierarchy are constituted by simple motor acts.

There are two points to this idea. The first concerns the existence of goal-directed patterns in memory, the second the thesis of a hierarchical structure of goals in the cortex.

As to the first point, Glenberg and Gallese (2012) argue, as we saw, that higher layers in HMOSAIC contain abstract patterns capturing the structure of actions. Based on our previous definition of schemata as the higher-level representations responsible for the organization of items at lower levels, such patterns can be legitimately considered as schemata. In the psychological, computational and neuroscientific literature on action, the existence of goal-directed patterns of this sort is commonplace. The most explicit defense of this claim—actually expressed in terms of the existence of "hierarchical schemas and goals in the control of sequential behavior"<sup>11</sup>—is provided by Cooper and Shallice (2006), mostly on the basis of computational considerations.<sup>12</sup> They adopt the notion of schema proposed by Bartlett (1932) and further developed by Rumelhart and Ortony (1977) among others, according to which a schema is a self-contained memory structure with a variable number of component parts. In their words, as far as action control is concerned,

a schema may be seen as a means of achieving a goal or subgoal. More generally, recent computational accounts [*. . .*] take schemas to be goal-directed structures, with goals serving to mediate schema–subschema relationships. Thus, schemas achieve goals and, apart from at the lowest level of the schema hierarchy, consist of partially ordered sets of subgoals (which may themselves be achieved by other schemas; Cooper and Shallice, 2006, p. 888).

Consistently, the authors describe the role that schemata play in action control in terms of the bottom-up/top-down dynamic we considered above: "Schemas are explicit and play a causal role in determining behavior: Excitation and subsequent selection of a schema cause excitation and then selection of subschemas or actions" (idem: 892).

The hierarchical organization of schema and goal representations is claimed to account for flexibility of sequential behavior (idem: 887)—an issue to which I will return in a moment. However, contextual flexibility is also explained by Cooper and Shallice by appealing to optional elements in schema representations. This would allow schemata to be highly context-sensitive, since optional subgoals can either be activated or not on any particular occasion as a function of the context in which the schema is performed (idem: 897). In order for this to be possible, schemata should also contain representations of the contextual cues whose excitation causes the activation of optional subgoals. The representation of contexts is explicitly mentioned by Badre (2013) as a component of what, in the literature on reinforcement learning of actions, is called a "policy," that is, a rule that relates an action, a desired outcome and *a state in which the rule has to be applied*. This notion of context is clearly more specific than the one involved in our previous suggestion that schemata provides hypotheses about context. The point here is the specific requirement that certain situational cues must be present in order for certain goals to be pursued. Based on these considerations, we can describe a goal-directed schema as constituted by a final goal, a number of subgoals (or actions that are means to that goal), and some specification of the conditions in which both the final goal and the subgoals apply. In another, wider sense goal-directed schemata, as schemata in general, are chunks in memory providing appropriate contexts for each of their components.

The other important point, in the view of action as driven by hierarchies of goals, concerns the question whether such hierarchies are actually present in the brain, an issue that is better addressed on the background of a general understanding of brain architecture.<sup>13</sup>

## **The Architecture of the Brain and the Prefrontal Cortex**

As recently recalled by Badre (2013), Fuster (2001, 2003, 2014)was the first to associate a concept of abstraction in action control with the functional organization of frontal cortex. There is today some evidence that the hierarchical structure of goal-directed motor actions correlates with specific neurological regions (Hamilton

representations by the reward system; (c) representation of means to those ends together with appropriate contexts (including an appreciation of the fact that, for a given end-state, different means are needed in different contexts). Moreover, intentions are usually thought of as consciously attended goals.

<sup>11</sup>This is in fact the title of the paper.

<sup>12</sup>The defence of goal-directed schemata in Cooper and Shallice (2006) is part of a larger debate, markedly with Botvinick and Plaut, about symbolic and connectionist models of action representation. Interestingly, in their reply to Cooper and Shallice (2006); Botvinick and Plaut (2006) admit that schemata and goals need to be represented somehow, they only object that "it is too strong to say [that their own model] is eliminativist with respect to task and subtask representations (i.e., schemas), it is true that the relevant patterns of activation may be more difficult to isolate within [their model than in the one proposed by Cooper and Shallice]" (Botvinick and Plaut, 2006, p. 921). Moreover, they argue for a "quasi-hierarchical structure" of action representation (idem: 922), that is, a structure in which there is a balance between hierarchy and context sensitivity—I will say in a moment something more on context sensitivity in hierarchical representations. In sum, none of the claims we report here from Cooper and Shallice (2006) is really disputed by Botvinick and Plaut (2006).

<sup>13</sup>One of the referees has correctly pointed out that there is neuroscientific literature on pragmatic processing and the interplay between pragmatics and intention recognition—involving different areas than the PFC—which is not accounted for in this paper (see, for instance, Catani and Bambini, 2014; Hagoort and Levinson, 2014). However, I want to emphasize that the purpose of the next section is not to address the neuroscience of pragmatics; it is instead to show that also neuroscience has proposed a hierarchical organization of representations, in line with cognitive theories of concepts, language, and action. This is further support to my general view of context construction as based on a bottom-up/top-down dynamic of activation in hierarchical representations. Thus, what I am interested in is theorizing (together with the supporting evidence) about hierarchical representations in the brain. The PFC is especially well studied in this regard, in particular in connection with the issue of goal representation, and this is why I focus my attention on it. This said, the issue of how the PFC and other cortical areas contribute to the representation of intentions and goals undoubtedly requires further investigation.

and Grafton, 2006; Koechlin and Jubault, 2006; Grafton and Hamilton, 2007; Koechlin and Summerfield, 2007; Badre, 2008, 2013; Botvinick, 2008; Botvinick et al., 2009; O'Reilly, 2010). This suggests, in Botvinick's (2008, p. 205) words, that "a topographical organization might exist within the frontal cortex and the DLPFC [dorsolateral prefrontal cortex], according to which progressively higher levels of behavioral structure are represented as one moves rostrally." For one example of these studies, Koechlin and Jubault (2006, p. 936) reports evidence from magnetic resonance imaging showing "phasic activation at the boundaries of action segments that constitutes a hierarchical action plan"; on this basis, they propose that Broca's area and its homolog in the right hemisphere might "implement a specialized executive system governing action selection in hierarchically structured action plans."

Although the focus of those studies is on hierarchical representations of action in the frontal/prefrontal cortex, it should be noticed that on Fuster's account hierarchical organization is a general phenomenon concerning the entire brain:

The physiology of the cerebral cortex is organized in hierarchical manner. At the bottom of the cortical organization, sensory and motor areas support specific sensory and motor functions. Progressively higher areas—of later phylogenetic and ontogenetic development—support functions that are progressively more integrative. The prefrontal cortex constitutes the highest level of the cortical hierarchy dedicated to the representation and execution of actions (Fuster, 2001, p. 319).

In other words, Fuster proposes that the brain is organized along two distinct—though highly interconnected—pathways, respectively constituting a sensory and a motor hierarchy of cortical maps, which together form a perception-action cycle. The PFC lies at the top of the motor hierarchy and it seems to contain neuronal networks that, both in monkeys and in humans, represent abstract programs or plans of action (Fuster, 2003, p. 76).

Two considerations are worth noting.

First, the above literature on action control emphasizes the role that hierarchies may play in flexibly dealing with large spaces of options. As Badre (2013) specifically notes, hierarchies permit a divide-and-conquer approach such that, on the one hand, choices about which actions to take can be made at multiple levels of abstraction, while, on the other hand, choices at the higher levels constrain the space of possible actions at lower levels. Compare this with a situation in which an inflexible routine has to be performed, and a single set of criteria for its application has to be coded and then assessed against the factual context. On the contrary, on the hierarchical account each component at lower levels has its own set of application criteria, and the selection of goals at the higher levels is the result of parallel activation of (and competition between) components at lower levels, with substantial gain in contextual flexibility. But this applies not only to goal selection in the frontal cortex: if Fuster—and our whole picture of the functioning of concepts, language and motor control—is right, the mechanism of bottom-up/top-down activations along hierarchical representations extends to the entire cortex, thus

accounting for contextual flexibility in a wide range of cognitive processes.

Second, since we described the prefrontal cortex as the seat of hierarchical representations, one might wonder whether this is compatible with the well-established view according to which this area has a crucial role to play in executive processes. As a matter of fact, a "representational" versus "processing" approach to PFC has gained consensus in the last decade (Huey et al., 2006; Miller et al., 2002; Wood and Grafman, 2003), in line with the influential model of executive functions proposed by Miller and Cohen (2001). As they observe, "one of the most fundamental aspects of cognitive control and goal-directed behavior [is] the ability to select a weaker, task-relevant response (or source of information) in the face of competition from an otherwise stronger, but taskirrelevant one" (Miller and Cohen, 2001, p. 170). Now, Miller and Cohen's suggestion is that the PFC contains patterns of activity which map onto configurations of representations in more posterior cortical areas. When such a pattern within the PFC is activated, this increases the activation of the posterior configuration it is connected to and allows that configuration to overcome task-irrelevant competing ones. In other words, plans of action in the PFC are here conceived as schemata, whose activation is transmitted to their components distributed in different cortical areas. This does not necessarily mean that the spreading of activation up and down the hierarchy is all there is to executive functions. An influential proposal made by Dehaene et al. (2006) is that selfsustaining loops play a crucial role in the neural dynamic, to the extent that they prevent the rapid decaying of spreading activation; more specifically, Dehaene et al. (2006) claim that consciousness depends on the establishing of such loops between strongly activated sensory-motor representations and higher association cortices. This might explain how prefrontal activation ensures stability of processing in accordance with current goals and tasks of the agent: thanks to recurrent loops, plans of action within the PFC might sustain the activation of related sensory-motor representations for the time needed to attain the goals. Under this account, there is no inconsistency between the suggestion that the PFC is the top of the hierarchy of representations in the cortex and the widespread opinion that it is key to conscious processing.

Executive functions are a third sense in which processes are usually said to be top-down. First, low-level processing can be constrained by higher-level schemata of various kinds; second, it can be specifically driven by plans of action, that is, by goaldirected schemata lying at the top of the perception-action cycle; third, it can be under the control of action plans in circumstances in which those plans and sensory-motor representations form self-sustaining loops. I claimed above that pragmatic processing is affected by top-down processing in the first two senses: in utterance interpretation, hypotheses about the cognitive context are constructed by exploiting the schematic organization of memory and, specifically, by activating goals within which the current communicative intention is embedded. I would like to suggest, though only in passing, that top-down processing in the third sense might have a role to play in utterance understanding as well. For instance, since in the normal case the speaker is consciously attended by the addressee, speaker-related information is likely to receive prominent activation in the course of utterance understanding (with consequences that are analyzed at some length in Mazzone, 2013).

## **Conclusion**

The main thesis defended in this paper is that, in understanding an utterance, the organization of memory is what essentially drives the construction of the appropriate cognitive context. More specifically, in the present account contextual assumptions and conclusions are provided by schemata, which are activated associatively by a variety of inputs (the utterance, its linguistic and situational context) and then assessed against each other. Goals have a crucial role to play in this process, insofar as goal-directed schemata are the highest levels in our cortical hierarchy of representations. I showed that this picture is consistent with suggestions made by RT and by Recanati, and with influential accounts of concepts, language, and action control. I also provided reasons to think that this hierarchical organization of memory and the related mechanism of bottom-up/top-down activation can account for generative processing and contextual flexibility.

The relation between the present account and RT invites some final comments. As I said above, despite the suggestions developed here, in its most general formulations RT takes a different view, based on expectations about the amount of cognitive effects and assessment of their actual amount against those expectations (in what follows, I will call this the "standard view"). In particular, let us focus on the fact that goal understanding plays no explicit role in this view. It is interesting to consider what Sperber and Wilson (1987) have to say on this issue:

Some commentators [*. . .*] think our definition of relevance fails to do justice to pretheoretical intuitions. Utterances are relevant, they feel, to purposes, goals, topics, questions, interests, or matters in hand.

We define relevance in a context and to an individual. We say what a context is, how it is constructed and how, once constructed, it affects cognition and comprehension. One reason we did not set out to define relevance to a purpose, goal, and so on, is that we had no idea how to answer the analogous questions for any of these terms [*. . .*]. Given a definition of relevance in a context, and a method of context construction, however, there is no reason that assumptions about the goals and purposes of the individual, or of the participants in a conversation, should not form part of the context and give rise to contextual effects in the usual way. Such assumptions are likely to be particularly rich in contextual effects, since purposes and goals imply plans for action. We see no incompatibility between our work and a belief in the importance of goals, purposes, and plans; on the contrary, RT sheds light on how these important notions may play the roles they play (Sperber and Wilson, 1987, p. 742).

The suggestion is that explaining comprehension directly in terms of goals is at least a difficult (and perhaps an impossible) enterprise. Most of all, according to the authors such explanation is not needed anyway, since RT can account for the importance of goals in comprehension without any explicit mention of them. However, the standard view might succeed in this ambition only by providing a satisfying account of how the amount of cognitive effects is assessed, while in fact we are left with no better explanation of this than the speculation about "symptomatic physicochemical changes." On the other hand, we provided here at least the general sketch of an explanation of comprehension based on schemata and goals, which is in fact consistent with the following ideas of RT: the interpretation process requires the construction of an appropriate cognitive context; this depends on the organization of memory, which determines the ease of access of contextual assumptions and conclusions; a mutual adjustment occurs between explicit meaning, contextual assumptions and contextual conclusions; specifically, backward inferences are based on expectations about the type of intended cognitive effects. The account of comprehension developed in this paper along those lines appears better grounded than the standard view, if only for the following two reasons.

First, it makes an appeal not to controversial claims about sensitivity to cognitive costs and effects, but instead to wellestablished cognitive facts (mechanisms of associative access and the organization of memory in chunks), which can be argued to play a key role in theories of concepts, language, and action control, and specifically in the explanation of contextual flexibility in those domains.

Second, this account embeds utterance understanding within a general ability to understand goals, in line with Grice's view and in accordance with explicit claims made by Sperber and Wilson. Interestingly, Sperber and Wilson's notion of relevance is, in a sense, a reinterpretation of Grice's maxim of quantity. There is, though, a clear difference between the two as to how they conceive the purpose of communication: while the former assumes that the speaker aims to be as informative as possible (compatibly with considerations of effort), the maxim of quantity prescribes instead that the speaker is "as informative as is required (*for the current purposes* of the exchange)" (Grice, 1975, p. 45; emphasis mine). In other words, in Grice's account the amount of information exchanged is not a purpose in itself; it is instead a means for pursuing other goals. From this point of view, the notion of relevance proposed by Sperber and Wilson seems to fall back into a pre-pragmatic view in which communication is conceived as instrumental not to the variety of human actions and goals, but instead to the acquisition and transmission of knowledge *per se*. On the contrary, in line with Fuster's proposal of a perceptionaction cycle, the view defended here is that communication in particular, as well as cognition in general, is geared to goal management and action (instead of to maximization of information). This makes communication, in Tomasello's (2008, p. 49) words, an exercise in "practical reasoning."

It is, I maintain, RT's notion of relevance that is in the end responsible for the problems affecting the standard view. The point is that it is very difficult (and perhaps impossible) to give a sensible cognitive instantiation to the idea of maximization of information. If we abandon this idea, even the above quotation may make new sense. As Sperber and Wilson say, "there is no reason that assumptions about the goals and purposes of the individual [*. . .*] should not form part of the context and give rise to contextual effects." In fact, I maintain, goal representations are part of our repertoire of schemata in memory and they can contribute to determine context via backward inferences. But this is because communication is essentially a goal-oriented activity.

## **References**


In sum, my claim is that the quantitative notion of relevance and the related idea of a quantitative assessment of cognitive benefits raise serious problems. In my view, the good news for RT is that large parts of the theory stay unaffected by these problems.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Mazzone. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Context as Relevance-Driven Abduction and Charitable Satisficing

### Salvatore Attardo\*

*College of Humanities, Social Sciences, and Arts, Texas A&M University-Commerce, Commerce, TX, USA*

It has been widely assumed that the full meaning of a linguistic expression can be grasped only within a situation, the context of the utterance. There is even agreement that certain factors within the situation are particularly significant, including gestures and facial expressions of the participants, their social roles, the setting of the exchange, the objects surrounding the participants, the linguistic, cultural and educational backgrounds of the participants, their beliefs, including those concerning the situation, the social procedures and conventions that regulate the situation. Finally, there is some agreement that context is dynamic, reflexive (the speakers are mutually aware of their beliefs), not limited to linguistics actions, and last but not least, a psychological construct. This definition of context is not (very) controversial, but it leaves out two major problems, which will be addressed in this paper: how is context arrived at? And, since a perfectly natural interpretation of the above definition could be that the context of each utterance is the entire universe, how is the relevant context delimited? Four related concepts will provide the answer to both questions: abductive reasoning, driven by relevance and cooperation, and bounded rationality and the principle of charity. Simply put, context is derived abductively by the speakers assuming that for the speakers to behave the way they behave and do so rationally, a given context must be available to them. The context is bounded by the simple requirement that speakers not try to optimize their interpretation/calculation, but rather satisfice, i.e., find the first acceptable solution and by the need to follow the principle of charity, which forces intersubjective agreement. Thus, abductive reasoning and bounded rationality will be shown to be sufficient to calculate the relevant context of utterances (or other rationality-driven interactions) and to effectively delimit the potentially infinite search space that must be explored to do so.

### Edited by:

*Gabriella Airenti, University of Torino, Italy*

### Reviewed by:

*Maurizio Tirassa, University of Torino, Italy Pietro Perconti, University of Messina, Italy*

\*Correspondence: *Salvatore Attardo salvatore.attardo@tamuc.edu*

#### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *12 August 2015* Accepted: *17 February 2016* Published: *08 March 2016*

### Citation:

*Attardo S (2016) Context as Relevance-Driven Abduction and Charitable Satisficing. Front. Psychol. 7:305. doi: 10.3389/fpsyg.2016.00305* Keywords: context, linguistics, pragmatics, cooperative principle, principle of charity, relevance, abduction, satisficing

I would like to begin discussing context by using a metaphor<sup>1</sup> . As is well known, metaphors have heuristic powers, which will help us in this complex and fraught subject. The metaphor is that studying context is akin to studying non-foveal vision. Peripheral (non-foveal) vision is quite important in many situations (for example, in driving one becomes aware of the presence of a car passing to the left or right through non-foveal vision, at first). This image helps us realize that context exists only in opposition to a text. A context never exists by itself. It exists because it is something other than the text. However, the metaphor is even more interesting, because it highlights another feature of context: if we notice something in our non-foveal vision and we shift

1The metaphor is used also in Schegloff (1992: p. 223).

of visual angle to focus on the thing (for example, by turning one's head to look at the car passing us on the right lane) then the object is no longer part of the non-foveal vision, but it is now in the foveal vision angle. To put it differently, one cannot study peripheral vision by focusing on it, because the act of focusing on it changes radically the nature of the thing to be observed.

As we will see in the present paper, most of the history of the research on context has consisted, roughly speaking, of focusing our gaze in the general direction of the object glanced at in peripheral vision and then trying to describe or enumerate the salient features of what is seen. I will argue that that approach misses largely, but not completely, the point. The remainder of the paper will be organized in two main parts: the first one will provide a cursory and non-representative, but nonetheless enlightening review of definitions of context, primarily within linguistics, but with some extracurricular forays. The second part will present the constructive side of the paper, presenting some of the tools needed to derive and bound context. I should stress that the present discussion should not be read as antagonistic to but rather as complementary to traditional definitions of context.

## A PARTIAL HISTORY OF CONTEXT

There are as many approaches to context as there exist disciplines in the humanities and the social sciences. It would be unrealistic to attempt to encompass them all. Therefore, I have settled for presenting a largely linguistic overview of definitions of context, to the detriment of other disciplines, such as psychology, philosophy, and phenomenology.

## Context Free Grammars

We will begin this review of definition of context, perhaps perversely, with the zero-degree of the term, or more specifically with Bloomfield rejection of the tractability of the very idea of meaning, let alone context. Bloomfield, famously rejected mentalistic psychology and espoused behaviorism. For Bloomfield, since the meaning/context continuum is potentially infinite it is eo ipso intractable:

We have defined the meaning of a linguistic form as the situation in which the speaker utters it and the response which it calls forth in the hearer.... In order to give a scientifically accurate definition of meaning for every form of a language, we should have to have a scientifically accurate knowledge of everything in the speakers' world. The actual extent of human knowledge is very small compared to this.... The statement of meanings is therefore the weak point in language-study, and will remain so until human knowledge advances very far beyond its present state (Bloomfield, 1933, pp. 139–140; my italics, SA).

Bloomfield believed that sub-morphemic analysis of meaning was impossible: "There is nothing in the structure of morphemes like wolf, fox, and dog to tell us the relation between their meanings; this is a problem for the zoölogist." (1933: p. 162). However, as Langendoen (1998) points out, by the time the American Structuralist school had reached its peak, submorphemic analyses of semantics were being performed (e.g., Goodenough, 1956). The view that the semantic content of a morpheme can be broken down in semantic features was incorporated in Katz and Fodor's (1963) semantics, which became the de facto semantics of generative grammar (Chomsky, 1965). As the name itself of the type of grammar strongly suggest, context-free grammars were not sensitive to, or interested in, context. The idea behind context-free grammars is that rewriting rules and transformations did their work regardless of the context in which they occurred. NP rewrites as Art + N, regardless if the NP is the first or the last of a sentence.

Generative semantics attacked the context-free nature of generative grammar "semantics" (the scare quotes acknowledge the reluctance that many generative grammarians would have had in using the term) using a barrage of examples such as the following:

(1) "John called Mary a republican and then she insulted him." (Lakoff, 1971: p. 333)

Example (1) above, in the emphatic prosody reading, works only if we assume that Mary considers "Republican" an insult. Or to put it differently, we need to know what Mary's state of mind is, in order to decide on the intonation of the sentence. Clearly, someone's state of mind cannot be part of the morphemic meaning of a sentence

Worse, even such a concept as grammaticality, one of the core ideas of generative linguistics, could be show to depend heavily on context, with examples such as the following:

(2) "Kissinger conjectures poached." (McCawley, 1976/1979)

Example (2) would be rejected by most speakers of English as non-grammatical, unless they can imagine it as the answer to the question: "How does President Nixon like his eggs in the morning?"

As is well known, generative semantics self-destroyed (Harris, 1993) and was reborn as pragmatics. We take up that thread next.

## Pragmatics

The linguistic tradition that Bloomfield was reacting against in 1933 came from such German thinkers as Humboldt and Wegener. Humboldt makes it very clear that, for him, meaning does not come just from the forms of language but from the "act of speaking" (see Nerlich and Clarke, 1996: p. 53). Wegener claims that interpretation depends on the "situation" Wegener (1885) and named his entire theory "situationstheorie." Wegener contemplates three types of "situations": the objective observations (views), the elements associated with the situation by memory, and the (self-) awareness of the participants.


Other factors in Wegener's definition are the "ongoing or just completed activity" and the Kultursituation [historical culture] (Knobloch, 1991: p. XVI). Wegener also anticipates speech act theory and Gricean pragmatics, witness the following quote:

Wegener was among the first to realize that speaking and understanding are preconditioned by and embedded in practical action and also dependent on the cooperation among the speakers (Knobloch, 1991: p. XVI).

The German tradition of seeing linguistic meaning as part of a broader context, found a fertile ground in the work of Malinowski. Malinowski is considered the founder of the pragmatic concept of context: he is considered the "first to use [the term] context in a systematic way" Nerlich and Clarke (1996: p. 316). The following quote makes the central role of the pragmatic context in Malinoski's thought very clear:

language [...] has an essential pragmatic character [...] it is a mode of behavior, an indispensable element of concerted human action (Malinowski, 1923: p. 316)

whereas the following is a definition of context, from Malinowski's masterpiece, Coral Gardens and Their Magic:

it is very profitable in linguistics to widen the concept of context so that it embraces not only spoken words but facial expression, gesture, bodily activities, the whole group of people present during an exchange of utterances and the part of the environment on which these people are engaged (Malinowski, 1935 vol. II: p. 22)

Malinowski's influence, concerning the concept of context, on the London School cannot be exaggerated. Raymond Firth, in a 1956 piece on Malinowski, states that Malinowski "has been one of the outstanding influences in shaping modern British social anthropology" (Firth R., 1956: p. 1). Both J. R. Firth and Halliday will be influenced by Malinowski's definition, but also Hymes' (1972) definition of ethnography of speaking is strongly reminiscent of Malinowski's definition of context. Senft (2007) goes so far as claiming that

the factors Hymes (1972: p. 65) summarizes in his famous acronym SPEAKING—"settings, participants, ends, act sequences, keys, instrumentalities, norms," and "genres"—are not only constitutive for the 'ethnography of speaking'paradigm but also for Malinowski's "context of situation" (Senft, 2007: p. 148)

J. R. Firth acknowledges Wegener's and Malinowski's influence very directly: "The key concept of the semantic theory he [Malinowski] found most useful for his work on native languages was the notion of context of situation" (Firth J. R., 1956: p. 101).

Firth's own definition of "context of situation": is as follows:

the linguistic text (...) finds a place and function in relation to other categories such as the participants, relevant non-verbal behavior, relevant objects and effect or result (Firth, 1957: p. 7)

Within the London school, which comprises Malinowski and Firth, the author who has had the broadest impact on linguistics is probably Halliday, who, somewhat ironically, emigrated to Australia in the mid 1970es. Halliday's definition of context is formulated in terms of cultural meanings, but is also influenced by Firth, for example in the insistence that speech is an act of meaning:

Context is (...) a construct of cultural meanings, realized functionally in the form of acts of meaning in the various semiotic modes, of which language is one. The ongoing processes of linguistic choice, whereby a speaker is selecting within the resources of the linguistic system, are effectively cultural choices, and acts of meaning are cultural acts (Halliday, 1971: p. 165).

Context, in Halliday's functional model is articulated in terms of field, tenor, and mode, but a discussion of his model would take us too far afield. In a different context, namely a discussion of intellegibility in spoken language, and 20 years earlier, Catford also defines context in ther now familiar terms of the speakers, the situation, and culture. His definition can be summed up as:


With Ochs (1979) definition of context, we are fully in the domain of interactional sociolinguistics. Accordingly, the following feature prominently in her definition of context:


However, despite her socio-interactional orientation, Ochs' definition reflects the zeitgeist of when it was presented, being still very tied to the linguistic form. For example, Ochs discusses extensively the grammaticalization of context.

Ochs' definition proved very influential. Duranti and Goodwin (1992: p. 6) explicitly take Ochs' (1979) definition as their starting point. The parameters they use are as follows:


Their discussion ranges very widely, to reach the conclusion that context should not be seen as "a set of variables [the parameters listed above] that statically surround strips of talk" (1992: p. 31) but rather as having a "mutually reflexive relationship to each other, with talk, and the interpretive work it generates, shaping context as much as context shapes talk." (Ibid.) The other major contribution that Duranti and Goodwin provide is a focus on the relationship between talk and context, which they see as paralleling that between figure and ground. The figure would here be talk (the text) and the ground would be the context. This is a crucial aspect of the definition of context, which Fetzer (2004: p. 3) calls the "core meaning" found, according to her, in "all the usages" of the term "context."

Despite approaching context from the perspective of the agents, recognizing the figure/ground relationship between text and context, and acknowledging that language is a part of a "stream of activity" (1992: p. 3), Duranti and Goodwin's definition is still very language-centric.

Jakob Mey, long the editor of the Journal of Pragmatics, presented in his pragmatics textbook a definition of context, slightly clarified in the second edition of the book, which follows below:

Context is a dynamic, not a static concept: it is to be understood as the continually changing surroundings, in the widest sense, that enable the participants in the communication process to interact, and in which the linguistic expressions of their interaction become intelligible (Mey, 2001: p. 39)

This definition has the great merit of stressing the dynamic nature of context, which of courses entails the fact that it is enacted and brought into being by the actors in the situation and not somehow per-existing them. The other aspect that should be stressed is that language performance becomes intelligible only within the context, again not pre-existing it.

Finally, some aspects of Schegloff's (1992) discussion of context fit in with the approach presented in this paper. Schegloff recognizes the potential infinity of contexts (see below) and upholds a dynamic view of context: the text helps determine ("invokes") the context: "the [text] (...) may be understood as displaying which out of that potential infinity of contexts (...) should be treated as relevant and consequential" (Schegloff, 1992: p. 197)<sup>2</sup> .

## Philosophy and Psychology

Philosophers have developed highly technical notions of context that need not detain us. However, it is worth considering Stalnaker's "informal (...) intuitive" definition of context:

[T]he concrete situation in which a conversation takes place, a situation with a more or less definite group of participants with certain beliefs, including beliefs about what the others know and believe, and certain interests and purposes, and interests and purposes that are recognized to diverge. (...) intelligible independently of any institutional linguistic practice (...) not defined by the constitutive rules of some language game (...). It is not just linguistic actions, but actions of any kind that take place in a context (Stalnaker, 2014: p. 14)

Let us note the very significant "divorce" of the concept of context from linguistic behavior ("not just linguistic actions") and the iterative nature of the common ground ("beliefs about what the others know and believe"), familiar from Gricean semantics. Stalnaker also notes that context is a dynamic concept which evolves "in the course of a conversational exchange" (2014: p. 14).

There are numerous definitions of context in psychology, but I will not be concerned with them. Instead, I will consider a psychological definition of context advanced within Relevance Theory:

context is a psychological construct, a subset of the hearer's assumptions about the world. It is these assumptions, of course, rather than the actual state of the world, that affect the interpretation of an utterance. A context in this sense is not limited to information about the immediate physical environment or the immediately preceding utterances: expectations about the future, scientific hypotheses or religious beliefs, anecdotal memories, general cultural assumptions, beliefs about the mental state of the speaker, may all play a role in interpretation. (Sperber and Wilson, 1986: pp. 15–16; my emphasis, SA)

What is relevant in Sperber and Wilson's definition is not the typically expansive listing of factors and components of the context, but the clear and important realization that the context of an utterance is not a physical reality but the mental representations of the physical reality, or as they put it that context is a psychological construct. Context does not exist "out there" in the world. It exists "in the head" of the speakers/interactants.

The risk with this definition is to attempt to make it fit the standard "toolkit" of propositional logic, as for example the following claim that an utterance conveys

a propositional complex that contains both explicit and implicit information. (...) this information is constructed on the fly as the interpreter processes every lexical item (...) While (...) the propositional complex communicated by an utterance is pragmatically narrowed and simultaneously pragmatically broadened in order to incorporate only the set of optimally relevant propositions (...) (Assimakopoulos, 2006: p. 1; emphasis mine, SA)

Unless we assume that all information is propositional by definition, which would render the word "propositional" hard to define, there is no guarantee that all implicatures are propositional. Consider the difference between:


which is certainly not propositional, as presumably both would be represented as

(c) ready(x)

whereas the connotations evoked, even outside of a rich context, are clearly different. We will address the assumption that optimal relevance needs to be sought below.

<sup>2</sup> Schegloff (1992; 1997) is concerned primarily with the difference between the view of context for the purpose of analysis (the analysts' view) as opposed to the view of context as integral to the unfolding of the speech event (the participants' view). Moreover his approach belongs to a tradition that does not fit easily within the methodological underpinning of contemporary linguistics and in many ways is antagonistic to (some of) them. It would be far too time consuming to review these in any detail, not to mention that none of the discussion would be particularly relevant to the present article.

Finally, we can conclude this selective review of the literature with a book that defies classification, but that, given the title (Context as other minds), I chose to list among the psychological approaches. Givón (2005) is indeed such a wideranging discussion that I will merely mention a few "recurrent themes" (the title of a section of the first introductory chapter) that are clearly related to the idea of context, to a greater or lesser extent.

We can start with the idea of relevance, which is key to the expansion of the search of the contex, starting from the text. Abduction and analogical reasoning figure prominently on Givón's list. As we will see, relevance and abduction are central to the discussion below. Givón also lists the concepts of similarity, analogy, and metaphor which all are "dependent on the choice of relevant context" (2005; p. 9) and other concepts such as categorization and taxonomy which are dependent on "the capacity to tell "major" traits" (2005: p. 9). While the connections between all these concepts and the idea of context may be not immediately obvious, it is clear that similarity, analogy and metaphor "work" in relation to a frame (a context): an argument is like a war, but only up to a point: if one of the participants starts raising an army, or bombing some territory we are no longer facing an argument. A whale and a submarine are similar, in the for example they both function under water, but obviously also very different. Along the same lines, dogs and coyotes are similar, but of course also different, as captured by their Linnaen taxonomy. Likewise, it would be absurd to argue that Fluffy, one's beloved poodle, is a completely different dog than the neighbor's mongrel, and should belong to a different genus.

Before leaving off the psychological and pragmatic theories, we can address a point suggested to me by one of the referees, who asks to compare the current approach to "social" theories such as common ground (e.g., Clark, 1996) and the "theory of mind" (ToM; e.g., Wimmer and Perner, 1983). Both common ground and ToM "complicate" the notion of context by making it relative to two or more participants, in the sense that the parties must "share" some amount of knowledge. Obviously, context is shared, to an extent, by the various parties that are co-participants in the situation. To be aware of which parts of context are shared and which parts are only available to an individual (or the self) is a crucial need for the participants. However, the present approach applies to the construction of context regardless of whether it is shared or not: the expansive and delimitative principles need to apply regardless of the shared nature of it. Imagine, as Gedankenexperiment, a solitary climber on the face of a mountain. In a difficult spot of the climb, she says: "You can do it!" Assume, for the sake of the argument, that we treat this is speech directed to herself, and not as an imaginary conversation between the climber's self and another self. The principles that would govern a shared interpretation of "do" (namely, climb this mountain) obviously must also govern a solipsistic interpretation.

## Literature Review Conclusions

There are a few recurring, central ideas that have emerged and that are, in my mind, crucial to the understanding of context. I will review them briefly and then move on to the proactive part

of the paper. The first idea is that context is not immanent, it does not pre-exist the communicative exchange and/or the speakers' consciousness. The second idea is that context is a mental state that is constructed by the speakers and/or participants to the situation "on the fly" as they go about their business in the conversation/interaction. Specifically, context is constructed along the lines of relevance and abductively, hence it is largely a matter of implicatures. Context is bounded, i.e., it is not infinite. Finally, we may note two competing forces at play: one is an expansionist force that impels speakers or interactants to seek out relevant parts of the environment to make sense of the text/events. The other, less visible, but just as important, that bounds the expansive search for relevant context, so that it is limited effectively to what is necessary for the purposes of the speakers/interactants. What follows examines the two tendencies.

## DERIVING AND BOUNDING CONTEXT

I believe that one observation that emerges from the consideration of the various definitions of context is that there is an expansive tendency: the definitions get more and more complicated in an effort to encompass all possible relevant contextual factors. While that is understandable and even probably necessary, it creates a problem, to which we turn next.

Fetzer (2004: p. 3) notes that "context can refer to the whole universe." We might add, just to ante up a little, that within the multiverse cosmological theory context might refer to numerous, in fact infinite, universes. As Fetzer concludes, "that extremely general definition of context requires some delimitation" (2004: p. 3) if for not other reason that the computability and psychological reality of an infinite set of concepts is questionable.

The problem, of course, is that even if many have proposed various theoretical constructs to define the domain of context, some of which are reviewed by Fetzer, very few if any scholars have explained how practically context is delimited by the participants of the interaction. Ex post facto, it is always possible to look at a conversation and find that socio-economical factors were at play in delimiting the context to members of a given socio-economical group (for example, the expression "he lived alone with his servants<sup>3</sup> " presupposes that the servants do not count as people, since "alone" requires the absence of others). However, it is unlikely that the writer of the sentence above was aware of and deliberately wanted to express this fact<sup>4</sup> . So, if the delimitation of context is subconscious, happens in real time, and is intersubjective (i.e., the participants to the conversation share it or agree to it, implicitly), how can the speakers take care of it? Moreover, it should not escape our attention that the derivation of context takes place in real time (i.e., as the interaction takes place or the conversation unfolds) since the speakers and the context, as Duranti and Goodwin remind us, are in a dialectical relationship.

<sup>3</sup>The sentence occurs on p. 65 of the Beadle's monthly, volume 3, which appeared in 1867, in New York. The author is Kate Putnam Osgood. Our New House, A Story. 4 Interestingly, this presents a serious problem for Schegloff's (1992: p. 215) demand that "demonstrable relevance to the participants" be the warrant for claims about context. Ideology can be very relevant to the establishment of context, but by definition, it will be invisible to its followers.

## Deriving Context: Relevance, Cooperation, Abduction

It is clear that relevance is the tool used to expand the context: the search for relevant implicatures or other implicit parts of meaning obviously drives the process whereby speakers look for features in the situation that may be relevant to what was said. For example, if speaker A says "hold this" while handing a hammer to speaker B, relevance determines that the reference of "this" is the hammer. A's order/request could be paraphrased as a request to hold the item identified by a deictic and the gesture in the situation resolves the ambiguity, under the assumption of relevance. If speaker A had wanted B to hold a book, why hand them a hammer?

Along the same lines, it is obvious that an assumption of cooperation is necessary to process the search for the relevant context: unless he/she assumes that speaker A means what he/she said, is being clear about it, etc. speaker B would have no reason to assume that A wants him/her to hold the hammer and isn't going to drop the hammer or is trying to sell them the hammer, or is a lunatic who likes holding hammers in their hand.

I assume that the reader is familiar with the Principle of Cooperation, proposed by Paul Grice (see Grice, 1989). Likewise, I assume that relevance is a known pragmatic principle or maxim (see Sperber and Wilson, 1986). This is not the place to review the discussion on the subjects, so I will not discuss them further. Instead I will briefly deal with adbuction, which is definitely less known in linguistic circles. A fuller discussion can be found in Attardo (2003).

Abduction, "discovered" by Peirce (1960-1966), is a "third" form of reasoning, besides induction and deduction. The general form of abduction is as follows:

The surprising fact, C, is observed; But if A were true, C would be a matter of course, Hence, there is reason to suspect that A is true (Peirce 5 p. 189).

Clearly abduction is not a matter of certainty: it is a probabilistic "guess" to the best explanation. Moreover, the formulation of the rule (A) which explains C is not itself part of the abductive process for Peirce and so has to be justified externally. Some scholars (e.g., Hoffmann, 1999: pp. 281–284) argue that the generation of A is itself abductive, which open the possibility of a regression ad infinitum.

Other strategies are possible. Brogaard (1999: p. 141) stresses the role of "unexpected or sudden regularities" (Peirce has "surprising") in triggering the abductive process. Regularities are perceived against the background of observed facts (Kapitan, 1997: p. 482) that are "separated from other facts" (Kapitan, 1997), in fact, "[a]n essential step in the process of abduction is the classification whereby a particular assembly of phenomena comes to be regarded as a single explanandum." (Kapitan, 1997) So, according to Brogaard (1999), the process of abduction works as follows: the subject observes an undifferentiated stream of phenomena, at some point in time, some of these phenomena exhibit some common feature, which leads the subject to group the phenomena in a single explanandum, furthermore, the presence of unexpected or surprising regularity in the phenomena leads the subject to the formation of an hypothesis which explains the phenomena, if true. The subject then assumes prima facie the truth of the hypothesis. Presumably the explanandum is more abstract than the mere collection of phenomena, which strikes me as a significant part of the explanatory power of the abductive hypothesis.

Other strategies have been used to ground the abductive process, but I believe the present discussion is sufficient to show how the search for a satisfactory context to explain the speakers' utterances will be largely abductive in nature. The metaphor of non-foveal vision I used at the beginning of the paper comes in handy now as well. Much like non-foveal vision cannot rely on foveal fixation, context searching cannot rely only on what is inferrable or deductible from what is literally said in the utterance or on what is said and its pragmatic enrichments. Context searching and building needs to rely on adbuctive jumps. If my wife walks into my office and asks "are you hungry?" to assume that the time of the day (around noon) is relevant and that therefore she is asking me to prepare lunch can only be an abductive process. No logical inference ever could bridge the gap between a question about my inner states and a request to prepare a meal. Note that the time of the day, our habits (I feed the humans, she feeds the animals), what we are doing (both of us are working on papers), and numerous other factors contribute to the successful abduction. These could never be accounted for in inferential or deductive reasoning, as a matter of principle as the list is open ended.

## Bounding Context: Satisficing and Charity

The speakers are in need of a context-delimiting algorithm. I propose the following as a first approximation.


## Bounding Through Satisficing

Simon (1983) proposed the idea of bounded rationality (reasoning), i.e., a much more "realistic" view of rationality based on limited knowledge and limited resources, which does not arrive at optimal solutions. Simon introduced in the definition of a (bounded) rational agent the following features:


The significance of these decisions is great: there is no guarantee that the agent will find an "optimal" (best) solution, because satisficing will lead to accepting a solution that achieves a given goal, when a better option might have been available. Similarly, because not all possibilities are searched, inconsistent facts may be present in the system. Inconsistency and suboptimality are problems, but a conception of reasoning that is bounded is preferable to conceptions of optimal rationality because, simply put, real agents in the real world never have perfect information and therefore bounded reasoning is more realistic. As a side note, we can observe that bounded reasoning solves the problems associated with the need to find optimal relevance, in some formulations of relevance, since bounded searches for a solution are guaranteed to find a solution, although it may not be the best one.

### Charity

We now turn to a discussion of the principle of charity, which is of necessity longer and more complex than the discussion of all the other tools we have examined so far, for the connected reasons that, within linguistics, virtually no use has been made of the principle of charity and, within philosophy, it has been applied to different problems than those to which we have applied it here.

There have been several proposals of charity principles. In the specific sense that we are interested in, we may begin with Wilson's (1959: p. 532) "principle of charity" which states that "We select as designatum [the referent of a proper noun] that individual which will make the largest possible number of (...) statements true." Wilson's discussion is technical and need not detain us further, but the basic idea is clear: interpret speech so as to maximize the truth of what the speaker says.

Quine (1960) generalizes the principle of charity, in the context of his discussion of the feasibility of radical translation (i.e., between non-related languages which have never been in contact; Quine, 1960: p. 28). Quine essentially says that if a speaker says something that seems clearly false, a bad translation is more likely than imputing irrationality to the speaker.

Grandy introduces a "humanity principle" in the same context. Grandy is critical of Quine's definition and replaces it with the assumption that "the purpose of translation is to enable the translator to make the best possible predictions and to offer the best possible explanations of the translate" (Grandy, 1973: p. 442). Since obtaining a complete account of the translatee's beliefs and desires is practically impossible, (Grandy argues that all or at least many psychological states can be reduced to these) Grandy concludes that

we use ourselves in order to arrive at the prediction: we consider what we should do if we had the relevant beliefs and desires. Whether our simulation of the other person is successful will depend heavily on the similarity of his belief-and-desire network to our own.(...) it is of fundamental importance to make the interrelations between these attitudes as similar as possible to our own. If a translation tells us that the other person's beliefs and desires are connected in a way that is too bizarre for us to make sense of, then the translation is useless for our purposes. So we have, as a pragmatic concern on translation, the condition that the imputed pattern of relations among beliefs, desires and the world be as similar to our own as possible. This principle I shall call the principle of humanity (Grandy, 1973: p. 443).

The most significant discussion of a principle of charity is to be found in Davidson's philosophy. There is not a single, standard discussion of Charity in Davidson's work; rather, his observations on the Principle of Charity are scattered throughout his work. We can start with one of the last presentations of the principle:

the principle directs the interpreter to translate or interpret so as to read some of his[/her] own standards of truth into the pattern of sentences held true by the speaker. The point of the principle is to make the speaker intelligible, since too great deviations from consistency and correctness leave no common ground on which to judge either conformity or difference (2001: p. 148; my emphasis, SA)

The emphasized passage clearly shows that, in Davidson's model, without an assumption of a charitable reading, communication would be impossible. Davidson insists repeatedly that his conception of charity is both indebted to Quine (and to Wilson, via Quine) and significantly different from his, since Quine applies it only to logical operators, whereas Davidson insists repeatedly that his Charity applies "across the board," i.e., is a general interpretive principle (1984: p. xvii; 1984: p. 153; 2001: p. 148). As Jackman (2003) notes, Davidson's charity principle is broader than Wilson's or Quine's, since it is supposed to determine not only referential semantic issues, but also which propositions are (likely) to be true (or at least believed to be so, by the speaker).

Charity, in Davidson's work, is a central tenet because it allows him to bridge between observable external behavior of the participants and their beliefs/desires. The importance of this step is of course due to his adherence to the behaviorist tenets that only observable behavior could be relied on in scientific work. Needless to say, behaviorism was discredited by modern linguistics (Chomsky and most linguistics after him) and we now freely speak of inner mental states, ideas, concepts, cognition, and meanings. Without a charity principle, Davidson has no way of guaranteeing that, absent social consensus, people mean the same thing when they say something. Charity, by ascribing the same true thoughts to all speakers, guarantees that there is intersubjective agreeement and therefore translation between languages<sup>5</sup> .

This is quite visible if we look at some of Davidson's statements on Charity. Charity, according to Davidson, is holistic, as can be seen from the following quotations: "we make sense of particular beliefs only as they cohere with other beliefs" (1980: p. 221); "the content of a propositional attitude derives from its place in the pattern"(Ibid.); "a belief is identified by its location in a pattern of beliefs; it is this pattern that determines the subject matter of the belief, what the belief is about" (1984: p. 169).

Furthermore, speakers are generally consistent:

crediting people with a large degree of consistency cannot be counted mere charity: it is unavoidable if we are to be in a position to accuse them meaningfully of error and some degree of irrationality (Davidson, 1980: p. 221)

<sup>5</sup>One might ask, since I assume that mental states exist, why use charity at all? The answer is that we need charity, but for different purposes than Davidson did.

Davidson notes that since we sometimes accuse people of being mistaken or of contradicting themselves, these charges can exist only against a background of consistency. Davidson allows for the presence of disagreement, error, and irrationality, ["it cannot be assumed that speakers never have false beliefs" (1984: p. 168) "of course a speaker can be wrong" (1984: p. 169)] but against the backdrop of general agreement, truth and rationality: "disagreement and agreement alike are intelligible only against a background of massive agreement" (1984: p. 137), "the methodological presumption of rationality does not make it impossible to attribute irrational thoughts and actions to an agent, but it does impose a burden on such attribution" (1984: p. 159).

Speakers are also rational, for Davidson, "successful interpretation [communication] necessarily invests the person interpreted with basic rationality" (Davidson, 2001: p. 211). Consistency, rationality, and coherence go hand in hand. Davidson remarks that "we necessarily impose conditions of coherence, rationality, and consistency" (Davidson, 1980: p. 231). In fact, the assumption of rationality, having beliefs, and intentional communication in an agent is founded by charity:

If we cannot find a way to interpret the utterances and other behavior of a creature as revealing a set of beliefs largely consistent and true by our own standards, we have no reason to count the creature as rational, as having beliefs, or as saying anything (Davidson, 1984: p. 137).

Because it needs to rely on observable behavior, Charity is based on public (social) assent:

A theory for interpreting the utterances of a single speaker, based on nothing but his[/her] attitudes toward sentences, would, we may be sure, have many equally eligible rivals, for differences in interpretation could be offset by appropriate differences in beliefs attributed. Given a community of speakers with apparently the same linguistic repertoire, however, the theorist will strive for a single theory of interpretation [...] What makes a social theory of interpretation possible is that we can construct a plurality of private belief structures: belief is built to take up the slack between sentences held true by individuals and sentences true (or false) by public standards. [...] Attributions of belief are as publicly verifiable as interpretations, being based on the same evidence: if we can understand what a person says, we can know what he[/she] believes (1984: p. 153; my emphasis, SA).

Davidson's argument here echoes Wittgenstein's against a private language. A language with one speaker could not be said to exist, because there would be no checking the meanings of the signs. Evnine (1991: pp. 105–108) acutely notes that Davidson, having acknowledged clearly the importance of the social aspect of language, goes on to more or less completely reject the idea of convention (Davidson, 1986), shifting his attention back from the social aspect of language to the individual role.

Charity forces the interpreter to attribute to the interpretee a set of true beliefs. Davidson seems to waver as to whether the truth of the speaker's beliefs is assumed relative to the interpreter's set of beliefs: he speaks of "assigning truth conditions to align sentences that make native speakers right when plausibly possible, according, of course, to our own view of what is right" (1984: p. 137; my emphasis SA) and of interpreting the behavior of another as "revealing a set of beliefs largely consistent and true by our own standards" (Ibid.; my emphasis, SA). Elsewhere, however, Davidson states that the speaker is objectively right: "massive error about the world is simply unintelligible" (1984: p. 201) and "successful communication proves the existence of a shared, and largely true, view of the world" (Ibid.) Davidson's position that Charity guarantees the truth of most beliefs of a community has not gone without challenges, e.g., McGinn (1999: pp. 178–179; 180–196). McGinn rejects the truth claim, but accepts the consistency and rationality claims in Davidson's account.

In my mind, the objections to the claim that the principle of charity requires speakers to attribute to each others true beliefs are misguided. Obviously Davidson never graded a set of midterms in my introduction to linguistics class, otherwise he would not have maintained that the possibility of "massive error" (Davidson, 1984: p. 197) is ruled out. However, this is beside the point. I understand Davidson as saying something much deeper and interesting than the obviously erroneous claim that people cannot be massively wrong about something. What I think Davidson meant (or should have meant, see below) is that even in order for my students to be massively wrong about linguistics they have to be massively right about way more things than linguistics in order to be counted as having the possibility to be right or wrong about linguistics. So, for example, they would have to have the true beliefs that linguistics exists, that the exam exists, that I exist, that exams are graded, that one wants to score well on an exam, etc. Here my interpretation of Davidson is along the lines of Wittgenstein's recognition of an unquestionable background of knowledge which anchors the very possibility of doubting something (e.g., Wittgenstein, 1969: p. 94; Stroll, 1994: p. 180 makes the connection to Davidson's charity).

Davidson himself seems, however, to deviate from this sort of reading, when he asks, rhetorically, "how clear are we that the ancients (some ancients) believed that the earth was flat? This earth? (...)" (1984: p. 168) and continues to argue that our earth is a planet in the solar system, etc. and if one does not have these beliefs, then (holistically) one cannot really be thinking about the same earth. But this line of reasoning is unnecessary: why not simply concede that some ancients were wrong about a few thousand beliefs and nonetheless right about millions of others (e.g., that the earth exists, that it is larger than one can walk in several days, etc.)?

Within Davidson's model, interpreters (speakers and hearers) need to attribute to one another beliefs that "minimize disagreement" (1984: p. xvii) or "maximize agreement" (1984: p. 101): "a good theory of interpretation maximizes agreement. Or, given that sentences are infinite in number (...) a better word might be optimize." (1984: p. 169).

Finally, Davidson notes that his account of interpretation via charity goes against relativism (1984: p. 197); however, he also notes that this does not amount to a universalist view (198). Lukes (1982) considers the important implications of Davidson's Charity and Grandy's humanity against relativism (i.e., the possibility that there exist radically different cultures, logics, or worldview across which ideas are untranslatable). Rationality and some beliefs need to exist across the board (i.e., to be part of the definition of humanity), although obviously not all beliefs.

Finally, one may object to my treatment of charity principle proposed by philosophers as linguistic principles, but in fact it has been argued, convincingly to my mind, that this is precisely what Davidson's intention was (or at least that this is the outcome of his views):

[Davidson's] tacit equation of [the charity principle] with his own views about the constitutive role of rationality in determining what sentences we can be understood as holding true further blurs the nature of the Principle and makes it seem more like a general maxim guiding interpretation (Jackman, 2003; my emphasis, SA).

This concludes our discussion of the principle of charity, as proposed by Davidson. It might be interesting, however, to reflect briefly on specific linguistic aspects of charity. Let us recall that Davidson points out that charity forces us to attribute "coherence, rationality, and consistency" to the speakers we are engaging in conversation with as well as substantive knowledge about the world. The attribution of rationality entails the attribution of some sort of set of Gricean principles/maxims, since the principle of cooperation and/or the maxims are characteristics of rational communication. This is familiar ground, which I will not repeat in this context. The attribution of coherence is, in a sense, part of the injunction of speaking to the point (relevance) but in a different perspective it is responsible for the need to assume that a speaker is, say, answering a question even if prima facie the utterance does not seem to meet the requirements of coherence and/or relevance. In other words, the option of assuming that the speaker is incoherent is literally the dispreferred option. Finally, the assumption of consistency is perhaps the most intriguing in the linguistic aspects of charity. Essentially, it boils down to the assumption that the speaker is using language units in the same way, through the exchange, so that the meaning and/or reference of the units does not change during the exchange. In a more sophisticated sense, it also requires that we attribute to a speaker the meanings that we know that speaker to intend. A feature of (some) uncharitable readings is that they attribute to the speaker meanings that the speaker would not have meant. For example, the Wikipedia article on "Controversies about the word niggardly" 6 reports several instances of speakers attributing racist intentions to those who used the word, apparently under the misconception that it would be etymologically related to a slur against African Americans<sup>7</sup> .

Charity and satisficing form the two bounding principles that constrain the expansive tendencies of relevance and abduction, seeking for features of the environment to make sense of the exchange or of the text. If we had not already introduced one metaphor, we could suggest thinking about the centrifugal (expansive) and centripetal (bounding) forces in physics that define the orbits of celestial bodies. In what follows we will examine a few examples of context definition with emphasis on the bounding, centripetal forces.

### Examples of Context Derivation and Bounding

Let us begin by a non-controversial example: deictics are by general agreement context-sensitive. Consider now the quantifier "everything" in the following examples (short hand indication of the "context" is given after the equal sign):


Example (a) is Fetzer's un-delimited account in which the quantifier refers literally to every entity in the universe. In example (b) the transactional situation of purchasing provides a justification for the direct request, while the hot-dog vendor cart situation provides the antecedent for "one" (i.e., one hotdog) which in turns provides the boundary of the expansion of everything to toppings by its affordances (see Attardo, 2005 for an analysis of an "everything" bagel. along the same lines). Note that in order for the reading to go off we must charitably not impute to the customer the reading in (a). Example (c) is different, in the sense that the situation in which the unlucky investor utters it is very salient, as presumably many speakers are discussing the market crash, newspapers and other media are commenting on it, etc. Again, relevance provides us with a direction to look for a referent for "everything": the investor could have lost his/her suitcase, his/her laptop, or any set of things that form a group, but since we are talking about the stock market the most relevant reading is that the investor lost in his investments (and note how that shifts the meaning of "lost" from a literal sense of losing to a metaphorical one, since obviously it is not the case that the investor cannot find his/her stocks or municipal funds anymore, but that the value thereof is greatly diminished). Once more charity bids us to ignore the fact that prima facie (c) is likely to be false since it is unlikely that all of the investments lost all of their value. Generally, even bankrupt companies retain some assets that are worth a small fraction of the valuation of the company. Thus, (c) is to be taken as an exaggeration corresponding literally to "the value of my investments has dropped significantly and to the point that they are unlikely to fulfill the purposes for which I had invested these sums."

Example (d) shows very clearly both the importance of charitable reading and of the affordances of the term "test." Let us start by the fact that tests generally consist of questions about a subject. The situation bounds the test to an academic test in

<sup>6</sup>Wikipedia. Controversies About the Word "Niggardly". Accessed December 26, 2015.

<sup>7</sup> Interestingly, the Wikipedia article also reports that some would be using the term as a "code" word for the slur, essentially to provide deniability. In that case, the reading that attributes racist intentions would not be uncharitable. Incidentally, uncharitable readings are not inherently bad. They may serve political purposes. For example, the Democrats made much of then-vice-President Dan Quayle's misspelling of the word "potato" in a 1992 school photo-op. Obviously and charitably, Quayle knew the correct spelling, but accusing him of ignorance served the Democrats' agenda.

the humanities (as opposed to an endurance test in the marine corps, for example). The students are trying to determine what topics will be on the test. Abductively, we can infer that they are interested in this information in order to study those subjects (we will ignore if they will the neglect the other subjects, or if they merely wish to fine-tune their preparation). My response is cooperative (it provides them with the relevant information, it is clear, to the point, succinct, and truthful), if adversarial (I deny that there might be topics that should be prepared to the exclusion of others). However, note how the students must charitably attribute to me a bounded interpretation of "every topic covered in class" or "every topic on the syllabus" or "every topic in the book." Had they failed to do so, they could have come to the conclusion that every topic in their major might be on the test (a relevant interpretation, but that would be irrational on my part, as such test are not ordinarily given to students in a regular class) or perhaps every topic in the discipline (again, a relevant interpretation, since the course was called "Introduction to linguistics" but again one that non-charitably would impute irrational or abnormal behavior on my part).

Finally, in example (e) we must imagine a situation, common from many narratives, in which a husband has discovered evidence of the unfaithfulness of his wife (obviously, the example also works with the genders reversed). Clearly the relevant, bounded interpretation is that he husband knows everything about the wife's affair. Note how, if the wife's response were, "Oh really? What's the square root of 1243?," the assumption of relevance and boundedness would disappear. But let us now consider another response by the wife, who presumably has a PhD in logic, and might object, "No you don't. For example, you do not know how many times I have made love to Arthur, nor do you know what my pet name for him is!" The wife's objection is technically correct, since the topics she lists are within the domain of the relevant information. However, her response violates the maxim of charity, because it does not maximize the instances of shared true propositions between her and her husband. Indeed, by assuming that her husband means "I know everything that is important about the fact that you are having an affair" the number of shared true propositions would increase. Note how, once again, the important abductive inferential work done by charity to narrow down the domain of applicability of the statement<sup>8</sup> .

All these example highlight the dynamic, changing process whereby context is determined and bounded for the communicative purposes of and by the dynamics of the interaction of the speakers/participants. An observation should also be made about the partially facetious nature of some of the examples. Humor is a mode of communication that deliberately switches between one set of expectations (a reading of a text, for example) and another set of expectations that are different enough to be incongruous in relation to the first set. The switch can be achieved or merely partially explained or justified in a number of ways, for example through an ambiguous term, but also through a number of mechanisms, only some of which have been identified and described in the literature. Because of this switch in which the first interpretation of the text is rejected in favor of a second one, the context of the text also changes radically, although not entirely. Thus, in keeping with the metaphor introduced at the beginning of the essay, humorous switches between interpretations of texts may shed some indirect light on the processes that constitute context.

To illustrate this claim, and to provide a further example of the inferential processes in deriving context, we will examine a joke. The text is taken from a monolog delivered by a famous Italian comedian, who goes by the name of "Crozza," on October 25th, 2014 on La7, an Italian private television channel. The entire clip of the performance is available online at http://www.la7.it/crozza/video/leopolda-renzi-come-steve-jobs-25-10-2014-139314.

The fragment we are interested in occurs within the times 1:13–1:45. Below is my translation of the relevant parts of the text.

Renzi said: we will begin the proceedings in a location that will remind us of a garage. A garage is a symbol of a place where ideas become startups, create employment. He [Renzi] thinks he is Steve Jobs. It's clear. (...) Perhaps, the comparison with Steve Jobs is appropriate. Since Renzi has been there, every year a new thinner model of the Democratic Party is released.

The performance occurs in a relatively impoverished context, with Crozza alone on an empty stage. During the first part of the text, relevant quotations from the speech of Italian Prime Minister Renzi are projected on the screen, behind Crozza. The quotes are from Repubblica, a prestigious daily newspaper. Obviously, the function of these quotes is to show that what Crozza is saying is true and that the PM did in fact say those things. Until the last sentence, Crozza is establishing a script (or frame) for political governance. Within this script, the creation of new and innovative employment opportunities figure prominently as positive actions that politicians may undertake to stimulate the economy and increase the well-being of the citizens. In turn, the context of "political discourse" is broadly activated. Then Crozza introduces the second script: Renzi thinks he is Steve Jobs. Since Steve Jobs is a great entrepreneur who created one of the most successful companies on earth, Renzi's comparison of himself with Jobs is inferred by the audience to be a case of megalomania and hence ridiculous. Note the shift from the political discourse context to the psychology of Renzi. After another jab at Renzi, which I have not included for simplicity, Crozza returns to the comparison between Renzi and Steve Jobs, introduced as a joke in the first part. Now Crozza says the comparison is actually accurate. In other words, Crozza is now claiming that there is at least one trait in which Renzi and Steve Jobs resemble each other. Since Jobs is high status individual, the comparison will elevate Renzi's own status. Crozza then reveals that the similarity lies in the parallelism that both Job's company, Apple, and Renzi's party, the Partito Democratico (PD), release a thinner model each year. Of course, while in technology thinner is better, in politics, thinner means less electors, which is of course bad. Note again how the introduction

<sup>8</sup>One may object that since the wife has deceived the husband, there is no need or point to be charitable. This objection misses the point entirely. Charity is needed to communicate, not to be nice to people.

of the scripts for technological gadgets and political parties shifts the context, narrowing it down until it is pinpointed on the PD's loss of electoral share, since Renzi's appointment as Prime Minister. Note also, how charity not only bounds the limits of the context to the shrinking of the Prime Minister's Democratic Party, but also forces the hearer to accept the somewhat forced parallelism between Apple's releasing thinner telephones every year and the Democratic Party getting smaller every year. To deny the validity of the parallelism would be a serious violation of charity, as it would reduce Crozza's point to nonsense.

## CONCLUSION

Hopefully, to use the metaphor of non-foveal vision we have presented a set of principles and mechanisms that surround context and that operating in conjunction, determine it, if not completely, at least to a large extent. While this is not the same thing as describing context, it is a meaningful step in the direction of being able to determine what speakers do when they think-within-context.

Recapitulating briefly, speakers subconsciouly generate a mental construct of context using the concrete situation they are in, in al its richness, the semantics of the utterances, all inferences and abductive implicatures they can draw from those led primarily by the assumption of relevance, but by cooperation at large. The output of these, expansions is bounded by satisficing and by the need to provide charitable interpretations of the speakers' (linguistic or non) behavior. Generally speaking this mental construct is never above the threshold of consciousness, but some features in it can be brought to the speaker's attention through humor or other marked situations.

I have based this account of the concept of context on Davidsonian charity which I believe to be a significant addition to the related concepts of cooperation and rationality which

## REFERENCES


are necessary to process implicatures and generally speaking the pragmatics of texts. However, care should be taken when applying a concept developed by philosophers within philosophy to a more empirical field such as linguistics. First, it is quite possible that my account of the principle of charity might not match exactly what Davidson says about it and/or might not dovetail exactly with other things Davidson said that are connected with it. As I have said elsewhere about Grice and my reading of his work, as it pertains to the principle of cooperation, sometimes it is necessary to read what Grice or Davidson should have said. Reading a work should be grounded in a reading as precise and close to the intentions of the author's as possible, but that should not stop one from deviating from what the author says, if it is possible to improve on it. Needless to say, one should be clear, as I hope I have been, when one is doing which. Second, the use I have made of the principle is probably not one of the uses that Davidson or more generally philosophers would have intended. About this, I am unapologetic. Using a door as a table may be a good or a bad idea, but its success depends on how well it functions as a table, not on whether its door-related teleology has been fulfilled. Third, by using a concept from one theory of philosophy, one does not enter in a binding contract requiring him/her to solve all the problems related to that theory. If the door sticks, as a table user, I am not morally, ethically, or otherwise bound to fix the sticking problem (my advice: lightly plane the offending surface). Fourth, using a concept from a theory does not require one to adhere to the rest of the theory: if I am using a door as a table, I have not committed myself to purchasing the house the door comes from. Specifically and outside of metaphor, I do not buy the behaviorist undercurrent in Davidsonian semantics.

## AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.


Wittgenstein, L. (1969). On Certainty. Oxford: Blackwell.

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, MT and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Attardo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Pragmatics as Metacognitive Control

Mikhail Kissine\*

Autisme en Contexte: Théorie et Expérience, Université libre de Bruxelles, Bruxelles, Belgium

The term "pragmatics" is often used to refer without distinction, on one hand, to the contextual selection of interpretation norms and, on the other hand, to the context-sensitive processes guided by these norms. Pragmatics in the first acception depends on language-independent contextual factors that can, but need not, involve Theory of Mind; in the second acception, pragmatics is a language-specific metacognitive process, which may unfold at an unconscious level without involving any mental state (meta-)representation. Distinguishing between these two kinds of ways context drives the interpretation of communicative stimuli helps dissolve the dispute between proponents of an entirely Gricean pragmatics and those who claim that some pragmatic processes do not depend on mind-reading capacities. According to the model defended in this paper, the typology of pragmatic processes is not entirely determined by a hierarchy of meanings, but by contextually set norms of interpretation.

Keywords: pragmatic process, metacognition, Theory of Mind, autism spectrum disorder, indirect speech act, irony, implicature, Relevance theory

### Edited by:

Marco Cruciani, University of Trento, Italy

### Reviewed by:

Valentina Bambini, Istituto Universitario di Studi Superiori di Pavia, Italy George Charilaos Spanoudis, University of Cyprus, Cyprus

> \*Correspondence: Mikhail Kissine mkissine@ulb.ac.be

### Specialty section:

This article was submitted to Psychology, a section of the journal Frontiers in Psychology

Received: 30 May 2015 Accepted: 29 December 2015 Published: 14 January 2016

### Citation:

Kissine M (2016) Pragmatics as Metacognitive Control. Front. Psychol. 6:2057. doi: 10.3389/fpsyg.2015.02057

## 1. INTRODUCTION

Everyone agrees that, at some level or another, utterance interpretation involves integrating contextual information to fill in the gap between the encoded linguistic meaning and what is actually communicated. To facilitate the discussion to come, context-dependent contents may be subsumed under two categories. First, context is needed to determine the meaning literally conveyed by the utterance (e.g., Bach, 1994; Carston, 2002; Recanati, 2004). Let us call such cases primary meanings:


Second, context is of course needed to recover meanings that are clearly different from or independent of the utterance literal content. Let us call context-dependent interpretations of this sort secondary meanings. Standard examples of secondary meanings include:

	- a. The candidate's command of English is excellent and his attendance to tutorials regular. [in a letter of recommendation for a lectureship in philosophy, strongly suggesting that the candidate is not suitable for the position];

The list of pragmatic phenomena just introduced is by no means exhaustive; there are many other aspects in which context influences utterance interpretation. For instance, I leave aside here the much discussed issue of "generalized" implicatures (e.g., Noveck, 2001; Geurts, 2010). Furthermore, the main claim of the paper is precisely that from a processing point of view neither secondary nor primary meanings constitute a natural class. Nonetheless, this two fold distinction is useful to introduce a chief theoretical divergence within the field of cognitively oriented pragmatics. The first major view stems from an adoption of Grice's (1957) rational reconstructions into a psychological theory of interpretation.

Proponents of this position claim that any kind of pragmatic processing involves inferring the intentions that underlie the speaker's communicative behavior. They see pragmatic processing, associated with the derivation of both primary or secondary meanings, as a homogenous cognitive capacity, inherently rooted within Theory of Mind (understood as the capacity to attribute and reason about mental states).

The second camp accepts that Gricean inferences about communicative intentions are needed to reach secondary meanings but holds that the derivation of primary meanings is underpinned by accessibility-based, non-inferential processes. Under this view, the derivation of primary meanings would then involve Theory of Mind-independent pragmatic processing.

This debate raises crucial issues about the relationship between pragmatics and Theory of Mind, and about the (alleged) modularity of pragmatic processing. But instead of taking camps, this paper advocates a change of perspective. The term "pragmatics" is often used to refer without distinction, one the one hand, to the contextual choice of norms of interpretation and, on the other hand, to the context-sensitive processes guided by these interpretative norms. I will argue that pragmatics in the first sense depends on language-independent contextual factors that can, but need not, involve Theory of Mind; in the second sense, pragmatics is a language-specific metacognitive process, which may unfold at an unconscious level without involving any kind of meta-representation.

This paper is organized as follows. In the next two sections I will outline the main features of the two conflicting visions of pragmatic processing: monolithic, post-Gricean inferential accounts and more heterogenous, accessibility-based approaches. In Section 2, I will take Relevance theory as a paradigmatic example of the former kind of analysis (Sperber and Wilson, 1995, 2002), and, in Section 3, the distinction between primary and secondary pragmatic processes, advocated by Recanati (2004), as an example of the latter. By no means should this choice be taken as a limitation of the argument scope to these two particular theories. Simply, while many authors leave implicit the workings of the cognitive model they adhere to, both Sperber and Wilson, and Recanati provide starkly articulated descriptions of their cognitive commitments. From the critical discussion of these two polar positions, I will argue that a psychologically valid pragmatic model should distinguish, as independent dimensions, between types of meanings (primary vs. secondary), and types of pragmatic processes (accessibility-based vs. Gricean). I will then outline of model that would meet such a constraint. The solution I propose is based on the two-tiered theory of epistemic acceptance, which I borrow from Proust (2013, pp. 169–184) and summarize in Section 4. Proust's insight is that one should not confuse the choice or acceptance of an epistemic norm with the acceptance of a level of epistemic success relative to this norm. Transposing this idea to pragmatics, I will suggest, in Section 5, that one should distinguish between, one hand, between contextually determined norms of interpretation and, on the other hand, cognitive processes that control and lead to the achievement of this interpretative goal. The change of perspective advocated in this paper naturally accommodates experimental data that indicate that pragmatic processing is possible without sophisticated Theory of Mind, and opens interesting perspectives on the interpretation of experimental results in pragmatics.

## 2. RELEVANCE THEORY

Grice's long-lasting insight is that (non-natural) meaning can be rationally reconstructed in terms of complex communicative intentions (Grice, 1957). Under such a reconstruction, a speaker S (non-naturally) means that p if, and only if:


This rational reconstruction of communicative behavior has been quickly transposed into a psychological view of how utterance contents are recovered by addressees, and became deeply entrenched in experimental psychology and cognitive science. To date, the most fully articulated version of such a post-Gricean approach remains Relevance theory (Sperber and Wilson, 1995).

## 2.1. Classic Relevance Theory: A Pragmatic Module

According to Sperber and Wilson, communicative stimuli activate a specific interpretation process. While noncommunicative intentional behavior is interpreted by attributing to the agent an intention to act, according to them the interpretation of communicative behaviors is mediated by the attribution an informative intention. Informative intentions are intentions to provide the addressee with (dispositions to acquire) new beliefs or to reinforce existing ones. To exemplify Sperber and Wilson's distinction between intentions to act and informative intentions, think first of a stranger on the bus scratching her head. This is an instance of a non-communicative gesture; the stranger's behavior will be interpreted as resulting from an intention to relieve itching. By contrast, imagine next, that, when asked about my opinion about a particularly difficult paper, I demonstratively scratch my forehead. Here my gesture is communicative; according to Sperber and Wilson, it will be interpreted by inferring a certain informative intention from my behavior, e.g., an intention to make my addressee acquire or reinforce the belief that I do not have a ready-made answer.

Communicative stimuli, linguistic, and non-linguistic alike, can be associated with a virtually infinite number of informative intentions. The act of scratching my forehead could for instance mean that I find the answer difficult, but also that I do not feel comfortable with answering your question because I am personally acquainted with the author of the paper. Or, to take a linguistic example, an utterance of I can't drink may mean that I cannot drink alcohol because I am driving, that I do not want to have alcohol because I am often agressive when inebriated, that I have already had too much alcohol, that I cannot ingest any liquid because I have a blood test in an hour, etc. Relevance theory explains how the range of possible interpretations gets narrowed down by appealing to (the relatively uncontroversial) hypothesis that human cognition is geared toward an optimal balance between the cognitive effects of processing and the processing efforts required to reach these effects. In the case of communication, the quantity of the effects of an utterance can be modeled as the number of new practical and theoretical implications allowed by the output of its interpretation. Communicative behaviors, according to Sperber and Wilson, are always perceived as worth processing: to use their terms, communicative stimuli come "with their own presumption of relevance."

The simplest interpretative procedure would be, then, to infer from the communicative stimuli the informative intention that is the most relevant from one's own point of view. Sperber (1994) suggests that this strategy, which he dubs "Naive Optimism," is used by young children. Now, what is relevant from one's own, egocentric point of view may be different from the meaning the speaker actually intended to communicate. The core of the Gricean conception of speaker's meaning is that it should be overt; a speaker usually intends that her addressee recognizes her informative intention. Exploiting this idea, Relevance theory posits that the optimal way to reach communicative success, called "Sophisticated Understanding" by Sperber, is to attribute to the speaker the informative intention this speaker is likely to have intended to make mutually manifest to her and to her addressee. That is, one should base one's interpretation on attributing to the speaker a communicative intention to make mutually manifest an informative intention. The interpretative inference then runs from the communicative stimuli to the communicative intention that is the most relevant, given the speaker's abilities and preferences, to the informative intention embedded within this communicative intention.

Importantly, while Relevance theorists admit the existence of different interpretative strategies, with varying levels of complexity, they hold that the output of any kind of interpretative process—be it Naive Optimism or "Sophisticated Understanding"—involves the attribution of an informative intention to the speaker. It is this assumption that compels Sperber and Wilson to posit the existence of a unitary pragmatic module.

To see why, recall that an informative intention is a mental state whose content includes the representation of mental states (speaker's beliefs). In spite of recent evidence of early first-order Theory of Mind (Onishi and Baillargeon, 2005; Baillargeon et al., 2010), there is a consensus that children are not capable to attribute such complex, second-order mental states until the age of seven (Perner and Wimmer, 1985; Leekam and Prior, 1994). And yet, very young children are apt conversationalists, who prove to be sensitive to the context of the conversation and to the interlocutor's perspective. To give a few examples, infants display pointing behavior with a clearly informative function, which is, moreover, constrained by their social partner's state of knowledge (Liszkowski et al., 2006, 2008). They also interpret ambiguous requests relative to their partner's needs and intentions (Grosse et al., 2010; Schulze et al., 2013). Around thirty months, children attempt to correct an adult who misunderstood their request even though they are handed the requested object (Shwe and Markman, 1997). Three-year-olds also display sensitiveness to the speaker's perspective in reacting to synonymous labels; they are puzzled when their conversational partner suddenly shifts from using one name for an object to another, synonymous one, but not when a new speaker, who did not participate in the ongoing exchange, uses this synonym (Matthews et al., 2010).

In brief, there is a robust set of developmental data showing that, from a very young age, children use contextual cues to interpret and produce communicative behavior, even though they do not master second-order mental state attribution. In order to account for these empirical facts, Sperber and Wilson (2002) propose that pragmatic processing is underpinned by a specific cognitive module, devoted to the interpretation of communicative behavior. This pragmatic module would be rooted within a more general Theory of Mind, and would have an independent, and more precocious, developmental trajectory. Its output inevitably is the representation of the speaker's communicative or, at least, informative intention. Importantly, for Sperber and Wilson, this holds for the output of any interpretative process that goes beyond conventional, linguistically encoded meaning.

## 2.2. Implicatures: Material vs. Behavioral

As pointed out by Jary (2013), in Sperber and Wilson's model, the functioning of the pragmatic module itself does not necessarily involve the representations of the speaker's mental states. That is, utterance content is not necessarily recovered through inferences about speaker's intentions. Within the context of the conversation, the linguistic content of the utterance activates certain interpretations, the selection of these interpretations being warranted by the general expectation of relevance. Imagine a context where S is offered a coffee and a croissant and responds with an utterance of (8). This utterance makes accessible the contextually enriched primary meaning in (9). The topic of the conversation also makes accessible the background assumption in (10). The conjunction of (9) and (10) (non-monotonically) allows the conclusion in (11)—which corresponds to an implicature of the utterance in (8).


Importantly, the derivation of the implicature in (11) does not have to chronologically follow that of the primary meaning (9). Rather, the principle of Relevance leads the interpreter to expect that the speaker's utterance will make manifest a range of additional consequences, viz. secondary meanings. Some such secondary meanings are made salient in the context; in the course of the interpretation process primary and secondary meanings are then adjusted, so that the primary meanings provide inferential warrant to the secondary ones. But, as we just saw, such an inference is possible without any reasoning about speaker's intentions taking place. [Even though Jary's (2013) argument for non-mentalistic derivation of material implicatures thus allows for a mutual adjustment between secondary and primary meanings, in still unpublished work (Jary, unpublished manuscript), he suggests that the derivation of primary meaning may not even be necessary.]

A similar rationale may be applied to indirect speech acts. Imagine a context where the window is open and the speaker utters (12). Provided that it is desirable for the addressee to relieve the speaker's unpleasant feeling of cold, the primary content (13) can combine with the assumption that closing windows makes the air warmer to lead to the decision to close the window. In other words, without the mediation of any hypotheses about speaker's intentions the utterance of (12) can serve as a reason to close the window, and thus lead to a contextually appropriate interpretation (Kissine, 2013, pp. 102–125).


The secondary meaning (11), derived from (8), is an instance of what Jary calls material implicature. As we just saw, provided an overall expectation of Relevance, the derivation of material implicatures does not require hypotheses about the speaker's mental states. In this respect, material implicatures contrast with what Jary calls behavioral implicatures, whose derivation does require premises about the speaker's intentions, beliefs or desires. Take Grice's (1975) classic example of a recommendation letter which reads as (14). In order to derive the implicature that the candidate is not suitable for the position, in addition to the general presumption of cooperativeness, one needs premises such as (15) and (16).


That is, behavioral implicatures require understanding the speaker's motives, as well as making assumptions about the addressee's beliefs. This entails that that the derivation of behavioral implicatures is underpinned by at least second-order Theory of Mind.

The same holds for irony; what the speaker intends to communicate through an ironical utterance is inherently different from the primary content. In order to grasp irony it is therefore necessary to make hypotheses about what the speaker believes, as well as about her assumptions about her addressee's beliefs (e.g., Bryant, 2012). For instance, to understand that the speaker of (17) actually hated the movie, the interpreter needs to assume not only that the speaker did not like the movie, but also that the speaker assumes that it is mutually obvious to her and to her addressee that she did not like the movie.

(17) This is the best movie I ever saw.

## 2.3. Pragmatic Processing with No Theory of Mind

At this stage, it becomes natural to question the Relevance theoretic assumption that the output of any type of pragmatic processing consists in a representation of complex communicative intentions. Recall that while Relevance theorists hold that the output of pragmatic processing is always a representation of the speaker's informative intention, they admit different stages of interpretative complexity. Following Sperber's strategy of Naive Optimism, the interpreter may just choose, among different interpretations activated in the context, the one that is the most accessible from his point of view. This interpretative strategy perfectly suits the derivation of material implicatures; as we just saw, these do not require any explicit representation of speaker's mental states. While Jary (2013) holds that Naive Optimism relies on non-mentalistic processes to reach speaker's informative intention or communicative goal, there is no reason why the resulting secondary meaning should necessarily be embedded within a meta-representationally complex attribution of informative intentions to the speaker<sup>1</sup> . The output of the interpretation of I have already had breakfast may just be a doxastic-type representation of the content [The speaker does not want a coffee and a croissant]. Likewise, the output of the interpretation of the indirect request It is cold in here may just be a conative representation of the addressee's closing the window. Context-sensitive, pragmatic processing should then be possible even in the absence of a second-order Theory of Mind. In addition, as also pointed out by Jary, such pragmatic processing need not be entirely egocentric. As mentioned earlier, very young children are sensitive to other people's perspective, which should allow them to inhibit interpretations that are relevant from their own point of view, but incompatible with the speaker's point of view. It is therefore possible to posit an interpretation process which is sensitive to the speaker's beliefs but that does not rely on complex Theory of Mind neither in its functioning nor, pace Relevance theory, in its output.

<sup>1</sup>But see Jary (2010, pp. 183–185).

One may then envision a less homogenous picture of pragmatic processing and modify Sperber's (1994) scale of interpretative strategies as follows (see Jary, 2010, p. 186; Kissine, 2013, pp. 78–80):


This three-pronged hierarchy of interpretative strategies renders unnecessary resorting to a specific pragmatic module. In addition, it is fully consistent with what is known about typical and atypical development. I have argued elsewhere that the first kind of interpretative strategy is at work in persons with autism spectrum disorder (ASD, henceforth) and the second in typically developing children below seven (Kissine, 2012, 2013). There is a robust consensus that, despite individual differences, most children and adults with ASD fail to pass first-order Theory of Mind tasks (e.g., Happé, 1995; Yirmiya et al., 1998; Baron-Cohen, 2000). However, impairment on first-order Theory of Mind does not necessarily prevent people with ASD from using pragmatic processing of the first kind, based on egocentric relevance (Kissine, 2012, 2013). True, there is a broad consensus that individuals with ASD struggle with "social" or "intersubjective" dimensions of language use. For instance, they often fail to produce informative, new and relevant conversational contributions, to respond on topic and to detect conversational "faux-pas" (e.g., Eales, 1993; Surian et al., 1996; Capps et al., 1998; Surian and Leslie, 1999; Kaland et al., 2002, 2011; Ziatas et al., 2003). However, recent research also shows that persons with ASD are capable to understand metaphors, scalar implicatures (such as non-logical readings of some) and even indirect requests (Norbury, 2005; Pijnacker et al., 2009; Chevallier et al., 2010; Gernsbacher and Pripas-Kapit, 2012; Kissine et al., 2012, 2015) 2 . Such a selective pragmatic profile is difficult to explain on a modular theory of pragmatics. By contrast, it makes sense once one admits the existence of egocentric pragmatic processing.

Traditional, verbally demanding versions of first-order Theory of Mind tasks also prove difficult for typically-developing children below the age of four (e.g., Wellman et al., 2001). However, there is evidence that implicit understanding of other people's beliefs is present in typical development as early as at fifteen months (Onishi and Baillargeon, 2005; Southgate et al., 2007; Baillargeon et al., 2010) 3 . Accordingly, typically developing children below four display awareness of their interlocutor's perspective and are already apt conversationalists (see above). That is, typically developing children can exhibit the second, allocentric type of interpretation.

However, until second-order Theory of Mind is mature, roughly around the age of seven, children have difficulties in understanding irony (e.g., Winner and Leekam, 1991; Filippova and Astington, 2008), and do not reach the third, Gricean stage. (Note that on the Sperber and Wilson's idea of a pragmatic module, whose functioning and maturation are independent from Theory of Mind, it is unclear why reaching the developmental stage required for understanding irony should be concomitant with the development of second-order Theory of Mind.)

## 2.4. Interim Summary

The foregoing discussion of Relevance theory may be summarized in two general points. First, the steps leading to context-dependent, pragmatic interpretations do not necessarily involve assumptions about the speaker's mental states, but may be based exclusively on contextual accessibility considerations. Second, there are good empirical reasons for believing that context-dependent interpretation of linguistic meaning does not always result in hypotheses about the speaker's complex communicative intentions. In some cases, the interpretation output will consist only in a content that is relevant from the interpreter's own egocentric point of view; in some others, the output will be a content relevant from the speaker's perspective, but without necessarily involving the attribution of complex, multilayered communicative intentions to the speaker.

## 3. RECANATI: PRIMARY VS. SECONDARY PRAGMATIC PROCESSES

Recanati's (2004) two-tiered theory of pragmatic processing is a major contender to the monolithic, modular versions of Relevance theory, discussed in the previous section. The gist of his position is to posit two distinct types—primary vs. secondary—of pragmatic processing, which differ both in workings and in terms of the output they yield. As his twopronged theory pragmatic processing does not necessarily rely on assumptions about speaker's communicative intentions, the division of pragmatic labor posited by Recanati seems to be in a better position than Relevance theory to accommodate the empirical data mentioned above. However, on closer inspection the distinction he proposes, be it in terms of internal pragmatic process functioning or output, is not entirely straightforward.

## 3.1. Process Workings

Recanati's secondary pragmatic processes are Gricean inferences, based on hypotheses about the speaker's intentions. According to him, such secondary pragmatic processes are reserved for the

<sup>2</sup>To be more precise, metaphor comprehension is impaired in many individuals with autism, but this impairment seems to be caused, to a large extent, by reduced receptive vocabulary, and not by Theory of Mind deficits (e.g., Norbury, 2005).

<sup>3</sup> Individuals with ASD, by contrast, do no seem to deploy such implicit belief understanding (Senju et al., 2010).

derivation of what has been called here secondary meanings, viz. of contents different from the utterance literal content. By contrast, context-dependent derivation of primary meanings, in Recanati's model, is handled by primary pragmatic processes, which operate locally on the linguistic structure of the utterance.

Primary pragmatic processes are determined by accessibility considerations, and do not rely on Theory of Mind. Lexical items give rise to concurrent activation of multiple semantic values. Primary pragmatic processes consist in selecting the most accessible among these values, to subsequently enter within the compositional computation of primary meanings. Primary meanings are thus gradually built as lexical items undergo context-dependent "saturation" (e.g., indexicals such as she or demonstratives such as this are assigned a referent), "enrichement" (e.g., and in Peter left Mary and she started to drink is interpreted as and then, as a result,), "loosening" (e.g., the meaning of swallow in The ATM swallowed my credit card is relaxed to apply to non-living organisms), or "free transfer" (e.g., parked associated with the speaker and not her car in I'm parked in the back).

Now, as discussed in the previous section, it makes sense to assume that some accessibility-based, primary pragmatic processes are sensitive to the speaker perspective, without involving complex mind-reading processes. Admitting that accessibility-based primary pragmatic processing may be allocentric remains compatible with the "Availability criterion" Recanati uses to draw a line between primary and secondary processes. The outputs of primary pragmatic processes, viz. primary meanings, can be made available to conscious introspection, as, for instance, in case of explicit truthevaluation. Given a (declarative) sentence and a context of utterance, our judgements of truth and falsity, claims Recanati, bear on the contents yielded by primary pragmatic processes. The unfolding of primary pragmatic processes, however, is irreducibly unconscious: interpreters are not conscious of the steps that lead them from linguistic form to pragmatically enriched, primary meanings. Secondary pragmatic processes, by contrast, build on primary meanings, and may thus take the form of a genuine propositional reasoning. Consequently they should be entirely available to conscious introspection. For instance, the derivation of irony may be made available to the interpreter consciousness as a series of inferential steps. In other words, secondary pragmatic processes consist in—or, at least, can be reconstructed as—a sequence of (non-monotonic) inferential steps about the speaker's beliefs and intentions.

## 3.2. Types of Processes vs. Types of Meanings

A consequence of the way Recanati defines primary and secondary pragmatic processes is that in his theory the selection of the type of pragmatic processing—primary or secondary—is entirely determined by the interpretation input. While primary processes operate on lexical items, secondary pragmatic processes are intrinsically "post-propositional"; they consist in an inference from primary to secondary meanings. It thus seems that any pragmatic interpretation that does not directly build on the utterance linguistic structure should only be derivable, in Recanati's theory, through Gricean inferences about speaker's intentions. This is problematic. Implicatures and indirect speech acts are derived, according to Recanati, from primary meanings. However, as we saw in the previous section, material implicatures and indirect speech acts—which are secondary meanings—may be derived with no appeal to the reconstruction of speaker's communicative intentions.

To be fair, it is not obvious whether, for Recanati, the conceptual precedence of primary over secondary pragmatic processes extends to psychological processing. He does go at great lengths to argue that implicature derivation is not necessarily handled by conscious and voluntary inferences about the speaker's intentions. Yet, such pragmatic processing is still secondary, according to him, because the inference from primary meaning to the implicature is available, ex post facto, to the interpreter's consciousness (Recanati, 2004, pp. 46–50, 70–71). Under one interpretation of this claim, secondary pragmatic processes may occur both at unconscious and conscious levels, but still differ from primary ones in terms of their workings. Derivation of secondary meanings should then always presuppose a complex Theory of Mind, which would make developmental data discussed above as problematic for Recanati as it is for Sperber and Wilson. Recall, for exemple, that children with ASD (Kissine et al., 2012, 2015), as well as typically developing toddlers (Reeder, 1978; Shatz, 1978; Schulze and Tomasello, 2015), understand indirect requests. Another reading of Recanati's theory, more in line with the view argued for at the end of the previous section, is that some secondary meanings, such as material implicatures and indirect speech acts, may be derived through either primary or secondary pragmatic processes. This reinterpretation of Recanati's theory entails that types of pragmatic processing do not necessarily correlate with types (primary vs. secondary) of meanings. While this is the view I wish to defend, it is important to emphasize that the challenge now becomes to explain what drives the selection of the pragmatic process type.

## 3.3. Interim Summary

At this stage, we reach a rather complex picture. In agreement with Jary and Recanati, it makes sense to posit that some context-dependent, pragmatic processes do not involve Theory of Mind, but are accessibility-based. However, types of pragmatic processes do not correlate with types of meanings, as some secondary meanings (material implicatures and indirect speech acts) may be derived using contextual accessibility alone, without involving Theory of Mind. In addition, while the output of some pragmatic processing consists in attributing complex communicative intentions to the speaker, contextual interpretation of linguistic meaning may also yield representations of the utterance contents without involving any representation of the speaker's mental states. That is, secondary meanings, such as material implicatures or indirect speech acts, need not correspond to a complex meta-representation of speaker's informative intention. In Section 5, I will sketch a proposal where types of pragmatic processes are not determined by types of meanings. The main idea will be that types of meanings (primary or secondary) are recovered relative to

contextually determined norms of interpretation, which may, but need not target speaker's communicative intentions, and may be entirely egocentric or partly depend on the speaker's perspective. Pragmatic processes lead to and monitor the construction of utterance contents relative to norms of interpretation.

While this way of thinking about pragmatics may seem quite unusual, it has in fact straightforward parallels in cognitive science. On such a conception, pragmatic processing belongs to the broader category of meta-cognitive processes associated with epistemic acceptance. More particularly, the model I will defend has clear parallels with a contextualist view of epistemic acceptance, to this discussion of which I turn now.

## 4. TWO KINDS OF EPISTEMIC ACCEPTANCE

Acceptance refers to a mental action that consists in including a proposition among one's beliefs (hence making it available for subsequent action planning and inference). Thus defined, acceptance includes, for instance, accepting that one's recollection of an event is faithful enough, that a story one hears is truthful or that one's interpretation of a difficult passage in a book is accurate enough.

From an epistemological point of view, there are two, equally plausible, norms for acceptance that a rational system should follow. The first norm for acceptance is set relative to a certain confidence threshold; whenever a rational agent has a certain degree of confidence n that a proposition p is true—such that, say, 0.5 < n ≤ 1—she should accept p. The second norm obeys consistency requirements: a rational agent should accept any consequence that follows from a proposition or a conjunction of propositions she previously accepted. However, the conjunction of these two norms begets two notorious paradoxes (Hempel, 1962; Makinson, 1965). The first paradox is standardly illustrated with a lottery example. Imagine a lottery with one winning ticket over a thousand. When one buys a ticket, there is a probability of 0.999 that it will lose. Since 0.999 is a fairly reasonable confidence threshold, the first epistemic norm of acceptance dictates that the buyer should accept that her ticket will lose. Now, every ticket has exactly the same chance to win, and, accordingly, one should accept, about each individual ticket, that it will not win. However, it follows, from the second norm of acceptance, that one should accept that no ticket in the whole lot will win. To see the second paradox imagine a historian that compiles a lifelong work on, say, the reign of Peter the Great. As she writes, she has sufficient confidence for accepting each claim she makes. However, given the breadth of her endeavor it seems that it would be rational for her to accept that, as a whole, her book may contain some inaccuracies. Yet, if the book is taken as the sum of individual claims she accepted on the basis of the first norm, this acceptance should be irrational.

These two infamous paradoxes may be dissolved by acknowledging that norms for epistemic acceptance vary context from context (Kaplan, 1981). In some contexts, it is the proximity to truth that is important—e.g., How exact is my recollection of a particular utterance?—and it is the first norm that applies. In some others, it is the internal consistency of a set of propositions that matters—e.g., How consistent is my recollection of a conversation?—and it is the second norm that applies. Building on Kaplan's idea, Proust (2013, chapter 8) points out that there are more norms than these two. Acceptance may be guided by adherence to a shared disposition to act in a group; for instance, in conducting peace talks, a proposition ought to be accepted, if, and only if, it is coherent with the team's general plan of negotiation. Or, in writing a novel, the writer will be guided in her acceptance of a proposition by coherence with the fiction background.

The major consequence of this contextualist view of acceptance is that the selection of the acceptance norm, viz. of the kind of acceptance at stake, is independent of the monitoring of success relative to this norm. The selection of this or that norm of acceptance depends on one's appraisal of the environment and practical goals: should I privilege truth, consistency, adequacy with my group plans.... Acceptance of the proposition itself, e.g., its integration within one's beliefs or within a line of argument, proceeds relative to this norm. The adequacy of the process of acceptance relative to the norm is then monitored and controlled at a metacognitive level, by specific procedural loops operating on aspects of cognitive processes (Koriat, 2000; Proust, 2013).

It is standard to draw a distinction between types of metacognitive control that can be made available to conscious introspection and those that are best seen as unconscious processes (Koriat, 2000; Shea et al., 2014 ; for a more nuanced view, see Metcalfe and Son, 2012). Meta-cognitive judgements may be brought to consciousness and take the form of deliberate inferences about one's beliefs and memories. An instance of a meta-cognitive judgement is making an inference about one's likelihood to provide an answer in a memory task, based, for instance, on the task complexity and previous experience. Metacognitive feelings also provide feedback in the epistemic domain, but they are difficult to reconstruct in inferential terms. The clearest example of a meta-cognitive feeling is the "tip-of-thetongue" experience: the subject can more or less accurately assess the likelihood of her recalling an information to which, however, she has no conscious access. Applied to epistemic acceptance, meta-cognitive feelings may provide the subject with an assessment of the adequacy of the output relative to the norm without her having conscious access to the grounds of this normative assessment.

This latter point is consistent with the idea that although metacognition operates on cognitive processes it does not entail meta-representation of mental states. The two main arguments for holding that metacognitive does not require mindreading are: (a) the differential appraisal of one's own and other people's performance on a cognitive task, and (b) the evidenced metacognitive processes in vertebrates that have no Theory of Mind (e.g., Koriat and Goldsmith, 1996; Koriat, 2000; Proust, 2013; Shea et al., 2014).

To sum up, a contextualist theory of epistemic acceptance entails the following three fold procedural distinction:


3. The resulting acceptance (or integration).

I will now argue that this distinction exactly parallels pragmatic processing of a linguistic utterance. To the context-dependent norms of acceptance correspond context-dependent norms of interpretation; to the metacognitive processes correspond pragmatic processes, and to the result, i.e., to the acceptance itself, correspond the final representation(s) of the utterance content(s).

## 5. NORMS OF INTERPRETATION VS. PRAGMATIC PROCESSES

Deriving the meaning of an utterance is an epistemic operation, which terminates with the acceptance of a particular interpretation. Just as one should distinguish between the contextual selection of an acceptance norm and the acceptance process relative to this norm, one should not confuse the norm of an interpretation process with the interpretation process itself. A good way to understand this point is to consider the different ways utterance interpretation may go wrong. The distinction between the selection of the acceptance norm, and acceptance as assessment relative to this norm entails that one may be wrong in two different ways: either by failing to select the contextually appropriate acceptance norm or because of a meta-cognitive failure to assess adequately one's judgement relative to this norm. The same applies to pragmatic processing. Take, as an illustration, irony misunderstandings. There are two ways one can fail to understand irony. One may fail to understand that the speaker is being ironic and stick to the literal interpretation. In this case, the norm of interpretation has not been adequately set relative to the context of conversation. Or, one may understand that the speaker is being ironic but fail to discern what she actually means (realizing one's failure or not). This time the interpretative norm has been adequately set; however, either no interpretation is arrived at (as the interpreter adequately rejects all candidate contents) or the interpretation process delivers a content the interpreter mistakenly accepts as contextually adequate.

As we saw earlier, understanding irony requires grasping speaker's beliefs and intentions; the appropriate contextual norm here is the recovery of the speaker's communicative intentions. Interpreting the utterance relative to this norm thus requires a specific monitoring and control process, which must draw on the attribution of second-order mental states. Once such a complex interpretative norm has been set, the control mechanism that yields awareness of interpretative failure or success is probably an instance of metacognitive judgement, which can be explicitly reconstructed as an inferential explanation (based on standard Gricean considerations of discrepancy between the literal meaning and the context).

Contrasting with irony, the first two interpretative strategies identified at the end of Section 2—egocentric and allocentric relevance—correlate with more modest norms of egocentric relevance, mitigated or not by the integration of the speaker's perspective. Pragmatic processes guided by such less complex interpretative norms rely on contextual accessibility without involving complex mind-reading. That is, they terminate once a sufficiently accessible interpretation has been reached.

It is worth emphasizing that such processes are genuinely context-dependent, and not guided by mere salience. In Recanati's (2004, p. 30) definition, the most accessible meaning of a lexical item "corresponds to the most active interpretation when the interpretation process stabilizes." Salience of a lexical meaning may be determined by its frequency of use, familiarity to the interpreter or prototypicality. Lexical meanings are activated according to their relative salience independently of and in parallel to contextual factors, which means that in contexts that favor non-salient meanings of a lexeme, its salient, but contextually inadequate meanings are still activated (see Giora, 2003). For instance, the most salient (frequent) meaning of bulb is [light bulb]. Peleg et al. (2001) found that this meaning is activated in (18), even though it is contextually inappropriate.

### (18) The gardener dug a hole. The bulb was inserted

Accordingly, even when the interpretation norm is simple egocentric relevance, metacognitive control will be intrinsically context-sensitive. For instance, it is very plausible that the contextually adequate interpretation of bulb in (20) may be reaching using an entirely egocentric interpretative norm. However, this interpretation process is pragmatic as it involves inhibiting the salient, but contextually inadequate meaning light bulb, which was automatically activated.

In line with Recanati's theory, accessibility-based processes probably remain out of conscious reach of the interpreters. While the progression of Gricean inferential processes, such as irony derivation, is controlled by metacognitive judgements, control of unconscious pragmatic processes is more likely to correspond to a metacognitive feeling. To be sure, this idea needs extensive empirical confirmation. However, it is intuitively plausible that garden-path interpretations are accompanied by a distinctive feeling "of something being wrong." As we just saw, the most salient meaning of bulb is "light bulb"; as a result, when the interpretation of (19) reaches the end of the sentence, backtracking is likely to occur.

(19) The bulbs John stored in his closet have flowered.

It seem plausible that this backtracking is accompanied by a metacognitive feeling of interpretative failure. If so, there is a similarity between the meta-cognitive feelings associated with non-Gricean pragmatic processes and the tip-of-tongue phenomenon: in both cases, the metacognitive feeling provides conscious feedback on an unconscious process.

Independently of the validity of the contrast between metacognitive judgements and feelings, be it in pragmatics or more generally, the parallel I am drawing between metacognitive control of epistemic acceptance and pragmatic processing provides a fresh conceptual framework for thinking about the ways context determines utterance interpretation. Context plays two distinct roles, which should not be confused. First, context is required to set up an interpretation norm. In some contexts, this norm will be complete recovery of the speaker's communicative intentions (for instance, the conversation is full of innuendo or the speaker is being clearly sarcastic). In some other contexts,



the norm is the interpretation that is the most relevant given the speaker's perspective. And in still some other contexts, simple egocentric relevance is sufficient. Contextual selection of appropriate interpretative norms is largely independent of the linguistic input; drawing on world-knowledge and interactional experiences, it consists in the assessment of the frame of interaction<sup>4</sup> .

Second, the success of interpretative processes must be monitored and controlled relative to this norm. That is, pragmatic processes involve contextual selection among activated meanings and assessment of the unfolding interpretation relative to the interpretative norm. The input of the pragmatic interpretative processes is restricted to linguistic, or at least, communicative stimuli. However, the kind of contextual resources required for pragmatic processing—and in particular, the extent to which it draws on Theory of Mind—depends on the interpretative norm it is geared to. In sum, the selection of interpretative norms is context-driven, while pragmatic processing is context-senstive but driven by the selected interpretative norm.

The model I propose is summarized in **Table 1**. Its most important feature is that the typology of pragmatic processes is not determined by a hierarchy of meanings. Whether pragmatic processing involves Theory of Mind or not depends on the kind of interpretative norm that has been contextually selected. The crucial empirical prediction that follows is that some kinds of meaning may be derived through different types of pragmatic processing. This is consistent with the fact that, as we saw above, material implicatures or indirect speech acts may sometimes be interpreted in an entirely egocentric way.

Another straightforward prediction of this model is that the repertoire of available interpretative norms varies across interpreters. Drawing again a parallel with epistemic acceptance, acquiring some epistemic norms—e.g., logical consistency requires considerable cognitive maturation, and emerge late in ontogenesis. Likewise, the interpretative norm consisting in the full recovery of the speaker's communicative intentions should not emerge until second-order Theory of Mind is mature. Some secondary meanings, such as irony, cannot be derived in the absence of such complex interpretative norms; some others, however, such as material implicatures and indirect speech acts, may be reached through less complex processing. This is why appropriate indirect request understanding has been observed in typically developing toddlers (Shatz, 1978; Reeder, 1978; Schulze and Tomasello, 2015) and children with ASD (Kissine et al., 2012, 2015), two population with no complex Theory of Mind<sup>5</sup> .

This last point should not be taken as implying that once a more complex interpretative norm is operational it overwrites less complex norms, which were available earlier on; rather, pragmatic development enriches the repertoire of interpretative norms, which all remain available to the interpreter. It is plausible to assume that when a less demanding norm, such as egocentric relevance, appears to be suitable, it will be privileged over the more complex, Gricean norm. In many contexts competent, adults interpreters limit themselves to such an egocentric interpretation<sup>6</sup> . This is also consistent with the idea that, in most situations, interpreters automatically integrate the utterance content within their beliefs. On the model defended here, interpretation outputs are not always embedded within complex meta-representations of the speaker's communicative intentions (see Kissine, 2013, pp. 80–101, Kissine and Klein, 2013). While such meta-representational outputs probably form a barrier against automatic integration (cf. Sperber et al., 2010), they will not emerge in contexts where interpretation is geared toward a less complex interpretation norm.

A connected prediction is that differential processing of the same stimuli may be evidenced in experimental paradigms that make salient different interpretation norms to participants<sup>7</sup> . Therefore, great care must be paid, in the interpretation of experimental results, not to confuse the type of meaning supposed to be illustrated by the stimuli and the actual processing that took place in the participants' minds.

As a brief example of this last point, take irony comprehension in autism. As already mentioned, there is a widespread consensus that irony comprehension requires second-order Theory of Mind (e.g., Bryant, 2012). It is therefore expected that persons with ASD who do not have second-order Theory of Mind fail to understand irony (Happé, 1993; Leekam and Prior, 1994; Martin and McDonald, 2004). It may seem surprising, then, that in Chevallier et al. (2011) and Colich et al. (2012) participants with ASD correctly discriminate between "ironical" and "literal" interpretations. However, that the task in these two studies consisted in choosing between two responses, literal vs. ironic.

<sup>4</sup>Of course, in some cases the linguistic input may trigger the switch from an egocentric to a more complex Gricean interpretative norm. In particular, search for ironic interpretation may be primed by prosody or discourse context (e.g., Kowatch et al., 2013; Spotorno and Noveck, 2014). More often, however, ironic interpretation will be triggered because interpretation driven by a more modest norm fails to deliver any plausible output.

<sup>5</sup> See also Spotorno and Noveck (2014) on individual differences in strategies used for irony detection.

<sup>6</sup>This assumption has interesting parallels with the "Good enough" approach, according to which syntactic and semantic processing remains shallow whenever a detailed interpretation is not required by the task at hand (Ferreira and Patson, 2007).

<sup>7</sup> In fact, this point is consonant with the well-established finding that some cognitive biaises may be reduced by preventing participants from reading experimenter's intentions within experimental instructions (e.g., Wright and Wells, 1988; Schwarz et al., 1991).

Furthermore, unlike their literal counterpart, ironic stimuli were incongruent with the preceding context and characterized by a marked intonation. In real-life situations, the interpretative norm associated with ironical interpretation is the recovery of the speaker's intentions. However, in the experimental studies under discussion, the more modest norm, consisting in rejecting the literal interpretation, sufficed to provide the correct response (a consequence acknowledged by Colich et al., 2012). Pragmatic processing guided by such a norm may remain entirely egocentric and accessibility-based in its workings.

## 6. CONCLUSION

This paper urges a change of perspective on pragmatic processing by distinguishing contextually-dependent selection of interpretative norms and context-sensitive pragmatic processing. A crucial feature of this proposal is that types of processing (accessibility based vs. Gricean) do not correlate with types of meaning (primary vs. secondary). At this stage, the model remains largely speculative, and many details need to be filled in. For instance, while I focused on three interpretative norms, inspired by Sperber (1994) and Jary (2010, pp. 185–187), there

## REFERENCES


may be more. In addition, links between general metacognitive control and pragmatic performance should be empirically investigated. However, the model proposed allows a better integration of experimental data on pragmatic processing in early typical and atypical development. Furthermore, it contributes to building a research framework within which the interpretation of experimental results is sensitive to the interpretative norms participants are likely to select.

## FUNDING

This research is part of a research project funded by the F.R.S.-FNRS Research Incentive Grant F.4502.15 and the Fédération Wallonie-Bruxelles ARC-Consolidator Grant "Context in Autism."

## ACKNOWLEDGMENTS

I am grateful to Mark Jary and to Marc Dominicy for comments and discussions on an earlier draft of this paper. Comments by the two reviewers prompted me to greatly clarify the structure of the paper.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Kissine. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Theory of mind in utterance interpretation: the case from clinical pragmatics

### Louise Cummings \*

*English, Culture and Media Studies, School of Arts and Humanities, Nottingham Trent University, Nottingham, UK*

The cognitive basis of utterance interpretation is an area that continues to provoke intense theoretical debate among pragmatists. That utterance interpretation involves *some* type of mind-reading or theory of mind (ToM) is indisputable. However, theorists are divided on the exact nature of this ToM-based mechanism. In this paper, it is argued that the only type of ToM-based mechanism that can adequately represent the cognitive basis of utterance interpretation is one which reflects the rational, intentional, holistic character of interpretation. Such a ToM-based mechanism is supported on conceptual and empirical grounds. Empirical support for this view derives from the study of children and adults with pragmatic disorders. Specifically, three types of clinical case are considered. In the first case, evidence is advanced which indicates that individuals with pragmatic disorders exhibit deficits in reasoning and the use of inferences. These deficits compromise the ability of children and adults with pragmatic disorders to comply with the *rational* dimension of utterance interpretation. In the second case, evidence is presented which suggests that subjects with pragmatic disorders struggle with the *intentional* dimension of utterance interpretation. This dimension extends beyond the recognition of communicative intentions to include the attribution of a range of cognitive and affective mental states that play a role in utterance interpretation. In the third case, evidence is presented that children and adults with pragmatic disorders struggle with the *holistic* character of utterance interpretation. This serves to distort the contexts in which utterances are processed for their implicated meanings. The paper concludes with some thoughts about the role of theorizing in relation to utterance interpretation.

Edited by:

*Gabriella Airenti, University of Turin, Italy*

### Reviewed by:

*Massimo Marraffa, Roma Tre University, Italy Maurizio Tirassa, University of Turin, Italy*

### \*Correspondence:

*Louise Cummings, English, Culture and Media Studies, School of Arts and Humanities, Nottingham Trent University, Clifton Lane, Nottingham, NG11 8NS, UK louise.cummings@ntu.ac.uk*

### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *06 July 2015* Accepted: *11 August 2015* Published: *26 August 2015*

### Citation:

*Cummings L (2015) Theory of mind in utterance interpretation: the case from clinical pragmatics. Front. Psychol. 6:1286. doi: 10.3389/fpsyg.2015.01286* Keywords: clinical pragmatics, theory of mind, utterance interpretation, pragmatic disorder, reasoning

## Introduction

Few post-Gricean pragmatists would deny the central role of mind-reading or theory of mind (ToM) in utterance interpretation. But what is altogether more contentious is the exact nature of ToM in the complex cognitive processes whereby speakers produce, and hearers interpret utterances. This paper presents a particular view of this ToM-based process that is not popular among pragmatists or cognitive scientists in general. But it is a view that is supported by evidence of how utterance interpretation is impaired in children and adults with a range of pragmatic disorders. The view in question depends on three main claims. The first of these claims is that utterance interpretation involves the full exercise of rationality. When language users produce and interpret utterances, they are not constrained to operate within a particular rational sub-domain that has been identified by some theorists as communicative rationality. Rather, they are exercising a rational capacity, the key attribute of which is that it transcends efforts to circumscribe it. The second claim is that utterance interpretation goes well beyond the recognition of intentions à la Grice. In fact, it involves the full gamut of cognitive and affective mental states as well as the attribution of these states in more or less complex ways to the minds of language users. The third claim relates to a feature of utterance interpretation which is almost never explicitly acknowledged by theorists. That feature concerns the holism of the knowledge that language users draw upon during their interpretation of utterances. Pragmatic accounts of utterance interpretation tend not to emphasize the essential unity of this knowledge, preferring instead to represent certain aspects of knowledge as relevant to the interpretation of utterances. The way in which such accounts misrepresent the holism of knowledge will be challenged in this article.

So, it will be argued that any ToM-based process that is to play a role in utterance interpretation must have a fully rational, intentional, holistic character in the particular senses outlined above. But such an understanding of utterance interpretation will not be acceptable to very many pragmatists and cognitive theorists. The contention that it is not possible to circumscribe the rational capacity that is exercised during utterance interpretation—at least if we are to end up with an intelligible account of this interpretation—will be unpalatable to cognitive theorists and pragmatists, many of whom have a substantial appetite for theory construction. However, it will be argued that although this proposal is unpalatable for many theorists in the area, it is an authentic representation of the rational character of utterance interpretation. The contention that the type of mental state attribution involved in utterance interpretation extends well beyond the recognition of communicative intentions will be troubling for those theorists (e.g., Sperber and Wilson) who believe that such attribution is the province of a highly specialized ToM module. And the contention that the knowledge and beliefs which we bring to utterance interpretation exist as a unified whole will be unsettling to any pragmatist who has ever talked about the beliefs and knowledge that are relevant to utterance interpretation (the implication, of course, is that there are other beliefs and knowledge that are not relevant to interpretation). Although each of these contentions will be disturbing to theorists who hold dear certain assumptions about utterance interpretation, these assumptions must be challenged if we are to begin to think in more productive ways about the cognitive basis of such interpretation. At least this will be our starting point for the following discussion of the nature and role of ToM in utterance interpretation.

That the three claims introduced above are valid statements about normal utterance interpretation will be demonstrated by examining impairments in the use and understanding of utterances in children and adults with pragmatic disorders. To the extent that utterance interpretation involves the exercise of a rational capacity, we might expect to find deficits in reasoning and the use of inferences in individuals with pragmatic disorders. Moreover, to the extent that this rational capacity has an open texture which evades circumscription, we may expect these deficits to be evident in domains beyond communication. The claim that utterance interpretation has an intentional character that goes beyond the recognition of intentions may also be verified on the basis of evidence obtained from clients with pragmatic disorders. We may expect to find deficits in the attribution of cognitive and affective mental states other than intentions in children and adults with these disorders. These states play a vital role in the interpretation of utterances, although this role is seldom acknowledged by pragmatists. The claim that the knowledge we bring to utterance interpretation exists as a unified whole also receives empirical validation from clients with pragmatic disorders. To the extent that the holism of this knowledge poses difficulties for clients with pragmatic disorders, we may expect them to process utterances within highly restricted contexts that are isolated from the wider body of knowledge to which they belong. It will be the aim of later sections to demonstrate that there is substantial empirical support for all three of these claims in clinical subjects. In the meantime, we consider the implications of these claims for the analysis of a standard communicative exchange of the type routinely examined in pragmatics.

## A Standard Communicative Exchange

The analysis of a standard communicative exchange serves as a useful starting point for the following discussion. This analysis will emphasize the rational, intentional, holistic character of utterance interpretation. In doing so, it will force us to think differently—and, it is hoped, more critically—about the mainly modular proposals<sup>1</sup> that have tended to dominate cognitive accounts of interpretation. Consider the exchange below between Mark and Jane:

Jane: Do you fancy going to Spain again this summer with my parents?

Mark: They didn't cope well with the heat last year. Jane: Okay then. I'll ask Bill instead.

The apparent ease with which Jane recovers the implicature of Mark's utterance—Mark clearly does not wish to go to Spain in the summer with Jane's parents—belies the complexity of the cognitive processes that are integral to this exchange. In demonstration of these processes, we need to examine the subconscious steps which Jane must take in order to recover the implicature of Mark's utterance. Before Mark can establish the communicative intention that motivates Jane's utterance, he must first undertake a number of pragmatic developments of the logical form of Jane's utterance. He must establish the referent of the pronoun "you" and the period of time that Jane has in mind when she uses the expression "this summer." He must also know the individuals that Jane is referring to through the use of the noun phrase "my parents." Only when referents are obtained for the indexicals "you," "this," and "my" can Mark even

<sup>1</sup>These proposals include most notably contributions from relevance theory (Wilson and Sperber, 1991; Sperber and Wilson, 2002; Sperber, 2005; Wilson, 2005) and modular pragmatics theory (Kasher, 1991a,b, 1994).

be said to be in possession of the proposition that is expressed by Jane's utterance. But Mark's cognitive input to this exchange does not end with the pragmatically enriched proposition of Jane's utterance. For this proposition is then subject to further pragmatic processing. At least part of this processing leads Mark to the presupposition of the iterative expression "again" in Jane's utterance—the presupposition that Jane and Mark have been to Spain before. It is also this additional processing which enables Mark to see that Jane is doing more than merely posing a question in the above exchange. For Jane is simultaneously suggesting that Spain should be the destination of their next summer vacation and that her parents should be their traveling companions during this trip. It is only when this particular speech act is established that Mark can be said to have recognized the communicative intention that motivated Jane's original utterance.

From assigning referents to indexicals to establishing the illocutionary force of Jane's utterance, Mark must perform a range of complex cognitive processes in the above exchange. But he is not alone in this regard. Jane, too, must exercise similar cognitive processes if she is to succeed in making sense of Mark's contribution to this exchange. Jane must also undertake pragmatic developments of the logical form of Mark's utterance. She must establish that her parents are the referent of the pronoun "they" in Mark's utterance. She must also be able to establish a temporal referent for the expression "last year" in this utterance. Some concept narrowing is required to appreciate the meaning of "heat" in Mark's utterance. Jane must understand this term to mean the high temperatures in Spain rather than just a general state or quality of being hot. Even after she has arrived at the proposition which is expressed by Mark's utterance, Jane must engage in further pragmatic processing in order to obtain the implicature of his utterance. That implicature is calculable on the assumption that Mark is attempting to make a relevant contribution to the exchange notwithstanding his apparent failure to address the specific issue raised by Jane's direct question. That issue—Mark's willingness to undertake another trip to Spain in the company of Jane's parents—has significant implications for the social relationship that exists between Mark and Jane, particularly if Mark does not welcome the opportunity to spend more time with Jane's parents. Jane must use her knowledge of that relationship to decide that if Mark is going to decline her proposal to travel to Spain, he is most likely to do so indirectly by way of an implicature. The recognition of this particular implicature is signaled by Jane in her final utterance in the exchange when she states that she will present the same proposal to Bill instead.

It is through a complex interplay of cognitive processes that the utterances in the above exchange are meaningful to Mark and Jane. I have argued elsewhere that these processes take the form of a single, undifferentiated ToM-based mechanism which achieves the pragmatic enrichment of the logical form of an utterance and the recovery of implicatures proper (Cummings, 2014a). In order for Jane to establish the referent of the indexical "they" in Mark's utterance in the above exchange, she must attribute to him many of the same mental states that she will use to recover the implicature of his utterance. However, of more interest in the present discussion is not that a ToM-based mechanism is used in both the pre- and post-propositional processing of an utterance—we will take it as unproblematic that it is—but the exact nature of this ToM-based mechanism. Expanding on an earlier discussion in Cummings (2014b), it will be argued that this mechanism cannot be a cognitive module or other highly specialized inferential device and be an authentic representation of the cognitive processes involved in utterance interpretation, at least to the extent that the latter has the type of rational, intentional, holistic character proposed in this paper. To see that this is the case, we need only examine in more detail the cognitive processes which Mark and Jane must undertake in order to participate in the above exchange. These processes involve nothing short of a full-blown ToM of a type that lies well beyond the representational capacity of a cognitive module or other specialized inferential device. Empirical support for these processes is presented in later sections. In the remainder of this section, these processes are examined on their own terms.

That an unbounded rational capacity is exercised by Mark and Jane in the above exchange is a key component of the cognitive account of utterance interpretation proposed in this paper. If this capacity is a truly unbounded entity, as it is contended, then it should not be possible to place a limit on the rational considerations which come into play in the above exchange. However, if this capacity is a bounded construct which can be circumscribed and even modularized, then we must recognize a point at which a line can be drawn around the rational capacity that Mark and Jane are using in this exchange. That the former scenario best represents this rational capacity, both in relation to the calculation of implicatures and the primary pragmatic processes<sup>2</sup> used to obtain the propositions of utterances, will now be demonstrated. Let us return to Mark's utterance in the above exchange. That utterance was taken to generate the implicature that Mark does not want to travel to Spain with Jane's parents in the summer. According to the standard pragmatic account of utterance interpretation, that implicature is arrived at by a process of reasoning which uses as its "premises" certain mutually held expectations about the rational conduct of (verbal communicative) behavior. These expectations require Mark to contribute only those utterances to the exchange which will have some relevance to, or salience for, his communicative partner Jane. Accordingly, any utterance that Mark contributes must relate to the topic of Jane's question (a summer trip to Spain) and to the specific proposal contained in that question (the proposal to travel to Spain with Jane's parents). Mark's actual utterance fulfills these criteria only to the extent that Jane is able to draw the following inferences:


<sup>2</sup>The term "primary pragmatic processes" is used by Recanati (1993) to refer to pragmatic developments of logical form. These processes include saturation, free enrichment, and transfer. The reader is referred to Bezuidenhout (2010) for an excellent discussion of these processes.

The primary pragmatic processes, which Jane must employ in order to achieve reference assignment and lexical narrowing in (1)–(3) above, are thus guided by her rational expectations of Mark in this communicative exchange. But unlike most pragmatic accounts, which would have Jane's communicative rationality end here, the inferences in (1)–(3) are themselves only intelligible to the extent that Jane is in possession of a number of other rational expectations. Several of these expectations relate to Mark's competence as a user of the English language. For example, Jane must have as a rational expectation that if Mark wants to refer to more than one person, he will know that he must use a plural pronoun in order to do so. Jane will also have a series of other rational expectations. For example, she will have an expectation that Mark will have a sound understanding of concepts such as time and physical properties like temperature, and that he can appropriately capture these concepts and properties in linguistic expressions such as "year" and "heat," respectively. But Jane's rational expectations do not even end here. She will also have rational expectations about Mark's world knowledge such that she will expect him to know that Spain is a European country which has a warm climate. Jane will also expect Mark to know that it is this warm climate which makes Spain a popular destination for many tourists. In short, the inferences in (1)–(3) above presuppose an entire network of rational expectations which are not bounded in any way and cannot be circumscribed, as most pragmatic accounts of utterance interpretation would have it. What started out as a series of inferences, which were aimed at achieving reference assignment and lexical narrowing, quickly opened up into an array of rational expectations which were as complex as human thought itself.

The situation is no less complex when we consider the steps which Jane must take in order to obtain the implicature of Mark's utterance. To derive the implicature that Mark does not want to undertake a trip to Spain with Jane's parents, Jane must again engage in a process of reasoning which has certain rational expectations as its "premises." These expectations lead Jane to search for the relevance of Mark's utterance as a response to her question. It is that search for the relevance or salience of Mark's utterance to Jane that leads her to draw the inference in (7) from the propositions in (4)–(6):


On most pragmatic accounts of utterance interpretation, the role of (communicative) rationality is limited to a number of rational expectations which secure the recovery of the implicature in (7). But it is not difficult to demonstrate that this cannot be the case. This implicature is derivable on the basis of certain rational expectations which serve to establish the relevance of Mark's utterance as a response to Jane's question. One of these expectations is that Mark is behaving as a cooperative communicator in his exchange with Jane. But this single expectation presupposes a range of other rational expectations that are equally important to the recovery of the implicature in (7). For example, Jane cannot have a rational expectation that Mark will be a cooperative communicator in the exchange in the absence of further rational expectations to the effect that Mark has intact linguistic competence and that he can employ this competence to communicate effectively in a range of contexts. The expectation that Mark has intact linguistic competence in turn presupposes other rational expectations about a range of conditions which Jane may reasonably assume apply to Mark. For example, she must have a rational expectation that Mark's language development has proceeded along normal lines, that his linguistic competence has not been impaired by disease or injury, and that Mark is not currently under the influence of chemical substances which may, temporarily at least, disrupt his competence. These additional rational expectations are integral to what it means for Jane to have a rational expectation that Mark is a cooperative communicator. As such, they are no less important to the recovery of the implicature in (7) than the expectation of cooperation which is routinely included in pragmatic accounts of utterance interpretation.

It is important to point out that the difference between the view of utterance interpretation proposed in this paper and that which is adopted by standard pragmatic accounts is not merely one of emphasis. For the network of rational expectations examined above plays a particularly critical role in utterance interpretation. It is through this network that Jane's expectation that Mark is a cooperative communicator is even an intelligible thought. Put quite simply, no sense can be made of Jane's expectation that Mark is behaving as a cooperative communicator in her exchange with him in the absence of this extensive network of rational expectations. For most pragmatists and cognitive theorists, this network is rarely even alluded to in their theoretical accounts of the cognitive basis of utterance interpretation. In fact, modular accounts of this interpretative process actively eschew the types of rational considerations which are emphasized in the present context. The modularization of any body of knowledge, should that knowledge be used to interpret utterances or perform other cognitive processes, can only proceed by excluding the prior rational expectations which, it was argued above, Jane must have in order for her to view Mark as a cooperative communicator. But in the absence of this prior rationality, this modularized knowledge is not even intelligible as an account of the cognitive basis of utterance interpretation. That so many present-day pragmatists and cognitive theorists subscribe to modular accounts of utterance interpretation, I have argued elsewhere, is symptomatic of an impulse to theorize about concepts such as rationality and meaning (Cummings, 2002a,b, 2005a,b, 2012a,b, 2014b). Those arguments will not be rehearsed here. Rather, we continue our examination of the cognitive basis of utterance interpretation.

Alongside an emphasis on the rational character of utterance interpretation, the view proposed in this paper also challenges us to think differently about the intentional character of this process. Of course, all post-Gricean pragmatic accounts of utterance interpretation acknowledge the central role of the recognition of intentions in this interpretation. It is only when a speaker's intention in producing an utterance is recognized that a hearer may even be said to have understood what the speaker means. However, communicative intentions, whilst important, are merely one type of mental state which hearers must attribute to the minds of speakers during utterance interpretation. Indeed, if anything, intentions are dependent upon a range of other cognitive and affective mental states which assume a primary role in the interpretation of utterances. To see this, let us return to the above exchange between Mark and Jane. In order for Jane to recover the implicature of Mark's utterance in this exchange, she must be able to establish the intention that motivated this utterance. Mark produced his utterance with the intention of inducing in Jane the belief that he, Mark, does not want to travel to Spain with Jane's parents. But this intention is a type of secondary mental state which is dependent upon other mental states. Although these other mental states are not intentions, they are no less important to the recovery of the implicature of Mark's utterance than the mental state of intention which is privileged in pragmatic accounts of utterance interpretation. An examination of the different mental states that Jane must attribute to Mark before she can even recognize the intention that motivates his utterance illustrates this point well.

In order to recover the implicature of Mark's utterance, Jane must attribute certain knowledge and belief states to Mark. These states include knowledge that Spain is a European country, and the belief that Jane's parents are intolerant to heat. Jane must also attribute to Mark a range of states based on desire. One such state is that Mark wants to maintain his pre-existing social relationship to Jane by declining her proposal to travel to Spain with her parents indirectly by way of an implicature. Jane must also attribute to Mark a desire to undertake foreign travel in order to present her proposal to him in the first place. Alongside knowledge, belief, and desire states, Jane must also attribute certain states of ignorance or lack of knowledge to Mark. For example, she must attribute a lack of knowledge of her summer travel plans to Mark in order to make her own verbal behavior the revelation of those plans—a rational move in the above exchange. Jane's verbal behavior is also only rational to the extent that she is able to attribute to Mark a lack of knowledge of any forthcoming events that may coincide with, and preclude, a summer trip to Spain. Alongside cognitive mental states, a range of affective mental states are also integral to Jane's recovery of the implicature of Mark's utterance. Mark's smiling face and relaxed demeanor may lead Jane to attribute a state of happiness to him. To the extent that Jane wants Mark to accept her travel proposal, the attribution of this particular affective state to Mark may encourage Jane to present her proposal to him now rather than in 2 days' time, when she knows Mark must have a tooth extracted at the dentist. Jane may also attribute to Mark disgust of Spanish food, and a fear of flying, two affective mental states which she recognizes may incline Mark to reject her travel proposal.

It emerges that the full panoply of mental states—intentions, knowledge, beliefs, desires, ignorance, happiness, disgust, and fear—may be attributed to the mind of a speaker during the recovery of an implicature. It also emerges that intentions hold no special logical position within this wider set of mental states, notwithstanding their dominance in pragmatic accounts of utterance interpretation. The combination of these factors leads one to doubt whether any cognitive module which is specialized to process intentional data could even begin to represent the mental states that are involved in utterance interpretation. Like rational expectations before them, mental states exist not as isolable units, but as part of a larger network of intentional phenomena. Indeed, it is on account of this wider network that intentions and other states are even intelligible mental phenomena. It was demonstrated above that Jane is not simply attributing an intention to Mark when she interprets his utterance as a rejection of her proposal to travel to Spain. If anything, that intention was a type of secondary mental state that was only attributed to Mark after Jane had already attributed a range of other cognitive and affective mental states to him. That these other states are also instrumental to the recovery of Mark's implicature has implications for the type of cognitive structure which can play a role in utterance interpretation. Specifically, that structure cannot be a cognitive module that is specialized for the recognition of intentions, as most pragmatists would have it. In fact, any type of cognitive module serves only to exclude the very intentional phenomena that make the recognition of intentions during utterance interpretation intelligible. A quite different form of description is needed. Some thoughts about what that description may involve are addressed subsequently.

We turn to the final feature of the alternative view of utterance interpretation proposed in this paper. That feature is the holistic character of interpretation. To some extent, this feature has already been addressed. It has been argued that the rational expectations and intentional phenomena which are integral to utterance interpretation are not isolable in any sense but exist as a unified whole. This same holism applies to the knowledge which speakers and hearers bring to utterance interpretation. That knowledge is variously captured by pragmatists in expressions such as "background knowledge," "mutual knowledge," "shared knowledge," and "world knowledge." These expressions reflect the fact that speakers and hearers use knowledge not only of each other but also of states of affairs in the world during their interpretation of utterances. This was evident in Jane's exchange with Mark, where Jane used knowledge of Mark's mental states as well as her knowledge of people, places, and events in the world to derive the implicature that Mark does not want to travel to Spain in the summer with her parents. For most pragmatists, only certain aspects of Jane's knowledge are relevant to her interpretation of Mark's utterance in this exchange. So, while her knowledge that Spain is a European country may be judged to be relevant to her interpretation of Mark's utterance, her knowledge that Spain has had three recessions in the last 5 years may not be considered to be relevant. Also for most pragmatists, the former knowledge can be circumscribed within a cognitive module or other specialized inferential device, while the latter knowledge can be disregarded as somehow irrelevant to Jane's interpretative task. For the sake of argument, let us assume that this account of the knowledge that is used in utterance interpretation is not just possible but is obtained in the particular case of Jane's exchange with Mark. What would such a body of knowledge look like?

The answer to this question is that we do not have the first idea what such a body of knowledge would look like. In fact, we must concede the complete unintelligibility of this knowledge. To understand why, we need only examine further the knowledge that Jane brings to her interpretation of Mark's utterance in the above exchange. It was suggested that this knowledge might contain the proposition that Spain is a European country. This proposition may even be represented within a cognitive module that is specialized for utterance interpretation. And for most pragmatists, the matter ends here. But it is not difficult to demonstrate that this single proposition depends on other propositions for its own intelligibility. For example, the proposition that Spain is a European country presupposes propositions to the effect that Europe is a continent and that Spain is one of several nations in the continent of Europe. Let us assume that both of these propositions are also represented within the cognitive module that Jane uses to interpret Mark's utterance. Surely now we can draw a line around the knowledge that must be permitted entry to the module. But the matter does not end here either. For the module must also contain knowledge to the effect that Spain and other European nations have their own cultures and languages, that some of these languages (e.g., Portuguese) are also spoken in South American countries and that bull fighting is a cultural tradition in Spain. The point that is demonstrated by means of this example is that there is no stage at which we can throw a net around the knowledge that Jane uses during utterance interpretation and then claim this knowledge to be complete. A fortiori, Jane's knowledge cannot be fully circumscribed within a cognitive module, even one which is specialized for utterance interpretation.

The problem that the holism of knowledge poses for pragmatists and cognitive theorists is that it is not possible to circumscribe the knowledge that we bring to utterance interpretation and arrive at an intelligible account of that knowledge. Regardless of where we think we can draw a boundary around the knowledge that is relevant to utterance interpretation, it can be readily demonstrated that it is only possible to make sense of this circumscribed knowledge by using knowledge that lies outside of the boundary. This boundary typically takes the form of an encapsulated cognitive module or a series of such modules, each of which is specialized to perform a particular function. This modular account has a certain appeal to pragmatists. It appears to be complete in the sense that a cognitive module contains all the knowledge that is relevant to utterance interpretation. It also appears to embody cognitive efficiencies in that the need for extensive searches of background knowledge is obviated when knowledge that is relevant to utterance interpretation is brought together in a specialized cognitive module. But this completeness and efficiency are more illusory than real. For what we have produced is not a complete account of utterance interpretation but an unintelligible account, which lacks a prior concept of knowledge with which to make sense of the circumscribed contents of a cognitive module. The dilemma that confronts pragmatists is the same dilemma that confronts any cognitive theorist who believes it is possible to produce a complete theoretical account of concepts such as meaning, rationality, and knowledge. Such an account appears to achieve the completeness of a theory. However, it only does so by eschewing the very rational and epistemic concepts that make that account intelligible.

Thus far, the discussion has addressed a number of conceptual issues relating to the cognitive basis of utterance interpretation. It has been important to reflect on these issues for at least two reasons. First, they have encouraged us to take a critical stance toward dominant (modular) accounts of utterance interpretation. Second, these issues have also encouraged us to think about what an alternative account of utterance interpretation might look like, especially one that is construed along the rational, intentional, holistic lines proposed in this paper. Having addressed these conceptual issues, we are now in a position to consider if there is any empirical support for this alternative view of utterance interpretation. That support, it will be argued, is to be found in a range of clinical disorders. Specifically, children and adults with pragmatic disorders exhibit problems in the use and understanding of utterances which are consistent with the alternative view of utterance interpretation that has been outlined above. It is to an examination of these disorders, and their implications for an account of utterance interpretation, that we now turn.

## Empirical Support from Pragmatic Disorders

The view of utterance interpretation proposed in this paper receives substantial empirical support from a range of pragmatic disorders. That view is expressed in a claim to the effect that utterance interpretation has a rational, intentional, holistic character. As a means of validating this claim, three types of clinical case will be considered in this section. To the extent that utterance interpretation involves the exercise of a fully unconstrained rational capacity, and not some narrowly defined communicative rationality, the first of these clinical cases presents evidence of the presence of deficits in reasoning and the use of inferences in domains beyond communication in subjects with pragmatic disorders. It was also argued above that intentions represent a mere subset of the cognitive and affective mental states that must be attributed to the minds of speakers during utterance interpretation. To the extent that this is the case, we may expect to find evidence of deficits in the attribution of a range of mental states beyond those of intention in children and adults with pragmatic disorders. It was also argued above that any account of the cognitive basis of utterance interpretation must be able to represent the holism of knowledge. To the extent that the knowledge we bring to utterance interpretation exists as a unified whole, we may expect to find evidence of a tendency in children and adults with pragmatic disorders to process utterances within restricted or limited contexts. These contexts may be expected to privilege certain (dominant) meanings of words and utterances and limit the extent to which hearers seek alternative (nondominant) meanings. Having examined the empirical support which exists for this view of utterance interpretation, the paper concludes with some thoughts about its implications for theories of utterance interpretation.

### Deficits in Reasoning and Inference

There is now extensive evidence of deficits in a range of inferences related to utterance interpretation in clients with pragmatic disorders<sup>3</sup> . Deficits in reasoning and inference have been reported in children with specific language impairment (SLI) and primary pragmatic difficulties (Botting and Adams, 2005; Adams et al., 2009), high-functioning children with autism (Dennis et al., 2001), children with attention deficit hyperactivity disorder (McInnes et al., 2003; Berthiaume et al., 2010) and hydrocephalus (Dennis and Barnes, 1993; Barnes and Dennis, 1998), and in pediatric traumatic brain injury (Dennis and Barnes, 2001; Moran and Gillon, 2004). Deficits in inferential aspects of utterance interpretation have also been reported in adult-onset conditions including schizophrenia (Corcoran, 2003), multiple sclerosis (Laakso et al., 2000), and righthemisphere damage (RHD) (Tompkins et al., 2000, 2001, 2009; Lehman-Blake and Tompkins, 2001). These studies certainly support the claim that there is disruption to inferences which play a role in utterance interpretation. But this claim does not go far enough for our present purposes. In order to support the contention that utterance interpretation involves the exercise of a fully unconstrained rational capacity, we must also be able to identify deficits in reasoning and inference in noncommunicative domains. In effect, we must be able to give an affirmative answer to the question: Is there any evidence that children and adults with pragmatic disorders also experience deficits in reasoning and inference in areas other than utterance interpretation? These deficits include impairments across a range of inference types and cognitive domains, and not just those inferences which are associated with utterance interpretation. It will be argued that evidence to this effect can be readily presented for a number of the pragmatic disorders introduced above.

It is not difficult to demonstrate the existence of impairments in a range of inference types in clients with pragmatic disorders. The breadth of these inferential impairments across domains provides support for the view that an unconstrained rational capacity is exercised during utterance interpretation. Children with SLI exhibit poorer deductive reasoning (Newton et al., 2010) and analogical reasoning (Leroy et al., 2012, 2014) than normally developing children. Individuals with autism spectrum disorder (ASD) exhibit deficits in analogical reasoning particularly about non-living items (Krawczyk et al., 2014) and defeasible conditional reasoning (Pijnacker et al., 2009). Adolescents with moderate to severe traumatic brain injury exhibit impairments in analogical reasoning ability (Krawczyk et al., 2010). A broad range of inferential deficits also exists in adult-onset conditions. Adults with schizophrenia display impaired probabilistic inference (Averback et al., 2011), transitive inference (Titone et al., 2004), associative inference (Armstrong et al., 2012), analogical reasoning (Simpson and Done, 2004), inductive reasoning (Corcoran, 2003), and deductive reasoning (Mirian et al., 2011). In a study of patients with acute aphasia, non-linguistic abstract reasoning was the only cognitive domain not to show improvement in the first year after stroke (El Hachioui et al., 2014). Adults with penetrating head injuries and focal lesions to the parietal cortex display deficits in transitive reasoning (Waechter et al., 2013). Adults with a range of dementias also exhibit deficits in reasoning. Yoshiura et al. (2011) found evidence of deterioration of abstract reasoning ability in individuals with Alzheimer's disease and amnestic mild cognitive impairment. Vartanian et al. (2009) reported that patients with frontal variant frontotemporal dementia (FTD) display impairments when engaging in transitive reasoning about familiar spatial environments.

These studies clearly demonstrate that individuals with pragmatic disorders experience an array of inferential deficits. The fact that these deficits also occur across domains such as reasoning about concrete and abstract entities, about living and non-living items and during language processing and visuospatial cognition suggests that there is disruption to a central rational capacity in individuals with pragmatic disorders rather than impairment of a specialized communicative rationality. Just such a rational capacity is posited in the view of utterance interpretation proposed in this paper. That the exercise of a fully unconstrained rationality is at work in utterance interpretation is now supported on conceptual and empirical grounds. On conceptual grounds, it was shown that the rational expectations which make communication possible are only intelligible to the extent that there exist other rational expectations which are as wide-ranging as human thought itself. It is simply not possible to circumscribe or modularize the rational expectations, thoughts, and concepts that play a role in utterance interpretation. This conceptual argument in favor of an unconstrained rational capacity receives empirical support from the study of pragmatic disorders. It was argued that children and adults with pragmatic disorders do not merely display impairments in the use of language-based inferences. Rather, inferential impairments in these subjects cut across cognitive domains and types of reasoning. The latter findings suggest that a central rational capacity is disrupted in individuals with pragmatic disorders, and not some rational sub-domain that is specialized for communication. But we must go further than the demonstration of an unconstrained rational capacity if the present view of utterance interpretation is to be upheld. For that view also makes specific claims about the intentional character of this process. It is to an examination of the empirical support for these claims that we now turn.

## Deficits in Mental State Attribution

It was argued above that post-Gricean pragmatic accounts of utterance interpretation routinely acknowledge the central role of the recognition of intentions in communication. Many of these accounts also argue for the existence of an inferential device or cognitive module that has become specialized to the task of intention recognition (e.g., Sperber and Wilson's relevance theory). That the recognition of intentions is integral to communication is one of the few indisputable facts of utterance interpretation. But what is often overlooked is that the type of mental state attribution involved in utterance interpretation

<sup>3</sup>For a detailed examination of this evidence, the reader is referred to chapter 2 in Cummings (2014a).

extends more widely than the attribution of communicative intentions to the minds of speakers. In fact, the interpretation of any linguistic utterance involves the attribution of the full range of cognitive and affective mental states to the minds of other communicators. To the extent that this wide-ranging intentional capacity is implicated in utterance interpretation, it should be possible to find evidence of deficits in the attribution of mental states other than intentions in children and adults with pragmatic disorders. These states include cognitive mental states like knowledge, belief, and pretense and affective mental states such as happiness, fear, and anger. To the extent that evidence of this kind is forthcoming, it may be used to support two claims. The first of these claims is that there is no limit on the type of mental states that may play a role in utterance interpretation and that may be disrupted when interpretation is impaired. The second claim is that it makes no sense to talk about a cognitive module that is specialized to undertake the recognition of intentions when such a device would require nothing less than the modularization of the whole of human thought about the minds and behavior of other people. The necessary general nature of this cognitive module precludes any such specialization.

That the recognition of communicative intentions is impaired in individuals with pragmatic disorders has been demonstrated in a number of studies. For the most part, these studies reveal a failure on the part of subjects to recover the implicatures of utterances or establish the illocutionary force of speech acts. In this way, children with SLI have been found to have difficulty deriving scalar implicatures (Katsos et al., 2011), while children with pragmatic language impairment (PLI) perform significantly more poorly than those with SLI on questions targeting implicature (Ryder et al., 2008). Pragmatic impairments in schizophrenia are known to compromise the comprehension and recognition of speech acts, maxims and implicatures (Tényi et al., 2002; Mazza et al., 2008). McNamara et al. (2010) reported that patients with Parkinson's disease are less likely than control subjects to activate indirect meanings of implicatures. The interpretation of implicatures is also impaired in adults with RHD (Kasher et al., 1999). These studies support the claim that communicative intentions are a problematic mental state category for children and adults with pragmatic disorders. But then so, too, are a range of other cognitive and affective mental states. Children with autism and Asperger syndrome (AS) have been found to refer predominantly to desire and make few references to thought and belief in their use of assertive speech acts (Ziatas et al., 2003). Normally developing children in the same study used a higher proportion of references to thought and belief. Pretense is a problematic mental state for young children and adolescents with autism (Bigham, 2008; Morsanyi and Handley, 2012). Affective mental states are also impaired in autism. Philip et al. (2010) found deficits in the recognition of basic emotions (happiness, sadness, anger, disgust, and fear) across facial, body movement and vocal stimuli in adults with ASD.

Beyond autism, cognitive and affective mental states are also impaired in a range of other clinical populations with significant deficits of utterance interpretation. Children and adults who have genetic syndromes with or without intellectual disability exhibit difficulties with a range of mental state categories. Porter et al. (2008) found a specific deficit in understanding false belief in the subjects with Williams syndrome in their study. Ho et al. (2012) found that individuals with velo-cardio-facial syndrome show impairments in the attribution of complex mental states to abstract visual stimuli. The attribution of a range of mental states is also disrupted in adult-onset conditions. Cognitive and affective ToM is impaired in patients with paranoid schizophrenia (Montag et al., 2011). Individuals with a high level of negative symptoms of schizophrenia have been found to display selective impairment in their ability to attribute affective mental states (Shamay-Tsoory et al., 2007). There are impairments of cognitive and affective ToM in individuals with sematic dementia, with awareness of affective but not cognitive ToM persisting into the moderate stage of the disease (Duval et al., 2012). Patients with FTD are impaired relative to controls in the recognition of the emotions of anger, fear, disgust, and happiness through facial features (Oliver et al., 2014). These patients also mislabeled negative facial expressions as happy more often than controls, a finding that suggested a deficit in the representation of positive affect in FTD. Henry et al. (2006) found that the recognition of basic emotions (e.g., disgust, anger) and the capacity for mental state attribution was significantly reduced in 16 adults with traumatic brain injury relative to controls.

It is clear from these studies that the meta-representational deficit in clients with pragmatic disorders extends well beyond the recognition of intentions. What can we conclude from this finding? The only possible conclusion is that it makes no sense to talk about a cognitive module that is specialized for the recognition of intentions, or even just communicative intentions, when the type of meta-representational capacity involved in utterance interpretation extends into every aspect of our thinking about the thoughts and behavior of other people. Such a general cognitive capacity cannot be represented by a cognitive module, even a module that is constructed along the broadest possible lines, and be intelligible in the absence of a range of intentional data that lie outside of the module. In the end, the intentional character of utterance interpretation comes to mean much more than the recognition of communicative intentions. For these intentions only even make sense within a complex network of other cognitive and affective mental states which are as pervasive as human thought itself. What makes it seem that these intentions can be removed from this network and represented in their entirety within a cognitive module is the assumption that it is possible to develop a theory of these mental phenomena. That assumption will be critically evaluated in the final section below. In the meantime, we turn to the third and last feature of utterance interpretation that is proposed in this paper. That feature concerns the holism of the knowledge that we bring to utterance interpretation.

## Deficits in Background Knowledge

It is important to begin the discussion of this final feature of utterance interpretation with a note of caution. The deficits in background knowledge that we will address in this section should not be taken to mean that individuals with pragmatic disorders do not know that France is a European country, that potatoes are a type of vegetable and that fish live in water. On the contrary, most children and adults with pragmatic disorders know all these things and more. Rather, what is being claimed here is that individuals with pragmatic disorders tend to interpret utterances within limited or restricted epistemic contexts. The key feature of these contexts is that they circumscribe the knowledge that hearers could potentially use to interpret utterances. While this tendency may simply reflect the wider processing limitations of subjects with pragmatic disorders—a context of just a few propositions is easier to retain in memory, etc.—its effect on utterance interpretation can be devastating. For example, the implicature that a hearer may derive from an utterance in a small, restricted context may not be the implicature that the speaker intended to convey. Also, it may not be possible to overturn or defeat an implicature that is immune to changes within the wider network of knowledge that attends utterance interpretation. Such a hearer may persist in upholding a particular implicature of an utterance or the dominant meaning of a word when it is clear from the wider context that such interpretations are erroneous. In this section, we will be concerned to establish if such a pattern of misinterpretation actually exists among children and adults with pragmatic disorders. To the extent that it does, we will have some empirical support for the claim that any account of utterance interpretation must succeed in representing the essential holism of knowledge.

Children and adults with pragmatic disorders often experience significant difficulties in the processing of context. These difficulties are typically documented during tasks that require the resolution of ambiguities based on linguistic context<sup>4</sup> . Jolliffe and Baron-Cohen (1999) reported that normally intelligent adults with autism or Asperger's syndrome are less able than normal controls to use context to interpret lexically or syntactically ambiguous sentences that are presented auditorily. Using a lexical ambiguity resolution task, Norbury (2005) demonstrated that children with language impairment and ASD plus language impairment do not use context as efficiently as their language intact peers to suppress irrelevant meanings. Difficulties suppressing contextually irrelevant meanings have also been reported in children with hydrocephalus (Barnes et al., 2004). Andreou et al. (2009) examined sentence context effects in homonym meaning activation in patients with schizophrenia. Unlike control subjects, who exhibited a pattern of selective target facilitation following the presentation of sentences which biased either the first or second meaning of equibiased homonyms, no significant target facilitation was observed in the patients with schizophrenia in this study. Grindrod and Baum (2003) examined the ability of subjects with right-hemisphere damage (RHD) and left-hemisphere damage (LHD) and nonfluent aphasia to use local sentence context information to resolve lexically ambiguous words. Subjects with nonfluent aphasia activated both meanings of ambiguous words regardless of context at a short interstimulus interval and neither meaning at a long interstimulus interval. The only contextually appropriate meanings to be activated in the subjects with RHD occurred in second-meaning biased contexts at a long interstimulus interval. Grindrod and Baum concluded that LHD and RHD lead to deficits in using local context information to complete ambiguity resolution.

Aside from ambiguity resolution, there is also extensive evidence of the failure of subjects with pragmatic disorders to use context appropriately during the processing of utterances for their implicatures. Ryder et al. (2008) examined the ability of two groups of typically developing children and 27 children with SLI to use context to generate implicatures in response to questions. Nine of the 27 children with SLI were pragmatically impaired. Only when an answer to a question was provided by pictorial context did the children with SLI perform similarly to their peers in the use of context to generate implicatures. However, children with PLI performed significantly more poorly than the rest of the SLI group on questions that required implicatures, leading the authors to conclude that these children have particular difficulty in integrating contextual information. Loukusa et al. (2007) examined the answers given by children with AS or highfunctioning autism (HFA) to contextually demanding questions. Analyses of the answers given by these children revealed that they had all tried to use contextual information, albeit that they had done so incorrectly. The examination of a category of error not produced by the normally developing children in the study indicated that the children with AS or HFA continued to process questions even after a contextually relevant answer had been given. Titone et al. (2002) found that patients with schizophrenia showed reduced priming for literally plausible idioms (e.g., kick the bucket) but intact priming for literally implausible idioms (e.g., be on cloud nine) compared with control subjects. These authors concluded that patients with schizophrenia can make normal use of context only when conditions (e.g., the implausibility of certain idiomatic meanings) reduce the need for controlled processing.

By way of summary, let us reflect on the significance of these empirical findings for the holism of knowledge during utterance interpretation. These studies demonstrate that processing limitations in individuals with pragmatic disorders lead to the interpretation of utterances in highly restricted contexts. Within these contexts, utterances and lexically ambiguous words are frequently misinterpreted, as individuals with pragmatic disorders are unable to revise their understanding of language to reflect wider contextual information. In effect, pragmatic disorders directly disrupt the holism of the knowledge that we bring to utterance interpretation. The processing limitations of children and adults with these disorders forces them to view this background knowledge as containing isolable elements that can exist apart from other contextual information. The erroneous interpretations arrived at by the subjects in the above clinical studies is a clear demonstration of what can go wrong when such a view of this knowledge exists. What appears to be a restricted, self-contained context of background knowledge is in fact a complex informational nexus that is coextensive with human thought itself. No component or element

<sup>4</sup> It may be objected that linguistic context is distinct from epistemic context and that, for this reason, these tasks cannot reveal anything about the background knowledge that we bring to utterance interpretation. However, this knowledge should be interpreted broadly to refer to any information that we may use to interpret utterances. All information, including information from the linguistic context of an utterance, is background knowledge in this sense.

of this nexus can be separated from any other component or element and be an intelligible representation of the context that attends the interpretation of utterances. What makes it seem otherwise is a strong impulse to theorize about the cognitive basis of utterance interpretation. This impulse can now be seen to distort the holistic character of utterance interpretation in much the same way that it distorted the rational and intentional character of this cognitive process. In the next section, we consider the only possible route through this theoretical impasse.

## The Way Forward

Throughout this discussion, the urge to theorize about the cognitive basis of utterance interpretation has been cast in the role of villain. It is now time to examine that urge directly, and explain why it has such disastrous consequences when utterance interpretation is at issue. Theories of a whole range of phenomena abound in science and elsewhere. We do not think it strange if physicists develop theories of the gravitational forces between the earth and the moon. In fact, we would be surprised if we discovered that such theories did not exist. Theories explain and predict events and behavior in the world, and our ability to make sense of our environment would be significantly diminished without them. Theories strive for completeness in that they must account for all the data within a particular domain. And as any scientist will tell you, a theory that cannot account for all the data in an area will be very short lived indeed. But when we turn to utterance interpretation, the idea that it is possible to develop a theory of this phenomenon is quite a different proposition altogether. The completeness aspired to by theorists in other areas of inquiry is decidedly destructive when we turn to a rational, intentional, holistic phenomenon like utterance interpretation. For here the focus of our theoretical efforts are concepts such as rationality and intentionality which, as we have seen, are nothing short of human thought itself. The physicist who develops a theory, even a fully complete theory, of the gravitational forces between the earth and the moon still has a set of rational concepts with which to make sense of that theory. The pragmatist who develops a theory of the cognitive basis of utterance interpretation must arrive at a fully complete account of rationality and intentionality. But in the absence of rational concepts outside of this theory, the pragmatist's theoretical enterprise lacks the intelligibility of the physicist's enterprise.

This view of the unintelligibility of theories of rationality and intentionality derives from the philosophical insights of Hilary Putnam (e.g., Putnam, 1988, 1994, 1995) 5 . For many years, Putnam has railed against a certain way of doing philosophy which can make it seem that the only way in which we can make progress on concepts such as truth, meaning, and rationality is to construct theories of these concepts. Such theories, Putnam argues, only appear intelligible on the assumption that we can occupy a metaphysical standpoint. From this standpoint, it seems that we can survey human thinking in its entirety without in turn presupposing the rational concepts which make that thinking intelligible. But, to the extent that this standpoint is devoid of rational concepts (how else are we to achieve the completeness of a theory of human rationality?), what we end up with is not a complete account of rationality or meaning but an unintelligible account. In fact, in the absence of prior rational concepts, such a standpoint is a "we know not what." In effect, the pragmatist who believes it is possible to generate a theory of utterance interpretation is in the same position as the metaphysical realist who has theoretical aspirations in relation to philosophical concepts. The pragmatist believes it is possible to capture the rational, intentional, holistic character of utterance interpretation within a cognitive scientific theory. This theory might be constructed around a cognitive module, or a series of such modules, or some other inferential device. However, if the discussion of the preceding sections has demonstrated anything, it is that such a theory is nothing less than an account of human thought. But at that point, what we have is not a complete account of utterance interpretation but an unintelligible account. Like the metaphysical realist, the pragmatist has not succeeded in producing an account that we can recognize, let alone make sense of.

Putnam's challenge is to the theoretical impulse which makes it seem that a cognitive scientific theory of utterance interpretation is possible and intelligible. But he is not alone in finding the entire research program that this impulse represents flawed and incoherent. John Searle exhibits the same concerns in relation to cognitivism, which is the view that the brain is a digital computer. For Searle, the proposal that the mechanisms by which brain processes produce cognition are supposed to be computational, and that by specifying programs we have specified the causes of cognition is no type of coherent explanation at all:

"I used to believe that as a causal account, the cognitivist's theory was at least false, but I now am having difficulty in formulating a version of it that is coherent even to the point where it could be an empirical thesis at all" (Searle, 1992: 215; italics added).

This lack of coherence arises, according to Searle, because the cognitivist denies that the characterization of a process as computational is an observer-relative characterization. A conscious agent must assign a computational interpretation to a pattern of physical events. In the absence of this agent, all we have are neurobiological processes which are not causal explanations of anything:

"The point is not that the claim "The brain is a digital computer" is simply false. Rather, it does not get up to the level of falsehood. It does not have a clear sense. The question "Is the brain a digital computer?" is ill defined. If it asks, "Can we assign a computational interpretation to the brain?" the answer is trivially yes, because we can assign a computational interpretation to anything. If it asks, "Are brain processes intrinsically computational?" the answer is trivially no, because nothing is intrinsically computational,

<sup>5</sup>The reader is referred to Cummings (2012b) and chapter 4 in Cummings (2005a) for discussion of those insights such as they relate to utterance interpretation.

except of course conscious agents intentionally going through computations" (Searle, 1992: 225; italics added).

A complete cognitive scientific theory (Putnam) and an intrinsically computational brain process (Searle) are just different manifestations of the same aberrant impulse in cognitive science. That impulse is to deny the existence of any rationality outside of the theory or the causal explanation. Yet, without this prior rationality we lack the very concepts that are needed to make sense of the cognitive scientist's theories and causal explanations. That such theories and explanations are unintelligible by their own standards is a clear sign that all is not well in the cognitive enquiry which has brought us to this point.

So, if a theory of utterance interpretation is not just a bad idea, but an unintelligible one, then what is the alternative? Can we afford to take seriously the proposal to reject modular and other theoretical accounts of utterance interpretation? And if we reject these accounts, is there anything that we can intelligibly say about utterance interpretation? For Searle, the way forward lies in an inversion of the order of our cognitive scientific explanations so that we get a different account of cause-and-effect relations in these explanations. Our psychological explanations are misguided when they posit deep unconscious mental causes of desired effects such as perceptual judgments or grammatical sentences. Rather, what appear to be mental causes of patterns in perception or language are actually the judgments of a conscious agent who is outside the perceptual and linguistic systems:

"The inversion radically alters the ontology of cognitive science explanation by eliminating a whole level of deep unconscious psychological causes. The normative element that was supposed to be inside the system in virtue of its psychological content now comes back in when a conscious agent outside the mechanism makes judgments about its functioning" (Searle, 1992: 237; italics in original).

Applied to utterance interpretation, it is Searle's claim that we are mistaken when we posit modular processes that somehow stand behind, and give a causal explanation of, our understanding of utterances. There are "brute physical mechanisms" in our brain which cause and sustain conscious thoughts, experiences, actions, and memories. But that is all there is. There is no level of deep unconscious mental processes which give a causal explanation of these thoughts and experiences. There is no intrinsic intentionality in any of the mechanisms we are attempting to explain, only in the conscious agents who are making judgments of these mechanisms:

"The elimination of the deep unconscious level marks two major changes: It gets rid of a whole level of psychological causation and it shifts the normative component out of the mechanism to the eye of the beholder of the mechanism" (Searle, 1992: 238).

Searle's dissatisfaction with cognitive scientific accounts of mind is matched by Putnam's concerns that the entire cognitive scientific venture has led us into unintelligibility. Putnam, too, seeks a different type of explanation, one in which the "eye of the beholder" can tell us something about normative concepts such as meaning and rationality in a way that cognitive scientific theories have failed to. Importantly, the eye of the beholder is not a "God's Eye point of view" or metaphysical standpoint, from which it is assumed we can survey the whole of rational thought without, in turn, presupposing rational concepts. It is the assumption of this standpoint which makes it seem that it is possible to generate complete cognitive scientific theories in the same way that it is possible to generate complete scientific theories of physical phenomena in the world. Like Searle's "rediscovery of the mind," Putnam believes it is possible to recover an intelligible position in the philosophy of mind. It is part of Putnam's own attempt at recovery—what he has described as commonsense realism and a "deliberate" or "second naivete" about conception—that he would have us take seriously the teachings of Wittgenstein. This requires that we engage in a process of description, the aim of which is an accurate characterization of the consequences that a particular picture, and the concepts inherent in it, has for its user. In his Lectures and Conversations on Aesthetics, Psychology, and Religious Belief, Wittgenstein (1966) describes the considerations that are subsumed within this type of description:

"God's eye sees everything"—I want to say of this that it uses a picture.

I don't want to belittle...the person who says it...

We associate a particular use with a picture...

What conclusions are you going to draw?...Are eyebrows going to be talked of, in connection with the Eye of God?...

If I say he used a picture, I don't want to say anything he himself wouldn't say. I want to say he draws these conclusions. Isn't it as important as anything else, what picture he does use?...

The whole weight may be in the picture...When I say he's using a picture, I am merely making a grammatical remark: [What I say] can only be verified by the consequences he does or does not draw...

All I wished to characterize was the consequences he wished to draw. If I wished to say anything more I was merely being philosophically arrogant (pp. 71–72).

The most outstanding feature of this descriptive process is the restrictions placed on the extent of the description. Wittgenstein (1966) doesn't want to say anything he—the user of the picture—himself wouldn't say. Indeed, to say more is "being philosophically arrogant." In fact, to say more is to proceed to philosophize in the manner urged by the metaphysical spirit, a manner in which we describe the application of a picture through an understanding of that same picture in isolation from its applications. Under the influence of the metaphysical spirit, we inevitably go forward by erecting standards about what must be the case in order for our thoughts to represent or refer to reality. These standards can make it seem that there must be something which stands behind thoughts and which makes it possible for them to represent the world. This "something" is unconscious mental processes which, it is claimed, provide a causal explanation of our conscious thoughts. It is these processes which the cognitive scientist aims to give an account of in his or her theories. But these processes are nothing but an illusion which arises, Putnam contends, when we attempt to characterize normative concepts like meaning apart from the wider nexus of rational concepts that is their home. As Searle (1992) remarks "deep unconscious rules satisfy our urge for meaning" (246). However, we are looking in the wrong place if we think an account of meaning, rationality, and other normative concepts lies anywhere other than conscious agents who use utterances to mean such and such.

Cognitive scientific accounts of utterance interpretation also appear to "satisfy our urge for meaning." Unconscious modular processes in particular appeal to our sense that "if the input to the system is meaningful and the output is meaningful, then all the processes in between must be meaningful as well" (Searle, 1992: 246). But there is no intentionality in the utterance interpretation system, only in the conscious agents who attempt to characterize that system. And it is from these agents that serious philosophical work on concepts such as meaning and rationality must proceed. In unpicking the complexity of these concepts in Section A Standard Communicative Exchange, we employed a form of description that opened up the rich interconnections between them. Such was the extent of our mining of these interconnections that it very quickly became apparent that we

## References


could not make any sense of a concept like the context in which an utterance is interpreted without also countenancing a vast array of interrelated notions. In this way, it made no sense to talk about the context of utterance interpretation without addressing the knowledge of speakers and hearers, their purpose or goal in speaking, their pre-existing social obligations and much else besides. As our mining continued, we gradually became aware that we were embarked on a descriptive process which had no end in sight. Nevertheless, this was a process which revealed valuable insights into the nature of the rational and other processes by means of which utterance interpretation proceeds. Moreover, this descriptive process revealed those insights without the slightest pretension of being a cognitive scientific theory of utterance interpretation. It is this very same process of description which I now urge pragmatists to adopt as they pursue their many and varied explorations of utterance interpretation.

## Acknowledgments

The author wishes to express her gratitude to the reviewers of this paper for their helpful comments on an earlier version.


evidence from a video-based assessment. Psychiatry Res. 186, 203–209. doi: 10.1016/j.psychres.2010.09.006


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Cummings. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpsyg-07-01392 September 16, 2016 Time: 17:1 # 1

# Playing with Expectations: A Contextual View of Humor Development

## Gabriella Airenti\*

Center for Cognitive Science, Department of Psychology, University of Torino, Torino, Italy

In the developmental literature, the idea has been proposed that young children do not understand the specificity of non-literal communicative acts. In this article, I focus on young children's ability to produce and understand different forms of humor. I explore the acquisition of the communicative contexts that enable children to engage in humorous interactions before they possess the capacity to analyze them in the terms afforded by a full-fledged theory of mind. I suggest that different forms of humor share several basic features and that we can construct a continuum from simple to sophisticated forms. In particular, I focus on teasing, a form of humor already present in preverbal infants that is also considered a typical feature of irony. I argue that all forms of humor can be regarded as a type of interaction that I propose to call "playing with expectations."

Keywords: humor development, communicative games, teasing, irony, theory of mind

### Edited by:

Tamara Swaab, University of California, Davis, USA

### Reviewed by:

LouAnn Gerken, University of Arizona, USA Ina Bornkessel-Schlesewsky, University of South Australia, Australia

> \*Correspondence: Gabriella Airenti gabriella.airenti@unito.it

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 28 October 2015 Accepted: 31 August 2016 Published: 20 September 2016

### Citation:

Airenti G (2016) Playing with Expectations: A Contextual View of Humor Development. Front. Psychol. 7:1392. doi: 10.3389/fpsyg.2016.01392

## INTRODUCTION

In studies of communication, the interpretation of non-literal, indirect or figurative meaning occupies a distinct place. Various views have been advanced on this topic. In pragmatics, the classical two-stage theories distinguish a primary literal interpretation from a non-literal secondary interpretation, which can be developed only through the analysis and failure of the former (Grice, 1957, 1969; Searle, 1969). Subsequent psycholinguistic studies show the cognitive implausibility of the classical perspective (see, for instance, Clark and Lucy, 1975), and more recent theories analyze the multifaceted aspects of non-literal meaning interpretation (Wilson and Sperber, 1992; Giora, 1997). It is generally agreed that the differences between models correspond to the way context is analyzed and considered (Gibbs and Colston, 2012).

Non-literal communication includes numerous forms, such as indirect speech acts, metaphors, jokes, irony, and hyperbole. These forms occur commonly in everyday adult communication. Do they also occur in children's communication? Certainly, when communicating with children, adults do not refrain from using non-literal expressions. Consider, for instance, the following examples:


When are children able to master these forms (i.e., to comprehend and produce them)? Are all of them cognitively equivalent? The few systematic studies that have been conducted suggest that acquisition does not follow a unique progression. Some forms are simpler to master than others fpsyg-07-01392 September 16, 2016 Time: 17:1 # 2

(Bosco et al., 2013). It is therefore important to understand the reasons for differences in the ease of comprehension and to delineate specific paths of acquisition.

One hypothesis asserts that different forms of non-literal communication can be distinguished based on the role of theory of mind (ToM) abilities. The most demanding tasks require developed ToM abilities to comprehend the speaker's meaning. For instance, according to Winner (1988), children comprehend metaphors before irony because understanding metaphors does not involve questioning the speaker's beliefs, whereas comprehending irony involves attributing second-order beliefs to the speaker. In this paper, I argue that children may perform complex non-literal communicative acts before developing full-fledged ToM abilities.

An important question arises concerning the relationship between use and interpretation. Adults and children differ markedly with respect to this relationship. Theories may differ on how the chain of inferences that enables interpretation is constructed. Nonetheless, adults are undoubtedly able to interpret non-literal communication. If an adult laughs at a joke, we presume that s/he has understood its humor, and if s/he produces a joke, we assume that s/he has intentionally produced humor. When children produce humor, however, we are unsure whether they do so intentionally. Does the fact that a child laughs at a joke indicate that s/he has understood it? If the child makes us laugh, did the child do so intentionally, or was the humor unintended such that we, the audience, are creating it? In the case of adults, we do not pose the problem of the meaning of comprehension. Instead, we presume that use and comprehension are linked. In the case of children, this link remains unclear.

In the developmental literature, the idea is often advanced that young children do not understand the specificity of nonliteral communicative acts and cannot distinguish, for instance, between an ironic statement or a hyperbole and a lie (Peterson et al., 1983; Demorest et al., 1984; Winner and Leekam, 1991; Sullivan et al., 1995; Winner et al., 1998). For young children utterances are either true or false, and when they are false, they can only be lies. Thus, it is reasoned, young children cannot properly appreciate non-literal communication.

This perspective is limited; it highlights only the tasks at which young children fail. Conversely, I aim to understand what young children are able to do. I believe this perspective might help to reconstruct the developmental path and thus to more effectively understand mature comprehension of nonliteral communication. In this article, I focus specifically on young children's ability to produce and understand different forms of humor.

My argument proceeds as follows. I identify the forms of humor that children typically use through several examples drawn primarily from parents' reports. I then discuss the difficulties highlighted in the literature regarding the definition and categorization of different forms of humor. I specifically address the relationship between humor and irony. I explore the acquisition of the communicative contexts that constitute the background that enables children to engage in humorous interactions before being able to analyze them using full-fledged ToM abilities. I assume that young children react differently to lies and to non-literal communication. Finally, I present a theoretical proposal: I argue that different forms of humor share some basic features and that we can construct a continuum from simple to sophisticated forms. I focus on teasing, a form of humor already present in preverbal infants that is also considered a typical feature of irony. I conclude that all forms of humor can be considered a type of interaction that I propose to call "playing with expectations."

## CHILDREN'S USE OF HUMOR<sup>1</sup>

Children are involved in humorous communicative interactions from a very young age (Groch, 1974; Bainum et al., 1984; Dubois et al., 1984; Bergen, 1989; Reddy, 1991, 2008; Loizou, 2005; Cameron et al., 2008; Hoicka and Akhtar, 2012; Mireault et al., 2012). From a developmental perspective, the earliest cases of humorous interactions are amusing situations that occur between infants and adults. Two cases are typical. Adults propose an amusing action, such as tickling, odd faces or sounds, or blowing a raspberry. Children playfully respond to the action, and the interaction becomes a shared game. Sometimes the child initiates the interaction, often inadvertently, with a gesture or a sound that provokes amusement in the adult. This amused response pleases the child, who intentionally repeats the gesture to obtain the same reaction, and the game becomes shared. These humorous games are non-verbal and simple. Reddy (2008) classifies them as clowning, or the violation of normal patterns of behavior to elicit amusement.

The other type of humor commonly observed with young children is teasing.

Consider two examples.

When asked to make the sound of a horse (Come fa il cavallo?), a 2.5-year-old girl answers, "Moo" (Muh) and laughs.

Another parent reports an incident with her daughter, also 2.5 years old: "I asked Becky, 'What is the cat's call?' (Come fa il gatto?). She answered 'chirp' (cip cip) and laughed. Then, she corrected herself: 'No, mom, it meows! (Ma no, mamma, fa miao miao)."

Reddy (2008) showed that this form of humor is precocious, starting at approximately 9 months of age. Relying on parents' reports, Reddy distinguished three types of teasing in young children: provocative non-compliance, offer and withdrawal of an object or of the self and disrupting others' activities. In all of these types, children playfully disturb an interaction by performing "the mis-expected" (Reddy and Mireault, 2015). As these authors note, teasing, even in its simplest forms,

<sup>1</sup>Unless otherwise specified, all examples of young children's humor production presented in this paper are from parents' reports collected under the supervision of the author in various Italian regions. We instructed parents of children aged 2–6 years to record all humorous communicative acts produced by their children in a given month and the context in which they were produced. We conducted a quantitative analysis on the reports of 90 children (Airenti and Angeleri, submitted). However, the examples presented here are derived from a larger sample of 300 reports. The author thanks the families who participated and Giulia Giacone, Sara Ferrero, Caterina Mancini, and Rachele Barresi for their assistance with collecting and coding the reports.

requires the display of cognitive abilities. In particular, the child must have expectations regarding the interlocutor's actions. For instance, in an offer/withdrawal, the infant must expect the interlocutor to extend her arm, open her hand and wait for the child to release the object. The child also expects the interlocutor to express surprise and disappointment after the withdrawal, and this response is the source of amusement. The authors assert that the wide spectrum of typical cases of teasing observed in young children indicates that "the range of things infants can do to tease their parents seems as large as the expectations parents have of the infants" (ibid.).

More precisely, based on my analysis of the existing literature and the parents' reports I collected, it appears that parents' expectations exploited by young children may be either relational or linked to newly acquired skills. As examples of the first situation, consider the cases of contradicting expectations of kissing or hugging, withdrawing at the last moment, or playing with parents' fears of approaching a dangerous or precious (and forbidden) object and withdrawing at the last moment.

One example was observed in a 2.3-year-old girl. "The aunt asked her, 'Marta, will you give me a kiss?,' to which she replied: 'No, never!' (No, mai!). The aunt looked sad, and [the girl] smothered her with kisses."

A good example of fears is reported in Corsaro (1997). Corsaro's daughter had just begun climbing chairs and other objects that parents consider dangerous to climb. Once, she climbed onto the seatback of a large armchair. When her father attempted to remove her from the seatback, she smiled broadly. According to the author, she seemed to be saying, "Look, dad, where I have gone this time!"

Common examples of playing with skills include those introduced earlier, such as deliberately attributing the wrong calls to animals, calling the father "mom" or the mother "dad," or claiming that the sister (or the grandmother or the aunt) is a male, whereas the brother (or the grandfather or the uncle) is a female. Children typically play with newly acquired skills, a tendency confirmed in the literature. Garvey (1977) includes the cases of misnaming in the form of social playing, which consists of playing with speech acts and discourse conventions. She suggests that as soon as children have learned a rule, they have fun distorting or exaggerating it. Dunn (1988) argues that such episodes, which characterize the beginning of the development of a sense of humor in children, are motivated by the pleasure of performing forbidden acts. The fact that young children perform this game with newly acquired abilities indicates intentional teasing. Children play with parents' uncertainty, as parents may be unsure whether the child is making a joke or a genuine mistake. The expression of uncertainty or trouble is, in turn, the source of amusement.

What form of communication does this type of humor represent? In at least one of the examples mentioned, the child explicitly indicated that her first response was not a mistake but an intentional joke. Often, however, the child makes no explicit declaration but displays what parents identify as a pert or ironic smile.

If we consider the development of teasing in older children, we may add a new category to the categories identified by Reddy, namely, mocking (Airenti and Angeleri, submitted).

Consider an example mentioned by Garvey (1977). David, a 5-year-old boy, laughed and misnamed parts of his face. For instance, he pointed to his forehead and said, "Here is my mouth" to mock his 2-year-old sister, who had previously shown an adult her likely newly acquired ability to identify parts of her face (i.e., eyes, nose, and mouth). In Garvey's case, the target is the little sister. In other situations, the targets may be strangers or other family members. Typical cases may involve imitating adults' funny behavior or appearance, such as a grandfather snoring or a mother putting on makeup.

Consider an example from my corpus. A 3.3-year-old boy, exaggerating his mother's thinking mood, says, "Let us see, let us see..." (Vediamo un po', vediamo un po'....).

However, older children also use forms of humor typically used by younger children, such as offer/withdrawal: "Mom, I brought you a cookie!" says a 4.7-year-old boy. When the mother, thanking him, approaches her son to obtain the cookie, the child eats it.

The following example illustrates a case of playing with expectations regarding new skills: a 6.3-year-old boy tells his mother, "Today the teacher scolded me because I was not able to read...I got an A! (10 e lode)."

The following example demonstrates play with relational expectations. A mother reports an incident with her 6.5-year-old daughter: "We are at the table, and my daughter looks at us and says, 'You are old, but dad is the oldest in the house! Ah, ah! I am kidding, you are the most beautiful parents in the world,' and she gets up and hugs us."

The following example is of disrupting others' activity (3.1 year-old girl): the grandmother is counting money aloud, and her granddaughter says numbers at random to confuse her.

In conclusion, teasing represents an intriguing form of humor because it develops precociously, yet older children and adults also use it. Keltner et al. (2001) conducted a comprehensive review of this form of interaction at different ages and proposed the following definition: teasing is intentional provocation accompanied by playful markers that together comment on something relevant to the target.

Consider the following instances of other forms of humorous interactions drawn from parents' reports.

(1) "The child, looking at the rain, says, 'What a beautiful day, mom! It is ideal to go to the beach!' (Che bella giornata mamma! È proprio l'ideale per andare al mare!)." (girl, 6.5 years old)

(2) "I said, 'Today I have prepared pizza,' and my child said, 'Mom, how disgusting! You made the wurstel pizza!' (Mamma che schifo! Hai fatto la pizza con i wurstel!), his preferred pizza, and he plunged [toward his plate] to eat it (e si è tuffato a mangiarla)." (boy, 5.3 years old)

(3) "Today, during lunch at [the] grandparents' [home], grandfather made [a] noise when eating his spaghetti. She [the girl] started laughing and said, 'Grandpa you are very elegant!' (Lei ha cominciato a ridere e a dire 'Nonno sei molto elegante!')." (girl, 5.3 years old)

(4) "She says, 'Mom let's go to the beach. Today we have a fine weather' (Mamma andiamo al mare! È bello oggi il tempo!), while outside it rains like all the other days." (girl, 4.1 years old)

(5) "Sofia dropped her glass of water, and she looked at me in a loving mood (modo amorevole) and said, 'It was not me! It was the glass, which did not want to live anymore!' (non sono stata io! è stato il bicchiere che non voleva più vivere)." (girl, 4.1 years old)

(6) "While I was doing the dishes, two glasses slipped from my hands and broke. Martina immediately commented, 'Eh, mom, you are really good at doing the dishes!' (Ehi, mamma, ma come sei brava a lavare i piatti!)." (girl, 3.8 years old)

(7) "He plays with the sand, rolling over (rotolandosi) and getting dirty all over (sporcandosi tutto); he makes castles with a bucket and spade. I tell him, 'You really like playing with sand, eh?' And he says: 'No, I don't like it at all!' (No, non mi piace affatto!)." (boy, 3.0 years old)

(8) "Today, while she was playing with her doll, she said, 'Mom, she is your child! She wants chocolate, she, no me!' (Mamma lei è tua bimba! Vuole cioccolata. . . lei, no io!)." (girl, 2.3 years old)

(9) "At lunch, she spilled water all over herself. She started laughing and said, 'Bath I take!' (Bagnetto ho fatto!)." (girl, 2.3 years old)

Examples 1–3 appear to be typical ironic utterances. In fact, the children who performed them were in the age range in which, according to the literature, we can expect irony production (Pexman et al., 2009; Recchia et al., 2010). Are the other examples also instances of irony? They appear to resemble the previous examples, but they are uttered by younger, sometimes much younger, children. Example 4 is nearly identical to Example 1, the most classical instance of irony. Example 6 is a strong case of irony. The other examples are less typical, though nonetheless recognizable instances. Consider how we would interpret equivalent utterances if an adult had performed them. For instance, someone is finishing a glass of wine with evident pleasure. In response to the remark "So you like white wine, eh?," the person answers, "No, I don't like it at all" (semantically equivalent to Example 7). We would consider this answer ironic. Consider now the following example, analogous to Example 9: someone spills a glass of water on his or her T-shirt and, laughing, says, "I took a shower." This statement seems to be a textbook case of self-irony. If we consider the former two statements ironic when adults utter them, why should we not do so when similar statements are uttered by young children? Admittedly, with young children, we may doubt that the production of humor was intentional.

In the following sections, I discuss different ways of interpreting children's ability to use and understand these forms of non-literal communication. First, however, I introduce additional examples of humor that children typically perform from a young age.

(10) "Hello, my little doll" (Ciao bambolotto mio). A 4.11 year-old girl uses the phrase "little doll" to refer to her father imitating her mother, who had greeted her with the words, "Hello, little doll" (bambolotta mia). Note that when addressing the father correctly, the child transitions from the feminine form to the masculine form.

(11) A 3.3-year-old boy tells his mother, "Mind that if you don't behave I'll call the traffic policemen" (Guarda che se non ti comporti bene, chiamo i vigili!). In this case, the child employs a typical expression used by Italian parents to calm overly excited or misbehaving children.

Examples 11 and 12 are noteworthy because the humor is produced by the fact that children use expressions with parents that the latter normally use with them and that become incongruous in this inverted form.

Consider another example:

(12) A 6.7-year-old girl is seated with a group of adults and children at a coffee shop. When everyone orders hot chocolate, she exclaims, "Well...I'll take a beer!"

In this example, irony results from the child's incongruous appropriation of a behavior that the child cannot yet perform.

Examples 10–12 indicate that a typical way that children use to produce humor is by uttering a statement that is normal when spoken by an adult but becomes incongruous when uttered by a child. Note that this incongruity applies to different forms of humor. In clowning, we may observe gestures rather than words, such as when a child wears her mother's heels or walks with a grandparent's cane. It applies also to teasing, as in Example 11, and to irony, as in Examples 10 and 12.

Based on the analysis of the reviewed examples of humor, the following observations are evident:


Thus far, we have analyzed forms of humor observed in children of different ages. We have not closely examined the current definitions of humor and irony. The next section discusses these definitions from a more theoretical perspective.

## HUMOR, IRONY, AND TEASING

This section presents the theoretical assumptions of my work. I propose that the relationship among humor, irony and teasing may be clarified by considering them different forms of a more general communicative ability that appears early in development and is characterized by playing with others' expectations.

Definitions of humor and irony and their relationship are widely debated in the literature. In cognitive studies, the most accepted definition of humor derives from the work of Shultz (1976) and McGhee (1979), who claim that incongruity with respect to reality is the source of humor. Divergences exist in the literature regarding the type of relation that a subject must entertain with incongruity to perceive humor. McGhee maintains

fpsyg-07-01392 September 16, 2016 Time: 17:1 # 5

that the subject must be able to represent the incongruity, an ability that children acquire at approximately 18 months of age. Shultz considers a necessary condition to be the resolution of incongruity, an ability that children acquire at 6 years of age. Other authors, by contrast, consider the detection of incongruity to be sufficient (Pien and Rothbart, 1976). This latter stance opens the possibility that infants also display humor. An incongruity becomes the source of humor when it arises within a playful interaction. Recent research has shown that the social emotional context is fundamental to infants' humor perception (Hoicka and Gattis, 2012; Mireault et al., 2015). In this relational perspective, humor appreciation cannot be evaluated outside the interaction in which it arises. In general, the relational approach admits infants and very young children to its definition of humor. I aim to extend this approach to forms of humor that older children and adults produce, specifically irony.

Humor is difficult to define, and irony is even more difficult to characterize (Gibbs and Colston, 2007). Since Grice's definition of irony as a violation of the maxim of quality (Grice, 1975, 1978), many authors have attempted to define irony to give an account of forms of irony that do not result from such a violation. Examples include the echo-mention theory (Wilson and Sperber, 1992), the echoic reminder theory (Kreuz and Glucksberg, 1989), the pretense theory (Clark and Gerrig, 1984), the joint pretense theory (Clark, 1996), the allusional pretense theory (Kumon-Nakamura et al., 1995), the relevant inappropriateness theory (Attardo, 2000), the implicit display theory (Utsumi, 2000), and more recent neo-Gricean accounts (Dynel, 2013; Garmendia, 2015). The fact that no theory has definitively prevailed over others can be attributed to the fact that irony is a multifaceted phenomenon. Thus, every definition explains certain aspects of irony, but no definition can explain all aspects. The same claim can be asserted regarding the function of irony. Some authors contend that irony functions to criticize a behavior in a particularly aggressive way (Colston, 1997; Toplak and Katz, 2000). By contrast, the tinge hypothesis views irony as a way to lessen criticism (Dews et al., 1995). Both situations are commonly corroborated by empirical reports. Sometimes sarcasm makes a criticism particularly harsh, whereas other times the mitigating aspect of the indirect form prevails and the criticism is alluded to but not explicitly uttered. A recent study has shown that ironic criticism is perceived as simultaneously more mocking and more polite (Boylan and Katz, 2013). Additionally, the relationship between irony and sarcasm is debated. Some authors use these two terms interchangeably, whereas other authors stress the more aggressive nature of sarcasm that aims to hurt the interlocutor (Lee and Katz, 1998). Other problems are posed by the function of ironic compliments, such as in the utterance "Selfish, as always!" directed toward someone who has just acted generously. Thus, it seems that there are various forms of irony and that irony may have different functions.

However, these distinct theories appear to agree that the recognition of irony always requires shared background knowledge. In fact, it is only shared knowledge that allows one to interpret an ironic utterance as such. This requirement is also the cause of misunderstandings because sharedness is necessarily attributed by actors to each other, and it is always possible for an actor to interpret as shared something that is not (Airenti et al., 1993b). Irony is not only based on shared presuppositions; it may also stress them. "Irony is, in this way, a particularly compelling means of reaffirming presuppositions common to both the speaker-author and the audience" (Gibbs and Izett, 2005). These authors claim that empirical research shows that people use irony to "specifically and succinctly comment on the disparity between expectations or beliefs and what is actually happening."

Another unresolved question concerns the relationship between humor and irony. As in the case of defining irony, different answers have been proposed in the literature. Some authors suggest that humor and irony share basic mechanisms (Giora, 1995), whereas for others, humor is not the final goal of irony but an associated phenomenon (Bryant, 2012). Gibbs et al. (2014) maintain that it is impossible to discern a direct link between irony and humor, even if laughter (or at least, a smile) may often be associated with irony.

I suggest that the relationship between irony and humor may be clarified if, rather than considering only adults, we analyze forms of humor that young children also use, specifically teasing. Linguists assert that teasing and irony must be considered distinct phenomena, even if irony may be used to tease an interlocutor (Dynel, 2014). Some psychologists have highlighted the teasing aspect of irony (Pexman et al., 2005), but the relationship between teasing and irony is more involved. Following the earlier remarks about irony by Gibbs and Izett, irony can be defined in terms of the disparity between reality and expectations, where an expectation is based on shared presuppositions. From this perspective, irony is a phenomenon continuous with teasing. In fact, the two forms of humor differ only in the degree of complexity of the presuppositions, which can be highly basic in teasing, at least in young children's teasing, but considerably more sophisticated in irony. Compare irony and teasing with respect to humor. If irony does not necessarily provoke laughter, teasing also need not do so. Teasing, moreover, involves a latent aggressive component that makes the teasing not necessarily amusing, at least for one of the interlocutors. This lack of amusement is clear in the case of disrupting others' activities, but in other forms of teasing, humor may also originate from the disconcertment (or related feelings, such as disappointment, embarrassment, and fear) displayed by the interlocutor. In such cases, laughter may occur, but it is not always the immediate expression.

Defining humor is complicated by the fact that the boundaries separating its different forms are blurred (Norrick, 1993; Attardo, 1994, 2002). However, if we adopt a cognitive perspective and study humor in development, we notice that very young children display basic aspects that evolve with age. Specifically, I hypothesize that young children learn to play humorous communicative games and that the main cognitive and interactional features of these games persist in adult life.

In other words, I propose that humor is a form of communication. Rather than delimiting different categories of humor in linguistic terms, I suggest analyzing the cognitive and interactive components of humor. I argue that different forms of humor depend on the degree of elaboration of different components that define different types of communicative games.

From this perspective, let us consider the relationship between irony and teasing. Angeleri and Airenti (2014) proposed the following componential definition of irony: irony is a non-literal utterance that is based on a common ground shared between interlocutors, focuses on an unexpected incongruity, and includes a teasing component.

We can adopt this perspective more generally and consider that all forms of humor combine different constituents that may co-occur to different degrees. Different communicative games arise from these constituents. Without claiming to be exhaustive, the following examples demonstrate such cognitive-interactional constituents:


Because all of the identified components are already present in young children's teasing acts, I propose that teasing is the prototypical form of humor.

Therefore, we can draw the following two conclusions:


## THE DEVELOPMENT OF HUMOR IN COMMUNICATION: THE ROLE OF THEORY OF MIND ABILITIES

In the developmental literature, a clear distinction has been proposed between the acquisition of spontaneous forms of humor, which is typical of infants and young children, and sophisticated forms of humor, including irony. The use of simple humor has been observed in children's familiar contexts. For these forms, the problem of comprehension has not been posed. By contrast, the comprehension of sophisticated forms of humor is considered a conceptual attainment that must be assessed with classical experimental procedures. Most experimental studies have shown that children's understanding of irony does not begin before 5 or 6 years of age (Dews and Winner, 1997). According to the few published studies on this topic, production likewise begins at this age (Pexman et al., 2009; Recchia et al., 2010). Only Recchia et al. found examples of hyperbole in 4-year-olds<sup>1</sup> that could be considered a display of irony. In these studies, observations were completed for a predefined limited time in specific contexts.

The late acquisition of irony is explained in terms of the ToM. The comprehension of irony implies the attribution of secondorder beliefs to the speaker, or a full-fledged ToM (Winner and Leekam, 1991; Sullivan et al., 1995; Hancock et al., 2000; Filippova and Astington, 2008, 2010). However, as the previous sections demonstrated, instances of children's humor in natural situations show that young children also make utterances that would be defined as ironic when performed by adults. Thus, one can argue that these utterances may seem ironic, but in claiming that they are ironic, we would be attributing to the child an intentionality that has not been proven. Considering these utterances ironic would constitute an over-interpretation. This perspective is supported by the fact that in experimental studies, young children do not seem to understand the ironic character of utterances.

I believe these two assumptions should be questioned. On the one hand, it is not clear that adults produce ironic utterances deliberately (Gibbs, 2012). It has been proven that adults may comprehend the meaning of an ironic utterance without explicitly recognizing its ironic character (Gibbs and O'Brien, 1991). We rather expect that a communicative act be used appropriately. Our data indicate that young children may sometimes use ironic utterances appropriately. On the other hand, recent experimental studies have shown that children as young as 3 years old can understand the communicative, nonliteral intent of ironic utterances (Loukusa and Leinonen, 2008; Angeleri and Airenti, 2014).

The previous considerations prompt us to reconsider the relationship between the use of sophisticated forms of humor and ToM abilities. A result of this reconsideration might be to extend the concept of ToM. A number of recent studies have shown that infants can attribute epistemic states to agents, including false beliefs (Onishi and Baillargeon, 2005; Luo, 2011). These findings support the hypothesis that psychological reasoning, or an abstract capacity to represent and reason about false beliefs, emerges early in infancy (Baillargeon et al., 2016). This reasoning capacity, often characterized as implicit (i.e., intuitive), would persist in older children and adults when the capacity of explicit reasoning has developed.

These results are abundantly debated in the developmental literature. The core of the debate centers on resolving the

<sup>1</sup>The status of hyperbole is discussed in the literature. Although it has been traditionally associated with metaphor and irony, recent work designates hyperbole as a distinct figure of speech (Carston and Wearing, 2015).

discrepancy between these results and the fact that 3-year-old children fail the classical false-belief tasks (Low and Perner, 2012; Perner and Roessler, 2012). More generally, the problem entails explaining the relationship between the capacities exhibited by infants during spontaneous tasks and the capacities that older children and adults display when they are requested to perform verbal ToM tasks. Two questions are fundamental with respect to this problem. One question concerns whether precocious abilities are mentalistic. The second question concerns the role of language acquisition and executive functions in the development of more mature reasoning skills.

To explain the discrepancy between infants' and older children's performances on false belief tasks, Butterfill and Apperly (2013) postulate the existence of two distinct systems. Before being able to represent mental states, children would develop a minimal ToM, an efficient yet inflexible system implied in precocious social abilities. The researchers assume that a minimal ToM involves representing belief-like states but does not involve representing propositional attitudes as such. Therefore, due to its limitations, this system would be unable to deal with complex sets of mental states.

San Juan and Astington (2012) note that no plausible theory exists to explain how children progress from implicit (i.e., automatic) reasoning to explicit (i.e., controlled) reasoning. In particular, they stress the possible role of social and linguistic experiences in facilitating this progression.

Other authors have emphasized the influence of social experiences, which may help to explain individual differences in the development of ToM abilities (Apperly, 2012; Hughes and Devine, 2015). Also Baillargeon et al. (2016) concede that we possess insufficient knowledge regarding the development of infants' ability to infer and reason about others' mental states and the factors that contribute to individual differences.

Kovács et al. (2010) conducted a series of experiments that supported the hypothesis of a typically human attitude to encode others' beliefs. They showed that the mere presence of social agents is sufficient to automatically trigger online belief computations in both 7-month-old infants and adults. On the other hand, some studies show that in false belief tests, perspective tracking can be disrupted by the request to use explicit language, a phenomenon also attested in both children and adults (Rubio-Fernández, 2013; Rubio-Fernández and Geurts, 2013). Considered together, these findings support the assumption that intuitive and explicit reasoning are alternative forms of reasoning, even though they may coexist in principle.

I believe these findings show that young children are able to monitor others' behavior and to adapt to their actions regardless of how precocious forms of intuitive comprehension of agents' actions are characterized (Airenti, 2015). Studies on intersubjectivity have shown that very precociously young children can interact with adults (Trevarthen, 1998). It is reasonable to believe that this finding implies that young children react to adults' behavior. However, no evidence exists that infants can represent propositional attitudes as such. Moreover, it is not clear that implicit reasoning can be considered indicative of having developed a minimal ToM. At issue is more than a terminological problem; defining implicit reasoning as evidence of ToM, though a minimal form thereof, hides the specific nature of intuitive reasoning about others, namely, the fact that such reasoning develops in interactions and is inseparable from the communicative intentionality characterizing infant behavior. In naturalistic situations, perspective tracking is one way of establishing common ground that enables communication. Other ways exist, such as emotion recognition, which is particularly relevant for acknowledging a playful interaction and, subsequently, humor.

Hence, I argue that we must postulate not precocious ToM abilities but precocious communicative abilities. Precocious communicative abilities allow young children to interact with others efficiently and to enter what has been called a community of minds (Nelson et al., 2003; Nelson, 2005).

Humor is a form of communication that children acquire as they do all other forms of communication. They perform it in their interactions with adults and, later, with peers. Developmental pragmatics assumes that children acquire speech acts, or communicative units, that initially entail only acts and subsequently include language and acts (Bruner, 1975). Two facts are crucial to consider. First, children acquire communicative acts simultaneously with the conditions of their use, that is, communicative formats (Bruner, 1983) or games (Airenti et al., 1993a; Airenti, 2010). Second, they learn that playing with the conditions of applicability of communicative acts can generate amusement. Thus, we can provide an interactional definition of how the unexpected that creates humor is produced, namely, by playing with the conditions of applicability of a communicative game. This definition explains why children may use nonliteral communication in interactions yet be unable to define its features.

The observation of infants has shown that humorous interactions occur in the first year after birth (Reddy, 1991). Already at this preverbal age, children play with others' expectations. The simplest and most common children's social game is peek-a-boo. With respect to the categories previously mentioned, it is the mildest form of teasing. It provokes immediate laughter and is based on shared knowledge of the immediate physical surroundings.

Other precocious typical teasing games combine acts and language. Adults help familiarize children with these formats. Consider, for instance, the use of nursery rhymes. In many cultures, parents perform nursery rhymes, which prompt the child to expect a particular unexpected event, namely, tickling. The following is a common English example:

Round and round the garden like a teddy bear one step, two step tickle you under there!

This nursery rhyme associates a simple story with movements performed on the child's arm that result in a teasing episode, which provokes laughter. In this case, the child learns to play with expectations using both gestures and words in an already elaborated manner, compared with, for instance, the simpler game of peek-a-boo.

fpsyg-07-01392 September 16, 2016 Time: 17:1 # 8

What children acquire is a specific form of communication, namely, teasing, which is performed by playing with others' expectations. Parents are often amazed by the creativity of their children. In fact, if we examine the literature and our corpus of collected data, we may be surprised to find a repeated appearance of a limited number of communicative games that have a teasing quality and that most children play. As discussed, children play with others' feelings and with expectations regarding their own abilities. They mock others by imitating and ridiculing their behavior. They can negate an understanding that is evidently shared. They may justify their own mistaken or clumsy behavior by redefining it in an amusing manner. They provoke laughter by assuming an adult's stance. All of these communicative games are continuous with forms of humor that have a teasing component and are classified as irony or sarcasm when performed by adults.

According to the existing literature, young children cannot distinguish between non-literal communication and lies. In fact, children's immature ToM prevents them from explicitly expressing this distinction. However, this fact does not reveal insight regarding their capacity to produce and comprehend nonliteral communicative acts (i.e., to use these communicative acts appropriately). It is reasonable to infer that a 3.3-year-old girl who laughs and tells her mother, "I am not sucking my thumb," while actually sucking her thumb is not lying but playing a teasing game. If the child intended to lie, she would have employed a hiding strategy, regardless of its naiveté or effectiveness. As noted, other typical situations exist in which the use of irony constitutes an alternative to lying—for example, when a child cannot conceal a wrongdoing such as not having eaten a disliked food or having soiled an article of clothing. In these cases, irony is used as a possible escape. As the 3.3-year-old girl who soiled her T-shirt with ice cream says while looking at her mother, "This way, I can also eat the ice cream at home!" ('Così mangio il gelato anche a casa!').

I have argued that the results of numerous experimental studies indicate only where young children fail in attempting to interpret non-literal communication. Children are unable to explicitly appreciate the nature of non-literal communication. We can now add what young children can accomplish with respect to this particular form of non-literal communication: humor. Children intentionally produce teasing communicative acts. Among these forms of teasing communicative acts, we also find incipient forms of irony.

Consider a hypothetical example of irony in adult communication. Two colleagues are standing near a coffee machine and see another colleague approaching. One of them exclaims, "Here is the hidden gem!" This exclamation is a perfect example of a "specific and succinct" comment about a situation in which the richness derives from the presuppositions shared between the two interlocutors. We can imagine that the two colleagues share the knowledge that the newcomer is particularly praised by his superiors, that he considers himself worthy of this consideration and that they, on the contrary, do not think that he deserves it. Let us now consider an actual example from our corpus that involves a young child. A 3-year-old boy is playing in a park. He sees another little boy approaching and tells his mother, "Look! My friend is coming!" Here, too, is a specific and succinct comment about a situation. The difference from the earlier example is that the presuppositions are simpler: the boy shares with his mother the knowledge that he does not like the other boy.

In conclusion, children may produce and understand nonliteral communication without having developed a full-fledged ToM. Teasing communicative games do not differ in principle from serious communicative games; in fact, they are acquired in an identical way and become increasingly complex with age. Many elements may relate to their development, such as an improvement in language abilities, executive functions, social and ToM skills. Importantly, however, the considerably more sophisticated adult forms of communication involve the development of communicative formats that are already present in childhood.

## DISCUSSION

This work aimed to delineate a developmental framework for humor. The intent was twofold: to examine the different forms that humor can assume in childhood and to use the analysis of children's humor to illuminate typical problems that are debated in research on humor in general.

The existing literature typically distinguishes between spontaneous forms of humor, which infants and young children perform, and refined forms of humor, which only older children and adults can produce and understand. Based on this distinction, young children would be able to perform only forms of humor that generate immediate laughter. Other forms of humor that adults perform, such as irony, are generally regarded as considerably more complex and thus beyond the capacity of children to perform. In particular, young children are regarded as lacking the ability to produce and understand such forms of humor because they cannot comprehend their non-literal character. This task of comprehension would require inferring others' mental states and, hence, the development of a full-fledged ToM.

If, however, we conduct naturalistic observation, the situation appears differently. When asked to record their children's humorous interactions in everyday life, parents report that even young children may use complex forms of humor appropriately. I have contended that young children acquire complex forms of humor within communicative games and that this acquisition does not require the ability to explicitly express an understanding of the implied mental states.

I argue that to more fully comprehend the problem of acquiring complex forms of humor, it is helpful to analyze what humor entails as a form of communication in general. To facilitate this analysis, I proposed to focus on a form of humor that begins developing very early in life and is present in a more advanced form in adults, namely, teasing. I discuss several examples of teasing that are typical of different ages. Reddy (1991), who studied teasing in infants, defined teasing as playing with others' expectations. I propose that playing with others' expectations by teasing should be considered the crucial feature that characterizes humor in general and constitutes the link between the simplest and the most sophisticated forms of humor. Teasing might thus be considered the prototypical form of humor from which irony and sarcasm also arise. In fact, even if teasing and irony are considered distinct forms of communicative acts, it is widely accepted that irony may include a teasing component.

Thus, I propose the possibility of a basic form of human communication characterized by playing with others' expectations by teasing. Children acquire this form of communication very early in their interactions with adults. Such communication games may assume different forms that become increasingly sophisticated during development, aided by the development of ToM abilities. However, even young children may acquire communicative games that include sophisticated forms of humor and use them. For instance, irony may be part of a game of justification. Children may know that a parent who laughs at a self-ironic description of a misdeed will likely be more indulgent and will abstain from scolding the child. Another example is the appropriation of a communicative game typical of adults. This is a simple move, and it is clear that this appropriation is unexpected and provokes laughter in the audience. We can thus see that acquiring the ability to perform more complex acts of humor does not differ from the way an interactional perspective explains the acquisition of the ability to perform the simplest acts of humor. Consider the case of a child who discovers that if she wears her mother's shoes, she will provoke laughter in the audience. These acts exemplify young children's communicative cleverness. This cleverness is not apparent if we ask children to provide explicit explanations of conceptual differences. The same claim could be made of other communicative acts. We do not doubt that when children make a request, they produce an intentional act, even if they are unable to define a request as a communicative act.

Therefore, we can assert that in early stages of development, a child's use of the communicative game is deliberate, whereas irony is not (Gibbs, 2012). Later in development, acquiring ToM in connection with linguistic proficiency and other cognitive capacities enables the performance of more elaborate forms of humor and the possibility of using communicative games more flexibly.

From a cognitive perspective, the degree of complexity of the different forms of humor depends on the complexity of the communicative game. Thus, rather than claiming that one form of humor is simple while another form (irony, for instance) is difficult, I suggest that the difficulty or simplicity of producing and comprehending a given instance of humor derives from the combination of several constituents that construct the specific communicative game. Most remarkable among these constituents is common ground. Every utterance draws its communicative meaning from a common ground that the interlocutors share. Common ground constitutes the context for comprehension. Nonetheless, the aspects considered when identifying common ground may differ considerably (Clark, 1996). For instance, the common ground that is merely the immediate physical context (what the interlocutors see or hear, for instance) differs notably from one that is an element of general knowledge. In Angeleri and Airenti (2014), we showed that children more easily understand the communicative intent of ironic utterances when the common ground is directly perceived by the interlocutors (contingent irony) than in situations in which irony is based on background knowledge that the interlocutors are supposed to share but that is not directly perceived or mentioned (background irony).

Another factor that may influence the ease of comprehension is the degree of indirectness. Planning an indirect act to hurt someone's feelings, as in the case of sarcasm, is considerably more difficult than directly mimicking an interlocutor's behavior to ridicule him or her.

At least two research directions are apparent. I propose several characteristics as relevant for defining different forms of humor. However, it is possible that other characteristics could be considered. I contend that such additional characteristics would make the present model more elaborate but would not invalidate it. Another direction that could be examined in depth is the relationship between comprehension and production. Are these two processes symmetrical in acts of humor? Only a systematic study could indicate whether production and comprehension develop simultaneously.

Comparing production and comprehension is not easy because of the different methods that may be utilized to study these two aspects. With respect to the production of humor, the only effective method is an observation technique. We cannot provoke the use of humor in an experimental situation. Moreover, we must resort to parent reports, which are observations made by non-professional observers. Naturally, parents are given precise instructions; for instance, they are asked to describe the context in which any specific humorous utterance is produced. The main problem involved in the use of this method is that it does not allow precise quantitative analysis because it is impossible to ensure that all parents devote the same attention to the observation of their children's behavior. However, these limitations are balanced by the possibility to access the child's spontaneous behavior at any time. I expect that future work will confirm that even very young children use a wide range of humorous utterances. Moreover, I expect to find similar typologies of humor in all children, namely, the forms that we have observed in our sample.

In contrast, comprehension can be assessed through experiments. Experiments may also be used to evaluate the factors that influence performance in humor tasks. According to the theoretical assumptions expressed in this paper, one would expect no direct correlation between performance in humor tasks and performance in ToM verbal tasks. This is the result that we obtained in Angeleri and Airenti (2014). In this study, we tested children aged 3.0–6.5 years in a task of comprehension of different forms of humor. Children were administered the Peabody Picture Vocabulary Test – Revised (PPVT-R; Dunn and Dunn, 1981; Italian adaptation: Stella et al., 2000) and three classical ToM tasks of first and second order: the Smarties task (Perner et al., 1987), the Sally-Ann task (Wimmer and Perner, 1983), and the ice-cream van story task (Baron-Cohen, 1989). To identify the specific effects of ToM and language on humor comprehension, we used path analysis. Our analyses suggested that the correlation between humor understanding and ToM was spurious, as indicated by the shared effects of language ability on

ToM and humor and by the shared indirect effects of children's age on language and ToM.

My perspective is compatible with the point of view expressed by Reddy (2007) that intentional insincere communication is acquired alongside with intentional sincere communication. My perspective differs in its approach to insincere communicative acts. I suggest that two different forms of insincerity must be distinguished: proper deceit and non-literal communication. From a pragmatic perspective, we may regard non-literal utterances as instances of insincerity because they violate the Gricean maxim of quality. However, these different forms of insincerity are acquired differently. Planning a deceit requires the use of ToM abilities, whereas the other forms of insincerity are precociously developed as part of children's communicative repertoire.

I believe my perspective is advantageous to explain the fact that young children may produce sophisticated forms of humor, as the empirical evidence shows, without attributing them ToM abilities that are not demonstrated in other domains. The latter is true particularly for deceit, which children do not perform until a later age (Peskin, 1992; Airenti and Angeleri, 2011; Lee, 2013). Acquiring the ability to play communicative games is unrelated to acquiring the ability to distinguish true from false statements and to instill false beliefs in others. As shown, in communicative interactions, young children use non-literal communication, particularly humorous communication, as an alternative to lying.

This perspective is also useful for obtaining a better understanding of humor in general and of the relationship among humor, irony and teasing in particular. Studies on humor aim to define and categorize the different forms of humor. They encounter difficulty with the fact that humor manifests multifariously and that it is difficult to formulate definitions that allow the construction of a categorization without overlaps. The fact of laughter cannot be a criterion because several forms of humor exist in which an association with laughter is indirect and even loose or absent. We also cannot identify

## REFERENCES


a function that characterizes all forms of humor. Sometimes humor represents a simple way to express immediate amusement, whereas other times it may function primarily to strengthen the relationship between the interlocutors by stressing and confirming shared knowledge. It may also be used to indirectly criticize an interlocutor in ways that can range from mild to harsh.

I propose the construction of a unifying cognitive framework underlying the communicative games from which different manifestations of humor arise. I argue that this framework can be constructed by analyzing one of the most precocious and pervasive forms of communication, namely, teasing. Teasing is a feature that can be found in all forms of humor, whether simple or complex. I thus propose to characterize humor as a form of communication that has a teasing component and that plays with expectations. Children acquire this general communicative format during their initial interactions with adults. This communicative format becomes increasingly flexible and articulated with age and with cognitive acquisitions, including language abilities, ToM and relational competence.

Consider a final example of a child's utterance. When his mother's car does not start, the 3.6-year-old boy asks, "Are we going to sleep here, mom?" How might we determine, in this case, whether this utterance is ironic? I would propose this remark as a typical form of teasing and, therefore, of humor.

## AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

## FUNDING

This research was supported by the University of Torino (Fondi di Ricerca Locale, 2014).




and representation of false belief. Br. J. Dev. Psychol. 30, 105–122. doi: 10.1111/j.2044-835X.2011.02051.x


Winner, E. (1988). The Point of Words. Cambridge, MA: Harvard University Press.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Airenti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Communicating numeric quantities in context: implications for decision science and rationality claims

David R. Mandel\*

*Socio-Cognitive Systems Section, Defence Research and Development Canada and Department of Psychology, York University, Toronto, ON, Canada*

Keywords: decision-making, rationality, numeric quantifiers, context, linguistic interpretation

Perhaps more than most areas of cognitive psychology, the study of human judgment and decision-making relies heavily on experimental tasks communicated through written descriptions that convey numeric quantifiers as primary sources of information. Subjects in such studies usually are required to use such information to make choices, indicate preferences, or offer judgments. These responses are compared to normative benchmarks, resulting in the researchers drawing conclusions about the quality, coherence, or rationality of human judgment and decision-making (Arrow, 1982; Tversky and Kahneman, 1986; Stanovich and West, 2000). Most of this body of research has paid little attention (a) to how subjects interpret numeric information conveyed in writing and (b) to how those interpretations are influenced by context (Mandel and Vartanian, 2011; Teigen, in press). More often than not, researchers simply assume that subjects will interpret numeric quantities conveyed in experimental tasks as exact values, and also that subjects should interpret expressed numbers as precise quantities.

### Edited by:

*Marco Cruciani, University of Trento, Italy*

Reviewed by: *Pietro Perconti, University of Messina, Italy*

\*Correspondence: *David R. Mandel, david.mandel@drdc-rddc.gc.ca*

### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *20 February 2015* Accepted: *14 April 2015* Published: *30 April 2015*

#### Citation:

*Mandel DR (2015) Communicating numeric quantities in context: implications for decision science and rationality claims. Front. Psychol. 6:537. doi: 10.3389/fpsyg.2015.00537*

Yet it is uncontroversial in linguistics that numeric quantifiers may be treated as exact or approximate values, and where their interpretations are approximate, they may be treated as onesided (e.g., at least, which is lower bounded, or at most, which is upper bounded) or two-sided (e.g., roughly or about). Linguistic accounts of numeric quantifiers (e.g., Horn, 1989; Carston, 1998; Levinson, 2000; Geurts, 2006; Breheny, 2008) do not support the normative claim (or assumption) that a precise "bilateral" reading of such quantifiers consistent with exactly is the proper reading. Although linguistic accounts differ in what they posit as possible semantic defaults, even those proposing a bilateral semantics, such as Breheny (2008), specify pathways for pragmatically derived unilateral interpretations, such as interpreting a numeric quantifier, x, as at least x or at most x.

More generally, the degree to which decision researchers seem confident in defining the meaning of linguistic terms for others runs counter to a fundamental idea in the philosophy of language, which holds that the meanings of words are definable only through their actual use in language (e.g., Wittgenstein, 1953; Austin, 1979). It also runs counter to psycholinguistic evidence indicating that even 5-year olds understand that numeric quantifiers should be interpreted as "at least" in some contexts (Musolino, 2004). And it runs counter to work in experimental pragmatics indicating that people develop context-sensitive scalar implicatures as they develop. For instance, they come to understand that although some logically entails all, it usually pragmatically excludes all because it would be infelicitous to use some if one meant all (Moxey and Sanford, 2000; Noveck, 2001; Noveck and Reboul, 2008).

## Studies of Option Framing as a Case in Point

Consider the following influential test of the coherence of decision-making:

Imagine that the U.S. is preparing for the outbreak of an unusual Asian disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the exact scientific estimates of the consequences of the programs are as follows:

[Positive Frame]

If Program A is adopted, 200 people will be saved.

If Program B is adopted, there is 1/3 probability that 600 people will be saved, and 2/3 probability that no people will be saved.

[Negative Frame]

If Program C is adopted, 400 people will die. If Program D is adopted, there is 1/3 probability that nobody will die, and 2/3 probability that 600 people will die.

According to Tversky and Kahneman (1981), options A and C in the Asian disease problem (ADP) are extensionally equivalent and likewise for options B and D. The former, moreover, are regarded by virtually all researchers who have used or commented on the problem as "certain" or "sure," whereas the latter are regarded as "uncertain" or "risky." Coherent choices thus require that a decision-maker who chooses A over B would also choose C over D (or vice versa).

Yet Tversky and Kahneman (1981) and others (e.g., see Levin et al., 1998, for an overview) found that most subjects choose A in the first pair and D in the second, ostensibly violating one of the most consensual normative principles of choice (Tversky and Kahneman, 1986)—description invariance, which states that extensionally equivalent events should not be differentially regarded merely because of the way in which they are described.

I say "ostensibly" because the claim that subjects presented with this problem violate description invariance (and, hence, are incoherent in their decision-making) rests on a shaky argument I call proof by arithmetic, which goes like this:


A similar argument can be expressed for the claim that options B and D are equivalent.

The reason the proof-by-arithmetic argument is shaky is that it assumes people interpret numeric quantifiers as exact values, when as noted earlier this reflects a naïve view on quantifiers, in particular, and language, in general.

To put that view to a proper test requires asking subjects not only about their choices but also about their interpretations of the quantifiers in the options presented to them. This approach was adopted in a recent experiment (Mandel, 2014, Experiment 3). After presenting subjects with a choice problem much like the ADP (except that it focused on 600 people at risk in a war-torn region rather than 600 people at risk due to an unusual Asian disease), they were asked whether they interpreted "200" in the positive frame or "400" in the negative frame as meaning (a) "at least [n]," (b) "exactly [n]," or (c) "at most [n]." Sixty-four percent responded "at least," 30% responded "exactly," and the remaining 6% responded "at most."

This finding shows how untenable the proof-by-arithmetic argument is as a basis for the claim that subjects violate description invariance in framing problems like the ADP. Simply put, the researchers' interpretation was not shared by most subjects, who instead viewed the quantifiers presented to them as lower bounds. That result has profound consequences for the interpretation of subjects' choice data. Take the modal response: It is evident that saving at least 200 people [out of 600] is objectively better than letting die at least 400 people [out of 600]. Rather than being a "preference reversal" of dubious decision-making quality, for subjects who interpret the options as such, the pattern of choosing A over B and D over C may maximize subjective expected utility. In fact, when subjects' interpretations of the quantifiers in both options (i.e., A and B in the positive frame and C and D in the negative frame) were taken into account, a majority (76%) chose the option that was utility maximizing. Moreover, the framing effect reported by Tversky and Kahneman (1981) was found only in the subsample that reported a lower-bound interpretation of the quantifiers in options A or C. For those subjects who interpreted the quantifiers as exact values, there was no effect of frame.

Teigen and Nikolaisen (2009) also found evidence that numeric quantifiers are often interpreted as lower bounds. In one framing experiment that used a financial version of the ADP, subjects were asked which of two financial forecasters would be more accurate. Forecaster A predicted that NOK 250,000 (of 600,000) would be saved (in the positive frame) or that NOK 350,000 (of 600,000) would be lost (in the negative frame). Forecaster B predicted that NOK 150,000 (of 600,000) would be saved (in the positive frame) or that NOK 450,000 (of 600,000) would be lost (in the negative frame). In fact, NOK 200,000 was saved (NOK 400,000 was lost). In other words, the experiment was set up so that one forecaster overestimated the outcome and the other underestimated it, but they did so by the same amount (NOK 50,000). Supporting the hypothesis that people often spontaneously adopt a lower-bound interpretation of numeric quantifiers, the forecaster who overestimated the actual amount was judged to be more accurate. This was so regardless of whether the outcome entailed saving money or losing it (thus ruling out an alternative explanation based on desirability).

## Contexts Matter

The preceding examples suffice to show that it is untenable for decision researchers to assume that subjects interpret numeric quantifiers as exact values. The next examples further demonstrate how aspects of context can moderate those interpretations. First, quantifier interpretations may be affected by the degree to which decision options are explicated. Consider the so-called certain options in the ADP: in A, nothing is said about the remaining 400 people; in C, nothing is said about the remaining 200. In contrast, the so-called uncertain options better (if not fully) resolve the uncertainty resulting from partial explication. That is, in options B and D the explicit probabilities add up to unity and for each possible outcome all 600 people in the focal set are accounted for—either they are all saved or else they all die. In this sense, the certain options seem less certain than the uncertain options. Mandel (2014, Experiment 3) resolved the uncertainty by filling in the missing information:

If Plan A is adopted, it is certain that 200 people will be saved and 400 people will not be saved.

If Plan C is adopted, it is certain that 400 people will die and 200 people will not die.

The effect of explicating the missing information on subjects' numeric quantifier interpretations was striking: only 24% selected "at least [n]," whereas 59% selected "exactly [n]." The remaining percentage of subjects who indicated "at most [n]" also nearly tripled (6 vs. 17%). The direction of these contextinduced shifts is predictable: when all members of a focal set are referenced, it is likely that the speaker intends for the quantifiers to be exact. Thus, one might expect the bilateral interpretation to be modal, as was found. Yet one might also expect a smaller shift in favor of "at most," which may reflect the reader's appreciation that the sum of the quantified subsets cannot exceed the value of the total set.

Moreover, the effect of sentential context (via the manipulation of explication) extends to choice: when both of the paired options were fully explicated, there was no effect of frame on subjects' choices (also see Kühberger, 1995; Mandel, 2001; Tombu and Mandel, 2015). Evidently, the interpretation of numeric quantifiers depends on aspects of sentential context, such as the explication of complementary implicit numeric quantifiers, and these context effects also affect subjects' choices. In this regard, the present discussion adds to a small literature that has highlighted the importance of context on decisionmaking (e.g., Wagenaar et al., 1988; Hilton, 1995; Goldstein and Weber, 1997; Rettinger and Hastie, 2001; Mandel and Vartanian, 2011).

Numeric quantifier interpretations are also affected by linguistic inferences that may be drawn from the broader semantic context of the decision-making problem. For instance, when a rationale for the values presented in the ADP was provided to subjects—namely, that there were only 200 vaccines for the disease that would be available—then a majority (71%) interpreted "200" as an upper bound ("at most") in the positive frame, whereas a majority (64%) interpreted "400" as a lower bound ("at least") in the negative frame (this experiment is

## References

Arrow, K. J. (1982). Risk perception in psychology and economics. Econ. Inq. 20, 1–9. doi: 10.1111/j.1465-7295.1982.tb01138.x

reported in the General Discussion of Mandel, 2014). In contrast, when the standard ADP was presented, 58 and 54% gave the "at least" response in the positive and negative frames, respectively.

Once again, the direction of these interpretational shifts (both as a function of frame and whether a rationale was provided to subjects) is predictable, reflecting subjects' awareness that maximum quantities (i.e., having only 200 vaccines) set upper bounds on positive expected outcomes. And, once again, there is evidence that the effect of context on linguistic interpretation, in turn, influences the choices people make. When the vaccine rationale was provided in the ADP, no effect of frame on choice was found (Jou et al., 1996).

## Conclusion

### William James wrote:

The great snare of the psychologist is the confusion of his own standpoint with that of the mental fact about which he is making his report. I shall hereafter call this the 'psychologist's fallacy' par excellence." (1890/1950, p. 196, italics in original).

Decision researchers have perennially committed this fallacy by projecting their understanding of decision-task structure and meaning onto their subjects and then assessing the rationality of their subjects' judgments and choices as if their subjects invariably shared their views.

Over the years, a minority of psychologists have objected to that approach, having noted how subjects' task construals often differ from those assumed by experimenters (e.g., Henle, 1962; Berkeley and Humphreys, 1982; Phillips, 1983; Hilton, 1995). For instance, Gigerenzer (1996) stated:

Semantic inferences—how one infers the meaning of polysemous terms such as probable from the content of a sentence (or the broader context of communication) in practically no time—are extraordinarily intelligent processes. They are not reasoning fallacies." (p. 593)

The research and arguments summarized here continue in a similarly critical vein, extending the alternative-task-construal argument to problems involving the linguistic interpretation of numeric quantifiers. As the examples provided here have illustrated, those linguistic interpretations are not only, at times, modally different from experimenters' interpretations, but also predictably moderated by multiple aspects of context. Such findings certainly do not prove that humans are rational, but they do show that some influential claims about human irrationality in decision-making are unwarranted. Such claims would benefit from careful consideration of possible linguistic effects on people's judgments and decisions.

Austin, J. L. (1979). "The meaning of a word," in Philosophical Papers, 3rd Edn., eds J. O. Urmson and G. J. Warnock (Oxford, UK: Oxford University Press), 55–75.

Berkeley, D., and Humphreys, P. (1982). Structuring decision problems and the "bias heuristic." Acta Psychol. 50, 201–252. doi: 10.1016/0001- 6918(82)90042-7


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Her Majesty the Queen in Right of Canada, as represented by Defence Research and Development Canada. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## Pitch enhancement facilitates word learning across visual contexts

## *Piera Filippi\* †, Bruno Gingras and W. Tecumseh Fitch*

Department of Cognitive Biology, Faculty of Life Sciences, University of Vienna, Vienna, Austria

### *Edited by:*

Alessio Plebe, University of Messina, Italy

#### *Reviewed by:*

Julien Mayor, University of Nottingham Malaysia Campus, Malaysia Marco Mazzone, University of Catania, Italy

### *\*Correspondence:*

Piera Filippi, Department of Cognitive Biology, Faculty of Life Sciences, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria e-mail: pie.filippi@gmail.com

### *†Present address:*

Piera Filippi, Artificial Intelligence Laboratory, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium

This study investigates word-learning using a new experimental paradigm that integrates three processes: (a) extracting a word out of a continuous sound sequence, (b) inferring its referential meanings in context, (c) mapping the segmented word onto its broader intended referent, such as other objects of the same semantic category, and to novel utterances. Previous work has examined the role of statistical learning and/or of prosody in each of these processes separately. Here, we combine these strands of investigation into a single experimental approach, in which participants viewed a photograph belonging to one of three semantic categories while hearing a complex, five-word utterance containing a target word. Six between-subjects conditions were tested with 20 adult participants each. In condition 1, the only cue to word-meaning mapping was the co-occurrence of word and referents. This statistical cue was present in all conditions. In condition 2, the target word was sounded at a higher pitch. In condition 3, random words were sounded at a higher pitch, creating an inconsistent cue. In condition 4, the duration of the target word was lengthened. In conditions 5 and 6, an extraneous acoustic cue and a visual cue were associated with the target word, respectively. Performance in this word-learning task was significantly higher than that observed with simple co-occurrence only when pitch prominence consistently marked the target word. We discuss implications for the pragmatic value of pitch marking as well as the relevance of our findings to language acquisition and language evolution.

**Keywords: cross-situational learning, prosody, word learning, sound-meaning mapping, infant directed speech, language evolution**

### **INTRODUCTION**

A crucial issue in the study of word learning is the inherent uncertainty of the referential act of naming in sound-meaning associations (Quine, 1960), sometimes called the "Gavagai!" problem. Both the child acquiring spoken language and the adult learning a new language have to map sounds onto referents, a problem that involves the triple challenge of (a) extracting (i.e., identifying and remembering) a word out of a continuous sound sequence, (b) inferring one or more possible referents within the current visual scene, and (c) mapping the segmented word onto its broader intended referential/pragmatic meaning(s), and/or grammatical role(s) (Bloom, 2000). The final step includes the possibility of extending the reference over a potentially infinite set of instances of the same semantic category (Brown, 1958; Waxman and Gelman, 2009), and to an open-ended set of novel utterances (Chomsky, 2000).

Language learners might infer the referential meaning of the spoken words by hearing them in various contexts of use (Wittgenstein, 2009), and by using multiple pragmatic or linguistic cues such as eye gaze (Nurmsoo and Bloom, 2008), discourse novelty, syntax (Wagner and Watson, 2010), and tactile interaction (Seidl et al., 2014). Here we focus on two important sources of information for word learning: cross-situational statistics and prosodic cues in the speech signal. Most research has investigated the role of these two cues separately (Morgan et al., 1987; Medina et al., 2011; Shukla et al., 2011; Vlach and Sandhofer, 2014). In the present study we simulate the complexity of real-world word learning processes in the laboratory, and bring research on prosody and statistical learning together. Specifically, our study builds on three key findings from previous research: (i) cross-situational statistical regularities, expressed as co-occurrence between labels and their intended referent across different visual scenes, favor learning of conventionally defined sound-meaning associations (Yu and Smith, 2007), (ii) the statistically regular co-occurrence between a target word and its intended referent through learning trials facilitates object categorization, i.e., the extension of target words to multiple exemplars of the visual referent (Waxman and Braun, 2005); and (iii) the exaggerated pitch parameter cross-culturally employed in infant-directed speech (IDS) provides markers of acoustic salience that guide selective attention and are often used to highlight target words (Grieser and Kuhl, 1988; Fernald and Mazzie, 1991; Aslin et al., 1996).

Based on these findings, we test the prediction that marking target words with IDS-typical pitch differential contrasts plays a key role in supporting word learning across different visual scenes. Although numerous studies have addressed the positive effect of speech directed to infants or strangers in conveying languagespecific phonological information (Burnham et al., 2002; Kuhl, 2004), as cues to word segmentation (Thiessen et al., 2005; Shukla et al., 2011), or as cues to the syntactic structure of the sentence (Sherrod et al., 1977), no research we know of has investigated

the effects of IDS-typical emphatic stress of single target units in the service of word learning, thus extending beyond the first step of sound extraction or single object labeling. The present study aims at filling this gap. With this overall aim, the research reported here specifically compares the learning effects of IDS typical pitch emphasis with those of other visual and acoustic attentional cues.

### **A NEW PARADIGM FOR THE INVESTIGATION OF WORD LEARNING: THE EIM TASK**

Much previous research in word learning has addressed the acquisition of sounds spoken in isolation in association with objects represented in pictures isolated from any surrounding visual scene (Gleitman, 1990; Markman, 1990; Baldwin, 1993). But such paradigms greatly simplify what actually happens in natural learning situations, which typically include an indefinite number of potential referents (Medina et al., 2011), and where the target words are typically spoken not in isolation, but in connected discourse, within a sequence of continuous sounds. Thus, in a real label-referent mapping situation, learners have to somehow identify the key word(s) to be linked to the visual scene.

To address these issues, we introduce a new paradigm for studying word learning, which we call the Extraction/Inference/Mapping (EIM) task (target sound string **E**xtraction, referential category **I**nference, and label-meaning **M**apping). This paradigm uses photographs of complex visual scenes, providing a naturalistic visual parsing challenge that poses Quine's problem of indeterminacy of the intended referent (Quine, 1960) in a laboratory environment. Simultaneously, a stream of spoken words is presented acoustically.

To control the key features of the auditory stimuli to which learners are exposed, we created an artificial language made of non-sense monosyllabic words (cf. Gomez and Gerken, 2000). Each utterance in this artificial language is a stream of five monosyllabic words containing a single target word ("target label" hereafter) at an arbitrary position. These target labels are consistently associated with the intended category of the photograph (**Figure 1**). Participants must identify the target labels within the speech stream, infer the intended referent category from the photographs, and link these two together into a label-meaning pair which allows them to subsequently extend the acquired word to novel utterance contexts and to new instances of the intended referential category (novel images).

The EIM task can be used with children or adults (and potentially animals). It uses computer-modified natural speech to provide precise acoustic control, and allows for both explicit and implicit learning approaches. The paradigm can be varied in many ways to address multiple questions concerning word learning and language acquisition.

## **MATERIALS AND METHODS**

### **ETHICS STATEMENT**

The experiment reported in this article was conducted in accordance with Austrian law and the policies of the University of Vienna. According to the Austrian Universities Act 2002, the appointment of ethics committees is required only for medical universities engaged in clinical tests, the

application of new medical methods, and/or applied medical research on human subjects. Accordingly, ethical approval was not required for the present study. Nevertheless, all participants gave written informed consent and were aware that they could withdraw from the experiment at any time without further consequences. All data was stored anonymously.

### **EXPERIMENTAL CONDITIONS: OVERVIEW**

Six different experimental conditions were tested (see **Figure 2**). In each condition, the set of referential images was the same, but the signal was manipulated in different ways to provide cues to the location of the target label in the speech stream:


**FIGURE 2 | Speech waveforms and pitch contours corresponding to the utterance "minajifoke" (where the target label was "mi") as synthesized in each experimental condition. (A)** Co-occurrence only: here the pitch contour is flat; the only cue to word learning is the consistent co-occurrence between target labels and their respective visual referents. **(B)** Consistent pitch peak: a consistent pitch emphasis marks each target label. **(C)** Inconsistent pitch peak: pitch emphasis marks a

addition to the statistical cue. We used a large pitch deviation of one octave, a magnitude cross-linguistically typical of IDS (Fernald, 1992). The manipulation of pitch cues typically employed in IDS allows us to examine the specific role of pitch in the process of word learning, which we hypothesized would enhance the effect of the pure cross-modal co-occurrence cue.


random word of the utterance. **(D)** Duration: a temporal length increase marks each target label. **(E)** Buzz cue: the attentional cue during the target label is a buzz sound played from the left channel of the headphones, precisely in correspondence, and for the duration of, the target labels. **(F)** Visual cue: the target labels are highlighted by an abrupt temporary color change in the background screen, which is synchronized with the duration of each target label.


Our manipulation of these different types of sensory information as selective attention markers to the target label allowed us to examine whether pitch enhancement has a special status in facilitating word learning.

### **PARTICIPANTS**

For each condition, 20 individuals at the University of Vienna were recruited via posters or Internet advertisement, for a total of 120 adult participants (71 females and 49 males, mean age = 23.7, range = 18–37) in a between-subjects design. Custom software (Experimenter version 3.5) written in Python 2.6 was used to present the stimuli and collect mouse-click responses. Participants were given modest monetary compensation or candy in exchange for their participation in this short (roughly 8 min) experiment.

## **MATERIAL**

The stimuli consisted of photographed images, presented on an LCD monitor, paired with artificial language utterances presented over headphones.

(1) Images. Forty-five unique full-color images of real life scenes were selected, each depicting one of three intended semantic categories: humans, mountains and non-human animals ("animals" hereafter). The images were downloaded from the National Geographic website (http://www. nationalgeographic.com), and scaled to 300 × 300 pixels. Care was taken that no obvious emotional or written content was depicted in these pictures.

(2) Sounds. Strings of five CV (consonant + vowel) words (our artificial language "utterances") containing the target label at a random position in the string were presented. 45 utterances were subdivided into three different sets of 15, each of which referred to one of the three image categories (see supplemental data online). Each set shared one distinctive word that consistently occurred in association with the corresponding image category. The target labels were /mi/, /ga/, and /lu/ and the image set to which they were paired was varied randomly for each participant. The position of the target label was varied systematically across each utterance, appearing in each of the five "slots" with equal frequency (Kuhl, 2004). Otherwise, all other words of the utterances were treated as "stems" that were systematically shared across the three utterance sets, and which therefore had no consistent referential link to the visual stimuli. Hence, only the words shared within each utterance set constituted statistically valid target labels.

In order to avoid co-articulation between adjacent words, as in Johnson and Jusczyk (2001), each word was recorded individually. Each word was then acoustically modified in PRAAT (Boersma and Weenink, 2007). In particular, the words' pitch and duration were modulated using the pitch synchronous overLap-add (PSOLA) algorithm. Word amplitude was made consistent: each word's intensity was adjusted to mean 70.0 dB (SD = 0.2) relative to peak amplitude. Except for the "duration" condition, the duration of each word was normalized (mean 400 ms; SD = 2 ms).

### *Perceptual manipulation of the signal in each experimental condition*

*Co-occurrence only.* The pitch-, loudness- and duration-normalized words were concatenated without pauses to form five-word utterances. In this first condition the target labels' pitch was normalized to have the same F0 as thefour other words (*M* =210.7 Hz; SD = 0.6 Hz).

*Consistent pitch peak.* The target label was manipulated to have a much higher pitch peak (*M* = 421.8 Hz; SD = 1.5 Hz) than the rest of the words, which were presented in a monotone frequency (*M* = 210.7 Hz; SD = 0.6 Hz). The frequency ratio between the peak and the baseline corresponded closely to a musical interval of an octave, with the peak frequency doubling the F0 of the monotonous words. Such large pitch excursions are cross-linguistically typical of IDS (Fernald, 1992).

*Inconsistent pitch peak.* An octave pitch peak was applied randomly to one word of the utterance, with the condition that each word of the artificial language was stressed at least once and no more than twice. To avoid the *absence* of pitch cue providing a cue this was the target, each target label was also stressed, but only once over the training. Thus, in this condition, pitch emphasis was inconsistent with the co-occurrence cue between the target label and its correspondent image category

(Morgan et al., 1987; Shukla et al., 2011). The focused words were again given an average F0 of 421 Hz (SD = 1.4 Hz), while the rest of the words were presented in monotone (*M* = 210.7 Hz; SD = 0.6 Hz). As in condition 2, the frequency ratio between the pitch peak and the baseline corresponds closely to an interval of an octave.

*Duration.* The duration of the target label was adjusted to twice that of the non target words (target label: *M* = 800 ms; SD = 2 ms; non-target words: *M* = 400 ms; SD = 3 ms). For each word, F0 was normalized to a mean of 210.7 Hz; SD = 0.5 Hz.

*Buzz cue.* The five-word utterances were those used in condition 1, but now a buzz sound (a low-pass filtered pulse train at 80 Hz, intensity: 67 dB relative to peak) was played during the entire duration of the target label. To prevent clicks, a 20 ms fade-in/fade-out transition was applied the buzz sound. In order to maintain optimal separation of the spoken utterance and the buzz, utterances (including the target label) were played from both stereo channels of the headphones (centering the auditory image), while the buzz sound was played only from the left side.

*Visual cue.* Again, the same set of utterances as those used in condition 1 were played, but now a visual cue was used during the target label: the "standard" light blue background color surrounding the images was changed to red during the entire duration of the target label.

### *Training and testing procedure*

An explicit learning paradigm was used. Participants were randomly assigned to one of the six experimental conditions. They were told that they would participate in an "Alien Language Learning Study" (see Kirby et al., 2008) in which they would see a series of pictures and hear the sounds that an imaginary alien would use to describe those pictures. They were informed that the experiment consisted of a training phase, during which they were simply asked to do their best to understand as much as they could of this language. They were also told that their mastery of the language would be evaluated in a test phase right after the training. After being instructed, participants were seated in a quiet room, at around 60 cm from a 23" monitor (1,920 × 1,080 pixels) and wore Sennheiser HD 520 headphones. The experiment lasted around 8 min.

In both the training and the test phase, the artificial language was manipulated as described above for each condition. To avoid the possibility that some specific image-label correspondences are easier than others, which could bias interpretation, each target label was randomly assigned to an image category across subjects. As illustrated in **Figure 1**, during the training session each utterance was randomly paired with one image (centered on the monitor) including the appropriate referential category, yielding 45 auditory utterance-image pairs (see Yu and Smith, 2007). Each utterance, and each image, was presented only once. The auditory unit-image pairs were presented in a random order across participants. For each slide, playback of the utterance was initiated synchronously with the onset of image presentation. The image remained on screen for a further 1500 ms after the end of

the auditory unit's ∼2s presentation, for a total of approximately 3500 ms per slide.

After the training session, participants received a multiplechoice test. Participants were presented with a novel five-word utterance, containing one of the three target labels, and three novel images simultaneously (one from each category). Each five-word utterance was associated once with a set of three probe images, yielding 45 test trials. The onset of images coincided with the onset of the auditory utterance. The mouse pointer was hidden during sound playback to prevent premature responses. Participants were asked to indicate which image matched the auditory unit by clicking on that image. They could thus make their choice anytime from the end of the auditory stimulus playback up to 4 s after the sound ended. No feedback was provided. An interval of 1 s followed the subject's response on each trial prior to the onset of the next trial. The order of presentation of the utterance-image trials, as well as the left-to-right arrangement of the three images on the monitor was randomized for each subject. Presenting novel images probes the participants' ability to apply the acquired reference to members of a potentially infinite set of new images, while the novel utterances examined their ability to process the acquired label within an open-ended set of new utterances.

### **RESULTS**

Statistical analyses were performed using SPSS for Mac OS X version 19. A binary logistic regression model was built within the generalized linear model framework, to compare overall responses across conditions. Data across all subjects were modeled using a binomial distribution and a logit link function. *Participant ID* was entered as subject variable, *image category* as a within-subject predictor variable and *experimental condition* as a between-group predictor variable. The dependent variable was the proportion of correct choices in participants' responses (where chance = 33.3%). Five participants were excluded from the analysis because their responses comprised more than 15% timeouts which could not be analyzed. The model provided a good fit (*R*<sup>2</sup> <sup>=</sup> 0.65; see Nagelkerke, 1991), and revealed a significant main effect of experimental condition [Waldχ2(5)=28.525, *<sup>p</sup>*<0.001], no significant effect of image category [Wald <sup>χ</sup>2(2) <sup>=</sup> 4.181, *p* = 0.124], and no significant interactions between image category and experimental condition [Wald <sup>χ</sup>2(10) <sup>=</sup> 11.732, *<sup>p</sup>* <sup>=</sup> 0.303]. Consistent with these analyses, a non-parametric Kruskal–Wallis test (with individual responses collapsed across image categories) confirmed that participants' performance was significantly affected by the experimental condition [*H*(5) = 17.734, *p* = 0.003].

Pairwise comparisons between the *Co-occurrence only* condition and all the other experimental conditions, using the sequential Bonferroni correction procedure (Holm, 1979), revealed a significant difference only between *Co-occurrence only* and the *Consistent pitch peak* condition [Wald <sup>χ</sup>2(1) <sup>=</sup> 14.138, *<sup>p</sup>* <sup>=</sup> 0.001]. Differences in learning performance did not reach significance between the *Co-occurrence only* condition and the *Duration* condition [Wald <sup>χ</sup>2(1) <sup>=</sup> 5.351, *<sup>p</sup>* <sup>=</sup> 0.083], the *Inconsistent* pitch peak condition [Wald <sup>χ</sup>2(1) <sup>=</sup> 2.692, *<sup>p</sup>* <sup>=</sup> 0.302], the *Visual cue* condition [Wald <sup>χ</sup>2(1) <sup>=</sup> 1.246, *<sup>p</sup>* <sup>=</sup> 0.529], or the *Buzz cue* condition [Wald <sup>χ</sup>2(1) <sup>=</sup> 0.031, *<sup>p</sup>* <sup>=</sup> 0.860]. Thus, only the consistent pitch cueing provided a significant boost in learning efficacy over the ever-present statistical association cue (**Figure 3**).

To further investigate this finding, we calculated the proportionate changes in odds (odds ratio) between the *Co-occurrence only* and all other conditions as a measure of effect size. This analysis revealed that the odds of getting the correct response were 4.851 times higher in the *Consistent pitch peak* condition than in the *Co-occurrence only* condition, while none of the other conditions yielded odds ratios greater than 1.842 times the odds obtained with *Co-occurrence only*, indicating a much stronger effect of the presence of the pitch peak than any of the other attention-highlighting modifications.

A one-sample Wilcoxon signed-rank test was performed for each condition, to test if the median % correct was significantly different from chance (above or below 33.3%). This test revealed that in all experimental conditions except the *Inconsistent pitch peak* condition, percent correct was significantly higher than expected by chance (*Co-occurrence only* condition:*z* = 2.951, *p* = 0.003; *Consistent pitch peak* condition: *z* = 3.473, *p* = 0.001; *Duration* condition: *z* = 3.816*, p* < 0.001; *Visual cue* condition: *z* = 2.951, *p* = 0.003; *Buzz cue* condition: *z* = 3.286, *p* = 0.001). For the *Inconsistent pitch peak* condition, participant performance did not differ significantly from chance (*z* = 1.645, *p* = 0.100).

### **DISCUSSION**

We found that performance in the EIM task was significantly higher than that observed with simple co-occurrence *only* when pitch prominence consistently marked the target label. Successful learning of target labels occurred in all conditions except the *Inconsistent Pitch Peak* condition. The fact that participants performed above chance in the *Co-occurrence only* condition shows that consistent cross-modal co-occurrence between target labels and their referents was sufficient to allow word learning, consistent with previous research on statistical cross-modal coherence in learning labels for individual objects (Gogate and Bahrick, 1998).

Comparisons between the *Co-occurrence only* condition and the other experimental conditions showed that only one condition yielded a significant increase in performance: the *Consistent pitch peak* condition. When duration, screen color change, or buzz cues were used to highlight attention to the target label, no significant increase in learning performance was observed (although a trend at *p* = 0.083 was seen for *Duration*). These results provide compelling evidence that, of all the cues examined here, *only* exaggerated pitch contour values typical of IDS are salient enough that, if used as markers of the target label, and thus of statistical cross-modal regularities, they significantly aid word learning. Although the manipulation of duration, visual and nonprosodic acoustic cues were quite extreme (especially the visual cue), they did not significantly improve participants' performance in word acquisition over simple cross-modal statistical regularities, strongly suggesting that the pitch effect demonstrated here goes beyond any general attentional effects (i.e., "von Restorff" effects).

Previous research has shown that the positive effects of IDS typical parameters such as prominent pitch values, exaggerated formant space (vowel hyperarticulation) and/or grammatical simplicity can assist spoken word identification (Sherrod et al., 1977; Burnham et al., 2002; Thiessen et al., 2005). Our results extend these findings, demonstrating a positive didactic effect of IDS-typical pitch prominence in the complex process of word learning as operationally defined here, and contrast with the suggestion that pitch highlighting has no positive didactic effects (Uther et al., 2007). Future work should evaluate the role of vowel hyperarticulation or grammatical simplicity in this task.

Given that the performance in the *Duration* condition was marginally significantly higher than in the *Co-occurrence only* condition, our data is compatible with findings indicating that prominent lengthening of utterances and/or specific words at the ends of utterances can also assist learners in communicative tasks (Church et al., 2005). The trend in our data suggests that, with larger samples or more extensive training, duration might also show a significant augmentation of word learning. However, our odds ratios comparisons suggest a stronger role of pitch prominence relative to timing as a learning booster, mirroring the findings of (Seidl, 2007). Our findings support the hypothesis that the natural predisposition to perceive cross-modal regularities, and the exposure to prosodically highlighted stimuli, are intertwined aspects of language learning (Christiansen and Dale, 2001).

Regarding the *Inconsistent pitch peak* condition, the impairment or lack of learning improvement compared with the *Co-occurrence only* condition demonstrates that the mere presence of pitch exaggerated contours somewhere in an utterance does not aid learning. This finding suggests that pitch enhancement might override cross-situational statistical learning, being a more salient cue for adult participants. Furthermore, evidence on this condition indicates that improved learning is not simply due to a general

increased attentiveness induced by the presence of an arousing pitch peak (Fernald, 1992).

It is notable that the addition of a visual cue synchronous with the target label did not significantly improve learning performance in comparison to the *Co-occurrence only* condition. This is consistent with some previous research on the interpretation of pragmatic cues as intentional acts of reference. Neither simply pointing to an intended referent (Grassmann and Tomasello, 2010), or highlighting it with a flashlight and a general attentional phrase (Keates and Graham, 2008), is sufficient for correct label acquisition. In these cases, what makes a communicative act is its intentional connotation, i.e., the interlocutors' ability to engage in joint attention frames of reference (Tomasello, 2000).

Importantly, our results suggest that adults exploit pitch enhancement as a pragmatic cue to relevant similarities among referents across multiple visual contexts. This finding contributes to a research framework that warrants further work: the effect of prosodic modulation as an invitation to generate referential categories across multiple visual environments in spoken interactions.

Our results may also have implications for models of language evolution, and are compatible with the suggestion that the increased use of prosodic and gestural modifications typical of motherese might have been a useful cue that made vocal language easier to process for hominins (Falk, 2004; de Boer, 2005a,b). Our data suggest possible links between two crucial hypotheses in the literature on the evolution of language: (a) Darwin's hypothesis that a music-like modulation of voice had a special role in the initial evolution of verbal language (Darwin, 1871) and (b) the hypothesis that mutual segmentation of speech streams and situational contexts initiated a subsequent evolutionary process of linguistic elaboration (Wray, 1998; Okanoya and Merker, 2007).

Some limitations of the current study are worth noting. First, we used adult participants who, unlike neonates, already know that a language-learning task implies the association between sounds and referents. Future work should examine the learning effects of attentional highlighting markers for preverbal infants, who presumably do not possess the mental categories employed here (humans, animals, and mountains). Future work could also examine novel referential categories in adults, and again evaluate the effects of pitch and other cues in guiding the formation of *new* categories.

There are many ways in which the paradigm introduced here can be extended. The simple paradigm used here lacks any syntactic relation between units, a property that no doubt plays an important role in word learning. Future studies might include multiple target labels in each utterance, or investigate the ability to map words to referents of different kinds (e.g., nouns versus verbs, or statements versus requests). This could provide new insights into how prosodic highlighting interacts with the syntactic and semantic organization of the utterance, and *vice versa* (bootstrapping process). Moreover, one could investigate additional statistical information with our design, utilizing, for example, multi-syllabic words defined by transition probabilities between syllables, rather than the monosyllabic target labels we employed. Image complexity could also be manipulated (e.g., referent size, number of distractors, or emotional connotation). Finally, it would be interesting to employ our task in animals, or using wordless melodies (rather than speech). Clearly, the paradigm introduced here opens up multiple research avenues to investigate word-learning across contexts in a controlled, yet naturalistically complex, experimental environment. We hope that further research along these lines will lead to a richer understanding of the complex cognitive processes involved in language acquisition.

### **AUTHOR CONTRIBUTIONS**

Piera Filippi developed the study concept. All authors contributed to the study design. Piera Filippi performed testing and data collection. Piera Filippi and Bruno Gingras performed data analysis. Piera Filippi drafted the manuscript, and Bruno Gingras and W. Tecumseh Fitch provided critical revisions. All authors approved the final version of the manuscript for submission.

### **ACKNOWLEDGMENTS**

We especially thank Marco Carapezza for his invaluable support. This work was supported by European Research Council Advanced Grant SOMACCA (No. 230604) awarded to W. Tecumseh Fitch. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpsyg.2014.01468/ abstract

### **REFERENCES**

Aslin, R. N., Woodward, J. Z., LaMendola, N. P., and Bever, T. G. (1996). "Models of word segmentation in maternal speech to infants," in *Signal to Syntax*, eds J. L. Morgan and K. Demuth (Hillsdale, NJ: Erlbaum), 117–134.


Quine, W. V. O. (1960). *Word and Object*, Cambridge, MA: MIT Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 October 2014; paper pending published: 22 November 2014; accepted: 30 November 2014; published online: 22 December 2014.*

*Citation: Filippi P, Gingras B and Fitch WT (2014) Pitch enhancement facilitates word learning across visual contexts. Front. Psychol. 5:1468. doi: 10.3389/fpsyg.2014.01468 This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Filippi, Gingras and Fitch. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Evoking Context with Contrastive Stress: Effects on Pragmatic Enrichment

### Chris Cummins \* and Hannah Rohde

Linguistics and English Language, University of Edinburgh, Edinburgh, UK

Although it is widely acknowledged that context influences a variety of pragmatic phenomena, it is not clear how best to articulate this notion of context and thereby explain the nature of its influence. In this paper, we target contextual alternatives that are evoked via focus placement and test how the same contextual manipulation can influence three different phenomena that involve pragmatic enrichment: scalar implicature, presupposition, and coreference. We argue that focus placement influences these three phenomena indirectly by providing the listener with information about the likely question under discussion (QUD) that a particular utterance answers (Roberts, 1996/2012). In three listening experiments, we find that the predicted interpretations are indeed made more available when focus placement is added to the final element (to the scalar adjective, to an entity embedded under the negated presupposition trigger, and to the predicate of a pronoun). These findings bring together several distinct strands of work on the effect of focus placement on interpretation all in the domain of pragmatic enrichment. Together they advance our empirical understanding of the relation between focus placement and QUD and highlight commonalities between implicature, presupposition, and coreference.

Keywords: question under discussion (QUD), scalar implicature, presupposition projection, coreference, focus placement

## INTRODUCTION

The study of pragmatics examines how hearers infer meaning beyond that which is explicitly expressed by the speaker. This process crucially depends upon the consideration of what is not said as well as what is said. To take one much-discussed example, quantity implicature has traditionally been assumed to rely on the hearer's ability to identify and reason about more informative alternatives that the speaker could have uttered. For example, a hearer is expected to reason that a speaker who utters (1) had available to them a stronger statement, as in (2), and that because the speaker chose not to utter (2), the hearer is entitled to infer the classic scalar implicature from (1), namely the negation of (2).


A major concern arising from this line of reasoning is what we mean by alternatives that the speaker "could have uttered." As Grice (1975) sketched out, we expect that cooperative speakers will adhere to several principles of interaction. They will not make statements which are false or for which

### Edited by:

Marco Cruciani, University of Trento, Italy

### Reviewed by:

Bob Van Tiel, Université Libre de Bruxelles, Belgium John Michael Tomlinson, ZAS Berlin, Germany

> \*Correspondence: Chris Cummins ccummins@staffmail.ed.ac.uk

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 14 July 2015 Accepted: 05 November 2015 Published: 26 November 2015

### Citation:

Cummins C and Rohde H (2015) Evoking Context with Contrastive Stress: Effects on Pragmatic Enrichment. Front. Psychol. 6:1779. doi: 10.3389/fpsyg.2015.01779 they lack evidence, they will produce utterances that are relevant, they will be concise, and they will make their contribution only as informative as is required for the current purposes of the dialogue in which they are engaged.

The net effect of this is to substantially narrow down the space of alternatives that are pragmatically consequential in a particular set of circumstances. For instance, implicatures are predicted not to be available based on informationally stronger statements about which the speaker is not knowledgeable, because the speaker could not have made these statements without violating Grice's quality maxim—therefore, the speaker's unwillingness to utter them is not intended to signal their falsity.<sup>1</sup> Similarly, implicatures should not be available when the additional information provided by the stronger statement would have been irrelevant to the current discourse purpose, putatively because the speaker could not convey this additional information without violating the maxim of relation. These predictions have been borne out experimentally (Breheny et al., 2006; Goodman and Stuhlmüller, 2013).

Nevertheless, as Grice himself acknowledged, the issue of determining whether or not a potential utterance would have been relevant to the current discourse purpose, had it been uttered, is not a straightforward matter. Roberts (1996/2012) approaches this by appeal to the notion of Question Under Discussion (QUD), which she defines as the immediate topic of discussion and which she takes to proffer a set of relevant alternatives. A felicitous assertion is, on this view, one which bears upon the QUD by choosing among the alternatives that it proffers. For instance, a QUD of "How many of John's children did Mary see today?" would proffer a set of alternatives including "some of them" and "all of them," and both (1) and (2) would be felicitous responses to this QUD. If that particular question is indeed the one under discussion, the hearer of (1) is expected to identify that (2) would have been a felicitous alternative and (given some additional assumptions) understand (1) to implicate the negation of (2).

What if the QUD is not explicitly given, though? Roberts (1996/2012) takes the view that the QUD is often merely implicit and has to be inferred on the basis of other considerations. Specifically, she cites the use of prosodic focus as a cue to QUD in English. As she puts it (2012: 27), "assertions, like questions, are conventionally associated with a set of alternatives, although these alternatives are presupposed by the prosody rather than proferred" (see also Büring, 2003). This proposal develops the observation of Jackendoff (1972) that the prosody of an assertion constrains the set of questions to which it could be an answer: on Roberts's account, we can go further and use the prosody of an assertion to identify relevant alternatives that could have been uttered in place of the actual assertion.

In this paper, we discuss the use of focus marking to evoke sets of alternatives and experimentally test the impact of such alternatives on three distinct pragmatic phenomena: scalar implicature, presupposition cancelation, and coreference. We argue that a QUD-based analysis potentially offers a unified explanation of what appear, on the surface, to be very different pragmatic consequences; and we introduce novel experimental data to show that these effects are indeed evident in comprehenders' behavior.

## SOME PRAGMATIC CONSEQUENCES OF FOCUS MANIPULATION

## Scalar Implicature

Since Horn (1972), scalar implicatures have been widely discussed as a special case of quantity implicature. As Geurts (2010, p. 49) puts it, "the distinctive feature of scalar implicatures is that we can use lexical substitution to generate the relevant alternatives from the sentence uttered." This is evident in the case of (1) above: the alternative, (2), is generated simply by replacing the informationally weaker "some" with the stronger "all." We can think of <some, all> as constituting an informational scale.

A widespread intuition within the literature is that (at least some) scales of this form are privileged in terms of their pragmatics, in that the use of a weak term from one of those scales robustly tends to implicate the falsity of the corresponding utterance with any stronger scalemate. Indeed, for cases such as <some, all>, the inference (that "some" tends to mean "not all") is sufficiently robust to have motivated accounts in which it is generated by default (Levinson, 2000) or is grammaticalised (Chierchia et al., 2012). From a QUD point of view, we can understand this observation as a generalization about the kinds of question to which a weak scalar is an appropriate answer: that is, whenever a weak scalar is a felicitous answer to a given QUD, any stronger scalemate would likewise be a felicitous answer. Consequently, it is generally appropriate for the hearer to embark on pragmatic reasoning concerning the stronger alternative (thus deriving the implicature), safe in the knowledge that the stronger alternative would indeed have been an appropriate thing for the speaker to have uttered, had the speaker known it to be true.

The extensive recent experimental literature on scalar implicature has demonstrated that things are not quite so clearcut as had previously been supposed. In fact, there is considerable variability between participants as to whether or not they endorse scalar inferences such as "some" -> "not all," with the overall response rates also depending on task factors (see Katsos and Bishop, 2011 for a review). A possible explanation for this is that the tendency of a particular weak scalar to evoke a suitable context for implicature (i.e., a context in which the stronger alternative would also have been felicitous) is not necessarily as strong as had been postulated. This may reflect the fact that, under certain circumstances, it is possible to use weak scalars in contexts in which their stronger scalemates are not judged to be especially relevant, as was shown by Breheny et al. (2006). For instance, in the context of (3), the use of the weak scalar "or" [as in (4)] already adequately answers the question of whether there is at least one person who will be available. The use of the stronger scalemate "and" [as in (5)] would not necessarily be warranted, inasmuch as the extra information it conveys is not essential for the current discourse purpose. Indeed, (5) could

<sup>1</sup>Of course, the speaker's unwillingness to make a stronger statement may have the effect of signaling their lack of certain knowledge about the truth or falsity of the stronger statement.

be altogether less useful than (4), to the extent that it introduces an ambiguity concerning whether Kate and Rob will be available separately. Correspondingly, readers/hearers do not tend to infer, on the basis of (4), that (5) is false.


Given that scalar implicatures are not obligatory in all contexts, we can ask whether their availability is sensitive to the kind of focus manipulation discussed by Roberts (1996/2012). The intuition is that placing focus on the weak scalar term emphasizes its potential for being substituted: that is, that the relevant alternatives to the utterance involve the substitution of some other lexical item in place of the weak scalar. For instance, by stressing "some," we call particular attention to the possibility that other items such as "all" could be used in its stead, and consequently feed these into the calculation of potential pragmatic enrichments. By hypothesis, the use of a weak scalar already tends to evoke this set of alternatives, but [as shown by cases such as (4)] this is not invariably the case. We might therefore expect that placing focus on the weak scalar will increase the rate at which comprehenders infer scalar implicatures.

This hypothesis has been partially tested by placing utterances containing elements from the <or, and> scale (Zondervan, 2006) and the <some, all> scale in contexts in which a preceding question designates the target utterance's intended focus structure (Zondervan et al., 2009). Using dialogue fragments like (6) and (7), Zondervan et al. show that comprehenders draw significantly more implicatures (agreeing more often with the statement that "not all pizzas were delivered") in the context of a question that evokes the stronger scalar alternative (6) than one that does not (7).


Intuitively, B's utterance in (6) has focus on "some," and would perhaps most naturally be read aloud with focal stress on that word, whereas B's utterance in (7) does not, and would be read with stress on "were." The finding therefore coheres with Roberts' account. However, it should be noted that (6) does not merely evoke the stronger alternative "all" through the presumed focus placement in B's utterance, but explicitly introduces it in A's utterance. By contrast, (7) makes no mention of "all." Moreover, in (7), B's utterance is unnecessarily verbose (B could just reply "yes"), and it seems possible that a reader could doubt B's full cooperativity. Therefore, it might be premature to attribute the difference in judgments between (6) and (7) entirely to focus considerations.

Recent experimental work points to a role for the combination of preceding context and explicit manipulations of prosody in the interpretation of scalar implicatures. De Marneffe and Tonhauser (2015) test for effects of prosody in two different contexts which provide the background against which to interpret a scalar adjective—either an explicit utterance similar to A's polar question in (7) or a preceding statement regarding speaker A's commitments. In both contexts, a rise-fall-rise intonation on B's subsequent utterance containing the scalar adjective leads listeners to report stronger degrees of belief in the pragmatically strengthened meaning compared with a neutral intonation. However, this leaves open the question of whether prosody alone can shift the hearer's understanding of what the preceding context is likely to contain, in such a way as to influence the pragmatic interpretation of the utterance. Under a QUD-based account, this should be possible: an utterance's prosody is one of the cues that listeners use to infer what question the utterance may be a relevant answer to.

As in the case of much of the experimental research on scalar implicature, the existing work on focus effects has attended to a limited number of potential scales. More recent work by Van Tiel et al. (2014) demonstrates substantial variability among potential implicature scales with respect to the availability of their corresponding implicatures. They demonstrate that, within a neutral context, the rates of endorsement of 43 candidate scalar implicatures ranged from 4% (e.g., "tired" +> "not exhausted") to 100% (e.g., "sometimes" +> "not always"), with "some" +> "not all" very near the top of the range at 96%. This variability raises the question of whether the effect of focus in promoting scalar implicature is general across a broad range of triggers. On the one hand, "some" and "or" (which was not tested by Van Tiel et al.) may be atypically strong implicature triggers, and consequently the effect of focus may be particularly clearcut in these cases, as the stronger scalar alternatives are especially susceptible to being evoked. On the other hand, it is possible that "some" and "or" could be influenced less by the presence of focus, as they already evoke the stronger scalar alternatives to the fullest extent possible even without additional stress being introduced.

Experiment 1 of this paper evaluates the availability of a variety of different scalar implicatures, using intonation to signal focus placement on a weak scalar. The study goes beyond prior work that has manipulated the preceding context against which a scalar is interpreted (Zondervan et al., 2009; de Marneffe and Tonhauser, 2015). If hearers can instead make use of focus placement on an utterance in isolation to recover a likely QUD that is operative in the context, that QUD and the set of alternatives it evokes is predicted to influence the perceived availability of the scalar implicature.

As we will show, this prediction is borne out. However, the finding follows from the fact that scalar implicatures necessarily depend on the presence of alternatives ("scalemates"). Arguably a more substantive result would be a demonstration that the manipulation of focus influences scale-independent pragmatic phenomena. To that end, we next consider presupposition and coreference.

## Presupposition Cancellation

The tendency of content to project from under the scope of negation has long been identified as diagnostic of presupposition, as opposed to other forms of non-asserted content. For instance, both (8) and its negation (9) presuppose (10). By appeal to accommodation, either (8) or (9) can be used to convey the fact of (10) to a hearer who was not previously aware of it.


Nevertheless, it is quite possible for a presupposition under the scope of negation to be canceled, or to fail to project to the discourse level. (11) is an apparently felicitous example.

(11) John didn't quit smoking—he never smoked in the first place.

In principle, the acceptability of (11) suggests that the hearer is confronted with a difficult problem when she encounters an utterance like (9)—should the presupposition (10) be added to her discourse model, even though this might turn out to be an erroneous inference? Or should she wait until it is made clear whether or not the speaker intends to communicate (10)? This puzzle appears to vitiate the communicative benefits of being able to exploit accommodation to convey a presupposition.

It may be possible to solve this puzzle by appeal to the notion of QUD. An utterance like (11), in which a presupposition is apparently triggered (in this case, by the use of "quit") and then canceled, may suggest the presence of a current QUD that already assumes that presupposition. For (11), the QUD appears to be something like "Did John quit smoking?" The set of proffered answers then effectively comprises (8) and (9), both of which contain the presupposition trigger "quit," and the speaker's subsequent utterance of one of them does not constitute an attempt to convey the presupposition. If the hearers are aware of, or can infer, the existence of such a QUD, then they should not take the speaker's utterance of "quit" as necessarily committing the speaker to the belief that John used to smoke. By contrast, an utterance like (9) is potentially compatible with a wider range of QUDs (for example, "What did John do after he saw his doctor?"), some of which proffer alternatives that do not involve the presupposition trigger "quit." The subsequent use of "quit" thus represents the outcome of a choice on the part of the speaker, and consequently has the potential to convey meaning (i.e., the presupposition).

How might focus effects come into play here? A speaker who utters (9) neutrally, or placing stress on "didn't," seems merely to evoke an alternative such as (8). This is compatible with a situation in which the QUD is "Did John quit smoking?" and the speaker does not wish to challenge the presupposition. However, a speaker who utters (9) but places focal stress on "John" appears to give rise to a different set of alternatives, involving all the people who might have quit smoking. This kind of focus appears to suggest a continuation such as (11) or (12).

(12) JOHN didn't quit smoking—you're thinking of Bill.

As far as the QUD is concerned, focus on "John" suggests that it is likely to be of the form "Who (didn't) quit smoking?" There is still a presupposition built into this question, namely that someone (in the universe of discourse) used to smoke at some point prior to the time of utterance, but the specific presupposition that John used to smoke is now absent.

The story is similar if stress is placed on "smoking." In this case, the implied QUD is "What did John quit doing?" and the alternatives are the things that John might have quit (e.g., "drinking"). Again, the QUD encompasses a presupposition that John used to do something (of interest to the discourse purpose), but not specifically that he used to smoke.

If this line of reasoning is correct, then the hearer's inference from (9) to (10)—the projection of the presupposition from under the scope of negation to the discourse level—relies upon the assumption that (9) answers a QUD that presupposes (10). This inference will therefore be obstructed if focus is placed on "John" or "smoking." In either case, the hearer will be encouraged to infer a QUD which does not presuppose (10), and hence not project the presupposition. This observation and variants of the QUD-based analysis have been outlined in similar form in several recent papers (Beaver and Clark, 2008; Cummins, 2014; Simons et al., to appear). Experiment 2 tests these predictions experimentally. As in our first experiment, Experiment 2 manipulates focus placement to influence the QUD a hearer infers, and as we show, this manipulation in turn modulates the projection of the presupposition from under negation.

We now turn to a phenomenon that is known to be sensitive to QUD but that has not been typically analyzed alongside implicature or presupposition: pronoun interpretation.

## Coreference

Assigning reference to pronouns gives rise to ambiguity in cases such as (15), where more than one suitable potential referent is present in the preceding context.

(15) Mary scolded Sue. She praised Bob.

An extensive literature posits a number of factors that comprehenders bring to bear on the process of pronoun interpretation. Some factors are taken to reflect surface structure—e.g., a preference for antecedents in subject position or a preference for grammatical role parallelism (Sheldon, 1974; Smyth, 1994; Stevenson et al., 1994). Other factors reflect deeper properties of the utterance such as the lexical semantics of the verb or its thematic role assignments (Caramazza et al., 1977; Stevenson et al., 1994; Arnold, 2001). An alternative approach (Hobbs, 1979; Kehler, 2002; Kehler et al., 2008) argues that such preferences emerge as a by-product of reasoning about the most likely interpretation of an utterance in relation to adjacent utterances. These intersentential relationships can be understood either as coherence relations or as QUDs which can influence pronoun interpretation (Rohde, 2008; Kehler and Rohde, under review).

In many discourse contexts, all of these approaches make the same prediction regarding a pronoun's preferred interpretation. However, an example like (15) reveals key differences and allows us to highlight the role of the inferred QUD. While parallelism and subjecthood preferences both favor the interpretation of "she" in (15) as referring to Mary, the status of the verb "scold" as a member of the class of so-called NP2-biased Implicit Causality (IC) verbs is posited to yield a preference for Sue, the referent filling the patient thematic role and appearing in object position (Garvey and Caramazza, 1974; Brown and Fish, 1983; Au, 1986; McKoon et al., 1993; Koornneef and van Berkum, 2006). This difference is unsurprising if the preferred interpretation of the pronoun is understood to depend on the coherence relation that is inferred to hold between the two sentences (Kehler et al., 2008).

If the second sentence in (15) serves as an explanation of the first, then the combination of the lexical semantics of "scold" and the causal coherence relation yields a preference to interpret "she" as the causally implicated referent of a scolding event, namely the scoldee, Sue (i.e., Mary scolded Sue because sheSue praised Bob). In this case, we are led to assume some set of circumstances under which, from Mary's point of view, the action of praising Bob is worthy of reproach. This is taken to be a more plausible state of affairs than a reading in which Mary praising Bob (sheMary praised Bob) stands as an explanation for Mary scolding Sue (although we might be able to imagine contexts in which this is conceivable). If instead the second sentence is interpreted to be relevant to the first via a discourse relation centered on parallelism, then what is important is the similarity of the entities and actions in the two sentences, e.g., Mary as the Agent of the scolding event (and the subject of the first sentence) can be mapped to Mary as the Agent of the praising event (and the subject of the second sentence), with Sue and Bob as the respective Patients. The fact that "scold" and "praise" are both members of the class of agent-patient IC verbs while differing in affect supports the inference of a contrast relation (i.e., Mary scolded Sue, but sheMary praised Bob).

The different interpretations of (15) seem to suggest the existence of different QUDs. Under the parallel interpretation, both sentences can naturally be construed as partial answers to a single QUD "What did Mary do?" Under the causal interpretation of (15), the sentences, respectively, answer two distinct QUDs to the effect of "What did Mary do?" and "Why did she do that?"

On this analysis, we would again predict that the interpretative preference for the pronoun would be influenced by the presence of focal stress in the second sentence. Suppose that stress is placed on the word "Bob" in (15). For the same reasons discussed earlier, this suggests that the QUD in effect at the second sentence is "Who did X praise?," where X denotes the referent of "she," i.e., Mary or Sue. If X = "Mary," then the question that the second sentence partially answers is "Who did Mary praise?," which is a subquestion of "What did Mary do?," which in turn is the QUD most likely to be operable for the first sentence. By contrast, if X = "Sue," the second sentence partially answers "Who did Sue praise?," which is not a subquestion of "What did Mary do?" Moreover, it is not transparently a subquestion of "Why did Mary scold Sue?," although it could be interpreted as such under some additional assumptions. It is not, after all, a likely state of affairs that Mary scolding Sue was caused by Sue praising anyone (though the fact that one may attempt to formulate such a scenario is a testament to the bias in favor of causal coherence relations following IC verbs). A similar argument applies if focal stress is placed on "praised": again, if "she" refers to Mary, the second sentence is a partial answer to the first sentence's likely QUD, whereas if "she" refers to Sue it is not.

In summary, then, on QUD grounds, we would expect the placement of stress on "praised" or "Bob" in the second sentence of (15) to promote the parallel interpretation, in which the pronoun refers to the subject, Mary, over a causal interpretation, in which the pronoun refers to the object, Sue. A similar theoretical case is made by Kehler (2005) for differences in the interpretation of an ambiguous pronoun depending on the coherence relation that is inferred to hold between two adjacent clauses, although in Kehler's example [see (16)], subject coreference is favored by the causal coherence relation and object coreference by parallelism.

(16) Powell defied Cheney, and Bush punished him.

Kehler argues that the parallel interpretation is associated with accent placement on each word of the second clause, whereas the causal interpretation leaves the final word unaccented. Focus marking is thus predicted to influence the inferred relation or question under discussion. Experiment 3 uses IC contexts to test the prediction that accent placement can guide listeners' inferred relation, which in turn has repercussions for coreference. We replicate the widely reported NP2 bias (for pronoun coreference with the object of NP2-biased verbs) and present the first experimental evidence of this novel effect showing that IC biases are reduced when there is focus placement on the predicate of the subsequent clause.

## Interim Summary

We have argued in the preceding subsections that the same form of manipulation—introducing focal stress on a particular constituent—should have pragmatic consequences of an apparently diverse nature across a range of structures. In the case of scalar implicatures, we argue that focusing a weak scalar term should increase the availability of the implicature, although the effect of this may vary between scales. In the case of presupposition, we argue that focusing any of various arguments of a presupposition trigger may result in the presupposition being less likely to project from under the scope of negation. And in the case of pronominal coreference, we argue that focusing the predicate of a subject-pronominal sentence is likely to promote a parallel interpretation of the pronoun over alternative causal readings. All of these consequences flow naturally from a view in which focus presupposes a set of alternatives, as argued by Roberts (1996/2012). The following sections present a short series of experimental studies designed to test these predictions.

Before we proceed, it is worth asking why these three phenomena have not previously been linked together via QUD. This may reflect a difference in emphasis over the fields' histories and their treatment of literal and inferred meaning. On the one hand, work on implicature has assumed that the literal message is easy to identify and that complexity emerges in the subsequent calculation of what is meant beyond that literal meaning. Likewise, in the case of presupposition and presupposition accommodation, the emphasis has been placed on identifying what additional meaning is at stake given the words used to convey a particular literal message. On the other hand, coreference models typically target the ambiguity in the literal message, specifically concerning which individual in the available set of entities in the preceding context is most likely to be referenced here. This in turn depends on inferences about the operative coherence relation. Only recently have these three areas been analyzed in terms of QUD. Pronouns historically were modeled primarily in terms of entity salience (see Ariel, 1990; Gundel et al., 1993; Arnold, 2001) and more rarely in terms of QUDs or coherence relations (Winograd, 1972; Hobbs, 1979; Kehler, 2002). Work in implicature and presuppositions has only recently focused on the importance of QUD (Breheny et al., 2006; Beaver and Clark, 2008; Zondervan et al., 2009; Cummins, 2014; de Marneffe and Tonhauser, 2015; Simons et al., to appear). Our studies represent the inevitable convergence of these separate research strands.

## EXPERIMENT 1: SCALAR IMPLICATURE

This experiment uses a rating task to test the hypothesis that the availability of scalar implicatures is sensitive to QUD, as evoked via focus placement. Participants listen to sentences containing weak scalars in two conditions (neutral vs. focus) and then answer a question about the status of a stronger statement. The design is a within-participants and within-items manipulation.

## Participants

Seventy-seven English-speaking participants were recruited from Amazon Mechanical Turk, location restricted to the United States. After eliminating data from 12 bilinguals and 5 participants who failed to complete the task, data from 60 monolingual participants remained for the main analysis. Participants were paid between \$1.80 and \$2.50.

For this and the subsequent experiments, each participant was provided in advance with information about the procedure and gave informed consent. The experiments were conducted in accordance with the University of Edinburgh's ethics policy and the UKRIO Code of Practice for Research, and under the oversight of the departmental Ethics committee.

## Materials

Target stimuli consisted of 20 recorded sentences, each containing a weak scalar in sentence-final position, as in (17), interleaved with 20 sentences for Experiment 2. The full stimuli set is listed in Appendix A.<sup>2</sup>

(17) The view from the hotel window is pretty.

The target sentences were recorded in two conditions: neutral intonation and focus placement on the scalar. The stimuli were recorded by a native speaker of English (the first author of this paper). Note that any variability in the recordings of these two conditions would serve only to reduce our ability to observe a difference between conditions.

The experiment consisted of 40 items: The 20 target items for Experiment 1 were intermixed with 20 items for Experiment 2, which were likewise one-sentence items with variable intonation.

## Procedure

Participants accessed the experiment via a website linked within Mechanical Turk. Each participant listened to all 20 sentences,

<sup>2</sup>The recordings of the target stimuli for all three experiments are available here: http://dx.doi.org/10.7488/ds/315.

half in the neutral intonation condition and half in the focus placement condition. Across participants, each sentence appeared in both conditions. Participants were asked to listen to the sentence and answer a question about the speaker's intended meaning on a scale of 1 to 7. The text showing the question was visible on the screen during and after playback of the recorded sentence. Participants could replay the sentence as many times as they wished. Each item appeared on a page by itself, with a radio-button interface for participants to record their rating.

For the Experiment 1 target items, the question asked about a relevant stronger scalemate: For example, the question for the recording of (17) was (18), with answer "1" labeled as "unlikely" and "7" labeled as "likely.

(18) How likely is it that the view is not gorgeous?

The task took roughly 20 min.

## Results

We modeled the ratings using a mixed-effect linear regression with a fixed effect of condition. All models reported in this paper contain random participant-specific and item-specific intercepts and slopes where permitted by the data (Barr et al., 2013). As predicted, participants endorsed the stronger statement ("not gorgeous") more in the focus condition (mean = 5.05) than the neutral condition (mean = 4.74), showing a main effect of condition (β = 0.26, t = 2.707). We conducted a likelihoodratio test between mixed-effects models differing only in the presence or absence of the fixed main effect of condition. The model comparison showed a main effect of condition (p < 0.05, 1 d.f.). **Figure 1** shows the difference between ratings in the focus placement and neutral conditions, broken down by item.<sup>3</sup>

## EXPERIMENT 2: PRESUPPOSITION

This experiment uses a rating task to test the hypothesis that the projection of a presupposition under negation is sensitive to QUD, again evoked via focus placement. Participants listen to sentences containing presupposition triggers in two conditions (neutral vs. focus) and then answer a question about the status of the presupposition.

## Participants

Because the Experiment 1 and Experiment 2 stimuli were interleaved in a single task, the same participants from Experiment 1 also completed this experiment.

## Materials

Target stimuli consisted of 20 recorded sentences, each containing a presupposition trigger, as in (19), in either a neutral

<sup>3</sup>One of our reviewers points out that the direction of the scale may not be obvious for the items "delayed" ("not on time" vs. "not canceled") and "smoldering" ("not alight" vs. "not out"). The participants in the experiment were asked how likely it was that the "train was not canceled" and "the fire was not out," respectively. Removing those two items does not affect the overall analysis: The main effect of intonation is still significant. Note that "smoldering," which is arguably the harder one to identify its pragmatically strengthened meaning, is shown in **Figure 1** to be the worst performing item, so it is possible that asking the question differently ("how likely was it that the fire was not alight?") might have yielded different responses.

or focus condition. The focus condition placed a pitch accent on the last word of the sentence. This word was either part of an embedded clause under a factive trigger (e.g., be sorry that the jewels were in the SAFE) or was otherwise within the scope of the trigger by being mentioned as part of an argument of a trigger verb (e.g., return to a job at CHRYSLER) or as an adjunct (e.g., finish a degree at HARVARD).

(19) Bill doesn't regret arguing with his boss.

## Procedure

The procedure is described in Experiment 1 above. For the Experiment 2 target items, the question asked directly about the presupposition: For example, the question for the recording of (19) was (20), with answer "1" labeled as "unlikely" and "7" labeled as "likely.

(20) How likely is it that Bill argued with his boss?

## Results

As predicted, participants gave lower ratings to the presupposed statement ("Bill argued with his boss") in the focus condition (mean = 5.97) than the neutral condition (mean = 6.15). As in Experiment 1, we modeled the ratings using a mixed-effect linear regression with a fixed effect of condition. The effect of condition (β = 0.20, t = 2.30) was significant under model comparison (p < 0.05, 1 d.f.). **Figure 2** shows the difference between ratings in the focus placement and neutral conditions, broken down by item.

## EXPERIMENT 3: COREFERENCE

This experiment uses a pronoun interpretation task to test the hypothesis that coreference is sensitive to QUD, again evoked via focus placement. Participants listen to two-sentence discourses containing in ambiguous pronoun. The second sentence varies between a neutral condition and a focus condition.

## Participants

Seventy-five English-speaking participants were recruited from Amazon Mechanical Turk, location restricted to the United States. Data was eliminated from seven bilinguals and three participants who either did not complete or did not understand the task. Participants were paid between \$1.25 and \$2.00.

## Materials

Target stimuli consisted of 16 recorded passages, as in (21). The first sentence mentioned two referents in a situation described with an NP2-biased IC verb. The two referents were of the same gender, counterbalanced between male and female names. The second sentence started with an ambiguous pronoun followed by a continuation that was intended to be plausible under either interpretation of the pronoun.

(21) Charles congratulated Simon. He had criticized Stephanie.

The passage varied between a neutral condition and a focus condition. The focus condition was uttered with the intention of conveying that the two sentences both provided answers to a question about what the first referent had done. For this manipulation to work, the preferred pronoun interpretation with neutral intonation must be to the non-subject. That is precisely why the class of NP2-biased IC verbs provides an ideal test case.

The target stimuli were interleaved with 16 fillers that were produced with either neutral or focus-marked intonation.

## Procedure

Participants were asked to listen to a sentence and answer a question about the speaker's intended meaning in the provided text box. As in Experiments 1 and 2, the text showing the question was visible on the screen during and after playback of the recorded sentence, and participants could replay the sentence as many times as they wished.

For the Experiment 3 target items, the question asked who did the action described in the second sentence: For example, the question for the recording of (21) was (22).

(22) Who criticized Stephanie?

The task took roughly 15 min.

## Results

Responses to target items were coded as SUBJECT [e.g., an answer of "Charles" to (22)], OBJECT ("Simon"), or UNKNOWN (e.g., "the teacher"). The responses to filler trials were also coded and used to determine which participants to exclude from analysis. Odd responses were taken to indicate that a participant might not have been paying sufficient attention or might not have been able to hear the audio sufficiently well. For example, a participant presented with a filler like "Paul leaned across the table toward Stacie. He then asked her to marry him." who answered the question of "Who proposed?" with "Stacie" had that answer coded as an outlier (misinterpreting "he" as "she"). Likewise, a participant presented with a filler like "Vicki is attracted to Dennis. He is repulsed by her." who answered the question "Who is repulsed?" with "Becky" had that answer coded as an outlier (mishearing Vicki as Becky). After eliminating the 14 participants with 2 or more outlier answers on filler trials, data from 50 participants remained for the analysis. Responses categorized as UNKNOWN (0.5% of target trials) were also removed.

In keeping with previous studies on NP2-biased IC verbs, participants favored the object as the referent of the pronoun (62% object coreference overall). As predicted, however, pronouns were interpreted to refer to the subject more often in the focus condition (mean = 41%) than the neutral condition (mean = 35%). A mixed-effect logistic regression showed a main effect of condition (β = −0.47, p < 0.05, based on the Wald Z statistic; Agresti, 2002). **Figure 3** shows the difference between ratings in the focus placement and neutral conditions, broken down by item.

## DISCUSSION

The results of our experiments broadly support our hypothesis that focus-driven pragmatic effects would be observable in all three domains of interest. In the case of scalar implicature, we see a general tendency for focus marking of the weak scalar to promote interpretations involving the implicature. In the case of presupposition, focus marking within the presupposed material—under the scope of negation—tends to promote interpretations in which the presupposition fails to project to the discourse level. In the case of subject pronoun disambiguation, focus marking on the sentential object tends to promote parallel interpretations of the pronoun.

As explored earlier, all these patterns are explicable in terms of QUD effects. This relies crucially upon the assumption that the intonation employed as an indicator of focus structure in the materials used is actually used by hearers as an indication of which QUD is currently in play. In principle, this appears to be a reasonable assumption: Most and Saltz (1979) documented experimentally that hearers were able to infer the questions to which differently-intoned sentences were answers. It is important to reiterate that our materials were not constructed in such a way as to control their prosodic properties: the sentences were merely

read by a native speaker who was trying to convey an intended meaning as opposed to trying to realize a specific contour. Consequently, we are not licensed to draw precise conclusions about the relationships between prosody, focus, and QUD. We can, however, conclude that a purely intonational manipulation that targets a particular constituent can have pragmatic effects on the hearers which are predictable under a QUD-based account.

The observed effects are in keeping with existing work showing that focus factors matter to interpretation by activating alternatives. Such effects have been demonstrated for the inclusive/exclusive interpretation of "or" (Chevallier et al., 2008, 2010) and for the exhaustivity inferences of "only" and the additive presuppositions "also" (Gotzner and Spalek, 2014; see also Tomlinson and Bott, 2013). With respect to scalar implicatures, our work complements ongoing research on the role of prosody in such contexts: results reported by de Marneffe and Tonhauser (2015) demonstrate that a specific prosodic contour can increase the availability of scalar implicatures compared to a neutral intonation contour, although the effect that they document is evident when the discourse context is also provided. As de Marneffe and Tonhauser note, this suggests that the prosodic influence on implicature is a more complex matter than simply whether or not the weak scalar receives a pitch accent, as this was the case for the all the conditions in their experiment. This in turn suggests an important role for research on scalar implicatures using auditory stimuli, as the kind of intonation contour inferred by a participant reading written stimuli cannot always be determined with confidence.

Regarding the mechanisms by which hearers generate pragmatic enrichments, there are several possibilities that are compatible with the kinds of pragmatic enrichments that we observe. For the much-discussed case of scalar implicature, possible strategies include interpreting the weak scalar as semantically (or typically—see Geurts and van Tiel, 2013) excluding the possibility that the strong scalar holds, or inserting a tacit exhaustivity operator over the scalar when parsing the sentence (Chierchia et al., 2012). Compared to these options, the QUD-based account appears computationally more laborious, and is more in keeping with the traditional Gricean approach to quantity implicatures. However, explaining the effect of focus within these other approaches is perhaps not so straightforward: we would have to construe it as inducing either a particular interpretative preference or a particular parsing preference at the weak scalar term itself. Consequently, it seems plausible to treat the patterns observed in this experiment as supportive of the QUD-based model.

There are similarly several different routes by which a given presupposition can project to the discourse level, as discussed earlier in this paper. On one account, the hearer adds the presupposition to her discourse model immediately upon encountering the trigger, even if it occurs under the scope of negation; but this proposal runs into difficulty in cases of local accommodation (i.e., where the presupposition turns out not to be intended by the speaker). Another possibility is that the hearer considers whether the QUD that they infer on the basis of the utterance carries the presupposition, and if so, this enables them to project the presupposition from under the scope of negation. Still another possibility arises for a particular class of utterances with presupposition triggers, as in the case of (23), contrasted here with (24).

(23) Mary doesn't regret that the Tories won the election.

(24) Mary doesn't regret arguing with her boss.

If "Mary" is stressed in (24), a possible interpretation is that it is someone other than Mary who regrets arguing with their boss. Under this interpretation, (24) does not convey that Mary argued with her boss: indeed, it does not convey that anyone argued with Mary's boss, although it does convey that someone argued with his or her own boss (which may or may not be the same individual as Mary's boss). By contrast, applying the same reasoning to (23), the utterance may convey that someone other than Mary regrets that the Tories won the election, and this in turn requires it to be the case that the Tories won the election. In effect, it appears possible that the utterance with focal stress triggers some kind of ad hoc implicature which in turn introduces a presupposition into the hearer's discourse model.<sup>4</sup> In the examples tested in this paper, this possibility does not arise, but the prediction would be that focal stress on "Mary" should make no difference to the projection of the presupposition in (23).

## CONCLUSION

The three experiments reported here test how a manipulation of focus placement can influence three phenomena that involve pragmatic enrichment, all of which are sensitive to the QUD evoked by the context. Unlike previous work that has explicitly manipulated the previous context, here it is the focus placement itself that informs the listener about the possible QUD to which the current sentence may be an answer. The repercussions of this QUD manipulation can be seen in three different types of pragmatic enrichment: scalar inference, the projection of presupposition from under negation, and the identification of a referent for an ambiguous pronoun. In each case, focus placement signals what QUD is likely and that QUD in turn determines the relevance of a particular proposition for

## REFERENCES


the interpretation of the target sentence. For scalar implicature, highlighting the relevance of an unstated alternative that is informationally stronger is found to heighten the availability of the implicature. For presupposition, highlighting the relevance of an alternative which does not itself carry the presupposition reduces projection. For coreference, highlighting the relevance of a particular alternative favors the inference of a parallel coherence relation between two adjacent sentences, thereby disfavoring coreference with the referent picked out by causal reasoning. Together, these experiments show that a single manipulation can influence a varied set of phenomena. Our findings suggest that the study of context can and should move beyond ad hoc explanations for specific readings and toward the identification of cues that alter context in systematic ways. Of course not all context-driven effects depend on focus placement, but the results reported here offer a first step toward a necessary inventory of targeted contextual manipulations that guide listeners' interpretations.

## ACKNOWLEDGMENTS

The authors would also like to acknowledge the feedback from participants at XPrag.de's "Formal and experimental pragmatics: methodological issues of a nascent liaison" and the University of Chicago's "What's the Question? Investigating formal and pragmatic constraints on the Question Under Discussion," as well as helpful comments from our two reviewers. The open access publication of this work was generously supported by the College of Humanities and Social Sciences at the University of Edinburgh.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2015.01779


<sup>4</sup>This inference requires the speaker to be knowledgeable; the assumption would be that such a speaker, knowing that no-one (of interest) argued with their boss [in (B)] or that the Tories didn't win the election [in (A)], would say so.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Cummins and Rohde. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Selecting Presuppositions in Conditional Clauses. Results from a Psycholinguistic Experiment

Filippo Domaneschi, Elena Carrea, Carlo Penco\* and Alberto Greco

University of Genoa, Genoa, Italy

In this paper, we propose an experiment concerning presupposition selection in conditional sentences containing a presupposition trigger in the consequent. Many theories claim that sentences like if p, qq'—where q is the presupposition of the assertive component q'—have unconditional presuppositions, namely, they simply project q. Other theories suggest that these kinds of conditional sentences project conditional presuppositions of the form if p, q. Data collected suggest two results: (i) in accordance with other experiments (by Romoli), dependence between the presupposition q and the antecedent p favors the selection of a conditional presupposition if p, q. (ii) presupposition selection in conditional sentences with a trigger in the consequent is affected by speakers' cognitive load: if speakers are highly cognitive loaded, then they are less disposed to select a conditional presupposition. We conclude by arguing that cognitive load represents a key factor for the analysis of linguistic and philosophical theories of context.

### Edited by:

Alessio Plebe, University of Messina, Italy

### Reviewed by:

Claudia Bianchi, Universitá Vita-Salute San Raffaele, Italy Diana Mazzarella, Centre National de la Recherche Scientifique, France

\*Correspondence:

Carlo Penco penco@unige.it

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 25 June 2015 Accepted: 18 December 2015 Published: 12 January 2016

### Citation:

Domaneschi F, Carrea E, Penco C and Greco A (2016) Selecting Presuppositions in Conditional Clauses. Results from a Psycholinguistic Experiment. Front. Psychol. 6:2026. doi: 10.3389/fpsyg.2015.02026 Keywords: presuppositions, conditional clauses, update semantics, context change potential, cognitive load

## PRESUPPOSITIONS, CONTEXT SET, AND COMPOSITIONALITY

According to the standard semantic framework, the common ground is the set of propositions that participants in a conversation mutually assume to be taken for granted (Stalnaker, 2002). In this view, the common ground determines the context set, that is, the set of possible worlds in which all the propositions that form the common ground are true (Heim, 1983, 1992).

According to this standpoint, the meaning of a sentence is modeled via its context change potential (CCP): an instruction to update the context with new information with the effect of producing a new updated context as result. For instance, if the context c corresponds to the set of possible worlds in which "I have a sister," "Konstanz is in Europe," and "Today is Monday" and, in this context, a speaker utters the sentence "I've bought a new car" (φ), then the assertion of this sentence in the context c (i.e., c+φ) produces, as a result, a context c' that corresponds to the set of possible worlds in which "I have a sister," "Konstanz is in Europe," "Today is Monday," and "I've bought a new car."

From this perspective, presuppositions put requirements on the context: if ψ is a presupposition of φ, then c + φ is defined only if c ⊆ ψ. For example, the sentence "My car is red" can only be uttered in contexts that entail the presupposition "I have a (unique) car." A sentence's CCP, therefore, is the extent to which the sentence changes the context in which it is uttered to produce a new context, assuming that the new context accepts as true not only the sentence itself but also the presupposition of the uttered sentence. In general, CCP may be defined as a partial function from contexts to contexts: a sentence φ can only be uttered in a given class of contexts and brings about a new class of contexts as result<sup>1</sup> .

In order to provide an explanation of how the context changes in the course of a conversation, different dynamic semantic theories have proposed formal representations of language structure aimed at modeling the growth of information in the processing and development of a discourse. Overall, this aims to provide a solution to the traditional problem of the compositionality of meaning, that is, an explanation of how the meaning of compound sentences depends systematically on the meaning of their constituents and on the logical operators in use (e.g., negation ¬φ, conjunctions φ ∧ ϕ, disjunctions φ ∨ ϕ, and conditionals φ → ϕ).

In this respect, for many years, linguists and philosophers have been interested in the so-called "presupposition projection problem" (Heim, 1983, 1992; Geurts, 1999; Beaver, 2001; Schlenker, 2008, 2009; Singh, 2008; Kripke, 2009), that is, the problem of the compositionality of presuppositions, how complex sentences inherit their parts' presuppositions. This paper deals in particular with one of the most-discussed topics in this field of research: the Proviso Problem, the problem of the projection properties of conditional sentences with a presupposition trigger in the consequent.

## THE PROVISO PROBLEM

The Problem concerns the projection properties of a specific case of composed clauses, conditional sentences that contain a presupposition trigger<sup>2</sup> in the consequent (CpC); schematically, if p, qq' (where q is a presupposition triggered by the assertive component q'). This core problem, the Proviso Problem (Geurts, 1996), has been widely discussed in recent literature (see for instance, Beaver, 2001; Singh, 2008; von Fintel, 2008; Schlenker, 2010; Chemla and Schlenker, 2011). The discussion has generated two different kinds of answers.

On the one hand, several theories—mainly taking Discourse Representation Theory (DRT) as a framework—claim that sentences of the type if p, qq' have mainly unconditional presuppositions, namely, they simply project q (e.g., Gazdar, 1979; van der Sandt, 1992; Geurts, 1999). It is, in fact, intuitive that, in several cases, the presupposition projected by a CpC is unconditional; for instance, it is the case in the following utterance (quoted in Geurts, 1999).

(1) If John hates sonnets then his wife does so, too.

### (1a). John has a wife

(1) projects the unconditional presupposition, (1a). These theories do not exclude the possibility of deriving a conditional entailments, of the form if p, q, but they claim that the unconditional presupposition is the default reading, since it is the result of the universal preferencing of global over local accommodation. This is because, while the unconditional reading is derived as a presupposition, the conditional reading is inferred as an entailment. In other words, a sentence of the form if p, qq' can be represented in at least two ways in terms of discourse structure.


In the latter view, the local resolution of the presupposition is supposed to be possible only in contexts where it is supported by a "bridging inference" of the form if p then it's usual that q based on world knowledge (Geurts, 1999; Piwek and Krahmer, 2000). For example, the local resolution that leads to the conditional entailment (2a) in the case of the sentence (2) is allowed by the bridging inference, "If Mark is a Professor, then it's usual that he has students."

(2) If Mark is a Professor, then his students love him.

(2a). If Mark is a Professor, then he has students.

On the other hand, competing theories, traditionally known as "satisfaction theories," whose subscribers are often also supporters of dynamics semantics<sup>4</sup> , predict that CpC always project conditional presuppositions of the form if p, q and derive the unconditional presupposition in different ways depending on the versions of the theory (e.g., Heim, 1983; Beaver, 2001; Singh, 2007; van Rooij, 2007; Chemla, 2009). A seminal idea proposed by Heim (1983) has been developed within the framework of update semantics: when a context c does not satisfy or does not admit an assertion of if p, then qq', the repair of the context is driven by the instruction c[if p, then q][if p, then qq']. For example, informally, to update the context c with the information conveyed by (2), it is first necessary to update the context set with the information (2a).

Let us now consider the following examples (quoted in Pérez-Carballo, 2009).

	- (3a). If Paul is not tired, then he has a Bible.

<sup>1</sup>To be precise, a sentence is only ever uttered in a particular context, but the same sentence can be correctly used in all contexts in which its presuppositions are true (we are grateful to a referee of Frontiers for making this distinction).

<sup>2</sup>Presupposition triggers are lexical items and syntactic constructions that, if used in an utterance, activate a presupposition. In contemporary debate, there are two major approaches to the problem of triggering presuppositions. In semantic approaches, it is claimed that presuppositions are a particular type of meaning determined by the lexicon (Chierchia and McConnell-Ginet, 2000; Simons, 2001). Other scholars (Karttunen, 1974; Simons, 2001; Abusch, 2010; Schlenker, 2010) have supported a pragmatic view, according to which presuppositions are the result of speakers' inferences, as well as conversational implicatures (Abusch, 2002, 2010). On more specific relations between presuppositions and scalar implicatures see Pistoia Reda (2014).

<sup>3</sup> See van der Sandt (1992) for the seminal idea of the resolution of presuppositions in DRT.

<sup>4</sup>Recently, Schlenker (2008; 2009) has proposed a more static approach that makes the same predictions as dynamic semantics with regards to the Proviso Problem. Schlenker (2008), for instance, proposes an account of presupposition projection within a classic semantic framework enriched with two pragmatic principles grounded on the Gricean maxim of manner: "Be articulated" and "Be brief."

(3b). Paul has a Bible.

	- (4a). If Paul is a devout Catholic, then he has a Bible.
	- (4b). Paul has a Bible.

As pointed out by Pérez-Carballo (2009), intuitively, (3) seems to project the unconditional presupposition (3b), while (4) seems to project the conditional presupposition (4a). A possible explanation for that diversity is that, since "the only difference between the two examples is the antecedent clause, the antecedent clause must play an important role in the present phenomenon" (Romoli et al., 2011; p. 593). In particular, the dependence of the antecedent on the presupposition of the consequent seems to play a crucial role in the Proviso Problem<sup>5</sup> . This dependence seems specifically to affect the selection of conditional and unconditional presuppositions, which is traditionally identified by Singh (2007, 2008) and Schlenker (2011, p. 2) as the "Selection Problem." This problem needs to be distinguished from the "Strengthening Problem," that is, the question of which mechanisms generate these presuppositions.

In what follows, we focus on the Selection Problem, with a view to grasping whether and when conditional and unconditional presuppositions are selected depending on the relation between the antecedent and the consequent of CpC, specifically, depending on the bridging relation between the presupposition of the consequent and the antecedent of the conditional. In the last decade, the presupposition projection problem has been the subject of several experimental studies but, to our knowledge, no work has been directly aimed at evaluating the relationship between presupposition projection and working memory. Our central goal, besides the confirmation or disconfirmation of previous experimental results, is to study the cognitive load factor in relation to the presupposition selection in CpC. The importance of this aspect in the experimental investigations of ordinary language is due to the widely accepted idea that the greater the extent to which people are cognitively loaded, the greater their difficulty in processing certain information. Work on the relationship between cognitive load and conditional reasoning or processing conditional sentences has already produced interesting results, such as Toms et al. (1993), Markovits et al. (2002), Meiser et al. (2001), Capon et al. (2003). Our experiment uses this basic idea, generating different levels of cognitive load to assess whether this affects the subject's understanding or grasping of a conditional or unconditional presupposition in CpC. We might say, therefore, that the general question at stake here concerns the compositionality of presuppositions: what factors affect the selection of either a composed or a simple presupposition?

## AN EXPERIMENTAL STUDY

The aim of this experimental study is to test three hypotheses about presupposition selection in CpC. The first two hypotheses have been already investigated by Romoli et al. (2011), although, here, we propose a different experimental design, which is also required to test the third hypotheses.


Our experiment has been designed to measure the frequency of selection of conditional and unconditional presuppositions. The preponderance of either conditional or unconditional presuppositions, however, does not directly constitute something that can decide between the two approaches: DRT vs. satisfaction theories. In fact, the two approaches each predict that both conditional and unconditional presuppositions can arise and neither concerns itself directly with predicting the frequency of each kind of presuppositions. For one approach, the default reading is the conditional presupposition, for the other, the unconditional. The main purpose of this paper, therefore, is to take a first step toward a better understanding of the main factors that affect the frequency of conditional vs. unconditional presuppositions.

In the experiment, participants were required to perform two tasks simultaneously. The main task consisted of listening to a short recording, containing sentences of the type if p, qq' and, after that, choosing one sentence that best fits with the recording, from a list of four alternatives. The second task, included in Trials 1 and 3 of the experiment, was to remember two geometrical Figures during the first part of the main task (listening to the recordings). Trials 1 and 3 included the Interference condition, while Trials 2 and 4 included the Simple condition, without interference in the main task.

## Pre-experiment

Two kinds of target items (sentences of the form p, qq') were needed for the experiment: Dependent items, in which the presupposition q in the consequent was strongly related to the antecedent content p, and Independent items, in which there was no dependence between p and q. In order to select appropriate items, a questionnaire was created (completed by pencil and paper) similar to the one used by Romoli et al. (2011).

The participants in the pre-experiment were 23 students (15 women, 8 men) from the University of Genoa. They were recruited for course credit. Their ages ranged between 21 and 32 (M = 23.95; SD = 3.27). All participants were native Italian speakers. Informed consent was obtained.

In the questionnaire, sentences, each followed by a question, were presented to participants; for instance, "Lucy has a dog. Does that make it more likely that she has a leash?" The task was to give an assessment on a 5-point Likert scale, from 1 (much less likely) to 5 (much more likely). The questionnaire included 39 items. The five items—four tests, plus one instruction trial—with the highest score were chosen as target Dependent items, while the five items with the scores closest to the neutral 2.5 point were chosen as target Independent items.

<sup>5</sup>The idea that this kind of probabilistic reasoning is relevant to the Proviso Problem is discussed in Beaver (2001); Lassiter (2012), Schlenker (2011), and von Fintel (2008).

## Participants in the Main Experiment

Participants in the main experiment were 30 students (14 women, 16 men) from the University of Genoa. None had previously taken part in the pre-experiment. They were recruited for course credit. Their ages ranged between 20 and 31 (M = 25.8; SD = 2.94). All participants were native speakers of Italian. Informed consent was obtained.

## Stimuli

We created 5 recordings concerning fictional crimes<sup>6</sup> . Every sentence of each recording was read by different female and male voices. We used a whodunit subject in order to encourage participants to be more attentive to details, as if they were detectives<sup>7</sup> . The sentences that constituted the stories were in fact seemingly unrelated and participants had to interpret them as clues to be collected and interpreted, as if they were detectives. Each recording comprised between 51 and 66 words (an average of 58). Three conditional sentences, with balanced order, were included in each recording: (i) a Dependent target conditional sentence, (ii) an Independent target conditional sentence, (iii) a distractor conditional sentence. Dependent and Independent target sentences were selected on the basis of the results obtained in the pre-experiment. All the target conditional sentences activated a presupposition in the consequent via the presence of a definite description. For instance, the Recording 1 ran as follows.

The thief came into the house during the night. Luke's father is the owner of the house. If Luke is a writer, then his book is sold at the bookshop [Dependent target]. Mud stains were found on the carpet in the living room. If the thief came into the house passing through the garden, then he should have left footprints [distractor]. If Luke is tall, then he will tell one of his jokes to the cops [Independent target].

Two sets of four sentences were connected to each recording, a Dependent set and an Independent one. In each set, there were included:


For example, the Dependent set of sentences related to Recording 1, printed above, was:


The Independent set included the following four sentences:


Sixteen polygons were created (**Figure 1**) by combining four shapes—triangle, square, hexagon, circle—with four colors—red, green, yellow, blue. These figures were used to load participants' working memory during the execution of the first part of the main task, namely, listening to recordings.

## Procedures

The study was conducted in a laboratory setting. Instructions, stimuli, response recording, and data collection were controlled by a laptop computer running E-Prime **<sup>R</sup>** 1.1. Participants sat approximately 50 cm from the display, in a separate room. The lighting in the room was normal. Only a keyboard (no mouse) was available for responses.

The experiment included four trials for each participant. Only Trials 1 and 3 included the second task about geometrical figures. Trials 1 and 3 represented therefore the Interference condition, while Trials 2 and 4 represented the Simple condition, without interference.

The Interference condition trials consisted of the following phases (**Figure 2A**).


FIGURE 1 | The sixteen polygons used as stimuli to load participants' working memory.

<sup>6</sup>The experiment was run in Italian. The original items have been included in the Appendix in Supplementary Material.

<sup>7</sup>The texts of the recordings and some statements presented incongruous information mainly because the independent conditionals (see below) expressed a link between disconnected, independent contents. In order to make these incongruities plausible to the participants, we required them to act as if they were detectives, namely, by considering the information presented in the recordings as disconnected and incoherent clues provided by witnesses. The post-experimental interview revealed no particular difficulties, on the part of the participants, with these incongruities.


In the Simple condition, the trials (**Figure 2B**) included only Phases 2, 4, and 5.

By way of instruction, the task was explained to participants by using a sample trial. The trials' order did not change during the experiment, while the presentation order of the Dependent and Independent sets and the presentation order of the four sentences within each set were randomized for every participant.

The figures used to load participants' working memory were chosen randomly but kept fixed for each trial (e.g., Recording 1 was always presented with a green triangle and a red hexagon). This was in order to show participants equally difficult combinations of figures.

## Expectations

Our expectations were as follows.


a conditional presupposition is likely to be more cognitively demanding than doing the same for an unconditional presupposition. Hence our expectation was that participants would more frequently select an unconditional presupposition [U] instead of a conditional one [C] in an interference condition, where they have limited resources available for processing the conditional presupposition.

## Results

The data from two participants were excluded from the analysis because of an interruption in task performance. Considering the second task, with regards to memorizing geometrical figures, the mean of correct answer was 0.88 (SD = 0.32). Every participant reached at least 50% of correct answer and thus none of them were excluded from the analysis.

The general results are reported in **Table 1** and graphically summarized in **Figure 3**. Considering Expectation 1 above, we analyzed the percentage of conditional presupposition selection [C] with respect to the percentage of unconditional presupposition selection [U] in both conditions. The results were:


The data seems to be in line with Expectation 1 and those produced under the Simple condition were consistent with Romoli et al.'s results.

Considering expectation (2), data collected seemed to show that:

TABLE 1 | The general results of the experiment under the two conditions (Interference, Simple) and the two sets of answers (Dependent, Independent) reported as total frequency of choice.


Results concern conditional presuppositions [C], conditional fillers; [Fc], unconditional fillers; [Fu], unconditional presuppositions [U].


Finally, considering Expectation 3, data collected seem to show that:


## GENERAL DISCUSSION

The general goal of this experiment was to investigate the Selection Problem in presuppositions projection in conditional sentences with a presupposition trigger in the consequent. In particular, our experiment was aimed at evaluating the dependence hypothesis considered by Romoli et al. (2011) and the role played by participants' cognitive load.

Data collected showed two results.

(1) Participants, in general, selected the conditional presuppositions more frequently than the unconditional presuppositions in processing CpC, as reported in **Result 1A**.

This first result is sympathetic to Conclusion (i), proposed by Romoli et al., according to which conditional presuppositions are more likely to be selected than unconditional presuppositions. This conclusion was confirmed by the data we have collected under the Simple experimental condition<sup>8</sup> . Therefore, **Result 1A** seems to support the central thesis of satisfaction theories that all CpCs project mainly conditional presuppositions of the form if p, q.

(2) Participants selected the conditional presuppositions more frequently when there was dependence between the antecedent of the CpC and the presupposition activated by the trigger in the consequent, as reported in **Result 2A**.

Reconsidering the Simple condition, **Result 2A** seems to be compatible both with satisfaction theories and with theories which predict that CpC mainly project unconditional presuppositions. In fact, the former theories claim that, in cases of CpC, conditional presuppositions are selected most of the time, hence the conditional presupposition If p, q is more likely to arise when the presupposition q in the consequent is dependent on the antecedent p. According to the latter theories, even if the unconditional presupposition is the preferred reading in cases of CpC, when there is dependence between the antecedent of a CpC

<sup>8</sup> Since Romoli et al.'s design did not include a second task generating interference, our Simple condition was more suitable than our Interference condition for comparison with Romoli et al.'s conclusions.

and the presupposition triggered in the consequent, speakers are supposed to select a conditional presupposition.

**Result 2A**, therefore, coheres with the idea, proposed by Romoli et al. (2011), that the conditional presupposition If p, q is more likely to arise when the presupposition q in the consequent is dependent on the antecedent p. Moreover, since we used a within-subject design instead of the between-subject design adopted by Romoli et al., we did not analyze data collected from different participants assigned to two different conditions. Rather, we analyzed the effect of dependence, for the same participant, on the selection of the presupposition, where dependence was the only manipulated variable. Hence, this analysis allows us to support a stronger claim: the dependence between the antecedent and the presupposition in the consequent of a CpC has a relevant effect in the selection of the presupposition.

To sum up, **Results 1A** and **2A** support the idea that sentences of the form If p, qq' mainly project conditional presupposition as If p, q and even more so if there is dependence between p and q 9 .

The second purpose of our experiment was to explore the effect of participants' cognitive load. To this end, data collected seem to suggest that:

(3) the same participant, if highly cognitively loaded, selected conditional presuppositions less frequently then in the case of low cognitive load, as occurred under our Simple condition (see **Result 3A**).

Our statistical analysis seems to show that, in the Dependent set, where participants were supposed to project conditional presuppositions (as shown by **Result 2A**), the very same participant, if highly cognitively loaded, might project an unconditional presupposition instead of the conditional presupposition that she probably would have projected if she had had more cognitive resources available. Considering the percentages of conditional and unconditional presuppositions selected within the set of dependent targets, data collected suggest that, under the Interference condition, the percentage of conditional presuppositions projected decreases, while the percentage of unconditional presuppositions increases.

**Result 3A** allows us to claim that, to a certain extent, together with the dependence between the antecedent and the presupposition in the consequent, speakers' cognitive load is a relevant factor that affects the selection of the presupposition in CpC. One explanation for this result might be that highly cognitive loaded speakers are less disposed to select a conditional presupposition since processing the mental representation corresponding to a composed sentence, and, in particular, to a conditional sentence, requires more cognitive effort than is the case for a simple (i.e., unconditional) sentence. Toms et al. (1993), for example, have argued that mistakes in conditional reasoning are related to working memory. In particular, conditional reasoning seems to require a surplus in working memory that, in turn, requires support from the central executive. In conditional representations, the higher the number of models required, the higher the cognitive effort involved (Barrouillet and Lecas, 1999; Johnson-Laird, 2001). Hence, the limited available resources under the Interference condition might have affected the selection

<sup>9</sup>As said before, all the target sentences used in our experiment have been divided into dependent and independent conditionals thanks to the results of the norming pre-experiment. As pointed out by Romoli et al. further research might address the question about what notion of dependence has been used by participants while answering the questionnaire.

by changing a conditional answer [C] in an unconditional answer [U]<sup>10</sup> .

Some final considerations concern the set of independent targets in our experiment. First of all, we have shown in **Result 2A** that independent conditional presuppositions have been selected significantly less frequently than dependent conditional presuppositions under the Simple condition. Thus, the percentage of independent conditional presuppositions under the Simple condition was close to the percentage of unconditional presuppositions. This result might be explained by assuming that. in the Independent set, since there was no sort of bridging inference connecting the content of the antecedent and the content of the consequent of the conditional presupposition of the CpC, participants have treated independent conditional presuppositions in the same way, as if they were independent unconditional presuppositions. In this case, in other words, the conditional presuppositions have been evaluated as equally available by the participants so that, in terms of percentage, they have been equally selected in the course of the experiment.

Secondly, a comparison of **Results 2A** and **2B** shows that the percentage of conditional presuppositions selected in the Dependent set decreases from the Simple condition to the Interference condition, while the percentage of conditional presuppositions selected in the independent set does not change significantly from the Simple condition to the Interference condition. These data seem to support the idea that, while the cognitive load factor affects the selection of the presuppositions in the Dependent set, it does not seem to have an effect on the selection in the Independent set. The data, therefore, suggest that the cognitive load factor affects the selection of the presupposition of a CpC only when there is a dependence between the antecedent of the conditional and the content of presupposition triggered in the consequent: if the dependence holds, and the speaker is highly cognitively loaded, then she seems to be less disposed to select a conditional presupposition. This effect of the cognitive load factor on the presupposition selection in a dependent CpC might be explained by the bridging inference that supports the dependence. The reason may be that, under the Interference condition, participants had few cognitive resources for performing the main task, given that part of their cognitive resources were used in the second task (i.e., memorizing geometrical figures). Since processing the dependent conditional implied computing the bridging inference (e.g., computing that "If Paul is a devout Catholic, then he will read his Bible tonight" implies computing the bridging inference "If someone is a devout Catholic, then he or she usually has a Bible"), the interference of the second task under the Interference condition affected the selection of the dependent conditional presuppositions. The reason seems to be that the remaining resources for performing the main task were not sufficient for computing both the content of the conditional presuppositions and the bridging inferences, with participants consequently selecting less dependent conditional presuppositions under the Interference condition than in the Simple condition. Conversely, processing independent conditionals does not require computing any bridging inference; hence, under the Interference condition, the remaining resources for performing the main task were sufficient for selecting the conditional presuppositions, meaning that, under the Interference condition, participants selected independent conditional presuppositions as often as under the Simple.

To conclude, data collected support the idea that two relevant factors affecting presupposition selection in the Proviso Problem are (i) dependence between the antecedent of a CpC and the presupposition triggered in the consequent and (ii) speakers' cognitive load.

While presuppositions have, for a long time, been rather unexplored as a topic in the field of experimental pragmatics, in the last years, a new wave of studies (Schwarz, 2014; Domaneschi, 2015) have suggested that, while presuppositions are typically considered background meanings, expected to be processed automatically, the actual processing seems to involve a large chunk of the cognitive resources available to the language users, which affects the understanding of different kinds of presuppositions.

It was expected that different factors would affect the cognitive demand of processing a presupposition. This paper has attempted to show that compositionality (i.e., conditional vs. unconditional presuppositions) is one of these crucial factors.

## SOME FINAL REMARKS CONCERNING THE COGNITIVE LOAD FACTOR

Cognitive context, the set of presuppositions assumed to be taken for granted by the participants in a conversation, has been widely discussed since the debate on informative presupposition (Gauker, 1998, 2008; Stalnaker, 1998, 2002; von Fintel, 2008) and the distinction between passive and active (or local) context (Kripke, 2009; Schlenker, 2009, 2011). On this background, the notion of cognitive context (and of contextual felicity) is still a working theoretical notion at the boundary between semantics and pragmatics and is useful for treating the pragmatic phenomenon of accommodation (von Fintel, 2008; Tonhauser et al., 2013). However, while the relevance of different cognitive loads on processing conditionals is a usual topic in psychological discussion, linguistic and philosophical theories of presuppositions have usually bypassed the problem. However, doing so runs the risk of treating the concept of cognitive context (perhaps including the distinction between passive and active context discussed by Kripke, 2009) without considering that which context is shared—which presuppositions are activated—may depend on the kind of cognitive effort

<sup>10</sup>One relevant point concerns the reasons why the conditional presupposition is more costly than the unconditional and whether **Result 3A** says something about the satisfaction vs. DRT-like theories. However, data collected do not seem to be strong enough for making previsions concerning the possible mechanisms underlying the processing of conditional and unconditional presuppositions that might be conjoined with different theories of presuppositions. Rather, data collected by this experimental design allows the recognition simply of the cognitive amount of selecting presuppositions in CpC: selecting conditional presuppositions seems to be more cognitively demanding than selecting unconditional ones. Further experimental studies should be conducted to investigate the link between processing conditional and unconditional presuppositions in CpC and the mechanisms employed in the competing theories.

required in a conversation. Without taking into account the impact of the cognitive effort behind selecting certain linguistic content in a context (e.g., presuppositions), we might overlook or misunderstand some experiments' results and, consequently, be unable to select the right competing theory.

One of the problems with theories of communication based on classical linguistic and philosophical theories is that they sometimes depend on hypotheses concerning how hearers should react or what they should understand to have been discussed without taking into account different possible scenarios arising from hearers' different cognitive loads. In a previous work (Domaneschi et al., 2014), we have discussed the role of cognitive effort in detecting presuppositions, showing that some presuppositions (mainly Iteratives and Change of State Verbs, which deal with temporal features) are more difficult to process when a hearer is highly cognitively loaded. The present study, plus Domaneschi et al. (2014) give some provisional methodological suggestions concerning the impact of the role of cognitive load in assessing the plausibility of linguistic and philosophical models: (i) the analysis of the cognitive load factor might reveal that, even if a certain semantic reading of a sentence appears to be like the default, that is, the one determined by the meaning of the clause, language users can opt for a less probable reading that is nevertheless more compatible with their available cognitive resources. Missing this point may affect how the different hypotheses under examination are assessed. (ii) The cognitive context (the set of presuppositions) might change

## REFERENCES


depending not only on the logic of the discourse structure but also on the speakers' cognitive state, namely, the level of cognitive load of participants in the conversation, which affects what kind of presuppositions are selected and, consequently, how the context changes in the course of a conversational exchange. These considerations we propose as hints for future research that might reveal further unexpected results.

## FUNDING

The research was partly funded by a Ministero dell'Istruzione, dell'Università e della Ricerca (MIUR) research project 20107738C5\_008 coordinated by Michele Marsonet and by the Research Project PRA 2012 Presupposition and Cognitive processes coordinated by CP.

## ACKNOWLEDGMENTS

We would like to thank the referees for their valuable suggestions and discussion on specific points of the paper.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2015.02026


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Domaneschi, Carrea, Penco and Greco. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# 'But' Implicatures: A Study of the Effect of Working Memory and Argument Characteristics

Leen Janssens and Walter Schaeken\*

Laboratory for Experimental Psychology, Department of Brain and Cognition, KU Leuven, Leuven, Belgium

This study aimed to investigate the possible cognitive costs involved in processing the implicatures from but and the conclusion introducing words so and nevertheless. Adult participants were asked to indicate the conclusion that the person in the story would make, based on 'p but q' sentences constructed as indirect distancing contrasts. Additionally, while performing this task, participants' working memory was burdened with a secondary dot recall task in four conditions ranging from no working memory load to high load. The results showed that working memory load did not influence participants' performance on the implicature task. This finding might be interpreted to suggest that working memory is not involved in inferring the implicatures from but, so, and nevertheless. We also found that the content of the arguments played a very important role. Whenever a strong argument is combined with a weak argument, participants mostly base their conclusion on the strong argument and consequently ignore the conventional interpretation of but (and so and nevertheless). Additionally, we found an effect of axiological value, which is in line with the positive–negative asymmetry theory.

### Edited by:

Gabriella Airenti, University of Turin, Italy

### Reviewed by:

Paul Pierre Marty, Massachusetts Institute of Technology, USA Tiffany Morisseau, Central European University, Hungary

\*Correspondence: Walter Schaeken walter.schaeken@ppw.kuleuven.be

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 14 July 2015 Accepted: 20 September 2016 Published: 08 November 2016

### Citation:

Janssens L and Schaeken W (2016) 'But' Implicatures: A Study of the Effect of Working Memory and Argument Characteristics. Front. Psychol. 7:1520. doi: 10.3389/fpsyg.2016.01520 Keywords: conventional implicatures, but, working memory, automaticity, context

## INTRODUCTION

When people communicate with each other, they tend to follow a cooperative principle to make their message easily understood by all interlocutors (Grice, 1989). This implies to follow some rules that Grice describes as maxims. This cooperative principle allows interlocutors to derive implicatures, i.e., inferences that consist of attributing to a speaker an implicit meaning that goes beyond the explicit linguistic meaning of an utterance. Consider the following example:

(1) Some students passed the exam.

The utterance in (1) will be interpreted as "Not all of the students passed the exam." If all of the students had passed the exam, (1) would still be logically true. However, the hearer can assume that the interpretation of 'some' as 'not all' holds because the speaker wants his utterance to be optimally understood by the hearer by being as informative as possible. The inference from (1) that not all the students passed the exam is an example of a conversational implicature.

There are, however, implicatures that are not derived from the cooperative principle and are therefore independent of its four maxims. They are called conventional implicatures. These implicatures are attached by convention to particular lexical items or linguistic constructions. Grice (1975) wrote the following about them:

"In some cases the conventional meaning of the words used will determine what is implicated, besides helping to determine what is said. If I say (smugly), He is an Englishman; he is, therefore, brave, I have certainly committed myself, by virtue of the meaning of my words, to its being the case that his being brave is a consequence of (follows from) his being an Englishman" (Grice, 1975, p. 44).

The use of the word therefore implies a consequence link between the two sentences. This link, however, does not contribute to the truth conditions of the sentence "he is an Englishman" nor of the sentence "he is brave." Indeed, if a sentence 'p therefore q' is true, it follows that 'p and q' is true, and therefore, that p is true and that q is true too. The contribution of therefore is in other words non-truth-conditional; it is not needed for the truth-conditional analysis. This idea is also expressed in the following definition by Horn (2004):

"Unlike an entailment or logical presupposition, this type of inference is irrelevant to the truth conditions of the proposition. This inference is not cancellable without contradiction, but it is detachable, in the sense that the same truth-conditional content is expressible in a way that removes (detaches) the inference. Such detachable, but non-cancellable aspects of meaning that are neither part of, nor calculable from, 'what is said' are conventional implicatures." (Horn, 2004, p. 4)

The implicatures stemming from the connector but are classically also described as conventional implicatures. This claim will be questioned in the current paper. The materials used in our experiment consist of 'p but q' sentences ('p maar q' in Dutch, the language in which the experiment is carried out) in which but operates as a distancing contrastive connector, more specifically as an indirect one. In a distancing contrast, but connects two parts of a complex speech act and the second part is dissociated from the first part, without explicitly denying what is being expressed in the first part (Van Belle and Devroy, 1992). The speaker endorses or recognizes that p is true (Van Belle, 2003). However, but prevents the inference that would normally be derived from p. This can happen in two ways. The first possibility is that q contains a conclusion that contradicts the inference from p. Consider the following example (Van Belle, 2003):

(2) The milk is sour, but I drink it.

On the basis of p, one expects that the speaker will not drink the milk. However, q contradicts directly this expectation. This is an example where but operates as a direct distancing contrastive connector, sometimes also called a 'concluding but' (Van Belle and Devroy, 1992).

The second possibility is the one that is investigated in this article. In this construction q consists of an argument that leads to an expectation that contradicts the expectation from p. For example:

(3) The milk is sour, but I am thirsty.

The inference from p is that the speaker in (3) will not drink the milk. The inference from q, however, is that the speaker will drink the milk. Anscombre and Ducrot (1977; see also Van Belle, 2003; Potts, 2015) claim that the second phrase in such an indirect distancing contrast has more weight. Consequently the conclusion follows that the speaker will drink the milk.

The conclusion from a 'p but q' sentence can be introduced by words like so (dus in Dutch) or nevertheless (toch in Dutch). So and nevertheless also demonstrate that words might have no effect truth-conditionally, but still carry information. The word so elicits the inference from q as the conclusion. In other words, from (3), it follows:

(4) So I will drink the milk.

One can say that so strengthens the inference from but, or, stated differently, it signals that the previous information/expectation explains the next fact. It is important to notice that so plays no role in the truth conditions of (4). In other words, (4) is true if and only if it is true that

(5) I will drink the milk.

This truth-conditional analysis does not mean that so has no purpose in the sentence. It signals that what follows is causally linked with the previous information. In contrast to the previous truth-conditional analysis, the word nevertheless cancels the inference from but and elicits the inference from p as the conclusion from (3):

(6) Nevertheless I will not drink the milk.

As with so, nevertheless does not play a role in the truth conditions of (6), although it signals something, i.e., that what will be presented is in contrast with previous information/expectation. Indeed, (6) is true if (7) is true, and false otherwise:

(7) I will not drink the milk.

Janssens and Schaeken (2013) investigated experimentally how people understand but, so, and nevertheless. They presented 63 adult participants with such 'p but q' sentences followed by either two so-conclusions or two nevertheless-conclusions. Participants were asked what the person in the story would conclude. Janssens and Schaeken (2013) aimed to find out whether people understand but, so, and nevertheless as predicted by the literature on conventional implicatures, or if other factors, like the content of the sentences, were driving the interpretation. If conventional implicatures were the driving force behind the interpretation, people would choose the inference from p when the conclusion with nevertheless is asked and the inference from q when the so-conclusion is asked. The p- and q-arguments were either both sensible (i.e., they both made sense) or a combination of a sensible and an irrelevant argument. According to the account of Anscombre and Ducrot (1977) implicatures stemming from but, so, and nevertheless should lead to a certain conclusion, irrespective of the (relevance of the) content of the arguments. The results showed that, although people seemed to follow the conventional implicatures, the content of the arguments also greatly influenced participants' answers. When a sensible argument was combined with an irrelevant argument, participants mostly based their conclusion on the sensible argument. Even when a combination of two sensible arguments was presented, performance was not perfect. A plausible interpretation of the imperfect performance is that the content of the arguments often prevails over the implicatures that could be drawn from the 'p but q' sentences. This interpretation is in line with the results of Experiment 2 in Janssens and Schaeken (2013), where participants had to justify their responses. When reasoners gave an unconventional

conclusion (i.e., not in line with what is predicted on the basis of the literature about the conventional implicatures of but, so, and nevertheless), they tended to refer to the content of that argument. Another finding from Janssens and Schaeken (2013) was that nevertheless elicited more unconventional answers than so. They argued that this might be attributed to the fact that the nevertheless doesn't actually evoke the inference from p as was predicted. There are, however, alternative explanations. It might be that nevertheless evokes the negation of the conclusion from q. This negation does not necessarily mean the inference from p. Indeed, after (3) one might for instance say "nevertheless I'm hesitating." Another plausible explanation is in terms of effort. In order to reach the conventional nevertheless-conclusion from p, the implicature from but (i.e., the inference from q) has to be overruled, which seems likely to be effortful.

There are no existing theories claiming that there is or should be specific processing costs involved in processing these specific but implicatures. However, according to Blakemore (1987) and Iten (2005), but encodes a specific procedure. In the context of Relevance Theory, Blakemore (2002) developed a procedural analysis of but. This analysis asserts that but "encodes a constraint that triggers an inferential route involving contradicting and eliminating an assumption that is manifest in the context" (in Hall, 2004, p. 220). Iten (2005) refined Blakemore's analysis of but and claimed "what follows (q) contradicts and eliminates an assumption that is accessible in the context." If we try to translate these in terms of processing costs, it seems fair to argue that the contradiction and elimination procedure is requiring extra processing costs. In order to reach the conventional conclusion from a 'p but q' sentence as an indirect distancing contrast, one must infer the specific conclusion from p and the specific conclusion from q (which is opposite to the conclusion from p). Additionally, but implies that the second argument weighs more heavily so that the putative conclusion of p is eliminated and the final conclusion is inferred from q. As a consequence, four inferences should be made in order for this final conclusion to be reached. Stated more formally, the inference steps are the following (where p stands for the p-argument, q for the q-argument, and r for the conclusion you can derive from the respective arguments):

	- [=Contradiction that follows from (2)]

[=The bigger weight of the not-r conclusion, on the basis of the but-implicature]

(5) ¬ r

[=Solving the contradiction in (3) by eliminating the conjunct that has the smallest weight in (4)]

(6) ∴ So not-r is the case.

The inference steps 2 and 4 are an expression of the implicatures attached to but, inference steps 3 and 5 are inferences, which are needed in order to be able to complete the reasoning process. Moreover, when the but-sentence is followed by nevertheless, the processing costs might be even higher. After the contradiction and elimination of the conclusion following the p-argument, encountering nevertheless forces the listener to undo the elimination (or eliminate and contradict the conclusion from the q-argument):

	- [=The bigger weight of the not-r conclusion, on the basis of the but-implicature]
	- [=Solving the contradiction in (3) by eliminating the conjunct that has the smallest weight in (4)]

An alternative account might be that people reverse the inference in step (4) after being confronted with the contradiction in (3) and the word nevertheless. This means that step (5) also could be

(5) ¬ (¬ r > r )

The subsequent reasoning steps would be:

(6) ¬ r ≤ r (7) ∴ Nevertheless r is the case.

Thus, reasoning in line with the 'contradiction and elimination' view of Blakemore (1987) and Iten (2005), together with the general finding from Janssens and Schaeken (2013) that drawing these implicatures doesn't happen flawlessly, induces the possibility that processing but is cognitively effortful and therefore requires specific cognitive capacity. In rule-based accounts of reasoning (see e.g., Braine and O'Brien, 1991; Rips, 1994), the number of reasoning steps influences the difficulty directly, because of their working memory load. In this perspective, the extra reasoning step needed for nevertheless, which might involve a double negation, a well-known difficult reasoning operation (see e.g., Schroyens et al., 2001), should tap working memory resources even more, therefore making nevertheless more difficult than so. Also the alternative account of the reasoning steps for nevertheless, where one reverses the already made but-implicature, clearly demands extra resources.

Moreover, a closer look at the inference steps 2 and 4, which are an expression of the implicatures attached to but as a distancing contrastive connector, reveals that they have certain properties of conversational implicatures. One specific feature that characterizes conversational implicatures but not conventional implicatures is that they are cancellable. In the next paragraph we will explain that the well-known conventional

implicature but is not immune for cancelation, which is surprising from a theoretical point of view.

First, there are sentences in which but connects two parts and the use of but creates a contrast between the two parts. For example:

(8) She is blonde, but she is intelligent.

The use of but in (8) elicits, in Grice's terms (1975), the implicature that being blonde contrasts with being intelligent (at least in the speaker's view) although this contrast is not explicitly expressed. This contrast is an indefeasible inference from but. The use of the word but implies a contrastive link between the two parts of the sentence. This link, however, does not contribute to the truth conditions of the sentence "she is blonde, but she is intelligent." The previous sentence and "she is blonde and intelligent" are both true if p is true ("she is blonde") and at the same time q is true ("she is intelligent"). The contrast in (8) is due to the fact that but comes with an implicature that and lacks. Since it is part of the conventions of English that but is used this way, Grice calls it a conventional implicature (Geurts, 2010, p. 8). Second, but can also be used in sentences in which the inferences from the p- and q-argument already contrast each other. Example (3) (about sour milk but being thirsty) is an example of such a but-sentence. The implicature from but indicates that the second part of the argumentation (q) attains more weight (Anscombre and Ducrot, 1977). The use of but in sentence (8) seems indeed to be in line with a classical conventional implicature, i.e., a non-cancellable implicature. However, this is not true for the use of but in sentence (3), which is the type that will be discussed in this paper. For example, by using nevertheless the implicature from but is canceled. Nevertheless denies the inference from but and guides the hearer or reader toward the inference from p. In other words, languages even have a discourse marker to signal the cancelation, namely nevertheless. As a consequence, the implicatures related to but may not be purely conventional, because they have certain features of conversational implicatures. Indeed, the differential weighting of the p- and q-argument seems not to be conventional because it is cancellable. This similarity with conversational implicatures is an important reason why the possibility arises that processing implicatures from but may require similar cognitive processes and capacities as conversational implicatures.

A substantial part of the experimental research on scalar implicatures has focused on the cognitive processes underlying these inferences. There is, however, no consensus in the literature with respect to the possible cognitive processing costs associated with deriving scalar implicatures. Indirect evidence suggesting that deriving scalar implicatures is cognitively effortful can be found in developmental research. Noveck (2001), among others, found that children are more logical than adults with terms such as might and some. Because children's cognitive capacities aren't fully developed yet, this was considered as indirect evidence that working memory capacities are involved in deriving scalar implicatures. Likewise, Bott and Noveck (2004, Experiment 4) observed that the number of pragmatic answers dropped when participants were forced to answer quicker, indicating that pragmatic inferences require processing costs. De Neys and Schaeken (2007) presented even more direct evidence. They burdened adult participants' working memory capacity by providing them with a secondary task during performance of the scalar implicature task. When working memory was burdened, pragmatic inferences dropped by 10%. Marty et al. (2013)replicated this working memory load effect associated with computing the scalar implicature from some (see also Noveck and Posada, 2003; Huang and Snedeker, 2009; Dieussaert et al., 2011).

In contrast, other literature doesn't seem to find any processing costs for scalar implicatures. For example, Marty et al. (2013) found an opposite working memory effect on numerical implicatures. Also, Feeney et al. (2004) found that working memory capacity was associated with providing the logical interpretation on infelicitous some statements. They argued that working memory is involved in inhibiting the pragmatic interpretation in favor of the logical one. Other evidence suggesting that there is no role for working memory was provided by Grodner et al. (2010). They showed in a visualworld study that there was no delay associated with the pragmatic inference from some compared to other, non-scalar expressions. Hence, the implicature generation takes place as soon as some is encountered, before the full sentence is processed. Similar, Heyman and Schaeken (2015) observed in a latent class analysis that working memory capacity did not explain the interindividual variability in the interpretation of infelicitous some statements. In sum, findings concerning the possible cognitive processing costs associated with deriving scalar implicatures are not consistent. This mixed evidence and the possibility of cancelation of the indirect distancing contrastive but (which gives but a characteristic of a conversational implicature) makes it worth looking into the processing costs underlying these specific implicatures from but.

Janssens et al. (2015) replicated Janssens and Schaeken (2013), but their participants were children aged 8–12. Additionally, working memory capacity was measured by means of the Listening Span task (Daneman and Carpenter, 1980). Their results were similar to the adult results in Janssens and Schaeken (2013), but children's competence with but seemed worse than adults' competence (although a direct comparison between adults and children was not made). Because children's working memory capacity isn't yet fully developed, this could indicate that working memory is involved in processing implicatures from but. However, no effect of working memory on children's performance was found. This finding, in turn, suggests that working memory would not be involved in processing implicatures from but as an indirect distancing connector.

In sum, there are some good reasons to investigate whether working memory is involved in processing but. The primary aim of the present study is thus to examine if working memory is involved in processing the implicatures from but, so, and nevertheless in those cases where but is used as an indirect distancing contrastive connector. We will not measure working memory capacity, but we will use the same paradigm as De Neys and Schaeken (2007) in scalar implicature research. We will look at the effect of working memory load on but-implicature competence by imposing a secondary task on participants that burdens working memory capacity. If the implicature requires specific effortful processing, deriving the implicature should be

harder when cognitive resources are burdened. We want to emphasize that pragmatic theorists and previous experimental studies have not characterized the exact nature of the alleged effortful processing. The present study focuses on the role of executive working memory resources. These resources are widely recognized as the essential component of human cognitive capacity (see e.g., Engle et al., 1999).

Apart from the main question of our study (the effect of working memory load), we aim to answer three extra questions.

First, we want to investigate if the relevance of the arguments can overrule the expectations that accompany but, so, and nevertheless. Previous but-research showed a strong effect of content with adults (Janssens and Schaeken, 2013) and children (Janssens et al., 2015). Content and context have a profound effect on many pragmatic phenomena (see e.g., Bambini et al., 2016, for an effect of context on metaphors), as with the closely related conversational implicature some, content and context effects are observed (see e.g., Breheny et al., 2006; Bonnefon et al., 2009, 2011; Heyman et al., 2012).

Actually, if one wants clear answers about the content-effect, Janssens and Schaeken (2013) and Janssens et al. (2015) did use a methodology that was not fully satisfying. They presented sentences with sensible arguments, and also sentences with irrelevant arguments. For instance, in a story where someone doubts whether or not he will eat chocolate, the person thinks: "Chocolate is very tasty, but I have blond hair." It is clear that the second argument is in principle irrelevant with respect to the question of whether the person will eat chocolate. For sentences with an irrelevant argument, participants did choose the conclusion stemming from the sensible argument regardless of the direction suggested by but, so, or nevertheless. For the example above, the majority of participants did choose the conclusion "so he will eat chocolate," although the combination of but and so should have made participants to infer the negation of the conclusion expected from the p-argument. Janssens and Schaeken (2013) interpreted the high number of these answers as a strong sign for the importance of the content. However, these 'awkward' sentences might have confused the participants. The complete irrelevance of one of the arguments might have canceled the differential weighting of the arguments and might have led participants to focus exclusively on the sensible argument. Therefore, in the present study a more ecologically valid measure is used to study the effect of the content. Participants are now presented with weak and strong arguments instead of, respectively, irrelevant and sensible arguments as in Janssens and Schaeken (2013). By manipulating the strength of the arguments we hope to investigate the effect of content in a more natural way. On the basis of the previous experiments, we expect strong arguments to overrule the direction suggested by but, so, or nevertheless.

Second, we want to exclude a simple alternative explanation for the claim of Anscombre and Ducrot (1977) that the q-argument has more weight. It simply might be that the last argument in a sequence always gets more weight. This alternative explanation was overlooked in previous research. To rule it out, the effect of the instruction word but will be assessed by comparing performance on sentences including but with sentences in which the arguments are simply juxtaposed. If order is important, also in simply juxtaposed sentences, the last argument should have more weight. However, given previous findings and theorizing about but, we predict that the q-argument only gets more weight in combination with but.

Third, we want to verify the impact of the axiological value of the arguments. Anscombre and Ducrot (1977) used the term 'axiological value' to describe the argumentative orientation of an argument, which is determined by a positive or negative value that can be ascribed to its content. Arguments whose axiological value is oriented toward a positive conclusion are labeled 'positive arguments' and their counterparts 'negative arguments.' For example, suppose a person who hesitates to buy a necklace. She says: "I really like the necklace, but it is very expensive." In this example, the p-argument (liking the necklace) is the positive argument because it is oriented toward the positive conclusion (she will buy the necklace). The q-argument (very expensive) is the negative argument because it is oriented toward the negative conclusion (she will not buy the necklace). Janssens and Schaeken (2013; see also Janssens et al., 2014) did not find an effect of the axiological value. There were no systematic differences between the items with negative or positive arguments in their study. Therefore, we do not expect an effect of this variable. Nevertheless, in light of the importance of replication in science (Open Science Collaboration, 2015), we treat axiological value as a possible confounding variable and we add it as an extra variable in the design.

## MATERIALS AND METHODS

## Participants

A total of 210 undergraduate students from the University of Leuven (Belgium) with a mean age of 19.2 participated in our experiment. They were all native Dutch speakers and received course credit in exchange for participation.

## Implicature Task

Every participant was presented with 16 short context stories, adapted from Janssens and Schaeken (2013). These stories, programmed in E-Prime 1.1, were presented on a computer and were followed half the time by two 'p but q' constructions and half the time by two 'p. q' constructions. For example (translated from Dutch):

Mom and Ella are shopping. Ella sees a lovely teddy bear lying on the shelves. She asks Mom if she can have the teddy bear. Mom is not sure.

Mom thinks: "Ella has been bad, but she lost her teddy bear." or

Mom thinks: "Ella has been bad. She lost her teddy bear."

After each argumentative construction (either 'p but q' or 'p. q'), the participants were told to indicate the conclusion that the person in the story would make, based on the construction of his/her utterance. They were explicitly told not to take into account the decision they themselves would make. For the example above, the so-conclusions they had to choose between

are "so Ella can have the teddy bear" and "so Ella cannot have the teddy bear." Next, the participants were presented with a different pair of arguments from the same type (e.g., Mom thinks: "Ella already has a lot of teddy bears, but she's been very good lately"). Now, they had to judge two conclusions form the second conclusion type (e.g., "nevertheless Ella can have the teddy bear" and "nevertheless Ella cannot have the teddy bear"). Both the 16 stories and the so- and nevertheless-conclusions were presented in a random order. In the Supplementary Data Sheet 1, the materials of a concrete trial are provided. In contrast to Janssens and Schaeken (2013), we did not use irrelevant<sup>1</sup> arguments but we did make a distinction between weak and strong sensible arguments. In the example above, both the p- and the q-argument are strong sensible arguments. In the same context, an example of two weak sensible arguments is Mom thinks: "I'm in a hurry, but it is a lovely teddy bear."

In order to choose plausible and good arguments for our constructions, we performed two pilot studies. In a first pilot study, 16 participants were instructed to read stories in which a person was always confronted with a 'dilemma' (e.g., a girl received some chocolates and has to decide whether or not to eat chocolate). We asked the participants to give both an argument why a person should do something (e.g., 'being hungry' is an argument to eat chocolate) and an argument for why she shouldn't (e.g., 'being allergic to chocolate' is an argument not to eat it). In a second pilot study we asked 16 different participants to rate the arguments that were generated in the first pilot study on a scale from 1 (very weak argument) to 7 (very strong argument). Based on these two pilot studies we created our experimental set. For both the constructions separated by a 'period' and the but constructions, there were four possible combinations of arguments: strong–strong, strong–weak, weak–strong, and weak–weak. Moreover, we also took into account the axiological value of the arguments. The argumentative orientation can be positive or negative. A negative argument (e.g., "Ella has been bad") is oriented toward a negative conclusion (she cannot have the teddy bear), whereas a positive argument (e.g., "Ella lost her teddy bear") is oriented toward a positive conclusion (she can have the teddy bear). This led to a 2 × 2 × 2 × 4 design (2 connectors: but or 'period' × 2 conclusion types: so or nevertheless × 2 axiological value combinations: negative-positive or positive–negative × 4 argument combinations: weak–weak, weak–strong, strong–weak, and strong–strong).

## Working Memory Load Task

We manipulated working memory load in order to determine whether the number of conventional responses would be lower when working memory is burdened. For our working memory manipulation, we used a secondary task based on the Double Task Paradigm used in De Neys and Schaeken (2007). We created four load conditions, whereby participants were presented with a matrix with three, four, or six dots. A matrix was displayed for 850 ms before each of the 16 stories and participants had to remember the position of the dots in order to reproduce them in an empty matrix. After the matrix, the short context story appeared on the screen. The participant could take as much time as they want to read the story. When they pressed the space bar, the story disappeared and the first (but- or 'period'-) sentence appeared, together with the first choice between two conclusions (two so- or two nevertheless-conclusions). These two conclusions were presented under each other, preceded by a number. When the participant had indicated his or her response (by typing the number), the second (but or 'period') sentence appeared, together with the second choice between two conclusions (two nevertheless-conclusions if the two so-conclusions were presented for the first sentence, and two so-conclusions if the two nevertheless-conclusions were presented for the first sentence). After the participant indicated the response, the sentence and the conclusions disappeared. An empty matrix appeared and the participant had to reproduce the previously presented matrix. In the low load condition, participants were presented with a 3 × 3 matrix with three dots that were always horizontally or vertically positioned. The moderate load condition was similar, but the dot pattern was more complex to remember. In this condition, participants were presented with a 3 × 3 matrix with four randomly positioned dots. In the high load condition, there were six randomly positioned dots in a 4 × 4 matrix. Finally, as a control, there was a no load condition in which the participants were not presented with matrices but were simply asked to perform the implicature task.

## Procedure

The participants individually performed the task in five groups of up to 50 students at the same time. In each group, the participants were randomly assigned to the different working memory load conditions. All participants were presented with the 16 stories, followed by two questions about the conclusion. This means that every participant answered one item of every sentence type. Meanwhile the participants performed the working memory load task. The whole task lasted approximately 12 min per participant.

## RESULTS

## Results of Working Memory Load Task

We calculated the average number of correctly reproduced dots in every load condition. In the low load condition the average number of correctly reproduced dots was 2.78 out of 3 (93%). In the moderate load condition, participants averagely reproduced 3.45 dots out of 4 (86%) correctly. Finally, in the high load condition, the average number of correctly reproduced dots was 4.31 out of 6 (72%). This means that participants performed fairly well on the dot recall task. This was important: avoiding both floor effects and ceiling effects is essential in order to expect differences in working memory load. For this reason, we removed all participants in every load condition whose performance was less than two SD's below the average of their condition. We removed two participants from the low load condition (n = 60), three from the moderate load condition (n = 53) and one from the high load condition (n = 32). This left us with a total data set of 204 participants.

<sup>1</sup>Note that 'irrelevant' is labeled as 'absurd' in the original Janssens and Schaeken (2013) study.

## Results on the but-Task

fpsyg-07-01520 November 7, 2016 Time: 17:27 # 7

First, we calculated correlations between performance on the dot recall task and performance on the implicature task. In the low load condition we found a correlation of −0.005 (p = 0.97). In the moderate load condition, the correlation was 0.054 (p = 0.7) and the high load condition yielded a correlation of 0.31 (p = 0.089). These correlations indicate that there is no tradeoff between the working memory load task and the implicature task.

For our main analyses a generalized linear mixed model with a logit link function was used (see e.g., Baayen et al., 2008; Jaeger, 2008; Bates et al., 2011). The dependent variable was the accuracy score (0 or 1; conventional or unconventional conclusion). The model fitting procedure was implemented in R using the lmer() function from the lme4 package. We increased model complexity until the best model fit was reached. Model fit was assessed through the Bayesian Information Criterion (BIC). We included a random intercept of participants in the final model (to capture the potential degree of heterogeneity of participants) and no random slopes for participants (because we expect similar effects of our variables on participants). All fixed effects variables were dummy-coded. For a complete description of the results, see the Supplementary Data Sheet 2.

The final model includes main effects of connector, conclusion type, axiological value combination and argument combination; a two-way interaction between conclusion type and connector; and a three-way interaction between axiological value combination, argument combination and conclusion type. We did not find an effect of working memory load: there were no significant differences between the load conditions in the mean accuracy scores. These mean accuracy scores for each load condition are depicted in **Figure 1**.

The Supplementary Data Sheet 3 displays a summary of the final model in which the intercept is compared with all other variables. T-tests were performed to further analyze significant effects in the model. There was a significant main effect of connector: but (M = 0.58, SD = 0.16) leads to more conventional answers than the 'period' [M = 0.55, SD = 0.15; t(203) = 2.312, p = 0.022]. Additionally, there was a significant main effect of conclusion type: so (M = 0.60, SD = 0.143) leads to more conventional answers than nevertheless [M = 0.53, SD = 0.163; t(203) = 6.127, p < 0.001]. Moreover, there was a main effect of axiological value: 'positive–negative' (M = 0.61, SD = 0.259) leads to more conventional answers than 'negative–positive' [M = 0.51, SD = 0.243; t(203) = 5.107, p < 0.001]. Finally, there was a significant main effect of argument combination

[F(3,609) = 8.635, p < 0.001]. There were less conventional answers on strong–weak (M = 0.52, SD = 0.175) than on strong– strong (M = 0.58, SD = 0.193; p < 0.001), and on weak–strong (M = 0.591, SD = 173; p = 0.149). **Figure 2** displays the two-way interaction between connector and conclusion type. When the connector but separates the two arguments, the mean accuracy score is significantly higher for so-conclusions (M = 0.64, SD = 0.48) than for nevertheless-conclusions [M = 0.51, SD = 0.5; t(3262) = 7.28, p < 0.001]. However, when the two arguments are separated by a 'period,' the mean accuracy scores don't differ significantly between so- and nevertheless-conclusions [so: M = 0.56, SD = 0.5; nevertheless: M = 0.54, SD = 0.5; t(3262) = 0.70, p = 0.48]. There are more so-conclusions with but than with 'period' [but: M = 0.64, SD = 0.48, 'period': M = 0.56, SD = 0.5; t(203) = −5.403, p < 0.001].

Concerning the three-way interaction, **Figures 3A–D** display the interactions between conclusion type and axiological value combination for each of the different levels of argument combination. In order to deal with multiple testing, Bonferroni correction was used, which set the significance cut-off at 0.000625. When a weak p-argument is combined with a strong q-argument, the axiological value combination 'positive– negative' leads to more accurate answers than 'negative–positive' for nevertheless [so/neg-pos: M = 0.79, SD = 0.41; so/posneg: M = 0.85, SD = 0.36; t(814) = −2.28, p = 0.023] (nevertheless/neg-pos: M = 0.27, SD = 0.45; nevertheless/posneg: M = 0.45, SD = 0.50; t(814) = −5.41, p < 0.00001]. We find the same results for the combination of a strong p-argument with a weak q-argument [so/neg-pos: M = 0.33, SD = 0.47; so/pos-neg: M = 0.38, SD = 0.49; t(814) = −1.39, p = 0.17] [nevertheless/neg-pos: M = 0.60, SD = 0.49; nevertheless/pos-neg: M = 0.78, SD = 0.41; t(814) = −5.86, p < 0.00001]. When two arguments of the same strength are presented, we see reverse patterns for strong–strong and weak–weak. In both these cases, it depends on the conclusion type whether 'positive–negative' or 'negative–positive leads to more accurate answers. When both arguments are weak, the axiological value combination 'negative–positive' leads to more accurate answers than 'positivenegative' for so-conclusions, but to less accurate answers for nevertheless-conclusions [so/neg-pos: M = 0.70, SD = 0.46; so/pos-neg: M = 0.51, SD = 0.50; t(814) = 5.68, p < 0.00001] [nevertheless/neg-pos: M = 0.42, SD = 0.49; nevertheless/pos-neg: M = 0.62, SD = 0.49; t(814) = −5.56, p < 0.00001]. When both the p- and q-argument are strong arguments, we find the reverse pattern, with the exception that the difference for the neverthelessconclusions is not significant [so/neg-pos: M = 0.50, SD = 0.50; so/pos-neg: M = 0.74, SD = 0.44; t(814) = −7.27, p < 0.00001] [nevertheless/neg-pos: M = 0.55, SD = 0.50; nevertheless/pos-neg: M = 0.54, SD = 0.50; t(814) = 0.49, p = 0.62).

Additionally, we performed two post hoc exploratory analyses, one on the asymmetric conditions (weak–strong and strong– weak) and one on the symmetric conditions (weak–weak and strong–strong). This was inspired by the asymmetry that is visible in **Figures 3A,B** between the strong–weak and the weak–strong combination. We performed the same analysis as the original one. Hence, the dependent variable was again accuracy; we included a random intercept of participants in the final model and no random slopes for participants. Again, all fixed effects variables were dummy-coded. For a complete description of the results, see the Supplementary Data Sheet 4. Here, we want to highlight two important points of these post hoc exploratory analyses. First, also for these models, adding working memory does not lead to an improvement of the model fit, neither when we add working memory as a main effect nor when working memory is part of an interaction. Second, the interaction between conclusion type and argument combination was significant for both extra analyses, but the pattern was different. For the asymmetric conditions, the difference between so and nevertheless was significant for the weak–strong combination [so: M = 0.82, SD = 0.21; nevertheless: M = 0.36, SD = 0.29, t(203) = 18.069, p < 0.001] and also for the strong–weak combination, but in the other direction [so: M = 0.35, SD = 0.29; nevertheless: M = 0.69, SD = 0.24, t(203) = −12.016, p < 0.001]. For the symmetric conditions, the difference between so and nevertheless was in both conditions significant, but now so was always easier than nevertheless [for the strong–strong combination, so: M = 0.62, SD = 0.24; nevertheless: M = 0.55, SD = 0.26, t(203) = 3.116, p < 0.02; for the weak–weak combination, so: M = 0.60, SD = 0.26; nevertheless: M = 0.52, SD = 0.25, t(203) = 3.506, p < 0.001].

## DISCUSSION

The primary aim of the present study was to investigate if working memory is necessary during the processing of the implicatures from but. There were four working memory load conditions in order to explore whether a higher burden on working memory capacity would significantly decrease the number of conventional

answers. Additionally, our experimental design enabled us to investigate the effect of three other variables. First, we made a distinction between weak and strong arguments instead of irrelevant and sensible arguments, which provides a more reliable measure of the effect of content of the arguments. Second, we made a comparison between the connectors but and 'period' to explicitly look at the effect of but. Third, we manipulated the axiological value of the arguments, to control if the null effect of previous findings could be replicated in a better designed study.

Our best fitting model included a two-way interaction between conclusion type (so and nevertheless) and connector (but and 'period'), a three-way interaction between conclusion type, argument combination and axiological value combination and the main effects of all these variables. We will discuss the consequences of these results for our different hypotheses.

## The Role of Working Memory

The working memory load variable was not included in the best fitting models for the data. This finding is line with the results in Janssens et al. (2014), who measured working memory capacity in children and found no relation with their performance on the implicature task. This suggests that processing the implicatures from but, so, and nevertheless used in 'p but q' sentences as indirect distancing contrasts happens effortlessly without involvement of working memory. In what follows, we will place this putative conclusion into perspective, by discussing four aspects.

First, this null effect might have consequences for the theoretical underpinnings of but. A procedural analysis of but (with the 'contradiction and elimination' principle; see Blakemore, 1987; Iten, 2005) seems to suggest that processing but is effortful. Since our results showed that deriving these implicatures does not seem to be effortful, one can doubt this 'contradiction and elimination' view. Hall (2004, 2007) already postulated that the clause introduced by but does not eliminate an assumption, but merely introduces an argument that points in a different direction. She explicitly says "the implication of the second clause . . . does not entirely seem to replace the implication of the first clause . . . It just has more weight, and this is all that follows from the constraint I'm proposing." (Hall, 2004, p. 229). The proposal of Hall is less demanding with respect to working memory costs, because the elimination is not part of it. Her argumentation might make it more understandable why we did not find that processing but is cognitively effortful. In

addition, one could claim that this prediction can also be derived from Grice's theory. For example, Moeschler (2012) wrote:

"This is a very important point in Grice's definition of a conversational implicature, because only conversational implicatures are supposed to be worked out. When an implicature is automatically triggered, through a reference to the meaning of a word, the implicature is conventional." (Moeschler, 2012, p. 417).

Second, we investigated only one type of but-sentences (with but indicating an indirect distancing contrast). One could argue that the implicatures from but investigated in this paper are not purely conventional. One of the major characteristics of a conventional implicature is that this inference is not cancellable without contradiction (see Horn, 2004). However, the more heavily weighting of the q-argument in the indirect distancing contrastive 'p but q' sentences can be canceled, for instance when nevertheless introduces the conclusion (and sometimes even when so introduces the conclusion, as our participants clearly did). Hence we are not making any claims about working memory involvement in other types of but or in conventional implicatures in general.

Third, the correlations between the number of correctly reproduced dots and accuracy on the implicature task did not support the idea of a trade-off between the two tasks. This finding is also in line with the conclusion that working memory doesn't seem to influence the implicatures investigated in this study. However, we also found that the percentages of correctly recalled dots were highest in the low load condition and lowest in the high load condition. One might argue that, if processing these specific implicatures from but, so, and nevertheless truly happens automatically, then we should expect these percentages on the dot recall task to be equal for each load condition. It can be argued that participants might have invested an equal amount of working memory capacity into the implicature task and that this goes at the expense of performance on the dot recall task (especially in the high load condition). However, this suggestive explanation seems unlikely since the percentages of correctly reproduced dots were fairly high, so we would have at least expected a difference with the no load condition (which was not found). Moreover, the moderate load condition in our study corresponds to the high load condition in the original De Neys and Schaeken (2007) study. This means that our high load condition was truly highly burdening and therefore a lower percentage of correctly reproduced dots compared to the other two load conditions is not surprising.

Fourth, the observed null effect of working memory must be seen in a wider picture. There is a difference between the task in the current paper on the one hand and for instance the task of De Neys and Schaeken (2007) on scalars on the other hand. In the latter study, there is no correct answer. Some refers to an indeterminate amount; therefore some is compatible with some and not all, but also with all. In other words, some is ambiguous and when interpreting some, participants have to decide, based on contextual information, to go either for the reading with or for the one without the scalar implicature. However, but, so and nevertheless are not ambiguous in the way some is ambiguous: their meaning is clear. One only has some freedom in taking care of the weights of the arguments when coming to an interpretation. It might well be that working memory resources play a different role in these cases. This point can be made clear by using the framework offered by Chemla and Singh (2014a,b). In their stimulating review study, Chemla and Singh provide evidence that the derivation process of scalar implicatures is indeed costly. However, their careful analysis identified different possible derivation processes and it is not clear what in the derivation process of a scalar implicature creates an extra cost. The research of Marty and Chemla (2013) suggests that the processing of the alternatives is not the most effortful part in the derivation of implicatures, but that the decision step (the choice between the two readings) is the costly process (however, see van Tiel and Schaeken, 2016). The current data can be interpreted as in line with this hypothesis. Indeed, there is no need to disambiguate between two readings when interpreting a but-sentence, because there is just one reading. One only has to play with the weights of the arguments. Nevertheless, we want to refrain from too strong conclusions about working memory involvement, because our study on its own does not allow to conclude that working memory is not involved at all.

## The Role of Arguments Order

The two-way interaction in the model (between conclusion type and connector; see **Figure 2**) is informative with respect to our hypothesis about the effect of order. Indeed, it provides evidence that the conclusion from so leads to more conventional answers than the conclusion from nevertheless, at least when but separates the p- and q-argument. This means that but is interpreted in line with the expectations expressed in the introduction and contributes to the understanding of so and nevertheless. The inference from but directs the reader toward the conclusion from the q-argument and the use of so following a but-sentence confirms and strengthens this conclusion. However, nevertheless requires the reader to overrule the inference from but in favor of the conclusion from p. When a 'period' separates the p- and q-argument, there is no indication which of the two arguments has more weight and therefore what conclusion is the expected one. Consequently, there is no significant difference in the number of conventional answers between so-conclusions and nevertheless-conclusions in the period-condition. Moreover, there are less conventional so-conclusions in the periodcondition than in the but-condition. Therefore and as predicted, we seem to be able to rule out the alternative explanation that the q-argument gets more weight simply because it is the last given argument. However, it's still possible that some reasoners interpret nevertheless as a word that gives freedom with respect to making the inference from the p- or from the q-argument. We will come back to this last issue in the paragraph on the role of content and axiological value.

When people interpret a sequence of sentences, they want to relate portions of the text or sentences. Rhetorical relations (also called discourse relations or coherence relations) have been proposed as an explanation for the construction of coherence

in discourse (see e.g., Lascarides and Asher, 1993; Asher and Lascarides, 2003). Examples of rhetorical relations are condition, motivation, purpose, and volitional cause. But makes the rhetorical relationship in the but-condition explicit (contrast), but reasoners can of course infer a rhetorical relationship themselves in the period-condition. Given the fact that we had combinations of (pretested) sensible arguments in the current experiment, although with a different orientation, it seems fair to argue that most participants have inferred the contrast-rhetorical relation. Therefore, the difference between the but-conditions and period-condition for the so-conclusions is really convincing evidence in favor of the hypothesis that it is but that leads to a different weighting. Because a signaled rhetorical relation (by means of but) might be easier to construct than an unsignaled one (in the period-condition), we only used the period-condition as a control for the temporal order hypothesis and not for the working memory involvement hypothesis.

## The Role of Content and Axiological Value

The three-way interaction, although difficult to interpret, seems to be informative with respect to our hypotheses about the role of content and the axiological value. We had not anticipated an effect of axiological value combination. The variable was basically added as a control variable, although we wanted to verify its null effect found in previous studies. The role of the argument combination variable, however, was not unexpected. Previous studies (albeit in a maybe methodologically less precise way) already gave evidence for the effect of content.

**Figure 3C** depicts the situation in which two weak arguments are presented. It can be argued that this is not an obvious situation. Compared to weak–strong and strong–weak, none of the two arguments stands out over the other. Compared to strong–strong, the weak–weak construction only contains weak arguments and it may be less clear which inference stems from these weak arguments. **Figure 3C** shows that in this weak– weak situation, people make more correct inferences from a positive argument than from a negative argument for both so and nevertheless. This can be deduced from the fact that the axiological value combination 'negative–positive' leads to more conventional so-conclusions than 'positive–negative' and the opposite applies for nevertheless. Since conventional soconclusions are inferred from the q-argument and neverthelessconclusions from the p-argument, this means that positive arguments facilitate the conventional conclusion in weak–weak situations.

The same seems to hold for other less obvious situations with different argument combinations. We found that, overall, nevertheless-conclusions elicited more unconventional conclusions than so-conclusions and for these neverthelessconclusions a positive argument seems to facilitate the conventional conclusion compared to a negative argument as well. This can be seen in **Figure 3A** (weak–strong) and **Figure 3B** (strong–weak). However, this does not hold for the nevertheless-conclusions when p and q are both strong arguments. In those sentences, there was no significant difference between 'positive–negative' and 'negative–positive.' The fact that in general nevertheless is better understood with a positive p-argument might be seen as evidence in favor of the claim that reasoners are able to interpret nevertheless as pointing toward the conclusion from the p-argument, at least when the circumstances are ideal, i.e., when not too much processing is required and a preferred axiological value construction is used.

When we look at the so-conclusions, a reverse pattern seems to emerge. In the strong–strong (**Figure 3D**) situations, the axiological value combination 'positive–negative' leads to significantly more conventional so-conclusions than 'negative– positive,' which implies that a negative argument facilitates the conventional conclusion in these situations. This difference between 'positive–negative' and 'negative–positive' is not significant for the so-conclusions in the strong–weak situations (**Figure 3B**). This can be explained by the fact that this is the least obvious so-conclusion to make, since it requires the reader to ignore a strong argument in favor of a weak argument. This explanation, however, does not match with the absence of a difference in the weak–strong situation. Here the reader has in principle only an easy job to do, that is, ignore a weak argument in favor of a strong one.

These results seem to be somewhat in line with the positive– negative asymmetry theory (Peeters and Czapinski, 1990; Taylor, 1991; Rozin and Royzman, 2001). Lewicka (1988, 1998) has demonstrated that this theory can account for human deviations from normative models of reasoning (see e.g., Verschueren et al., 2006). According to the theory, human information processing bears the marks of a general tendency to have greater subjective necessity associated with avoiding negative outcomes than with obtaining positive outcomes. The general trend we observe is that so-conclusions are easier with the negative–positive form, while the opposite is true for nevertheless. In both cases this means that the conclusion is easier when it is based on the positive argument (q in the case of so and p in the case of nevertheless). We do not have a clear explanation yet why and how exactly this effect interacts with the strength of the arguments. Nevertheless, our results indicate that emotional factors can penetrate the interpretation process in a very subtle yet influential way. The current experiment seems to show that even non-intrusive butsentences change depending on whether people read a positive or negative p- or q-argument. However, since the effect of axiological value combination was unexpected, the proposed analysis remains suggestive. Therefore, replication and variation studies are mandatory in order to firmly establish this positive– negative effect and the interaction with the strength of the arguments.

The two extra analyses not only confirmed the null-effect of working memory, they also revealed an interesting extra finding. For the weak–strong combination, we observed more expected so-conclusions than nevertheless-conclusions; for the strong– weak combination, we observed more expected neverthelessconclusions than so-conclusions. This interaction can be phrased differently, namely, for both combinations participants just preferred conclusions on the basis of the strong argument, which is the q-argument for the weak–strong combination and the p-argument for the strong–weak combination. This seems to indicate that participants were strongly driven by the strength

of the arguments in the asymmetric condition. This observation is in line with Hall's (2004; 2007) theory. She claims that the q-argument does not eliminate an assumption, but merely announces an argument that points in a different direction. The q-argument has more weight and is preferred over the p-argument, but when the content of the p-argument allows it, a conclusion can be drawn from p. Hence, Hall (2004, 2007) indirectly emphasizes the significance of the content of the arguments.

## CONCLUSION

This experiment showed that, when presented with butconstructions indicating an indirect distancing contrast, people tend to attribute more weight to the q- than the p-argument. The experiment also showed that participants under a high working memory load did not perform significantly different from participants under a low working memory load or whose working memory was not burdened at all. Concerning the different conclusion types, we found that more unconventional answers are given when participants have to infer the neverthelessconclusion than when they have to infer the so-conclusion. We also found that the content of the arguments played a very important role. Whenever a strong argument is combined with a weak argument, participants mostly base their conclusion on the strong argument and consequently ignore the conventional interpretation of but (and so and nevertheless). Hence, even sensible arguments can get annulled simply because they are weak and measured against a stronger argument. The latter effect might be modulated by the axiological value of the arguments. In other words, the strength or perceived relevance of the p- and q-arguments can override the expectations elicited by but, so and

## REFERENCES


nevertheless: content and context are important forces during our interpretation process.

## AUTHOR CONTRIBUTIONS

All authors contributed to this article, both substantively and formally. LJ and WS prepared the experiment. LJ performed the experiment and did the statistical analysis. WS did the data interpretation. LJ was the driving force in writing the first version and WS of the final version. All authors contributed equally to the editing of the manuscript and approved the final version of the manuscript.

## FUNDING

This work was supported by the National Council for Scientific Research – Flanders, Belgium (FWO) under Grant G.0634.09.

## ACKNOWLEDGMENTS

We would like to thank our reviewers for their constructive remarks. Their requests, suggestions and advices greatly improved the paper.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01520/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Janssens and Schaeken. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Context in Generalized Conversational Implicatures: The Case of Some

### Ludivine E. Dupuy, Jean-Baptiste Van der Henst, Anne Cheylus and Anne C. Reboul\*

*National Center for Scientific Research, Institute for Cognitive Sciences-Marc Jeannerod-UMR 5304, University Claude Bernard-Lyon1, Lyon, France*

There is now general agreement about the optionality of scalar implicatures: the pragmatic interpretation will be accessed depending on the context relative to which the utterance is interpreted. The question, then, is what makes a context upper- (vs. lower-) bounding. Neo-Gricean accounts should predict that contexts including factual information will enhance the rate of pragmatic interpretations. Post-Gricean accounts should predict that contexts including psychological attributions will enhance the rate of pragmatic interpretations. We tested two factors using the quantifier scale <*all, some*>: (1) the existence of factual information that facilitates the computation of pragmatic interpretations in the context (here, the cardinality of the domain of quantification) and (2) the fact that the context makes the difference between the semantic and the pragmatic interpretations of the target sentence relevant, involving psychological attributions to the speaker (here a question using *all*). We did three experiments, all of which suggest that while cardinality information may be necessary to the computation of the pragmatic interpretation, it plays a minor role in triggering it; highlighting the contrast between the pragmatic and the semantic interpretations, while it is not necessary to the computation of the pragmatic interpretation, strongly mandates a pragmatic interpretation. These results favor Sperber and Wilson's (1995) post-Gricean account over Chierchia's (2013) neo-Gricean account. Overall, this suggests that highlighting the relevance of the pragmatic vs. semantic interpretations of the target sentence makes a context upper-bounding. Additionally, the results give a small advantage to the post-Gricean account.

Keywords: scalar implicature, upper-bounding context, lower-bounding context, domain of quantification, relevance, cardinality of domain of quantification

## INTRODUCTION

Broadly speaking, context is the set of non-linguistic pieces of information that plays a role in the interpretation of an utterance. As such, it can include any relevant extralinguistic information, from notions relative to the individual—his putative knowledge and/or his set of assumptions and beliefs—to notions relative to the physical environment in which the conversation is taking place. Although there is wide agreement that the meaning of utterances (their truth-conditions) varies somewhat according to the context in which they are uttered, the extent of this variation is disputed. While contextualism argues that the context's contribution to the truth-conditional meaning of

### Edited by:

*Gabriella Airenti, University of Torino, Italy*

### Reviewed by:

*Dimitrios Skordos, College of William and Mary, USA Nausicaa Pouscoulous, University College London, UK*

> \*Correspondence: *Anne C. Reboul reboul@isc.cnrs.fr*

### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *16 June 2015* Accepted: *03 March 2016* Published: *22 March 2016*

### Citation:

*Dupuy LE, Van der Henst J-B, Cheylus A and Reboul AC (2016) Context in Generalized Conversational Implicatures: The Case of Some. Front. Psychol. 7:381. doi: 10.3389/fpsyg.2016.00381* an utterance is substantial, minimal semanticists (e.g., Borg, 2004, 2012) argue that pragmatic processing plays a very limited role in semantic content. Although contextualism has become the dominant paradigm in the philosophy of language, it has also raised strong interrogations relative to the semanticpragmatic interface in linguistics. This debate has been dubbed the "border wars" (see Horn, 2006) and has mainly centered on implicatures.

Grice (1989) introduced the notion of implicature. One utterance can have a semantic meaning (i.e., linguistic or conventional meaning) and a speaker meaning (i.e., the content the speaker intends to communicate), an implicature. Grice claimed that adult native speakers of a language easily retrieve the additional implicit meaning because communication is governed by a set of tacit maxims, summed up under the Cooperative Principle. The hearer goes beyond what the speaker literally said to recover a meaning compatible with the assumption that the speaker complied with the maxims. According to Grice, to compute a conversational implicature, a hearer must take into account both what the speaker said and what he could have said. The reasoning is thus based on both the actual utterance and its possible alternatives.

Grice proposed a further distinction among conversational implicatures based on how the alternatives are determined. He divided them into Particularized Conversational Implicatures (PCIs) and Generalized Conversational Implicatures (GCIs). According to him, PCIs are heavily context-dependent [the alternative utterances are determined by the context, as in (2)], whereas GCIs are not context-dependent [the alternative utterances are lexically determined, as in (3)]. Logical words such as or and some—typically trigger a GCI:


The Gricean distinction between PCIs and GCIs has constituted the main battleground for the "border wars" between neo-Griceans<sup>1</sup> (Levinson, 2000; Horn, 2004, 2006)—who endorse the distinction between PCIs and GCIs—and post-Griceans (Sperber and Wilson, 1995, (Noveck and Sperber, 2007))—who reject it and claim that all implicatures are context-dependent.

The neo-Griceans have defended their view by proposing a lexicalist account, according to which lexical triggers belong to scales, e.g., <all, some> <and, or>, where the weaker terms implicate the negation of the stronger terms, producing a scalar implicature (SI; for a general overview, see Horn, 2004). Levinson (2000) proposed a strongly lexicalist model, according to which weak scalar terms automatically trigger, as a default interpretation, the SI: for instance, some will automatically be interpreted as some, but not all. The semantic interpretation will only be accessed when the SI is explicitly canceled.

The theoretical predictions of the neo-Gricean and post-Gricean views are fairly clear. On the one hand, neo-Gricean accounts predict pragmatic interpretations at ceiling and extremely low levels of semantic interpretations. They also predict that drawing pragmatic interpretations will take less time than drawing semantic interpretations. On the other hand, post-Griceans make opposite predictions: pragmatic interpretations will not be at ceiling and a number of semantic interpretations should be expected. Additionally, pragmatic interpretations should take longer than semantic interpretations.

On the whole, experimental evidence has not been favorable to the neo-Gricean default account. The results showed a strong residual percentage of lower-bounded, semantic, interpretations (20–40% depending on the experimental paradigm; see e.g., Bott and Noveck, 2004; Feeney et al., 2004; Pouscoulous et al., 2007). This has been interpreted as showing that GCIs are context-dependent to a degree and has called into question the Gricean distinction between GCIs and PCIs. Regarding interpretive cost, even though most results seemed to show that the semantic interpretation is more readily and easily (lower RTs) accessed than the pragmatic interpretation—which suggests that the pragmatic meaning has a higher processing cost than the semantic meaning (see e.g., Noveck and Posada, 2003; Bott and Noveck, 2004; Breheny et al., 2006; Huang and Snedeker, 2009, 2011; Bott et al., 2012)—, it is important to note that there is conflicting evidence in the literature (see Grodner et al., 2010, who shows that, given an appropriate context, the pragmatic interpretation is not more costly than the semantic interpretation). Concerning development, an early batch of experiments seemed to show a clear developmental trajectory with fewer pragmatic interpretations among younger children and an increase with age (see e.g., Gualmini et al., 2001; Noveck, 2001; Papafragou and Musolino, 2003; Guasti et al., 2005; Pouscoulous et al., 2007). However, some studies have shown that even young children (4–5-year-olds) can produce pragmatic interpretations at the adult level (Feeney et al., 2004; Papafragou and Tantalou, 2004; Katsos and Bishop, 2011; Foppolo et al., 2012). This suggests that it is not pragmatic competence per se that children lack and that their low number of pragmatic interpretations in some tasks may be due to task demands.

While RT and developmental evidence may be seen as ambiguous, the rate of pragmatic interpretations is, in itself, a strong argument against the lexicalist neo-Gricean accounts (Levinson, 2000). One might have thought that this would be the end of the border wars. However, these results are compatible with Chierchia's (2004, 2013; Chierchia et al., 2012) syntax-based account, especially in its last version (Chierchia, 2013). According to Chierchia (2013), a silent grammatical exhaustification operator (≈ only) applies on a set of alternatives on a context-dependent basis. In other words, the context will or will not make the set of alternatives available to the operator.

Thus, now, there seems to be an agreement between the neo-Gricean account (Chierchia, 2013) and the post-Gricean account (Sperber and Wilson, 1995; Noveck and Sperber, 2007) on the fact that the process of implicature retrieval is context-dependent. However, there still remains a major difference between the two accounts, as there is still no agreement as to the mecanism itself.

<sup>1</sup>As we will see below, there are two types of neo-Gricean accounts: the lexicalist neo-Gricean accounts, of which Levinson's is the best example; and the syntaxbased accounts, of which Chierchia's model (see Chierchia, 2004, 2013; Chierchia et al., 2012) is the best example. The experimental literature has mainly targeted the lexicalist model, although some results are also relevant for syntax-based accounts.

For neo-Griceans, the process of exhaustification is grammardriven: a silent grammatical exhaustification operator (≈ only) applies on a set of alternatives on a context dependent basis. In other words, the context will or will not make the set of alternatives available to the operator. For post-Griceans, it is a pragmatic enrichment process, whereby the logical form (corresponding to the semantic interpretation) is strengthened, leading to the pragmatic interpretation. Note that even on the post-Gricean interpretation, all things being equal, the enrichment will always lead to the same interpretation, i.e., the negation of the stronger terms on the scale. Thus, though in Recanati's (2004) terms, the process of enrichment is optional, the result of the process does not vary according to the context for scalar implicatures.

Yet, there seems to be a way of testing the two accounts at the contextual level. Chierchia (2013) remains very cautious relative to how the context-dependency of the mechanism works and relative to the nature of the context (what kind of information it can include). Other semanticists (see Borg, 2004, 2012; Stanley, 2007) who also accept a modicum of context-dependency for semantics claim that grammar-based processes can only depend on factual contexts that exclude mental state attributions. This limitation does not exist in the post-Gricean view, where the context can include such psychological attributions. This suggests that a way of approaching this new border war between neo- and post-Gricean account would be to see whether upper-bounding contexts (i.e., contexts that enhance the rate of pragmatic interpretations) include factual information vs. psychological attribution (see below).

## THE ROLE OF CONTEXT IN THE DERIVATION OF GCIs

Let us call contexts that favor a semantic interpretation lowerbounding contexts and those that favor a pragmatic interpretation upper-bouding contexts. What remains unclear is what makes a context upper- or lower-bounding. It is precisely this question that the present paper targets with the aim of contributing to the latest version of the border wars. Studies addressing this issue have mainly targeted children<sup>2</sup> and aimed to identify the factors that increase the rate of pragmatic interpretation. From those studies, three main factors have emerged:


• The accessibility of the alternative set (Barner et al., 2011; Aravind and de Villiers, 2014; Skordos, 2014) 4 .

While there is a general consensus that the third factor (the accessibility of the alternative set) is not relevant to adults in regards to SIs (Aravind and de Villiers, 2014; Skordos, 2014), the first two factors are central to our investigation, as they constitute respectively a factual context and a psychological context<sup>5</sup> . Quantifiers are normally interpreted relative to a contextually determined Domain of Quantification (DQ), which basically indicates the set of objects over which the quantifier quantifies. Additionally, the cardinality of the DQ (how many objects should be considered) is necessary to verify whether all or only some of the objects in the DQ are affected by a given process. For example, in Feeney et al. (2004), the experimental material is as follows:

(4) Charlotte finds three sweets [our emphasis] on the kitchen table. Charlotte likes sweets. Charlotte eats the first sweet. Charlotte eats the second sweet. Charlotte eats the third sweet. Charlotte's Mum says, "Charlotte, what have you been doing with the sweets?" Charlotte says: "I've eaten some of the sweets."

The mention of DQ-cardinality in the first sentence allows participants to verify that in the course of the story, the character has exhausted all the originally present items (here, the three sweets). In other words, DQ-cardinality is a necessary factor in SIs based on the quantifier scale. In addition, explicitly mentioning DQ-cardinality in the first sentence, as well as counting the objects in the following sentences [as in (4)] should make it obvious to the hearer that all the objects in the DQ are affected, which, arguably, should favor a pragmatic interpretation. This then will be the kind of context that, on a neo-Gricean account (Chierchia, 2013), should enhance the rate of pragmatic interpretations.

By contrast, while making the contrast between the semantic and the pragmatic interpretations relevant may encourage the derivation of pragmatic answers, it is not necessary because adults can produce pragmatic answers even when the contrast is absent. For instance, for the categorical sentences used by Noveck (2001, e.g., Some elephants have trunks), which does not make the contrast between a pragmatic and a semantic answer relevant in and of itself, adults gave 59% of pragmatic responses when answering the question Do you agree?. Making the contrast between the pragmatic and the semantic interpretations relevant would also not be sufficient in the absence of DQ-cardinality, as participants could not check whether all or only some of the objects in the DQ are affected. But, on a post-Gricean account (Sperber and Wilson, 1995; Noveck and Sperber, 2007), this type of context should enhance the rate of pragmatic interpretations. This is because, on Relevance Theory, the interpretation process only stops when an interpretation consistent with the presumption that

<sup>2</sup>Those studies aimed at improving children's rate of pragmatic interpretations.

<sup>3</sup> In Feeney et al. (2004), the contrast was made relevant by including an element of deception: the character who uttered the underinformative sentence clearly intended to mislead her hearer by letting her believe that she had not, e.g., eaten all the sweets but only some of them. Papafragou and Tantalou (2004) highlighted this by the fact that the participant had to decide whether the character should receive a prize based on the fact that the character had, e.g., painted all the stars.

<sup>4</sup> Skordos (2014) manipulated the order of the experimental items in such a way that every underinformative some item was preceded by an all item, making the alternative to some Xs (all Xs) more accessible.

<sup>5</sup>The notion of relevance is relative to an utterance and expectations of relevance depend on the intentions the hearer attributes to the speaker.

the utterance is optimally relevant (achieving a balance between interpretive costs and benefits) has been reached. When an upper-bounding question is present in the context, it is clear that satisfying that condition entails accessing the pragmatic interpretation.

Thus, we propose to combine the presence or absence of an explicit mention of cardinality in the context with the presence of another element in the context that does or does not make the contrast between the weaker and the stronger term (e.g., some and all) conversationally relevant. We choose to use a question because, theoretically, questions have been deemed to clearly indicate the type of answer that would be relevant to the speaker, and the hearer recovers that information through mental state attribution. Wilson (2000), following Sperber and Wilson (1995), proposed that questions are the metarepresentational counterparts of imperatives, representing desirable thoughts or, in other words, relevant answers. This view comes very close to the notion of Question-Under-Discussion (QUD: see Roberts, 2004). A question featuring all indicates how the hearer's reply (the target sentence) can be relevant—by saying whether the action affects **all or only some** of the objects in the DQ—whereas a lower-bounding question does not indicate whether the speaker is interested in knowing whether only some or all of the objects in the DQ are affected. Hence, a lower-bounding question (using the indefinite plural determiner) does not make the difference between a semantic and a pragmatic interpretation relevant and thus does not encourage the participant to access the pragmatic interpretation. In the following experiments, we compare pragmatic and semantic interpretations of underinformative utterances (SIs) in the following cases:


Given the above discussion on the kind of contexts that the neo- and the post-Gricean accounts will accept—the neo-Griceans favoring factual contexts, while the post-Griceans favor psychological contexts—, the two accounts will make different predictions (where ">" means more pragmatic interpretations):

**Neo-Gricean account**: (DQ-cardinality and upperbounding question = DQ-cardinality and lower-bounding question) > (No DQ-cardinality and upper-bounding question = No DQ-cardinality and lower-bounding question). **Post-Gricean account**: (DQ-cardinality and upper-bounding question = No DQ-cardinality and upper-bounding question) > (DQ-cardinality and lower-bounding question = No DQ-cardinality and lower-bounding question).

In other words, the neo-Gricean account predicts that an upper-bounding question will make no difference to the rate of pragmatic interpretations, while the post-Gricean account predicts that an explicit mention of DQ-cardinality will make no difference to the rate of pragmatic interpretations.

## EXPERIMENTS

## Experiment 1

### Participants

Eighty participants (53 women and 27 men) were recruited from the area around Lyon, France. The participants were between 18 and 26 years of age (mean age: 21.6) and were either students or young graduates. All were native French speakers and had normal or corrected to normal vision. They participated in the experiment on a voluntary basis<sup>6</sup> and received a gratification of 10 euros. The experiment lasted approximately 15 min.

### Stimuli and Procedure

To investigate the role of the two contextual elements and the impact of their interactions on the derivation of the pragmatic interpretation of the quantifier some, we used a simple verification task. The experiment was displayed on a computer screen and took place entirely in French, although the English translations are presented in the paper<sup>7</sup> . As the experiment was self-paced, the participants had to press the spacebar to move from one slide to the next. They could also go back if desired by pressing the left arrow key.

The experiment proceeded as follows: the participants were presented with a story narrated through six image-sentence pairs and then saw a puppet named Lilo ask a question about the output of the story. Immediately following Lilo's question, another puppet (Pipo) appeared on the screen and answered the question using an underinformative sentence (target sentence). The participants were asked to judge Pipo's sentence by answering Yes or No to the question "Is Pipo right?" The answers were recorded automatically by the computer program. To illustrate, the experiment proceeded as illustrated in **Figure 1**.

In addition to the test items (in which the weak term some was used when the stronger term all would have been more appropriate), there were three non-target types of sentences that served as controls: two in which all was used (one in which the target sentence was true and one in which it was false) and one in which some was used and was felicitous. There were thus infelicitous some, felicitous some, false all and true all items in each condition.

For each participant, the experimental sentences were ordered randomly from a base of 4 stories for the three control types of items (true all, false all and felicitous some) and 8 stories for the test items (infelicitous some). Thus, there was a total of 20 stories per participant.

### Design

The experiment followed a 2 × 2 mixed design with Cardinality as a between-subjects variable (Card vs. NoCard) and Question as a within-subjects variable (Lower-bounding vs. Upper-bounding

<sup>6</sup>This study was carried out in accordance with the recommendations of the Comité de Protection des Personnes Sud Est II, which granted its agreement with the study (IRB number: 11263). All subjects gave written informed consent in accordance with the Declaration of Helsinki.

<sup>7</sup>All the materials used in the experiments presented in this paper can be found in the Appendix of Supplementary Material, where it is presented in French with an English translation.



question), resulting in four conditions. The variations concerned the first sentence of the story (in which a cardinal number was specified or not specified) and the question asked by Lilo, which was either upper-bounding or lower-bounding (see **Table 1**). We tested 40 participants in the Card condition and 40 participants in the NoCard condition.

### Results and Discussion

As the non-target types of sentences were used to assess the understanding of the task, the proportions of yes and no responses were converted into correct and incorrect responses. The accuracy rate for the various control sentences was 94%, showing that the task was correctly understood. The data from 3 participants who provided incorrect answers to 4 or more of the control stimuli were discarded.

We were interested in the acceptance (semantic responses) or rejection (pragmatic responses) of the underinformative some target sentences. The responses were coded for pragmatic correctness, that is, an answer of no to the underinformative sentences. **Figure 2** illustrates the percentage of pragmatic answers for each of the four versions of the experiment.

The statistical analysis revealed a significant effect of the question: an upper-bounding question triggered significantly more pragmatic answers than a lower-bounding question both when the cardinality of the DQ was explicitly stated (Medianupper−bounding = 4, Medianlower−bounding = 2, Wilcoxon signed-ranks, Z = 3.997, p < 0.001) and when it was not (Medianupper−bounding = 4, Medianlower−bounding = 3, Wilcoxon signed-ranks, Z = 3.327, p < 0.001). A Mann-Whitney test was then performed to assess the role of the cardinality. No significant effect was found in either of the conditions (upperbounding condition: Mediancard = 4, Mediannocard = 4, U = 700.5, p = 0.548; lower-bounding question: Mediancard = 2, Mediannocard = 3, U = 656, p = 0.296).

In this experiment, we observed a significant effect of the question type on the rate of pragmatic answers but no impact of the cardinality of the DQ. Indeed, the two conditions with an upper-bounding question triggered significantly more pragmatic interpretations than the two conditions with a lowerbounding question, regardless of whether they were combined with an indication of cardinality. This result might suggest that cardinality plays a minor role (if any) in the computation of the SIs. However, an alternative explanation is that the effect of the question is so strong that it overrides any potential effect of the cardinality. To determine whether this is, indeed, the case, we conducted a second experiment in which we erased the conversational context.

## Experiment 2

### Participants

For this second experiment, 60 participants (aged between 18 and 25 years; mean age: 21.1) were recruited. There were 16 males and 44 females. The participants were either students or young graduates from the Universities of Lyon and Bordeaux, France. They were native French speakers, had no background in linguistics and had normal or corrected to normal vision. They participated in the experiment voluntarily and were paid 10 euros.

## Stimuli and Procedure

To ensure consistency, minimal changes were made to the original design: the same 20 stories (sets of sentence-image pairs describing a sequence of actions: 8 test stories and 12 control stories) and the same procedure were used. The main difference between this experiment and Experiment 1 was the absence of the question. After the participants viewed the six imagesentence pairs, they were directly presented with the puppet uttering the target, underinformative sentence and were asked whether the puppet's description of the story was correct: the target sentence was simply presented as a comment on the story.

The storyboard in Experiment 2 is presented below (see **Figure 3**).



### Design

Three conditions were compared: one in which DQ-cardinality is mentioned in the first sentence-image pair and the rest of the context is the same as that in Experiment 1 (notwithstanding the absence of a question); one in which DQ-cardinality is not mentioned in the first sentence-image pair and the rest of the context is identical to that in Experiment 1; and one in which DQ-cardinality is not mentioned in the first sentenceimage pair and the successive sentences do not number the objects (see **Table 2**). This last condition was added in case the fact that the object is numbered in the descriptions of each sequential event in the story alerted participants of DQcardinality.

Each participant was tested in only one condition, and there were 20 participants per condition.

### Results and Discussion

The participants answered the control sentences with an accuracy rate of 96%. The data from three participants had to be discarded because these participants gave too many incorrect responses in the three control conditions. Again, the rejection rate of underinformative responses represented the percentage of pragmatic answers (see **Figure 4**).

The Kruskal-Wallis ANOVA by Ranks and Median Test showed no significant difference in the rate of pragmatic answers between the different conditions, H(2) = 0.0595, p = 0.970, with a mean rank of 10 for the Card and NoCard conditions and 8 for the NoNumber condition.

In the first experiment, for which the neo-Gricean account predicts an effect of the cardinality of the DQ, we only obtained an effect of upper- vs. lower-bounding question. These results suggest that the cardinality of the DQ plays (at most) a minor role in the derivation of pragmatic interpretations for quantifierbased SIs. To test that hypothesis, we ran a second experiment in which we compared three conditions in the absence of a question: one condition in which the cardinality was explicitly indicated (Card), one in which it was not (NoCard) and one in which neither cardinality nor number was indicated (NoNumber). Because the Card condition does not lead to significantly more pragmatic interpretations than the NoCard condition, we can conclude that DQ-cardinality does not play a major role in the access of pragmatic interpretations. This is further evidenced by the fact that the results remain unchanged if we do not state any numbers explicitly (NoNumber).

A potential limitation of our study however is that in Experiment 1, the lower- vs. upper-bounding question comparison involved a within-subjects design. This raises two concerns: (a) it makes the comparison between the withinsubjects conditions and the between-subjects conditions difficult and (b) it creates a strong pragmatic contrast between the two types of questions, which could have an impact on the rate of rejection of underinformative statements: the difference observed in the rate of rejection of underinformative statements could well be due to the contrast between the question types rather than the question type by itself. To rule out this possibility, we conducted a follow-up experiment in which the question type (upper- vs. lower-bounding) was manipulated as a between-subjects factor.

## Experiment 3 Participants

Forty undergraduate students from Lyon University participated in this experiment (mean age: 21.05; 8 were male). All participants were native French speakers, had normal or corrected to-normal visual acuity, and were given 10 euros for participation.

### Stimuli and Design

This experiment used the same stimuli as that used in Experiment 1, but the design was slightly changed to make the question type

TABLE 3 | The two experimental conditions of Experiment 3.


a between-subjects variable. The participants were still presented with the six images and their verbal descriptions and the two puppets. However, in this experiment, each participant saw Lilo ask the same type of question (either lower-bounding or upperbounding) for the whole trial (see **Table 3**).

As familiarity with the question type was a concern, some changes were made to the fillers: instead of using upper- or lower-bounding questions in the control sentences, the questions contained the French plural definite article "les" and the French singular indefinite article "un."

### Results and Discussion

The participants correctly responded to 94 percent of the control sentences. Four participants (2 in each condition) provided incorrect answers for 4 out of 12 sentences in the control conditions and were thus removed from the set of results. The percentage of rejection of underinformative sentences was used to calculate the rate of pragmatic responses (see **Figure 5**).

As in Experiment 1, a strong effect of the question type on the rate of rejection of underinformative sentences was observed, with upper-bounding questions triggering significantly more pragmatic answers than lower-bounding questions (Medianupper−bounding = 7.5 Medianlower−bounding = 1.5, Mann-Whitney test, U = 3.548, p < 0.001).

The results of Experiment 3 are fairly similar to those we obtained in Experiment 1 and suggest that it is not the contrast between the two types of questions that impacts the rate of pragmatic answers but the question type itself. By changing the design slightly, we have been able to rule out a possible alternative explanation and to show that, indeed, the psychological context plays a major role in the interpretation of scalar implicatures.

## GENERAL DISCUSSION

As previously mentioned, both neo-Griceans (Chierchia, 2013) and post-Griceans (Sperber and Wilson, 1995; Noveck and Sperber, 2007) now agree that pragmatic interpretation for SIs is context-dependent. Some contexts (the upperbounding contexts) make the pragmatic interpretation relevant, encouraging hearers to make the necessary effort to access it, whereas other contexts (the lower-bounding contexts) do not. However, what makes a context upper-bounding remains unclear and may make a difference between the two accounts. We examined two factors: the presence of explicit information that facilitates the computation of pragmatic information in the context (DQ-cardinality) and the presence of an element that makes both the information in question salient and the difference between the pragmatic and the semantic interpretations relevant in the context (upper-bounding question).

Our view was that if DQ-cardinality was the main factor triggering pragmatic answers, this would favor the neo-Gricean account. By contrast, if the presence of an upper-bounding question was the main factor triggering pragmatic answers, this would favor the post-Gricean interpretation. The greater number of pragmatic answers found in Experiments 1 and 3 was entirely due to the upper-bounding question. Additionally, an explicit mention of DQ-cardinality and object number in Experiment 2 did not result in differences between the three conditions (i.e., DQ-cardinality, no-DQ-cardinality, no-number). Thus, taken together, these results suggest that DQ-cardinality plays a minor role, or perhaps no role at all, in the derivation of pragmatic interpretation in quantifier-based SIs.

This conclusion might, however, be premature. Indeed, our contexts combined pictures and written sentences, which means that the objects in the DQ were always visually represented in the first sentence-picture. Moreover, DQ-cardinality was immediately perceptible because the number of objects (five in all the stories) is within the range of subitization<sup>8</sup> . One could thus argue that DQ-cardinality is involved in the computation of the pragmatic interpretation but that in cases in which it is immediately perceptible through subitization, it does not have to be explicitly mentioned. In other words, according to such an hypothesis, DQ-cardinality would be available in all conditions through subitization. Hence, the addition of an explicit mention of DQ-cardinality would be redundant, and one would not expect it to have an effect. In such a view, the upper-bounding question would make the difference merely by making the information relevant, regardless of whether it was made available through language or through the visual scene. Thus, according to this line of argumentation, the absence of a difference between the DQcardinality/no DQ-cardinality conditions in Experiments 1 and 2

<sup>8</sup> Subitization is the process through which low numbers (1–5, minus or plus 2) are perceived directly (visually) even though the relevant objects are not counted (Kaufman et al., 1949).

would not argue against the important role of DQ-cardinality in the production of pragmatic interpretations.

This is compatible with the results we obtained: in this view, one would expect the only difference in Experiments 1 and 3 to be between the lower- and upper-bounding question conditions, and this is exactly what we found. In the same way, one would not expect the explicit mention of DQ-cardinality to result in a difference between the three conditions in Experiment 2—given that the information is visually available through subitization and this is the result that we obtained.

This issue is more complicated, however. If DQ-cardinality was the major factor in the production of pragmatic interpretation for quantifier-based SIs, one would certainly expect a difference between the lower-bounding question and the upper-bounding question conditions, such as the difference we found in Experiments 1 and 3. However, one would also expect a much higher level of pragmatic interpretations in the lower-bounding question conditions in Experiments 1 and 3 and in all three conditions in Experiment 2 than what we found. Indeed, the rates of pragmatic interpretations in both lower-bounding question conditions in Experiments 1 and 3 and in all three conditions in Experiment 2 were fairly low (between 40 and 50%). However, in the two upper-bounding question conditions, the rates of pragmatic interpretation were approximately 81% in Experiment 1 and approximately 95% in Experiment 3. These results suggest that DQ-cardinality plays, at most, a minor role in pragmatic interpretation and is certainly far from sufficient to increase its occurrence.

The main trigger of pragmatic interpretations was the presence of the upper-bounding question in the two conditions of Experiments 1 and 3. But, how exactly does the upper-bounding question foster pragmatic interpretation? In Section The Role of Context in the Derivation of GCIs above, we assumed that the upper-bounding question increases the rate of pragmatic responses by increasing the relevance of the contrast between the pragmatic and the semantic interpretations and the salience of DQ-cardinality. The proposition that the upper-bounding question makes DQ-cardinality relevant is rather convincing given the results of the NoNumber condition in Experiment 2. The results suggest that the DQ-cardinality and the number of objects affected are (visually) processed and that participants are aware of whether all or only some objects are affected. However, the majority of participants will only use that piece of information and give a pragmatic answer when they consider the pragmatic answer relevant (i.e., in the upper-bounding condition), as shown by the results of Experiments 1 and 3. This clearly agrees with the prediction of the post-Gricean account (see Section The Role of Context in the Derivation of GCIs above).

In other words, the upper-bounding question does not make DQ-cardinality relevant; rather, DQ-cardinality becomes

## REFERENCES

relevant because the pragmatic answer is relevant. It should be noted that this suggestion is compatible with our assumption regarding DQ-cardinality: DQ-cardinality is a necessary ingredient in the computation of pragmatic interpretation but (contrary to the neo-Gricean prediction) is not sufficient, in and of itself, to increase pragmatic interpretation.

As said above, both Chierchia's (2013) neo-Gricean account and Sperber and Wilson's (1995) post-Gricean account accept that access to the pragmatic interpretation is context-dependent. What the present results suggest is that the main factor that makes a context upper-bounding is that this context makes the contrast between the two interpretations relevant. In addition, on the view that grammatical mechanisms, while they can be context-dependent, should depend on factual rather than psychological contexts, our results are more consistent with the post-Gricean account than with the neo-Gricean account.

In the present experiment, we have used questions based on the relevance-theoretic view that questions provide the hearer with an indication of how to make his answer relevant to the speaker. This view comes very close to the notion of Question-Under-Discussion (QUD), which, however, is not restricted to questions. This notion can also apply to assertions, as assertions can be characterized relative to which QUD they target. In other words, while questions have the obvious advantage of making the QUD explicit, other types of sentences might play the same role in the building of an upper-bounding context. This suggests that a further direction for research might be the examination of the explicitness of the QUD when participants have to judge the pragmatic felicity of underinformative utterances. We leave this investigation for future research.

## AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

## ACKNOWLEDGMENTS

The research leading to these results has received funding from the European Union's Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 613465. We also want to express our thanks to the referees, whose remarks and suggestions have helped us improve the paper.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.00381

Barner, D., Brooks, N., and Bale, A. (2011). Accessing the unsaid: the role of scalar alternatives in children's pragmatic inference. Cognition 118, 84–93. doi: 10.1016/j.cognition.2010.10.010

Borg, E. (2004). Minimal Semantics. Oxford: Oxford University Press. Borg, E. (2012). Pursuing Meaning. Oxford: Oxford University Press.

Aravind, A., and de Villiers, J. (2014). "Implicit alternatives insufficient for children's scalar implicatures with some," in Poster Presented at the 39th Boston University Conference on Language Development (Boston, MA).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Dupuy, Van der Henst, Cheylus and Reboul. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Disentangling Metaphor from Context: An ERP Study

Valentina Bambini <sup>1</sup> \*, Chiara Bertini <sup>2</sup> , Walter Schaeken<sup>3</sup> , Alessandra Stella<sup>4</sup> and Francesco Di Russo4, 5

<sup>1</sup> Center for Neurocognition, Epistemology and theoretical Syntax (NEtS), Institute for Advanced Study (IUSS), Pavia, Italy, <sup>2</sup> Laboratorio di Linguistica "G. Nencioni", Scuola Normale Superiore, Pisa, Italy, <sup>3</sup> Laboratory of Experimental Psychology, KU Leuven, Leuven, Belgium, <sup>4</sup> Department of Movement, Human and Health Sciences, University of Rome "Foro Italico", Rome, Italy, <sup>5</sup> Istituti di Ricovero e Cura a Carattere Scientifico Santa Lucia Foundation, Rome, Italy

A large body of electrophysiological literature showed that metaphor comprehension elicits two different event-related brain potential responses, namely the so-called N400 and P600 components. Yet most of these studies test metaphor in isolation while in natural conversation metaphors do not come out of the blue but embedded in linguistic and extra-linguistic context. This study aimed at assessing the role of context in the metaphor comprehension process. We recorded EEG activity while participants were presented with metaphors and equivalent literal expressions in a minimal context (Experiment 1) and in a supportive context where the word expressing the ground between the metaphor's topic and vehicle was made explicit (Experiment 2). The N400 effect was visible only in minimal context, whereas the P600 was visible both in the absence and in the presence of contextual cues. These findings suggest that the N400 observed for metaphor is related to contextual aspects, possibly indexing contextual expectations on upcoming words that guide lexical access and retrieval, while the P600 seems to reflect truly pragmatic interpretative processes needed to make sense of a metaphor and derive the speaker's meaning, also in the presence of contextual cues. In sum, previous information in the linguistic context biases toward a metaphorical interpretation but does not suppress interpretative pragmatic mechanisms to establish the intended meaning.

### Edited by:

Marco Cruciani, University of Trento, Italy

### Reviewed by:

Ina Bornkessel-Schlesewsky, University of South Australia, Australia Bálint Forgács, Université Paris Descartes, France

### \*Correspondence:

Valentina Bambini valentina.bambini@iusspavia.it

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 07 October 2015 Accepted: 05 April 2016 Published: 03 May 2016

### Citation:

Bambini V, Bertini C, Schaeken W, Stella A and Di Russo F (2016) Disentangling Metaphor from Context: An ERP Study. Front. Psychol. 7:559. doi: 10.3389/fpsyg.2016.00559 Keywords: metaphor, context, pragmatics, neuropragmatics, experimental pragmatics, N400, P600

## INTRODUCTION

While understanding language in the context of communication, comprehenders have to infer the so-called speaker's meaning, i.e., what the speaker intends to communicate, which is vastly underdetermined by the literal meaning of words and sentences. The speaker's intended meaning is the result of a pragmatic inference exploiting world knowledge, the context, and the lexical meaning of the expression. Metaphor offers a major example of the gap between the literal meaning and the speaker's meaning, and describing how this gap is bridged in the mind/brain of language users is one of the major concerns of experimental pragmatics and neuropragmatics (Bambini, 2010; Bambini and Bara, 2012; Hagoort and Levinson, 2014; Grossman and Noveck, 2015).

With respect to processing, metaphor has been studied mainly in relation to the steps of comprehension. Positions are traditionally divided into two main models according to whether the access to figurative meaning is considered indirect, i.e., passing through a first stage where the literal meaning is represented, or direct. The indirect access model is linked to the view of classic scholars in pragmatics, namely Grice and Searle, and supported by evidence of longer reaction times for metaphorical as compared to literal expressions (Janus and Bever, 1985). Conversely, the direct access view claims that, with appropriate context, people take no longer to understand metaphors than to understand comparable literal language (Gibbs, 1994). Somehow in between, the Graded Salience Hypothesis claims that the direct access is influenced by the salience degree of the stimuli (Giora, 2003).

When the event-related potential (ERP) electrophysiological technique started to be used to investigate how metaphor comprehension unfolds over time, the issue of the processing steps was revived in terms of ERP components (Bambini and Resta, 2012; Rataj, 2014). Two components have been commonly reported for metaphors, namely a centro-parietal negativity (N400) and a later parietal positivity (P600/LPC) (Pynte et al., 1996; Coulson and Van Petten, 2002; De Grauwe et al., 2010; Schmidt-Snoek et al., 2015). The functional roles of these components in language processing are diverse. The N400 is generally linked to meaning processing, in relation to a plethora of stimulus types (Kutas and Federmeier, 2011). The P600, originally linked to syntactic reanalysis, is nowadays assumed to reflect also semantic and interpretation processes, such as sentence-level interpretation conflicts (Frenzel et al., 2011) and integration in the wider discourse model and communicative context (Brouwer et al., 2012).

When considered with respect to metaphor, defining the functional significance of these ERP components becomes even more complex, and entrenched with the debate over the direct vs. indirect account. Most studies reported a biphasic N400-P600 effect, assumed to link different stages in conceptual mapping (Coulson and Van Petten, 2002; De Grauwe et al., 2010). Other authors reported an N400 response only (Pynte et al., 1996), or a P600 only, described as a form of a reanalysis stage (Yang et al., 2013). Globally, studies with a biphasic pattern or a later effect tend to favor the indirect view, while studies focusing on the N400 argue against the indirect model. In addition, one important result evidenced in the literature is that the ERP components elicited by metaphor are modulated by the degree of conventionality of the expression, also known as familiarity (Rataj, 2014). For instance, novel metaphors seem to elicit larger N400 amplitude than conventional metaphors (Arzouan et al., 2007a; Lai et al., 2009), which might suggest an indirect access for the formers and a direct access for the latters, in line with the Graded Salience Hypothesis. Moreover, it seems that conventionality affects the type of processes indexed in the N400 (Lai and Curran, 2013). This complex scenario casts doubts on the specificity of the effects reported for metaphor, by highlighting the need of carefully controlling for confounding variables such as familiarity, and definitely leaves the issue of direct/indirect access to metaphorical meaning unsolved.

It is indeed very likely that standard comprehension tasks like those employed in the studies above, while allowing to disentangle different phases of processing, cannot answer the question whether the literal meaning plays a role. A recent study employed masked priming during EEG to explore the issue for metaphor and metonymy (Weiland et al., 2014). This technique proved useful to tap into early phases of processing (Schumacher et al., 2012), and might shed light on the hypothesis of an early literal stage. Results showed that, when literal meaning of metaphorically used words is primed (e.g., priming hyenas with furry in the metaphor Those lobbyists are hyenas), the amplitude of the N400 is reduced with respect to the unprimed condition, thus facilitating rather than interfering with the comprehension process. This speaks in favor of the involvement of literal meaning aspects in the N400 phase, and supports the indirect view, or at least the idea of the lingering of the literal meaning in early phases, consistently with recent theoretical proposals (Carston, 2010b) and behavioral priming studies (Rubio Fernandez, 2007).

One important issue when speaking of metaphor and pragmatics is context. Context is constitutive in pragmatics, where it is assumed to influence the comprehension process by adjusting meanings and shaping inferences. In natural use of language, metaphors occur in the context of a conversation, exploiting background knowledge as well as the previous discourse shared by speakers to base the non-literal use. However, electrophysiological studies have rarely considered the issue of context with the aim of explicitly assessing its role. Pynte et al. (1996) varied the contextual support in the experimental stimuli, comparing familiar metaphors with supportive context and unfamiliar metaphors with non-supportive context, but this study does not help in disentangling the role of context in the comprehension of metaphorical meanings, as the manipulation mixed familiarity and context. Interesting hints into the role of context are provided in Yang et al. (2013). Through a word-to-sentence matching paradigm, the authors compared metaphorical and literal sentences with different probe words, which might work as contextual priming. The results evidenced a modulation of the P600, in the absence of N400 effect (Yang et al., 2013). Apart from this, stimuli employed in the literature are mostly limited to metaphors in the "A is B" sentence form or metaphorical word pairs with no supportive cues.

Other information comes from research on metonymy, where the manipulation of context was shown to directly influence the N400. The resolution of metonymic shift (e.g., The ham sandwich wants to pay) evoked a biphasic N400-LPC pattern when presented in minimal context (Schumacher, 2014), while only a LPC effect is visible when the linguistic context is supportive (e.g., already activating the restaurant semantic field) (Schumacher, 2011). Leaving aside the case of non-literal language, the issue of context is indeed the topic of a large body of investigation in the field, in particular with respect to the N400 components (van Berkum, 2009; Schumacher, 2012). The N400 seems to be sensitive to different types of context, including sentence level information (Hoeks et al., 2004; Federmeier et al., 2007) as well as larger discourse (Nieuwland and van Berkum, 2006), and non-linguistic information such as world knowledge (Hagoort et al., 2004) and speaker's identity (van Berkum et al., 2008). Recently, the literature hosted a debate between different views of the N400 (Lau et al., 2008; Hoeks and Brouwer, 2014). Some assume that the N400 reflects lexical access (Lau et al., 2009), other link the N400 to predictive mechanisms (Federmeier et al.,

2007; Van Petten and Luka, 2012). In both cases context is crucial: in the lexical view context facilitates access and retrieval of stored information, in the prediction view context supports the ease of pre-activation and integration of meaning.

Also the P600/LPC has been described as context-sensitive. First, it is reported for several typically pragmatic phenomena that depend on context, such as irony (Regel et al., 2011; Spotorno et al., 2013), indirect request (Coulson and Lovett, 2010), jokes (Coulson and Kutas, 2001), as well as ambiguous idioms processing (Canal et al., 2015), question/answer pairs and other aspects of conversation and discourse (Hoeks et al., 2013; Hoeks and Brouwer, 2014). Second, context-based mechanisms such as expectation (Davenport and Coulson, 2011; Van Petten and Luka, 2012) and integration (Brouwer and Hoeks, 2013; Hoeks and Brouwer, 2014) have been advocated to describe the P600 as well.

Considering the literature on metaphor and the literature on context, questions arise whether context specifically affects the N400 observed for metaphor, and whether the P600 is also affected. The present study aims at exploring these issues by disentangling the benefit of linguistic context from the global process of understanding the speaker's meaning conveyed in a metaphor. To this purpose, we run two experiments where metaphors and corresponding literal sentences were presented in a minimal context (Experiment 1) and in a supportive context (Experiment 2). Supportive context was represented by the metaphor's ground, i.e., a word that expresses the relation between the metaphor's topic (the subject of the metaphor) and vehicle (the term used metaphorically). For instance, in the metaphor "Mary is a gem," Mary is the topic, gem is the vehicle, and the ground is that Mary is precious or valued (End, 1986). This type of contextual information, which resembles natural occurrences of metaphors where the figurative use arises based on elements in the previous discourse or in the communicative situation, was already used in behavioral paradigms, producing a facilitation of the comprehension process (Gildea and Glucksberg, 1983). In order to avoid confounding effects due to familiarity, we employed non-lexicalized metaphors, and we checked for familiarity as a potentially confounding variable. Based on previous ERP studies on metaphor and on the literature on context effects, our prediction was twofold: (i) we expected to replicate the biphasic patterns observed in several studies for metaphors in minimal context; (ii) we expected context to reduce the N400 and possibly affect the P600. Results could also shed light on the functional characteristics of the components.

As a second aim of the study, we addressed the issue of localization. The source of the brain response to metaphor comprehension has been widely discussed in the imaging literature. While early studies highlighted the role of the right hemisphere (Bottini et al., 1994), later studies failed in reporting a right hemisphere advantage (Rapp et al., 2007) or evidenced a bilateral pattern (Bambini et al., 2011). Recent meta-analyses support the bilateral distribution of activation foci (Bohrn et al., 2012; Rapp et al., 2012). As in the case of the ERP response, familiarity plays an important role also in the localization of the processes (Schmidt and Seger, 2009; Forgács et al., 2012, 2014). EEG data are in line with the bilateral view (Coulson and Van Petten, 2007). Specifically, the N400 effect for metaphors was found to be localized in the bilateral temporal cortex (Arzouan et al., 2007b). In the present study, we run reconstruction of the intracortical ERP origin to further explore the source of the effects, and to compare the results with the previous literature.

## EXPERIMENT 1: MINIMAL CONTEXT

## Methods

### Participants

Thirteen healthy volunteers (6F; mean age = 25.92, SD = 3.75) took part in the study. All participants were monolingual native speakers of Italian. They were all undergraduate or graduate students with a medium-high educational level (16 years of schooling on average). All participants were right handed. Handedness preference was tested with the 10-item version of the Edinburgh Handedness Inventory (Oldfield, 1971). Participants had an average laterality quotient of 87 (of 100 for complete right-handedness; range 71–100). All participants had normal or corrected-to-normal vision, and reported no serious psychological or physical health problems. The experimental protocol was approved by the local ethical committee and was performed in accordance with the declaration of Helsinki. All participants gave written informed consent.

## Stimuli

Stimuli were constructed by expanding the set used in a previous neuroimaging study on metaphor comprehension (Bambini et al., 2011). Sixty-four nouns functioned as target words (e.g., "squalo," shark). Nouns were matched for the main psycholinguistic variables, i.e., frequency, word length, orthographic difficulty. Each noun was associated to two other nouns, once literally (e.g., "squalo"-"pesce," tr. shark-fish) once metaphorically ("squalo"-"avvocato," tr. shark-lawyer). Pairs were embedded into two-sentence passages with a minimal context, e.g., literal "Sai che cos'è quel pesce? Uno squalo." (tr. Do you know what that fish is? A shark.) vs. metaphor "Sai che cos'è quell'avvocato? Uno squalo." (tr. Do you know what that lawyer is? A shark.), for a total of 128 passages (64 metaphorical, 64 literal). This passage structure was chosen in order to have an equal number of words in Experiment 1 and Experiment 2 (see below).

All selected metaphors were non-lexicalized, i.e., they were not listed as idiomatic expressions of Italian. However, given the important role of familiarity in processing metaphor in general, it is possible that, even for non-lexicalized metaphors, there are differences in perceived frequency, with impact on ERP patterns. For this reason, we decided to treat familiarity as a possible confounding variable and to control for the familiarity of the metaphorical expressions. To this purpose, we divided the metaphorical set in familiar and non-familiar metaphors, based on a pre-test run on 16 participants matched for age and education with the participants of the ERP study. Participants were presented with a list of metaphors and had to classify each of them as either familiar or non-familiar. Of the 64 metaphorical passages used in this study, 32 were judged as familiar (average agreement 0.78) and 32 were judged as non-familiar (average

agreement 0.76). In analyzing behavioral and EEG data, familiar and non-familiar metaphors will be compared preliminary to the main metaphor vs. literal comparison, to control for the presence of effects related to familiarity.

Cloze probability was also pre-tested through a completion ("cloze") test on a sample of 15 participants matched for age and education with the participants of the ERP study. The noun pairs used to build the two-sentence passages were presented in a single sentence form, truncated before the last word (e.g., for the literal condition, That fish is a... and, for the metaphorical condition, That lawyer is a...). Mean cloze probability was 0.11 (SD = 0.17) for literal endings and 0.01 (SD = 0.03) for metaphorical endings, with a significant difference between the two conditions (paired t-test, p < 0.001).

Passages were divided in two lists so that each participant saw a target noun only once, either in the literal or in the metaphorical condition. In addition, 32 filler passages per list were included, containing literal passages of comparable structure.

### Task

Metaphor comprehension was given as an implicit task and participants were not informed about the presence of metaphors in the stimuli. In order to maintain attention, participants were explicitly instructed to perform an adjective matching task following the comprehension of the target stimuli. Two adjectives were presented after each passage, one on the right, the other on the left of the screen, one on-topic with respect to the preceding passage, the other off-topic. Participants were instructed to select the adjective that better matched with the preceding passage, by pressing the button in their right or left hand. For each pair of passages (literal and metaphorical, split in the two lists), the same adjective pair was used and, so that the materials employed in the task was constant across condition (e.g., for the metaphorical and the literal passages built upon the noun "shark", the adjective pair was "feroce", tr. ferocious, vs. "geografico", tr. geographical).

## Procedure

During EEG recording, participants were comfortably seated in a dimly lit sound-attenuated room while stimuli were presented in binocular vision on a video monitor at a viewing distance of about 80 cm. Written stimuli were presented in lowercase white font on a dark background. The task sequence was controlled by a PC running Presentations software (Neurobehavioral Systems, http://www.neurobehavioralsystems.com). Each trial started with a fixation cross presented for 500 ms in the center of the screen, followed by the first part of the passage for 1300 ms. A pretest showed that this time was sufficient to read and understand the passage. Then, the determiner and the target noun were presented, one at a time for 400 ms each, preceded by 400 ms of blank screen in both conditions (metaphorical and literal). Target nouns were presented together with a dot to indicate the end of the passage. Next, the screen remained black for 1500 ms, and then the adjective pair appeared, with up to 2500 ms allocated for response. The buttons used to indicate the correct adjective (left or right hand) were counterbalanced across subjects. Thereafter, the screen remained blank until the next trial, resulting in a total trial duration of 9200 ms.

Response time (RT) and response accuracy (percentage of correct responses) in the explicit task following the presentation of the target stimuli were recorded. In a preliminary analysis, RTs of the metaphors were submitted to a one-way ANOVA with Familiarity as the independent factor (2 levels, familiar vs. non-familiar). Next, RTs were submitted to a one-way ANOVA with Metaphoricity as the independent factor (2 levels, metaphor vs. literal). Accuracy data were analyzed non-parametrically. First, familiar vs. non-familiar metaphors were analyzed through Wilcoxon signed-rank test; next the same test was used to compare metaphor vs. literal stimuli. The overall alpha level was fixed at 0.05.

### Electrophysiological Recording and Analysis

EEG was recorded using BrainVisionTM system with 64 electrodes referenced to the left mastoid (Di Russo and Pitzalis, 2014). Horizontal eye movements were monitored with a bipolar recording from electrodes at the left and right outer canthi. Blinks and vertical eye movements were recorded with an electrode below the left eye, which was referenced to site Fp1. Electrode impedances were kept below 5 k. The EEG from each electrode site was digitized at 250 Hz with an amplifier band-pass of 0.01– 60 Hz including a 50 Hz notch filter and was stored for offline averaging. The EEG was segmented for each target stimulus giving epochs of 1000 ms (from −200 to +800 ms relative to the target noun). Computerized artifacts rejection was performed prior to signal averaging in order to discard epochs in which deviations in eye position, blinks, or amplifier blocking occurred. On average, 6.5% of the trials were rejected. Blinks were the most frequent cause of rejection. ERPs were averaged separately according to the conditions (metaphor vs. literal) with respect to a 100 ms pre-stimulus baseline (in both conditions). To further reduce high-frequency noise, the averaged ERPs were filtered at 30 Hz.

All statistical analyses were performed on the mean ERP amplitudes in the different experimental conditions. On the basis of previous studies on the N400 and P600 in similar contexts (Arzouan et al., 2007a; De Grauwe et al., 2010) and visual inspection of the spatiotemporal ERP patterns, we defined two different time windows (320–440 ms for the N400 and 550– 700 ms for the P600) and 25 electrodes (see **Table 1**) that were submitted to two analyses. In a preliminary one-way ANOVA of the ERP amplitudes for the metaphor condition, performed on each electrode site and each time window, Familiarity was the independent variable (2 levels, familiar vs. non-familiar). Next, for each electrode site and each time window, a one-way ANOVA was performed on all items, with Metaphoricity as the independent factor (2 levels, metaphor vs. literal), adjusting for nonsphericity with the Greenhouse-Geiser epsilon coefficient. In all conditions t = 0 ms marked the onset of the target word. The overall alpha level was fixed at 0.05.

In order to preclude that one or two subjects are influencing the results excessively, we performed a sensitivity analysis, by comparing the previous results with those obtained by deleting randomly two participants two times.

Tridimensional topographical maps and estimation of intracranial sources generating effects on the N400 and the P600

### TABLE 1 | Experiment 1 (minimal context).


Mean amplitude (µV) for the metaphorical (M) and literal (L) conditions in the N400 and P600 time windows on a sample of relevant electrodes, with significance values for the Metaphoricity factor [F(1, <sup>12</sup>) ; \*p < 0.05].

was carried out using the BESA 2000 software (MEGIS Software GmbH, Gräfelfing, Germany). We used the spatiotemporal source analysis of BESA that estimates location, orientation, and time course of equivalent dipolar sources by calculating the scalp distribution obtained for a given model (forward solution). This distribution was then compared to that of the actual ERP. Interactive changes in source location and orientation lead to minimization of residual variance between the model and the observed spatiotemporal ERP distribution. The threedimensional coordinates of each dipole in the BESA model were determined with respect to the Talairach axes. In these calculations, BESA assumed a realistic approximation of the head (based on the MRI of 24 subjects). The possibility of interacting dipoles was reduced by selecting solutions with relatively low dipole moments with the aid of an "energy" constraint (weighted 20% in the compound cost function, as opposed to 80% for the residual variance). The optimal set of parameters was found in an iterative manner by searching for a minimum in the compound cost function. Latency ranges for fitting were chosen (see above) to minimize overlap between the two, topographically distinctive components. The accuracy of the source model was evaluated by measuring its residual variance as a percentage of the signal variance, as described by the model, and by applying residual orthogonality tests (ROT) (Böcker et al., 1994). The resulting individual time series for the dipole moments (the source waves) were subjected to an orthogonality test, referred to as a source wave orthogonality test (SOT) (Böcker et al., 1994). For all t-statistics, the alpha level was fixed at 0.05.

In order to further explore possible confounding effects of familiarity, we performed an additional analysis with the three conditions (non-familiar metaphors, familiar metaphors, literal). For the same time windows and the same electrode sites used in the main analysis above (metaphor vs. literal), a one-way ANOVA with 3 levels for the Metaphoricity factor was run (nonfamiliar metaphors, familiar metaphors, literal). Two planned contrasts were run, one between non-familiar and familiar metaphors, and one between the two metaphorical conditions together (familiar metaphors + non-familiar metaphors) and the literal condition.

## Results

### Behavioral Results

Within the metaphor set, RTs in the adjective matching task for familiar and non-familiar metaphors, respectively 1087 ms (SDOM ± 27) and 1098 ms (SDOM ± 26), did not differ significantly [F(1, 17) = 0.015; p = 0.904], which legitimated pooling together the two conditions. Globally, RTs were 1093 ms (SDOM ± 19) for the metaphor condition and 1091 ms (SDOM ± 18) for the literal condition, with no statistically significant differences [F(1, 17) = 0.008; p = 0.930). Accuracy did not significantly vary for familiar and non-familiar metaphors, respectively 92.50 and 90.63% (Wilcoxon's z = −0.359, p = 0.719), which legitimated pooling together the two conditions. Accuracy was high in both condition (91.56% for metaphor and 90.94% for literal condition), with no statistically significant differences (Wilcoxon's z = −0.178, p = 0.859).

### EEG Results

**Figure 1** shows the grand-average ERP for the metaphor and literal conditions over representative electrodes. The earliest detectable ERP component was the visual P1 over bilateral parieto-occipital areas peaking at about 110 ms. The N1 component peaked at about 170 ms over lateral parieto-occipital areas (not shown). The P2 component peaked at about 240 ms over bilateral central-parietal areas. The N400 peaked at about 390 ms over medial central areas and the P600 peaked at about 620 ms over medial parietal areas. The early sensorial components (P1, N1, and P2) were identical in the two conditions, but starting from 300 ms the two waveforms started to diverge showing larger N400 and P600 for the metaphor condition. The shaded gray areas indicate the time windows used for statistical analyses.

The preliminary ANOVA for the Familiarity factor did not yield any significant result on any of the electrodes considered, in any of the two time windows (all ps > 0.05), which legitimated pooling together the two metaphorical conditions (familiar and non-familiar metaphors) and comparing with the literal condition. The main ANOVA between the metaphor and the literal conditions revealed a significant effect of condition in the N400 time window on fronto-central, central, and centroparietal sites. The main ANOVA also revealed a significant effect of condition in the P600 time window on parietal and parietooccipital sites. See **Table 1** for mean amplitudes and significance values.

The sensitivity analysis yielded the same effects, i.e., an N400 effect over fronto-central, central and centro-parietal sites, and a P600 over parietal and parieto-occipital sites. See Supplementary Tables 1.1 and 1.2 in Supplemental Data Sheet.

**Figure 2** shows the scalp topography of the differential ERP waveform obtained from the subtraction of the metaphor minus the literal condition. The N400 effect had a medial central distribution spreading over the two hemispheres, which was bilaterally localized within the superior temporal lobe (BA 22). The P600 effect had a medial parietal distributions more spread out over the right hemisphere, which was localized in the right inferior temporal lobe (BA 20).

The additional analysis with the three conditions (nonfamiliar metaphors, familiar metaphors, literal) confirmed the main analysis. A main effect was visible in the N400 time window on fronto-central, central and centro-parietal electrodes and in the P600 windows on parieto-occipital sites. Planned contrasts showed that the effect is triggered by the comparison between the two metaphorical conditions together (familiar + non-familiar) vs. the literal condition, rather than by familiarity. See Supplementary Tables 2.1 and 2.2 in Supplemental Data Sheet.

## Discussion

Behavioral data showed that participants easily performed the adjective matching task, with no differences between literal and metaphorical stimuli. This is in line with previous studies employing the same task (Bambini et al., 2011) and suggests that subjects correctly processed the passages. The comparison between the ERP waveforms for literally and metaphorically used words in minimal context showed a biphasic pattern, with higher N400 and P600 amplitudes for metaphors as compared to literal stimuli. These findings were confirmed by the sensitivity analysis. The biphasic pattern observed here is compatible with previous studies employing metaphors in the "A is B" form and in minimal context (De Grauwe et al., 2010; Weiland et al., 2014). The topography of the N400 shows bilateral distribution over centro-medial sites, localized within the superior temporal lobes, in line with previous studies on the N400 in general (Van Petten and Luka, 2006; Kutas and Federmeier, 2011). The P600 appears more spread out on the right hemisphere, localized within the right inferior temporal lobe. The topographic distribution of the P600/LPC is still a matter of debate over the literature (Arzouan et al., 2007a; Yang et al., 2013). Supporting evidence for our data can be found in fMRI data on metaphor processing on the same materials (Bambini et al., 2011), pointing to a greater involvement of right temporal areas.

We also reported the absence of effects related to familiarity, both in the preliminary analysis comparing familiar and nonfamiliar metaphors, and in the analysis on all items. On the one hand this supports the idea that the biphasic pattern observed for metaphor was not affected by confounding effects due to familiarity. On the other hand this might seem in contrast with previous literature reporting strong familiarity modulation of the ERP response (Arzouan et al., 2007a; Lai et al., 2009). However, this discrepancy might be explained by noting that previous studies contrasted highly conventional and highly novel metaphors, while in our study all metaphors are non-lexicalized, although associated with different judgments of familiarity in the pre-test.

Although both the N400 and the P600 effects reflect pragmatic processing, in this first experiment it is not possible to specifically weigh the role of context, as it might shape the whole process of metaphor comprehension. In order to disentangle the role of context, we conducted a second experiment, embedding the target words in supportive linguistic information, expressing the ground between the metaphor's topic and vehicle.

## EXPERIMENT 2: SUPPORTIVE CONTEXT Methods Participants

Thirteen healthy volunteers (7F; mean age = 26.00 years, SD 3.70) took part in the study. All participants were monolingual native speakers of Italian. They were all undergraduate or graduate students with a medium-high educational level (16 years of schooling on average). All participants were right handed. Handedness preference was tested with the 10-item version of the Edinburgh Handedness Inventory (Oldfield, 1971). Participants had an average laterality quotient of 86 (of 100 for complete right-handedness; range 69–100). All participants had normal or corrected-to-normal vision, and reported no serious psychological or physical health problems. The experimental protocol was approved by the local ethical committee and was performed in accordance with the declaration of Helsinki. All participants gave written informed consent.

### Stimuli

The same 64 noun pairs employed in Experiment 1 were used and embedded in a supportive context. Pairs (e.g., shark-fish and shark-lawyer) were inserted into two-sentence passages where the link between the noun and its associate was made explicit. In the case of metaphor, this corresponded to the so-called ground, i.e., the property the bonds metaphor's topic and metaphor's vehicle. The structure of the passages was such that the overall number of words did not vary with respect to Experiment 1 (i.e., 8 words). Literal passages were of the type: "Quel pesce è molto aggressivo. È uno squalo." (tr. That fish is really aggressive. It is a shark.) and metaphorical passages were of the type: "Quell'avvocato è molto aggressivo. È uno squalo." (tr. That lawyer is really aggressive. He is a shark.), for a total of 128 passages. A pre-test of cloze probability was run on 14 participants matched for age and education to the participants of the ERP, by showing the literal and the metaphorical passages truncated before the last word (e.g., for the literal condition, That fish is really aggressive. It is a... and, for the metaphorical condition, That lawyer is really aggressive. He is a...). Cloze probability was 0.35 (SD = 0.28) for literal passages and 0.12 (SD = 0.16) for metaphorical passages, with a significant difference between the two conditions (paired t-test, p < 0.001). Although still in the range of values classified as low contextual constraint (Kutas and Hillyard, 1984), these cloze probability values differed significantly from the values for the literal and metaphorical endings in Experiment 1 (paired t-test, p < 0.001). This shows that adding the link between the noun and its associate successfully increased context-based expectations both for literal and metaphorical conditions in Experiment 2. Passages were divided in two lists so that each participant saw a target noun only once. In addition, 32 filler passages per list were included, containing literal passages of comparable structure.

### Task

As in Experiment 1, participants were asked to perform an adjective matching task following the presentation of the target stimuli.

## Procedure

The same as for Experiment 1.

## Electrophysiological Recording and Analysis

Data were recorded as in Experiment 1. For the analysis of the ERP component, we used the same time windows selected in Experiment 1 (320–440 and 550–700 ms) to allow for the comparison of the results. As for Experiment 1, we conducted a preliminary ANOVA for Familiarity (familiar vs. non-familiar metaphors), a main ANOVA (metaphor vs. literal conditions) and a source analysis. Likewise, we also performed a sensitivity analysis and an additional analysis with the three conditions (non-familiar metaphors, familiar metaphors, literal).

Moreover, in order to directly compare the findings of Experiment 1 and 2 we conducted an additional crossexperiment analysis including Context as a between participants factor. For a similar approach see Tune et al. (2014).

## Results

### Behavioral Results

RTs in the adjective matching task for familiar metaphors (1044 ms, SDOM ± 22) and non-familiar metaphors (1043 ms, SDOM ± 22) did not differ significantly [F(1, 18) = 0.000; p = 0.986], which legitimated pooling together the two conditions. RTs was 1044 ms (SDOM ± 16) for the metaphor condition and 1037 ms (SDOM ± 17) for the literal condition, with no significant differences [F(1, 18) = 0.008; p = 0.982]. Accuracy was 97.50% for familiar metaphors and 96.88% for non-familiar metaphors, with no significant differences (z = −0.333, p = 0.739), which legitimated pooling together the two conditions. Accuracy was 97.19% for metaphors and 95.31% for the literal condition, with no statistically significant differences (z = −1.403, p = 0.161).

## EEG Results

**Figure 3** shows the grand-average ERP for the metaphor and literal conditions over representative electrodes. The ERP components were similar as in the Experiment 1 except for the N400, which was almost the same in the two conditions (metaphorical and literal). The shaded gray areas indicate the time windows used for statistical analyses.

The preliminary ANOVA on Familiarity did not yield any significant effect on any electrode site in any time window (all


Mean amplitude (µV) for the metaphorical (M) and literal (L) conditions in the N400 and P600 time windows on a sample of relevant electrodes, with significance values for the Metaphoricity factor [F(1, <sup>12</sup>) ; \*p < 0.05; \*\*p < 0.01].

p > 0.05), which legitimated pooling together familiar and non-familiar metaphors. In the main ANOVA on Metaphoricity (metaphor vs. literal), no significant effects were observed in the N400 time window. On the contrary, in the P600 time window ANOVA yielded a significant effect of Metaphoricity on frontal, central, and parietal electrodes. See **Table 2** for mean amplitudes and significance values. The sensitivity analysis yielded the same results, with no N400 effects and a P600 effects on frontal, central and parietal electrodes. See Supplementary Tables 1.3 and 1.4 in Supplemental Data Sheet.

**Figure 4** shows the scalp topography of the differential ERP waveform obtained from the subtraction of the metaphor minus the literal condition. The P600 effect had a clearly parietal distribution over the right hemisphere, and was localized in the right inferior temporal lobe (BA 20) similarly to the P600 localization in Experiment 1.

The additional analysis with the three conditions (nonfamiliar metaphors, familiar metaphors, literal) confirmed the main analysis. A main effect was visible in the P600 window on frontal, central and parietal sites, and planned contrasts showed that this effect is triggered by the comparison between the two metaphorical conditions considered together (familiar + non-familiar) vs. the literal condition. In the N400 window, only a few right posterior electrodes showed a main effect, possibly due to familiarity. See Supplementary Tables 2.3 and 2.4 in Supplemental Data Sheet.

In the N400 time window, the cross-experiment analysis showed an interaction between Context and Metaphoricity on right central and parietal sites [C4 F(2, 48) = 3.269, p = 0.047, η 2 <sup>p</sup> = 0.12; C6 F(2, 48) = 3.974, p = 0.025, η 2 <sup>p</sup> = 0.14; CP4 F(2,48) = 3.321, p = 0.045, η 2 <sup>p</sup> = 0.12; CP6 F(2, 48) = 3.987, p = 0.025, η 2 <sup>p</sup> = 0.14; FC6 F(2, 48) = 3.334, p = 0.044, η 2 <sup>p</sup> = 0.12; I6 F(2, 48) = 4.037, p = 0.024, η 2 <sup>p</sup> = 0.14; P4 F(2, 48) = 3.652, p = 0.033, η 2 <sup>p</sup> = 0.13; P6 F(2, 48) = 4.643, p = 0.014, η 2 <sup>p</sup> = 0.16; P8 F(2, 48) = 4.626, p = 0.015, η 2 <sup>p</sup> = 0.16; PO8 F(2, 48) = 4.070, p = 0.023, η 2 <sup>p</sup> = 0.14; T8 F(2, 48) = 9.511, p = 0.000, η 2 <sup>p</sup> = 0.28; TP8 F(2, 48) = 5.839, p = 0.005, η 2 <sup>p</sup> = 0.20]. In the P600 window, there is a significant interaction between Context and Metaphoricity only on Fp1 [F(2, 48) = 4.188, p = 0.021, η 2 <sup>p</sup> = 0.15]. In other words, the analysis of the N400 time window showed an effect of the between participants factor on right posterior sites: the N400 effect for metaphor was bigger without a supportive context (Experiment 1) than with a supportive context (Experiment 2).

## Discussion

As in Experiment 1, behavioral data showed that participants correctly processed the stimuli, with no differences between metaphors and literal stimuli. The ERP responses, time-locked to the target words, varied across conditions, with enhanced P600 for metaphorical as compared to literal stimuli, confirmed in the sensitivity analysis. The P600 response had a broader distribution than in Experiment 1 but it was localized in the right temporal lobe as in Experiment 1. Notably, in contrast with Experiment 1, there was no N400 effect. These data seem to suggest that context manipulation has a direct impact on the N400, while not suppressing the P600. The cross-experiment analysis supports this interpretation, showing the interaction between metaphoricity and context in the N400 time window.

As in Experiment 1, familiarity seemed to play no role, neither in the preliminary analysis nor in the additional analysis on all items. Effects were limited to a few posterior electrodes in the N400 time window in the planned contrasts in the analysis on all items. Although this might suggest a possible modulation of the N400 response linked to familiarity, this result, however, is too limited to draw further conclusions. It is indeed likely that familiarity becomes evident in the electrophysiological response only over a certain threshold of difference across stimuli, as in previous studies where highly conventional and highly novel metaphors were compared. In contrast, here all metaphorical stimuli consisted of non-lexicalized metaphors, although with different judgments of familiarity in the pre-test.

In what follows the discussion will concentrate on the N400 and the P600 for metaphor, as differently modulated across the two experiments.

## GENERAL DISCUSSION

The primary aim of this study was to assess the role of context in metaphor comprehension. Results showed the presence of a biphasic N400-P600 pattern when metaphors were presented in a minimal context (Experiment 1). Crucially, when sentences were preceded by a supportive context, the biphasic pattern was not maintained, with metaphors evoking only a P600 effect (Experiment 2). The data obtained for metaphor in minimal context are in line with previous literature. The novel finding here is represented by the effect of the addition of supportive contextual material, which determined the suppression of the N400 effect, but did not suppress the P600 for metaphors. This paves the way to a number of considerations related to the functional characteristics of the observed ERP components.

With respect to the N400, the results of the two experiments suggest that this negativity is especially sensitive to the contextual aspects of pragmatic processing. This hypothesis is also supported by the cross-experiment analysis, which evidenced that the N400 effect for metaphor was bigger without a supportive context (Experiment 1) than with a supportive context (Experiment 2). Previous studies are consistent in reporting enhanced N400 amplitude for metaphorical compared to literal sentences in minimal context or word pairs (Pynte et al., 1996; Tartter et al., 2002; Lai et al., 2009; De Grauwe et al., 2010), yet vary in interpreting its functional role. Here we show that this effect is probably linked to efforts related to the absence of a supportive context, when expectations about upcoming words are not matched.

It is important to highlight that context in our experiment consists of linguistic material constituting the ground, i.e., a property of the lexical concept expressed by the metaphor's vehicle that is promoted and applied to the metaphor's topic. Adding the ground resulted in higher cloze probability rates for metaphorical expressions in Experiment 2 than in Experiment 1, as shown in the pre-test. Given that cloze probability is usually considered a measure of the degree to which the context establishes an expectation for a particular upcoming word (Kutas and Federmeier, 2011; Bambini et al., 2014), we can legitimately say that the two experiments vary with respect to contextual support, and it seems likely to assume that the different N400 response is specifically related to contextual expectations that guide lexical access and retrieval. When the ground is explicit in the context, as in Experiment 2, the retrieval of the metaphorically used words is less costly, as part of the concept is already activated. This interpretation of the N400 effect is in line with several sources of data. First, this account was already proposed in a study where lexical priming affected the N400 for metaphor (Weiland et al., 2014): although in that study the prime was a literal property, such property was still part of the lexical concept expressed by the metaphor vehicle, and it reduced the N400. Second, a similar manipulation in metonymic shift produced a suppression of the N400 when the semantic field of the metonymic concept was activated through lexical items in the context (Schumacher, 2014). More generally, context is known to affect the N400, which responds to the manipulation of semantic congruency at the level of both sentence and discourse, as well as extralinguistic context (Federmeier et al., 2007; van Berkum, 2009; Hoeks and Brouwer, 2014). This interpretation is consistent with the lexical pre-activation based view of the N400, where it is assumed that context has "excitatory" power in supporting lexical retrieval. This kind of proposal comes from studies arguing that the N400 reflects the mental processes that accompany the retrieval of lexical information from longterm memory as facilitated by the activation of features in the preceding context (Brouwer et al., 2012). More generally, the N400 might index the activity of a language processor that rapidly recovers information from multiple sources (e.g., syntax, semantics, discourse, world knowledge) to continuously update its interpretation of an incoming sentence (Stroud and Phillips, 2012). Interestingly, converging evidence of the N400 as an index of contextual expectations also comes from a study on 19-month old children where context for words was represented by colored pictures of objects, suggesting that the functional characterization of this component is very strong since early developmental stages (Friedrich and Friederici, 2004).

With respect to the P600, the positivity observed in our experiment seems to index an authentic pragmatic process of establishing the intended meaning of a metaphor. In current pragmatic models, understanding metaphors is indeed the result of a pragmatic inference exploiting world knowledge, the context, and the lexical meaning of the expression (Carston, 2010a; Pouscoulous, 2014). Specifically, metaphorical interpretation can be seen as an inferential move from the literal meaning to the intended meaning, which starts from the premises in the decoded meaning, combines contextual assumptions, and derives a set of conclusions warranted by the premises (Wilson and Carston, 2007) 1 . The P600 response could thus reflect the derivation of the intended meaning, which capitalizes on context beyond the process of lexical access as observed in the N400 response. Evidence in favor of this interpretation comes from several studies. Our supportive context condition can be compared to the probes employed by Yang et al. in a word-to-sentence matching paradigm where metaphors and literal sentences were preceded by differently congruent words. In line with our findings, that study showed a modulation of the P600, with no N400 effects (Yang et al., 2013). Moreover, several other pragmatic phenomena evoke a P600/LPC effect, among which indirect requests (Coulson and Lovett, 2010) and jokes (Coulson and Kutas, 2001). Interestingly, the P600/LPC effect shows up also in the absence of higher amplitude in the N400 time window, as in the case of irony (Regel et al., 2011; Spotorno et al., 2013), ambiguous idioms processing (Canal et al., 2015), question/answer pairs and other aspects of conversation and discourse (Hoeks et al., 2013; Hoeks and Brouwer, 2014). Regel and colleagues discuss this issue for the specific case of irony, arguing that the absence of the N400 effect is motivated by the easy integration of words in context, while the complete understanding of intended meanings still require later additional cognitive processes in the P600 (Regel et al., 2011). Similarly, for metaphor, when lexical access is facilitated by providing enough supporting context, words are easily integrated, but the final interpretation remains more costly than in the literal case. In this view, the P600 might index the step in the pragmatic inferential process when the speaker comes up with an interpretation of the intended meaning of the utterance, which is observed for metaphors, as well as for irony and other pragmatic phenomena.

More generally, the present study gives additional support to the characterization of the P600 as a reflection of processing costs related to the semantic/pragmatic level (Bornkessel-Schlesewsky and Schlesewsky, 2008; Brouwer et al., 2012), overcoming the classic view of the P600 as a syntactic component, as the stimuli employed in our experiments were neither syntactically anomalous nor ambiguous, and did not differ in syntactic structure across conditions. In line with recent proposals, it might be possible that the P600 is not a single component, but actually a family of late positivities that reflect the wordby-word construction, reorganization, or updating of a mental representation of what is being communicated (Hoeks and Brouwer, 2014). Possibly this family of positivities might belong to the wider P3 family. This hypothesis has been recently revived with the description of the P600 as a point in time where a linguistic entity has achieved subjective significance and some form of adaption process is underway (Sassenhagen et al., 2014).

With respect to the classic debate over the direct vs. indirect view, our study does not offer straightforward conclusions, as a simple comprehension task like the one employed here does not allow to assess the presence of a literal stage. However, the modulation of context might shed some light on the processing steps of the comprehension process. Given the data of Experiment 1 and 2, it is clear that the processing of metaphors is more costly (as shown by the N400 and the P600) than the processing of literal language. Moreover, when a metaphor is preceded by a supportive context (Experiment 2), the effort in the lexical access phase is reduced (no N400), but there is still a P600. Although clearly not decisive on its own, these results are compatible with studies arguing for the lingering of literal meaning (Rubio Fernandez, 2007; Carston, 2010b; Weiland et al., 2014): literal meaning aspects are accessible early on and active throughout the lexical retrieval stage reflected in the N400 in Experiment 1. With supportive context as in our Experiment 2, lexical access becomes easier, as aspects of the metaphorical meaning are activated in the ongoing discourse context, with presumably reduced lingering effects and no visible N400 response. The presence of a P600 in both experiments seems to reflect enhanced costs in a later stage of pragmatically driven interpretative processes: the speaker's meaning, even in the case of a supportive context, must be inferentially derived, hence extra processing is required, both in Experiment 1 and Experiment 2.

<sup>1</sup> In Relevance Theory, the interpretation of a metaphorical utterance consists of a non-demonstrative inference process that combines the lexical meanings and the context of use. For instance, the interpretation of "Sally is a chameleon" takes as input a premise such as "The speaker has said 'Sally is a chameleon' (i.e., a sentence with a fragmentary decoded meaning requiring inferential completion and complementation)," together with other contextual assumptions, and yields as output a conclusion such as "The speaker meant that Sally<sup>x</sup> is a CHAMELEON<sup>∗</sup> , Sally<sup>x</sup> is changeable, Sally<sup>x</sup> has a capacity to adapt to her surroundings, it's hard to discern Sallyx's true nature (etc.)," where CHAMELEON<sup>∗</sup> is an expansion from the category CHAMELEON to the category CHAMELEON<sup>∗</sup> , which includes both actual chameleons and people who share with chameleons the encyclopaedic property of having the capacity to change their appearance in order to blend in with their surroundings (Wilson and Carston, 2007). Of course, this interpretative process takes place at risk, given that the premise cannot guarantee the truth of the conclusion. Yet speakers possess an inferential heuristic for constructing the best interpretation given the available evidence in the context of use.

Taken together, the results of the two experiments presented here suggest three main conclusions: (i) the pragmatic process of metaphor comprehension unfolds through two different stages which might be explained in terms of retrieval of lexical elements shaped by context followed by pragmatic interpretation; (ii) linguistic context reduces the effort in retrieving lexical aspects of metaphors as indexed in the N400, which has never been observed before in the literature; (iii) linguistic context does not suppress later pragmatic interpretation efforts needed in order to derive the speaker's intended meaning, as reflected in the P600. Although these conclusions seem to capture what happens in the comprehension of metaphors in natural conversation, where the linguistic material often introduces and "primes" metaphorical meaning, they cannot be extended to all possible contexts. For metaphors taken from poetry, for instance, there is behavioral evidence that the literary text cannot be simply considered as a context licensing the figurative expression, but rather it seems to promote mechanisms that make the metaphor more open to different interpretations in different scenarios, less familiar but more meaningful (Bambini et al., 2014). Moreover, the modulation of familiarity might affect both the lexical retrieval and the pragmatic interpretation stage, which was not observed here given that all metaphors were non-lexicalized.

As a second aim of the study, we explored the spatial characteristics and the localization of the two ERP effects, in order to add to the large debate in the imaging literature over the neural correlates of metaphor comprehension and the right hemisphere advantage (Bohrn et al., 2012; Rapp et al., 2012). The N400 observed in Experiment 1 has a standard centromedial distribution. The source localization analysis indicated the bilateral superior temporal cortex (BA22) as the origin of the effect, in line with previous accounts of the N400 in general (Van Petten and Luka, 2006; Kutas and Federmeier, 2011). This also matches with previous ERP evidence on metaphor, obtained with hemi-field presentation (Coulson and Van Petten, 2007) and with source localization analysis (Arzouan et al., 2007b), disconfirming the right hemisphere advantage and supporting the idea that both hemispheres work in tandem in metaphor comprehension. This is also compatible with imaging studies, where BA 22 is involved in the comprehension of metaphor processing, both in the right (Mashal et al., 2005; Bambini et al., 2011) and in the left hemisphere (Rapp et al., 2012).

The results on the P600 are less straightforward. In Experiment 1, the P600 effect has a standard parietal distribution, which becomes broader and extended to frontal sites in Experiment 2. The topographic features of the semantic P600 are still a matter of debate in the literature, which is too modestsize to derive strong conclusions (Van Petten and Luka, 2012; Regel et al., 2014). Our data suggest that the distribution of the positivity might vary based on context, possibly with a more global process and a distributed involvement of scalp sites when context is supportive enough in early phases, and interpretation is concentrated in later stages. Moreover, in both Experiment 1 and 2, the generator of the P600 effect was localized in the right inferior temporal gyrus (BA 20). Although there seems to be some agreement on the localization of the P600 in the left hemisphere (Brouwer and Hoeks, 2013), the literature disagrees with respect to the P600 for metaphor, with some localizing the effect in the left (Yang et al., 2013) and other in the right hemisphere (Arzouan et al., 2007a). Imaging data can shed some light over this conflict. Sometimes reported for figurative language processing (Eviatar and Just, 2006), right BA 20 is a region implicated also in the evaluation of alternative meanings and interpretation of ambiguous stimuli (Zempleni et al., 2007), as well in the attribution of intentions to story characters (Brunet et al., 2000). Once lexical retrieval is passed, the ultimate interpretative effort in understanding a metaphor involves the attribution of communicative intentions. The right-sided effect of the P600 observed in our study might thus be related to the interpretative effort in deriving the speaker's meaning. Granted that modern literature has largely reconsidered the right hemisphere advantage (see above), the rightward asymmetry found in the present study seems in favor, at the least, of a larger contribution of the right hemisphere for what concerns the final, interpretative part of the pragmatic inference process. This localization and this interpretation of the P600, however, needs to be further verified, possibly combining EEG and imaging data. Overall, considering both the N400 and the P600 results, what our data seem to highlight is the role of the temporal cortex, bilaterally, and possibly with a right focus, which is in line with a recent neurofunctional proposal of temporo-parietal circuitry for pragmatic processing, at the interface between linguistic and social cognition processes (Catani and Bambini, 2014).

## CONCLUSIONS

Overall, our findings confirm the presence of two dissociable ERP signatures in the processing of metaphors, namely the N400 indexing lexical access guided by contextual expectation, and the P600, indexing a truly pragmatic interpretative mechanism of deriving the speaker's meaning. When the context is supportive, lexical access is facilitated, but the efforts related to establishing a pragmatic interpretation remain. These results shed light on the comprehension of metaphor in natural conversation and points in the direction of increasing the ecological validity of experimental approaches to pragmatics.

## AUTHOR CONTRIBUTIONS

Design and construction of the materials: VB. Data collection: VB, CB, AS. Data analysis: VB, CB, WS, FDR. Manuscript writing: VB, WS, FDR. All authors provided feedback on the draft and approved the final version of the manuscript.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.00559

## REFERENCES


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Bambini, Bertini, Schaeken, Stella and Di Russo. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Who is respectful? Effects of social context and individual empathic ability on ambiguity resolution during utterance comprehension

Xiaoming Jiang1, 2 and Xiaolin Zhou1, 3, 4, 5 \*

<sup>1</sup> Center for Brain and Cognitive Sciences and Department of Psychology, Peking University, Beijing, China, <sup>2</sup> School of Communication Sciences and Disorders, McGill University, Montréal, QC, Canada, <sup>3</sup> Key Laboratory of Machine Perception and Key Laboratory of Computational Linguistics (Ministry of Education), Peking University, Beijing, China, <sup>4</sup> Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China, <sup>5</sup> IDG McGovern Institute for Brain Research at PKU, Peking University, Beijing, China

Verbal communication is often ambiguous. By employing the event-related potential

Edited by:

Marco Cruciani, University of Trento, Italy

### Reviewed by:

Thomas C. Gunter, Max Plank Institute for Human Cognitive and Brain Sciences, Germany Mario Dalmaso, University of Padova, Italy

> \*Correspondence: Xiaolin Zhou xz104@pku.edu.cn

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 15 June 2015 Accepted: 01 October 2015 Published: 23 October 2015

### Citation:

Jiang X and Zhou X (2015) Who is respectful? Effects of social context and individual empathic ability on ambiguity resolution during utterance comprehension. Front. Psychol. 6:1588. doi: 10.3389/fpsyg.2015.01588 (ERP) technique, this study investigated how a comprehender resolves referential ambiguity by using information concerning the social status of communicators. Participants read a conversational scenario which included a minimal conversational context describing a speaker and two other persons of the same or different social status and a directly quoted utterance. A singular, second-person pronoun in the respectful form (nin/nin-de in Chinese) in the utterance could be ambiguous with respect to which of the two persons was the addressee (the "Ambiguous condition"). Alternatively, the pronoun was not ambiguous either because one of the two persons was of higher social status and hence should be the addressee according to social convention (the "Status condition") or because a word referring to the status of a person was additionally inserted before the pronoun to help indicate the referent of the pronoun (the "Referent condition"). Results showed that the perceived ambiguity decreased over the Ambiguous, Status, and Referent conditions. Electrophysiologically, the pronoun elicited an increased N400 in the Referent than in the Status and the Ambiguous conditions, reflecting an increased integration demand due to the necessity of linking the pronoun to both its antecedent and the status word. Relative to the Referent condition, a late, sustained positivity was elicited for the Status condition starting from 600 ms, while a more delayed, anterior negativity was elicited for the Ambiguous condition. Moreover, the N400 effect was modulated by individuals' sensitivity to the social status information, while the late positivity effect was modulated by individuals' empathic ability. These findings highlight the neurocognitive flexibility of contextual bias in referential processing during utterance comprehension.

Keywords: social status, pragmatics, referential ambiguity, directly-quoted utterance, pronoun resolution, ERP

## INTRODUCTION

Establishing referential relations is vital to verbal communication (Brown-Schmidt and Hanna, 2011). Verbal expressions are often ambiguous, particularly in supportive contexts. Considering a situation when John met his friend Bob and Shawn. John asked "which course are you going to teach

next semester?" As a third-party observer, one may be confused as to whom John is addressing without the addressee being explicitly referred to in the speech. However, if one knows that Bob is a lecturer and Shawn is a student at the university, one may immediately infer that Bob is the target addressee. The addressee or the observer may employ a variety of information from the context to resolve this temporary referential ambiguity, building up representation for the utterance as it unfolds over time.

Among context information, the social status of the speaker and the addressee has been demonstrated to be a significant cue relevant to attention, perception, decision-making, and inference-making (Dalmaso et al., 2012, 2014; Hu et al., 2014; Mason et al., 2014; Koski et al., 2015) and is linguistically marked in certain languages (e.g., the second-person pronoun in Mandarin, French, Spanish etc.). The social status of communicators is typically realized by cues related to job titles and professions (e.g., professor) which are attained by individuals involved in the conversation and which form a set of features that are uniquely associated with high vs. low status (Koski et al., 2015). The linguistic marker such as, nin/nin-de (you/your), a respectful form of the second-person pronoun in Mandarin Chinese, is normally used by a lower-status speaker to address a higher-status addressee; in contrast, ni/ni-de (you/your), an informal version of the second-person pronoun, is typically used by a lower-status speaker to address a lower-status and/or familiar addressee. Our previous work (Jiang et al., 2013b) demonstrated that a mismatch of the social status between the addressee and the respectful/informal form of the pronoun elicits neural responses associated with the perception of deviance, including N400, P600, and late negativity (N600) effects in eventrelated potentials (ERPs). A successful resolution of referential ambiguity associated with social/pragmatic information may require accessing information from long-term memory, holding multiple pieces of information in working memory, and making use of complex inference procedures (Brown-Schmidt and Hanna, 2011). A critical question is how the brain uses social status information concerning the communication partners in resolving referential ambiguities and how these processes may vary between individuals with differential social abilities during utterance comprehension.

## Social Context and Referential Ambiguity

Behavioral and neurophysiological studies have implicated that listeners use both discourse and social contexts to resolve referential ambiguity during language comprehension. The discourse context biases the interpretation of the addressee and affects the neural responses underlying ambiguity resolution on the noun (Nieuwland et al., 2007) and pronoun (Nieuwland and Van Berkum, 2006). A frontal sustained negativity effect (Nref) was observed on the ambiguous pronoun, the gender of which was congruent with two competing antecedents in the context, relative to the pronoun referring specifically to one antecedent. This effect was reduced when contextual information (e.g., verb) biased one antecedent to be more probable than the other (e.g., "The chemist hit the historian when he. . . "), or was completely absent when a discourse context implied the death or leaving of one antecedent from the discourse (Nieuwland et al., 2007). These findings suggest that the contextbased pragmatic inference reduces both ambiguity in referential processing and the neural activity underlying this processing. Such ambiguity-related neural responses are also modulated by the comprehenders' working memory span, with higher span comprehenders exhibiting stronger responses (Nieuwland and Van Berkum, 2006).

Evidence from eye-tracking studies has also revealed that the shared knowledge and beliefs between the speaker and addressee provide constraints on the resolution of referential ambiguity (Keysar et al., 1998; Hanna et al., 2003; Barr, 2008; Brown-Schmidt and Tanenhaus, 2008; Heller et al., 2008; Brown-Schmidt, 2009; Ferguson and Breheny, 2012; Bezuidenhout, 2013; Ferguson et al., 2015), resulting in different eye gaze patterns on the object displayed in a shared perspective vs. the object in an addressee-privileged perspective, when the object was referred to in speech. In tasks involving a real conversation, the communication partners coordinated on an object-matching task for a display of objects. The target object referred to in the speaker's instruction was accompanied by an object competing in their initial phonological structure (e.g., bucket/buckle, Barr, 2008) or by an object with the same shape and color (e.g., two blue triangles, Hanna et al., 2003). The display of the target object was shared between the speaker and the addressee (the participant) and the display of the competitor was either shared or was only visible to the addressee (who possessed the knowledge that the speaker could not see this object; Barr, 2008). The frequency of fixation was equally deployed when the target and the competitor were in the shared perspective but was prioritized for the target object when the competitor was only in the addressee-privileged perspective (Hanna et al., 2003; cf. Barr, 2008), indicating that access to the other's perspective reduced the competitor interference effect in face of referential ambiguity.

Recent work on shared beliefs (e.g., Ferguson et al., 2015) required participants to watch a movie in which a character (Jane) either held a false or a true belief of an object's location while at the same time listening to a description (Jane will look for the chocolates in the container on the left/right) in which the character's belief resulted in ambiguity of the location (container) which could not be resolved until the end of the sentence. When asked to make an inference of the character's belief, the comprehenders' eye-movements were immediately guided to the object based on the false belief of the actor from the onset of the sentence (Jane); the comprehenders' eyes, however, were not fixated on the object until the sentence-final disambiguating word in an irrelevant task. These finding suggests that the successful inference of other's knowledge or perspective facilitates the resolution of referential ambiguity and this inference process is most likely cognitively effortful.

## Neurocognitive Evidence of Contextual Effects and Individual Differences in Pragmatic Language Processing

Other evidence has also demonstrated the contextual effect on the integration of an upcoming input word. Two ERP effects, an N400 and a late positivity, are mostly reported to vary as a function of contextual variables. A factual statement inconsistent with one's real-world knowledge (Hagoort et al., 2004) or with one's inference from a counter-factual construction (Nieuwland and Martin, 2012) elicited larger N400 responses. This N400 effect appears when a statement mismatched the cultural convention of the comprehender ("Every single Welsh child can sing in tune," presented to a Welsh-speaking comprehender) but is absent when it is irrelevant (e.g., the same utterance presented to an English-speaking comprehender; Ellis et al., 2015). Morally-laden statements disagreeing with one's belief system elicit stronger N400 responses than agreeing statements (Van Berkum et al., 2009). The discourse context implying the positive or negative characteristic of a person affects the integration of this person's name in the subsequent sentence in which the name was positively or negatively valenced; the name incongruent with the context elicited a larger N400 or delayed positivity as compared with the congruent one, depending on the valence endowed with the name (Wang et al., 2015). An enlarged N400 was also present on words describing a character's emotional reaction which mismatched the expected feeling in a socio-emotional vignette (Leuthold et al., 2012). The N400 effect in these studies suggests an increased integration demand for unifying a word into a broad context, ranging from linguistic to social and extending to the comprehender's own knowledge or belief system.

The context is also a useful source of information for deriving non-literal interpretation. An utterance ("Tonight we gave a superb performance") with a context facilitating an ironic interpretation (Both ladies sang off key) elicited an increased late positivity (P600), compared with an utterance containing only the literal interpretation (Spotorno et al., 2013). A similar positivity effect was observed on utterances presented with ironic vs. neutral-intending prosody (Regel et al., 2011), demonstrating a non-literal interpretation beyond linguistic input via pragmatic inference. The late positivity was preceded by an N400 when the non-literal expression was unfamiliar to the listener (Filik et al., 2014) or was constrained by minimal context (Coulson and Kutas, 2001). Some studies also reported a more sustained positivity, which was found on words inconsistent with the preceding context describing one's traits, intention, or goal of an action, indicating the comprehender's attempt to infer these implied messages (Van Duynslaeger et al., 2007; Baetens et al., 2011). This sustained effect is related to the activity of the neural network subserving the mentalizing process, including the temporoparietal junction (Van Duynslaeger et al., 2007). Jiang et al. (2013b) observed a sustaining positivity following N400 on respectful second-person pronouns (i.e., nin-de, your) incorrectly referring to a lower-status addressee as compared with the pronoun correctly referring to a higher-status addressee. This sustained positivity effect was interpreted as reflecting a second-pass reanalysis process, which resulted in a sarcastic interpretation of the input sentence. However, a sustaining negativity following N400 was elicited on a less respectful secondperson pronoun (i.e., ni-de, your) incorrectly referring to a higher-status addressee as compared with the pronoun correctly referring to a lower-status addressee. This negativity effect was interpreted as reflecting a second-pass inhibitory process when no sarcasm could be derived from the input (as such derivation would violate social norms).

Individuals' characteristics, such as empathic ability, modulate language use, and the neural activity underlying the pragmatic processes. Differential neural responses have been revealed between individuals with autism spectrum disorders (ASD) and healthy individuals during pragmatic language comprehension (Tesink et al., 2009): individuals with ASD showed stronger activations in the right inferior frontal gyrus when comprehending speech violating the voice-inferred speaker's social status and an absence of activation in the ventromedial prefrontal cortex in comprehending speakerconsistent speech. Moreover, eye-tracking studies using the visual-world paradigm suggested that the perspective of a communication partner is immediately taken into account by the listener when interpreting what was said, especially in determining what was referred to in the context (e.g., Ferguson et al., 2010; Brown-Schmidt and Hanna, 2011). These findings highlight the role of perspective taking in utterance comprehension. A third-party's interpretation of directly-quoted utterances between communication partners may involve perspective-taking that allows the comprehender to take the speaker's or the addressee's perspective. Recent ERP evidence suggests that empathy and its sub-processes modulate the use of contextual information and its effect on the integration of upcoming information. Scalar sentences such as some people have lungs in which the critical word "lungs" did not match the pragmatic interpretation of the scalar quantifiers (i.e., only some of the people have lungs) elicited a larger N400 as compared with the counterpart word in felicitous sentences (e.g., "pets" in some people have pets; Nieuwland et al., 2010).

Such neural responses are also modulated by individuals' autistic quotient (AQ, Baron-Cohen et al., 2001), an index inversely correlated with one's empathic ability. Nieuwland et al. (2010) split the group of participants based on the median AQ score and observed an N400 effect only for individuals having lower AQ (i.e., higher empathic ability). Using an empathy questionnaire (Baron-Cohen and Wheelwright, 2004), Van den Brink et al. (2012) demonstrated that the increased N400 responses in the mismatch of speaker identity and speech content (I cannot sleep with my teddy bear in my arm, spoken in an adult male voice) was only observed in listeners with higher empathic ability; participants with lower empathic ability, in contrast, showed a positivity effect.

Individuals' empathic ability also modulates ERP responses to status-mismatch on the second-person pronoun (Jiang and Zhou, 2015). The N400 effect was only observed in participants displaying higher fantasizing ability as measured by the Interactive Reactivity Index (Davis, 1983). Moreover, the cognitive components of empathic ability, as measured by IRI, modulated the neural activity underlying the interpretation of sentences with pragmatic under specification or pragmatic failure (Li et al., 2014). The fantasizing ability (to imagine oneself to be the character of a novel or movie) affected the activation in the medial prefrontal cortex when a description of an event was underspecified (and hence requiring pragmatic inference), suggesting the deployment of mentalizing process to infer a proper representation of the event satisfying the pragmatic constraints. The perspective-taking ability (to shift one's perspective to that of the other) affected the activation in the bilateral inferior frontal gyrus when the description of an event mismatched the comprehender's knowledge about the likelihood of the event. These findings suggest that cognitive empathy could be linked to the individual's ability in using contextual information and making pragmatic inference during verbal communication.

## The Present Study

We aim to investigate when and how a comprehender, as a thirdparty, resolves referential ambiguity in a conversation scenario by using information concerning the social status of communicators in the context, and how his/her empathic ability and sensitivity to the social status information modulates ambiguity perception and the underlying neural activity. The comprehender's empathic ability was measured using the empathy score (40-items) in Baron-Cohen and Wheelwright (2004); the status sensitivity was defined as the difference in rating the appropriateness of statusincongruent and status-congruent scenarios on a 7-point scale (on a subset of stimuli from Jiang et al., 2013b). Participants were asked to explicitly rate the ambiguity of scenarios depicting social interaction involving interpersonal communication and to read these scenarios for comprehension while undergoing EEG recording. We created scenarios in Mandarin Chinese which included a context introducing a speaker of lower social status and two potential addressees with the same (the ambiguous context, in the Ambiguous condition) or different social status (the status-biased context, in the Referent and Status conditions). In both the ambiguous and Status conditions, a directly-quoted utterance began with the respectful form of the Chinese second-person pronoun (nin/nin-de). This pronoun was referentially ambiguous in the ambiguous condition because both the potential addressees were of equally high social status and hence both could be the target, but was not ambiguous in the Status condition because the social convention concerning the use of the respectful form would predict the person of higher status to be the target. The status of the potential addressees was indicated clearly in the context by a status word used together with the family name (e.g., Professor Wu). Finally, the Referent condition differed from the Status condition in that a status word indicating a higher-status/position (such as Professor, General, Boss, etc.), consistent with one of the status words used in the context, was inserted before the pronoun to additionally indicate which one of the two persons in the context should be the target addressee.

Behaviorally, we predicted a reduction of ambiguity rating for the Referent and Status conditions as compared with the Ambiguous condition, due to a successful matching of the referent and the pronoun in the Status situation and the additional information conveyed by the status word in the Referent condition. On the ERPs time-locked to the pronoun, we would normally predict an Nref effect for the ambiguous condition. The Nref is a sustained negativity that starts at about 300 ms and lasts for several hundreds of milliseconds (Van Berkum et al., 1999). This effect, distributing mainly at anterior sites, appears when two antecedents are equally suitable, rendering the interpretation of the pronoun ambiguous. It has been claimed to reflect the detection of ambiguity, the controlled process of ambiguity resolution, and/or the maintenance of two referential interpretations in working memory (Van Berkum et al., 2007). However, the present study did not have the unambiguous, baseline condition in which there was only one antecedent in the context. Although the pronoun in the Status or the Referent condition was unambiguous, the interpretation of this pronoun in these conditions came with a processing cost that may have overshadowed the potential Nref effect for the Ambiguous condition, especially in the early time window (see below).

We predicted increased N400 responses for the Referent condition, as compared with the Status condition. Although adding a status word before the second-person pronoun in the Referent condition would mark even more clearly who is the referent of the pronoun, the pronoun has to nevertheless be integrated with both the status word and the targeted addressee. This integration is perhaps more resource-demanding than the integration just between the pronoun and the referent. Moreover, we would also predict an N400 effect for the Status condition, as compared with an unambiguous, single-referent baseline condition if the latter were included in the design. Using pragmatic information to infer (and select) the referent of a pronoun from the two potential candidates and linking the pronoun with the referent would be more difficult than simply linking a pronoun with a single candidate in the context, resulting in increased N400 responses (c.f., Jiang et al., 2013b). This prediction would lead us to compare the Status with the ambiguous condition, which might yield no or small differences in the early time window as both the potential N400 and Nref effects were in the same direction.

For the late time windows, we predicted that the Nref effect for the Ambiguous condition, as compared with the Referent condition, would eventually be detectable. This was because the Nref effect in processing the ambiguous pronoun in the Ambiguous condition would last for a long time, whereas the processing cost for integrating the pronoun with the status word and the targeted addressee in the Referent condition would have already dissipated by this time. In contrast, we predicted an increased late positivity for the Status condition, relative to the Referent condition. To link the pronoun with one of the potential referents in the context, a pragmatic inference process must take place to decide which person was of higher status and hence could be addressed with the respectful form. Previous studies have shown that this inference process is usually accompanied by the late positivity (e.g., Jiang et al., 2013b).

As we indicated above, in the Status condition, to decide which of the two addressees should be the referent of the pronoun, a pragmatic inference process must occur. This process may vary as a function of the comprehender's empathic ability. The higher the ability, the more successful the inference process. Moreover, comprehenders with increased sensitivity to the social status information should find the sentences less ambiguous when the status information is relevant for a successful inference for the

referent of the pronoun (in the Status condition) and should find the sentences even more ambiguous when no such information is available (in the Ambiguous condition).

Given the previous findings of the modulation of empathy on language comprehension (Nieuwland et al., 2010; Van den Brink et al., 2012; Li et al., 2014; Jiang and Zhou, 2015), we predicted that the magnitude of the N400 effect in the Referent condition may be modulated by individuals' empathic abilities. The status sensitivity may also affect the N400 responses because integration of the current pronoun into the preceding context depends on the matching of the status information between the antecedent and the pronoun in all the conditions. Moreover, both empathic ability and status sensitivity could modulate the late positivity effect given that the pragmatic inference process in selecting a likely addressee would depend highly on one's ability to use this information in the context.

## METHOD

## Participants

Thirty-two right-handed university students (22 females, aging from 18 to 28 years, mean age = 21.2 years) gave informed consent to participate in the ERP experiment. All the participants were native Mandarin speakers born and raised in Beijing. They spoke the Beijing dialect of Mandarin and had not lived outside of the Beijing area before college. This selection criteria was used to ensure that the participants were sensitive to the use of the respectful form of the second person pronoun, since some Mandarin dialects do not use this form. All the participants had normal or corrected-to-normal vision and none had reported reading impairment or any type of neurological or psychiatric disorders. This study was carried out in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of the Department of Psychology, Peking University.

## Design and Material

One hundred and sixty triplets of scenarios describing events in daily life were created, from which 150 triplets were selected as the critical material (**Table 1**). Each scenario comprised a directly-quoted utterance and a conversational context preceding the utterance. The conversational context described a daily situation in which one character was meeting or interacting with the other two characters. For all the scenarios, the first character always served as the speaker and one of the other two characters as the addressee. The social status was conveyed by the name of each character, which consisted of a common Chinese family name which had no status meaning (e.g., Li, Zhang, Yang, etc.) and a position name which conveyed a particular level of social status in the social hierarchy (e.g., higher-status: Professor, General, Manager, etc.; lower-status: Student, Soldier, Assistant, etc.). The status level of each name was pre-evaluated by a university student who speaks the Beijing dialect. The speaker in a scenario was always in lower status. The addressees were of different status in the Referent and Status conditions, with one higher than the other; the addressees were of equal status in the Ambiguous condition, with both addressees holding higher status than the speaker. For the scenario with addressees of different status, the higher-status addressee preceded (i.e., was mentioned earlier than) the lower-status one in half of the scenarios and was preceded by the lower-status addressee in the other half.

Each utterance was composed of an object-subject-verb (OSV) structure beginning with either a status/position noun (e.g., Professor) which stood for the addressee in the Referent condition, or a singular, respectful-form of the second-person possessive pronoun (i.e., nin-de) in the Status and Ambiguous conditions. The utterance delineated an action that the speaker performed for the addressee, a message to the addressee, or the speaker's attitude toward the addressee. The same possessed object (e.g., article in the exemplars in **Table 1**) was used across the three conditions. All the objects were status-neutral, which were equally likely to be possessed/ owned by a higher- or lowerstatus person. The predicates used in the utterance were also status-neutral.

One hundred and twenty unambiguous scenarios were created as fillers to prevent the use of potential response strategies, with 80 scenarios involving 3 characters (1 speaker and 2 addressees) and 40 scenarios involving 2 characters (1 speaker and 1 addressee). Among these fillers, 40 scenarios were created with the same context sentences as those in the Ambiguous condition, but began the utterance with a status/position name which unambiguously stood for the addressee; this was to eliminate the potential strategy of anticipating an unspecified pronoun when reading the ambiguous context. To eliminate the strategy of anticipating a higher-status person to be the addressee in comprehending status-biased context, another 40 scenarios were composed of contexts with a higher-status speaker and two characters of different status levels, but the utterances began with either the plain form of the second-person pronoun (nide/your) referring to the low-status addressee or a status word referring to one of the addressees (20 scenarios for each). Thus, the addressee was not predictable until the status word or the second-person pronouns was revealed. The remaining 40 twocharacter scenarios were selected from Jiang et al. (2013b) which included characters in different social status and a pronoun in its singular, respectful (nin-de, your) or in a singular, informal form (ni-de, your), referring to the addressee at a certain status level in the scenario.

## Scenario Rating

The scenarios were selected from a larger sample of 160 sets of scenarios based on a reference ambiguity rating prior to the ERP experiment which aimed to examine the ability of the utteranceinitial pronoun to refer unambiguously to a person in the multi-character conversational context. The 160 sets of scenarios were created using the same criteria as those described for the critical scenarios, and were divided into three lists using a Latinsquare procedure. Thirty native speakers of Beijing Mandarin (21 females, aging from 18 to 26 years, mean age = 21.8 years) who were not tested for EEG took part in this pretest (**Table 2**). They were randomly assigned to one of the three lists (each with 10 participants) and were instructed to rate the level of ambiguity of the pronoun in referring to an antecedent in the context (1 = the most ambiguous, and 7 = the least ambiguous). To minimize


#### TABLE 1 | Examples of conversational scenarios used in the experiment.

Critical pronouns and the object nouns are underlined.

TABLE 2 | Mean ambiguity rating scores in two independent groups of participants in the pretest and the post-EEG test.


The ambiguity rating was based on a seven-point Likert scale, with 7 representing "the least ambiguous" and 1 representing "the most ambiguous."

the potential referential bias due to the information following the pronominal phrase, an incomplete utterance was given (e.g., in the Referent condition, Student Lin met student Yu and Professor Ye in the conference. Student Lin said, "Professor, your. . . ").

The critical sets of scenarios were selected to ensure that the rating for the chosen scenarios was the lowest for the Referent condition and the highest for the Ambiguous condition. ANOVA with Scenario Type as a within-participant factor revealed a main effect of scenario type, F1(2, 58) = 412.28, p < 0.001, ï 2 <sup>p</sup> = 0.93; F2(2, 298) = 1572.20, p < 0.001, ï 2 <sup>p</sup> = 0.91, with the lowest level of ambiguity for the Referent condition (Mean = 6.81, SD = 0.13), followed by the Status condition (Mean = 5.21, SD = 1.04), and the highest for the Ambiguous condition (Mean = 1.73, SD = 0.66). The differences between conditions were all significant, ps < 0.001 (see **Table 2**).

## Procedure

Participants were seated comfortably in a sound-proofed and electronically shielded chamber. They were instructed to move their head or body as little as possible and to keep their eyes fixated on a sign at the center of the computer screen before the onset of each scenario. The fixation sign was at eye-level and was approximately 1 m away. Scenarios were presented segment-by-segment in a rapid serial visual presentation (RSVP) mode at the center of the screen, with less than 1 degree of horizontal visual angle and 0.2 degree of vertical angle for one segment to minimize the eye-movement. Each scenario consisted of a series of eight frames (**Table 1**). Each segment was presented in a comfortable rate of 400 ms followed by a blank screen of 400 ms. Participants were asked to read scenarios carefully for comprehension. At the end of each scenario, participants were presented with a probe statement and were asked to verify whether the statement was consistent with the information described in the scenario. The statement could probe constituents in the context, including the speaker and the location of the conversation/interaction (e.g., for Technician Wang met Technician Zhang and Director Li in the office, Wang said, "Director, I have achieved the goal," the probe was Technician Wang met Direct Li at the metro station), or constituents in the directly-quoted utterance, including the actor, the patient, and the verb (e.g., for Student Dong encountered Student Chen and Madam Chu, Dong said, "Madam, your story touches me so much," the probe was Madam Chu was touched). This task did not facilitate the reader to access the social status information in the conversational context but required a certain level of comprehension of the directly-quoted utterance (Regel et al., 2010; Jiang et al., 2013b). Each condition required the same numbers of consistent ("yes") and inconsistent ("no") responses. Participants were asked to respond as accurately as possible by pressing a button on a joystick with their right index fingers. Each probe statement was presented 1200 ms after the offset of the last segment of the scenario and remained on the screen until the participants made a decision. The next trial began 1000 ms after button press. Participants were randomly assigned to one

of the three experimental lists, created using a Latin Square procedure. For each list, scenarios were pseudo-randomized so that no more than three consecutive scenarios were from the same critical condition, no more than three consecutive scenarios were followed by a statement probing the same constituents in the scenario, and no more than three scenarios were followed by the same "yes" or "no" response. A practice session of 14 scenarios were presented to each participant prior to the experiment.

A few behavioral measurements were administered after the EEG session. Participants were asked to complete the Empathy Quotient (EQ-40) questionnaire to measure selfreported empathic abilities (Baron-Cohen and Wheelwright, 2004). A Chinese version of the reading span task adapted from Daneman and Carpenter (1980) was used to measure verbal working memory performance (Ye and Zhou, 2008). Two scenario rating tests were administered to all the participants to validate the contextual manipulation and to evaluate individual differences in the sensitivity to the social status information in the context. In the reference ambiguity rating, participants were asked to rate (7-point Likert scale, 1-representing the most ambiguous and 7-representing the least ambiguous) the level of ambiguity of a given pronoun referring to a person in the conversational context (i.e., the same as the pretest) for all the critical stimuli. In the appropriateness rating, participants rated the degree of appropriateness of using a pronoun (7-point Likert scale, 1-representing the least appropriate and 7-representing the most appropriate). Included were four types of scenarios, including 10 containing the correct use of and 10 containing the incorrect use of the respectful form of the second-person pronoun (Nin-de, a higher-status speaker addressing a lower-status addressee) and 10 containing the correct use of and 10 containing the incorrect use of the informal form of the second-person pronoun (Ni-de, a lower-status speaker addressing a higher-status addressee).

## EEG Recording

EEGs were recorded from 64 scalp sites using Ag/AgCl electrodes mounted in an elastic cap (Brain Products, Munich, Germany) according to the international 10–20 system. The vertical electrooculogram (VEOG) was recorded supra-orbitally from the right eye. The horizontal EOG (HEOG) was recorded from electrodes placed at the outer canthus of the left eye. All EEGs and EOGs were referenced online to an external electrode placed on the tip of nose and were re-referenced offline to the mean of the bilateral mastoids. Electrode impedance was kept below 5 k for all electrodes. The bio-signals were amplified with a band pass from 0.016 to 100 Hz and digitized online with a sampling frequency of 500 Hz.

## EEG Analysis

The EEG data were preprocessed with Brain Vision Analyzer software. The EEG signals were corrected for ocular artifacts using algorithms developed by Gratton et al. (1983), and were then segmented with an epoch of 1800 ms time-locked to the onset of the pronoun (from 200 ms before to 1600 ms after the onset). The segmented EEGs were filtered with a 30 Hz low-pass filter with a slope of 24 dB/oct. The resulting data were baseline corrected according to the mean amplitude of the activity preonset of the stimuli (−200 to 0 ms). Trials were rejected if they exceeded ± 70µV in amplitude, contained a transient of over 100µV in a period of 100 ms, or contained activity lower than 0.5µV in a period of 100 ms.

Trials that were inaccurately verified and contaminated by excessive artifacts were excluded from the statistical analysis, rendering 33, 37, and 38 trials on average for the Referent, Status, and Ambiguous conditions, respectively. The differences between conditions were not significant, F < 1. Mean ERP amplitudes were calculated for each time window, participant, and condition. Based on visual inspection and previous findings on the respectful pronoun (Jiang et al., 2013b), four time windows of interest were selected: 300–600 ms for the N400, 600–900 ms for the late positivity, 900–1600 ms for the sustained late positivity, and 1300–1600 ms for the sustained anterior negativity. Repeated-measures ANOVA was performed on the mean amplitudes, with experimental conditions (3 levels: Referent, Status, Ambiguous), Hemisphere (3 levels: left, medial, right), and Region (3 levels: anterior, central, posterior) as within-participant variables. The Hemisphere and Region were crossed, forming nine regions of interest (ROI), each with 5–6 representative electrodes: left-anterior (F7, F5, F3, FT7, FC5, FC3), left-central (T7, C5, C3, TP7, CP5, CP3), left-posterior (P7, P5, P3, PO7, PO3), medial-anterior (F1, Fz, F2, FC1, FCz, FC2), medial-central (C1, Cz, C2, CP1, CPz, CP2), medialposterior (P1, Pz, P2, O1, POz, O2), right-anterior (F4, F6, F8, FC4, FC6, FT8), right-central (C4, C6, T8, CP4, CP6, TP8), and right-posterior (P4, P6, P8, PO4, PO8). Mean ERP magnitudes for each ROI were averaged over the electrodes in each region.

To evaluate the effects of empathic ability and statussensitivity on pronoun resolution in each condition, these ANOVA models also included EQ or Differential Score between status-incongruent and status-congruent sentences in the post-EEG Appropriateness Rating (as an index of status-sensitivity) as a covariate. WM span was added as a statistical control. Regression analysis was further performed on each ERP effect whenever there was an interaction involving experimental condition and EQ/Differential Score, using EQ or Differential Score as an independent factor and the magnitude difference in an ERP effect as a dependent factor. All the continuous variables were z-score transformed before entering the model. Greenhouse-Geisser correction was applied whenever the degree of freedom was above 1. Post-hoc comparisons between conditions were planned and the significance level was estimated with Bonferroni correction. Partial ï <sup>2</sup> was reported as a measure of effect size (ï 2 p ). Marginally significant effects were further examined with Bayesian Factor (BF), which was calculated as the ratio between the probability of an effect to be true and the probability of a null effect based on the observation (Morey and Rouder, 2011; Rouder et al., 2012), and were only considered more likely to be true when the BF was larger than three (Rouder et al., 2009). The reported marginal effects all survived this examination.

#### Jiang and Zhou Social status, empathy, and pronoun resolution

## RESULTS

## Individual Differences Measures

The post-ERP questionnaire revealed large individual differences in both EQ (Mean = 39.63 out of 80, ranging from 16 to 61) and WM span (Mean = 3.19 out of 7, ranging from 2 to 6.5). No correlation was observed between EQ and WM span, r = 0.01, p = 0.94.

## Post-ERP Scenario Ratings

For the appropriateness rating, the repeated-measures ANOVA included Scenario Type (Status-congruent vs. Status-incongruent) as a within-participant factor for by-participant and by-item analysis and EQ as a covariate for by-participant analysis. To control for the effect of WM on pronoun resolution (Nieuwland and Van Berkum, 2006, 2008), we included WM span as a control variable in all the by-participant analyses. The ANOVA revealed a significant effect of Scenario Type, F1(1, 29) = 219.18, p < 0.001, ï 2 <sup>p</sup> = 0.88, F2(1, 19) = 447.85, p < 0.001, ï 2 <sup>p</sup> = 0.96. Consistent with Jiang et al. (2013b), the appropriateness rating showed that participants rated the status-incongruent utterances (3.14 for Nin-de sentences and 2.49 for Ni-de sentences) as less appropriate than status-congruent utterances (6.49 for Nin-de sentences and 6.65 for Ni-de sentences), suggesting that the participants were sensitive to the social status information in the context and were aware of the misapplication of pronoun to an addressee of a certain social status. The by-participant analysis revealed a significant interaction between EQ and congruency, F(1, 29) = 3.11, p = 0.03, ï 2 <sup>p</sup> = 0.77. A linear regression analysis revealed that empathy only modulated the appropriateness rating in the congruent condition, b = 0.15, t = 2.10, p = 0.04, indicating that participants with higher empathy tended to judge the congruent sentences to be more appropriate than those with lower empathy (6.67 vs. 6.44 out of 7, if participants were median split and grouped according to the scores of the empathy measure).

Consistent with the rating prior to the EEG experiment, the post-EEG ambiguity rating showed that the participants rated the Referent condition as the least ambiguous (Mean = 6.89, SD = 0.25), the Status condition as more ambiguous (Mean = 5.37, SD = 1.06), and the Ambiguous condition as the most ambiguous (Mean = 1.79, SD = 1.09). ANOVAs were conducted, taking experimental condition (3 levels: Referent, Status, Ambiguous) as a within-participant factor for by-participant and by-item analysis and EQ as a covariate for by-participant analysis. Results revealed a significant main effect of experimental condition, F1(2, 58) = 237.35, p < 0.001, ï 2 <sup>p</sup> = 0.89; F2(2, 298) = 4146.59, p < 0.001, ï 2 <sup>p</sup> = 0.97. Post-hoc comparisons showed that the differences between conditions were all significant, ps < 0.0001. The by-participant ANOVA also revealed a significant interaction between condition and EQ, F(2, 58) = 3.56, p = 0.02, ï 2 <sup>p</sup> = 0.09. The regression analysis revealed a marginally significant effect of EQ on the rating score in the Status condition, b = 0.36, t = 2.05, p = 0.05, suggesting that participants with higher empathy tended to judge the sentences to be less ambiguous. The scores were 5.42 vs. 5.09 (out of 7) for the median split groups.

To further analyze the effect of individual sensitivity to status information on the ambiguity rating, we performed ANOVA including experimental condition as a within-participant factor and the Differential Score in the post-EEG appropriateness rating (calculated for each participant) as a covariate. Results revealed a significant interaction between Scenario Type and Differential Score, F(2, 58) = 25.38, p < 0.001, ï 2 <sup>p</sup> = 0.47. The ambiguity rating was positively predicted by Differential Score in the Status condition, b = 0.55, t = 3.53, p < 0.005, and in the Referent condition, b = 0.08, t = 1.95, p = 0.06, and negatively predicted by Differential Score in the Ambiguous condition, b = −0.73, t = −4.60, p < 0.005. These findings suggested that the larger the difference the participant showed in the appropriateness ratings (i.e., the more status-sensitive), the less ambiguous they judged the scenarios in the Referent and Status conditions, and the more ambiguous they judged the scenarios in the Ambiguous condition. The rating scores were 6.94 vs. 6.83, 5.69 vs. 4.91, and 1.37 vs. 2.28 (out of 7) for the three conditions, respectively, if the participants were median split into the more sensitive vs. less sensitive group.

## Online Sentence Verification Task

On average, 82.5% (SD = 9.9%), 84.4% (7.6%), and 85.0% (8.5%) scenarios were verified accurately for the Referent, Status, and Ambiguous conditions, respectively. No differences were found in accuracy between conditions, F < 1, suggesting that the participants were equally attentive to each type of scenario in the experiment.

## ERPs

**Figure 1** depicts the grand average ERPs spanning from the pronoun to the following noun. The topographic distributions of the differential ERPs between conditions are displayed in **Figure 2**. The Referent condition elicited more negative responses (the N400 effect) in the 300–600 ms time window as compared with the Status and the Ambiguous conditions. Starting from around 600 ms, however, the Status condition showed more positive responses than the referent condition, and this positivity effect lasted until the end of the following noun (i.e., 1600 ms post onset of the pronoun). In contrast, in a later 900– 1600 ms window, the Ambiguous condition showed an anteriorly distributed sustained negativity effect relative to the Referent condition. Statistical analyses confirmed these observations.

## The N400 Effect in the 300–600 ms Time Window

ANOVA with experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of Hemisphere: left, medial, right; 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and EQ as a covariate revealed a significant main effect of condition, F(2, 58) = 8.66, p = 0.001, ï 2 <sup>p</sup> = 0.23, with the Referent condition eliciting more increased N400 responses than the Status or the Ambiguous conditions, p < 0.001 and p < 0.05, respectively. No difference was found between the latter two conditions, p > 0.1. A significant interaction between experimental condition and Hemisphere was found, F(4, 116) = 4.73, p < 0.01, ï 2 <sup>p</sup> = 0.14. As can be seen in **Figure 2**, the N400 effect for the Referent condition was larger

in the right hemisphere than in the left hemisphere. There was also a four-way interaction between Scenario Type, Hemisphere, Region, and EQ, F(8, 232) = 4.03, p < 0.01, ï 2 <sup>p</sup> = 0.12.

To evaluate the relationship between EQ and the N400 effect, linear regression analyses were performed on each ROI, treating EQ as a covariate and WM as a control variable. The EQ significantly predicted the N400 difference between the Referent and the Status conditions in the left posterior region, b = −1.33, t = −2.91, p = 0.007, and the N400 difference between the Referent and the Ambiguous conditions in the left posterior region, b = −1.02, t = −2.41, p = 0.02. Participants with higher empathy tended to show larger N400 effects for the Referent condition relative to the Status and Ambiguous conditions (or more reduced N400 responses for the Status and Ambiguous conditions relative to the Referent condition). To illustrate this trend, we grouped participants according to their EQ scores and depict the group ERP responses in **Figures 3A,B**. It should be noted that, the N400 response in the Status condition, although much more reduced in the high-empathy group, may not represent what is typically meant by an N400-effect (**Figure 3A**). The high-empathy group showed a positive shift for the Status condition starting around 300 ms in the frontal region.

ANOVA with experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of Hemisphere: left, medial, right; 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and the Differential Score in the appropriateness rating as a covariate revealed a significant three-way interaction between Differential Score, Scenario Type, and Hemisphere, F(4, 116) = 3.11, p < 0.05, ï 2 <sup>p</sup> = 0.10. Regression analysis in each hemisphere, which controlled for WM, revealed a significant effect of Differential Score in the right hemisphere for the N400 differences between the Referent and the Status conditions and between the Referent and the Ambiguous conditions, b = −0.75, t = −2.55, p = 0.01; b = −0.78, t = −2.90, p = 0.005, respectively. These findings suggest that, distinct from the role of empathy, which predicted the N400 effect in the left and medial posterior regions, the status-sensitivity predicted this negativity effect in the right hemisphere. Participants who displayed increased sensitivity to the difference between the status-incongruent and the statuscongruent scenarios had a larger N400 effect for the Referent condition (or more reduced N400 responses for the Status and the Ambiguous conditions, **Figures 4A,B**). Similar to the high-empathy group, the high-sensitivity group also showed a less typical pattern of N400 responses in the Status condition, with a positive shift following the negative peak at about 300 ms (**Figure 4A**).

The Late Positivity Effects in the 600–900 ms Window ANOVA with experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of Hemisphere: left, medial, right, and 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and EQ as a covariate revealed a significant main effect of condition, F(2, 58) = 3.71, p < 0.05, ï 2 <sup>p</sup> = 0.12, indicating that the Status condition elicited a positivity effect relative to the Referent and Ambiguous conditions (**Figure 2**), ps < 0.05. No difference was observed between the Referent and the Ambiguous conditions, p > 0.1. There was a significant three-way interaction between experimental condition, EQ, and region, F(4, 116) = 3.23, p < 0.05, ï 2 <sup>p</sup> = 0.10. Linear regression revealed a significant influence of EQ on the magnitude of the difference between the Status and the Referent conditions in all the regions (anterior: b = 1.03, t = 3.24, p = 0.002; central: b = 0.90, t = 2.58, p = 0.01; posterior: b = 1.00, t = 3.17, p = 0.002). These findings suggest that empathy predicted the late positive effect in the Status condition. The higher the empathic ability the participant exhibited, the larger the late positive effect (**Figures 3A,B**).

ANOVA with experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of

Hemisphere: left, medial, right; 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and Differential Score as covariate revealed a significant two-way interaction between Differential Score and experimental condition, F(2, 58) = 3.09, p < 0.05, ï 2 <sup>p</sup> = 0.09. Regression analysis, which was performed on ERP differences collapsing over hemispheres and regions, revealed a significant effect of EQ on the ERP difference between the Status and the Referent conditions, b = 0.94, t = 4.93, p < 0.001. These findings suggest that the Differential Score in the appropriateness rating predicted the late positivity effect in the 600–900 ms time window: participants showing an increased sensitivity to the appropriate usage of pronoun in a status-given context also had larger late positivity for the Status condition (**Figures 4A,B**).

### The Delayed, Sustained Positivity Effect in the 900–1600 ms Time Window

ANOVA taking experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of Hemisphere: left, medial, right; 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and EQ as a covariate revealed a significant three-way interaction between Scenario Type, Hemisphere, and Region, F(8, 232) = 2.68, p < 0.05, ï 2 <sup>p</sup> = 0.09. Further analysis on each ROI revealed a significant difference between the Status and the Ambiguous conditions in the left posterior, medial posterior, right central, and right posterior regions, ps < 0.05, and a significant difference between the Status and the Referent condition in the right central and right posterior regions, ps < 0.05, suggesting that the positivity effect elicited by the Status condition, relative to the Ambiguous condition in 600–900 ms window continued to develop and sustained until the end of the noun following the pronoun. There was a marginally significant two-way interaction between experimental condition and EQ, F(2, 58) = 2.87, p = 0.07, ï 2 <sup>p</sup> = 0.09, and a significant three-way interaction between experimental condition, EQ, and region, F(4, 116) = 4.90, p < 0.01, ï 2 <sup>p</sup> = 0.14. Linear regressions in each region revealed a significant effect of EQ on the sustained effect between the Status and Referent condition in the anterior, b = 1.57, t = 4.83, p < 0.001, central, b = 1.18, t = 3.33, p = 0.001, and posterior regions, b = 0.94, t = 2.87, p = 0.005, suggesting that the higher the empathy of the comprehender, the larger the sustained positivity shown in the Status condition (**Figures 3A,B**).

ANOVA taking experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of Hemisphere: left, medial, right; 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and Differential Score as a covariate revealed a significant interaction between experimental condition and Differential Score, F(2, 58) = 3.29, p < 0.05, ï 2 <sup>p</sup> = 0.08. Regression analysis revealed a significant effect of Differential Score on the late sustained positivity between the Status and Referent condition, b = 0.86, t = 4.32, p < 0.001, and between the Status and Ambiguous condition, b = 0.89, t = 3.35, p = 0.001. These findings suggest that the

differential score between the status-incongruent and statuscongruent sentences in the appropriateness rating predicted the sustained positive response. Comprehenders with increased sensitivity to the appropriate usage of pronoun in different status contexts demonstrated larger positivity effects in the Status condition (**Figures 4A,B**).

To evaluate whether the empathic ability modulated the positivity effect through the status-sensitivity in the Status condition, mediation analyses were performed for each ROI, with empathic ability as the independent variable, the ERP magnitude difference between the Status and Referent conditions as the dependent variable, and the Differential Score in the appropriateness rating of the status-incongruent vs. congruent sentences as the mediator. We tested for mediation by deriving 95% bias-corrected confidence intervals (CIs) from 5000 bootstrap estimates (MacKinnon et al., 2004; Preacher and Hayes, 2004, 2008). Higher EQ predicted greater Differential Score, which in turn predicted greater amplitude of the late positive effects in the Status condition (in the left and medial posterior regions in the 600–900 ms window and in the medial posterior region in the 900–1600 ms window). The indirect path was significant (600–900 ms: b = 1.35, t = 3.19, p = 0.003; b = 1.54, t = 3.23, p = 0.003; 900–1600 ms: b = 1.48, t = 2.73, p = 0.01), and the estimates of the direct path between EQ and the amplitude of the positive response were reduced but still marginally significant when the mediator was entered in the model (600–900 ms: b = 1.15, t = 1.91, p = 0.06; b = 1.32, t = 1.96, p = 0.05; 900–1600 ms: b = 1.28, t = 1.76, p = 0.09), suggesting that the status-sensitivity partially mediated the relationship between EQ and the late positive effects. The robustness of the mediation effect testing CIs (at 95% level) confirmed the mediator role of the status-sensitivity [600–900 ms: (0.02, 1.46); (0.05, 1.90); 900–1600 ms: (0.01, 1.71)]. These findings indicate that comprehenders with a higher EQ had increased positive responses in the Status condition and this effect was partly due to the increased status-sensitivity of these individuals.

## The Late, Anterior Negativity Effect: 1300–1600 ms Time Window

ANOVA with experimental condition (3 levels: Referent, Status, Ambiguous) and topographic variables (3 levels of Hemisphere: left, medial, right; 3 levels of Region: Anterior, Central, Posterior) as within-participant factors and EQ as a covariate revealed a significant three-way interaction between experimental condition, Hemisphere, and Region, F(8, 232) = 2.98, p < 0.05, ï 2 <sup>p</sup> = 0.09. Further analysis for each ROI revealed marginally significant differences between the Ambiguous and Referent conditions in the left anterior, medial anterior, and right anterior regions, ps < 0.05, and between the Status and Referent conditions in the left central, and left posterior regions, ps < 0.05. As shown in **Figures 1**, **2**, these findings suggest that the Ambiguous condition elicited an anteriorly-distributed negativity effect, relative to the Referent condition, while Status condition elicited a larger posterior positivity. ANOVA also revealed a marginally significant interaction between EQ and experimental condition, F(2, 58) = 3.04, p = 0.07, ï 2 <sup>p</sup> = 0.09, and a significant three-way interaction between EQ, experimental condition and Region, F(4, 116) = 5.63, p < 0.01, ï 2 <sup>p</sup> = 0.16. Linear regression analysis did not reveal any effect of EQ on the ERP differences

between the Ambiguous and the Referent conditions, ps > 0.1, but it did reveal an effect of EQ on the difference between the Status and the Referent conditions in the anterior, b = 1.60, t = 4.61, p < 0.001, central, b = 1.18, t = 3.26, p = 0.002, and posterior regions, b = 1.02, t = 2.86, p = 0.005. Such a predictability effect of EQ, consistent with the previous analysis of the ERP effects in the 600–900 ms and 900–1600 ms windows, suggests a continuous impact of the comprehender's empathic ability on ERP responses elicited by the Status condition. ANOVA with experimental condition and topographic variables as withinparticipant factors and Differential Score as a covariate found only a marginally significant three-way interaction, F(4, 116) = 3.26, p < 0.05, ï 2 <sup>p</sup> = 0.07. However, further analysis did not reveal any significant effect of Differential Score on the negativity for the Ambiguous condition.

## DISCUSSION

This study aimed to provide behavioral and neurophysiological evidence on how the social status information in the context and the individual's empathy and sensitivity to social status affect referential ambiguity resolution in directly-quoted utterances. We first demonstrated a graded decrease of the perceived ambiguity over the Ambiguous, Status, and Referent conditions. The perceived ambiguity was the lowest in the Referent condition, suggesting that a status word before the second-person pronoun can serve as a cue for the reactivation of the target referent and effectively facilitate pronoun resolution. The ambiguity in the Status condition was also lower than that in the Ambiguous condition; the comprehender may compare the social status of the two potential referents in the context and choose the one of higher status for the respectful pronoun in the former condition. In the Ambiguous condition, however, the two potential referents in the context were of equal status and they engaged in a dead-end competition for being the target addressee of the pronoun, which involved an effortful maintenance of the antecedent information in the limited working memory.

Consistent with the hypothesis that empathic ability plays a crucial role for comprehenders to make use of pragmatic (social status) information in the context to resolve referential ambiguity, we demonstrated that EQ modulates the perceived ambiguity in the Status condition: individuals' with higher EQ perceived less ambiguity in the sentences even when a clear social status difference existed between the two potential referents. This hypothesis was further supported by the finding that the comprehender having higher sensitivity to social status information also perceived less ambiguity in the Status condition, suggesting that individuals sensitive to the social status hierarchy are more able to resolve referential ambiguity using the status information.

Findings in the ERP data are generally consistent with the above arguments. In the following discussions, we focus on the ERP effects for different experimental conditions.

## The Increased N400 Responses in the Referent Condition

Generally speaking, the N400 responses are reduced in a highlypredictive sentential or discourse context in which the mental representation of contextual information or an individual lexical item facilitatesthe semantic access of the target word (e.g., "access account," Kutas and Federmeier, 2011); the N400 responses are enhanced when a target word is incongruent with the previous lexical, sentential, or conversational context or is difficult in being integrated into the comprehender's knowledge or belief system ("integration account," Hagoort et al., 2004; Van Berkum et al., 2009; Leuthold et al., 2012; Nieuwland and Martin, 2012; Jiang et al., 2013a,b; Ellis et al., 2015; Wang et al., 2015). A respectful pronoun used to address a lower-status addressee or a less respectful pronoun used to address a higher-status addressee elicited increased N400 responses (Jiang et al., 2013b).

The first account (Kutas and Federmeier, 2000, 2011) argues that the sentence context with more constraining information toward the upcoming word reduce N400 responses to that word. In the Referent condition, the additional status word preceding the pronoun formulates additional contextual information which may reduce the N400 response and ease the access toward the upcoming respectful pronoun. The behavioral rating revealed a lower perceived ambiguity in the Referent condition than the other two conditions. However, we found that the pronoun in the Referent condition showed larger N400 responses than those in the other conditions, a pattern opposite to what would be predicted by the access account; this account would predict an easier rather than a disruptive access for the Referent condition. Alternatively, given that there were 90 critical scenarios with Nin-de as the target pronoun but only 40 filler scenarios with Ni-de as the target pronoun, the system might be biased toward expecting the higher-status individual as the potential addressee. Such expectancy would reduce the N400 responses to nin-de in all the conditions relative to a balanced design, but less so for the Status and Ambiguous conditions. This would lead to enlarged N400 effects for the two condition, compared to the Referent condition, a prediction, however, not supported by our data.

The integration account attributes the increased N400 responses in the Referent condition, relative to the other two conditions, to the increased effort of simultaneously integrating the pronoun to the higher-status referent and to the status word inserted before the pronoun. The modulation of N400 has been found on the pronoun with no explicit antecedent in the context to be integrated with (e.g., the in-flight meal I got was more impressive than usual. In fact, he/they courteously presented the food as well.). The pronoun (he) that highly demands an explicit antecedent elicited larger negative responses than the one (they) that is less disruptive in the absence of the antecedent (Filik et al., 2008). Another study required the listeners to discriminate a visually presented object from its competitor based on the auditory description of its color and shape. The N400 observed on the color word (e.g., "red" in the red square) was increased when this word was redundantly uttered for discrimination in the visual display (e.g., a red square and a blue star), relative to when it served as critical information (e.g., a red square and a blue square, Engelhardt et al., 2011). These findings suggest that the N400 increase is associated with the increased demand of integration between the referential expression and what it is referred to in the context. Here, although the status word could help to disambiguate which of the two characters in the context should be the addressee and make the reference tracking easier, the pronoun nevertheless has to be linked with both the status word in the utterance and one of the characters described in the context. An integrated representation of "whom is referred to" has to be established based on the context including both the character and the status word. Such integration effort was reduced in the Status condition since the pronoun merely linked with the character of higher status, which had been specified by the character's name.

Future studies can better address how the pronoun-locked N400 effect is affected by the conversational context by adding a control condition which includes an ambiguous context and an explicit status word, and by comparing the unambiguous Referent condition with that control condition. The integration account would still predict a larger N400 in the unambiguous than the ambiguous condition due to the necessity to link the pronoun with both the status word and the contextually appropriate antecedent. The access account would predict a reduced N400 for the former due to facilitated access of a higher-status antecedent in the context.

How then can we account for the correlations between the N400 effect and the individuals' empathic ability or sensitivity to social status? Previous studies have shown that nouns mismatching the pragmatic constraints of scalar quantifiers elicited an N400 effect only in readers with a low autistic quotient (Nieuwland et al., 2010); pronouns mismatching the social status in the context elicited an N400 effect only in readers exhibiting high fantasizing ability (Jiang and Zhou, 2015). Different components of cognitive empathy (i.e., perspective-taking and fantasizing) differentially modulated the neural activity underlying pragmatic failure (which demanded a re-interpretation/conflict resolution) and pragmatic underspecification (which demanded an inferential process) (Li et al., 2014). Based on these findings, one can envisage that the comprehenders' sensitivity to the social status information in the communicative context or their empathic ability in deriving the underlying message could modulate the processes in making use of the status information and specifying an appropriate antecedent for the pronoun or in dealing with pragmatic ambiguity. The stronger the ability, the weaker the N400 responses to the pronoun in the Status or the Ambiguous conditions, and the larger the N400 effect for the Referent condition. In other words, the variation of the size of the N400 effect over participants was mainly due to individual differences in the neural responses to the Status and Ambiguous conditions rather than the neural responses to the Referent condition.

Similarly, comprehenders with increased sensitivity to the status-pronoun mismatch showed a larger N400 effect when the pronoun mismatched the social context, replicating Jiang and Zhou (2015). In the latter study, the readers were presented with scenarios in which the informal form of the second-person pronoun was used to address the addressee of lower status (correct usage) or the addressee of higher status (disrespectful usage). The N400 responses were enlarged on pronouns used in a disrespectful way, and this effect was modulated by the comprehenders' Difference Score ratings for the appropriateness of the respectful and disrespectful scenarios. In the current study, the successful resolution of the pronoun in the directly-quoted utterance depended on the matching of the respectful pronoun with the person of higher social status in the context. The higher the sensitivity to the social status information, the stronger the ability to use this information, the weaker the N400 responses to the pronoun in the Status or the Ambiguous conditions, and the larger the N400 effect for the Referent condition.

It should be noted that status-sensitivity and empathic ability modulated the N400 effect in different hemispheric sites: the effect of status-sensitivity was in the right hemisphere, whereas the effect of the empathic ability was in the left and medial hemispheric sites. Individuals who excelled in recognizing social status in the context and those who showed expertise in empathizing the conversational partner may engage different neurocognitive mechanisms in integrating the pronoun with the context, although further studies are needed to elucidate these mechanisms.

## The Late, Sustained Positivity in the Status Condition

The Status condition elicited a positivity effect post-onset on the pronoun that sustained until the end of the following object noun. This effect was similar to the positivity (P600) found on words eliciting ironic interpretations (Regel et al., 2010, 2011; Spotorno et al., 2013; Filik et al., 2014). This effect was also similar to the sustained positivity effect found on the respectful pronoun (and the word immediately following the pronoun) when the pronoun was used sarcastically in a directly-quoted utterance (i.e., used by a higher-status speaker addressing a lower-status addressee, Jiang et al., 2013b). A sustained positivity effect has also been found on words inconsistent with the preceding context describing an individual's traits, intention, or goal of an action (Van Duynslaeger et al., 2007; Baetens et al., 2011); it has been suggested to be manifested by the neural network subserving the mentalizing process (e.g., the temporo-parietal junction, Van Duynslaeger et al., 2007). These positivity effects may reflect a "pragmatic enrichment," second-pass processing strategy when a literal interpretation of the input meets difficulties and when contextual cues are sufficient to allow for the use of this strategy (Xu et al., 2015). Positivities with different latencies may subserve different components of pragmatic inference. The P600-like effect (340–730 ms) was found in vocal expressions which were ambiguous in indicating a speaker's confidence, while a more delayed sustained positivity was found in neutralintending expressions which were acoustically different but perceptually similar to the confident expressions (Jiang and Pell, 2015); the former was associated with the attempt of continued evaluation of an ambiguous input, and the latter was responsible for successful derivation of the speaker's meaning from an incongruent perception. In the Status condition of the current study, the late positivity (in 600–900 ms) and the delayed sustained positivity (900–1600 ms) may reflect different sub-processes. The comprehender was faced with a temporary referential ambiguity which may require continued analysis (the late positivity); this ambiguity was eventually resolved by the pragmatic inference process that was based on the status information in the context and the usage of respectful pronoun (the sustained positivity). This account is consistent with the MRC (Mental Representation of what is being Communicated) model suggested by Brouwer et al. (2012) and Brouwer and Hoeks (2013). Here the positivity effect could be interpreted as reflecting the difficulty of integrating the pronoun into the mental representation pre-established according to the communicative context. The context specifies two potential addressees, and an inference process must be conducted to establish which one is the actual addressee that could be linked with the pronoun. Only through this process can the pronoun be integrated into the MRC.

The positivity effects were modulated by the comprehender's empathic ability: the comprehender with higher empathizing ability displayed a larger positivity effect. This finding suggested that those who excel in empathizing are more likely to initiate the effort of inferring the addressee using the social status information in the context or the effort of integrating the pronoun into the communicative context. Such effort may help to reduce the ambiguity created by two potential antecedents. Indeed, we showed in the behavioral data that the increased EQ scores were associated with decreased perceived ambiguity when the context biases the selection of the target addressee. Another possibility is that the high-empathy group have higher sensitivity to the pragmatic cues such as the social status of the communicator (Van den Brink et al., 2012) and this sensitivity facilitates the selection of the target addressee based on the context biases. These findings provide a first piece of evidence showing that using contextual information to implement pragmatic inference is subject to individual's empathic ability in resolving verbal ambiguity (Li et al., 2014). In line with this argument, the EQ modulated the appropriateness rating of the respectful form usage, demonstrating its impact on individual's sensitivity to the social status of the addressee. Moreover, the mediation analysis confirmed that the empathic ability affected the positivity effect partially through individuals' sensitivity to the status information in the context.

## The Delayed Anterior Negativity in the Ambiguous Condition

An early-starting, anteriorly distributed sustained negativity effect (Nref) has been observed on the third-person pronoun when two competing characters in the discourse are equally likely to be the antecedent of this pronoun (Van Berkum et al., 1999; Nieuwland and Van Berkum, 2006; Nieuwland et al., 2007). Different from the previous studies, the ambiguity in this experiment was developed on the second-person pronoun in a directly-quoted utterance whose referent had to be determined based on the social status information. The starting portion of Nref for the Ambiguous condition (relative to a one-referent, unambiguous baseline) may have overlapped with the N400 effect for the Referent condition, preventing us from observing this effect. However, assuming that the integration cost for the additional status word in the Referent condition had dissipated in the time windows later than the N400 window, the Nref for the Ambiguous condition, relative to the Referent condition, would be observable. The competition between the two possible referents

would last for a long time until new information comes to specify which is a more possible antecedent of the pronoun.

## CONCLUSION

This study examined the role of social context as well as individual differences in ambiguity resolution during the comprehension of conversational scenario involving a directly-quoted utterance and a singular, respectful pronoun. Behaviorally, the perceived ambiguity gradually decreased over the scenario without a disambiguating cue (the Ambiguous condition), the scenario in which differential status between the potential referents bias one to be the target addressee (the Status condition), and the scenario in which a status word unambiguously vocalized the target addressee (the Referent condition). Comprehenders with increased status-sensitivity perceived less ambiguity in the Status condition and more ambiguity in the Ambiguous condition; comprehenders with higher empathic ability also perceived less ambiguity in the Status condition. Electrophysiologically, the Referent, Status, and Ambiguous conditions were distinctively captured by

## REFERENCES


increased N400 responses (300–600 ms), increased late sustained positivity (600–1600 ms), or late anterior negativity (or Nref, 1300–1600 ms), demonstrating differential neurocognitive processes underlying ambiguity resolution with different contextual cues. The late positivity effect demonstrated an inferential process in which pragmatic information was used to establish a potential referential link between the pronoun and its antecedent. The late negativity effect demonstrated an increased computational load in choosing one of the two competing antecedents. These findings highlight the role of disambiguating cues in the social context and the neurocognitive flexibility in using these cues to establish referential representations during utterance comprehension.

## ACKNOWLEDGMENTS

This work was supported by the Natural Science Foundation of China (31470976) and National Social Science Foundation of China (12&ZD119). We thank the two reviewers for their constructive comments and suggestions.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Jiang and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## A Test for the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS): Normative Data and Psychometric Properties

Giorgio Arcara<sup>1</sup> and Valentina Bambini <sup>2</sup> \*

*<sup>1</sup> Department of Neurosciences, University of Padua, Padua, Italy, <sup>2</sup> Center for Neurocognition, Epistemology and theoretical Syntax, Institute for Advanced Study (IUSS), Pavia, Italy*

The Assessment of Pragmatic Abilities and Cognitive Substrates (APACS) test is a new tool to evaluate pragmatic abilities in clinical populations with acquired communicative deficits, ranging from schizophrenia to neurodegenerative diseases. APACS focuses on two main domains, namely discourse and non-literal language, combining traditional tasks with refined linguistic materials in Italian, in a unified framework inspired by language pragmatics. The test includes six tasks (Interview, Description, Narratives, Figurative Language 1, Humor, Figurative Language 2) and three composite scores (Pragmatic Productions, Pragmatic Comprehension, APACS Total). Psychometric properties and normative data were computed on a sample of 119 healthy participants representative of the general population. The analysis revealed acceptable internal consistency and good test-retest reliability for almost every APACS task, suggesting that items are coherent and performance is consistent over time. Factor analysis supports the validity of the test, revealing two factors possibly related to different facets and substrates of the pragmatic competence. Finally, excellent match between APACS items and scores and the pragmatic constructs measured in the test was evidenced by experts' evaluation of content validity. The performance on APACS showed a general effect of demographic variables, with a negative effect of age and a positive effect of education. The norms were calculated by means of state-of-the-art regression methods. Overall, APACS is a valuable tool for the assessment of pragmatic deficits in verbal communication. The short duration and easiness of administration make the test especially suitable to use in clinical settings. In presenting APACS, we also aim at promoting the inclusion of pragmatics in the assessment practice, as a relevant dimension in defining the patient's cognitive profile, given its vital role for communication and social interaction in daily life. The combined use of APACS with other neuropsychological tests could also improve our understanding of the cognitive substrates of pragmatic abilities and their breakdown.

Keywords: pragmatics, neuropragmatics, neuropsychological assessment, figurative language, discourse

## INTRODUCTION

Pragmatics concerns the interplay of linguistic content, contextual information and general communicative rules in guiding communication (Grice, 1975; Levinson, 1983; Sperber and Wilson, 2005). Typical domains of investigation in pragmatics are those verbal phenomena in which the gap between the literal meaning and the communicative meaning is clearly

Edited by: *Gabriella Airenti, University of Turin, Italy*

Reviewed by: *Pilar Prieto, Universitat Pompeu Fabra, Spain Ivan Enrici, University of Turin, Italy*

\*Correspondence: *Valentina Bambini valentina.bambini@iusspavia.it*

### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *16 October 2015* Accepted: *12 January 2016* Published: *12 February 2016*

### Citation:

*Arcara G and Bambini V (2016) A Test for the Assessment of Pragmatic Abilities and Cognitive Substrates (APACS): Normative Data and Psychometric Properties. Front. Psychol. 7:70. doi: 10.3389/fpsyg.2016.00070* visible, and in which context plays a major role. Metaphor, irony and non-literal language in general are among those phenomena, as comprehenders are required to integrate contextual information, including belief and intentions, in order to reach the intended meaning. Also aspects of discourse and conversation such as topic maintenance and coherence are often included in the domain of pragmatics, as speakers need to adhere to rules of appropriateness to context in conducting the verbal exchange.

A long tradition which traced back to the early '60s identified the right hemisphere as the site of pragmatic abilities in the brain (Joanette et al., 1990). This claim was based on research with different paradigms such as sentence picture matching task for metaphor (Winner and Gardner, 1977) or completion of jokes (Brownell et al., 1983), as well as discourse analysis approaches to the patients' speech (Joanette and Brownell, 1990). However, it soon became evident that, in addition to right hemisphere brain damaged patients, a large number of clinical populations, while not being aphasic, show similar pragmatic impairments, including patients with schizophrenia, traumatic brain injury and neurodegenerative diseases (Stemmer, 2008; Bambini, 2010; Bambini and Bara, 2012).

The increasing volume of the literature in clinical pragmatics encouraged the development of standardized assessment tools for acquired pragmatic deficits. Tests for English fall into two main categories: structured batteries assessing the comprehension of non-literal language, such as the Right Hemisphere Communication Battery (Gardner and Brownell, 1986) and the Right Hemisphere Language Battery (Bryan, 1995), and tests for evaluating discourse and conversation produced by patients, such as the Pragmatic Protocol (Prutting and Kirchner, 1987) and the Profile of Communicative Appropriateness (Penn, 1985). Similarly, for Italian, both types of approaches were developed. Some tools assess pragmatic abilities with a main focus on non-literal language, among which the Batteria sul Linguaggio dell'Emisfero Destro (BLED) (Rinaldi et al., 2004), the Italian version of the Protocole Montréal d'Évaluation de la Communication (MEC) (Tavano et al., 2013), and the Assessment Battery for Communication (ABaCo), which expands the evaluation of communicative abilities to non-verbal pragmatics (Angeleri et al., 2012; Bosco et al., 2012). Other methods focus on the analysis of the patient's speech (Marini et al., 2011), based on discourse analysis and pragmatic notions such as are coherence and cohesion, measuring how sentences are connected and integrated in the global narrative context.

Despite increasing evidence of the vulnerability of the pragmatic aspects of communication in a large number of neurological and psychiatric conditions, and despite the existence of evaluation instruments, pragmatic assessment is rarely integrated in the clinical practice. Several reasons motivate this exclusion. First, language assessment usually concentrates on the formal aspects of language, for which a much larger number of standardized tools exist, in order to detect aphasic syndromes. Communicative disruptions at the pragmatic level, although frequently documented and qualitatively reported, are not considered part of the clinical profile and they are often ascribed to cognitive or social cognition deficits. This situation is probably related also to the cognitive substrates of pragmatics, which is known to be associated with a network of different abilities. Among these, Theory of Mind, i.e., the ability to represent another's mental state (Premack and Woodruff, 1978), seems to play a major role, along with executive functions (i.e., working memory, set-shifting, inhibition, planning and flexibility) (McDonald, 2008; Stemmer, 2008). Although the common opinion is that these abilities do not fully account for pragmatic deficit, the cognitive substrates of pragmatics is still considered as a "puzzle" in the neuropsychological literature (Martin and McDonald, 2003; Champagne-Lavau et al., 2007). The second reason playing against the inclusion of pragmatic assessment is that the available pragmatic tests, while offering a fine-grained profile of the patient's communicative skills, are usually too long for clinical settings (90 min on average), and sometimes difficult to administer and score.

In light of this scenario, we aimed at promoting a better consideration of pragmatic aspects in describing the patient's clinical profile. To pursue this aim, we decided to expand the inventory of tools to assess pragmatic abilities, by producing a new test (Assessment of Pragmatic Abilities and Cognitive Substrates, APACS), with the following major innovative characteristics: (i) inclusion of the major domains of impairments as evidenced in the literature on patients, i.e., discourse and non-literal meaning, compacted in a single tool; (ii) careful selection of the materials, combining refined theoretical notions in pragmatics and discourse analysis as well as psycholinguistic variables, and respecting the ecological validity as much as possible; (iii) brevity and easiness of administration. We built the test in Italian, yet encouraging the development of versions in other languages, granted a careful adaptation especially of the non-literal uses, in the perspective of endorsing cross-national sharing of standardized tools and data pooling also for the important domain of social communication.

With respect to (i), our choice fell on discourse and on non-literal language, including figurative expressions (idioms, metaphors, proverbs) and humor, as these are well explored domains in studies on patients, known to be largely impaired in schizophrenia, traumatic brain injured, and neurodegenerative diseases such as fronto-temporal dementia and amyotrophic lateral sclerosis (Brüne and Bodenstein, 2005; Ash et al., 2014; Marini et al., 2014; Clark et al., 2015). Although pragmatic impairment might affect also other pragmatic dimensions, we believe that discourse and non-literal language might represent two appropriate test-grounds to detect a global deficit in social communication. APACS has the advantage of combining discourse and non-literal language in a single tool while preserving the brevity of the instrument, thus overcoming the traditional separation between tests assessing discourse and tests assessing figurative language<sup>1</sup> . Importantly, studies and metaanalyses in neuropragmatics showed that the comprehension of metaphor, humor, as well as discourse rely on a common

<sup>1</sup>Note that, in including discourse and non-literal language understanding, APACS provides a view of pragmatic abilities which is in line with the recent classification of neurodevelopmental Semantic-Pragmatic Disorders in the DSM-5, specifically sharing information in criterion 1, story-telling and conversation in criterion 3 and non-literal meaning in criterion 4.

extended language network (Ferstl, 2010), extending to Theory of Mind and executive functions hubs, with differences depending on the specific task. The rationale behind APACS acknowledges that pragmatics, while globally depending on context, is not monolithic and different pragmatic aspects might involve different cognitive skills. APACS might indeed be useful also to shed light into the cognitive substrates of pragmatic abilities, which might not completely overlap across tasks and might be differently compromised across pathologies (Champagne-Lavau et al., 2007).

With respect to (ii), great attention was devoted to the construction of the materials. As a general trend, we tried to enhance the realistic nature of the stimuli, by using photographs instead of line drawings, and everyday language as in news articles. Theoretically, we took into account notions from linguistic and pragmatics (e.g., the distinctions among figurative language types such as idioms and metaphors, often blended together in previous tests). Psycholinguistic variables such as familiarity (for figurative expressions) and readability (for narrative texts) were also balanced. For figurative language in particular, research in psycholinguistics showed the importance of familiarity and previous exposure in shaping processing load and mechanisms (Cardillo et al., 2010). When possible, stimuli in APACS were extracted from norms or rating studies collected on the Italian populations, thus balancing the conventionality of the expressions. Other materials in APACS were ex novo built paying attention to contextual appropriateness.

With respect to (iii), we employed widely used tasks such as sentence matching or semi-structured interview, so that no special training is required on the clinician's side, thus increasing the easiness of administration. Training requirement is minimal also for the scoring, which in APACS is done on-line based on clear instructions. Administration time averages 35–40 min, depending on the individual's characteristics. As an important caveat, APACS tasks focus on verbal pragmatic abilities as they are used in social communication, but does not directly manipulate contextual settings, neither involve role playing, since the use of these approaches is still controversial (Crockford and Lesser, 1994). To this respect, recent literature is orienting toward the use of functional communication scales as the best measure of communicative skills in social situations, and their impact on functioning (Long et al., 2008).

The final structure of the APACS test includes 6 tasks (Interview, Description, Narratives, Figurative Language 1, Humor, Figurative Language 2) and allows to derive three composite scores (Pragmatic Production, Pragmatic Comprehension, APACS Total). It is advisable to accompany the test in parallel with a neuropsychological assessment, evaluating especially executive functions and social cognition, to unravel different involvement across pragmatic tasks. The full name of the test ("Assessment of Pragmatic Abilities and Cognitive Substrates") captures this perspective. The use of tests assessing formal aspects of language is also advisable, to dissociate aphasic from "apragmatic" profiles. In what follows we first present the structure of the APACS test, and then we describe the psychometric properties and provide normative data from an Italian population sample.

## METHODS

## Stimuli and Structure of the APACS Test

The APACS test focuses on the assessment of two main pragmatic domains, namely discourse and non-literal language. The test is divided in two main sections, one devoted to assess production and the other devoted to assess comprehension, for a total of 6 tasks. Three composite scores are derived from the tasks. Below we provide a short description of the six tasks and the three composite scores. **Figure 1** summarizes the structure of the test and the derived scores. Examples of items are provided in Supplemental Data Sheet 1: APACS-Item Examples. Further information on APACS can be obtained from the authors.

## Interview

This task (duration: approximately 5 min) aims at assessing the ability of engaging in conversation though a semi-structured interview, organized around four autobiographical topics: family, home, work, organization of the day, known to be suitable topic to enhance speech in patients (Borovsky et al., 2007). The discourse produced by the subject is assessed according to a checklist including the main parameters of discourse analysis, based on previous approaches to pathological speech (Prutting and Kirchner, 1987; Marini et al., 2011). Several dimensions of discourse are rated on line for the presence of communication difficulties at the contextual-pragmatic level, namely speech (e.g., repetition, incomplete utterances, echolalia), informativeness (over- or under-informativeness, loss of verbal initiative) and information flow (missing referents, wrong order of the discourse elements, abrupt topic shift). Although the focus of the assessment is on verbal pragmatics, the paralinguistic dimension of discourse is included in the rating (e.g., altered intonation, loss of eye-contact, fixed facial expression, abuse of gesture). Also errors in grammar and vocabulary are annotated, based on classic aphasic symptoms such as anomia and paraphasia (Semenza, 2002), as they impact on the communicative effectiveness of the discourse. The frequency of each type of communication difficulty is annotated (always/sometimes/never) and then converted into scores (0/1/2). Maximal score: 44.

## Description

This task (duration: approximately 5 min) aims at assessing the ability of producing informative descriptions and sharing information of everyday life situations. Compared to the Interview task, here expressive abilities are measured through a more structured task, similar to traditional picture description task, but with higher ecological validity. Ten photographs that depict scenes of everyday life (e.g., a woman waiting at the bus station, a man buying a newspaper in a shop) are presented one by one. The subject is asked to describe the photograph in relation to the main elements that characterize the scene (the location, i.e., the so-called "scene setting topic," the agent(s) and the action performed by the agent(s)). For each salient element in each picture, a score is assigned differentiating missed identification, partially correct identification, correct identification (0/1/2). Maximal score: 48.

## Narratives

This task (duration: approximately 10 min) aims at assessing the ability to comprehend discourse and the main aspects of a narrative text. Six stories were built, inspired by real news articles, with increasing length (number of sentences ranging from 4 to 8), and complexity set on a medium difficulty level for subjects with 8 years of schooling, scoring on average 58.5 on the Gulpease readability index (range 0– 100) (Lucisano and Piemontese, 1988). Each story includes two non-literal expressions. Stories are read to the subject at normal rate. Following each story, several question items are administered:

Pragmatic Production and Pragmatic Comprehension.



### Figurative Language 1

This task (duration: approximately 8 min) aims at assessing the ability to infer non-literal meaning through multiple choice questions, similarly to existing tests (Rinaldi et al., 2004). Fifteen sentences are presented, selected from available databases, with different degrees of lexicalization, including: five highly familiar idioms, average familiarity 6.36 on a 7 point scale, based on existing norms (Tabossi et al., 2011); five novel metaphors, average familiarity 3.78 on a 5 point scale, based on existing ratings (Bambini et al., 2013); five common proverbs extracted from a dictionary of Italian proverbs (Guazzotti and Oddera, 2006). All sentences are provided with a minimal context. For each sentence, three possible interpretations are presented and

the subject is asked to choose the one that correctly expresses the figurative meaning. Options include one correct, figurative, interpretation, and two incorrect interpretations, one literal and one unrelated with respect to the target word. Each item is scored either 1 or 0 according to the accuracy. Maximal score: 15.

### Humor

This task (duration: approximately 5 min) aims at assessing the ability to comprehend verbal humor through multiple choice questions, inspired by the Joke and Story Completion Test (Brownell et al., 1983). The materials consist of seven items, each presenting a brief story. For each story, three possible endings are provided, including: a correct funny ending; an incorrect straightforward non-funny ending; an incorrect unrelated nonsequitur ending. Correct funny endings either play with literal and polysemous meanings, or require to derive non-explicit, unexpected scenarios (Yus, 2008). The subject is asked to select the ending that best functions as the punchline of the story. Each item is scored either 1 or 0 according to the accuracy. Maximal score: 7.

### Figurative Language 2

This task (duration: approximately 7 min) aims at assessing the ability to infer non-literal meanings through verbal explanation, similar to previous tests (Papagno et al., 1995; Amanzio et al., 2008). The materials were selected as for the Figurative Language 1 task and consist of 15 sentences, including 5 highly familiar idioms (average familiarity 6.52), 5 novel metaphors (average familiarity 3.88), and five common proverbs listed in the dictionary. The subject is asked to explain the meaning of each expression. Responses score 2 when the subject provides a good description of the actual meaning of the figurative expression, 1 when the subject provides incomplete explanation, such as concrete examples, but fails in providing a general meaning, 0 when the subject paraphrases the figurative expression, provides a literal explanation, or ignores the expression. Maximal score: 30.

## Composite Scores

Three composite pragmatic scores are computed from the tasks' scores. The Pragmatic Production composite score is calculated from Interview and Description tasks, whereas the Pragmatic Comprehension composite score is calculated from Narratives, Figurative Language 1, Humor and Figurative Language 2 tasks. Each composite score is obtained transforming the original tasks' scores in proportions, and averaging these proportions. Hence, each task contributes with equal weight to the final composite score, which ranges from 0 to 1. Furthermore, the Total APACS score is derived as the mean of the Pragmatic Production and the Pragmatic Comprehension scores. The APACS composite scores allow to coarsely categorize the pragmatic performance of the individuals and can be used to classify patients according to general notions of pragmatic abilities or to easily describe the overall status of pragmatic impairment for clinical purposes.

## Participants

Normative data for APACS were collected from 119 healthy participants. The sample selection was stratified by age and years of education to reflect as much as possible the demographic characteristics of the Italian population. Mean age was 50.03 years (SD = 16.79, range 19–89) and mean education was 13.49 years (SD = 4.54, range = 5–23). Sixty-five participants were female and 54 were male. Among the participants, 114 were right-handed and 5 were left-handed. Details on the distribution of participants' demographic variables are reported in **Table 1**. All participants were native speakers of Italian, autonomous in their daily living and had no relevant pathologies that could affect the cognitive performance. Moreover, no participant reported any developmental learning disorder. All participants took part to the study on a voluntary basis and gave their informed consent according to the Helsinki Declaration.

## Procedure

The APACS test was administered to each participant in a single session of approximately 35–40 min. Since the APACS test is meant for use on clinical populations, the tasks were presented in a fixed order, as is standard in clinical practice. The order was fixed starting with Interview, as the most natural task in the test situation, and then alternating tasks of different processing load, as follows: Interview, Description, Narratives, Figurative Language 1, Humor, Figurative Language 2. Data collection was performed by trained psychologists or linguists. All statistical analyses were performed by means of the free statistical software R (R Core Team, 2015).

## RESULTS

Raw results on APACS for the 119 controls are reported in **Table 2**. To facilitate the inspection of age and education stratification on APACS scores, results were divided in two age bins (age < 55 years and age ≥ 55 years) and two education bins (education ≤ 13 and education > 13). Results show that healthy controls have very high scores in all age and education bins (see Supplementary Tables 1 in Supplemental Data Sheet 2: APACS-Data Tables and Cut-offs). This makes APACS particularly suited to detect impairments rather than to measure proficiency in healthy individuals.

## Internal Consistency

The Internal consistency of APACS was calculated by means of Cronbach's alpha on all items in each APACS task on the whole sample of 119 participants<sup>2</sup> . In particular, we adopted the standardized alpha, based upon the correlations. Results indicate that all APACS tasks have acceptable internal consistency, with alpha values ranging from 0.60 to 0.70. Specifically, the following values were obtained: 0.63 for Interview; 0.65 for Description; 0.66 for Narratives; 0.60 for Figurative Language 1; 0.63 for Humor; 0.70 for Figurative Language 2.

<sup>2</sup> In calculating the Cronbach's alpha of Figurative Language 1 task, we removed two items almost at ceiling, i.e., Item 1 and Item 2. Nevertheless, we decided to keep these items in the final version of APACS, because they are not associated to a ceiling performance in patients and thus might be useful to detect impairment (Bosia et al., 2015). Alpha including the two items was 0.54.


#### TABLE 1 | Distribution of Age, Education, and Gender for the 119 healthy participants of APACS normative data.

*The values in each cell indicate the number of males/females.*

### TABLE 2 | Descriptive statistics of APACS results.


*The table reports the means, standard deviations, median, minimum, maximum, kurtosis, skewness, first quartile and third quartile for APACS scores in the normative data group of 119 participants.*

## Test-Retest Reliability and Practice Effect

The Test-Retest reliability of APACS was assessed in a subset of 19 participants (mean age = 42.00, SD = 14.85; mean education 16.89, SD = 4.12) tested at two separate times with a 2-week interval, by the same examiner. A small Test-Retest interval was chosen in order to maximize the possibility to detect undesired practice effects. Results indicate that Test-Retest reliability, calculated by means of Pearson correlations, is good to excellent for all APACS tasks except for Narratives, which showed a remarkably low value (i.e., 0.19, see **Table 3**). Probably the reason of this low value is the almost ceiling performance of the participants who underwent the Test-Retest combined with the practice effect (see below). Low Test-Retest reliability in the normative sample of neuropsychological tests are not surprising (see for example Spinnler and Tognoni, 1987), especially when a ceiling effect is observed<sup>3</sup> .

The presence of practice effects in the APACS tasks and composite scores was evaluated by means of a series of paired ttests comparing the scores at the two measurements. A significant practice effect was found only in Narratives, where participants scored slightly better in the second measurement than in the first. All other tasks and composite scores showed no trend of practice effect (see **Table 3**).

Furthermore, to allow the utilization of APACS for detecting changes over time (for example after a treatment), we employed a statistical method that, given two scores from the same individual, determines if a significant change occurred. Among the many possibilities to define a significant change (Jacobson and Truax, 1991; Collie et al., 2002), we used a regressionbased approach (Crawford and Garthwaite, 2006). According to this method, a score in the second measurement is predicted from the score observed in the first measurement. If the score observed at second measurement is far from the predicted value, then a significant change is inferred. The main advantage of using a regression-method is that it takes into account test-retest reliability and factors out both the practice effect and the "regression to the mean" bias (Crawford and Howell, 1998a). Specifically, the method from Crawford and Garthwaite (2006), unlike several other methods, takes into account the fact that the data used to build the regression models derive from a sample drawn from a wider population. For this reason, results derived through regressionbased methods are very robust and methodologically they are the gold standard to identify significant changes. Thresholds for significant changes are provided in the Supplementary Tables 2 in Supplemental Data Sheet 2: APACS-Data Tables and Cut-offs.

## Factorial Structure and Construct Validity

The factorial structure of APACS was inspected to study the relationship between APACS task scores. APACS includes different pragmatic domains possibly associated to different cognitive substrates. For this reason, we did not expect

<sup>3</sup>Notably, in a joint analysis on unpublished data that included both patients with schizophrenia and healthy controls tested with APACS, the Narratives task shows a satisfying value of Test-Retest reliability of 0.76.

### TABLE 3 | Test-Retest reliability and practice effect of APACS.


*Test-Retest analyses were conducted on a subsample of 19 participants. The following information is provided: task or score name (first column); Test-Retest reliability measured by means of Pearson correlations (second column); practice effect, calculated as the mean difference between the measurements at test and retest (third column); results of the paired t-tests comparing the scores at Test and Retest, with stars (\*) denoting a significant difference and a potentially harmful practice effect (fourth column).*

### TABLE 4 | Correlations between APACS task scores.


*The table reports the Pearson correlations between APACS task scores, performed on the sample of 119 participants. Stars (*\**) denote significant correlations.*

### TABLE 5 | Results of factor analysis on APACS tasks.


*The table reports the factor loadings for the APACS task, after a factor analysis with varimax rotation. The Description task was not included in the factor analysis due to a marked ceiling effect.*

that a single factor could explain the variability observed in APACS tasks. Rather, we expected a factorial structure where several domains correlate with the task scores, possibly in relation to the involvement of different cognitive functions.

We performed an exploratory factorial analysis (using a solution with varimax rotation) on all APACS tasks excluding Description. This task was excluded because of its almost ceiling distribution of the scores, which made it unsuitable for factorial analysis. A two factors solution provided a satisfactory fit of the data [χ(1) = 0.33, p = 0.57]. The correlation between the APACS tasks is reported in **Table 4**, and the results of the factor analysis are reported in **Table 5**.

### TABLE 6 | Content validity of APACS.


*The table shows the content validity of each task and composite score, operationalized as the appropriateness of each item (or score) in assessing the construct measured by the task, evaluated on a 5-point Likert scale. The cells report average values (standard deviations enclosed in round brackets).*

The inspection of loadings reveals that the first factor is presumably associated with the comprehension of figurative meanings, being mostly correlated to Figurative Language 1, Figurative Language 2, and Narratives (which includes questions on figurative language). For the second factor, the highest loadings are in Humor and Narratives. Overall, the results from this factor analysis may be taken as evidence that supports construct validity of APACS, as a test able to capture different aspects of the pragmatic competence, possibly related to different cognitive substrates.


*The following information is provided in the table: task or score name (first column); name of the term in the regression model (second column); coefficient estimate and standard error within round brackets (third column); t-value associated with the term (fourth column); p-value with stars "\*" denoting significant terms (fifth column); adjusted R*<sup>2</sup> *(sixth column).*

## Content Validity

Content validity refers to the extent to which the items in a test are appropriate to measure the construct that the test intends to measure. To assess content validity we followed the method adopted in Sacco et al. (2008), by asking five experts in linguistics (4 Linguists and 1 Psycholinguist) to rate on a 5-point Likert scale how each task or score of the APACS test measures the construct it intends to measure. A set of statements was presented to the raters, one for each item or composite score of APACS. For example, for Figurative Language 1, the statement associated to each item was "This item evaluates the ability to understand figurative language." A score of 1 in the Likert scale indicated "I completely disagree with the statement," whereas a score of 5 indicated "I completely agree with the statement." Intermediate value of 3 indicated "I don't agree neither disagree with this statement." Responses for all items were collapsed within and across judges, to obtain a mean value and a standard deviation for each task. A series of question on the quality of APACS composite scores (Pragmatic comprehension, Pragmatic production, and APACS Total) was also added. The overall mean responses (reported in **Table 6**) are very high (all above 4.5), indicating that all experts judged that the items of each task and the composite scores were appropriate.

## Effect of Demographic Variables on APACS Tasks and Composite Scores

In order to better characterize the effect of age, gender and education on APACS, we performed a series of multiple

shows the partial effects of age and education on APACS tasks, as estimated by regression analysis. The figure is an array displaying the APACS tasks (first column) and the effect of age (second column) and education (third column). A slash ("/") indicates that the effect was not significant in the regression analysis. The black line in each plot represents the predicted score at the APACS task. The colored bands around the line represent point-wise confidence bands around the prediction. Light blue is used for the tasks that compose the Pragmatic Production score. Light orange is used for the tasks that compose the Pragmatic Comprehension score.

regressions with each APACS task and composite score as dependent variable. Age and education were included in the regression models as continuous predictors, whereas

represents the predicted score at the APACS composite score. The colored bands around the line represent point-wise confidence bands around the prediction. Light blue is used for the Pragmatic Production score. Light orange is used for the Pragmatic Comprehension score. Gray is used for the APACS Total score.

Gender was included as a factor with two levels (male, female).

For each regression, we used the following regression modeling strategy: starting from an initial model including the three predictors (age, education, and gender) we used a backward elimination of terms, with a method based on Akaike Information Criterion, using the step function of R (R Core Team, 2015). After this first term selection, we further removed the terms whose coefficients were not statistically significant. After this procedure of variable selection, the final model on each dependent variable included only significant predictors. We graphically inspected the partial residuals of each variable in each model to investigate if relaxing the assumption of linearity could improve the fit. For all the variables that showed a nonlinear trend, we tested if adding quadratic terms yielded to better models. According to the standard regression procedure, if a quadratic term was significant, we kept also the linear term in the model, regardless of its significance.

The models resulting from this procedure are reported in **Table 7** and graphically represented in **Figure 2** (for the APACS tasks) and **Figure 3** (for the APACS composite scores). Results show a consistent pattern of age and education across APACS tasks and scores, but with some differences. Age and education showed some general effects, whereas gender never was a significant predictor. In Interview, the effect indicates that as age decreases the performance slightly decreases. In Description, no variable was significant. This means that the performance on this task is consistent across all the healthy participants, regardless of age, education, and gender. In Narratives a significant linear effect and quadratic effect of education were observed. These results indicate that performance on Narratives increases as education increases, but reaching a maximum at 16 years of education and then becoming stable. Performance in Figurative Language 1 was linearly related to both age and education, with a negative effect of age and a positive effect of education. In Humor, both age and education showed a non-linear (i.e., quadratic) relation. Age effect on Humor is slightly positive from 20 to 40 years and then negative from 40 to 89 years. The education effect on Humor is positive but, similarly to Narratives, reaches a plateau and becomes stable around 16 years. For Figurative Language 2, age had a negative linear effect, while education had a positive linear effect (similarly to Figurative Language 1 task). For the Pragmatic Production composite score only a negative effect of age was found, reflecting the effect of the Interview task on the composite score. For the Pragmatic Comprehension and APACS Total scores, both quadratic effects of age and education were found. For these two scores, age had almost no influence from 19 to 40 years, but then it showed a negative effect. Education had a positive effect, reaching a maximum around 16 years.

## Cut-offs

Cut-offs were calculated for each APACS task and for the three composite scores. Rather than stratifying arbitrarily for age, education, and gender, we used a regression approach to build demographic correct norms, by means of the method proposed by Crawford and Garthwaite (2006). This method relies on the same mathematical formulas already used to identify thresholds for significant changes. Here the score of a participant is predicted from the demographic variables (i.e., age and education) of that participant, using the regression models reported in **Table 7**. A crucial issue when using regression-based norms is the problem of the estimate for extreme values of the predictors (in this case age and education) that could be biased as a consequence of regression model estimates. An important feature of the method by Crawford and Garthwaite is that it takes into account this problem and is also specifically designed to compare a single case with a control group<sup>4</sup> . Cut-offs are reported in the Supplementary Tables 3 in Supplemental Data Sheet 2: APACS-Data Tables and Cut-offs.

## DISCUSSION

This study presents the psychometric properties and normative data of the APACS test, a new tool to evaluate pragmatic competence taking into account discourse and non-literal language through a set of 6 tasks.

APACS shows a satisfactory reliability, with acceptable internal consistency for all tasks (all Cronbach's alphas ≥ 0.60) and good test-retest reliability for almost all tasks and composite scores. A low test-retest reliability was found only for the Narratives task (r = 0.19), probably due to a combination of ceiling and practice effect in the test-retest sample. A factor analysis on APACS scores showed a meaningful pattern of results, with two factors accounting for task variance. One factor presumably reflects the ability to interpret figurative meanings such as idioms, metaphors, and proverbs, whereas the other factor seems related especially to pragmatic processes in detecting humor. The results of the factor analysis bring support to the construct validity of APACS, as composed by tasks tapping on different facets of the pragmatic competence. We further inspected the validity of APACS by focusing on the content validity as rated by five judges. Overall, the judges gave excellent rates to APACS items and scores, supporting the content validity of the test. When compared to other tests for pragmatic abilities, APACS has analogous values of internal consistency and very good content validity (Sacco et al., 2008). In addition, APACS is one of the few tests for which test-retest reliability is also available, which further supports the precision of the assessment instrument.

Construct validity results are especially interesting and deserve further discussion. The factorial structure of APACS evidenced two factors, one loading especially on figurative language and the other on humor. As a first consideration, this seems to confirm the view that pragmatics is not a monolithic component, and that the different pragmatic processes involved (i.e., the inferential load) might vary across tasks. Moreover, this two-factorial structure is a good starting point for discussing the role of the underlying cognitive substrates of pragmatics. There is compelling evidence on the important role of Theory of Mind and social cognition in general in inferring the speaker's intended meaning in Humor and related phenomena (e.g., sarcasm and irony) (Vrticka et al., 2013). Other literature points to the role of executive functions (like working memory and set-shifting) in humor comprehension (Bozikas et al., 2007). Hence, the second factor might be especially linked to Theory of Mind and to a lesser degree to executive functions. Note that the second factor loads also to Narratives, which is another domain in which Theory of Mind might be of some importance, especially in monitoring the protagonists' perspective (Mason and Just, 2009). The first factor, on the other hand, might be especially linked to executive functions, e.g., inhibition of inappropriate literal interpretation (Papagno and Romero Lauro, 2010) and to a lesser degree to Theory of Mind. Indeed, one might argue that only a basic ability to represent mental states is necessary for understanding metaphors (Langdon et al., 2002). We want to emphasize that this is only one of the possible interpretations of our factors in terms of cognitive substrates and that independent empirical research is needed to support this interpretation. This independent empirical research should not only focus on a normal population, but also on pathological groups. Due to the patient's cognitive and social abilities decline, a different factorial structure might emerge when studying APACS in clinical populations. This attempt

<sup>4</sup>For the Description task, since no predictor was significant in the regression analysis, we used the formula by Crawford and Howell (1998b). This formula allows to calculate cut-offs analogous to those obtained with the regression method by Crawford and Garthwaite (2006).

to define the cognitive substrates of pragmatics is a topic of major interest, with important theoretical consequences, since some theorists describe pragmatic interpretation as essentially an exercise in mind-reading, involving inferential attribution of intentions, and argue that pragmatics is a submodule of Theory of Mind evolved for communication (Sperber and Wilson, 2002). Conversely, others argue that pragmatics is best described as a complex domain interfacing with different cognitive systems (Stemmer, 2008). Interestingly, neuroimaging evidence showed that pragmatics and Theory of Mind share important networks of activations, specifically at the level of the temporo-parietal connections (Catani and Bambini, 2014; Hagoort and Levinson, 2014). As already said, our normative data do not offer the possibility to speculate further but definitely point to the possibility of APACS to shed light on the issue of the cognitive substrates of pragmatics.

Besides the factor analysis reported here, further corroboration for the construct validity of APACS comes from an exploratory study that compared 39 patients with schizophrenia and 32 healthy controls on the APACS test (Bosia et al., 2015). In this study, patients showed an impaired performance in all APACS tasks, falling below the 5th percentile of data from the control group. The highest effect sizes of the impairment were observed in Interview, Narratives and Figurative Language 2 tasks. These findings show that APACS is a useful tool to detect the well-known pragmatic deficit in schizophrenia.

The effect of demographic variables was investigated in APACS by means of regressions, which showed a consistent pattern across tasks. Age and education influenced almost all APACS tasks and composite scores, with a negative effect of age and a positive effect of education. These results are consistent with what is commonly observed in many neuropsychological tests (Strauss et al., 2006). Moreover, these results match with experimental research on the effects of age on specific pragmatic abilities, where aging is showed to affect the comprehension of jokes (Mak and Carpenter, 2007), written text (Borella et al., 2011) and the neural response for metaphor (Bonnaud et al., 2002; Mejía-Constaín et al., 2010). Studies on aging and pragmatics also pointed out that the decline in pragmatic performance in the aged population is probably related to a conundrum of other cognitive abilities (Mak and Carpenter, 2007), and it is possibly reduced once we factor out the working memory load (Borella et al., 2007). These results further highlight the importance of exploring the cognitive substrates of pragmatics, complementing the assessment of pragmatic abilities with neuropsychological tests targeting executive functions and social cognition. Interestingly, studies showed that the ability of comprehending figurative uses of language improves during adolescence, reaching a plateau in adulthood (Nippold et al., 1997), which remains relatively stable in elderly subjects with a high education level (Bonnaud et al., 2002). In APACS we found an interplay between age and education that could be consistent with these findings.

Finally, we reported cut-offs for clinical purposes, calculated by using state-of-the-art techniques based on regression analysis (Crawford and Howell, 1998b; Crawford and Garthwaite, 2006). Importantly, and innovatively with respect to previous tests, we also provided thresholds to detect significant changes, which allow to determine if a single patient has improved or worsened at two repeated measurements. Thresholds for significant change can be used to test if a patient changes after a treatment or after a neurosurgical intervention, or to test if the patient shows a decline in pragmatic abilities over time.

Overall, this study shows that APACS is a valuable tool to detect impairments in verbal pragmatic abilities, which could be employed for research as well as for clinical purposes. To this respect, the total duration of the test (around 35– 40 min) and the use of traditional tasks and scoring systems not requiring effortful training on the clinician's side should add to the feasibility of APACS in clinical settings. In terms of clinical utility, the importance of a test assessing pragmatic abilities like APACS comes from two main considerations. First, a large body of research reports communicative breakdowns in specific pragmatic tasks across several clinical populations, from schizophrenia to traumatic brain injuries, where deficits are documented for instance in metaphor comprehension or discourse and conversation (Martin and McDonald, 2003; Brüne and Bodenstein, 2005). The number of clinical populations that exhibit pragmatic impairments has been recently expanded with data from neurodegenerative diseases, including frontotemporal dementia and amyotrophic lateral sclerosis (Orange and Hillis, 2012; Ash et al., 2014). APACS is suitable for use in both psychiatric and neurological patients, including patients with dysarthria and other production difficulties, as it contains tasks that do not require production and separate cut-offs are provided for each task. Second, pragmatics is intimately related to communication, and it lies at the heart of our social life, with high impact on the individual's life and on society at large. A compact test like APACS could contribute to providing a complete picture of the pragmatic competence in the different clinical populations, targeting a vital domain in the patient's social life, and ultimately leading to a more precise characterization of the different clinical profiles.

An important aspect deserving consideration for future uses of APACS is related to the description of the cognitive substrates of pragmatic abilities. Factor analysis offered hints in this direction, with Figurative Language tasks and Humor clustering separately, possibly in relation to different cognitive substrates. Coupling APACS with neuropsychological tests could contribute to clarifying how cognitive functions are involved in pragmatics. Although clearly unified by their close relation to the communicative context, the pragmatic tasks included in APACS might differ from each other and might differently tax on cognitive abilities. Research on patients might shed light on the inventory of pragmatic phenomena by highlighting specific interplays of communicative performance and neurocognitive deficits.

To conclude, with APACS we aim at providing a tool that could promote the inclusion of pragmatics in the clinical assessment practice, as a relevant dimension in defining the patient's cognitive profile, as well as research on the neurocognitive underpinning of the typically human abilities of adjusting communicative behavior to context.

## AUTHOR CONTRIBUTIONS

The authors designed the APACS test and run the study together. VB is especially responsible for the pragmatic aspects of the test and GA for the statistical analyses.

## ACKNOWLEDGMENTS

VB is partially supported by the Italian PRIN project "I meccanismi neurocognitivi alla base delle interazioni sociali" (MIUR 2010YJ2NYW\_001). This work was also partially

## REFERENCES


supported by Regione Toscana under the framework of the project "Assessing Pragmatic Abilities and Cognitive Substrates" (Bando Salute 2009; Grant number: 19), awarded to VB while affiliated to Scuola Normale Superiore of Pisa. We thank all the people who helped us in data collection.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.00070


Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge University Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Arcara and Bambini. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Tapping into neural resources of communication: formulaic language in aphasia therapy

Benjamin Stahl <sup>1</sup> \* and Diana Van Lancker Sidtis <sup>2</sup> \*

<sup>1</sup> Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität Berlin, Berlin, Germany, <sup>2</sup> Department of Communicative Sciences and Disorders, New York University, New York, NY, USA

Keywords: formulaic language, left-hemisphere stroke, aphasia, apraxia of speech, Melodic Intonation Therapy, Intensive Language-Action Therapy, Constraint-Induced Aphasia Therapy, post-stroke depression and anxiety

Decades of research highlight the importance of formulaic expressions in everyday spoken language (Vihman, 1982; Wray, 2002; Kuiper, 2009). Along with idioms, expletives, and proverbs, this linguistic category includes conversational speech formulas, such as "You've got to be kidding," "Excuse me?" or "Hang on a minute" (Fillmore, 1979; Pawley and Syder, 1983; Schegloff, 1988). In their modern conception, formulaic expressions differ from newly created, grammatical utterances in that they are fixed in form, often non-literal in meaning with attitudinal nuances, and closely related to communicative-pragmatic context (Van Lancker Sidtis and Rallon, 2004). Although the proportion of formulaic expressions to spoken language varies with type of measure and discourse, these utterances are widely regarded as crucial in determining the success of social interaction in many communicative aspects of daily life (Van Lancker Sidtis, 2010).

### Edited by:

Alessio Plebe, University of Messina, Italy

### Reviewed by:

Matti Laine, Abo Akademi University, Finland Marcelo L. Berthier, University of Malaga, Spain

### \*Correspondence:

Benjamin Stahl and Diana Van Lancker Sidtis, stahl@zedat.fu-berlin.de; diana.sidtis@nyu.edu

### Specialty section:

This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology

Received: 27 August 2015 Accepted: 22 September 2015 Published: 20 October 2015

### Citation:

Stahl B and Van Lancker Sidtis D (2015) Tapping into neural resources of communication: formulaic language in aphasia therapy. Front. Psychol. 6:1526. doi: 10.3389/fpsyg.2015.01526

The unique role of formulaic expressions in spoken language is reflected at the level of their functional neuroanatomy. While left perisylvian areas of the brain support primarily propositional, grammatical utterances, the processing of conversational speech formulas was found to engage, in particular, right-hemisphere cortical areas and the bilateral basal ganglia (Hughlings-Jackson, 1878; Graves and Landis, 1985; Speedie et al., 1993; Van Lancker Sidtis and Postman, 2006; Sidtis et al., 2009; Van Lancker Sidtis et al., 2015). It is worth pointing out that parts of these neural networks are intact in left-hemisphere stroke patients, leading to the intriguing observation that individuals with classical speech and language disorders are often able to communicate comparably well based on a repertoire of formulaic expressions (McElduff and Drummond, 1991; Lum and Ellis, 1994; Stahl et al., 2011). An upper limit of such expressions has not yet been identified, with some estimates reaching into the hundreds of thousands (Jackendoff, 1995).

The above literature suggests that formulaic expressions may be viewed as a valuable resource in speech-language therapy. However, surprisingly little is known about their potential impact on the success of popular programs in clinical rehabilitation. The current opinion paper seeks to address this matter by outlining the contribution of formulaic expressions to seminal approaches in recovery from speech and language disorders after stroke.

## Utterance-Oriented Approaches: Music-Based Rehabilitation Programs

According to analytical language philosophy and communicative-pragmatic theory, the meaning of an utterance emerges from its ordinary use by performing so-called "speech acts," such as greeting a person (Wittgenstein, 1953; Austin, 1962; Searle, 1969; Horn and Ward, 2008). Adopting this idea for clinical rehabilitation, treatment programs in speech-language therapy should be grounded in behaviorally relevant situations that enable patients to benefit from a range of communicative features, including the turn-taking structure underlying everyday conversation (Pulvermüller, 1990). For example, the speech act of greeting offers the conversation partner a number of possibilities to respond—typically by using individual sets of formulaic expressions, such as "Good to see you," "How's it going?" or "Long time no see." One may claim that incorporating this turntaking structure in speech-language therapy does not provide any added value for the outcome of the treatment. If this is true, the training of formulaic expressions in communicativepragmatic context should be as successful as exercises that focus on articulatory quality of the same utterances, regardless of their social function. However, it remains questionable how effective such utterance-oriented approaches are in improving everyday language abilities over and above articulatory quality of trained expressions.

Prominent examples of utterance-oriented approaches in speech-language therapy are, in some respect, music-based rehabilitation programs, among them a treatment known as Melodic Intonation Therapy (Albert et al., 1973). The treatment protocol requires persons with non-fluent aphasia to produce sentences and phrases in different modalities, including singing and rhythmic sprechgesang (Helm-Estabrooks et al., 1989). While the higher difficulty levels of the protocol encourage the use of grammatical utterances, the lower levels involve formulaic expressions, such as "I am fine," "How are you?" or "Thank you." Although most of these expressions may occur naturally in a conversation, their repetitive training does not meet the criteria of communicative-pragmatic speechlanguage therapy. Among other caveats, Melodic Intonation Therapy does not benefit systematically from the turn-taking structure underlying everyday conversation in the training sessions. This may limit the transfer of trained sentences and phrases into real life, a goal of primary importance in clinical practice.

In line with this view, randomized controlled trials on Melodic Intonation Therapy should not consistently reveal generalized effects on standardized aphasia test batteries, even if the sample of trained sentences and phrases is relatively large (cf. van der Meulen et al., 2014; van der Meulen, 2015; for evidence on modeloriented approaches in speech-language therapy, see Brady et al., 2012). Nonetheless, music-based rehabilitation programs have been demonstrated to directly benefit the production of trained expressions in individuals with chronic non-fluent aphasia and apraxia of speech (Wilson et al., 2006; Stahl et al., 2013; Zumbansen et al., 2014). One may argue that the reported progress in the production of such expressions depends, at least in part, on increased activity in right-hemisphere neural networks engaged in the processing of formulaic language, especially when considering the repetitive character of the training (cf. Berthier et al., 2014). If this notion is correct, it would help to explain conflicting results from neuroimaging studies, indicating either left perilesional or right frontotemporal reorganization of language in patients treated with Melodic Intonation Therapy (Belin et al., 1996; Schlaug et al., 2008, 2009; Vines et al., 2011). Future trials will hopefully determine whether or not these discrepant findings arise from different degrees of formulaicity in the experimental tasks (cf. Stahl and Kotz, 2014).

## Communicative-Pragmatic Approaches: Therapeutic Language Games

Communicative-pragmatic rehabilitation programs for individuals with aphasia aim at training verbal expressions in behaviorally relevant settings, so-called "language games" (Davis and Wilcox, 1985; Pulvermüller and Roth, 1991; Bastiaanse and Prins, 1994). Based on a variety of utterances, patients are invited to communicate with others by performing different types of speech acts, such as requesting objects from a person. Importantly, the turn-taking structure of language games offers the conversation partner a number of possibilities to respond, including a series of formulaic expressions ("Here you are," "You're welcome," "I'm sorry," "Too bad," "Pardon me?"). In contrast to utterance-oriented approaches, language games focus less on articulatory quality of sentences and phrases rather than on their suitability in communicative-pragmatic context. One may therefore claim that such approaches should, in principle, be effective in improving everyday language abilities over and above articulatory quality of trained expressions.

Prominent examples of communicative-pragmatic approaches are clinical language games, including a treatment known as Intensive Language-Action Therapy (cf. Constraint-Induced Aphasia Therapy; Difrancesco et al., 2012). The treatment protocol requires up to three individuals with aphasia and a therapist to obtain picture cards from each other, such as by making verbal requests. Utterances are combined with manual actions by handing over the requested card to other players. Depending on the availability of picture cards, the players use adequate sets of formulaic expressions to indicate whether a request was accepted ("Here you are," "Thank you," "You're welcome"), rejected ("I'm sorry," "No problem," "Too bad") or unclear ("Pardon me?"). That is, the repetitive interaction with formulaic expressions benefits from the rich turn-taking structure underlying everyday conversation, with possible implications on the success of the language game.

There is indeed ample evidence from randomized controlled trials suggesting that Intensive Language-Action Therapy induces generalized effects on standardized aphasia test batteries (Pulvermüller et al., 2001; Meinzer et al., 2005, 2007; Berthier et al., 2009). Although several elements included in the program are likely to contribute to this finding, the use of formulaic expressions may particularly account for the practicability of communicative-pragmatic approaches, allowing patients to tap into right-hemisphere language resources in the training sessions. Interestingly, neuroimaging studies have revealed either left perilesional or right frontotemporal functional reorganization in patients treated with Intensive Language-Action Therapy (Meinzer et al., 2004, 2008; Pulvermüller et al., 2005; Breier et al., 2006, 2009; MacGregor et al., 2014; Mohr et al., 2014; Barbancho et al., 2015). Future trials may help to clarify to what extent these results depend on increased activity in neural networks supporting formulaic language.

## Possible Impact on Motivation, Well-Being, and Quality of Life

Individuals with left-hemisphere brain damage often experience a sudden inability to engage in communication with others based on propositional, grammatical utterances. This loss of social interaction skills may be one reason for the high prevalence of severe psychopathological symptoms in the first year following acquired brain injury (cf. Lewinsohn, 1974). Notably, almost half of the patients suffer from post-stroke depression or anxiety during this period of time (Kauhanen et al., 1999; Fleminger et al., 2003; Schöttke and Giabbiconi, 2015). While antidepressant medication is an option for most patients with speech and language disorders, classical forms of psychotherapy remain challenging due to constrained verbal expression and comprehension.

A number of approaches in psychotherapy seek to identify and activate resources in order to overcome cognitive-affective distress (Priebe et al., 2014). Adopting this goal for clinical rehabilitation after stroke, formulaic expressions frequently remain one of the few resources available to communicate for patients with left-hemisphere brain damage. However, patients are commonly unaware of their ability to perform sets of formulaic expressions correctly. Using these utterances in therapy may therefore play a key role in compensating for loss of social interaction, with a possible beneficial influence on motivation, subjective well-being, and quality of life (Doering et al., 2011; Hilari et al., 2012; Kuenemund et al., 2013). Although anecdotal evidence confirms the positive non-linguistic effects of formulaic language in stroke patients, this hypothesis has not been studied experimentally.

We wish to emphasize that current programs in speechlanguage therapy differ considerably in how they take advantage of formulaic expressions, drawing on neural resources of communication, to support social interaction. As discussed previously, utterance-oriented approaches focus mainly on articulatory quality in the training sessions. In contrast, communicative-pragmatic approaches benefit from the rich turn-taking structure underlying everyday conversation, thus encouraging the use of formulaic expressions in natural settings. We believe that methods relying on preserved language abilities in contexts of social interaction may have a substantial impact on recovery from cognitive-affective distress, especially in persons with concomitant post-stroke depression and anxiety—a claim yet to be confirmed empirically.

## Open Questions for Future Research

A growing body of research provides compelling evidence for the contribution of right-hemisphere cortical and bilateral subcortical neural systems to the production and comprehension of formulaic language. These data are consistent with the notion that the efficacy of prominent approaches in speech-language therapy is, to some degree, dependent on the intensive use of formulaic expressions. However, it is still poorly understood how exactly the language system of the damaged brain benefits from neural resources associated with formulaic expressions. There are, in fact, a range of neurophysiological scenarios that may account for descriptions of preserved language skills in clinical rehabilitation.

According to Hebbian learning, the synchronous firing of cell assemblies is likely to strengthen the neural connectivity between them, even if they are located in distributed areas of the brain; in other words, "cells that fire together, wire together" (Hebb, 1949). This neurobiological model may be appropriate in addressing three fundamental questions in future research: (i) Does intensive training of formulaic expressions stimulate neural activity in right-hemisphere cortical and bilateral subcortical language circuits? (ii) Does the combined training of grammatical utterances and formulaic expressions lead to functional reorganization in the interplay of left perilesional and intact right-hemisphere language networks? (iii) Does this bilateral neural interplay affect treatment-induced generalized effects observed on standardized aphasia test batteries?

With this article, we wish to increase the awareness for neural resources of communication in the treatment of lefthemisphere stroke patients. We readily acknowledge that the examples included may only be the "tip of the iceberg." Based on our experience, the ability to use formulaic expressions is often well documented in clinical practice, commonly under a variety of different terms. However, the possible influence of such expressions on the outcome of speech-language therapy frequently remains unnoticed. Uncovering the behavioral and neural dynamics of formulaic expressions may therefore be crucial in identifying and activating resources of communication more systematically. This may help to improve the success of current attempts to promote recovery from speech and language disorders and cognitive-affective distress after stroke.

## References

Albert, M. L., Sparks, R. W., and Helm, N. (1973). Melodic Intonation Therapy for aphasia. Arch. Neurol. 29, 130–131. doi: 10.1001/archneur.1973.00490260074018

Austin, L. (1962). How to do Things with Words. Oxford: Clarendon Press.

Barbancho, M. A., Berthier, M. L., Navas-Sánchez, P., Dávila, G., Green-Heredia, C., García-Alberca, J. M., et al. (2015). Bilateral brain reorganization with memantine and Constraint-Induced Aphasia Therapy in chronic post-stroke aphasia: an ERP study. Brain Lang. 145–146, 1–10. doi: 10.1016/j.bandl.2015.04.003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Stahl and Van Lancker Sidtis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Bridging the gap between DeafBlind minds: interactional and social foundations of intention attribution in the Seattle DeafBlind community

This article is concerned with social and interactional processes that simplify pragmatic

### Terra Edwards \*

*Department of Linguistics, Gallaudet University, Washington, DC, USA*

### Edited by:

*Gabriella Airenti, University of Torino, Italy*

### Reviewed by:

*Marie Coppola, University of Connecticut, USA Elinor Ochs, University of California, Los Angeles, USA*

### \*Correspondence:

*Terra Edwards, Department of Linguistics, Gallaudet University, 800 Florida Avenue NE, Washington, DC 20002, USA terra.edwards@gallaudet.edu*

### Specialty section:

*This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology*

Received: *13 July 2015* Accepted: *16 September 2015* Published: *09 October 2015*

### Citation:

*Edwards T (2015) Bridging the gap between DeafBlind minds: interactional and social foundations of intention attribution in the Seattle DeafBlind community. Front. Psychol. 6:1497. doi: 10.3389/fpsyg.2015.01497* acts of intention attribution. The empirical focus is a series of interactions among DeafBlind people in Seattle, Washington, where pointing signs are used to individuate objects of reference in the immediate environment. Most members of this community are born deaf and slowly become blind. They come to Seattle using Visual American Sign Language, which has emerged and developed in a field organized around visual modes of access. As vision deteriorates, however, links between deictic signs (such as pointing) and the present, remembered, or imagined environment erode in idiosyncratic ways across the community of language-users, and as a result, it becomes increasingly difficult for participants to converge on objects of reference. In the past, DeafBlind people addressed this problem by relying on sighted interpreters. Under the influence of the recent "pro-tactile" movement, they have turned instead to one another to find new solutions to these referential problems. Drawing on analyses of 120 h of videorecorded interaction and language-use, detailed fieldnotes collected during 12 months of sustained anthropological fieldwork, and more than 15 years of involvement in this community in a range of capacities, I argue that DeafBlind people are generating new and reciprocal modes of access to their environment, and this process is aligning language with context in novel ways. I discuss two mechanisms that can account for this process: *embedding in the social field* and *deictic integration.* I argue that together, these social and interactional processes yield a deictic system set to retrieve a restricted range of values from the extra-linguistic context, thereby attenuating the cognitive demands of intention attribution and narrowing the gap between DeafBlind minds.

Keywords: intention attribution, deictic reference, pointing, DeafBlind, Tactile American Sign Language, deictic integration, practice

## Introduction

This article analyzes some of the social and interactional mechanisms that constrain pragmatic acts of intention attribution among DeafBlind people in Seattle, Washington. In particular, it focuses on the use of pointing signs and the means by which potential referents in the environment are narrowed down. In visual signed languages, pointing signs can be used gesturally, but they are also recruited by the grammar, taking on a range of linguistic functions (Friedman, 1975; Klima and Bellugi, 1979; Supalla, 1982; Petitto, 1987; Padden, 1988; Engberg-Pedersen, 1993; Liddell, 1995; Taub, 2001; McBurney, 2002; Meier, 2002; Rathmann and Mathur, 2002; Pfau and Steinbach, 2006; Pizzuto, 2007; Coppola and Senghas, 2010; Meier and Lillo-Martin, 2013; Gökgöz et al., 2015). Evidence for this includes, among other things, that some pointing signs are acquired by children according to developmental patterns similar to learners of corresponding spoken language forms (Petitto, 1987; Pizzuto, 2007: p. 292; Gökgöz et al., 2015), they appear to have syntactic distributions that are the same as corresponding elements in spoken languages (Padden, 1988), but different from co-speech pointing gestures (Cormier, 2014:4, Cf. Johnston, 2013), and they are subject to visual and processing constraints that apply to linguistic, but not gestural phenomena (Siple, 1978; Emmorey, 2002).

There are, however, unresolved theoretical issues regarding characteristics of pointing signs that are difficult to account for from phonological, morphological, and syntactic perspectives (Mathur, 2002; Pizzuto, 2007). These problems have been approached from many different angles (see Mathur and Rathmann, 2012, for a review), and yet, scholars are converging on the fact that pointing signs, no matter how far into the grammar they penetrate, cannot be adequately described via linguistic analytics alone (Liddell, 2003; Dudis, 2004; Johnston, 2013; Meier and Lillo-Martin, 2013; Cormier, 2014). This pushes pointing in signed languages into the realm of pragmatics, where questions of intention attribution inevitably arise (Grice, 1971; Levinson, 1983; Searle, 1983).

In the sign language linguistics literature, intention attribution, and more generally, speech act theory, has played a fairly limited role in addressing problems associated with pointing. Instead, concepts such as cognitive capacity, gesture, and iconicity have been central and from those perspectives, constructs such as "real space" (Liddell, 2003: p. 82), "gestural space" (Rathmann and Mathur, 2002: p. 144), and "iconic prototypes" (Sandler, 2012) have been proposed. These constructs tend to assume a non-problematic relationship between representations (both linguistic and cognitive) and embodied experience. Liddell for example, defines real space as "a person's current conceptualization of the immediate environment based on sensory input" (Liddell, 2003: p. 82). Real space is isomorphic with the conceptualizer's experience of the environment, and is assumed to be reciprocal across the group of language-users since "[i]n general, real space lines up well with physical things in the world" (Liddell, 2003: p. 84). From this perspective, intention attribution seems pretty simple—you and I inhabit the same world and/or representation of it, so when I point to an object or location in real or imaged space, the object of my attention will likely be self-evident to you<sup>1</sup> .

For DeafBlind people in Seattle, however, seamlessness between experience and representation can rarely be assumed. Most members of the community are born deaf and become blind slowly. Everyone becomes blind in different ways and at different rates. Therefore, sensory capacities and habitual modes of sensory orientation vary significantly across the group. These differences are compounded by differences in race, ethnicity, gender, age, disability, socioeconomic status, sexual orientation, and school experience (i.e., growing up in a residential school for the deaf vs. a deaf program within a hearing school, etc.). Furthermore, tactile reception of Visual American Sign Language (VASL) was, until recently, the only available choice. Since VASL emerged and developed among sighted people, and is therefore built around visual modes of access and orientation, it is only partially perceptible via tactile reception, much as spoken English is only partially perceptible via lipreading (see Edwards, 2014b). In other words, for DeafBlind people, the systems of representation historically available to them are shaped by a world that they can no longer access. In addition, authority accrued to sighted social roles, and legitimacy accrued to visual modes of communication, therefore, in order to maintain one's position and status in the social order, tactility had to be avoided. In the past, these barriers were considered too great to surmount and direct communication between DeafBlind was rarely attempted (Edwards, 2014a: pp. 86–90). Instead, DeafBlind people communicated via sighted interpreters. However, since 2007, a socio-political movement known as the pro-tactile movement has opened up new possibilities for direct communication between DeafBlind people.

The pro-tactile movement is based on the idea that all human activity can be realized via tactile-kinesthetic channels, including interaction and language-use. Therefore, interpreters are not necessary for DeafBlind people to interact with each other or their environment. However, in order to legitimize practices built around tactile modes of access, social restrictions on touch have to be relaxed and experimentation encouraged. In 2010 and 2011, a series of 20 pro-tactile workshops was led by two DeafBlind instructors for 11 DeafBlind participants with these aims in mind. This paper focuses on several interactions between DeafBlind people that took place as part of the pro-tactile workshops<sup>2</sup> . The interactions were videorecorded by a team of three videographers from multiple angles (120 h of video data was collected in total). Sequences of communicative activity where DeafBlind people coordinated and directed each other's attention to particular dimensions of setting were subsequently isolated, and the ways in which pointing signs were produced and responded to were considered<sup>3</sup> . Examining these moments among DeafBlind people

<sup>1</sup> See Duranti (2010) for a critique of intersubjectivity in the social sciences, which touches on this and related points.

<sup>2</sup>This study has been approved by the Committee for Protection of Human Subjects at the University of California, Berkeley, and all research subjects have given their informed consent.

<sup>3</sup>When these data were collected, I made an extensive list of transcribed entries, describing and translating interactions that were of interest, given my theoretical interests at the time. I also included things that seemed to diverge in significant ways from what would be expected in visual signed language communities. Each entry includes a video code and a time code, and the overall document functions like an index of notable interactional events. This index is 60 pages long and contains roughly 100 descriptions and/or translations of communicative sequences. For purposes of this article, I reviewed this index, and drew on it for some of the examples. In addition, I returned to the raw data, looking for patterns specific to demonstratives and locatives; please see footnote at the end

in Seattle offers unique insight into how language and context are brought into alignment with our embodied experience of the world.

In Speech Acts and Intentional States from a Practice Perspective, I begin with a discussion of intention attribution viewed through the lens of practice theory (Giddens, 1979; Bourdieu, 1971, 1990 [1980]; Hanks, 1996, 2005a,b; Edwards, 2012, 2014a,b). From this perspective, embodied knowledge takes on a crucial role for language-users as they work to converge on specific, pragmatically situated meanings and effects in interaction. I argue that these embodied forms of knowledge arise in dynamic tension with structured and historically pre-given fields of social and interactional activity. In Embedding in the Social Field, I consider the effects of these tensions on languageuse. Drawing on Hanks (2005b), I argue that embedding in the social field involves the legitimation of new styles, modalities, and genres, as well as the authorization of some language users (and not others) to evaluate linguistic forms and communicative practices as correct, appropriate, polite, or not. I argue that this dual constraint of legitimation and authorization restricts the range of feasible moves and interpretations in interaction among DeafBlind people in ways that simplify the cognitive tasks required for intention attribution. In The Deictic System and the Deictic Field, I ask how these two constructs work in tandem to structure deictic reference (Bühler, 2001 [1934]; Hanks, 2005a). When a deictic sign is instantiated, contextual values must be retrieved and coordinated, and patterns in retrieval have an effect on the internal organization of the language. I call this process "deictic integration." Via detailed analysis of interactional sequences among DeafBlind people, as well as attention to their metapragmatic commentary, I show how deictic integration is accomplished in the workshops. In Deictic Integration and Appropriate Pointing in TASL: Embedding and Integration in the Social and Deictic Fields, I argue that in conjunction with embedding in the social field, deictic integration is giving rise to a deictic system in Tactile American Sign Language (TASL), which diverges from the visual system on which it is scaffolded. Evidence for this claim includes an emerging distinction between demonstratives and locatives in TASL represented by a difference in movement (tapping vs. tracing, respectively). I show how these changes emerged as certain practices for pointing were deemed appropriate and others were deemed inappropriate by DeafBlind leaders, who are invested with the requisite authority. I conclude in the final section, with some reflections on the role of deictic integration and embedding in the social field for simplifying the task of intention attribution from the perspective of the DeafBlind participant. In particular, I emphasize the importance of socially transmitted forms of embodied knowledge in fitting the linguistic system to particular fields of activity, thereby narrowing the gap between DeafBlind minds.

## Speech Acts and Intentional States from a Practice Perspective

When people apply linguistic resources in the speech situation, they are not only producing semantic meanings; they are also performing pragmatic actions such as informing, requesting, and asserting (Austin, 1965; Searle, 1969; Grice, 1989). And yet, when utterances are taken out of context, the pragmatic layer can collapse, revealing a kind of "residual semanticity" or indeterminacy that can be manipulated by speakers to deny specific inferences: "Thus, the characteristic speaker's denial of speech offensive to the hearer takes the form of 'all I said was. . . "' (Silverstein, 1976: p. 47). Reducing an utterance in this way produces many possible interpretations, which must be narrowed to generate specific meanings and effects in interaction. One of the ways that participants accomplish this is by attributing communicative intentions to their interlocutor (Grice, 1971; Levinson, 1983; Searle, 1983).

Intention, in the sense of meaning to do something, is just one of many intentional states. Broadly construed, a mental state is intentional insofar as it is directed toward an object or state of affairs (Searle, 1983: pp. 1–37). Other intentional states include, for example, belief, love, elation, anxiety, irritation, and remorse<sup>4</sup> (Searle, 1983: p. 4). It is in this broader sense that the term is taken up here. Intentional states correspond in many ways to speech acts; speakers can insist that their interlocutor leave the room in much the same way as they can believe, fear, or hope their interlocutor will leave the room (Searle, 1983: pp. 5–6). These kinds of correspondences come together in Searle's "conditions of satisfaction," including, for example, his sincerity condition<sup>5</sup> . Each time an illocutionary act is performed, an intentional state is expressed via the same propositional (or representative) content (Searle, 1983: p. 9). Insofar as the intentional state and the illocution correspond, the speaker satisfies the sincerity condition. For example, if I say, "It is snowing," I have produced an assertion (speech act), which corresponds to the belief (intentional state) that it is snowing. If I believe it is snowing when I assert that it is snowing, I have satisfied the sincerity condition. The sincerity condition is one among many, which link utterances (and other representative content) to a psychological and/or illocutionary mode, thereby specifying its meaning or effect.

However, anthropologists have shown that such conditions are culturally and historically specific, that they presuppose certain notions of personhood, and they can be more or less attenuated in different communicative contexts (e.g., Silverstein, 1976; Rosaldo, 1982; Duranti, 1984; Ochs, 1984; DuBois, 1987; Hanks, 1990; Kockelman, 2010). My aim is to build on this work by considering the role of embodied knowledge and practical circumstance in structuring the convergence of the speaker and addressee's intentional states on objects of reference in the immediate environment (Bourdieu, 1971, 1990 [1980]; Giddens, 1979; Hanks, 1996; Edwards, 2012, 2014b; Hanks, 2005a,b). To

of Appropriate Pointing in TASL: Embedding and Integration in the Social and Deictic Fields. I have also participated in the practices described here over the past 15 years of in-depth ethnographic engagement, and have developed an intuition for patterns in language that are new, and those that are not. I am relying on all of these forms of knowledge in my analyses.

<sup>4</sup>These are intentional states insofar as they are directed. Diffuse anxiety or elation, with no identifiable cause does not count as an intentional state for Searle.

<sup>5</sup>Cf. Austin's first Gamma condition (1965: p. 15) and Grice's maxims of quality (1989: pp. 27–28).

this end, I begin with the practical communicator, who exists in a world of routine, where much of what is said is anticipated and much of what is done could be done without saying much (Hanks, 1996). Informed by practice theory, I assume that patterns that emerge out of that regularity do not inhere solely in the linguistic system, nor can they be isolated in a static and detachable set of conditions or rules. Rather they cohere in the relations between the language-user, the language, and the specific fields of socio-historical and interactional activity where each is shaped. Embedded in routine patterns of embodied activity, the cognitive tasks required for generating pragmatically situated meanings appear less demanding than they might otherwise appear. In what follows, I outline three key concepts required for understanding intention attribution in a practice framework: habitus, field, and embedding.

### Habitus

Habitus is an acquired system of generative schemes, which predisposes actors to perceive, think and act in ways that feel correct, appropriate, and polite (Bourdieu, 1990 [1980]: pp. 52–65). Individuals share a habitus insofar as they are subject to social and material conditions that reinforce a ground of common sense ideas and behaviors, which, in turn, tend to reproduce the conditions that gave rise to those ideas (Bourdieu, 1971: p. 80). This circular process tends to convert history into second nature and in doing so, harmonizes the practices of the group in ways that are not transparent to its members. Harmonization is most apparent, analytically, in non-reflective patterns of thought, action, perception, and navigation<sup>6</sup> . Individual differences can only be evaluated reciprocally against the backdrop of a common habitus (Bourdieu, 1971: pp. 81–86).

Frames for evaluation, which tend to restrict possibilities for action, are an integral part of the habitus. These frames derive from the Aristotelian notion of hexis: an intention (or desire) to act together with reflexive judgments of that intention, guided by, or weighed against, frames of social value and meaning (Hanks, 2005a: pp. 69–72). Under the influence of Merleau-Ponty, Bourdieu's notion of hexis shifts analytic attention from the mind to the habituated activity of the body:

The evaluative perspective, once embodied, emerges as active perception, and the intentional states of desire and purpose become the inclination of body posture (Hanks, 2005b: pp. 71–72).

While Bourdieu locates hexis in the body and its dispositional tendencies, Giddens locates this kind of reflexive monitoring on three, distinct planes of consciousness: practical, discursive, and unconscious (1987: pp. 1–49). I focus here on the first two: practical and discursive consciousness. Practical consciousness accounts for "the tacit knowledge that is skillfully applied in the enactment of courses of conduct, but which the actor is not able to formulate discursively" (Giddens, 1979: p. 57). While practical consciousness accounts for what actors know how to do, discursive consciousness accounts for the knowledge actors are able to talk about.

Among DeafBlind people, dimensions of practice that would normally remain tacit are projected onto a discursive plane. This provides an unusual opportunity to see how embodied modes of knowledge contribute to the narrowing of interactional potentials in practice. For example, during the pro-tactile workshops, a group of DeafBlind people were playing a tactile version of charades. Someone would enact a character or person, everyone would explore the enactment tactually, and then they would take turns guessing who it was. After one of these games, an instructor asked a participant about his experience<sup>7</sup> .


Here, DeafBlind participants are talking about a breakdown in practical sequence, where an embodied disposition that works well for the sighted leads to injury. When breakdowns like this occur, practical activity becomes an object of discursive reflection as the instructor and the student explicitly talk about, and try out, different combinations of communicative cues and body postures. After a few attempts, they agree on a particular strategy and the instructor says she will tell her co-instructor (Adrijana) what they have come up with. This shows that a strategy has been chosen and legitimized, making it a candidate for the communicative repertoires collaboratively constructed in the workshops. This process gives rise to novel practices while also linking them to social, evaluative frames associated with correctness, politeness, and appropriateness. Innovations that stick recede from discursive consciousness into practical consciousness. Part of what determines whether something will stick, and therefore recede, is the degree to which the practice is commensurate with the emergent, reciprocal body-schema of pro-tactile people.

### Reciprocity and the Body Schema

In a practice framework, the body schema is neither a representation of the body, nor a mere physical fact about the body (Hanks, 2005a: p. 69). Rather, it accounts for the "momentary grasp that actors have of being a body" (Hanks, 2005a). When the Marilyn Monroe charade was over and the

<sup>6</sup>Cf. Grice's cooperative principle (1989: p. 26).

<sup>7</sup>Dialog was translated from American Sign Language into English by the researcher.

participants began to sit down, their heads collided because their grasp of being a body, or their body schema was not commensurate with tactile modes of access. The collision presented an opportunity to bring representation, practice, and the physical surround into alignment. Practical strategies that do not lend themselves to such alignments tend to fall away over time as an inhabitable world coheres.

In a coherent, inhabitable world, hexis and the body schema work together to generate a reciprocity of perspectives (Schutz, 1970: p. 183). Where there is reciprocity, shared evaluative frames are applied to the reflexive grasp that DeafBlind actors have of being a body. Without this kind of reciprocity, people collide and are injured. They also have difficulty communicating, and these two facts are not unrelated. Where perspectives on the physical and social world are not reciprocal, propositional content appears under no particular perspective; the pragmatic layer never quite crystalizes, and the indeterminacy of language becomes a persistent, practical problem. The body plays a crucial role in addressing such problems, not as a physical or representational mechanism, but as the site of a reflexive grasp that social actors have of being a body. However, the tactile body demands relations to the world and to other people, which may appear inadmissible from a social perspective. In order to account for these constraints and their reconfiguration among DeafBlind people in Seattle, I appeal to the notion of the social field.

## The Social Field

The social field is a structured space of positions and roles, along with the historically specific means by which those positions and roles are occupied by social actors (Hanks, 2005a: p. 72). In the social field, speaking is a means of position-taking, which is dually constrained by legitimation and authorization (Hanks, 2005a: 72–73). Legitimation accrues to styles and genres of language use, knowledge of which is limited by social and economic position. Limitations on who has access to legitimate styles and genres restrict access to power, reinforcing unequal power relations. Authorization, on the other hand, is invested in the actors themselves, via the social roles they occupy (Hanks, 2005a: p. 76).

For DeafBlind language-users, dynamics in the social field include not only genres and styles of language-use, but also the relative legitimacy of different channels through which linguistic signs are exchanged (i.e., visual-kinesthetic vs. tactilekinesthetic), the modes of access used to link linguistic signs to people, things, and events in the environment (i.e., memory, perceptual access, shared knowledge), and the relative authorization of social actors who habitually draw on and reproduce those channels in reciprocal ways (i.e., "visual people" vs. "tactile people"). Historically, visual channels and modes of access accrued more legitimacy than their tactile counterparts (Edwards, 2014b). Therefore, as DeafBlind people competed for resources in the social field, it was advantageous for them to continue communicating via visual channels and modes of access, even after they had become blind<sup>8</sup> . This meant that they did not have direct access to things like body posture, eye-gaze, and other embodied behaviors, which are transmitted by the habitus. Therefore, there was no way for one DeafBlind person to evaluate another against shared frames of social value. Instead, they relied on sighted people to share their interpretations and impressions. DeafBlind people were always removed from the embodied knowledge required for position-taking in the social field.

The inception of the pro-tactile movement brought with it a reconfiguration of social roles and positions, new ways of linking evaluative frames to embodied experience, and novel patterns in position-taking. Rather than accruing legitimacy by communicating as sighted people do, an internal hierarchy was established within the Seattle DeafBlind community. A small minority of DeafBlind signers, who were, importantly "tactile people" emerged as leaders and they applied their authority in judgments about the correctness of certain linguistic forms and communication practices. As a result, some embodied behaviors (and not others) became legitimate ways of being smart, polite, interesting, "culturally DeafBlind" and so-on. Patterns in language-use were caught up in this broader transformation, and novel linguistic forms began to mark new social distinctions (Edwards, 2014b). From there, pro-tactile practices could be used to acquire resources in the social field (e.g., prestige, membership, employment, etc.), without relying on the impressions, opinions, or interpretations of the sighted.

For example, in the following exchange, Lee, one of the instructors, identifies some linguistic forms and practices as appropriate, and others as inappropriate. In doing so, she is also legitimizing tactile modes of access to the environment and downplaying the necessity of visual access<sup>9</sup> .

**Lee:** The announcements for today are about the new rules. First, the video people—[. . . ]—you're not allowed to talk with them. You're not allowed to ask them, as sighted people, where things are or where people are. So the film people are "not here." That's crucial. [. . . ] That's the first rule. [. . . ] The second rule is that you have to be assertive—and feel around! You can't just stand there and wait for someone to tell you what to do.

Rules like this pushed participants toward tactile modes of access and exchange. DeafBlind leaders naturalized the practices that emerged as a result by labeling them culturally appropriate, correct, polite, and "pro-tactile." For example, Adrijana explained to one of her students that

if someone is eating, and you touch their arms and their face, you will figure out they are eating. Then you know to leave them to their meal. It's the same thing with any other activity someone might be doing. You feel their arms, and then that leads you to some other part of their body, maybe their hands, and then you know what they're doing and how to interact with them. The point is that touching people for the purpose

<sup>8</sup> See Edwards (2014b: pp. 118–143) for further discussion.

<sup>9</sup>This exchange took place in the second of 10 pro-tactile workshops, when basic problems were being identified. The only sighted people present during the workshops were working as part of the research team, videorecording the interactions.

of gathering information is perfectly acceptable. So that is, in essence, what today's class has been about. [. . . ]

Adrijana is inviting her student to reconsider habitual, dispositional ways of interacting with others and with the environment. The appropriateness or inappropriateness of these practices is deeply ingrained from childhood, so abandoning established practices feels like a risk. Because Adrijana is invested with the requisite authority, her students took her up on her invitations, regardless. Where they felt resistance, they were encouraged to reflect. In one workshop, a participant identified the relative physicality of tactility and visuality as a potential problem for the pro-tactile movement. She argued that sighted people don't understand the range of things that touch can do, and too often assign sexual or romantic meanings to tactile signals.

They have to understand that touching is about feeling—its about having access to emotion—just like they have through vision. Touch is no more physical than vision.

These kinds of social facts—for example the commonly held idea that touch is more physical than vision—become apparent to DeafBlind people when they try to substitute tactile communication strategies for visual ones, and have the reflexive sense that they are doing something inappropriate. In response, Adrijana insisted that DeafBlind people must not comply with those impulses. Instead, she encouraged them to apply pressure to established social norms (i.e., join the pro-tactile social movement), or else suffer the effects of isolation:<sup>10</sup>

When people use their eyes for seeing, that causes them to feel. When people use their ears for hearing, that causes them to feel. [M]ost DeafBlind people have been missing out on feeling because we've been so focused on [language], and that's all. But there's this whole environment around us—a whole world and we can't feel it. So that's why [the pro-tactile movement] is so important and why it has to include ways for us to feel things again. [. . . ] We need more stimuli for our bodies to interpret. All of that is part of "pro-tactile."

Adrijana is not arguing that DeafBlind people are physically incapable of "feeling the world." Rather, she is arguing that tactile modes of knowing have historically been limited by excessive social restrictions. Relaxing those restrictions requires new relations to be established between the habitus, the linguistic system, and the social field where touching is evaluated. I analyze this process as a kind of embedding in the social field (Hanks, 2005b).

## Embedding in the Social Field

Broadly speaking, embedding is a process through which highly schematic form-meaning correspondences undergo reshaping, conversion and transformation as contextual values are retrieved (Edwards, 2014b; Hanks, 2005b: p. 194). Through patterns in retrieval, the linguistic system is aligned with its contexts of its use, generating a restricted range of feasible interpretations. Four mechanisms of embedding have been proposed: practical equivalences, counterparts, rules of thumb (Hanks, 2005b) and integration (Edwards, 2012: pp. 52–63, 2014a: pp. 26– 27). Practical equivalences, counterparts, and rules of thumb transform the meaning associated with forms as they are instantiated. Integration, in contrast, affects both form and meaning.

Embedding in the social field involves: (1) the legitimation of certain styles, modalities, and genres of language-use for taking up recognizable social positions, along with the embodied knowledge necessary to do so, and (2) authorization of some language-users to evaluate linguistic forms and communicative practices as correct, appropriate, polite, or not. In the social field, the effect of an utterance will be different depending on who produces it and what social position that person occupies. For example, in Yucatec Maya, an utterance produced by a shaman about a divining crystal will have a particular effect because of the authority invested in him and the social position he occupies, just as a radiologist's position authorizes him to interpret xrays (Hanks, 2005b: p. 202). However, position is not enough. Legitimate modes of language-use, body posture, dress, overall comportment, and other aspects of practice must be convincingly enacted as well.

Legitimation and authorization constrain position-taking, thereby restricting the range of feasible moves in any interaction and the feasible interpretations of any utterance. In a practice framework, these restrictions are not listed a priori as maxims (Grice, 1989) or conditions (Austin, 1965; Searle, 1983). They are instead historically specific relations that cohere between: (1) actors, (2) social roles and positions, along with the structures they fit into, and (3) the embodied and linguistic knowledge required for taking up those roles and positions in legitimate ways. These relations are amenable to ethnographic, historical, and interactional analysis, and they have a role in shaping the internal organization of the language.

However, position-taking in the social field does not have a direct or determinate effect on the linguistic system. Rather, the social configuration of the body acts indirectly on the language as the ground against which reference is achieved. In other words, in order to individuate an object of reference in the immediate environment, the language must be aligned with the capacities of the body, the physicality of the world, and the reciprocal modes of access that are established across a group of language-users. In order to grasp these dimensions of practice, a shift in analytic perspective from the social to the deictic field is required.

## The Deictic System and the Deictic Field

The deictic system and its corresponding deictic field structure how people refer to objects and events in the immediate environment (Bühler, 2001 [1934]; Hanks, 2005a). The deictic system is composed of semantic elements which are organized by contrastive opposition (e.g., this is not that). These oppositions contribute to the definiteness of reference, or the capacity of speaker and addressee to pick out a bounded thing among other

<sup>10</sup>See also Sauerburger (1993: pp. 87–98) on isolation in DeafBlind populations.

things. Deictic signs also direct the attention of the addressee to the object by way of mutually accessible relations; this is the directivity of reference. While definiteness derives from the deictic system, directivity derives from the deictic field, where patterns in memory, sensory perception, navigation, and modes of attention, cohere to generate pathways, channels, grids, and coordinate schemes that speaker and addressee draw on to converge on an object. Therefore, all deictic signs are composite, composed of both "symbols" and "signals" (Bühler, 2001 [1934]: p. 99). Any time a deictic sign is instantiated, values must be retrieved from two distinct sources: the deictic system and the deictic field.

Given stable and reciprocal sensory capacities, relations of embedding between the two should be so seamless that reference to objects in the immediate environment feels self-evident, concrete, and natural to the language-user (Hanks, 1990: p. 5). However, in the context of radical shifts in sensory capacity, this apparent concreteness is disrupted, and the means by which the deictic system and the deictic field are brought into alignment is revealed.

The deictic system also registers social relations in an indirect way by aligning the grammar with modes of access that are reciprocal across a group of language-users (Edwards, 2014b). Modes of access include patterns in how the body perceives, moves through, remembers, and inhabits its environment. Any time signer and addressee converge on a referent in the immediate environment, modes of access in the deictic field must be coordinated. Analytically, the body must be viewed under distinct perspectives in the social and deictic fields11. However, in practice, the body that grounds reference is also the body that takes up positions in the social field. If a group of DeafBlind language-users has been socialized to avoid touching objects in their environment, it will be difficult to converge on an object of reference that is available via strictly tactile modes of access.

Therefore, social and deictic pressures are dually exerted on the linguistic system via the body. Nevertheless, as mentioned before, distinct analytic approaches are required for grasping the social and deictic processes that exert those pressures. In the social field, the analyst aims to understand how particular styles, genres, and channels are differentiated and legitimized for purposes of position-taking. In the deictic field, the analyst focuses instead on how pathways, relations, and dynamics in the environment are made reciprocal across a group of languageusers as signer and addressee converge on objects of reference. Possibilities for how pathways in the deictic field can be organized are constrained at the outset, since many of the routes, relations, and modalities that could link speaker and addressee to the object given the physical and cognitive capacities of humans, are ruled out on social grounds in historically contingent ways.

## Deictic Integration

As social restrictions on touch were loosened among DeafBlind people, new pathways in the deictic field became available, and these pathways affected the internal organization of the deictic system. I use the term "deictic integration" (Edwards, 2014b: pp. 27–61, 159–190), to account for the coordination of linguistic elements that derive from the deictic system with non-linguistic elements that derive from the deictic field into tighter and more restricted configurations over time so that (a) when a deictic sign is instantiated, retrievable values are restricted to a small and alternating set, and (b) deictic signs are organized by contrastive opposition (e.g., this and that in English). For example, the pronominal system of VASL makes a two-way distinction between first and non-first person (Meier, 1990: p. 377). The first person form is encoded in a pointing sign directed toward the signer and the non-first person form is encoded in a pointing sign directed away from the signer. This distinction retrieves values from basic participant frameworks, which inhere in the deictic field. In other words, these pointing signs are organized by contrastive opposition, which derive from the linguistic system, and are set to retrieve one of a restricted set of values (i.e., first or non-first person) from the deictic field. The participant frameworks themselves derive from the deictic field, and only the most schematized, basic, or expectable configurations make their way into the deictic system (Hanks, 1990: p. 149).

Linguistic pointing signs are therefore distinguished from pointing gestures according to the tightness of the relations that obtain between (1) schematic, oppositional categories, which are repeatable and transportable across contexts, and (2) relations, roles, and dynamics in the deictic field, where those forms are routinely instantiated. If a pointing sign is momentarily altered as it is brought into alignment with some dimension of context, linguistic and deictic elements are merely coordinated. If there is a restricted set of values (e.g., person and number values), and one of those values must be selected in order to produce a grammatical utterance, linguistic and deictic elements are integrated. The process whereby the deictic system and the deictic field are coordinated into tighter and more restricted configurations is what I am calling "deictic integration."

Together, embedding in the social field and deictic integration narrow interactional and referential possibilities, thereby reducing the cognitive burden that interactants are faced with as they attribute intentional states to one another. This process became evident in the pro-tactile workshops as a range of linguistic forms were deemed inappropriate and fell out of use. It is not trivial that the first forms to go were VASL pointing signs. From the perspective of the DeafBlind language-user, many forms that derive from VASL feel intuitive despite the fact that they do not describe or articulate to a perceptible world. This is because the habitus is not redundant with, or even consistent with, sensory capacity. For example, it may feel natural or intuitive for DeafBlind people to sit down in the same manner that sighted people do. However, among exclusively DeafBlind people, this practice leads to collisions. This highlights the fact that the habitus can be in direct conflict with the capacities of the body and its ways of interacting with the physical world.

The same disconnect affects deictic reference. At the beginning of the pro-tactile workshops, many participants referred to objects in the immediate environment as if their

<sup>11</sup>See Edwards (2012) for discussion.

interlocutors could see what they were pointing at. It took a person imbued with authority to change such practices by deeming them inappropriate and suggesting an alternative. From there, modes of access were brought into alignment and made reciprocal each time an object of reference was individuated by way of mutually accessible relations. This process, which involved the embedding of language in both the social and deictic fields, narrowed the range of potential linguistic resources to those that were "fieldable" (Bühler, 2001 [1934]) and it narrowed the range of retrievable contextual values to those that were mutually accessible (Hanks, 2005b). When a fieldable pointing sign is instantiated, the addressee is not abandoned in unstructured space with no clues for how to proceed; rather, they are the recipient of a signal, telling them to choose one path over another in a highly restricted field of possibilities (Bühler, 2001 [1934]).

## Appropriate Pointing in TASL: Embedding and Integration in the Social and Deictic Fields

Prior to the pro-tactile movement, pointing signs were produced for DeafBlind people by sighted interpreters, as would be expected in Visual American Sign Language (VASL). For example, in **Figure 1**, a sighted interpreter (right) is pointing to a referent in the environment by extending her pointing finger toward it, along a visually accessible trajectory. The DeafBlind person (left) receives the sign tactually.

In the pro-tactile workshops, this type of pointing was deemed inappropriate by the instructors, Adrijana and Lee, and pro-tactile philosophy became a way of legitimizing alternate practices. For example, in the following exchange Adrijana demonstrates to her student that he can't resolve reference using VASL pointing signs and she explains that this failure is predictable from the perspective of pro-tactile (or "PT") philosophy.

**Adrijana**: I'm going to explain PT philosophy to you. I'm not going to preach. It's going to be a discussion between the two of us. So let's say that I come up to you, and I start explaining: "There's a table over there [pointing], and there's a wall over

FIGURE 2 | TASL Pointing Sign.

there, and there's a door further over there." Do you understand me? **DB Participant**: Yes. **Adrijana**: No you don't. . . **DB Participant**: You said that there is a wall over there [points] and a door over there [points] right? **Adrijana**: No, the door is over there [points]. **DB Participant**: Well, whatever. **Adrijana**: Yeah, but that's exactly it. It's important. When people point like that to direct you, and you're standing in the middle of the room, you're totally lost. Right? [DB participant nods]. You're sitting here, and it might seem clear for a minute, but when you stand up and try to find the things I just located for you, the directions won't seem to match the environment and you'll be confused. Deaf [sighted] people do that—they point to places, but that's not clear. **DB Participant**: Well, yeah. That's visual information. **Adrijana**: Right, but it has to be adapted to be pro-tactile. So instead of pointing, we have to teach them to do this (See **Figure 2**).

To demonstrate the appropriate procedure for referring to the location of the door, Adrijana substituted VASL pointing signs like the one in **Figure 1** for TASL pointing signs like the one in **Figure 2**.

Notice that in the above exchange, Adrijana flat out contradicts the claim made by her student that he understands, and her student responds by adopting the practice she proposes. As discussed above, that move is successful is because Adrijana is invested with the requisite authority. This is a social fact, which has a particular history (see Edwards, 2014b: pp. 65–113). This exchange is part of a larger discourse that grew during the pro-tactile workshops, aimed at associating specific tactile communication practices with "pro-tactile people" so that using particular forms is not only a means of accomplishing reference by linking people, language, and the physical environment, but

also a means of taking up new and increasingly valued social positions (Edwards, 2014b). In interactions like these, novel linguistic forms are embedded in the social field: to be a pro-tactile person is to point in a particular way. However, the designation of the form as pro-tactile also derives from the fact that it is fieldable, and is therefore a feasible candidate for a process of deictic integration. Novel, pro-tactile pointing signs articulate to the deictic field of TASL, as opposed to the deictic field of VASL. Where embedding in the social field and deictic integration come together, novel linguistic forms tend to emerge.

The deictic field of VASL is organized around visual modes of access to the immediate environment. For example, in **Figure 2**, Adrijana points to a location on the addressee's palm and associates it with where they are at the time. She then locates the wall and the door relative to that against the tactually accessible backdrop of the addressee's hand. Then she says, "That's more clear, right? Better than [VASL] pointing?" And the participant says, "Yes. It helps because it's kind of like drawing a map. Then you can really visualize where things are." Notice that the handshape in both the VASL and TASL pointing signs is roughly the same: one extended index finger directed toward the location of a referent. However, the trajectories launched by the handshape articulate to distinct pathways. Given the bodyschema of a sighted person, the sightline in **Figure 3** will feel like a commonsensical trajectory with which a pointing sign can align.

Given the body-schema of a pro-tactile DeafBlind person, however, the sightline in **Figure 3** is likely to be inaccessible and/or inappropriate. Instead, some kind of tactually accessible pathway must be located, such as the one in **Figure 4**, which includes a straight orienting line that can be identified with a cane and tracked. Over time, patterns in how lines of travel intersect, where doors tend to be located, how materials are organized into common sequences, and so-on, become intuitive as they are incorporated into the habitus, and an orienting grid becomes available. In order for reference to be reliably resolvable,

participants must be able to act as if orienting grids are reciprocal across the group of language-users, and this as-if clause has some minimal threshold of actuality built in. If everyone acts as if they are sighted when they are actually DeafBlind they will not be able to locate the door. Nevertheless, sensory capacities will not be consistent across the group—some will have more or less vision, better or worse vestibular function, and so-on. Therefore, a reciprocal orienting grid need not be identical, just calibrated to a coordinate scheme that is good enough for all involved. In other words, the body schema must be reciprocal, and it must be calibrated to the interactional and social fields inhabited by DeafBlind people.

Prior to the pro-tactile workshops, DeafBlind people oriented to the environment in many different ways, which were more or less commensurate with their sensory capacities. Those who relied heavily on sighted people as guides were less likely to develop navigational habits organized around tactile modes of access, while those who relied less on sighted people were more likely. Therefore, body-schemas were not consistent across the group. This became apparent in many ways to participants of the pro-tactile workshops, and that recognition led to new practical routines. For example, before anyone started talking about or referring to an object, participants would often explore it tactually. In the following sequence, Adrijana leads a napkinfolding exercise, which involves learning how to do a "pocket fold." In **Figure 5**, she grips the top of Hanks' hand, and guides it carefully along the top edge of the napkin. In **Figure 6**, she guides his hand along the parallel edge of the napkin. The two sides have different thicknesses because one side includes hemmed edges and the other one doesn't. She does the same thing with the remaining two sides of the napkin.

In **Figure 7A**, Adrijana signs FEEL, and in **Figure 7B**, she signs NONE. Then, in **Figure 7C**, she says, "RIGHT?" followed by a question marker (not pictured here), meaning, "You don't feel any [thickness] there, right?" Then she runs her fingers over the bottom edge and the left edge of the napkin, drawing attention to the fact that both of those sides are flat and smooth, unlike the hemmed edges. Hank acknowledges this, by signing

FIGURE 5 | Adrijana guides Hanks' hand across top edge of napkin.

FIGURE 6 | Adrijana guides Hanks' hand across bottom edge of napkin.

YES (not pictured). Then, Adrijana rotates the napkin so that the two flatter edges extend away from Hank, and the corner is pointed toward the edge of the table. Hank's hands remain on top of Adrijana's as she rotates the napkin and also remains in contact with the table under it, so he can feel the relative position of the napkin shift. In **Figure 8**, she uses a flat handshape to refer to them by moving the edge of her hand back and forth in line with the edges, meaning something like, "Here is one flat edge and here is another flat edge." In this sequence, Adrijana draws Hanks' attention to a tactually perceptible difference in two aspects of the object: two of the napkin's edges are thicker because they include hemmed edges, and two of the napkin's edges are flatter, because they do not include hemmed edges. Adrijana then taps twice on the corner of the napkin where the two flat edges come together (**Figure 9**), which I have glossed, THIS.

In contrast to many other attempts to single out a bounded referent, this attempt worked, evidenced by the fact that later in the interaction, Hank was able to perform the napkin fold successfully, and also by consistent signals of understanding throughout this stretch of the interaction. The reason for its success is that the field of potential referents was restricted significantly by interactional and social processes. When Adrijana signs THIS, she signals to Hank to choose one aspect of the object over another: this corner and not some other aspect of the object we have previously singled out. This restriction emerged over the course of several turns, prior to the moment in **Figure 8**. In addition, before this interaction, admissible dimensions of the object were restricted to those that could be accessed given a particular habitus, and the orienting scheme that DeafBlind participants were building over the course of the workshops. Pro-tactile people were beginning to narrow things down in ways that visual people wouldn't think to.

The deictic sign registers these restrictions in two senses: first, it is fieldable, i.e., it articulates to a field organized around tactile modes of access by being directed toward a location that both speaker and addressee can touch and distinguish from other aspects of the object. In contrast, a pointing sign that launches a trajectory into a visually organized space would not be fieldable. Second, the form of this deictic sign is perceptible and easily contrasted with other, perceptible forms. In this example, two taps on the referent functions as a demonstrative— Adrijana is trying to single out this part of the object. While more data is needed, this appears to be an emerging pattern. In contrast, tracing movements on the body of the addressee are used to identify the location of one referent in relation to another (for example, the door, relative to "us" in **Figure 2**). This suggests that tapping vs. tracing may be taking on a contrastive relation in TASL, which corresponds to demonstrative vs. locative functions<sup>12</sup> .

<sup>12</sup>The pro-tactile workshops were naturalistic interactional contexts where there were many variables in play (as opposed to an experimental context, where variables are more tightly controlled). Therefore, I am hesitant to make a definitive claim here, and am currently conducting more controlled elicitations to follow up on these findings. However, this provisional claim is based on what appears to be a fairly stable pattern in certain portions of the data. I discovered this pattern first by developing an intuition as I participated in these practices with DeafBlind people. I followed up on that intuition by jumping to places in the video footage where I thought demonstratives and locatives might appear, including the beginning of each workshop, where the instructors would give directions for the day, and activities that included instructions on how to manipulate objects, such as a crocheting exercise, and a direction-giving exercise. I looked for questions like "which one?," "where?" or moments where it seemed that the signer was trying to single out one thing as opposed to something else—e.g., if there were two chairs, one sitting next to the other one, and a signer tried to draw attention to the one they wanted their interlocutor to sit it, I recorded the form that was used to accomplish that task. In the first 6 classes, I noticed that there were a lot of avoidance behaviors, even when asked specifically to provide locational information, or to single out a referent among others. There were also many cases where visual pointing signs were used and these forms were usually followed by confusion, requests for clarification, re-statement, or the use of English calques, that do not require explicit locational information to be disclosed. In the first 6 classes, I recorded 21 occasions on which signs were used to single out a thing among others, or to provide information about its location relative to other things, and I took note of the form that was employed. The "TAP.TAP" form that later took on a stable demonstrative meaning was only produced by one signer, three times in that data set. Then, in the 7th class alone, I identified 42 tokens of TAP.TAP produced by 7 signers, all of which occurred in contexts that suggested a demonstrative meaning. I identified 36 tokens of tracing, like the kind described in the napkin example, produced by 4 signers, in contexts that suggested locative meanings. In addition, when these forms were used, the addressee often produced backchanneling signals used for agreement, understanding, and continued attentional focus in response. However,

FIGURE 9 | Adrijana refers to a corner of the napkin by tapping on it twice with a flat hand.

The ability of the addressee to attribute an intentional state to the signer is augmented by emergent distinctions like these in the language. It is also reinforced by an emergent, pro-tactile habitus and the fields with which it articulates. In order for Adrijana to be successful in teaching Hank to do a pocketfold, he must be able to grasp the directedness of her mental states to answer questions like: what is she focusing on and singling out for me? A perceptible contrast between demonstrative and locative clues is invaluable when faced with such tasks. In addition, Hank does not have to entertain the possibility that Adrijana might direct his attention to dimensions of setting that she knows he can't perceive. This was not a safe assumption prior to the pro-tactile movement. These kinds of mutual alignments between the body, language, and the social world are helping participants rule out many logically, linguistically, and physically

these interactional contexts are not comparable across the different classes in the pro-tactile workshops, and there are too many variables to be sure about when, exactly the pattern emerged, and if, in fact, these forms map consistently onto a distinction between demonstrative and locative meanings. Therefore, further evidence is currently being collected and analyzed.

possible intentional states that could be attributed to their interlocutors.

## Conclusion

In examining social and deictic processes of embedding among DeafBlind people, I have shown how embodied forms of knowledge can simplify pragmatic acts of intention attribution, particularly with respect to deictic reference. I have argued that as social, interactional, and physical pressures are exerted on the language via the body, a process of integration is set in motion and the internal structure of the language is reconfigured. This suggests that language and context are not linked by way of external rules, maxims, or conditions. Rather, the linguistic system is continually adapted to, and shaped by, the historically specific fields of activity in which it is used. In other words, as contextual values are retrieved in interaction, patterns begin to sediment. From within those patterns, some values become more likely candidates for retrieval than others. In this sense, the language develops receptors, with particular sensitivities built in; a tactile language is not set to retrieve values from a field organized around visual modes of access.

In this article, I have argued that one of the key components of this process is deictic integration, or the coordination of linguistic and deictic elements into tighter and more restricted configurations over time. When an individual acquires a deictic system, they are acquiring a relational configuration of receptors, set to retrieve certain dimensions of context and not others. From this perspective, a range of pragmatic inferences will feel commonsensical, while others will feel like strange leaps that only philosophers would make. Following Bourdieu (1971, 1990 [1980]); Giddens (1979), and Hanks (1996, 2005a,b), I locate this commonsense, practical knowledge in the body, where it is registered neither as a representation, nor as a physical fact, but as a reflexive grasp that social actors have of being in a concrete world, which is often expressed as a dispositional tendency.

The cognitive tasks required for generating pragmatically situated meanings are attenuated when viewed from within the constraints of an individual's dispositional tendencies. This is particularly true if, as I have argued in this article, the social configuration of the body grounds relations between

## References


the language-user, the linguistic system, and the modes of access that are reciprocal across the group. Caught up in these complex relations, the body exerts an indirect but consequential effect on the contextual receptors that develop in any linguistic system; language anticipates context. I am not arguing, however, for an assumed or pre-determined fit between conceptual representations (linguistic or not) and the world13. Rather, the integration of language and context is the outcome of socio-historical and interactional processes, which from the perspective of the addressee, reduce the range of feasible, intentional objects (i.e., objects to which mental states are co-directed).

The approach sketched out in this article can also be distinguished from traditional approaches to speech acts. Searle's language-user, for example, would never come out of an interaction concluding that the reason their assertion or command was unsuccessful was that the linguistic system itself was inadequate to the task. Likewise, he would not presume that a description was unsuccessful because the world was not accessible in reciprocal ways. However, these are precisely the assumptions DeafBlind leaders acted on. The practices that were subsequently established linked language to context in new ways, and in the process, a range of potential interpretations and attributions were ruled out—not by a static and detachable set of conditions, rules, or maxims, but by the reconfiguration of the language as it was embedded in, and integrated with, new social and interactional fields.

## Acknowledgments

Thank you to the Wenner-Gren Foundation for Anthropological Research (Grant # 8110) and the Diebold Foundation for Linguistic Anthropological Research for funding this research. Thank you also to the College of Arts and Sciences at Gallaudet University for supporting the publication of this work. Many thanks to E. Mara Green for extensive comments, Mark A. Sicoli, for helpful suggestions early on, and two anonymous reviewers. Finally, thank you to the DeafBlind people who participated in, and contributed to this project—especially Jelica Nuccio and aj granda.

DuBois, J. W. (1987). Meaning without intention: lessons from divination. IPrA Pap. Pragmat. 1, 80–122. doi: 10.1075/iprapip.1.2.04boi


<sup>14</sup>As, for example, in Liddell's "real space" (bib31: p. 84).


Mathur, G. (2002). Verb Agreement in Signed Languages. Ph.D. dissertation, MIT.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Edwards. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.