# MENTAL STATE UNDERSTANDING: INDIVIDUAL DIFFERENCES IN TYPICAL AND ATYPICAL DEVELOPMENT

EDITED BY: Daniela Bulgarelli, Anne Henning and Paola Molina PUBLISHED IN: Frontiers in Psychology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-268-2 DOI 10.3389/978-2-88945-268-2

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **MENTAL STATE UNDERSTANDING: INDIVIDUAL DIFFERENCES IN TYPICAL AND ATYPICAL DEVELOPMENT**

Topic Editors:

**Daniela Bulgarelli**, Università degli Studi di Torino, Italy **Anne Henning,** SRH Hochschule für Gesundheit Gera, Germany **Paola Molina,** Università degli Studi di Torino, Italy

The current book addresses the development of mental state understanding in children with typical and atypical population, and reports new suggestions about the way to evaluate it and to support it through training. The presented frame is multifaceted. In respect to typical populations, the role of maternal reflective functioning, language, communication, and educational contexts has been deepened; and the association with internalizing/externalizing behaviors, performances in spatial tasks and pragmatics has been addressed as well. As to atypical populations, deficits in mental states understanding are reported for children with different developmental disorders or impairments, as the agenesis of the corpus callosum, Down Syndrome, preterm birth, Autism Spectrum Disorder, hearing impairment and personality difficulties such as anxiety. Overall, the papers collected in our book allow a better understanding of the mechanisms influencing mental state understanding and the effects of mental state comprehension on development.

**Citation:** Bulgarelli, D., Henning, A., Molina, P., eds. (2017). Mental State Understanding: Individual Differences in Typical and Atypical Development. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-268-2

# Table of Contents



Francesca M. Bosco and Ilaria Gabbatore

# **Section 2: Atypical development**

## **Section 2.1: Mental State Understanding in children with Hearing Impairment or Speech and Communication disorders**

*92 Facial Expression Recognition in Children with Cochlear Implants and Hearing Aids*

Yifang Wang, Yanjie Su and Song Yan

*98 Theory of Mind and Reading Comprehension in Deaf and Hard-of-Hearing Signing Children*

Emil Holmer, Mikael Heimann and Mary Rudner

*109 Theory of Mind Deficits and Social Emotional Functioning in Preschoolers with Specific Language Impairment*

Constance Vissers and Sophieke Koolen

## **Section 2.2: Mental State Understanding and personality**

*116 Putting Ostracism into Perspective: Young Children Tell More Mentalistic Stories after Exclusion, But Not When Anxious*

Lars O. White, Annette M. Klein, Kai von Klitzing, Alice Graneist, Yvonne Otto, Jonathan Hill, Harriet Over, Peter Fonagy and Michael J. Crowley

## **Section 2.3: Mental State Understanding in congenital and perinatal originated disabilities**

*131 Social Cognition in Children Born Preterm: A Perspective on Future Research Directions*

Norbert Zmyj, Sarah Witt, Almut Weitkämper, Helmut Neumann and Thomas Lücke

*138 The Role of Executive Functions in Social Cognition among Children with Down Syndrome: Relationship Patterns*

Anna Amadó, Elisabet Serrat and Eduard Vallès-Majoral,

*150 Mental State Understanding in Children with Agenesis of the Corpus Callosum* Beatrix Lábadi and Anna M. Beke

## **Section 2.4: Mental State Understanding in Autism Spectrum Disorder**

*162 Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder*

Francesco Margoni and Luca Surian

*167 Corrigendum: Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder*

Francesco Margoni and Luca Surian

*168 Implicit Mentalizing Persists beyond Early Childhood and Is Profoundly Impaired in Children with Autism Spectrum Condition*

Tobias Schuwerk, Irina Jarvers, Maria Vuori and Beate Sodian

#### **Section 3: Perspectives on intervention**

*177 Promoting Mentalizing in Pupils by Acting on Teachers: Preliminary Italian Evidence of the "Thought in Mind" Project*

Annalisa Valle, Davide Massaro, Ilaria Castelli, Francesca Sangiuliano Intra, Elisabetta Lombardi, Edoardo Bracaglia and Antonella Marchetti

*189 The ToMenovela – A Photograph-Based Stimulus Set for the Study of Social Cognition with High Ecological Validity*

Maike C. Herbort, Jenny Iseev, Christopher Stolz, Benedict Roeser, Nora Großkopf, Torsten Wüstenberg, Rainer Hellweg, Henrik Walter, Isabel Dziobek and Björn H. Schott

# Editorial: Mental State Understanding: Individual Differences in Typical and Atypical Development

#### Daniela Bulgarelli <sup>1</sup> \*, Anne Henning<sup>2</sup> and Paola Molina<sup>1</sup>

<sup>1</sup> Dipartimento di Psicologia, Università degli Studi di Torino, Turin, Italy, <sup>2</sup> SRH Hochschule für Gesundheit Gera, Gera, Germany

Keywords: theory of mind, emotion understanding, social cognition, mentalization, typical development, atypical development

#### **Editorial on the Research Topic**

#### **Mental State Understanding: Individual Differences in Typical and Atypical Development**

We often refer to mental states such as intentions, desires, and beliefs to explain and predict our own behavior and that of others. Mental state understanding develops from infancy through adolescence and adulthood. A deeper understanding of influencing developmental factors may be obtained by studying individual differences in typical and atypical populations.

The current Research Topic addresses several topics about mental state understanding and development in childhood. It is organized into three sections, comprising 18 papers in total.

The first section addresses the development of social cognition in typical populations through seven papers. Different from most research on Theory of Mind (ToM) that commonly focuses on age-related changes, Blijd-Hoogewys and van Geert investigated whether there occur nonlinearities during ToM development in childhood. Within an overall developmental trend that leveled off toward the age of 10 years, results showed two non-linearities suggesting a developmental shift in ToM understanding: a stagnation at the age of around 4 years and 8 months and a dip at the age of 6 years to six and a half years.

Four papers concern influencing social factors on children's ToM. Rosso and Airaldi showed that maternal reflective functioning (but not maternal attachment security) predicted their preadolescent child's reflective functioning, and that maternal ability to metalize mixed-ambivalent mental states predicted the corresponding ability in their child. While maternal education and linguistic competence are well researched influencing factors (e.g., NICHD HLB, 1998; Pons et al., 2003; Sammons et al., 2004), Bulgarelli and Molina showed that preschooler's linguistic competence mediated the effect of maternal education. Moreover, center-base care in the first 3 years of life eliminated the effect of maternal education, suggesting a protective role of center-base care for children with less educated mothers. Göbel et al. assessed the relation between emotion understanding and internalizing and externalizing behavior in 7- to 10-year-old children in a non-clinical, community sample. Inconsistent with prior research, the overall level of emotion understanding, comprising nine components, was not related to externalizing symptoms, but correlated positively with elevated levels of somatic complaints and anxious/depressed symptoms. Also, and specifically, higher levels of social withdrawal were associated with worse performance in understanding emotions elicited by reminders. Pinto et al. showed that joint narratives only improve 6- to 10-year-old's children's mental state talk performances when children were at the moment of initial elaboration or emergence of mental state talk, and when intersubjectivity levels were high, that is, when children produced more utterances to orchestrate and regulate the dialog.

Edited and reviewed by: Jessica S. Horst, University of Sussex, United Kingdom

> \*Correspondence: Daniela Bulgarelli daniela.bulgarelli@unito.it

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 07 June 2017 Accepted: 28 June 2017 Published: 13 July 2017

#### Citation:

Bulgarelli D, Henning A and Molina P (2017) Editorial: Mental State Understanding: Individual Differences in Typical and Atypical Development. Front. Psychol. 8:1183. doi: 10.3389/fpsyg.2017.01183

Other two papers concern possible implications of ToM development on social interaction. Bosco and Gabbatore suggested that first-order ToM may play a causal role in explaining 3- to 8-year-old children's performance in handling pragmatic phenomena, namely sincere and deceitful speech acts. As to children's cognitive performances in social interaction regarding spatial tasks, Viana et al. showed that 5- to 9-yearold children's ToM was a better predictor of their spatial performances in a dyadic condition than their age, gender, and spatial performances in an individual setting.

Overall, the papers regarding typically developing children present some interesting ideas about the development of understanding mental states. This competence proves to be linked with different aspects of development, at social, cognitive, and relational levels: Rosso and Airaldi showed that only maternal reflective functioning, and not maternal attachment security, predicted children's mental state understanding; in turn, the paper of Bulgarelli and Molina confirmed the role of language and that of Pinto et al. the role of communicative context on children's ToM. Understanding mental states shows to be a complex ability that involves different functions and effects different aspects of development. Finally, the contribution of Blijd-Hoogewys and van Geert presented an interesting new approach to the study of ToM development that has implications for the debate whether this development may be stage-like or continuous.

The second section of this Research Topic encompasses nine papers that address the development of mental states understanding and its correlates in atypical populations. The possibility to compare results derived from studies carries out with typical and atypical populations is of key importance. In fact, similarities and differences in typical and atypical development can shed light on the processes at the base of the ability to understand, attribute and interpret mental states.

Lábadi and Beke's study concerned the role of structural connectivity across the hemispheres in neurodevelopmental disorders. They showed that 6- to 8-year-old children with agenesis of the corpus callosum exhibited mild impairments in recognizing emotions and in understanding theory of mind, and also showed more behavioral problems than control children matched by IQ and sociodemographic variables.

White et al. showed differential effects of social exclusion on children's usage of their capacity to understand mental states in relation to anxiety. After children were nonaccidentally excluded in a virtual game, typically developing 5 year-olds' (Study 1) completion of peer-scenario stories were characterized by portraying story-characters more strongly as intentional agents, with use of more mental state language, and more between-character affiliation. Differently, 4- to 8-year-old children with anxiety disorder (Study 2) told stories in which story-characters exhibited less intentionality and less use of mental-state language. Thus, while exclusion may induce young children to mentalize, and thus to more effectively reconnect with others, excessive anxiety may impair this usage of controlled mentalizing.

The study by Amadó et al. investigated the relation between social cognition and executive functioning in children with Down Syndrome (DS). Children with DS were delayed in social cognition and in executive functioning, with unequal impairment of different functions. Moreover, working memory explained a higher amount of variability in social cognition performance than in typically developing children matched by age.

Implicit mentalizing consists of a spontaneous anticipation of an agent's false belief-based action that can be observed through anticipatory looking biases in tasks where eye movements are assessed. Using eye tracking devices, Schuwerk et al. showed that implicit mentalizing persists over infancy up to childhood in typical population; on the contrary, children with Autistic Spectrum Disorder (ASD) appeared to be impaired in such skill, even when their performance in the explicit tasks were similar to the matched control group. The results of this study– intact explicit mentalizing, impaired implicit mentalizing and no relation between that and executive function in children with ASD−support theories that propose two dissociable mentalizing systems.

The review by Margoni and Surian and its corrigendum discussed the idea that impairment in mental state understanding is the main factor explaining why children with ASD face difficulties in moral judgements: in fact, these children mainly rely on actions consequences and other external factors rather than on the agents' mental states when solving moral reasoning tasks.

Due to restricted discussion of abstract concepts, and to a possible to mismatch between language capabilities of children and their parents, the literature reported that deaf and hardof-hearing signing children can display delays in mental states development (Peterson, 2009). Wang et al. compared children with a cochlear implant or a hearing aid with normally hearing participants matched by age and gender and showed that children with cochlear implants and hearing aids were developmentally delayed not only in verbally labeling the facial expressions of happiness, sadness, anger, and fear, but also in a nonverbal emotion-matching task. Holmer et al. showed that deaf and hard-of-hearing signing children were delayed in ToM tasks performances; only three of them have been exposed to sign language since birth. ToM was associated with reading comprehension and working memory, but not with sign language comprehension.

The inter-relation between language and ToM has been clarified in a meta-analysis by Milligan et al. (2007). Deepening this relation in children with Specific Language Impairment (SLI) is interesting, because some studies found delays in this population while others did not (Perner et al., 1989; Shields et al., 1996; Bulgarelli and Molina, 2013). In the review by Vissers and Koolen preschoolers with SLI appeared to be impaired both in cognitive ToM (imitation, joint attention, false belief understanding) and in affective ToM (recognizing and understanding emotions).

The review by Zmyj et al. addressed the role of joint attention as a precursor of social cognition, focusing on pre-term born children: they were less likely to initiate joint attention with others and to respond to others' attempts of engagement. The authors suggest that these deficits in joint attention might lead to impairments in social cognition, and in social interaction skills.

Deficits in mental states understanding are reported for children with different developmental disorders or impairments, from neurological ones (agenesis of the corpus callosum), prematurity, ASD, and personality difficulties such as anxiety. The paper of Schuwerk et al. suggested an interesting topic for future research: the possibility to differentiate implicit from explicit ToM based on different results in typical and ASD populations. On the contrary, as in typical development, the role of language is supported also by the present studies on children with SLI (Vissers and Koolen) and hearing impairment (Holmer et al.; Wang et al.).

A third and final section in this Research Topic is composed by two papers regarding evaluation and training tools. Valle et al. presented the "Thoughts in Mind (TiM) Project" that aimes at training mentalizing skills in adults (e.g., teachers and parents) to positively affect children's mentalization. They reported first evidence of the efficacy of the training when done with teachers: only the TiM Project training group significantly improved in third order false belief understanding and in two of the three components of a Mentalizing Task. Herbort et al. presented a new tool to assess ToM, the ToMenovela, that consists of 190 scenes depicting daily-life situations, addressing cognitive and affective ToM, emotional reactivity, and complex emotion judgment with respect to Ekman's basic emotions. First results on the use of the test with

## REFERENCES


neurologically and psychiatrically healthy adults were reported. The tool proposed by Herbort et al. is very interesting because tools assessing adults' ToM are very scarce. Valle et al. proposed a teacher's training effective in improving children's abilities: a relevant aspect of research in mental states understanding effectiveness.

The current Research Topic addressed the development of mental state understanding in children with typical and atypical population, and reported new suggestions about the way to evaluate it and to support it through training. The presented frame was multifaceted. In respect to typical populations, the role of maternal reflective functioning, language, communication, and educational contexts has been deepened; and the association with internalizing/externalizing behaviors, performances in spatial tasks and pragmatics has been addressed as well. As to atypical populations, deficits in mental states understanding were reported for children with different developmental disorders or impairments, as the agenesis of the corpus callosum, Down Syndrome, prematurity, ASD, hearing impairment and personality difficulties such as anxiety.

### AUTHOR CONTRIBUTIONS

DB, AH, and PM equally contributed to the writing of the Editorial.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Bulgarelli, Henning and Molina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Non-linearities in Theory-of-Mind Development

#### Els M. A. Blijd-Hoogewys1,2 \* and Paul L. C. van Geert<sup>2</sup>

1 INTER-PSY, Groningen, Netherlands, <sup>2</sup> Department of Clinical and Developmental Psychology, Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, Netherlands

Research on Theory-of-Mind (ToM) has mainly focused on ages of core ToM development. This article follows a quantitative approach focusing on the level of ToM understanding on a measurement scale, the ToM Storybooks, in 324 typically developing children between 3 and 11 years of age. It deals with the eventual occurrence of developmental non-linearities in ToM functioning, using smoothing techniques, dynamic growth model building and additional indicators, namely moving skewness, moving growth rate changes and moving variability. The ToM sum-scores showed an overall developmental trend that leveled off toward the age of 10 years. Within this overall trend two non-linearities in the group-based change pattern were found: a plateau at the age of around 56 months and a dip at the age of 72–78 months. These temporary regressions in ToM sum-score were accompanied by a decrease in growth rate and variability, and a change in skewness of the ToM data, all suggesting a developmental shift in ToM understanding. The temporary decreases also occurred in the different ToM sub-scores and most clearly so in the core ToM component of beliefs. It was also found that girls had an earlier growth spurt than boys and that the underlying developmental path was more salient in girls than in boys. The consequences of these findings are discussed from various theoretical points of view, with an emphasis on a dynamic systems interpretation of the underlying developmental paths.

#### Edited by:

Daniela Bulgarelli, Aosta Valley University, Italy

#### Reviewed by:

Veronica Ornaghi, University of Milano-Bicocca, Italy Davide Massaro, Università Cattolica del Sacro Cuore, Italy

#### \*Correspondence:

Els M. A. Blijd-Hoogewys e.blijd-hoogewys@inter-psy.nl

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 19 September 2016 Accepted: 05 December 2016 Published: 04 January 2017

#### Citation:

Blijd-Hoogewys EMA and van Geert PLC (2017) Non-linearities in Theory-of-Mind Development. Front. Psychol. 7:1970. doi: 10.3389/fpsyg.2016.01970 Keywords: Theory-of-Mind, ToM\_Storybooks, development, dynamic\_systems\_theory, non-linearities, anomaly

# INTRODUCTION

# Theory-of-Mind

The child's Theory-of-Mind (ToM) is an important condition for showing socially adequate behavior (Astington and Jenkins, 1995; Imuta et al., 2016). ToM refers to the ability to attribute mental states **–** such as beliefs, desires, intentions, emotions, and perceptions **–** to oneself and others and to use these mental states in understanding, predicting, and explaining the behavior of oneself and others (Premack and Woodruff, 1978**;** Mitchell, 1997).

For instance, a child comprehends that if Sam thinks his soccer ball is in the garage (a belief), he will look in the garage for this soccer ball (the consecutive action), even though the soccer ball may in reality be in the garden. A typical five-year-old who is questioned about the actions of Sam and who also knows the true location of the soccer ball will be able to predict the action of Sam correctly. A typical three-year-old, however, will not be able to do so: he will most likely say that Sam will look in the garden. The 3 year old cannot distance himself from the knowledge of the true

location, and he does not comprehend that others can hold beliefs that do not match reality as he sees it. He does not grasp false beliefs yet.

Numerous studies have suggested that a distinct change occurs in understanding these false beliefs between the age of 3 and 5 years old (for meta-analyses on false beliefs see Wellman et al., 2001**;** Liu et al., 2008). Though considered a cornerstone ability, ToM is far more than only false belief understanding and also extends beyond the 3–5 age period. Infants already possess 'implicit' mindreading capacities (Slaughter, 2015), treating themselves and others as intentional agents and experiencers; while older children understand lies and deception (Peterson and Siegal, 2002). Research shows that ToM development even prolongs into late adolescence (Dumontheil et al., 2010; Vetter et al., 2013; Valle et al., 2015).

# Developmental Sequences in Theory-of-Mind

Research has shown that ToM develops in normally developing children according to a particular, age-related sequence. It evolves from a simple desire theory to a complete belief-desire theory, from true beliefs to false beliefs, and from the understanding of first-order beliefs to second-order beliefs (Wellman, 1990). Deviations from this normal developmental path have been used in describing ToM difficulties of, for instance, children with autism (Baron-Cohen, 2000; Peterson et al., 2012). Relatively little research has been done on the effect of gender on ToM development, but, some studies have found an advantage for girls (Charman et al., 2002; Calero et al., 2013), including the finding that the association between ToM and prosocial behavior is stronger in girls than in boys (Imuta et al., 2016).

Wellman and Liu (2004) looked into the conceptual changes of different ToM aspects, using the ToM Scale. They found a consistent progression of conceptual achievements that pace ToM understanding in normally developing children: diverse desires > diverse beliefs > knowledge access > false belief > hidden emotion (Wellman, 2012, 2014). Wellman and Liu (2004, pp. 536) argue that the ToM developmental order is not one of addition or substitution, but one of modification or mediation. Initial insights broaden or generalize into later insights, following orderly conceptual progressions. A conceptual development has recently also been demonstrated for more advanced ToM tasks (Osterhaus et al., 2016).

#### Temporary Regressions in Development

One can question how these ToM generalizations come about. Is there a gradual development or are there temporary accelerations, delays or even regressions observable during ToM development? Temporary regressions imply that children can have a temporary relapse before a newly acquired ability consolidates. This phenomenon is often referred to as U-shaped or N-shaped development (Siegler, 2004; Zelazo, 2004).

Temporary regressions have been found in a variety of domains, including motor and verbal development (Gershkoff-Stowe and Thelen, 2004; Swingley, 2009), non-verbal symbol learning (Namy et al., 2004), face perception (Cashon and Cohen, 2004), false belief understanding (Bernstein et al., 2011), intentbased moral judgments (Margoni and Surian, 2016), creativity, reasoning, and auditory localization (for an early collection of studies, see Strauss and Stavy, 1982; for modeling this U-shaped development, see Morse et al., 2011; for a recent overview, see Pauls et al., 2013).

In addition to temporary regressions, developmental curves may also show accelerations, which are often the hallmark of rapid changes that mark developmental transitions (see for instance Fischer and Bidell, 2006). Such developmental transitions are likely to be preceded by temporary regressions (Van Geert, 1991; Fischer and Bidell, 2006).

Temporary regressions, accelerations, and temporary plateaus are examples of non-linear forms of developmental change. One can question to what extent such non-linearities also apply to the development of ToM.

# Measuring Potential Non-linearities in Theory-of-Mind Development

In order to be able to observe potential regressions and accelerations in ToM development, one should take two issues into account. First, since ToM development does not solely depend on the development of false belief understanding, the research instrument used to measure ToM development should involve a variety of ToM components, like emotion understanding, belief understanding linked to actions (such as false beliefs) and emotions, desire understanding linked to actions and emotions, and relevant ToM precursors and associated abilities, like the understanding of the difference between mental and physical entities (Wellman, 1990). For that purpose, we developed the ToM Storybooks (Blijd-Hoogewys et al., 2008; see also "Materials and Methods"). Second, as there is no convincing evidence that by the age of six ToM is fully acquired (e.g., Hala and Carpendale, 1997; O'Hare et al., 2009) and stable, research should aim at a considerably broader age range, for instance up to 12 years old and even older (until adulthood).

At first glance, a time-serial design would be superior in order to follow the changing level of ToM over the course of developmental time. This is a design with as many measurements as are needed to capture the temporary and often non-linear forms of change characteristic of a particular developmental phenomenon in individual children (Steenbeek and van Geert, 2002; van Geert and Steenbeek, 2005). However, such a method also brings along considerable logistic problems. Children need to be tested repeatedly over an extended period. Also, since so few research has focused on the dynamics in ToM development, it is hard to predict at what time intervals children should be tested in order to find evidence of developmental phenomena such as accelerations and decelerations, transitions and temporary regressions.

Meanwhile, a cross-sectional design might provide a preliminary answer to the question of age-related changes and potential critical points in ToM in the population, and is a first step toward future time-serial research of developmental paths, as they occur in individual children. However, it is becoming

a well-established fact that a developmental curve based on cross-sectional data should never be automatically identified as a representation of individual developmental curves (e.g., as the curve that applies to the 'average' child or the majority of children), until it has been empirically demonstrated, with the aid of a sufficient number of individual developmental curves, that individual-based curves are statistically and structurally similar to the developmental curve based on group data. This latter condition is known as the homology or ergodicity condition, and is rather unlikely to occur in the case of developmental processes (Molenaar and Campbell, 2009).

# Using Cross-Sectional Data to Tap Potential Non-linearities in Theory-of-Mind Development

Cross-sectional growth curves may serve yet another purpose than serving as first approximations of phenomena that require further scrutiny by means of individual time-serial designs. In this article, we propose an alternative perspective on the interpretation of cross-sectional data, which is based on the obvious fact that making a test amounts to the performance of a particular task, in which the child is asked to solve a particular series of problems, framed in a particular format.

In general, there exist various ways in which task performance can be used to obtain information about children's development and about what they have learned from experiences. One way is to ask children to perform a familiarized and trained type of task independently and without help, such as solving math problems that are framed in a familiar and trained format, to see how much they have learned from their math lessons. Another way is to ask children to perform a particular task that lies beyond the child's capability to solve this particular task independently, and to provide the child with help for doing so. This is the approach taken in dynamic testing (Grigorenko and Sternberg, 1998). A third possibility is to confront the child with a novel task, and to observe how far the child can get if it has to rely entirely on its own capabilities, eventually pushing the child to its limits by giving counter-suggestions or by repeatedly asking the same sort of question. The novel task can be novel in terms of content, and/or in terms of the problem format. In this case, the level of capability, development or learning is defined as the ability to transfer knowledge or skills from one context (e.g., the context of spontaneous daily experience and actual behavior) to another context, which can be of various kinds.

We contend that the administration of a ToM test that the child is unfamiliar with, amounts to observing a child's developmental capabilities by providing it with a novel task content and format. A cross-sectional administration of the test that is likely to be a novel context for virtually every tested child can thus be seen as a way of mapping individual and agerelated variability in the way children process this novel task, by means of a highly simplified measure, which is the set of sub-scores and the total test score of each individual. In this way, a cross-sectional procedure provides yet another perspective on a complex phenomenon, namely children's development of ToM that can only be fully understood if it is viewed from a wide variety of perspectives. Hence, a cross-sectional design provides an answer to the question of age-related changes in how children transfer their knowledge about ToM that functions in daily contexts of spontaneous activity to a new context, namely that of explicit verbal questions and pictorial representations. It should be noted that a repeated administration of the same test may provide yet another kind of information, namely differences between children in their ability to spontaneously learn from repeatedly performing the same task (without feedback; Blijd-Hoogewys et al., 2010).

# Statistical Indicators of Non-linear Developmental Phenomena

In order to describe changes in development, different fitting models can be used to represent the general underlying trend. In research, linear or quadratic models are often used. Unfortunately, such models do not sufficiently take local deviations of the distribution of data into account. This may lead to over- and underestimations of the expected average scores in certain age periods.

In contrast, non-parametric models, like Loess (or Lowess) estimate smoothing procedure, follow local distributions of data as reliably as possible. They apply a locally weighted least squares estimate, and are commonly used as smoothing techniques (see for instance Simonoff, 1996). Such non-linear techniques can be of substantial value for testing non-linear changes even when applied to cross-sectional data. Examples of such nonlinear changes are accelerations, decelerations, and temporary regressions. Additional indicators of developmental transition are changes in the skewness of the distribution, temporary changes in growth rate and changes in variability (van Geert and van Dijk, 2002; Bassano and van Geert, 2007; Van Dijk and van Geert, 2007).

Changes in the skewness of the distribution over time may provide information about alternations between periods of relative stability (zero skewness) and periods of rapid change beginning with a minority of rapid developers (positive skewness) heading toward a new period of relative stability with a minority of children lagging behind (negative skewness).

A temporary change in growth rate can be demonstrated in the form of marked oscillations in the first derivative of the developmental curve, which represents the rate of growth at that point. A particularly strong instance of change in the growth rate occurs in the form of a temporary regression (a local dip), where the growth rate temporarily drops down to negative values. Since the changes in skewness over time are related to accelerations in the growth of the developmental phenomenon at issue, we expect to find a certain level of coherence between the first derivative of the non-linear ToM growth curve and the change of skewness over time.

Change in variability, the third indicator of developmental transition discussed in this article, can be observed as intraand inter-individual variability. A temporary increase in the intra-individual variability is considered a strong indicator of a developmental transition (van Geert and van Dijk, 2002).

However, such an indicator can only be used in repeated measures designs. Inter-individual variability, which is applicable to cross-sectional data and which is expressed in terms of standard deviation over a certain period of time, might also temporally increase during a transition.

Although these three indicators are likely to be correlated, if they are indeed indicative of an underlying developmental transition, they are, in principle, independent of one another. For instance, an acceleration in the group-based growth curve might occur without any change in intra-individual variability, or without any change in skewness.

# Aims, Hypotheses, and Research Questions

In the sections above, we have provided evidence for the occurrence of developmental regressions and various other nonlinearities in the development of a wide variety of skills and forms of knowledge. So far, no studies have explicitly looked at such eventual non-linearities in ToM development. The objective of this study is to investigate whether there occur developmental regressions and other non-linearities during ToM development in childhood.

In addition, we have seen that ToM is not a monolithic ability. It consists of various sub-abilities, each with their characteristic developmental timing. Hence, if ToM development is characterized by non-linearity, it is likely that the forms of these eventual non-linear properties will differ between various aspects of ToM.

Finally, gender differences have been found in ToM development, with girls having a slight advantage over boys. The question is whether this difference is also observable in the form of the cross-sectional developmental trajectories in boys and girls, i.e., whether eventual non-linearities in the curves have a gender specific timing or form.

Given the present state of our knowledge, all these issues amount to open questions. So far, there is no theory from which the answers to these questions can be predicted and that allows us to formulate these questions in the form of hypotheses. In this article, we will formulate a dynamic model of ToM development that might serve as a first attempt toward such a theory.

To summarize, our research questions are as follows: (1) Are there non-linearities in the cross-sectional growth curve of ToM in the form of temporary regressions and accelerations? (2) If such non-linearities are observed, are they real or ordained due to statistical or sampling artifacts? (3) Are eventually observed non-linearities supported by additional indicators of non-linear change as described above? (4) Are there differences in the eventually observed nonlinearities between (4a) the various aspects of ToM as represented by the ToM sub-scores, and (4b) boys and girls?

In order to answer these questions, we follow a crosssectional design, for reasons explained in the section on tapping eventual non-linearities in ToM development. We use the ToM Storybooks, an instrument that incorporates a variety of ToM components. In order to describe the possible temporary regressions and accelerations in ToM development, we use techniques that also look at additional indicators for developmental transitions: changes in the skewness of the distribution, temporary changes in growth rate and changes in variability.

## MATERIALS AND METHODS

# Ethics Statement

The ethical committee of the University of Groningen approved this study and written consent was obtained in advance from parental guardians. The methods were carried out in accordance with the approved guidelines. Minors were involved. Their parents were asked for written consent.

#### Participants and Setting

We tested 324 children. The ages ranged from three up to and including 11 years, with approximately the same number of boys and girls per age range (**Table 1** for the age distribution).

The children came from preschools, kindergartens, and elementary schools, from both provincial and urban regions in the Netherlands. All children had a Dutch linguistic background, and did not have language acquisition problems that could have hampered their performance on the tasks (for the role of language in ToM development and ToM performance: Milligan et al., 2007; de Villiers and de Villiers, 2014; Ebert, 2015). Two Dutch language tests were used, depending on the age of the child. For 3–6 year olds, the Reynell was administered (test for receptive language comprehension; Van Eldik et al., 1997); and for 6–9 year olds, the TvK (Taaltest voor Kinderen, Language Test for Children: subtests 'vocabulary' and 'sentence construction'; Van Bon, 1982) was used. Language scores were available for 249 children (Reynell: n = 170, TvK: n = 79). Those children who did not receive a language test were older than 6 years and judged as having appropriate language skills by their teachers. Thirteen percent of the children came from a lower social background, distributed over the whole age range. This

TABLE 1 | Age distribution of sample being administered the Theory-of-Mind (ToM) Storybooks (N = 324). Age (in years) 3 4 5 6 7 8–9 10–11 Total Boys 32 31 31 31 15 14 13 167 Girls 29 24 32 26 16 12 18 157 All 61 55 63 57 31 26 31 324

percentage corresponds with the percentage as known from the Dutch National Bureau of Statistics, at time of the research.

#### Measure

Children's ToM knowledge was tested with the ToM Storybooks, version Sam (Blijd-Hoogewys et al., 2008). It is a comprehensive test, composed of multiple tasks, that measures a variety of ToM components and associated aspects. The ToM tasks are incorporated in short stories. These stories are illustrated with full color pictures and enlivened by the use of cuddly patches of fur, toy doors that can be opened, and magnetized emotion faces that can be placed on the characters. The test takes 40– 50 min, including a short break (5 min of free play), also for the youngest age group (no significant effect of fatigue was found; Blijd-Hoogewys et al., 2008). Children experience the assessment as a 'being read to' activity, rather than a 'being tested' activity.

In total, there are 34 tasks spread over six storybooks in total. A maximum sum-score of 110 points can be obtained, which can be divided into five sub-scores (for example tasks, see Appendix A in Blijd-Hoogewys et al., 2008): (1) emotion recognition (maximum = 14 points), (2) distinction between physical and mental entities (real-mental, real-imaginary, and close impostors; maximum = 44 points), (3) understanding that seeing leads to knowing (maximum = 3 points), (4) understanding of desires (maximum = 17 points), and (5) understanding of beliefs (maximum = 32 points). The latter encompass tasks on standard belief, changed belief, not own belief, explicit false belief, false belief, inferred belief, and inferred belief control. In addition to providing a single, quantitative measure of the level of ToM ability, the ToM Storybooks also allow investigators to compare various relevant ToM components.

Each task incorporates one to five questions, including both test questions and justification questions. There are in total 74 binary test questions and 18 justification questions. The answers to the test questions are coded as correct or incorrect (1 or 0 points; maximum of all test questions = 74). The justification questions result in 2, 1, or 0 points, depending on the amount and correctness of the mental state terms spontaneously used by a child (maximum of all justification questions = 36). In order to evaluate the justifications, a category system is used (for more details, see the Appendices in Blijd-Hoogewys et al., 2008).

Since the ToM Storybooks is a comprehensive test, no other ToM measures were included in this study. The test has good psychometric qualities. The internal consistency [Cronbach's alphas: ToM total score = 0.95, ToM sub-score 1 (emotion recognition) = 0.83, ToM sub-score 2 (physical/mental) = 0.88, ToM sub-score 3 (seeing knowing) = 0.51, ToM sub-score 4 (desires) = 0.84, ToM sub-score 5 (beliefs) = 0.89], test-retest reliability (r = 0.86, p = 0.001 for typically developing children, r = 0.98 for children with PDD–NOS), inter-rater reliability (Cohen's Kappa = 0.81–0.97), divergent and convergent validity are good (see also Blijd-Hoogewys et al., 2008, 2010). The ToM Storybooks has been translated in different languages, such as English, Finnish, French, Italian, and Spanish; and it has been standardized on two European populations, namely Dutch children (Blijd-Hoogewys et al., 2008) and Italian children (Molina and Bulgarelli, 2012; Bulgarelli et al., 2015).

#### Procedure

All subjects were individually tested in a quiet room at school. Test administrators were carefully instructed to follow standard procedures. For practical reasons, kindergarten children were tested at home. If necessary, the parent was allowed to be present during testing but was requested not to interfere. The justification questions were judged later on. The inter-rater reliability of the justifications is known to be high (Cohen's Kappa = 0.81–0.97, see Blijd-Hoogewys et al., 2008). The few differences left were unanimously agreed on after discussion by four researchers.

## Data Analysis

In order to acquire insight in non-linear changes in ToM development, we used a descriptive non-parametric method, namely Loess curve smoothing (Simonoff, 1996). Next to that, we used random permutation techniques, and more generally, Monte Carlo analyses, which are assumption-free techniques (Kroese et al., 2014). Wellman et al. (2001) have argued for the use of more assumption-free techniques, such as bootstrap methods, in ToM research. It entails a simulation of the test statistic at issue (e.g., a particular numerical indicator of change or of nonlinearity) as based on the null hypothesis, which can be compared to our empirical ToM data (Good, 2001; Todman and Dugard, 2001; Manly, 2007).

# RESULTS

# Non-linearities in ToM Sum-Scores

The Loess smoothed curves of the ToM sum-scores (maximum = 110 points) reveal three points of developmental interest (**Figure 1**; see also Blijd-Hoogewys et al., 2010). To determine the exact timing of these points, minima and maxima of the second derivative (acceleration of growth) of the developmental curve were inspected. The most marked inflection points are seen at 56 months (4 years and 8 months), 72 months (6 years), and 78 months (6 years and 6 months). The second inflection point (72 months) is followed by a dip in the curve, which shows its deepest point at 78 months. This

FIGURE 1 | The Loess fitting curve of the Theory-of-Mind (ToM) sum-score data plotted versus age displays a non-linearity. Based upon the second derivative of this curve three points of developmental interests were found, namely at 56, 72, and 78 months.

dip is a temporary regression or local U-shaped age curve. It is the most striking deviation from monotonicity in the non-linear developmental curve based on the data from boys and girls taken together. More detailed analyses can be found in the section 'Non-linearities in ToM sub-scores and in gender based sub-groups.'

# Is the Temporary Regression at 72–78 Months Real?

It should be checked whether the non-linearity in the form of a temporary dip is not the result of inadequate selection procedures or of statistical artifact, such as accidental sampling effects or the influence of specific, biased or incompetent test administrators.

To begin with, inadequate selection procedures are highly unlikely since the selection procedure was carried out with the utmost care and selection criteria were uniform over all ages. Second, it is unlikely that the non-linearity in the form of a temporary regression (a dip) is a statistical artifact. In order to demonstrate this, the null hypothesis was tested that the generic curve underlying the data is actually a monotonically rising curve and that the dip is due to accidental sampling variations. The latter could amount to an accidental overrepresentation of low scoring individuals. In order to test this possibility, we calculated the best fitting monotonic growth curve and a regression model for the variances. Since we had no prior assumption about where a non-linearity, in the form of a temporary regression, in ToM ability should occur, we tested for the accidental occurrence of an apparent temporary regression anywhere along the time interval. Because a theoretical expectation about the length of the temporary regression is also lacking, the null hypothesis was also tested for time windows of different length. By means of a Monte Carlo technique, we calculated the probability that the null hypothesis model yields a temporary regression, comparable to the observed one. The pattern of probabilities supported the conclusion that it is unlikely that the observed temporary regression is an accidental sampling effect of an otherwise continuous, monotonically rising simple curve (Monte Carlo, p = 0.01 through p = 0.05, depending on the length of the tested interval). Another indicator for non-linearity in the form of a temporary regression, namely negative slope (over intervals of variable length), provided converging evidence (Monte Carlo, p = 0.02).

Next, we checked if particular test administrators caused the non-linearity in the form of a temporary regression (dip). We defined eight groups of data sets by leaving out the data of one particular test administrator at a time. If the temporary regression is due to an anomalous test administrator, it should disappear in the dataset from which this particular person is lacking. We repeated the statistical procedure described above for each of the reduced data sets. The resulting p-values showed that the dip remained significant for each of the reduced data sets (Monte Carlo, p < 0.001 through 0.05).

In summary, neither selection errors, nor accidental sampling errors nor a deficient test administrator can account for the occurrence of the observed non-linearities.

# Non-linearities in ToM Sub-Scores and in Gender Based Sub-Groups

We checked whether the main temporary regression (a dip at 72–78 months) is observable in all five ToM sub-scores and preferably in the core ToM sub-scores (on desires and beliefs). For this purpose the ToM sub-scores were rescaled, to make comparisons easier (otherwise they would have different maximum scores). Loesses with a 20% window size were calculated. All ToM sub-scores showed dips at roughly the same age (Monte Carlo, p = 0.01; see **Figure 2**).

The curves of the ToM sub-scores showed the same characteristics as that of the ToM sum-score, with start and end of the major dip at roughly the same age as the dip based upon the ToM sum-score (start: respectively, 71–74 months vs. 72 months; end: respectively, 77–79 months vs. 78 months). The sub-score on false beliefs displayed the steepest dip.

Second, we looked whether there are gender differences. On average, girls had slightly higher ToM sum-scores than boys (M = 71.71 versus M = 68.73, respectively; independent samples t-test, p = 0.098). The variance hardly differed between both sexes (20.82 and 20.44) and is considered equal (Levene's test, p = 0.749). When we divided the group in three age groups, however, (n = 87, <54 months; n = 119, 54 < 78 months; n = 118, ≥78 months), we found the gender difference to be significant for the youngest and oldest group (p = 0.05); and the variances within these two age groups were not equal (Levene's test, p = 0.01 and p = 0.05, respectively). Subsequently, we compared the Loess curves for both genders (**Figure 3**). The girls showed two non-linearities: an increase between the fourth and fifth year, followed by a plateau (first temporary regression) and then again a growth spurt between the fifth and sixth year, followed by a dip (second temporary regression) and ending with an ultimate growth spurt. The boys showed only one nonlinearity, namely a dip that was more pronounced than the simultaneous dip of the girls. Through slope hunting techniques, we investigated the statistical significance of these dips in the null hypothesis model. The dip of the boys was significant (Monte Carlo, p = 0.007). The dip of the girls (their second temporary regression) was more flat and did not reach significance (Monte Carlo, p = 0.20). However, this dip appeared around 3 months earlier than in boys. If we reckon with the fact that ToM develops earlier in young girls than in young boys (Charman et al., 2002), the earlier appearance of the dip in the girls seems a meaningful phenomenon. The probability that the occurrence of a dip of this magnitude, appearing up to 3 months earlier, but not later than in the boys is unlikely to be accidental (this difference is statistically significant; Monte Carlo, p = 0.03).

# Skewness and Variability as Additional Indicators of an Underlying Transition

In the introduction, we discussed three qualitative indicators of developmental transition, namely skewness, temporary changes in growth rate and change in variability. Before further analyzing the developmental ToM pattern, we first wished to determine whether the hypothesized properties stated are indeed characteristic of a developmental transition of the kind

FIGURE 2 | Loess curves of the ToM sub-scores plotted versus age. All ToM sub-scores showed dips at the same ages as the dips found for the ToM sum-score. ER, Emotion Recognition; D, Desire; B, Belief; FB, False Belief; MP, Mental Physical. The deepest point of the dip based upon the ToM sum-score is pointed out with the arrow and dotted line.

we now expect to find in the ToM data. In order to do so, we mathematically simulated a transition model in order to check whether the expected qualitative indicators occur. A good example of a developmental transition is a two-step growth process (for details on how such models can be specified and simulated, see Van Geert, 1991, 1994; a two-step growth process can easily be extended toward a three- and more-step model if needed). This transition model can be found in the Supplementary Material.

**Figure 4** shows the Loess curves with a 30% window of the skewness, growth rate, and variability of the real ToM data. A mixture between a two-step and a three-step growth process is apparent. There are two large peaks, with a smaller peak in between, most clearly observable in the variability measure (standard deviation) and less in the other two measurements. The qualitative similarity with the model simulation of a two-step process is striking. There are two peaks, both in the skewness and in the first derivative (i.e., growth rate) curve. As is the case in

the simulation, the peaks of skewness largely coincide with those in the first derivative (growth rate), and the skewness peaks come somewhat earlier than those of the first derivative. The covariance of the series is 0.88, which is comparable to (and even higher than) the high covariance that the simulation model predicted (0.70).

Before concluding that the skewness and first derivative data support the notion of a two-step developmental process, we need to know what the probability is that a similar covariation of skewness and first derivative curves can be obtained if the underlying statistical variation of the sum-scores is in fact symmetrical across age (and not varying systematically, as hypothesized). This null hypothesis model can be tested by generating random series of sum-scores based on a normal distribution model, with means equal to the successive values of the non-linear growth curve and standard deviations equal to the observed standard deviation of the residuals. Only 2 out of the 200 simulated series had a covariance greater than or equal to the observed covariance (p-value is ∼2/200, i.e., p = 0.01). We can thus conclude that the skewness data provide further independent evidence for the existence of at least a two-step process in the development of ToM.

# Is There a Two or Three-Step Developmental Model?

As we mentioned before, girls evidenced a three-step development and boys more a two-step development (**Figure 3**). However, it is highly probable that also boys show a three-step development. It can be hypothesized that the first transition is observable only in girls, because of differences in major parameters – in particular the value of the main parameter, which is the growth rate – and not because of differences in the underlying variables affecting the growth of ToM.

In order to show that this interpretation is indeed feasible, we fitted a three-step growth pattern of ToM knowledge, based on the emergence of two underlying, supportive variables, one around the age of 56 months (A) and another around the age of 72 months (B; **Figure 5**). These supportive variables are hypothetical and may for instance include executive functions, which are known to be an important facilitator in ToM functioning (for the relation between ToM and executive function see, e.g., Carlson et al., 2002, 2013; Devine and Hughes, 2014), also found across cultures (Wang et al., 2016).

The growth model that was fitted to the smoothed data is of the type described by Van Geert (1991, 1994), and by Fischer and Bidell (2006). It contains positive parameters, i.e., a supportive relationship, for the A and B levels and negative parameters, i.e., a competitive relationship, for the first derivative of the hypothetical A and B levels (which corresponds with the actual change in these levels). **Figure 5** shows the fit with the smoothed curves of boys and girls separately, based on underlying hypothetical variables A and B, which are of the same magnitude and occur at the same age in both sexes. **Table 2** shows the values of the model parameters.

A striking difference between boys and girls is that the parameter values cause faster growth and more effect of supportive and competitive variables in girls than in boys. The first discontinuity, a plateau, which is observable in the girls thanks to their higher growth rate, is in fact concealed in boys,

A and B. The top graph shows the fit of girls, the bottom part shows the fit of boys. The underlying variables A and B are of the same magnitude and occur at the same age in both genders.

TABLE 2 | Values of the model parameters used for the dynamic systems growth model.


as a consequence of their lower growth rate and lesser effect from the A-variable (which is a hypothetical variable emerging around the age of 4.6 years). The second discontinuity is observable in both girls and boys. Although the competitive effect of B on ToM is greater in girls than in boys, the observable effect is more salient in boys. This finding may lead to the conclusion that girls evidenced a three-step development and boys only a two-step development. However, in dynamic growth models, parameters often show non-linear co-variations, for instance competitive effects among variables can be masked by higher growth rates. The dynamic growth model (**Figure 5**) showed that the expression of the steps in the form of observable plateaus and marked dips may depend on the values of the growth parameters, in particular the value of the main parameter, which is the growth rate. It can be concluded that a dynamic growth model involving the effect of two variables affecting the growth of ToM, one occurring around the age of 56 months and the other around the age of 72 months, can account for the variety of non-linear phenomena observed in the data, including the differences and similarities, the plateaus and dips between boys and girls.

# DISCUSSION

# Non-linearities in the Development of ToM: One or Two Temporary Regressions

Our findings support the general developmental view of ToM. Based on cross-sectional analyses of a ToM task that is new to the children, our results show that ToM increases with age – with the greatest increase between 42 and 56 months, that is between 3.5 and 4.7 years of age – and that it continues to develop after the age of six. The development before the age of four and a half is evidently monotonous. However, after this age non-linearities occur. Two temporary regressions – one around the age of 4 years and 8 months and one at the age of six to six and a half – are found not only in the ToM sum-score Loess curve but also in the ToM sub-score Loess curves.

The temporary regressions can be viewed as indicators of non-linearity in ToM development. We have demonstrated that the probability that the main temporary regression (a dip at 72–78 months) is either a statistical selection artifact or an experimenter artifact is very small. The application of additional indicators – skewness, growth rate, and variability – provided further support for the occurrence of a transition – or two transitions – in the development of ToM, as evidenced by an instrument that requires children to transfer their daily knowledge to a context of explicit verbal questions and pictorial representations. Also, the non-linearity found cannot be accredited to gender differences. Both boys and girls showed a marked regression around the age of six. However, girls also showed evidence for an additional earlier regression (a plateau), around the age of five.

There are different views on the manner in which ToM develops in preschoolers. For instance, one view implies continuous increases in ToM related processing abilities rather than radical conceptual shifts in understanding mental states (e.g., German and Leslie, 2000; Carlson and Moses, 2001; Birch and Bloom, 2004). A second view assigns central importance to the occurrence of a conceptual change. This change takes place between the age of three and four/five for simple ToM skills (Perner, 1991; Gopnik, 1993; Wellman et al., 2001; Wellman, 2014), and this conceptual development continues into more advanced ToM skills at the age of eight/ten (Osterhaus et al., 2016). Our data show a pattern of overall continuous increase, with a steep growth of ToM knowledge around the age of four, followed by a more continuous increase of ToM knowledge

leveling off toward the age of five and interrupted by a temporary regression around the age of six, which occurs in boys and girls alike. Overall, we found boys and girls to follow the same developmental path. However, we also found some gender differences in ToM development. Such differences have seldom been reported in ToM research (for exceptions: Charman et al., 2002; Walker, 2005; Calero et al., 2013). In fact, most studies find no statistically significant differences between boys and girls, which might be due to the use of tests that are insufficiently capable of capturing subtle individual ToM differences (Baron-Cohen et al., 1997), or have insufficient statistical power. Our study included a more extensive sample than the majority of studies did. In addition, we employed statistical techniques that are sensitive to more subtle developmental patterns. Under such methodological conditions, eventual gender differences are more easily recorded from the data, not only in the appearance of ToM skills but also in the rate of ToM development. The early ToM growth in girls was more rapid than that of boys. Gender difference in the rate of ToM development has been hypothesized before by Baron-Cohen et al. (1997) and by Charman et al. (2002) who found that young girls have a ToM advantage, which disappears as children get older. Such a higher early rate of growth results in a greater likelihood of a later temporary standstill (Van Geert, 1994), which has indeed been demonstrated in our data, for girls showed two non-linear changes in the form of temporary regressions (a plateau and a dip), and boys only one (a dip). This is in correspondence with the scarce research on the effect of gender on ToM showing slight ToM advantages in both young girls (2.3–4.3 year olds, Charman et al., 2002) and more profound ToM advantages in older girls (6–8 year olds, Calero et al., 2013). The more rapid ToM growth in girls might be due to the fact that, from the beginning, girls are more focused on sociability. For instance, already in 1 day old neonates, a definite sexual dimorphism is observable (Connellan et al., 2000). Next to that, girls also have better verbal abilities than boys (Halpern, 2000), stronger syntactic abilities and a larger amount of social experiences (Charman et al., 2002). Language is considered an important factor in ToM functioning (e.g., de Villiers and de Villiers, 2014). Finally, there is some evidence that females show more pronounced responses of the mirror neuron system than males (Cheng et al., 2006); the mirror neuron system has been hypothesized to directly relate to ToM abilities in both children and adults (for a review see Oberman and Ramachandran, 2007).

# Potential Explanations for the Observed Temporary Regressions

In this article we reported the discovery of one or two temporary regressions, indicative of either a two- or three-step development. The literature on U-shaped growth and non-linear growth curves in general provides some hints on possible explanations.

The first explanation is that the non-linearities reflect a temporary conflict between competence and performance (Marcus, 2004). According to this view, the development of ToM competence follows in reality a monotonically rising function, but for some accidental reason, performance on ToM tests gets a little worse around the age of six, maybe because a particular performance component interferes negatively. The question is of course what this performance factor is. In addition, one may question whether this competence-performance distinction is relevant on the level of testable psychological functions. Dynamic systems theory, as advocated by the late Esther Thelen and her collaborators, makes no distinction between these two levels, and sees a temporary regression as a direct consequence of dynamic interactions between components that are responsible for the production of answers to ToM questions in specific problem contexts (Gershkoff-Stowe and Thelen, 2004). According to this view, there is no ToM in the sense of an identifiable, internal conceptual structure. All behavior is soft assembled, and temporary regressions reflect the "continuous changes in the collective dynamics of multiple, contingent processes" (Gershkoff-Stowe and Thelen, 2004, page 11).

Another point that we wish to re-emphasize is that, from a dynamical point of view, cross-sectional data based on test scores provide an answer to the question of how children transfer their daily probably non-discursive experiences to a context of repeated, explicit verbal questions and pictorial representations. From a dynamic systems point of view, all forms of knowledge expression reflect the process by which this expression has come about. In that sense, all information about development reflects the contextual conditions under which it has been obtained. It is thus possible that the non-linearities found in our study are a typical property of the current test conditions. However, this eventual context dependency does not reduce the developmental significance of the information obtained. The question is of course which aspect/aspects of ToM related knowledge and behavior is/are responsible for the observed non-linearities, in particular the temporary regression.

According to Brainerd (2004), temporary regressions in performance occur if a particular performance class – for instance the class of ToM related questions – is served by opposing strategies, or dual processes. It is conceivable that up to the age of six, the child has employed an intuitive and direct solution to ToM problems, while at around the age of six a new approach begins to emerge, which is more cognitive and reflective in nature (see also the hypothesis of embodied/enacted and explicit/reflective perspectives on other persons, e.g., Bohl and van den Bos, 2012; Fuchs, 2013; Gallagher and Varga, 2014). The emergence of a second strategy – for instance implying an explicit third person perspective as Fuchs (2013) has called it – requires a form of reorganization of components responsible for ToM performance, and the observed non-linearities are likely to reflect this reorganization (Feldman and Benjamin, 2004; Friend, 2004; Marcovitch and Lewkowicz, 2004; Rogers et al., 2004; Wewrker et al., 2004). That such non-linearities indeed occur as a consequence of continuous, long-term growth in a developing system has been demonstrated by modeling development, either by means of connectionist networks (Rogers et al., 2004) or by means of dynamic systems models of the type advocated by Van Geert, Fischer, and others (see Demetriou and Raftopoulos, 2004, for

a discussion regarding U-shaped growth). In these models, long-term development is context-specific and dependent on dynamic interactions among many components – biological, cognitive, emotional, behavioral – that constitute the developing system (Van Geert, 1991, 1994, 1998; Fischer and Rose, 1994; Fischer and Bidell, 2006; Fischer and Van Geert, 2014). Relationships between the multiple components in a system can be supportive, competitive, conditional, or neutral. The dynamics of these relationships over time explain the emergence of phenomena such as accelerations, decelerations, and regressions.

Based on dynamic modeling and indirect evidence from brain development, neo-Piagetian theory predicts relatively major shifts in development around the age of 6 years, dependent on the context or content of the developmental function (Case, 1991; Fischer and Bidell, 2006). The shift is broadly associated with a marked increase in more reflexive, coordinated ways of thinking in contrast with the more intuitive, uni-dimensional ways of thinking that precede it. Although the application is purely speculative, it might be so that around the age of six the intuitive ToM judgment, which is considered to be largely based on biologically founded forms of empathy (Preston and de Waal, 2002) is supplemented by a more reflective, cognitive form of ToM reasoning (already constructed form age 4 onward; Low, 2015). In this regard, it has been shown that six-yearolds have little trouble assigning false beliefs to others, but only arrive at a truly interpretive ToM at the age of seven (Carpendale and Chandler, 1996; Lalonde and Chandler, 2002), however, ToM continues to develop and change throughout life (Moran, 2013; Vetter et al., 2013). Children with autism seem to have an implicit ToM deficit (Schuwerk et al., 2015). As predicted by the theories discussed earlier, this emergence of a new ToM specific strategy in typical development might explain the temporary regression found in our data. The fact that this regression was found for all ToM sub-scores supports this way of thinking.

The previous explanations all rely on the notion of distinctive, developmentally ordered strategies for solving ToM problems. In fact, there is supportive but indirect evidence of two 'approaches' to ToM: an intuitive (or automatic) and a reflective (or controlled) route (Lieberman, 2007). Indirect evidence for an intuitive, neuro-physiologically based understanding of ToM related properties of other persons comes from the rapidly growing literature on the neuronal systems that underlie the spontaneous understanding of human actions and psychological states of others. An example of such a system is the mirror neuron system (for a systematic review see Hamilton, 2013). It is hypothesized that through cognitively mediated routes people with autism are able to compensate for the lack of an intuitive ToM (Eisenmajer and Prior, 1991; Baron-Cohen et al., 1993; Dissanayake and Macintosh, 2003). It is a strategy they can only master if a verbal mental age of 11 years is attained (e.g., Happé,, 1995). Typically developing subjects, on the other hand, use the direct biology-based routes as well as the more cognitive ones. Their understanding of ToM is a combination of approaches and strategies (Lieberman, 2007), the combination of which changes across development (Kobayashi et al., 2007). It is not unlikely that the temporary regressions found in our study reflect a major reorganization in the composition of strategies.

It should be noted though that the non-linearities found in our data need not reflect a difference in ToM understanding per se, but could reflect a developmental difference in other factors necessary for the task. For instance, attention, inhibition, and 'curse of knowledge' may play a role (e.g., German and Leslie, 2000; Carlson and Moses, 2001; Birch and Bloom, 2004). At the age of six, the development of executive functions undergoes its first active stage of maturation (Brocki and Bohlin, 2004). It is not unthinkable that this development also has consequences for the ToM development of children (Carlson et al., 2002). According to the emergence account, executive function is even considered a necessary condition for the acquisition of ToM understanding (Moses, 2001; Devine and Hughes, 2014). San Juan and Astington (2012) have even suggested that executive function and language abilities can aid the developmental step from an implicit to an explicit ToM. However, Osterhaus et al. (2016) found that advanced ToM abilities were not determined by information-processing capacities (such as executive control: working memory and inhibition), instead indicating conceptual development.

Finally, data collected on children with PDD–NOS, an autism spectrum disorder, (Blijd-Hoogewys et al., 2010) show a highly comparable dip in ToM scores. However, in accordance with the developmental delay in ToM typical of such children, the dip occurs at a slightly later age than in the typically developing children. This delay in the timing of the dip supports the conclusion that the dip is a genuine phenomenon of ToM development, and not of interference with some other non-ToM factor, which is not necessarily delayed in children with PDD–NOS. Note that children with autism spectrum disorder are also known to have executive function problems (Blijd-Hoogewys et al., 2014; Craig et al., 2016).

# A Three-Step Developmental Model

Visual inspection of the graphs revealed that girls showed two discontinuities (a plateau and a dip) and boys only one (a dip). The dip of the boys coincided with the second (more shallow) dip of the girls. The dynamic growth model showed that the observable properties of the growth trajectories depend on the values of the parameters governing the growth rate and the supportive and competitive relations between the variables in the model. A typical prediction of the model is that growth rates will result in more clearly observable plateaus and less clearly observable temporary regressions. This prediction is in line with the observed trajectory of the girls: the fact that they show an earlier growth spurt than the boys suggests that the growth rate of their underlying ToM components is higher than that of the boys. Consistent with this presumable higher growth rate, the girls show more clearly observable plateaus and more shallow dips. In short, the proposed dynamic growth model might provide a speculative explanation of the non-linear phenomena observed in the data, including the

differences and similarities, the plateaus and dips between boys and girls.

# Limitations of the Research, Prospects for Further Study, and Implications for Clinical Practice

One limitation of our research is that it had fewer children in the older age range (from 8 years on), which implies a reduction in reliability at the older ages. Also, the test was probably too easy for the older children since we did not include more advanced ToM tasks that are typically mastered at later ages. Perhaps additional regressions would have been found at the older ages if second-order belief tasks (Perner and Wimmer, 1985) or more complex emotional constructs would have been used. However, not having included such tasks does not change anything to our main message, that there are nonlinearities in ToM development, if it is viewed at from a crosssectional perspective, with children being confronted with an essentially unfamiliar task, as far as their ToM knowledge is concerned.

A second limitation of our research is that the growth curve of ToM is based on cross-sectional data. This is only one particular perspective on ToM development, namely the perspective provided by asking children to transfer their knowledge to a new and unfamiliar ToM context, namely that of a storybook with explicit verbal questions. Various other complementary perspectives can be provided, for instance that of time-serial frequent measurements or observations of individuals. As is now becoming well-established knowledge, models based on group data should not be seen as models of typical individual curves (see the earlier remark on ergodicity in the introduction, Molenaar and Campbell, 2009). However, there is also converging evidence from longitudinal ToM research both in typically developing children (see Figure 3 in Serra et al., 2002) and children with PDD–NOS (Serra et al., 2002; Blijd-Hoogewys et al., 2010), further supporting the robustness of this developmental phenomenon.

Concerning future research, it might be interesting to include a broader age group, also including second-order and third-order belief tasks. In addition, it might be interesting to focus on directly perceived and enacted forms of other person understanding in the form of micro-observations of social interaction in young children and to compare these implicit forms of understanding with the more explicit forms of understanding that a test like the ToM Storybooks is trying to capture. A third possibility is to focus on the nature of the explanatory schemes that children use or enact while answering questions about desires and intentions. After all, many questions focusing on the understanding of desires and intentions evoke a potential conflict between a scheme of persons as rational-agents (acting on the basis of the real states of affairs in the world) and a scheme of persons as psychological agents (acting on the basis of their knowledge and perception of states of affairs in the world). Of course, these schemes must be coordinated into a scheme of the person as a rational psychological agent, but this process of coordination might not be an easy accomplishment for many children.

The findings of the current research may have implications for clinical assessment and intervention. In the sixth year of life (72–78 months), a dip in ToM understanding and reasoning – in the form of answering explicit questions about imaginary situations – seems common. Note that this is also the age period in which ASD is often diagnosed in children (Miodovnik et al., 2015). Test developers and diagnosticians should take into account that children with ASD may at that time 'appear' to have less severe problems on a ToM test if compared to their typically developing peers who are undergoing a temporary ToM dip. Children with ASD show this dip much later (Blijd-Hoogewys et al., 2010). This may appear counterintuitive, for children with ASD do have ToM problems (Baron-Cohen, 2000). Research concerning the impact of the ToM dip on clinical assessment is needed. In individual children, the temporary dip found on the group level might be expressed in the form of temporarily increased intra-individual variability in their reactions to questions involving ToM decisions, for example shifts between direct, rapid, and primarily implicit understanding on the one hand, and reflective, thoughtful and primarily explicit understanding on the other hand, or shifts between rationalagent and psychological-agent perspectives. In principle, clinical interventions might explicitly reckon with the non-linearities in the processes of ToM development, and focus on individual indicators of such non-linearities in the form of rapid learning, resistance to learning, response variability, and so forth, to adapt the intervention to the idiosyncratic nature of the young client's developmental pathway. Also, ToM training should perhaps focus mainly on acquiring an intuitive and direct way of ToM, only taking into account the cognitive and reflective approach after the dip-age has been reached (Gallagher, 2004; Gallagher and Varga, 2015). How exactly this should be done is of course a matter of further clinical research.

# CONCLUSION

In sum, this article has explored the existence of non-linearities, in particular temporary regressions, in ToM development. Because little is known about the dynamics in ToM development, a cross-sectional design was applied in combination with nonlinear fitting methods. Data from the ToM Storybooks, a comprehensive measurement of ToM, showed that a two or three-step developmental model can be distilled. One nonlinearity occurs at the age of 4 years and 8 months (a plateau), and one between the ages of six to six and a half (a dip). These non-linear phenomena could not be explained as accidental sampling effects and were supported by additional indicators of non-linearity, namely changes in skewness, in growth rate, and in variability. The non-linearities, for instance in the form of temporary regressions or dips, were observable not only in the ToM total score, but also in the ToM sub-scores and in both boys and girls. Boys and girls differed somewhat in the form and timing of the non-linear properties. Finally, the dynamic growth

models presented in this article might serve as a starting point for the formulation of a theory of ToM in a broader developmental context, focusing on the individual-in-interaction as the locus of the developmental process.

# AUTHOR CONTRIBUTIONS

Both authors were involved from beginning to end. EB-H is the first author. She gathered the data, did data-analyses and interpretation together with PvG and wrote the article. PvG was responsible for the research design, did data-analyses and interpretation together with EB-H and made revisions to the manuscript. Both authors are in agreement with the content of the manuscript and agree to the byline order and to submission of the manuscript in this form. They agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

#### REFERENCES


### FUNDING

This research was supported by an internal research Grant from the University of Groningen, Department of Psychology, Heymans Institute.

#### ACKNOWLEDGMENT

We thank all the children and their parents for participating in this research project and the numerous students who helped in collecting data.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01970/full#supplementary-material



Osterhaus, C., Koerber, S., and Sodian, B. (2016). Scaling of advanced theory–of– mind tasks. Child Dev. doi: 10.1111/cdev.12566 [Epub ahead of print],


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Blijd-Hoogewys and van Geert. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Intergenerational Transmission of Reflective Functioning

#### Anna M. Rosso\* and Cinzia Airaldi

Department of Education, University of Genoa, Genova, Italy

The present study investigated whether, and to what extent, reflective functioning (RF) during preadolescence is associated with maternal attachment security and RF, and with the child's attachment security. Thirty-nine mother–preadolescent child dyads from a non-clinical population participated in the study. Maternal and child RF were assessed by applying the Reflective Functioning Scale to the Adult Attachment Interview (AAI) and to the Child Attachment Interview transcripts. Children of mothers who showed a secure attachment model regarding the relationship with their parents during childhood reported higher levels of RF than the children of mothers who were classified as insecure on the AAI. Child RF was positively associated with maternal "Coherence of the Mind" on the AAI and negatively associated with maternal derogation of attachment. A strong, significant association was also found between child attachment security and child RF. Children who were rated as being more emotionally open, more able to balance positive and negative descriptions of their parents, more prone to support their assertions through examples, and more able to positively resolve conflicts with their parents showed higher RF. On the contrary, children who resorted to a higher extent to idealization and dismissal toward their parents showed a lesser degree of RF. Notably, a very strong association was found between the score on the "Overall coherence" subscale and the child's ability to mentalize mixed-ambivalent mental states in the context of their family relationships. As expected, child and maternal RF resulted significantly positively correlated with each other. In particular, only maternal RF (and not maternal attachment security) predicted child RF, and only maternal ability to mentalize mixed-ambivalent mental states predicted the corresponding ability in the children.

Keywords: child reflective functioning, maternal mentalization, Child Attachment Interview, preadolescence, dismissing attachment model

## INTRODUCTION

The development of the human ability to understand the mental states of oneself and of others has been studied by philosophers (e.g., Brentano, 1924; Dennett, 1987; Fodor, 1987), cognitive and developmental psychologists (e.g., Baron-Cohen et al., 1985; Dunn, 1988; Gopnik and Astington, 1988), and neuroscientists (e.g., LeDoux, 1996). This ability is commonly referred to as "mentalization." However, a growing body of evidence supports the notion that the construct of mentalization includes several components which are only partially correlated to each other (Fonagy et al., 2012). In addition, the term "mentalization" often refers to different constructs [e.g., theory of mind (ToM), mind-mindedness, emotional intelligence] which albeit partially

Edited by: Paola Molina, University of Turin, Italy

#### Reviewed by:

Cristina Riva Crugnola, University of Milano-Bicocca, Italy Susanna Pallini, Università Degli Studi Roma Tre, Italy

> \*Correspondence: Anna M. Rosso rosso@unige.it

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 24 August 2016 Accepted: 21 November 2016 Published: 06 December 2016

#### Citation:

Rosso AM and Airaldi C (2016) Intergenerational Transmission of Reflective Functioning. Front. Psychol. 7:1903. doi: 10.3389/fpsyg.2016.01903

overlapping, originated from different theoretical frameworks and were investigated by means of different experimental paradigms or tasks (Sharp and Fonagy, 2008).

In this study, we focused on reflective functioning (RF), a definite operationalization of mentalization that was suggested by Fonagy and Target (1997) in the context of the attachment theory. RF was defined as the ability to mentalize in the context of close, interpersonal relationships, thus allowing "to distinguish inner from outer reality, pretend from 'real' modes of functioning, intra-personal mental and emotional processes from interpersonal communications" (Fonagy et al., 1998). It promotes a more coherent sense of self as well as a better understanding of others, thereby making the individuals' behavior meaningful and predictable. It is assumed that RF originates in the context of early attachment relationships and is promoted by a mentalizing mother who is able to treat her child as a being with a mind, and can keep her child's feelings, desires as well as intentions in her own mind (Fonagy et al., 2002). Such a mentalizing mother helps the child to recognize, tolerate, and regulate his/her emotional experiences through her ability to represent them, through her gestures and actions, and later also by playing and talking in terms of mental states (Gergely and Watson, 1996; Meins et al., 2002).

Reflective functioning was initially assessed in adults by applying the Reflective Functioning Scale (RFS) to the Adult Attachment Interview (AAI) (Main and Goldwyn, 1998), a semistructured interview which focuses on the subject's attachment experiences with their parents during childhood. As will be further explained in more detail, some questions in the AAI (e.g., "Why did your parents behave as they did during your childhood?," "Do you think your childhood experiences have an influence on who you are today?") require RF, while others allow it. Based on the RFS, RF emerges when the interviewee shows that he/she is aware of the nature of mental states, an explicit effort to tease out mental states underlying behavior, the proneness to recognize developmental aspects of mental states or mental states in relation to the interviewer (Fonagy et al., 1998). The longitudinal London Parent–Child Project (Fonagy et al., 1991) found that mothers with higher RF (who were interviewed during their first pregnancy) were more likely to have a child with a secure attachment model at the age of 1 year. In particular, the longitudinal study highlighted that elevated RF in mothers who had suffered from painful and/or traumatic experiences in their childhood was a protective factor against the risk of the child developing insecure and/or disorganized attachment models. On the contrary, these were frequently found in the children of mothers who had suffered traumatic experiences in their own childhood, and who never developed the protective ability to mentalize their own, or their parents' mental states that were involved in the painful emotional experiences (e.g., severe neglect, loss, physical, or sexual abuse) they experienced (Fonagy et al., 1991). A more recent study (Arnott and Meins, 2007) found that mothers with higher RF showed better mindmindedness (i.e., the parent's ability to represent their children's thoughts and feelings) when their children were 6 months old. In addition, in this study the mothers' RF predicted child attachment security at 12 months. A later, very recent study (Ensink et al., 2016) replicated the results of the London Parent–Child Project (Fonagy et al., 1991), and found that the RF of the mothers, as assessed during pregnancy, was associated with later adequate parenting as well as infant attachment security.

Afterward, a modified version of the RFS (Slade et al., 2004) was developed to be applied to the Parent Development Interview (PDI) (Aber et al., 1985), a semi-structured interview designed to evaluate the mental representation the parent has of him/herself, as well as of the child, and of their relationship. Studies found that good maternal RF, as assessed in the context of the PDI, mediated the intergenerational transmission of attachment security and was associated with more sensitive and adequate caregiving behavior (Grienenberger et al., 2005; Slade et al., 2005).

A third version of the RFS, i.e., the Child Reflective Functioning Scale (CRFS) was recently developed and validated (Target et al., 2001; Ensink, 2004; Ensink et al., 2015) to be applied to the Child Attachment Interview (CAI) (Shmueli-Goetz et al., 2000). It is a semi-structured interview that was developed to assess attachment models in children aged 7–12. Children with secure attachment showed that higher RF was significantly associated with higher scores on some CAI subscales, namely "Emotional openness" and "Coherence" (Ensink, 2004). A recent study found that maternal RF, as assessed by the PDI, was associated with child RF, and that the latter resulted impaired in children who had experienced sexual abuse (Ensink et al., 2015).

The availability of the CRFS has led to progress in this field, ultimately overcoming some of the previous study limitations. Until recently, the lack of a measure to assess child RF in the context of attachment narratives prevented us from exploring both the impact of mother–child attachment security and the influence of maternal RF on the ability of RF in the child. Previous studies, which focused primarily on preschool aged children, mostly used measures of different components of child mentalization, such as ToM, or emotional understanding in impersonal contexts. These studies found that maternal attachment security predicted the child's ability to identify painful emotions, to cope with challenging circumstances (Steele et al., 2002), to recognize emotions, especially negative ones (Laible and Thompson, 1998; Steele et al., 1999, 2003, 2008), and to solve false-belief tasks (Fonagy and Target, 1997). Maternal mind-mindedness, as well as maternal RF predicted the child's performance in ToM tasks as well (Meins et al., 2002, 2003; Steele and Steele, 2008). To date, only few studies have focused on preadolescence (Rosso et al., 2015; Scopesi et al., 2015).

The aim of the current study was to investigate whether, and to what extent, RF during preadolescence is associated with maternal attachment security and RF, and with the child's attachment security. Based on the available literature, we expected to find an association between child RF, maternal attachment security and RF, and child attachment security even though some studies (de Vito and Muscetta, 1998; Ammaniti et al., 2000; Ammaniti and Sergi, 2003) pointed out that in the transition to adolescence children might more frequently adopt dismissing strategies toward their parents which could decrease their ability to mentalize in the context of their closest familial relationships.

Previous studies also showed that dismissing attachment correlated both with an impairment of the ability to process negative emotions, particularly sadness (Strathearn et al., 2009), and to a proneness to inhibit negative affective responses (Leckman et al., 2004; Strathearn, 2006; Crittenden, 2008). Conversely, it was found that secure mothers showed better attunement with their children and greater ability to repair mismatched states during free play (Riva Crugnola et al., 2013), as well as the maternal proclivity to talk about painful emotions predicted emotional understanding in children (Dunn and Brown, 2001), as well as the early acquisition of ToM (Hughes and Dunn, 2002). Mixed emotional understanding in children was also predicted by their attachment security (Ensink, 2004). Recent studies (Rosso et al., 2015; Scopesi et al., 2015) confirmed the association between a dismissing model and impaired mentalization, as well as the association between the maternal ability to mentalize mixed emotions and mentalization in their children. Thus, the aims of the current study also include investigating (1) whether and to what extent dismissing and preoccupied maternal defensive strategies are associated with an impairment of RF in children, (2) whether the maternal ability to mentalize mixed-ambivalent mental states is associated with higher RF in children. Replication of the findings that were observed in the previous studies was expected. To our knowledge, no empirical studies have ever been conducted to investigate the impact of the preoccupied state of mind on mentalization ability. Fonagy et al. (2010) hypothesized that preoccupied individuals showed strong activation of the attachment system and simultaneous deactivation of the mentalization system. More recently, Fonagy et al. (2016) found that psychologically suffering mothers used mental state talk extensively in their narrative which, however, was not really a marker of authentic mentalization. In line with these hypotheses, we could assume that children of preoccupied mothers do not show good mentalizing, but it would be more cautious in this regard to consider the current study as an exploratory one.

# MATERIALS AND METHODS

# Participants

Thirty-nine mother–child dyads were recruited on a volunteer basis at an Italian public school. Children were aged 12.3–12.9 years, there were 25 (64.1%) males and 14 (35.9%) females, mostly (74.4%) from intact families. In order to exclude children with physical or psychological impairments, mothers were interviewed regarding the child's developmental history, while teachers were briefly questioned about learning and/or behavioral disorders. Twentythree parents (59%) gave consent for their children to be administered the verbal scale of the Wechsler Intelligence Scale for Children (WISC)-III. The Verbal IQ of the children was found to range from 99 to 145 (M = 116.96, SD = 12.8). Mothers came from working and middle class backgrounds. They were aged between 37 and 53 years (M = 42.95; SD = 4.36), and their level of education ranged from 8 to 23 years (M = 13.31; SD = 3.65). All but two were employed.

#### Measures

#### Maternal Attachment Models

The AAI (George et al., 1985) was administered to the mothers. It is a semi-structured, hour-long interview designed to classify the state of mind with respect to early attachment experiences. The protocol consists of 18 questions. The interviews begin by asking the subject to describe his/her relationship with their parents during childhood. Then he/she is requested to give five adjectives that describe the relationship with each parent and to recall specific memories that would support the previously chosen adjectives. The next questions ask about the experiences of emotional distress, physical injury, illness and separation from parents during their childhood. The subject is further asked about any possible experiences of rejection, abuse, maltreatment and loss. The interviewee is also asked to give his/her opinion about the impact of their childhood experiences on their personality and the mental states underlying their parents' behavior. Finally, the interview questions shift to the current relationship with their parents, and the present relationship with their own children, if any. The last question requires them to describe how their experiences of being parented impact on their own parenting. According to the Main and Goldwyn (1998) coding system, the subjects are judged "secure/autonomous" if the narrative is sufficiently coherent regardless of the positive or negative quality of their relationships during childhood. The transcripts are classified as "dismissing" when the speaker shows an attempt to minimize the influence of attachment experiences, in particular idealizing or derogating the attachment figures. The category "preoccupied" is assigned to people who appear entangled in their past experiences. They may be confused, passive, vague, fearful, overwhelmed or angry, conflicted and unconvincingly analytical. "Unresolved/Disorganized" is an additional category that is assigned when the narrative contains markers of lapses in the monitoring of reasoning or discourse during the discussion of experiences of loss and/or abuse. The category "cannot classify" is assigned to those transcripts that show a mixture of inconsistent and incompatible states of mind. In the nonclinical populations the latter classification is rarely assigned. According to the findings of the most recent meta-analysis, the following distribution was observed in the non-clinical adult population: 58% secure, 23% dismissing, 19% preoccupied, and 18% additionally classified unresolved/disorganized (Bakermans-Kranenburg and van IJzendoorn, 2009). Several studies have supported the power of the AAI to predict parenting and subsequent infant–parent attachment (Fonagy et al., 1991; van IJzendoorn, 1995; Bakermans-Kranenburg and van IJzendoorn, 2009; Berthelot et al., 2015).

In the current study, the two-way classification (Secure vs. Insecure) was used. The decision to dichotomize the sample was the only choice since, due to the limited number of participants in the study, our sample included only 15 mothers who were classified as Insecure (five Dismissing, seven Preoccupied, and three Unresolved). Furthermore, a dimensional approach to the AAI was also utilized, as suggested by recent studies

(Bakermans-Kranenburg and van IJzendoorn, 2009; Whipple et al., 2011) after Roisman et al. (2007) explored the AAI latent structure and found two dimensions, namely the dismissing and the preoccupied dimension. Using the state of mind scales in the analyses is also recommended because it allows to investigate the impact of the dismissing and the preoccupied dimensions with enhanced statistical power (Roisman et al., 2007). Thus, we considered the subscales "Idealization regarding mother," "Idealization regarding father," "Overall derogation of attachment," and "Coherence of the mind" to explore the definite impact of the dismissing strategies and the subscales "Passivity," "Involving anger toward mother," and "Involving anger toward father" to investigate the influence of the maternal preoccupied state of the mind on the children's RF. All of the AAIs were coded in terms of the Berkeley AAI System (Main and Goldwyn, 1998) by a licensed coder, blinded to scores on other measures. Eight transcripts (20%) were then randomly selected and recoded by the first author. The resulting inter-rater reliability was satisfactory (Cohen's k = 0.86 for overall classification, and ICC ranging from 0.81 to 0.85 for the subscales).

#### Maternal Reflective Functioning

The Adult Reflective Functioning Scale (ARFS) (Fonagy et al., 1998) was applied to the AAI transcripts to evaluate maternal RF. In coding RF, some AAI questions are considered "Demand Questions" in that they require RF (e.g., "Why do you think your parents behaved as they did during your childhood?"), while other questions are called "Permit Questions" in that they do not require, but only allow RF (e.g., "Could you describe your first separation from your parents?"). According to the scoring guidelines, the following four markers of RF are identified: "Awareness of the nature of mental states" (marker A), "Explicit effort to tease out mental states underlying behavior" (marker B), "Recognizing developmental aspects of mental states" (Marker C), and "Mental states in relation to the interviewer" (Marker D). After rating each identified passage of the AAI, an overall classification is assigned to the interview as a whole, ranging from −1 (negative RF) to 9 (exceptional RF). In this study, in addition to the overall RFS rating score, we considered three further RF variables on the basis of a recent study (Rosso et al., 2015), namely the frequency of RF in the context of positive, negative, and mixed-ambivalent mental states (e.g., "I felt secure with my mum, because she always tried to comfort me"; "Unfortunately, I often got mad at my mother, it seemed that she could not understand me when I was sad"; "I really don't know how the relationship with my mother was when I was a child, sometimes I felt well with her, sometimes I felt some kind of irritation, maybe I was really sensitive to her sudden mood swings, without understanding that she was terribly depressed"). Validation studies of RFS (Fonagy et al., 1998) showed discriminant and predictive validity, good inter-rater reliability, low correlation with education level, and no correlation with socioeconomic status (SES) or age.

In the present study, the RF score did not correlate to either the mothers' level of education (r = −0.032, p = 0.845) or to the mothers' age (r = 0.121, p = 0.463). The first author, who was blinded to the scores on the other measures, rated the transcripts according to the guidelines manual (Fonagy et al., 1998), then eight transcripts (20%) were randomly selected and re-coded by an independent coder. The resulting inter-rater reliability was excellent (ICC = 0.82).

#### Child Attachment Models

The CAI (Shmueli-Goetz et al., 2000, 2008, 2011; Target et al., 2003) was administered to the children. It is a measure designed to assess attachment models in children from 7 to 12 years of age. The protocol includes 19 questions about the composition of the family, and about the child him/herself and the relationship with his/her parents. The child is encouraged to talk about specific relationship episodes involving each parent, even concerning moments in which he/she was ill or felt troubled or was in conflict with them or in need of help. Similarly to the AAI, the CAI investigates the emotional reactions to experiences of mourning as well as of separations. Coding the protocol takes into account not only an analysis of the speech, but also the non-verbal behavior of the child. A score ranging from −1 to 9 is assigned to the following subscales: "Emotional openness," "Balance of positive/negative references to attachment figures," "Use of examples," "Preoccupied involving anger," "Idealization," "Dismissal of attachment," "Resolution of conflict," "Atypical/Disorganized behavior," and "Overall coherence." Then, a main attachment classification (Secure, Dismissing, Preoccupied, Disorganized) is assigned individually to the mother and to the father. Secure children show greater ability to express and to identify emotions and to give examples, as well as low levels of anger, idealization, dismissal/derogation of attachment, a higher balance of positive and negative references, and the ability to resolve conflicts constructively. Preoccupied children are entangled in their painful experiences, sometimes overwhelmed by anger feelings, and excessively focused on the parent. Dismissing children are highly rated on "Idealization" and/or "Dismissal," as well as low rated on "Emotional openness" "Balance of positive/negative references to attachment figures," "Use of examples," "Resolution of conflict," and "Overall coherence." Disorganized children often show a proclivity to control through punitive or care-giving behavior. During the interview, these children may show sudden changes in the affective tone, interruptions in speech, affective inadequacy, and/or bizarre behavior. In some cases, they exhibit unrealistic self-representations. CAI validation studies (Target et al., 2003; Shmueli-Goetz et al., 2008; Venta et al., 2014; Borelli et al., 2016) conducted on clinical and non-clinical populations showed good psychometric properties. Inter-rater reliability was good (k between 0.58 and 0.93), both between expert coders and between students who had received 3 days of training. The distribution of attachment classifications in non-clinical samples was in line with what is reported in the literature (i.e., 66% secure, 30% dismissing, 4% preoccupied with respect to the mother, and 64% secure, 30% dismissing, and 6% preoccupied with respect to the father). The concordance of classifications between mother and father was very high (92%, k = 0.84). The group of scales related to the state of mind showed a high internal consistency (Cronbach'sα = 0.87). The test-retest reliability showed k values between 0.52 and 0.81 after 3 months, and k values between 0.52 and 0.74 after 1 year. The classification with respect to the mother

showed higher temporal stability compared to the classification toward the father. No significant differences were observed when comparing secure and insecure children, with regard to age, gender, SES, ethnicity and verbal IQ. A significant association was instead observed between attachment classification of the children and that of their mothers (Shmueli-Goetz et al., 2008).

In the current study, the second author, who was blinded to scores on other measures, rated the transcripts according to the guidelines (Shmueli-Goetz et al., 2011), then eight transcripts (20%) were randomly selected and re-coded by an independent coder. The resulting inter-rater reliability was excellent (k = 0.88 for the overall classification and ICCs ranging from 0.84 to 0.88 for subscales).

#### Child Reflective Functioning

The CRFS (Target et al., 2001) was developed on the conceptual basis of the ARFS, with modifications to the guidelines so as to apply it to children. As for AAI, the markers of RF include "Awareness of qualities of mental states," "Explicit effort to tease out mental states underlying behavior," "Recognizing that mental states develop in the context of developmental, psychobiological, and social processes," and "Mental states in relation to the interviewer." It must be kept in mind that as compared to adults, children often give evidence of RF in more implicit ways, for example by mimicking, changing their tone of voice and by facial expressions. This is why coding from videotaped interviews is also needed since coding from transcripts alone is not enough. CRFS inter-rater reliability was found to be good, with ICC ranging from 0.60 to 1.00, with a median of 0.93, temporal stability was found to be high over a 3-month period and adequate over 12 months (Ensink, 2004). A recent study (Ensink et al., 2015) supported the validity of the CRFS in distinguishing sexually abused children from a community control group.

In this study, in addition to the overall CRFS rating score, we considered the frequency of RF in the context of positive, negative, and mixed-ambivalent mental states, just as we did when coding RF in the mothers. All the CAI transcripts were coded according to the CRFS guidelines (Target et al., 2001) by a licensed coder, blinded to scores on the other measures. Then 8 transcripts (20%) were randomly selected and re-coded by the first author. The resulting inter-rater reliability was satisfactory (ICC = 0.85).

#### Children's Verbal Intelligence

The WISC-III verbal scale was administered to assess the children's verbal IQ.

#### Procedure

Mothers and children agreed to participate in this study after receiving a letter from the headmaster of the school attended by the children. The letter presented our research project as a study aimed at investigating the inter-generational transmission of attachment models. Only 13% of the families of children attending the second year of the middle school agreed to be contacted further. Mothers had a brief interview with the researchers aimed at further illustrating the study and at collecting the developmental history of the children to rule out physical or mental disorders, after which the children were informed about the aim of the study. All of the contacted children agreed to participate, then both parents gave their written consent. While all of them gave their consent for the interviews, only 23 families gave their consent to administer the WISC-III verbal scale. Graduate psychology students, who had previously been trained by the first author in the administration of the AAI and the CAI, administered the interviews in rooms made available by the headmaster inside the school. The AAIs were audiotaped, the CAIs were videotaped, and then both were transcribed verbatim. All the coders involved in the study had received their coding license after ad hoc training at the Anna Freud Centre and University College in London. The study followed the 2010 ethical guidelines of the APA (American Psychological Association, 2010).

# RESULTS

## Preliminary Analyses

First of all, we checked the distribution of the variables of interest. All, but maternal derogation, maternal involving anger toward mother and father, and maternal references to mixedambivalent mental states, resulted normally distributed. Thus, in the subsequent analyses non-parametric statistics were used only for the four not normally distributed variables. Then, we explored the data for possibly puzzling variables. Gender differences in CRF-overall score, F(1,38) = 0.342, p > 0.05, CRF-references to positive mental states, F(1,38) = 0.172, p > 0.05, CRF-references to negative mental states, F(1,38) = 0.064, p > 0.05, and CRFreferences to mixed-ambivalent mental states, F(1,38) = 2.152, p > 0.05, were not significant. No significant correlation emerged between maternal level of education, maternal RF (r = −0.032), and child RF (r = 0.206). The children's verbal I.Q. did not correlate with child RF (r = 0.062).

# Child Reflective Functioning and Maternal Attachment

According to the AAI coding system, 24 mothers were classified as secure and 15 mothers as insecure. The children of secure and insecure mothers were compared on RFS scores using independent t-test. A moderate effect of the group (Cohen's d = 0.63) was found regarding overall CRFS, with higher scores being observed in the children of secure mothers. Comparisons are reported in **Table 1**.

Correlation analysis was used to investigate the association between the maternal scales of mind referred to attachment and the child RF. The results are shown in **Table 2**.

Child RF correlated significantly positively with maternal Coherence of Mind (r = 0.326, p = 0.043) and negatively with Maternal overall derogation of attachment (ρ = −0.327, p = 0.043). No significant associations emerged between child RF and the maternal idealization of her relationships with her parents during her childhood. A negative association (ρ = −0.252), albeit not statistically significant, was found between maternal "Involving anger toward mother" and child RF.


TABLE 1 | Child Reflective Functioning Scale (CRFS) descriptive statistics and group comparisons between children of secure and insecure mothers.

M, mean; SD, standard deviation; t, t-statistic; p, p-value; d, Cohen's measure of effect size (|d| < 0.20: negligible; |0.20| < d < |0.50| : small; |0.50| < d < |0.80| moderate; d > |0.80| : large); CRFS, Children's overall Score reported on Child Reflective Functioning Scale; CPMS, Children's references to positive mental states in the context of RF; CNMS, Children's references to negative mental states in the context of RF; CMMS, Children's references to mixed-ambivalent mental states in the context of RF.

# Child Reflective Functioning and Child Attachment

Twenty-two children (56.4%) were classified as secure toward their mother, and 17 children were rated as insecure, of whom 14 (35.9%) were dismissing and three (7.7%) were rated as preoccupied. None of the children were classified as disorganized. Secure children obtained higher scores on CRFS (M = 4.14, SD = 1.36) compared to insecure children (M = 2.94; SD = 1.03). The comparison was carried out using the independent t-test and yielded a significant difference between the two groups (t = −3.021, p = 0.005) as well as a large effect size (Cohen's d = 0.99).

Correlation analysis was used to explore the association between the scores obtained on the CAI subscales and the CRFS. Results are provided in **Table 3**.

Overall CRFS score correlated significantly with "Emotional openness" (r = 0.607), "Balance of references to Attachment Figures" (r = 0.382), "Use of examples" (r = 0.552), "Resolution of conflicts" (r = 0.472), and "Overall coherence" (r = 0.549). An inverse correlation was observed between Overall CRFS score and "Idealization of father"(r = −0.350), "Dismissal of mother" (r = −0.458), and "Dismissal of father" (r = −0.423). The children's ability to mentalize positive mental states was significantly positively associated with "Emotional openness" (r = 0.402), "Use of examples" (r = 0.378), and significantly negatively associated with "Dismissal of mother" (r = −0.466), and "Dismissal of father" (r = −0.416). The children's ability to mentalize negative mental states was significantly positively associated with "Emotional openness" (r = 0.469), "Use of examples" (r = 0.445), and "Overall coherence" (r = 0.352), whereas it was significantly negatively associated with "Dismissal of mother" (r = −0.474), and "Dismissal of father" (r = −0.360). The ability of the children to mentalize mixed-ambivalent mental states also resulted significantly positively associated with "Emotional openness" (r = 0.416), "Use of examples" (r = 0.385), and "Overall coherence" (r = 0.898), whereas it was significantly negatively associated with "Dismissal of mother" (r = −0.431), and "Dismissal of father" (r = −0.369).

# Child Reflective Functioning and Maternal Reflective Functioning

Correlation analysis was also conducted to investigate the association of child RF with maternal RF. As reported in **Table 4**, a positive significant association emerged between child and maternal overall scores on RFS (r = 0.375). In particular, the children's overall RF score was associated with maternal ability to mentalize negative mental states (r = 0.348), as well as maternal ability to mentalize mixed-ambivalent mental states (ρ = 0.508).


TABLE 2 | Descriptive statistics for maternal scores on AAI subscales and correlations between maternal AAI subscales and child reflective functioning scores.

M, mean; SD, standard deviation; CRFS, Children's overall score reported on Child Reflective Functioning Scale; CPMS, Children's references to positive mental states in the context of RF; CNMS, Children's references to negative mental states in the context of RF; CMMS, Children's references to mixed-ambivalent mental states in the context of RF; Maternal Idealization M, maternal score on "Idealization toward mother" AAI subscale; Maternal Idealization F, maternal score on "Idealization toward father" AAI subscale; Maternal Overall Derogation, maternal score on "Overall Derogation" AAI subscale; Maternal Passivity, maternal scores on "Passivity" AAI subscale; Maternal Involving Anger M, maternal score on "Involving anger toward mother" AAI subscale; Maternal Involving Anger F, maternal score on "Involving anger toward father" AAI subscale; Maternal Coherence of Mind, maternal score on "Coherence of Mind" AAI subscale; <sup>∗</sup>p < 0.05.



M, mean; SD, standard deviation; CRFS, Children's overall (ccore)elim score reported on Child Reflective Functioning Scale; CPMS, Children's references to positive mental states in the context of RF; CNMS, Children's references to negative mental states in the context of RF; CMMS, Children's references to mixed-ambivalent mental states in the context of RF; <sup>∗</sup>p < 0.05; ∗∗p < 0.01; ∗∗∗p < 0.001.

The maternal ability to mentalize mixed-ambivalent mental states was also significantly associated with the children's ability to mentalize positive (ρ = 0.325), negative (ρ = 0.426), and mixedambivalent (ρ = 0.434) mental states.

To explore the extent to which maternal security of attachment and maternal RF might predict RF in children, a stepwise regression analysis was performed using maternal "Coherence of mind," maternal overall RFS score, and maternal references to mixed-ambivalent mental states as predictors of the children's RF. The final models are shown in **Table 5**. The models account for approximately 21% of the variance in children's overall RF score, and about 22% of the variance in children's references to mixed-ambivalent mental states. Specifically, only maternal overall RFS score predicted children's overall RFS score (t = 3.082, p = 0.004), and only maternal ability to mentalize in mixed-ambivalent mental states predicted the corresponding ability in the children (t = 3.167, p = 0.003).

# DISCUSSION

## Child Reflective Functioning and Maternal Attachment

The children of mothers who showed a secure attachment model regarding the relationship with their own parents during their childhood reported higher levels of RF than did the children of mothers who were classified as insecure on the AAI. Child RF was positively associated with maternal "Coherence of the Mind" on the AAI and negatively associated with maternal derogation of attachment. No association was found between Child RF and maternal idealizing strategies in the context of the AAI. A negative association, albeit not statistically significant, was found between maternal "Involving anger toward mother" and child RF.

These findings were mostly consistent with our hypotheses, and replicated results from previous studies. Thus, support was given to the notion that the maternal coherent mental

#### TABLE 4 | Correlations between maternal and child reflective functioning scores.


CRFS, Children's overall score reported on Child Reflective Functioning Scale; CPMS, Children's references to positive mental states in the context of RF; CNMS, Children's references to negative mental states in the context of RF; CMMS, Children's references to mixed-ambivalent mental states in the context of RF; MRFS, Mothers' Overall Score reported on Reflective Functioning Scale; MPMS, Mothers' references to positive mental states in the context of RF; MNMS, Mothers' references to negative mental states in the context of RF; MMMS, Mothers' references to mixed-ambivalent mental states in the context of RF; <sup>∗</sup>p < 0.05; ∗∗p < 0.01.

representation of her personal history, free from rigid defensive strategies, both maximizing and minimizing the importance of attachment relationships, allows the mother to freely access and process emotions in herself as well as in her child, in turn promoting the child's RF. Previous studies already found that securely attached mothers showed more emotional openness, whereas dismissing mothers were prone to minimize internalizing emotions in themselves as well as in their children, specifically by not being responsive to emotions of fear and sadness in their children (DeOliveira et al., 2005). The ability to accurately identify the child's emotions and to understand the causes of his/her distress was found to be related to attachment security, while experiences of neglect in childhood were found to be associated with an impairment of this maternal ability. Insecure women were less accurate in identifying emotions in children, and were more prone to negative attributions, and to be amused or neutral in the face of the child's distress (Leerkes and Siepak, 2006). In line with these findings, the results of the current study confirm that maternal derogation of attachment is specifically associated with impaired RF in children. A mother who derogates her emotional and attachment needs may be unable to be sympathetic with her child's emotional


Final model in bold; MMS, mixed-ambivalent mental states; MRFS, Mothers' Overall Score reported on Reflective Functioning Scale; CoM, Maternal Coherence of Mind; MMMS, mothers' references to mixed-ambivalent mental states in the context of RF.

needs, and it could be argued that her empathetic deficit in turn weakens her child's ability to recognize, to pay attention to, and to place importance on mental states. It has been found that the maternal proneness to contemplate children's negative emotions predicted emotional understanding in children (Dunn and Brown, 2001) whereas maternal difficulties in understanding the child's mind predicted an impairment in the children's ability to identify and deal with negative emotions (Sharp et al., 2006).

It was noteworthy that the results of the current study highlighted that maternal derogation, rather than maternal idealization, was associated with the child's impairment in RF. We could assume that idealizing strategies have a less destructive influence on mentalization, possibly impairing hostile feelings toward their attachment figures rather than impairing their entire emotional awareness, and thereby damaging RF to a lesser degree. This finding suggests that the overall dismissing category might be confusing in that it includes different sub-classifications: DS1 and DS3 (based mostly on the idealizing strategy), and DS2 (based on the derogating strategy). Results from the current study suggest that it is the maternal dismissing strategy based on derogation of the attachment figures as well as of one's own attachment needs that has a more disruptive impact on the child's mentalization.

# Child Reflective Functioning and Attachment Security

A highly significant association was also found between child attachment security and child RF, thus replicating the results of the CRFS validation study (Ensink, 2004). Children who were rated as being more emotionally open, more able to balance positive and negative descriptions of their parents, more prone to support their assertions through examples, and more able to positively resolve conflicts with their parents showed better RF. On the contrary, children who more often resorted to idealization and dismissal toward their parents showed a lesser degree of RF. Moreover, it is remarkable that the child's dismissal strategy and not the child's idealizing strategy negatively correlated with the child's RF. Yet, findings from the current study highlighted the more disruptive influence of the dismissal strategy on the mentalizing ability. Notably, a very strong association was found between the score on the "Overall coherence" subscale and the child's ability to mentalize mixed-ambivalent mental states in the context of their family relationships. Thus, these results strongly support the definite relationships that exist between attachment security and RF in the context of family relationships. Fonagy et al. (2010) recently reported specific associations between different attachment models and responses to the activation of the attachment system. Whereas secure individuals were able to maintain the mentalization and attachment systems simultaneously, dismissing individuals did not activate the attachment system, and preoccupied individuals showed strong activation of the attachment system and simultaneous deactivation of the mentalization system. Early studies assumed that since secure children feel an inner sense of emotional security in their relationship with their parents, they do not activate an attachment system and therefore are able to maintain an active mentalization system (Fonagy, 2006; Fonagy and Target, 2008). However, it was more recently hypothesized (Fonagy et al., 2010) that maternal mentalization mediated the relationship between secure attachment and mentalization in children.

# Child and Maternal Reflective Functioning

As expected, child and maternal RF resulted significantly positively correlated with each other. Correlation analysis yielded interesting findings showing that, above all, maternal ability to mentalize negative as well as mixed-ambivalent mental states correlated with the child RF. In particular, only maternal RF (and not maternal attachment security) predicted child RF, and only maternal ability to mentalize mixed-ambivalent mental states predicted the corresponding ability in the children. Thus, results from the present study add support to the hypothesis according to which maternal mentalization, more than maternal attachment security promotes mentalizing ability in children.

According to Fonagy et al. (2010), the maternal ability not to be overwhelmed by the emotional experiences of the child, especially when they are intense and/or painful, and her ability to mirror them in a marked and contingent way (Gergely and Watson, 1996), enhance the child's ability to effectively regulate emotions, allowing him/her to keep both attachment and mentalization systems activated. On the basis of this hypothesis, emotional regulation, rather than secure attachment, would allow mentalization. In other words, effective emotional regulation, promoted by a mother who is able to mentalize even in conditions of increased arousal as well as in the context of negative and

ambivalent mental states, mediates the relationship between attachment security and mentalization ability. Results of the current study seem to support the hypothesis put forth by Fonagy et al. (2007). They argued that particularly negative affects related to inevitable conflicts (provided they were moderate and experienced in the context of a good enough relationship) elicit the emergence of mentalization. At the same time, a good enough mother–child relationship provides the necessary emotional containment to promote the ability to mentalize. Our study suggests that mothers who are open to recognizing the emotional experience related to mixed-ambivalent mental states both in themselves and in their children, and to reflect upon it without being overwhelmed or in need to deny or to avoid it, are more able to promote the corresponding mentalizing ability in their children. However, further studies are needed to investigate whether and to what extent mothers with better mentalizing abilities use more mental state talk in the conversations with their children, and whether and to what extent the maternal ability of mental state talk mediates the intergenerational transmission of RF.

Furthermore, findings from the current study provide a fresh contribution to the research in this field, in that previous studies investigated the relationship between maternal and child mentalization comparing indeed different components (e.g., mind-mindedness, emotional understanding, ToM, mentalstate talk) of the multifaceted construct of mentalization in mothers and in children. To the best of our knowledge, this is the first study to compare the same operationalization of mentalization, namely RF, in mothers and their children by using the narratives about attachment in close family relationships both for mothers and for children. A previous study (Ensink et al., 2015) investigated RF in mothers and children of about 10 years of age on average, by assessing maternal RF in the context of the PDI. Ensink's study differs from ours because in that context the authors specifically evaluated the maternal ability to mentalize the child, instead of mentalizing the mother's own mental representations of her early attachment relationships. As Ensink et al. (2016) stated, taking into account the maternal RF even in the AAI, and not only in the PDI, is crucial because the mother's mentalization regarding her attachment experiences in childhood plays a critical role in her parenting. Maternal RF about her personal attachment history helps the mother to put herself in her child's shoes and be interested in his/her emotional experience and mental states. In addition, maternal RF might help the mother to understand what impact her feelings and thoughts could have on the child, thus preventing negative parenting.

Furthermore, the present work contributes to the study of the intergenerational transmission of RF in preadolescence, a rarely investigated developmental phase with regard to mentalization. As expected, we found a slightly increased frequency of the dismissing model in preadolescents. This finding, which is in line with previous studies (e.g., Weinfield et al., 2004; Doyle et al., 2009), could raise questions about the generalizability of the results of the current study. However, it is noteworthy that a significant association was observed between child and maternal RF, even in this developmental stage in which children are usually striving to achieve more autonomy.

The relatively small sample size (due in part to the very time consuming measures of attachment model and RF) prevented us from investigation the association between the distinct models of insecure attachment, namely dismissing, preoccupied, and disorganized, and distinct impairment of RF. Lastly, in addition to the above mentioned limitations of the study, it should be pointed out that only a very small number of the contacted families agreed to participate in the study. On the one hand, this was expected because of the very confidential and intimate nature of the measures that were used, on the other hand it might be questioned whether and to what extent the sample could be considered representative of the population.

# CONCLUSION

The present study investigated whether, and to what extent, RF during preadolescence was associated with maternal attachment security and RF, and with the child's attachment security. Results yielded significantly positive associations between child RF, maternal attachment security, maternal RF as well as child attachment security. On the contrary, maternal derogation of attachment and children's dismissing strategies were associated with lower RF in children. Specifically, only maternal RF (and not maternal attachment security) predicted child RF, and only maternal ability to mentalize mixed-ambivalent mental states predicted the corresponding ability in the children.

# AUTHOR CONTRIBUTIONS

AR designed the study, coordinated data collection, performed the statistical analyses and prepared the first draft of the article. CA contributed to the search for references, coded the CAI transcripts, cooperated in performing the statistical analyses, and contributed to the final version.

# ACKNOWLEDGMENTS

The authors are grateful to Eleonora Abbondanza, Giulia Alloro, Davide Dondero, Cinzia Firpo, Alessandra Lombardo, Franca Pezzoni, Daniel Joy Pistarino, Sara Maggio, and Marta Tonelli for collecting, transcribing, and coding the interviews.

# REFERENCES

fpsyg-07-01903 December 3, 2016 Time: 13:59 # 10


predictors of theory of mind understanding. Child Dev 73, 1715–1726. doi: 10.1111/1467-8624.00501


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Rosso and Airaldi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Social Cognition in Preschoolers: Effects of Early Experience and Individual Differences

#### Daniela Bulgarelli1,2 \* and Paola Molina<sup>1</sup>

<sup>1</sup> Department of Psychology, Università degli Studi di Torino, Torino, Italy, <sup>2</sup> CHILD, Collegio Carlo Alberto, Moncalieri, Italy

Social cognition is the way in which people process, remember, and use information in social contexts to explain and predict their own behavior and that of others. Children's social cognition may be influenced by multiple factors, both external and internal to the child. In the current study, two aspects of social cognition were examined: Theory of Mind and Emotion Understanding. The aim of this study was to analyze the effects of type of early care (0–3 years of age), maternal education, parents' country of birth, and child's language on the social cognition of 118 Italian preschoolers. To our knowledge, the joint effect of these variables on social cognition has not previously been investigated in the literature. The measures used to collect social cognition and linguistic data were not parent- or teacher-reports, but based on direct assessment of the children through two standardized tests, the Test of Emotion Comprehension and the ToM Storybooks. Relationships among the variables showed a complex pattern. Overall, maternal education and linguistic competence showed a systematic effect on social cognition; the linguistic competence mediated the effect of maternal education. In children who had experienced centre-base care in the first 3 years of life, the effect of maternal education disappeared, supporting the protective role of centre-base care for children with less educated mothers. The children with native and foreign parents did not significantly differ on the social cognition tasks. Limits of the study, possible educational outcomes and future research lines were discussed.

Keywords: Theory of Mind, emotion understanding, childcare, language, maternal education, parents' country of birth

# INTRODUCTION

Social cognition is the way in which individuals process, remember, and use information in social contexts to explain and predict how people behave (Fiske and Taylor, 2013). In the current study, two aspects of social cognition were examined: Theory of Mind (ToM) and Emotion Understanding (EU). ToM concerns the attribution of mental states (beliefs, desires, intentions, etc.) to oneself and others, and the ability to use these attributions to understand, predict and explain one's own behavior and that of other people (Mitchell, 1997). EU, on the other hand, is a component of social cognition and emotional competence, which concerns how individuals understand, predict, and explain their own and others' emotions (Harris, 1989; Denham, 1998; Saarni, 1999).

#### Edited by:

Markus Paulus, Ludwig Maximilian University of Munich, Germany

#### Reviewed by:

Andrea Saffran, Ludwig Maximilian University of Munich, Germany Anna Amadó Codony, University of Girona, Spain

> \*Correspondence: Daniela Bulgarelli daniela.bulgarelli@unito.it

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 28 August 2016 Accepted: 26 October 2016 Published: 14 November 2016

#### Citation:

Bulgarelli D and Molina P (2016) Social Cognition in Preschoolers: Effects of Early Experience and Individual Differences. Front. Psychol. 7:1762. doi: 10.3389/fpsyg.2016.01762

From a theoretical point of view, ToM and EU are partly correlated. In Pons and Harris' (2000) view, EU is made of nine components hierarchically organized. The simplest ones are recognition of emotional expressions and external causes of emotion, followed by the role of desire, beliefs and external reminder on emotions, emotion regulation, displayed emotions, role of moral dimension and mixed emotions. In Wellman's (1990) approach, basic ToM in childhood consists of five components: recognition of emotion expressions and external causes of emotion, understanding of desire and beliefs, ability to distinguish between physical and mental entities, and awareness of the link between perception and knowledge. Thus, the external features of emotions are necessary to read and predict people's internal states, while beliefs and desires can shape emotions. The correlation between ToM and EU is also supported by research outcomes (Hughes and Dunn, 1998; Cutting and Dunn, 1999; Pears and Fisher, 2005).

The current study focused on some factors, both external and internal to the child that can influence social cognition abilities in a group of Italian pre-schoolers: the role of early type of care on ToM and EU has been examined together with the effects of other intervening variables as maternal education, parents' country of birth, and linguistic competence. In what follows, the literature showed that the effect of type of care on social cognition has not been studied yet; that a complex interplay among these factors could be expected and that a study to take into concern simultaneously these several variables is necessary. This study focused firstly on early type of care and other variables that are strictly related to it; some other factors that could influence children's social cognition development as socio-economic status (Shatz et al., 2003; Cutting and Dunn, 1999; Meins et al., 2013), cognitive functioning and executive functioning (De Stasio et al., 2014; Schneider et al., 2014) were not deepened.

In early childhood, toddlers receive two main types of care: centre-based and home-based. In centre-based care, children experience daily life in a group setting with adults and peers, and routines, spaces and toys are organized for a group of children and adults; in addition, the adults providing the care are trained professionals. In home-based arrangements, children are more likely to be alone with adults or to share routines and toys with a very small number of other children, usually younger or older siblings. In these informal settings, caretakers are usually mothers, grandparents or non-professional baby-sitters (for a broader discussion, see Bulgarelli and Molina, 2016).

The literature emphasizes that type of care is associated with children's later development, reporting positive effects of centre-based care on cognitive and linguistic outcomes (Broberg et al., 1997; NICHD Early Child Care Research Network, 2002, 2004, 2006; Sylva et al., 2004; Belsky et al., 2007; Loeb et al., 2007; Magnuson et al., 2007; Hansen and Hawkes, 2009). With regard to more general social behavior, centre-based care appears to be related to teacher-reported externalizing problems in preschool and school age children (NICHD Early Child Care Research Network, 2002, 2005). A study on Canadian families showed that maternal care acts as a protective factor in the first year of life as compared to non-maternal care (provided by relatives, non-relatives, day care centres, etc.): parent-reported physical aggression and emotional problems at 4 years of age were lower in children from low-risk families who had been in maternal care (Côté et al., 2008). In the US on the other hand, high quality centre-based care has been found to protect against internalizing and externalizing behavior problems in preschoolers from low-income families (Votruba-Drzal et al., 2004). Thus, besides different types of care have shown to affect cognitive and social dimension of children's development, as far as we are aware, to date no studies have examined the relationship between social cognition at preschool age and the type of care received during early childhood. Maternal education predicted centre-based care usage in several countries: Norway (Zachrisson et al., 2013), Finland and West Germany (Krapf, 2014), Belgium (Vandenbroeck et al., 2008), UK (Sylva et al., 2007), Italy (Del Boca et al., 2005) and US (NICHD Early Child Care Research Network, 1997a, 2006). Moreover, maternal education is the most robust sociodemographic predictor of mother and infant behavior (Bornstein et al., 2003; Mistry et al., 2008).

Previous research has shown that children's social-cognitive development is positively associated with parental education level (Perner et al., 1994; Cutting and Dunn, 1999; Pons et al., 2003). In the UK and US, maternal education is positively associated with cognitive and linguistic outcomes (NICHD Early Child Care Research Network, 1997b; NICHD Human Learning Branch, 1998; Peisner-Feinberg et al., 2001; Sammons et al., 2004). Similarly, Italian children's cognitive and linguistic competence have been found to be systematically related to maternal education (Bulgarelli and Molina, 2016). Moreover, type of care has been shown to moderate the maternal education effect in preschool and school-aged children: specifically, linguistic and cognitive outcomes improve in line with level of maternal education in children who receive home-based care only, indicating that centre-based care can play a protective role in the first 3 years of life (Bulgarelli and Molina, 2016). For this reasons, while deepening the role of early type of care on children's social cognition, it is crucial to take into consideration the effect of maternal education as well.

Some studies reported that migrant status is related to type of care, specifically by predicting lower utilization of centrebased care (Sammons et al., 2004; Turney and Kao, 2009; Miller et al., 2013, 2014; Zachrisson et al., 2013); though, it is worth noticing that other studies did not find this relationship (Kahn and Greenberg, 2010; Krapf, 2014). A migrant is defined in the United Nations Educational Scientific and Culture Organization Glossary (2016) as "any person who lives temporarily or permanently in a country where he or she was not born, and has acquired some significant social ties to this country"; the parents of first-generation children are both migrants. Social cognition is partly affected by culture (for a review, see Molina et al., 2014), but migrant status is more than a question of cultural belonging: it is a condition with specific features related to entering a new social context–for example, separation from one's family of origin, changes in economic status, negative stereotypes and discrimination, language barriers and higher levels of stress. Very often, the migrant condition combines with other variables that affect children's development, such as poverty status and dual language learning, whereby children acquire both their parents'

mother tongue and the language of the host country (De Feyter and Winsler, 2009; Winsler et al., 2014). A Canadian study by Wade et al. (2014) showed that ToM performance at 5 years was predicted by children's language competence, but not by family income, migrant status or the presence of siblings in the household. Another study by the same research group (Prime et al., 2015) showed that mother's communicative clarity and mind-reading skills (termed cognitive sensitivity) were positively related to children's ToM at 5 years, and receptive language and academic achievement at preschool age. This pattern of associations between mothers' cognitive sensitivity and children's outcomes was similar in both native and migrant dyads of mothers and children, suggesting that the underlying process was similar. Nevertheless, migrant status appeared to be a risk factor, because it was negatively associated with maternal cognitive sensitivity. In keeping with the findings of Prime et al. (2015), U.S. immigrant mothers have been shown to report higher levels of parenting stress than native mothers, with stress predicting aggressive behavior in pre-school age children (Mistry et al., 2008).

The theoretical frame outlined so far highlighted that the relationship between social cognition development and early type of care requires to focus on other intervening variables, as maternal education and migrant condition, which in turn are related to linguistic issues. Moreover, social cognition and linguistic competence are also "directly" associated with one another. A meta-analysis by Milligan et al. (2007) reported that the predictive correlations between language and ToM were significant, even after controlling for age. When linguistic tasks were administered at an earlier time-point than ToM tasks, the correlations were higher than under the opposite condition, suggesting that the influence of language on ToM is stronger than the influence of ToM on language (Milligan et al., 2007). It may be that an overarching developmental factor such as working memory (Astington and Jenkins, 1999) or executive functioning (Carlson and Moses, 2001) influences both competences. Multiple aspects of linguistic competence may be interrelated with ToM: lexicon (for instance, Lohmann and Tomasello, 2003), syntax (de Villiers and Pyers, 2002) and conversational experience (Harris, 2005; Deleau, 2012). In the literature, debate is ongoing concerning the specific contribution to ToM of the different components of language competence. In the context of this discussion, Miller (2004) has proposed the performance hypothesis, which postulates that the influence of linguistic competence on performance on ToM tasks is affected by the linguistic complexity of the ToM task itself; evidence in support of this hypothesis has also come from a study by Bulgarelli and Molina (2013). For a wider discussion of these topics, see Bulgarelli and Molina (2013).

The current study deepens the role of early type of care, maternal education, parents' country of birth, and children's linguistic competence on social cognition of a group of Italian pre-schoolers: the reviewed literature showed that a complex interplay among these factors can be expected thus it is worth investigating them together in one study. Moreover, as reported in the Introduction session, previous studies focused on the effect on social behavior: to our knowledge, our study is the first to analyze the role of type of care on ToM and EU. Finally, social behavior was usually measured through parentor teacher-reported questionnaire (NICHD Early Child Care Research Network, 2002, 2005; Votruba-Drzal et al., 2004; Côté et al., 2008). Parents could be considered reliable observers when they are requested to evaluate children's behaviors: they have a privileged perspective on their child's development and can observe the child over time and in a familiar environment (Matheny et al., 1984). Nevertheless, parents are not trained observers: their judgment may be biased by social desirability, they may be incapable of perceiving their children's real competence (Fenson et al., 1994), and social representation of childhood may play a role in distorting adults' observations and managing the reliability of the measures (for a wider discussion of this topic, see: Molina and Bulgarelli, 2012b). It is also worth noticing that children's social cognition involves internal states that are not always directly observable: thus, parents may not be accurate in evaluating this competence (Kårstad et al., 2014). For these reasons, in the current study social cognition was measured directly with the children, through standardized tools that are internationally used to assess ToM and EU.

The current study focused on four research questions, mainly deduced from the literature. The first question related to the effects of type of early childcare on social cognition: given that this was the first study to investigate such question, we relied on earlier findings reported by Bulgarelli and Molina (2016) concerning cognitive outcomes to formulate the second hypothesis, predicting that type of care would only yield an effect in interaction with maternal education: specifically, higher maternal education would positively affect children's social cognition only in those who had been in home-based care in the first 3 years of life. The second question concerned the role of maternal education on social cognition and we expected that maternal education would directly affect children's social cognition, in line with the literature reviewed above (Perner et al., 1994; Cutting and Dunn, 1999; Pons et al., 2003). In keeping with the existing literature, the third question concerned the role of parents' country of birth: no direct effect of this variable on social cognition is expected (Wade et al., 2014; Prime et al., 2015). Finally, the fourth question related to the role of child's language: in line with earlier studies reported in the literature, as to the fourth hypothesis linguistic competence was expected to be directly associated with social cognition and also to be associated with maternal education (NICHD Early Child Care Research Network, 1997b; NICHD Human Learning Branch, 1998; Peisner-Feinberg et al., 2001; Sammons et al., 2004; Milligan et al., 2007; Bulgarelli and Molina, 2016); we therefore set out to analyze the possible joint effect of maternal education and linguistic competence on social cognition.

#### MATERIALS AND METHODS

#### Sample

The sample comprised 118 typically developing children (average age = 59.6 months, SD = 10.4, range: 38.5–76.7 months; average IQ = 99.6, SD = 13.5), all of them attending kindergartens in

Turin (Italy): see **Table 1**. Data were collected between 2009 and 2012; most of the children in the current study also took part in earlier reported research by Bulgarelli and Molina (2016).

Sixty-four children were girls (54.2%). A t-test analysis confirmed that the two subsamples of boys and girls were similar with respect to age (p = 0.449), IQ (p = 0.174), type of early childcare received (p = 0.530), maternal education (p = 0.187), parents' country of birth (p = 0.650) and verbal quotient (VQ; p = 0.450).

With regard to education, 53 mothers had completed lower secondary school (44.9%), 52 held an upper secondary school diploma (44.1%) and 13 were university graduates (11.0%). Overall, the sample displayed a lower level of educational achievement than the Italian population between 25 and 64 years of age in 2011, in which 44% had completed lower secondary education, 41% upper secondary education, and 15% third level education (OECD, 2014). For the purposes of the statistical analysis, the groups of mothers with upper secondary and university-level education were collapsed into one group termed the "highly educated group," after it had been verified that they did not significantly differ in relation to the independent variables in the research design. A t-test analysis confirmed that the two final subsamples of children, with less educated and more highly educated mothers, respectively, were similar in terms of age (p = 0.644), gender (p = 0.784), type of care (p = 0.116) and parents' country of origin (p = 0.163). The IQ and VQ scores of the children with more highly educated mothers were significantly higher than those of the children whose mothers had completed a lower level of education (IQ: mLOW = 96.98, mHIGH = 101.78, tIQ = −1.94, p = 0.055; VQ: mLOW = 76.70, mHIGH = 84.05, tVQ = −3.18, p = 0.002).

With regard to parent's country of birth, 92 of the children had two native-born parents (77.9%); 14 had one foreign-born parent (11.9%) and 12 two foreign-born parents (10.2%). In our sample, the percentage of children with two foreign-born parents was slightly lower than in the Italian population (14.5%)


and the percentage of children from mixed couples was higher (4.9% in the Italian population; Istat, 2012b). For the purposes of the statistical analysis, the groups of children with two nativeborn parents and one native-born parent were collapsed into a single subsample labeled native children, after it had been verified that these two groups did not differ in relation to the independent variables in the current research design. A t-test analysis confirmed that the two final subsamples, composed of children with at least one native-born parent and first-generation children with two foreign-born parents, respectively, were similar with respect to age (p = 0.433), IQ (p = 0.104), VQ (p = 0.319), gender (p = 0.627), type of early childcare (p = 0.402) and maternal education (p = 0.166).

In relation to type of care, in the first 3 years of life 54 children had received centre-based care (45.8%) and 64 children had been in exclusively home-based care. Home-based care had consisted of either exclusive maternal care or being looked after by other family members or babysitters. In 2010/11, 14.0% of Italian children between 0 and 2 years of age were enrolled in centre-based care, with marked differences among the different geographical regions: for instance, in the North, 29.4% of children attended day care in Emilia Romagna and 15.4% in Piemonte, while in the South, percentages varied from 9.6% in Abruzzo to 2.4% in Calabria (Istat, 2012a). A t-test analysis confirmed that the two subsamples of children who had received homebased care and centre-based care were similar with respect to age (p = 0.852), IQ (p = 0.276), VQ (p = 0.136), gender (p = 0.530) and parents' country of birth (p = 0.215), but differed significantly in relation to maternal education: highly educated mothers were more likely to choose centre-based care (p = 0.021).

#### Measures and Procedures

At three separate sessions conducted within a month of each other, the children were individually assessed at kindergarten using four standardized tests: the ToM Storybooks (Molina and Bulgarelli, 2012a; Bulgarelli et al., 2015) were used to assess ToM and the Test of Emotion Comprehension (TEC, Pons and Harris, 2000; Albanese and Molina, 2013) to assess EU; the Leiter-R (Roid and Miller, 2002; US version: 1997) was used to assess non-verbal IQ; the Peabody Picture Vocabulary Test (Dunn and Dunn, 1981), in its Italian version (Stella et al., 2000) was used to assess receptive language, reported as Verbal Quotient (VQ).

The ToM Storybooks is a comprehensive 93-item instrument tapping the five components in Wellman's (1990) model of ToM: emotion recognition, understanding of desire and beliefs, ability to distinguish between physical and mental entities, and awareness of the link between perception and knowledge; a classical False Belief task is also included. The total score varied from 0 to 111; in this study the total score was used because the standardization of the test is still ongoing. The ToM Storybooks is made up of six full-picture books telling stories about a boy called Sam. Each book recounts an adventure of Sam's (Sam going to the swimming pool, visiting his grandparents, etc.) and contains 5 or 6 tasks assessing one or more ToM components. The experimenter reads the story while the child looks at the images. In one of the tasks that tap the role of desire in generating behaviors, Sam is searching for his dog: "Where is Puckie? Puckie

has hidden [point the picture] behind the tree or [point the picture] behind the trash can. Sam wants to play with Puckie. First, he goes to look behind the trash can. But Puckie is not there. What will Sam do now?"

The TEC evaluates nine hierarchically organized components of EU that emerge between 3 and 11 years. The simplest ones are recognition of emotional expressions and external causes of emotion, followed by the role of desire, beliefs and external reminder on emotions, emotion regulation, displayed emotions, role of moral dimension and mixed emotions. The TEC raw score varied from 0 to 9 and in this study the Italian standardized z-score was used. Each TEC components are proposed in the frame of a short pictured story; the child answers to the task questions by indicating the facial expression of the correct emotion, accordingly to what happened. For example, in the displayed emotion task, in which the difference between apparent and real emotion is tapped, the experimenter reads this story: "This is Sarah and this is Dorothy. Dorothy is teasing Sarah because Dorothy has lots of marbles and Sarah doesn't have any. Sarah is smiling because she doesn't want to show Dorothy how she is feeling inside. How is Sarah feeling inside? Is she happy, alright, angry or scared?"

Parents were asked to complete a questionnaire on their sociodemographic background, which assessed both parent-related characteristics (country of birth, level of education) and childrelated characteristics (country of birth, gender, siblings, type of childcare during the first 3 years of life). Thus, the data concerning the type of early care received in the first 3 years of life was collected retrospectively.

Mothers' level of education was coded in terms of the Italian school system: (0) lower level of education (i.e., mothers had obtained a low school degree, corresponding to a maximum of 8 years' school); (1) more highly educated (i.e., mothers had attended at least 13 years of school/university, with high school, bachelor's, master's and doctoral degrees all collapsed together into a single category).

For each child, parents' country of birth was coded as follows: (0) native children (i.e., two native-born parents or one nativeborn and one foreign-born parent); (1) first-generation children (i.e., two foreign-born parents).

# Analysis

T-tests for small sample sizes were performed to check for significant differences in the children's ToM and EU scores as an effect of type of early childcare, parents' country of birth and maternal education. The direct effect of language on social cognition was assessed by analyzing the correlations between linguistic competence scores and ToM and EU scores, respectively.

To test for interactions between maternal education and type of early childcare or parents' country of birth, separate t-tests for the effect of maternal education on ToM and EU were performed on the type of care and parental country of origin subsamples. An ANOVA analysis has not been run: the sample size was too small to test the interaction effects through an ANOVA; for this reason, a regression analysis was not run as well.

The mediating effect of linguistic competence was investigated by conducting two regression analyses, with ToM and EU scores as dependent variables and maternal education and parents' country of birth as independent variables.

# RESULTS

# Direct Effect of Individual Variables

Type of early childcare did not lead to significant differences in ToM and EU and the effect size was not relevant as well (**Table 2**). Maternal education was found to have a significant direct effect on ToM and EU scores. First-generation children obtained the lowest mean scores on the social cognition measures: although these scores did not significantly differ from those of the other children, the effect size was relevant (**Table 2**). Finally, linguistic competence was found to be correlated with both ToM and EU scores (r = 0.503, p < 0.01 and r = 0.406, p < 0.01, respectively).

# Interaction among Variables: The Role of Type of Care

Type of care and maternal education were found to interact, in that maternal education had an effect on the social cognition abilities of children who had received home-based care only, but not on those of children who had been in centre-based care. More specifically, children whose mothers had completed a lower level of education only obtained significantly lower ToM scores than children with more highly educated mothers when they had received exclusively home-based care in the first 3 years of life (**Table 3**).

# Interaction among Variables: The Role of Parent's Country of Birth

Parental country of birth and maternal education were found to interact: namely, maternal education had an effect on the social cognition abilities of children with native-born parents, but not on those of first-generation children. More specifically, children whose mothers had completed a lower level of education only obtained significantly lower ToM and EU scores than children with more highly educated mothers when both parents were native-born (**Table 4**). Nevertheless, considering the effect size, the differences due to parents' country of birth were lower than the differences observed in respect to the maternal education.

# Interaction among Variables: The Role of Linguistic Competence

With regard to the role of linguistic competence, both a direct effect of language on ToM and EU scores and a mediation effect of language on the relationship between maternal education and ToM and EU were found.

With respect to ToM (**Figure 1**), the correlation between maternal education and language ability scores was 0.283 (p < 0.01), the partial correlation between linguistic competence and ToM scores (after controlling for the effect of maternal education) was 0.465 (p < 0.01), while the direct correlation between maternal education and ToM scores was 0.269



TABLE 3 | Differential effects of maternal education as a function of home- versus centre-based early childcare.


TABLE 4 | Differential effect of maternal education in children as a function of having native-born versus foreign-born parents.


(p < 0.01), and this correlation was reduced if the language effect was considered (Beta = 0.137, NS).

Turning to EU, the same pattern of results was found (**Figure 1**): the partial correlation between linguistic competence and EU scores (while controlling for the effect of maternal education) was 0.380 (p < 0.01), the direct correlation between maternal education and EU scores was 0.235 (p < 0.05), and this correlation was reduced if the linguistic competence effect was considered (Beta = 0.127, NS).

When these correlations were analyzed separately in the two groups of children who had received home-based only versus centre-based care, the pattern of results differed (**Figure 1**). With respect to children in home-based care, two significant direct correlations were found: between maternal education and linguistic competence (r = 0.422, p < 0.01) and between maternal education and ToM scores (r = 0.347, p < 0.01); furthermore, linguistic competence and ToM scores were correlated, partializing for maternal education (Beta = 0.529, p < 0.01). Moreover, linguistic competence mediated the relationship between maternal education and ToM: in fact, the direct correlation between maternal education and ToM was reduced if the linguistic competence effect was considered (Beta = 0.124, NS). With respect to children in centre-based care, the correlation between linguistic competence and ToM was the only significant relationship identified (Beta = 0.379, p < 0.01), with no correlations found between maternal education and ToM or between maternal education and language.

A similar pattern of results was found for EU (**Figure 1**): in the subsample of children who had received home-based care only, there were direct correlation between maternal education and linguistic competence (r = 0.422, p < 0.01), and between maternal education and EU scores (r = 0.271, p < 0.05); linguistic competence correlated with EU (Beta = 0.500, p < 0.01), and mediated the relationship between maternal education and EU: more specifically, the direct correlation between maternal education and ToM was reduced if the linguistic competence effect was taken into account (Beta = 0.061, NS). On the contrary, in children in centre-based care the only significant relationship

identified was the correlation between linguistic competence and EU (Beta = 0.245, p < 0.01).

When parents' country of birth was included in the analysis, only linguistic competence was strongly correlated with ToM and EU, in both migrant parent and native-born parent subgroups (**Table 5**). However, no mediation effect was found: parents' country of birth was not correlated with language (r = 0.113, NS for the total sample; r = 0.055, NS for children in home-based care; and r = 0.154, NS for children in centre-based care), nor with ToM (r = 0.101, NS for the total sample; r = 0.061, NS for children in home-based care; and r = 0.154, NS for children in centre-based care), nor with EU (r = 0.125, NS, for the total sample; r = 0.071, NS for children in home-based care; and r = 0.106, NS for children in centre-based care).

#### DISCUSSION AND CONCLUSION

The aim of this study was to contribute to the debate about the effects of type of early childcare, maternal education, parents' country of birth, and child's linguistic competence on children's social cognition as observed at preschool age by analyzing Italian data. We analyzed two specific social cognition abilities, ToM and EU, finding them to display a systematically similar pattern of relationships with the independent variables under study.

Interestingly, in our study type of early childcare did not have a direct effect on social cognition and, as predicted according to the first hypothesis, interacted with maternal education: the ToM and EU scores of children who received their early childcare in the home were affected by maternal education, whereas this was not the case for children in centre-based care. It seemed that centre-based care could play a protective role for children with lower-educated mothers: on one hand, professionals provide stimulating contexts and aware educational practice (for a wider discussion, see: Molina, 2016; Molina et al., 2016); on the other hand, in day care services children experience stable and numerous relationship with peers that could foster ToM development: debate is still open about a positive effect of the presence of siblings in the family and peers in kindergarten, observed in some studies (McAlister and Peterson, 2007; Wang



Pearson correlation (Two-tailed): <sup>∗</sup>p < 0.05, ∗∗p < 0.01.

and Su, 2009) but not in others (Das and Babu, 2004; Molina and Bulgarelli, 2012a).

According to the second hypothesis, maternal education was found to have a direct effect on ToM and EU: evidence of the effect of maternal education on social cognition has been found in other studies (Perner et al., 1994; Cutting and Dunn, 1999; Pons et al., 2003) as well as in our own earlier study on cognitive outcomes (Bulgarelli and Molina, 2016). A possible explanation for the positive effect of maternal education on children's development could lay on mothers' higher awareness of the importance of the quantity and quality of time spent with the offspring: higher-educated parents spend more time with their children than lower-educated parents; they are more aware of the link between spending time with their children and their future development; and are more likely to interiorise and implement the social norms and behaviors associated with "involved parenting" (Sayer et al., 2004; Craig, 2006; Monna and Gauthier, 2008). Moreover, the development of social cognition is specifically supported by parents' ability to mentalise: mindmindedness is defined as the adults' tendency to comment appropriately on their children's internal states and it plays a protective role for the children's social development, specifically in low socioecomomic families (Meins et al., 2002, 2013).

In line with the Canadian study by Wade et al. (2014) and the previous Italian study on cognitive outcomes (Bulgarelli and Molina, 2016), an effect of parents' country of birth on children's social cognition was not expected. Nevertheless, the results were partly different: no significant differences were found between children with native and foreign parents, but the effect size of the difference between the two groups was not negligible (0.45 and 0.42 for ToM and EU scores, respectively). Then, the lack of significance of the difference could be due to the insufficient power of the statistical test, taking into consideration that the sample was highly unbalanced in favor of children with native parents. Similarly, the very small number of first-generation children could explain the lack of differences due to maternal education observed in this subsample.

As expected based on the literature (for instance, Milligan et al., 2007), the fourth hypothesis was confirmed: linguistic competence directly affected social cognition. On analyzing the role of children's linguistic competence, which is related to both children's social cognition and maternal education, linguistic competence was shown to mediate the maternal education effect on social cognition, but only in children in home-based care. As stated before, professional care appeared to play a protective role for children with less educated mothers. The protective role of early type of care was less clear when considering the two groups of children with native and foreign parents: in this case, the linguistic competence seemed the relevant aspect to differentiate children's performances in the social cognition tasks. In sum, when not correlated with maternal education, language was the variable that mainly correlated with the ToM and EU scores. More highly educated mothers had children with greater linguistic competence, but centre-based care in the early years compensated for this difference. As previously discussed elsewhere (Bulgarelli and Molina, 2016), designing educational intervention and training professionals to better support children's linguistic development from the early years of life seem crucial: day care services are the context where such support could be better provided (Scopesi and Viterbori, 2008; Molina et al., 2016) and such intervention could be crucial for children with two foreign-born parents.

The role if linguistic competence in shaping the differences among children's social cognition performances should be interpreted with caution, because children's performance on the ToM and EU tasks were also affected by the linguistic format of the task itself (Miller, 2004). Moreover, this study focused on receptive language: this measure was chosen because it is a good index of children's general linguistic competence yet easy and fast to assess; nevertheless, language is a complex construct and future research could deepen the role of other linguistic aspects, as syntax and conversational ability.

With regard to the limits of the current study, the quasiexperimental design required to interpret the results with caution. The sample was recruited in a specific Italian region: this guaranteed a higher homogeneity of social influence on our sample, but limited the generalizability of the results to the Italian population. Italian children's ToM and EU showed specific pattern of development compared to British and German children (Lecce and Hughes, 2010; Molina et al., 2014): thus the generalizability of the pattern of the current results to other western countries should be specifically tested. Furthermore, the sample included a relatively low number of children with two foreign-born parents, that did not allow to perform a multiple regression analysis to test the interaction of the independent variables; nevertheless, the percentage of this group of subjects was in line with the percentage of children with two foreign parents living in Italy in the period when the data were collected. It is worth noticing that the sample was balanced between medium-low and medium-high socio-economic status, avoiding the biases due to the difficulty in enrolling low socio-economic families. The pattern of the effect of type of early childcare, maternal education, and parents' country of birth on social cognition was similar to that observed in a previous study in which verbal and cognitive competence were the dependent variables (Bulgarelli and Molina, 2016) and this could be read as a partial support to the validity of the current research.

Nevertheless, this pattern of effects might be limited to preschool age, and further investigation with older children is needed. In future research, it would be of interest to explore the role of cognitive functioning and gender in greater depth, together with an index of quality of the type of care in early infancy.

To our knowledge this is the first study to have investigated together the role of early childcare, maternal education, parent's country of origin and children's receptive language in the development of social cognition. Type of care, in interaction with maternal education and children's linguistic competence, affected social cognition and early centre-based care seemed to play a protective role for those children with lower-educated mothers. The protective role of centre-based care was less clear when considering the effect of parental country of birth and further research is needed.

#### ETHICS STATEMENT

The Comitato di Bioetica dell'Ateneo (the Committee) approved the current research run on human voluntary participants. The Committee approved: 1) the design of the research and the assessment tools; 2) the sample recruitment criteria; 3) the procedure for the collection of the informed consent form. The study involved preschool-aged children: their parents' consent for participating in the study was collected. The children were observed at school, after an agreement with the school Director

## REFERENCES


and the teachers. Each child gave her or his personal vocal consent to participate in the study assessment.

#### AUTHOR CONTRIBUTIONS

DB and PM substantially contributed to the conception of the work and to the acquisition, analysis, and interpretation of data of the current study. DB and PM wrote the manuscript and revised it critically, adding important intellectual content. DB and PM approved the final version of the manuscript to be published. DB and PM agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

## FUNDING

The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 320116 for the research project FamiliesAndSocieties.

# ACKNOWLEDGMENT

We are grateful to the kindergarten staff for their support and to the parents and children who participated in the study.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer AS and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Bulgarelli and Molina. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Beyond Conceptual Knowledge: The Impact of Children's Theory-of-Mind on Dyadic Spatial Tasks

#### Karine M. P. Viana<sup>1</sup> \*, Imac M. Zambrana2,3, Evalill B. Karevold<sup>1</sup> and Francisco Pons<sup>1</sup>

<sup>1</sup> Department of Psychology, University of Oslo, Oslo, Norway, <sup>2</sup> Department of Special Needs Education, University of Oslo, Oslo, Norway, <sup>3</sup> The Norwegian Center for Child Behavioral Development, Oslo, Norway

Recent studies show that Theory of Mind (ToM) has implications for children's social competences and psychological well-being. Nevertheless, although it is well documented that children overall take advantage when they have to resolve cognitive problems together with a partner, whether individual difference in ToM is one of the mechanisms that could explain cognitive performances produced in social interaction has received little attention. This study examines to what extent ToM explains children's spatial performances in a dyadic situation. The sample includes 66 boys and girls between the ages of 5–9 years, who were tested for their ToM and for their competence to resolve a Spatial task involving mental rotation and spatial perspective taking, first individually and then in a dyadic condition. Results showed, in accordance with previous research, that children performed better on the Spatial task when they resolved it with a partner. Specifically, children's ToM was a better predictor of their spatial performances in the dyadic condition than their age, gender, and spatial performances in the individual setting. The findings are discussed in terms of the relation between having a conceptual understanding of the mind and the practical implications of this knowledge for cognitive performances in social interaction regarding mental rotation and spatial perspective taking.

Edited by: Daniela Bulgarelli, Aosta Valley University, Italy

#### Reviewed by:

Andrea C. Samson, University of Geneva, Switzerland Frances Buttelmann, Cognitive Development Center, Budapest, Hungary

\*Correspondence:

Karine M. P. Viana karinepv@gmail.com

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 25 July 2016 Accepted: 05 October 2016 Published: 20 October 2016

#### Citation:

Viana KMP, Zambrana IM, Karevold EB and Pons F (2016) Beyond Conceptual Knowledge: The Impact of Children's Theory-of-Mind on Dyadic Spatial Tasks. Front. Psychol. 7:1635. doi: 10.3389/fpsyg.2016.01635 Keywords: theory-of-mind, spatial task, cognitive performance, dyadic interaction, children

# INTRODUCTION

The development of 'theory of mind' (ToM), which is the ability to understand the nature, origins, and consequences of the mind (beliefs, intentions, desires, feelings, etc.) in the self and others, has been investigated extensively (e.g., Wellman et al., 2001; Shahaeian et al., 2011; Harris et al., 2016). However, much less is known about the implications of ToM for children's social and cognitive development (e.g., Harris, 2006; Grüneisen et al., 2015). On one hand, recent studies have shown that children's ToM is positively associated with their overall prosocial behaviors and social competences (Caputi et al., 2012; Roazzi et al., 2013; Farina and Belacchi, 2014). On the other hand, the implications of ToM for children's cognition has received less attention and the findings are typically inconsistent (Meins et al., 2006; Veneziano et al., 2008; Guajardo and Cartwright, 2016). Furthermore, albeit it is well documented that on a range of cognitive problems children obtain better performances when solving these together with a partner (e.g., Doise and Mugny, 1984; Tversky and Hard, 2009), to the best of our knowledge, no study has addressed the degree to which individual differences in ToM can account for cognitive problem resolving performances

in dyadic settings. In light of the particular reliance on collaborative task performances in the awaiting and modern academic and vocational life, understanding whether collaborative tasks depend on individual abilities can have both practical and pedagogical implications. The present study therefore aims to investigate whether ToM can explain children's performance in a dyadic spatial transformation task which demands the cognitive ability to mentally rotate objects and the coordination of different viewpoints.

When the term 'theory-of-mind' was originally introduced, it was thought of as the competence to attribute mental states to self and others, involving the ability to theorize about others' mind by making inferences regarding mental phenomena (Premarck and Woodruff, 1978). It was thus recognized as a socio-cognitive skill enabling human beings to predict, explain and manipulate others' actions and representations. Traditionally, false-belief tasks, based on the attribution of a mistaken belief, have been central in assessing children's ToM capacities (Wimmer and Perner, 1983). Today, however, it has become more and more common to consider ToM through a wider lens; not only involving the understanding of belief and knowledge, but also encompassing the competence to conceptually understand intentions, desires, and emotions (e.g., Astington, 2001; Wellman et al., 2001; Pons et al., 2004; Dunn, 2006; Shahaeian et al., 2011).

The first studies in the ToM field presented strong evidence for the progress children obtain between 3 and 5 years of age on classical false-belief, appearance-reality, and Level-2 visual perspective taking tasks (Flavell et al., 1983; Flavell, 2004). Albeit important milestones in ToM development occur in the preschool years, the knowledge about mental states continues to increase later on (Flavell, 2004). Research has shown that from infancy to adolescence, ToM develops from a "peripheral and superficial" understanding of rather visible or non-reflective dimensions of the mind (e.g., recognition of basic emotions, understanding of first order false-beliefs and impact of desires on emotions) to a more "central and deeper" understanding of more invisible or reflective dimensions of the mind (e.g., understanding of moral and mixed emotions, of second order false-beliefs and double-bluffs; Pons et al., 2009).

Different directions of research emerged from these early works on trends in ToM development. These studies have been exploring, for instance, antecedents that might contribute to ToM development, intra and intercultural differences, and real world consequences of ToM abilities (e.g., Flavell, 2004; Shahaeian et al., 2011). It is well documented that ToM development depends on many social and cognitive factors, such as language, intelligence, executive function, attachment, and relationships with peers (e.g., Cutting and Dunn, 2006; Pons et al., 2014). Recent studies have also found positive impacts of ToM on social competences at the ages of 3–6 years and psychological well-being at the ages of 8–12 years (e.g., Farina and Belacchi, 2014; Bender et al., 2015). However, the implications of understanding mental states for children's cognition remain unclear. For instance, Veneziano et al. (2008) found that 6–7 year-olds with higher ToM test scores were better able to express epistemic states when they narrated a story. A longitudinal study conducted by Guajardo and Cartwright (2016) tested children at 3–5 years and later at 6–9 years and showed that those who had better understanding of other's perspectives were more aware of their thoughts involved in reading. Lecce et al. (2010) found the same results in a study assessing children between 9 and 10 years of age. On the other hand, Meins et al. (2006) argue that between 6 and 9 years of age, having ToM capacities, measured through conceptual tasks, is different from being able to use it either to narrate a book or to describe friends. Likewise, Guajardo and Cartwright (2016) showed that false-belief understanding did not contribute uniquely to reading comprehension. Together, this suggests at least two gaps when it comes to understand the role of ToM for children's cognition. First, previous studies do not cover a broad measure of ToM that also includes the understanding of desires and emotions; and second, there is still a need to explore other cognitive dimensions potentially influenced by ToM in school-aged children that go beyond the use of mental terms and reading comprehension. One such dimension is the performance on cognitive tasks completed together with peers.

Studies on the impact of ToM on cognitive performances in dyadic interaction are rare, and have especially focused on falsebelief reasoning and the process (rather than the outcome) of cooperation. If on one hand it has been shown that ToM works as a powerful social tool that facilitates children's interactions with peers (Moore and Frye, 1991), it remains unclear whether ToM has implications for the cognitive outcome produced in social interaction. For instance, Grüneisen et al. (2015) recently found that 6-year-olds could use first and second order falsebeliefs to coordinate actions with peers, showing that recursive mind-reading is an important component of dyadic interaction. Similarly, Flobbe et al. (2008) demonstrated that 8–10 year-olds passing a second order false-belief task are able to apply this when playing a strategic game with a peer. Curry and Chesters (2012) showed that adults scoring lower on a self-report measure of autistic traits and understanding of other's minds were also less successful at coordinating their behaviors with others in coordination games. These researchers subsequently called for studies using a broader range of ToM measures to investigate the impact of children's understanding of the mind on their performances in dyadic settings. Investigating how children solve a spatial transformation task in a dyadic situation might be particularly relevant in this context because it requires both the cognitive ability to mentally rotate objects and the adoption of the spatial perspective of someone else (Kessler and Thomson, 2010).

Spatial abilities comprise activities such as perception of horizontality, mental rotation of objects, or location of simple figures within complex figures (Linn and Petersen, 1985). Specifically, spatial transformation demands the ability to mentally rotating objects and making transformations in their positions based on a specific referential mark (Hegarty and Waller, 2004). Piaget and Inhelder (1952) focused in particular on one aspect of spatial relations called "coordination of perspectives," which refers to the ability to identify the appearance of an object as something dependent on the spatial position from which they are viewed. Based on the classical "three mountains task," they found that children younger than 6 years locate objects with respect to their own points of view, and it is only between 7

and 9 years of age, when children reach the concrete operational stage, that they would be aware of other perspectives than their own and thus deal with an external frame of reference (Piaget and Inhelder, 1952). Spatial relations, therefore, comprises both the cognitive process of projecting relationships between objects, and the social process of understanding the relation between two different perceptions, as exemplified by the "If I were in your place I would see what you see" line of thinking (Fishbein et al., 1972). Flavell et al. (1981) claimed that even under the age of 3, children recognize that people can perceive different objects at the same time (Level-1 perspective taking) but they have difficulties with recognizing that they can see the same object from different perspectives (Level-2 perspective taking). This more sophisticated ability is likely to be developed around 5 years of age. Newcombe and Huttenlocher (1992), for instance, tested children between 3 and 9 years of age and found that children as young as 5 years can take the spatial perspective of others when the task does not entail conflict between two frames of reference.

The Piagetian paradigm presented strong evidence for the role of socio-cognitive conflicts on the development of coordination of perspectives. In the "three mountains task," children have to visualize themselves in a different position and these conflicting representations within the individual promote a breakdown in the cognitive equilibrium that boosts a reinterpretation of the object (Zapiti and Psaltis, 2012). However, in the "three mountains task," the perceptions were not confronted by someone else. Doise and Mugny (1984) contributed enormously to this issue by considering the spatial coordination not only as an intra-individual process but also as an inter-individual one. Based on a critical review of Piaget and Inhelder's (1952) work, they proposed a series of experiments where the coordination of real viewpoints could take place. They tested children between 5 and 8 years of age in a spatial transformation problem called "The reconstruction of the village task," involving both an individual and a dyadic condition. The findings demonstrated a positive impact of peer collaboration on spatial performances as children progressed on the task after they have worked with a partner. The authors argue that when solving a spatial task individually, children have to create intra-individual cognitive conflicts to envision and derive at different solutions, and that this could be less powerful than collaborative settings where the inter-individual conflict and the mutual action context promote subsequent individual progress. Moreover, it could be more effective if each member of the dyad has access to only one part of the resources needed to complete the task (Buchs and Butera, 2004). In a recent study, Zapiti and Psaltis (2012) applied the same "village task" used by Doise and Mugny and tested children between 6.5 to 7.5 years of age to analyze the impact of interaction types on task performance. They found that the pair composition in terms of the children's gender and spatial knowledge affected the expression of point of view and the type of the socio-cognitive conflict that emerged. In a meta-analysis on gender differences in spatial ability, Linn and Petersen (1985) demonstrated how gender relates to spatial performance by showing that males are better than females in mental rotation problems and that the magnitude of this difference is smaller in spatial visualization. The authors also pointed out that the impact of gender might vary depending not simply on the task type but also on the age range of the participants (e.g., Voyer et al., 1995; Yilmaz, 2009).

Thus, even though previous research has demonstrated a positive impact of peer collaboration on spatial performances both with children and adults (Doise and Mugny, 1984; Tversky and Hard, 2009), more studies are needed in order to deepen our understanding of the mechanisms underlying spatial performance in dyadic settings. Because the "village task" demands the cognitive ability to mentally rotate objects and the coordination of different viewpoints, they can be particularly fruitful for the purpose of examining whether broader ToM capacities play a role in children's spatial performance in social interaction. Therefore, the current study addresses two main questions: (1) whether children improve their performance when resolving a spatial transformation task with a partner as compared to alone; (2) and to what extent children's achievements on ToM tasks explain their spatial performances in a dyadic setting. The reasons for focusing on a dyadic setting are twofold: the need to understand potential mechanisms related to individual differences in dealing with spatial problems in social interaction; and the intention to explore the impact of ToM on an advanced cognitive problem, as the performance in the dyadic condition implies not only mental rotation of objects but also the coordination of different hands on spatial perspectives.

One could argue that the village task is a perspective taking problem in itself, so why investigate whether ToM impacts another perspective taking task? First, in this study ToM is not measured based solely on perspective taking ability but as a broad competence including the understanding of beliefs, desires, and emotions (e.g., Shahaeian et al., 2011). Moreover, the "village task" cannot be reduced to its perspective taking dimension. Different from the "three mountains" (Piaget and Inhelder, 1952) and other classical perspective taking tasks, such as the picture and turtle tasks (Masangkay et al., 1974), in the dyadic version of the "the village task" a child can be confronted by the other, so that both children have to deal with two sociocognitive operations at the same time: (1) the mental rotation of the objects based on an external frame of reference; (2) the perspective of the other child about the position of the objects in relation to the referential mark. When confronted with another spatial representation, the child is challenged to make some changes in his own spatial representation, and as Gopnik and Astington (1988) suggested, it is much easier to ignore your own contradictions than ignore the contradictions between your own representation and the representation of others. Previous studies have shown that adopting others' perspectives remains cognitively demanding even for adults, especially when the perspectives are conflicting (Keysar et al., 2000; Epley et al., 2004; Qureshi et al., 2010). Surtees et al. (2011) tested adults and children between 6 and 11 years of age and found that those who succeed on direct tasks of Level-2 perspective taking showed no evidence of this competence when it was measured in an indirect task where the participants where not explicitly asked about what the partner was seeing. This is also the case with the "village task" in which the participants are encouraged to work together but there is no explicit question about the

perspective of the other, though the children need to coordinate their spatial representations to find the solution to the problem. Consequently, we are not applying two simple perspective taking tasks. In addition, the aim is not to assess whether ToM and spatial performance are related competences, but to examine specifically to what extent the performances on classical ToM tasks with different levels of complexity and where the child attributes mental or emotional states to a character in a fictional scenario (without being confronted with another's perspective) can explain the variation in spatial performances in an interactional scenario where the spatial representation of one child can be confronted by that of the other child. In other words, does a broad conceptual knowledge about the mind have implication for children's cognition in the domain of a dyadic spatial task?

In accordance with previous studies, the first hypothesis is that children perform better on the Spatial task in the dyadic setting compared to when doing it by themselves, even when we consider age and gender. Because resolving the Spatial task together with a partner depends on mental rotation of objects and understanding of the other's point of view, the second hypothesis is that children's ToM has a positive impact on spatial performances in the dyadic version of the task, even after taking into account age, gender, and the performance in the individual condition. We expect the results to contribute to the fields of ToM development and social development in at least three ways: by consolidating previous results showing that children take advantage from dyadic setting when resolving a cognitive problem; by originally informing on the role of individual differences in ToM on children's spatial performances in a dyadic setting (illuminating potential mechanisms underpinning spatial abilities in social interactions); and by pointing out a link between conceptual understanding of the mind and its practical implications on children's cognitive performance in the domain of spatial transformation abilities.

#### MATERIALS AND METHODS

#### Participants

Initially, 120 parents were contacted through two middle-class private schools in Recife (Brazil). The parents of 90 children (75% of the invited) signed a consent form that informed on the study aims and procedures, allowing their children to be asked to participate. Subsequently, all children invited agreed to participate in the study. The Norwegian Social Science Data Service and the Ethic Committee in Brazil approved the project.

To avoid floor and ceiling effects, children who did not succeed on the simplest item in the individual condition of the Spatial task (n = 14), as well as those who achieved the maximum score in the individual setting (n = 10) were excluded from the sample (Doise and Mugny, 1984). This ensured that the children could have the same minimum level and that they could also progress on the task. There were equal number of boys and girls among those who failed on the first item and 12 children in the youngest group. Amongst the children who achieved the maximum score, four were girls, six were boys, and all of them were in the oldest group. This is consistent with previous findings (e.g., Doise and Mugny, 1984), as younger children failed more often than the older ones and only children in the oldest group achieved the maximum score. Thus, the final sample included 66 typically developing children (32 boys; 34 girls) between 5 years 7 months and 9 years 8 months (M = 89.94 months; SD = 13.09 months) with Portuguese as their native language. In order to obtain more variation in terms of ToM competence (the younger group with ToM in progress and the other with well-established ToM), children were divided into two groups according to their age (n = 36 in the Younger group: 5;7–7;5 years; n = 30 in the Older group: 7;6–9;8 years). Because we wanted to facilitate that children would work together – and because asymmetry in terms of knowledge and gender might create competitive relationship instead of collaboration (Buchs et al., 2004; Sommet et al., 2015) – the dyads consisted of children of the same gender, similar age, from the same classroom, and with similar performances on the individual version of the Spatial task (SD = 0.84) and the ToM tasks (SD = 2.19). For the same reason, we wanted to ensure that the children in the dyads were neither best friends nor not friends, so that information from the children's ranking of their friends in the classroom was also used when composing the dyads.

#### Procedure, Tasks, and Scoring

The data collection consisted of three sessions carried out at the children's schools. In the first session, the children were tested individually on the Spatial task. In the second session, the children completed the ToM tasks, and in the third and last session, they participated in the dyadic version of the spatial problem. Each session lasted around 10 min, with an average interval of 15 days between each session.

#### Spatial Task

Children were first tested individually in an adapted version of the spatial transformation task "the reconstruction of the village," developed by Doise and Mugny (1984) and derived from Piaget's famous "three mountains" task (Piaget and Inhelder, 1952). The task material included a miniature village placed on a model cardboard (50 cm by 50 cm), which was fixed on a table, and comprised a lake (the referential mark) and three or four houses (i.e., based on task complexity, which is described below) with different colors and marked with doors on one side. On a different table, offset 90◦ from their left, children could see another cardboard also marked with a lake on it. They received three or four houses equivalent to the ones previously placed by the researcher on the model cardboard, and they were instructed to make a similar village. In order to emphasize the referential mark, the experimenter said that if a man comes out of the lake, he would find the houses in the same positions as the ones in the model constructed by the experimenter. Chairs were placed in such a way that the children could not move beyond a limited area.

**Figure 1** gives an overview of the task. There were four different items with increasing complexity. The simplest item had three houses with no rotation required. The second demanded a rotation of 90◦ and an inversion of the left-right and frontback orders of the houses. The third and fourth items had four

houses and both required 180◦ rotations and inversions of the left-right and front-back orders. After the completion of each item, children were oriented to move to the opposite side of the cardboard to check whether or not they wanted to make changes to their villages. When solved individually, this part of the procedure generated an intra-individual cognitive conflict, as the child could look at the same village from different perspectives (Doise and Mugny, 1984).

The same four items were applied in the dyadic condition, but in this situation children were placed in different face-toface positions (position X and position Y in **Figure 1**). This required them to coordinate their viewpoints to make a copy of a village, which entails an inter-individual cognitive conflict, as it involved looking at the same village from different angles (Doise and Mugny, 1984). To make sure that one child would not act alone, the dyadic condition operated with interdependent resources (Buchs and Butera, 2004), so that each child received only a certain number of houses (either one or two) and were only allowed to touch and move their "own" houses. To move the houses of the "other" child, the children had to convince the partner to do this, providing opportunities for negotiations within the dyad.

The same scoring method, based on the original work of Doise and Mugny (1984), was applied for both the individual and dyadic conditions. The children first got a spatial score for each item of the Spatial task. Children showing no compensation (NC) got 0 points. They did not manage to mentally rotate the cardboard and just reproduced the perceptual tableau that they were able to observe without making any inversion regarding the position of the houses. Children who displayed partial compensation (PC) received one point, meaning that they achieved one of the inversions required, either the right-left order or the front-back order, but not both. Children who demonstrated total compensation (TC) got two points, and this involved correct transformation of both dimensions (leftright and front-back) simultaneously. Subsequently, in both conditions, a total sum score was calculated from the points on the four items, therefore could vary from zero to eight in each condition of the spatial task. Because two dyads did not reach an agreement regarding the resolution of the problem, the score was computed for each child separately in both conditions. Thus, the score in the dyadic setting represents an individual result of the social interaction.

#### Theory of Mind Task

Children were tested individually for their ToM with items extracted from the Theory of Mind Test (TMT; Pons and Harris, 2002), and the Test of Emotion Comprehension (TEC; Pons and Harris, 2000). Both tests are the result of an extensive review of the developmental literature and of a selection of the most common tasks used to assess children's ToM. Giménez-Dasí et al. (2016) have also combined these two tests to obtain a broad

measure of ToM. However, the authors used a short version of the two tests by reducing the number of items and keeping all the components. In addition, they applied the tests in two separate sessions. Because we had an extensive data collection, we applied the TEC and the TMT in the same session which, in turn, required the exclusion of some components. This was a strategy to ensure that the children would be concentrated and motivated during the assessment, and reducing the number of items would still make the tests very lengthy. Moreover, more items per component should be more reliable than fewer items within more components. Thus, based on the review of the literature which focuses on ToM as an understanding of multiple concepts rather than a single task paradigm (e.g., Pons et al., 2004; Wellman and Liu, 2004; Blijd-Hoogewys et al., 2008), we selected components that did not overlap and that represented different levels of difficulty. Children were therefore assessed for their perspective taking (two items in Level 1 and one item in Level 2), understanding of false-belief (three items), understanding of second-order false-belief (three items), recognition of basic emotion (five items), understanding of the impact of situational variations on emotions (five items), and understanding of desire-based emotion (two items). This choice avoided the tests to become too long, but warranted the inclusion of both visible or non-reflective dimensions of the mind and more invisible or reflective dimensions of the mind (Pons et al., 2009). For each item, the examiner showed a drawing while reading a story regarding the depicted characters, and the child was asked to attribute either a cognitive or an emotional mental state to the main character of the story by pointing to one of two or four possible answers. A composite score ranging from 0 to 21 was calculated by summing the number of correct items.

#### Statistical Analyses

SPSS Statistics 22.0 was used for all analyses in the current study. First, preliminary analyses assessed the performances on the ToM tasks by age and gender through analysis of variance. Subsequently, the first hypothesis was examined through a mixed between-within-subjects analysis of variance to assess the impact of age, gender, and condition (individual and dyadic) on the performance in the Spatial task. To test the second hypothesis, correlation analysis and regression analysis were performed to assess the impact of ToM in explaining the variation on children's spatial performance in the dyadic condition, while accounting for age, gender, and individual spatial performance.

#### RESULTS

**Table 1** shows the performances at the ToM tasks and at the Spatial task (individual and dyadic conditions) by age and gender. An analysis of variance Age X Gender indicated a significant and large effect of age on ToM performances (F(1,62) = 10.91, p = 0.002, η <sup>2</sup> = 0.15), but no significant effect of gender or interaction between gender and age. Regardless of gender, older children had higher ToM performances than younger children.

TABLE 1 | Theory of Mind (ToM) by age group and gender and Spatial Performance by condition, age group and gender.


## First Hypothesis

An analysis of variance Age X Gender X Condition showed a moderate effect of age (F(1,62) = 4.72, p = 0.034, η <sup>2</sup> = 0.07), a large effect of condition (F(1,62) = 65.29, p = 0.000, η <sup>2</sup> = 0.51), and no effect of gender on children's performances on the Spatial task. The older children had higher performances (M = 4.3; SD = 1.7) than younger children (M = 3.7, SD = 1.6), regardless of condition and gender. Moreover, children had higher performances in the dyadic condition (M = 5.05, SD = 2.23) than in the individual condition (M = 2.92, SD = 0.96), regardless of age and gender. There was also an interaction effect of moderate size between age and gender (F(1,62) = 7.90, p = 0.007, η <sup>2</sup> = 0.011), indicating that older girls performed better (M = 4.8, SD = 2.3) than older boys (M = 3.79, SD = 2.5), regardless the condition. An interaction of moderate effect size between gender and condition (F(1,62) = 6.62, p = 0.012, η <sup>2</sup> = 0.10) furthermore showed that girls were better than boys in the dyadic setting, whereas there were no significant gender differences in the individual condition. Finally, a moderate interaction effect was found between condition, age, and gender (F(1,62) = 6.09, p = 0.016, η <sup>2</sup> = 0.09), suggesting that older girls obtained higher scores than younger girls and they were better than boys from both age groups in the dyadic version of the Spatial task, but not in the individual condition.

#### Second Hypothesis

Correlation analysis showed that ToM performances correlated with the spatial performances both in the individual (r = 0.26, n = 66, p < 0.038) and in the dyadic (r = 0.39, n = 66, p < 0.001) conditions, even when we control for age and gender (r = 0.26, n = 66, p < 0.038 and r = 0.32, n = 66, p < 0.010 for the individual and dyadic conditions, respectively). In the regression analysis, the role of age, gender, spatial performance in the individual condition, and scores in the ToM tasks for the performances on the Spatial task in the dyadic condition were examined. This regression model (Multiple R = 0.45, F(4,61) = 3.81, p < 0.008) showed that the predictors explained in total 20% (R <sup>2</sup> = 0.20) of the variation in the dependent variable. When examining the impact of the different predictors, ToM (b = 0.31; t = 2.4, p < 0.020), but not age, gender, nor spatial performance in the individual condition, had a significant effect on the spatial performance in the dyadic condition. ToM accounted for 15% of the shared variance (r = 0.39) and explained alone 8% (r = 0.29) of the variance of the children's performance in the dyadic Spatial task.

# DISCUSSION

fpsyg-07-01635 October 20, 2016 Time: 11:6 # 7

The goal of this study was to investigate: (1) whether children improve their performance when resolving a Spatial task with a peer; and (2) whether individual differences in ToM affect children's spatial performances in a dyadic setting. In line with prior research (Doise and Mugny, 1984; Psaltis and Duveen, 2007), we found that children improved their performance on the Spatial task when they resolved it together with a partner compared to when resolving it alone. For the first time, this study showed that children's performances in a dyadic Spatial task were predicted by their ToM, even when accounting for age, gender, and the children's spatial performances on the same task in an individual condition.

# Spatial Performances Across Age, Gender, and Condition

Confirming our first hypothesis, children performed better in the dyadic compared to the individual setting. This is consistent with the original experiments carried out by Doise and Mugny (1984) and other studies showing that children between 5 and 9 years of age profit from resolving tasks with a partner (e.g., Psaltis and Duveen, 2007; Zapiti and Psaltis, 2012). It has been argued that such results demonstrate that inter-individual conflicts are central for children's cognitive development, and that this is particularly happening when children work on complementary resources to resolve problems (Buchs and Butera, 2004). The current study extends prior results on spatial problems that have reported beneficial effects of social interaction on cognitive performances in samples of older children and adults (Teasley, 1995; Tversky and Hard, 2009). One potential explanation is that the non-verbal and verbal behaviors of the other support the understanding of the objects and their spatial relations, so that the mutual action context promoted by social interaction helps children to (re-)think about the activity from the other's perspective (Tversky and Hard, 2009; Frick and Wang, 2013).

As has been suggested earlier in the field (Piaget and Inhelder, 1952), the effect of age on children's overall spatial performance indicates that spatial ability follows a developmental trend. The absence of an effect of age on spatial performance when children resolved the task by themselves might be related to the way we divided the groups. According to the literature, it is typically somewhere between the ages of 7 (younger group) and 9 (older group) years that children start to imagine an orientation outside their body, and work with relations such as before/behind and left/right (Piaget and Inhelder, 1952; Yilmaz, 2009). The enhanced performance of older compared to the younger children in the dyadic condition could be related to the higher reliance on more advanced social and linguistic abilities in this setting (Siegal, 2008).

The fact that gender had no main impact on children's overall spatial performance contrasts with previous work that found that males perform better than females on mental rotation problems (Voyer et al., 1995; Yilmaz, 2009). However, this gender difference in mental rotation seems to appear from the age of 10, and could possibly be related to boys having more experiences with manipulation of symbolic information than girls by that age. Thus, gender differences may occur as the children gets older, which might explain the interaction effect between age and gender showing that older girls were better than younger boys, independent of the condition. Indeed, the literature suggests that the impact of gender varies according to both age and the type of task (Yilmaz, 2009), which shed some light on the interaction effect between gender and condition, and between age, gender, and condition. Thus, one reason for the gender differences in the dyadic setting may be that this condition depends more on broader social and language skills, which are dimensions where girls and older children typically demonstrate better abilities than boys and younger children (Walker et al., 2002; Siegal, 2008). More research with a larger age range is needed, however, to understand why gender differences appear in different conditions and how they might evolve over time.

# The Impact of ToM on Spatial Performances

The impact of age on ToM performances was expected, as previous studies have shown that ToM follows a clear developmental trend, both in boys and girls (e.g., Harris et al., 2005; Shahaeian et al., 2011). The results originally showed relations between ToM and spatial performances, both in the individual and in the dyadic conditions, even when age and gender were taken into account. Moreover, confirming the second hypothesis of this study, ToM had a positive impact on the spatial performance when children worked together, even when we controlled for age, gender, and spatial performance in the individual condition.

The link between ToM and the spatial performance in the individual setting indicates that the abilities to conceptually understanding the mind in terms of thoughts and emotions and to cognitively visualize objects in different positions based on an external frame of reference are related competences. The findings therefore expand previous results by demonstrating that understanding mental states has positive consequences not only on social competences (e.g., Roazzi et al., 2013; Farina and Belacchi, 2014) and the use of mental terms and metacognition (Veneziano et al., 2008; Lecce et al., 2010; Guajardo and Cartwright, 2016), but also on the domain of children's cognition with regard to spatial visualization, which is a spatial transformation where "the positions of objects are moved with respect to an environmental frame of reference" (Hegarty and Waller, 2004, p. 127). In the present study it means that children with higher level of conceptual ToM were better able to mentally rotate the object and correctly transform the positions of the houses by taking the lake as the referential mark.

One could argue that once a relation between ToM and the spatial performance in the individual condition was found, a relation between ToM and the performance in the dyadic condition would be expected. Yet, the performance in the two conditions rely on different levels of spatial skills, as indicated

by the findings showing the absence of a relation between the performance in the individual condition (making object-based transformation) and the performance in the dyadic condition (coordinating different perspectives). This is in line with the dissociation between tests of perspective taking and tests of mental rotation reported by others (Hegarty and Waller, 2004). Thus, we could not interpret the correlation between ToM and the performance in the dyadic condition as parallel to the correlation between ToM and the performance in the individual condition. It is also noteworthy that the relation between ToM and spatial performance was stronger in the dyadic compared with the individual setting. Moreover, beyond examining how ToM and the spatial performance in the dyadic condition were related, our aim was to investigate the degree to which ToM abilities could explain variation in the spatial performances in a social interaction setting. It was only ToM that significantly explained the performance in the Spatial task when children worked together, while the children's age or their previous experience with the task did not. This finding therefore suggests the existence of socio-cognitive mechanisms underpinning spatial performance in social interactions.

A comparison of the two conditions of the Spatial task might deepen our understanding on such socio-cognitive mechanism. When resolving the task alone children had to visualize the houses in different positions by taking the lake as a reference. Even when the child changed the position to see the cardboard from a different angle (intra-individual conflict), the task in the individual setting centered around object-based transformations, while in the dyadic setting they needed to go beyond their own spatial visualization and deal with the other's spatial perspective. In fact, the performance in the dyadic condition of the Spatial task seems to be more strongly dependent on the performance on the ToM tests where the child also had to take the mental perspective of the character. Thus, one could argue that a link between ToM and the spatial performance in the dyadic setting would be expected because the Spatial task in the dyadic condition essentially demands perspective taking. Nevertheless, the task in the dyadic condition cannot be reduced to its perspective taking dimension as the children also needed to manage the object-based transformation while coordinating different viewpoints with the other child, which is an advanced form of cognitive problem. In addition, we used a broad measure of ToM that assessed not only perspective taking but also false-belief and emotion comprehension, in which – different from the Spatial task – children's beliefs and perspectives were not confronted by the experimenter or another child. Thus, the main explanation is that the findings add a new factor to the previous results on the reconstruction of the village task (e.g., Doise and Mugny, 1984; Zapiti and Psaltis, 2012) by pointing out that the better the child is at conceptually theorizing about the mind in a fictional scenario in terms of beliefs, perspectives, and emotions, the better he mentally rotates the objects while taking the spatial perspective of a real partner.

The current findings can therefore shed new light on the link between conceptual understanding of the mind and its practical implication for children's cognition, especially for cognitive performance in social interaction. According to Tversky and Hard (2009), seeing another person in a scene near objects can elicit spontaneous perspective taking, which, in turn, create mutual expectations between partners while attempting to coordinate actions, imposed each person to go into multiple levels of perspectives. Nevertheless, Keysar et al. (2000) showed that even adults with high levels of ToM can demonstrate difficulties in applying these abilities to take other's perspective. Accordingly, Samson and Apperly (2010) argue that using ToM could be a cognitively costly process involving the need to resist the interference from the egocentric perspective and to select relevant information necessary for ToM inferences, potentially creating a gap between competence and performance. We should point out some distinctions between the previous and the current findings. Notwithstanding the differences in age ranges, the aforementioned studies focused on perspective taking, while we have assessed a broad measure of ToM. This might suggest that the implication of ToM for children's spatial performances cannot be seen as a uniform fact, as it can vary depending on the age range of the participants, how ToM is measured and what context it is applied in. A broad measure of ToM is potentially accounting for more variability in spatial performances than measures of perspective taking or false-belief alone, especially when the task is spatial and social at the same time (i.e., the village task). Perhaps a broad measure of ToM that includes the understanding of beliefs, desires, and emotions is part of a broader socio-cognitive process underlying spatial and socialperspective taking. In light of findings suggesting that social abilities are related to a more visually driven form of perspective taking (Clements-Stephens et al., 2013; Hamilton et al., 2014), future studies analyzing how children consider the other's point of view while cooperatively resolving a spatial problem may contribute to understanding the extent to which and how ToM, social perspective taking and spatial performance are intertwined.

In sum, our results showed that conceptual competence can account for variation in cognitive performances on a Spatial task in children between 5–9 years of age, and in particularly so when the ToM measure includes different concepts. This does not indicate that we can directly translate ToM competence into spatial performance, and future studies should examine the role of potential third variables, such as language, cooperative behavior, intelligence, and executive functions (Wellman, 2014) to have a more complete picture of the role of ToM on spatial performance. As for now, the findings illustrate that, although not sufficient (Astington, 2003; Samson and Apperly, 2010), higher ToM levels can have positive implications for cognitive performances in terms of mental rotation and spatial perspective taking during peer interaction.

#### Limitations

Some limitations should be mentioned. A larger sample size would have provided more power to detect significant relations and group differences in the present study. The inclusion of a post-test section (Doise and Mugny, 1984) would inform on possible long-term effects of the dyadic experiences. Future studies could also apply a longitudinal approach to address potential developmental processes. In addition, training studies aiming at strengthening ToM competences might provide

stronger evidence of the positive impact of ToM on spatial performances. Inclusion of additional ToM concepts, as well as examination of the contributions of the separate components of the TEC and TMT could also contribute to a deeper understanding of the role of ToM on cognition.

Another limitation is that we did not analyze the interactional processes in the dyadic setting. Zapiti and Psaltis (2012), for instance, showed that what happens in the interaction affects the final spatial performance. In addition, Caputi et al. (2012) underlined that the relation between having and using ToM in social interaction is mediated by social factors. It could be argued that having the same intention toward the task does not specify the kind of social relation children would establish (Thomsen and Carey, 2013) and that different dyadic profiles, either more unilateral/hierarchical or more cooperative could affect performances in dyadic settings (Psaltis and Duveen, 2007). Thus, investigating the process of how children interact and operate with the socio-cognitive conflict could help to better understand how ToM explains the spatial performance in the dyadic Spatial task. Last, but not least, it is not certain that the same results would have occurred in other type of cognitive problem or if the spatial abilities were examined in a nonstructured task. Investigating the impact of ToM in everyday interaction could deepen our understanding on the implication of ToM for children's cognition with regard to the nature of the task and the nature of the interaction.

#### CONCLUSION

Both hypotheses of the current study were confirmed: (1) children performed better in the dyadic setting compared to when doing it by themselves; and (2) children's ToM had a positive impact on the spatial performance in the dyadic condition. Theoretically, these findings add a new aspect to the explanations based on inter-individual conflict and action-based reasoning (Doise and Mugny, 1984; Tversky and Hard, 2009; Zapiti and Psaltis, 2012) by illuminating socio-cognitive mechanisms that link conceptual competence in understanding the mind with spatial performance within interactional settings. The results demonstrate that individual differences in ToM – not only in terms of false-belief or perspective taking, but also in terms of emotion comprehension – impact children's cognition and have to be taken into account in order to get a more complete picture of what promotes spatial performances in social interactions. Hence, three practical implications can be derived from it. First,

## REFERENCES


it implies the need to elaborate more adequate and sensitive measures to grasp the cognitive consequences of ToM in a wide range of interactional contexts. Second, pedagogues might need to consider children's ToM abilities when composing dyads and groups to solve spatial problems in cooperation, as such grouping might yield different outcomes. Finally, the findings suggest that teaching and strengthening of children's ToM competences can have positive impact on children's cognitive performance in important settings, such as in school, at least when it comes to spatial problems. To conclude, the link between what ToM is and what ToM is for (Liszkowski, 2013) does not indicate that ToM concepts are sufficient to efficiently promote successful cognitive outcome in social interaction (Astington, 2003). However, it shows that having such concepts goes beyond conceptual knowledge and can have practical implications for children's cognition. This study demonstrates how this is the case in the domain of spatial transformation in peer interaction.

# AUTHOR CONTRIBUTIONS

KV and FP designed the study. KV coordinated data collection and KV, IZ, EK, and FP contributed to the analysis and interpretation of the data for the work. KV prepared the first draft of the article and all authors revised it critically and approved the version to be published.

# FUNDING

This work was supported by Lånekassen – The Norwegian State Educational Loan Fund – as part of the Quota Scheme Program which supports students from developing countries.

### ACKNOWLEDGMENTS

The authors thank the children for their participation in this project and the parents who authorized their participation; Carina Pessoa Santos for helping with data collection; Maria Isabel Pedrosa and the members of the developmental group of Labint (Laboratory of Human Social Interaction) of the Federal University of Pernambuco (Recife/Brazil) for providing the video recording equipment and for giving support to the analysis of the pilot of this study; and participating schools for giving us access to their facilities.

preliminary investigation. Front. Psychol. 6:1–10. doi: 10.3389/fpsyg.2015. 01916




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Viana, Zambrana, Karevold and Pons. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Peer Interaction Does Not Always Improve Children's Mental State Talk Production in Oral Narratives. A Study in 6- to 10-Year-Old Italian Children

Giuliana Pinto, Christian Tarchi\* and Lucia Bigozzi

Department of Education and Psychology, University of Florence, Florence, Italy

Joint narratives are a mean through which children develop and practice their Theory of Mind (ToM), thus they represent an ideal means to explore children's use and development of mental state talk. However, creating a learning environment for storytelling based on peer interaction, does not necessarily mean that students will automatically exploit it by engaging in productive collaboration, thus it is important to explore under what conditions peer interaction promotes children's ToM. This study extends our understanding of social aspects of ToM, focusing on the effect of joint narratives on school-age children's mental state talk. Fifty-six Italian primary school children participated in the study (19 females and 37 males). Children created a story in two different experimental conditions (individually and with a partner randomly assigned). Each story told by the children, as well as their dialogs were recorded and transcribed. Transcriptions of narratives were coded in terms of text quality and mental state talk, whereas transcriptions of dialogs were coded in terms of quality of interaction. The results from this study confirmed that peer interaction does not always improve children's mental state talk performances in oral narratives, but certain conditions need to be satisfied. Peer interaction was more effective on mental state talk with lower individual levels and productive interactions, particularly in terms of capacity to regulate the interactions. When children were able to focus on the interaction, as well as the product, they were also exposed to each other's reasoning behind their viewpoint. This level of intersubjectivity, in turn, allowed them to take more in consideration the contribution of mental states to the narrative.

Keywords: mental state talk, peer interaction, storytelling, narrative competence, theory of mind

# INTRODUCTION

Research into the development of children's mental state understanding has recently focused on mental state talk in social interactions as a powerful tool to both explore and foster Theory of Mind (ToM). Mental state talk is defined as that is the set of words used by children to attribute thoughts, feelings, emotions, and desires to people, when referring to either themselves and other people (Bretherton and Beegley, 1982). Mental state talk is facilitated by interactional contexts

Edited by:

Paola Molina, University of Turin, Italy

#### Reviewed by:

Antonella Marchetti, Catholic University of the Sacred Heart, Italy Serena Lecce, University of Pavia, Italy

> \*Correspondence: Christian Tarchi christian.tarchi@unifi.it

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 15 March 2016 Accepted: 11 October 2016 Published: 25 October 2016

#### Citation:

Pinto G, Tarchi C and Bigozzi L (2016) Peer Interaction Does Not Always Improve Children's Mental State Talk Production in Oral Narratives. A Study in 6 to 10-Year-Old Italian Children. Front. Psychol. 7:1669. doi: 10.3389/fpsyg.2016.01669

in which young children communicate with other people about thoughts and feelings. In this study, we will analyze the effect of joint story-telling on children's mental state talk. Creating a learning environment for storytelling based on peer interaction, does not necessarily mean that students will automatically exploit it by engaging in productive collaboration, thus it is important to explore under what conditions peer interaction promotes children's ToM. Our understanding of such conditions is limited as most of the studies conducted on joint storytelling have focused on adult-child interactions, rather than on peer interactions. Moreover, prior studies on children's ToM have mainly focused on its cognitive aspects and on preschoolers. This study extends our understanding of social aspects of ToM, focusing on the effect of joint narratives on school-age children's mental state talk.

## Theory of Mind and Mental State Talk

Children's ToM includes several basic skills, that is recognizing emotions, making a distinction between physical and mental entities, appreciating the casual link between perception and knowledge, understanding how desires and beliefs influence behavior, and understanding how beliefs affect behavior (Wellman, 1990; Bulgarelli et al., 2015). The strict interconnection between language and children's understanding of other people's mental states has led several scholars to use children's mental state talk as an indicator of their ToM (Dunn and Hughes, 1998; Astington and Baird, 2005; Symons et al., 2005; Antonietti et al., 2006). Mental state talk includes terms that children use to attribute physiological (e.g., being hungry), perceptual (e.g., see), willing (e.g., desire), emotional (e.g., anger), cognitive (e.g., knowing), moral (e.g., judge), and socio-relational (e.g., helping) state to others (Bretherton and Beegley, 1982; Symons, 2004).

Several studies have used mental state talk as a measure of ToM, for instance to analyze maternal mind-mindedness (Meins et al., 2002), mother–child conversations (Ruffman et al., 2002), conversations between young friends (Hughes and Dunn, 1998) and siblings (Brown et al., 1996), and autistic children (Tager-Flusberg, 1992; Happé, 1994; Capps et al., 2000). A few studies have also validated mental state talk by finding significant correlation scores with standardized measures of ToM, such as the false-belief task (Peterson and Slaughter, 2006; Hughes et al., 2011; Accorti Gamannossi and Pinto, 2014). Thus, evidence from the typically and atypically developing population confirm mental state talk as a reliable indicator of children's understanding of other people's ToM.

Mental state talk brings some advantages with respect to more traditional assessments of ToM (e.g., 'false belief task,' Wimmer and Perner, 1983): it is a more ecological instrument as it relies on children's spontaneous production; it allows us to include and analyze several mental states (e.g., desires and feelings, besides the cognitive-related aspects of ToM); it allows us to study the development of ToM in school-age children, since it does not reach a ceiling as other measures do (Wellman and Liu, 2004). Previous studies have demonstrated that individuals' mind-reading ability grows with age, even beyond school years (adolescence and young adulthood, Valle et al., 2015; and adulthood and elderly age, Cabinio et al., 2015).

# Mental State Talk in Narratives

Narratives represent an ideal context to analyze children's mental state talk, as through them children develop, practice, and re-describe their ToM (Guajardo and Watson, 2002; Accorti Gamannossi and Pinto, 2014), as is also confirmed by neuropsychological studies (Marchetti et al., 2015). According to the redescription theory ("representational redescription," Karmiloff-Smith, 1995), the human mind first develops by learning a process, and then further develops by turning the information that is in the mind into explicit knowledge to the mind. In this way, processes increase the flexibility of the knowledge we possess. In other words, the mind re-describes its knowledge by representing in different formats what it is internal stored. Redescription theory applies to ToM too. When children are in the process of understanding mental states, they need to understand that a certain event can be represented and viewed differently (Qu et al., 2015). Thus, children's ToM might be improved by promoting children's representation, whit the support of narrative tasks.

Children's development of narrative competence begins early and increases significantly during school years (Makinen et al., 2013). In primary school, children begin to tell or write stories with a basic and conventional macrostructure, which includes initiating events, several interlinked episodes, goaldirected actions, internal responses, and a final resolution (Stein and Glenn, 1982; Gelmini-Hornsby et al., 2011; Squires et al., 2014). Thus, children need advanced mental state talk to create a narrative centered around a protagonist's intentions and subsequent actions (Pelletier and Beatty, 2015). The relationship between narrative competence and mental state talk develops in particular during primary school years. Generally primary school children tell stories as a list of actions (Carnine et al., 1982; McConaughy et al., 1984), but if they possess a certain level of mental state talk, which allows them to connect action with consciousness, then they are also able to integrate the plot actions with the characters' mental states (Pelletier and Astington, 2004). Moreover, if the characters' intentions are explicitly stated, primary school children are able to identify the characters' mental states (Feathers, 2002). Pelletier and Beatty (2015) examined children's developing understanding of Aesop's fables from Kindergarten through Grade 6, and found that as children grow, they are increasingly able to understand fables through their mental state talk, beyond the contribution of general vocabulary. According to Dyer et al. (2000) it is possible that narratives themselves can be an important source of mental state information. The authors analyzed 90 children's books and found that they included high rates of mental state terms, regardless of the children's age (they compared books aimed at 3- to 4-year-olds vs. books aimed at 5- to 6-year-olds). They also noted that pictures instead did not represent any mental state, nor did they refer to mental states mentioned in the text.

The development of ToM is particularly facilitated by communication between young children and other people (e.g., mother, father, siblings, peers, and the like) about others'

mental states (Symons, 2004), also through the effect of social shared norms (Massaro et al., 2014). A specific case of interpersonal discourse about mental states is represented by joint narratives. In preschool, kindergarten and school children are exposed to narratives through joint story-telling or storyreading activities. Besides being an activity in which children naturally engage, joint story-telling represents one of the ways in which individual performances can be improved. The effect of peer interaction on children's mental state talk is explained by several mechanisms. Firstly, peer learning is strictly interrelated with intersubjectivity. The two partners need to achieve a certain degree of intersubjectivity, which can be negotiated or achieved through mutual adjustments (Devescovi and Baumgartner, 1993). Intersubjectivity is strictly interrelated with mental state talk too (Symons, 2004). According to the literature, two conceptual traditions on development psychology focused on intersubjectivity in a meaning co-construction activities: Piaget's socio-cognitive conflict hypothesis, and Vygotsky's internalization hypothesis. According to the former perspective, in a joint activity an individual has to take the perspective of the other participant as well, rather than just dealing with his/her own one (Mugny and Doise, 1978). If the two participants are able to achieve a mutual understanding of the activity, then they can achieve a new, and more advanced perspective on the problem. According to Vygotsky internalization process (1978), higher-level processes appear fists at an interpsychological level, and through it are transformed into intrapsychological processes. Children's participation in interpersonal discourse about the thoughts and feelings of other people facilitates the internalization of the reasoning about mental states, which implies a cognitive reorganization of their own ToM (Symons, 2004). Actually, these two perspectives can be considered as complementary, if we focus on the cooperation between partners, rather than simply the presence of a partner (Kruger, 1993). Both perspectives, although focusing, respectively, on conflict and cooperation, claim that children are able to benefit from a joint activity if they engaged in an extended discourse that explores the reasoning behind the various viewpoints being presented (Kruger, 1993). In this way, the two participants are introduced to each other's intentions and thoughts on the activity, with a beneficial retroactive effect on their own mental state talk. On the other hand, also the type of task assigned to students has fundamental implications for the efficacy of peer interaction (Slavin, 2004). An exploration of the levels of participation allows us to explore interactional patterns and the source of interaction. In other words, it allows us to understand to what extent students engage in conversations, who initiates the conversations, and whether the response aims at developing the meaningconstruction endeavor or rather providing some feedback to the partner. Instead, an exploration of the use of language allows us to analyze the semiotic tools used by participants to mediate the social construction of meaning. Children could engage in a conversation to negotiate meaning, provide and/or justify their perspective, share personal experiences or relevant information, managing the interaction, expressing an agreement/disagreement on what the partner said, evaluating the partner's contributions to the meaning-making process, and the like. As previously described, narrative represent a perfect outlet for children to reflect on the character's inner states of mind, providing an ideal context for peer learning to positively influence children's own mental state talk. In a joint story-telling task, narratives become object of metacognitive reflection: talking about a narrative means talking about ToM.

The understanding of the ways through which children's mental state talk in primary school can be improved is affected by a few limitations. Firstly, as Hughes et al. (2007) noted few studies have explored school-age children's mental state talk (Lecce et al., 2010; Longobardi et al., 2016). As with what happened with traditional forms of ToM assessment, most studies on children's mental state talk have generally explored preschoolers. This is particularly concerning, since several components which have an effect on mental state talk develop during school years (e.g., expansion of vocabulary, working memory, referential communication, and the like). Moreover, schooling introduces a new set of experiences into the child's life, which create a new set of applications of mental state talk in everyday life (e.g., more social settings).

Secondly, studies in this area have focused especially on parent–child interactions (e.g., Adrian et al., 2005), conversation between siblings and/or friends, but they have rarely explored the facilitation of peer-interaction practices promoted in school, in which students are working together toward a convergent outcome. This is particularly surprising, considering the bulk of research available on the efficacy of peer-assisted learning (Ginsburg-Block et al., 2006; Riese et al., 2012). Such practices are often promoted in school for their positive effects on academic achievements in several different learning processes (Palincsar and Brown, 1984; Ginsburg-Block et al., 2006; Tarchi and Pinto, 2015). In particular, narratives allow us to explore the effect of peer-interaction on an open-ended school activity, which is particularly interesting as it provides children with more opportunities to negotiate meaning and exchange information (Tarchi and Pinto, 2015).

Thirdly, prior studies on socio-cognitive conflict and peer learning (e.g., Mugny and Doise, 1978) have emphasized the importance of taking into consideration children's levels of individual competence to assess the magnitude of the improvement due to working with a partner. For instance, prior studies found that a socio-cognitive conflict between children is most likely to foster progress in a specific process if children are at the moment of initial elaboration or emergence (Mugny et al., 1981). Most of the studies on children's mental state talk have assessed it in interactional contexts, but without untangling the relationship between individual and joint mental state talk performance. Moreover, when interacting, each child reciprocally influences each other in their use of mental state talk. However, previous studies demonstrated that children's mental state talk, generally highly correlated to performances in ToM standardized tests when assessed though an individual task, decrease the strength of this correlation when interacting with older partners (Symons et al., 2005). On the other side, children might use more mental state talk when interacting with peers, rather than with older partners (Dunn, 2000). Thus it is important to explore under what conditions peer interaction

promotes children's ToM. Some studies focused on the individual levels of participants, with two different approaches. According to the peer tutoring approach, peer learning is effective when there is a discrepancy in individual mastery of the target skill (Topping, 2005). Instead, according to the reciprocal peer learning approach, peer learning activities are mostly successful when the two members have similar levels in the target skill and scaffold each other (Duran and Monereo, 2005). In this study, we assessed children's mental state talk twice, in an individual and in a joint condition.

Lastly, from past studies on peer learning we know that creating a learning environment for storytelling based on peer interaction does not necessarily mean that students will automatically exploit it by engaging in productive collaboration (Prangsma et al., 2007). Prior studies on the discursive practices in peer-interaction educational contexts have put emphasis on both the level of participation in the discourse and the participants' use of language (Kovalainen and Kumpulainen, 2005; Tarchi and Pinto, 2015).

A few studies have investigated the relationship between the quality of the interaction and mental state talk. Hughes et al. (2006) studied the quality of sibling interactions in relation to children's mental state talk. One hundred and one families participated in the study, which included 111 2-years-olds and 111 female siblings, for a total of 61 same-sex dyads and 50 opposite-sex dyads. Dyads were video-taped during a 2-h play session at home. Transcripts were coded for presence of mental state talk (referred as inner state talk in the original article). The frequency of mental state talk was significantly correlated with video-based ratings of reciprocal play, also when effects of age, verbal ability and ToM performance were controlled. O'Connor and Hirsch (1999) investigated whether adolescents' understanding and attribution of mental states was a function of the quality of the relationship, rather than a context-independent characteristic of the individual. Participants were presented with six school situations through a semi-structured interview to assess their mentalising about teachers. Two factors were manipulated to verify the context-dependence hypothesis, most liked compared with least like teacher, and self compared with other student. According to the results, early adolescents exhibit a more advance understanding and attributing of mental states to the behavior of teachers who they like, compared to the ones who they do not like. An indirect measure of the relationships between quality of interaction and mental state talk derives from a study conducted by Meins et al. (2006), who explored 7- to 9-year-old children's mental state talk in two tasks, book narration versus describing a friend. Children's mental state talk scores correlated between the two tasks, even after the effects of age and verbal ability were controlled. According to the authors, children's mental state talk in non-interactional situations generalizes across relational contexts. Furthermore, their mental state talk measures did not correlate with ToM measures, whereas previous studies found that interactional measures of mental state talk were related to ToM. One explanation of this discrepancy could depend on the different ages at which these associations have been explored. Generally, children's mental state talk in interactional contexts has been studied in preschoolers, thus their developing ToM could have constrained their mentalstate reasoning capacities. In older children ToM capacities are more advanced and might no longer influence children's mental state talk. Alternatively, in preschoolers the association between children's mental state talk and ToM might be mediated by the mind-mindedness of their partner. This study contributes to the research area on the relationship between the quality of the interaction and mental state talk by exploring mental state talk produced by school-age children interacting with a class-mate in comparison to individual levels of mental state talk.

# Aims of the Study

The aim of this study was to analyze the effect of a peerinteraction condition on mental state talk through a joint narrative task. Consistently with Vygotsky's internalization hypothesis, participating in a joint narrative task might facilitate children's development of mental state talk and, in turn, foster a cognitive reorganization of their own ToM (Symons, 2004). Also, consistently with the socio-cognitive conflict hypothesis, peer learning stimulates children to talk about the story, the plot, the characters' intentions, actions, and internal responses. Talking about a narrative makes the narrative itself an object of a metacognitive reflection.

This study addressed the limitations of the literature by (i) exploring ToM through mental state talk in school-age children, (ii) while engaged in a peer learning task (story-telling in school), (iii) with a focus on the contribution of children's individual mental state talk, the discrepancy between mental state talk of the two members of a couple, and the quality of the interaction during the joint story-telling task.

Several studies supported the efficacy of peer learning on several aspects of the child's psychology (Ginsburg-Block et al., 2006; Riese et al., 2012), and emphasized the importance of the interaction with others for the development of ToM (Symons, 2004). Nevertheless, several studies also pointed out that peer interaction does not always produce an improvement in children's performances, if certain conditions are not satisfied (Devescovi and Baumgartner, 1993; Slavin, 2004; Prangsma et al., 2007). Thus, we investigated whether the efficacy of peer interaction on mental state talk was systematic or not. Specifically, we explored the following conditions of efficacy:


and Baumgartner, 1993; Symons, 2004; Kovalainen and Kumpulainen, 2005), thus we expected peer learning to be more effective in couples that were able to engage in interactions characterized by a higher quality of the dialogs.

# MATERIALS AND METHODS

fpsyg-07-01669 October 21, 2016 Time: 16:9 # 5

#### Participants

Sixty-four Italian children participated in the study (23 females and 41 males). Eight children were excluded from the study as they did not participate in either the individual storytelling or the joint story-telling task. The final sample included 56 participants. Participants were randomly selected from one predominantly middle-class primary school located on the outskirts of Florence. Four classes were involved (**Table 1**).

At the time of the study, no participant was diagnosed with a physical or mental disability, nor was included in a diagnostic process, or identified by the teachers as having special educational needs. Parents and school authorities, as well as the children themselves, gave consent to participate in the study. Regarding the Italian educational system, children start formal teaching of literacy at the age of six with entry to primary school and finish it when they conclude the last or fifth grade, at the age of 10 or 11.

## Procedure

Participants were asked to produce oral stories under two different experimental conditions: (a) a free story production by a single child; (b) a free story production by a couple: two children of the same gender constructed and told an invented story together. Joint-narrative partners were randomly assigned. The order of the two tasks was counter-balanced. Each story told by the children, as well as their dialogs were recorded and transcribed. For joint narratives, the dialogs and the story were separated and considered as distinct set of data. The researcher, in agreement with the teachers, at first, explained the storytelling tasks to the entire class so as to reassure the children and promote a climate of trust. Children were asked to make up a story without any book or visual materials or topic to guide them. As a consequence, children generated stories with a very different content. It is important to notice that individual and joint story-telling are daily school activities, since they are often used by teachers, making them an ecologically valid method to explore children's mental state talk. We included an example of

TABLE 1 | Description of the sample: total number, age, distribution of males and females, and mental state talk performance in individual and joint condition (mean and standard deviation).


narrative production of a couple of children from 1st grade (two individual narratives and one joint narrative) as Supplementary Table S1. After that, the activity continued in a room adjacent to the classroom both with the individual children and with the couples. First phase, a free story production was requested from the child (Task 1): "I would like you to tell me a story." Second phase, a free story production was requested from a couple of children (Task 2): "I want you and your partner to tell me a story invented by you together." In the joint condition, children could plan their performance how they preferred. Some first planned and agreed on the title and/or plot, others just start telling the story and interacted during the construction of the story. Each child, as well as the couples, stayed with the researcher from 15 to 30 min and every story was recorded. Overall, we collected 56 stories and 28 stories told by two children together. The data collection took place in agreement with the school and following the requirements of privacy and informed consent requested by Italian law (Legislative Decree DL-196/2003). Regarding the ethical standards for research, the study referred to the last version of the Declaration of Helsinki (World Medical Association, 2013). The present study was approved by the Ethical Committee of the Department of Psychology at the University of Florence, Italy.

# Coding Systems

Two independent judges coded the narratives in terms of narrative competence and mental state talk in individual and joint narratives, and quality of dialog in joint narratives. Interrater agreement scores were all acceptable (k > 0.70).

#### Mental State Talk

Mental state talk was analyzed by identifying terms and expressions referring to mental states (adapted from Bretherton and Beegley, 1982). In particular, we identified the following categories: perceptual-physiological states, emotional states, willingness states, cognitive states, and moral and socio-relational states (**Table 2**).

#### Narrative Competence

Children's narrative competence was assessed in terms of structure, cohesion, and coherence, using a coding scheme developed by Spinillo and Pinto (1994), and adapted by Pinto et al. (2015).

#### **Structure**

On the base of the presence, absence or/and combination of fundamental elements of a story (title, conventionalized story opening, characters, setting, problem, central event, resolution, and conventionalized story closing), children's productions were given an index score ranging from 0, "non-story," to 5, "complete story" (see Supplementary Material for details and examples on the narrative coding system, Supplementary Table S2).

#### **Cohesion**

Causal and temporal linguistic connectives were counted. Examples of causal connectives are: thus, because, therefore, it follows that, to this aim, as things stand, and the like (e.g., The fox wanted to eat the chicken. **To this aim**, the fox decided to hide).


TABLE 2 | Description of the coding system for mental state talk (adatpted from Bretherton and Beegley, 1982).

Examples of temporal connectives are: after, before that, at the beginning, suddenly, soon, and the like (e.g., Suddenly, the two boys heard a noise). Based on the number of connectives per total number of words, we assigned the narratives to four categories of cohesion: absent; low (the ratio of connectives/words was below the 33rd percentile); medium (the ratio of connectives/words was between the 33rd and 66th percentiles); and high (the ratio of connectives/words was above the 66th percentile). Absent was assigned a score of 0, low a score of 1, medium a score of 2, and high a score of 3.

#### **Coherence**

The number of incongruences were identified (sentences introduced by an adversative even though it did not contradict the previous sentence, such as: the monsters did not want to make peace, **but** the monsters wanted to attack). Based on the number of incoherencies per total number of propositions, we assigned the narratives to four categories of coherence: absent; low (the ratio of incoherencies/propositions was below the 33rd percentile); medium (the ratio of incoherencies/propositions was between the 33rd and 66th percentiles); and high (the ratio of incoherencies/propositions was above the 66th percentile). Absent was assigned a score of 0, low a score of 1, medium a score of 2, and high a score of 3.

#### Quality of dialogs

The quality of dialogs was analyzed in terms of discourse moves and communicative functions (Kovalainen and Kumpulainen, 2005).

#### **Discourse moves**

The analysis of discourse moves shows the participatory roles of each member in collective meaning making. The units of analysis are participants' utterances. We coded three types of discourse moves: children's initiation moves, that is utterances used to open a discourse on a particular topic; children's response moves, that is utterances that elaborated other initiations or responses; and children's follow-up moves, that is utterances that provided feedback on the ongoing interaction. This analysis allowed us to explore to what extent children engaged in dialogs, rather than producing solo-utterances, and what was the role of the experimenter.

#### **Communicative functions**

The analysis of communicative functions focalizes on the message unit and permits us to explore the nature of the interaction and its construction in ongoing interactions. The units of analysis are participants' utterances. We coded nine categories of communicative function (**Table 3**).

## Data Analysis

Mental state talk was divided by the fluency of the participants' productions: the total number of words used to tell the stories was counted to create ratios, standardize participants' performances, and check for the potentially confounding effect of narrative length. Ratios were also calculated for cohesion and coherence score, dividing raw scores by the total number of words. Following, mental state talk scores were transformed into percentiles. There are several ways to explore children's narrative competence, adopting both continuous data (Haden et al., 1997; Fivush et al., 2006), and categorical data (Bigozzi and Vettori, 2015; Pinto et al., 2015, 2016b). In this study, narrative competence variables (i.e., mental state talk, structure, cohesion, and coherence) were re-coded into a 3-point scale using the percentile distribution: the first point was for scores lower than the 33rd percentile, the second point for scores between the 33rd and the 66th percentile and, finally, the third point corresponded to scores higher than the 66th percentile. Each variable was recoded coherently with this positional criteria, both for individual and for joint narrative tasks.

To verify whether the joint condition systematically improved students' mental state talk when compared to their individual performances we identified incremental and decremental subjects. To this aim, we compared the individual and joint performances of each subject, and identified two groups: individuals who incremented their mental state talk from the individual to the joint condition (incremental), and individuals that decremented their mental state talk from the individual to the joint condition (decremental) (**Table 4**).

Since prior research showed that children's narrative competence develops throughout primary school (Bamberg,

#### TABLE 3 | Analysis of communicative functions.

fpsyg-07-01669 October 21, 2016 Time: 16:9 # 7


TABLE 4 | Frequencies of decremental and incremental individuals/couples (total scores and divided by grade).


1997), we verified the influence of children's narrative competence on the efficacy of peer interaction on mental state talk. To this aim, we compared performances in structure, cohesion, and coherence of incremental children versus decremental children. Then, we verified whether the joint condition is particularly effective for individuals for low levels of mental state talk. We tested the frequency of participants' distribution in the three groups through a binomial statistical test.

To verify the conditions under which joint narratives have a beneficial effect on children's mental state talk, we changed the unit of analysis from the individual to the couple, and proceeded to identify incremental and decremental couples. A couple was defined as incremental if the percentile score in the joint condition was higher than the scores obtained by the two participants of the couple in the individual condition. A couple was defined as decremental, if the percentile score in the joint condition was lower the scores obtained by the two participants of the couple in the individual condition (**Table 4**). We explored two conditions through a series of Mann–Whitney U tests: (i) whether the joint condition is particularly effective for couples made up of individuals with discrepant individual performances in mental state talk; and (ii) whether incremental couples were engaged in interactions of higher quality than decremental couples were. For all statistical analysis, the effect-size was estimated (Fritz et al., 2012).

# RESULTS

Descriptive statistics for mental state talk and narrative competence in the individual and joint condition are reported in **Table 5**. Descriptive statistics for quality of interaction in the joint condition are reported in **Table 6**.

In the individual condition, mental state talk did not correlate with any narrative competence score, namely structure (r = 0.14, p = 0.31), cohesion (r = −0.13, p = 0.34), or coherence (r = −0.02, p = 0.89). In the joint condition, mental state talk correlated with cohesion (r = 0.41, p = 0.04), but not with structure (r = 0.12, p = 0.56) or coherence (r = −0.06, p = 0.77). According to the Mann–Whitney test, the performances in structure (U = 359.50, z = −0.81, p = 0.94, η <sup>2</sup> = 0.00), cohesion (U = 300.00, z = −1.11, p = 0.27, η <sup>2</sup> = 0.07), and coherence (U = 267.50, z = −1.80, p = 0.07, η <sup>2</sup> = 0.18) of incremental and decremental children were statistically similar.

## Effects of Joint Narratives

The joint condition was not systematically beneficial for all students' mental state talk performances. The probability of using more mental state talk in the joint condition than in the individual one was not above chance (Binomial test, p = 0.89). On a descriptive level, we compared the differences from the individual to the joint performances of incremental versus decremental participants (**Figure 1**). In the joint condition, incremental children are able to increase their use of perceptual,


TABLE 5 | Descriptive statistics for mental state talk and narrative competence (ratios: mental state term/number of words): Mean (M), standard deviation (SD), median (Mdn), skewness (Skw), and kurtosis (Kur).

TABLE 6 | Descriptive statistics for quality of interaction (count of discourse moves and communicative functions): Mean, standard deviation, median, skewness, and kurtosis.


moral, and willingness terms, whereas emotional terms are substantially stable in the two conditions. Instead, incremental children also decrease their use of cognitive terms in the joint condition. Decremental individuals decrease the use of mental state talk in all categories from the individual to the joint condition, with cognitive terms displaying the higher percentage of change.

# Conditions of Efficacy of the Joint Condition

To explore the conditions under which joint narratives increase children's mental state talk, we changed our unit of analysis to couples (incremental and decremental). To illustrate the differences in incrementation and presence of mental state in the individual narrative across grades, in **Table 1** we report the means of the incremental couples' mental state talk ratios in the individual and joint condition, for the total sample as well as for each grade.

When analyses were conducted at the individual level, one statistical significant result emerged. According to the Mann– Whitney U test, incremental individuals (Rank mean = 20.54) had lower levels of mental state talk in the individual condition than decremental individuals had (Rank mean = 33.96), U = 183.00, z = −3.14, P < 0.01, η <sup>2</sup> = 0.55.

When analyses were conducted at the couple level, two statistical significant result emerged, both related to differences in quality of interaction. When we compared types of couples on the basis of discrepancy among individual performances in mental state talk of the two members of each couple, the Mann– Whitney U test did not report a statistically significant difference (**Table 7**). When we compared types of couples on the basis of quality of interaction (discourse moves, and communicative functions), the Mann–Whitney U test showed that incremental couples are characterized by more dialogs initiated by the teacher, and more utterances aimed at orchestrating the interaction than decremental couples are. Although not a significant result, the Mann–Whitney showed a tendency for students in incremental couples to speak more than students in decremental couples (**Table 7**).

# DISCUSSION

The aim of this study was to analyze whether a joint narrative condition influenced children's production on mental state talk. Mental state talk is a valid and reliable indicator of children's ToM (Dunn and Hughes, 1998; Astington and Baird, 2005; Symons et al., 2005; Antonietti et al., 2006), thus the results of this study can contribute to our understanding of the influence of interactional contexts and discursive practices in school on children's understanding of other people's thoughts, beliefs, feelings, and intentions. School peer-interaction practices have a demonstrated positive effect on several aspects of the child's psychology (e.g., academic performances, O'Donnell and King, 1999; cognitive development, Riese et al., 2012; and social skills, Ginsburg-Block et al., 2006), and we extended this effect to mental state talk. We were interested in the conditions under

TABLE 7 | Mean rank comparison between the two types of couples (incremental vs. decremental) in terms of mean discrepancy between individual performances of the two members of each couple and quality of interaction (discourse moves and communicative functions): sample sizes, mean ranks, Mann–Whitney U test (ZU), p-value and effect-size (η 2 ).


T, teacher; S, student.

which a peer-interaction context improves children's mental state talk.

Firstly, we controlled the effect of narrative competence. Narratives themselves are an important source of mental state talk (Dyer et al., 2000), thus children's production of mental states could be influenced by their capacity to represent the protagonist's intentions and subsequent actions (Pelletier and Beatty, 2015). Our resulted indicated that children's production

of mental state talk was unrelated to their competence in producing a narrative with a conventional structure, either in the individual or joint condition. Mental state talk appears to be an independent component of children's mind, which can be facilitated or hindered by contextual variables, such as a narrative task, but does not overlap with other skills involved by the task itself, such as narrative competence. In other words, children's mental state talk is activated by narratives, rather than being a by-product of narrative competence. Prior research showed that children's narrative competence develops throughout primary school (Bamberg, 1997). In this study, we controlled for this potentially confounding effect by comparing incremental and decremental children's performances in structure, cohesion and coherence. No significant difference emerged, suggesting that children's developing narrative competence did not play a significant role in supporting mental state talk. Narrative competence and ToM appear to be independent constructs.

The results of this study confirmed that peer interaction does not automatically lead to increased performances, as not necessarily are two students able to engage in a productive collaboration (Prangsma et al., 2007). Before turning our attention to the conditions under which peer interaction produces an increase in mental state talk, let us discuss changes in the patterns of mental state talk from the individual to the joint condition in incremental and decremental couples. In the joint condition, incremental couples increase their use of perceptual and physiological terms, willingness terms, and moral terms. In particular, incremental and decremental couples display the largest difference in the use of moral terms. Thus, peer interaction seems to act on the core component of a narrative. According to Linde (2010), it is the inclusion of a moral meaning that distinguishes a story from a list of events or a chronicle. Interestingly, moral components cannot be completely defined structurally, as confirmed by the lack of correlation between mental state talk and narrative competence, including the structural component. Linde (2010) also added that a narrative can be considered successful if there is an agreement on the moral meaning of a story. Generally, such an agreement should take place between the narrator and the interlocutor, whereas a joint narrative activity requires this agreement to be reached by the two narrators. In this sense, peer interaction might be a reflective tool on the moral aspects of a story and on its dialogical nature.

The other two main differences between incremental and decremental couples in terms of change across the two conditions concern perceptual-physiological terms and willingness terms. As suggested by previous studies Pelletier and Beatty (2015) children need high levels of mental state talk to create a narrative based on the characters intentions and the subsequent actions. Thus, peer interaction might stimulate children to share and negotiate the intentions of the characters of the joint narrative (i.e., willingness states) and the actions connected to such intentions (i.e., perceptual and physiological states). Feathers (2002) stated that if the characters' intentions are explicitly described in a narrative, then children are abler to identify each mental state present in the story, and peer interaction might contribute to this link.

Once confirmed that peer interaction does not automatically lead to higher performances in mental state talk, we proceeded to explore the conditions under which children increased their mental state talk from the individual to the joint condition. A first variable controlled in this study was children's individual levels of mental state talk. Prior studies have demonstrated in certain cases, children's ToM, as assessed by a standardized test, is more strictly related to their individual mental state talk, rather than to the mental state talk produced while interacting with a partner (i.e., older partner, Symons et al., 2005). According to our data, children included in the incremental couples had lower levels of mental state talk in individual narratives than children included in the decremental couples did. Thus the facilitating effect of a peer-interaction condition is confirmed for children who are at the moment of initial elaboration or emergence of mental state talk, in line with prior studies demonstrating the conditions under which group performance is superior to individual performance (Mugny et al., 1981).

A second variable explored in this study to explore the conditions under which peer interaction positively influences children's mental state talk was discrepancy between the individual mental state talk of the two members of a couple. According to our data, the individual levels of mental state talk of members of incremental couples were not more or less discrepant than the ones of decremental couples. This finding emphasizes that for peer learning to be effective, there is no need to create a couple with an asymmetrical relationship ("peer tutoring;" Duran and Monereo, 2005), a model advocated by Vygotsky, who claimed that problem-solving in interaction with more expert peers allows the child to enter new areas of potential (i.e., zone of proximal development), with both members of the couple benefitting from the interaction by internalizing all the processes enacted during the meaning-constructing discourse (Vygotsky, 1978).

Finally, we examined the interaction between partners in joint narratives in terms of source of interaction and communicative use of language. According to our data, incremental couples interacted more than decremental couples did, as shown by a higher number of interventions by the children. Also, children produced more utterances to orchestrate and regulate the dialog, which is probably the reason why children in the incremental couples interacted more and, in turn, benefitted more from the joint narrative condition. Peer-assisted learning contexts require high levels of intersubjectivity, which needs to be accomplished by mutual adjustments of the two partners (Ginsburg-Block et al., 2006; Tarchi and Pinto, 2015). None of the other comparisons was statistically significant. Students in incremental and decremental couples seemed to interact in a similar way: they mainly interacted to define and elaborate the topic of their narrative, exchanged information and confirmed that they agreed on their partner's story-lines.

# CONCLUSION

This study describes the effect of peer-interaction on mental state talk. Our results suggest that a peer interaction intervention

is mostly beneficial for children with lower levels of individual mental state talk. This is consistent with a traditional line of research on socio-cognitive conflict emphasizing how children progress as a function of interacting with others is significant when they are in the initial stages of the elaboration of the target process (Mugny et al., 1981). Moreover, interaction played an essential part in the effect of peer-learning. Children who improved their mental state talk in the joint condition have been able to create a high level of intersubjectivity with their partner, as demonstrated by the higher number of interventions to orchestrate the dialog. When focusing on the interaction, as well as the product, children were also able to achieve a mutual understanding of the activity by being exposed to each other's reasoning behind their own viewpoint (Kruger, 1993). This mechanism appeared to be more important than having students working with a more expert peer (peer tutoring, Duran and Monereo, 2005).

This finding provides useful information for educators: children's ToM can be improved through children's engagement in a peer-assisted learning task. Moreover, in agreement with Pelletier and Beatty (2015), we believe that this study also contributes to improving children's appreciation of narratives, which could be hindered by an impaired understanding of the story characters' mental states. Furthermore, our results emphasize the importance of the role played by the teacher. Incremental couples were characterized by more interventions by the adult, which scaffolded children's interactions and coconstruction of the story.

This study was affected by a few limitations. Firstly, our results are limited by the small sample size, which determines problems of statistical powers and risks of not finding existing associations between variables. Moreover, the size of our sample sizes did not allow to test the moderation effect of age on the association between peer interaction and mental state talk. Secondly, although several studies used and validated mental state talk as an implicit measure of ToM (e.g., Dunn and Hughes, 1998; Astington and Baird, 2005; Symons et al., 2005; Antonietti et al., 2006), results from this study would be sounder if an explicit evaluation of ToM with a specific test was included. Thirdly, past studies have shown that children's ToM and mental state

# REFERENCES


talk correlate with other variables, such as executive functions (Bianco et al., 2015). Future studies should include these variables and examine whether the results obtained in this study partially depend on their influence. Fourthly, generalization of results is limited by the research design of this study, in particular by the use of oral narratives. Prior studies have demonstrated the presence of a discontinuity in children's narrative competence, when writing is introduced (Pinto et al., 2015, 2016b). In primary school children are asked to write their narratives, rather than tell them, but we believe in the importance of keeping oral narratives in primary school too, given their fundamental role in eliciting and organizing children's ToM through the use of mental state talk (Guajardo and Watson, 2002; Pinto et al., 2016a). Finally, in this study children were allowed to create stories without specific directions. As a consequence, children's narratives resulted in a wide variety of contents. Prior studies emphasized the influence of the context and instructions on children's narrative production (e.g., Berman, 1995; Cameron and Hutchison, 2009), and future studies should verify whether also mental state talk depends on the instructions given and the content of the stories produced.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

### ACKNOWLEDGMENT

We would like to thank Anna Tosi for her help with data collection.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01669




Vygotsky, L. S. (1978). Mind in Society. Cambridge, MA: Harvard University Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Pinto, Tarchi and Bigozzi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Relationship between Emotion Comprehension and Internalizing and Externalizing Behavior in 7- to 10-Year-Old Children

#### Ariane Göbel<sup>1</sup> \*, Anne Henning<sup>2</sup> , Corina Möller<sup>3</sup> and Gisa Aschersleben<sup>3</sup>

<sup>1</sup> Clinic for Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Hamburg-Eppendorf, Hamburg, Germany, <sup>2</sup> Early Intervention Institute, SRH College of Health, Gera, Germany, <sup>3</sup> Department of Psychology, Developmental Psychology Unit, Saarland University, Saarbrücken, Germany

The influence of internalizing and externalizing problems on children's understanding of others' emotions has mainly been investigated on basic levels of emotion comprehension. So far, studies assessing more sophisticated levels of emotion comprehension reported deficits in the ability to understand others' emotions in children with severe internalizing or externalizing symptoms. The aim of this study was to investigate the relation between emotion comprehension and interindividual differences, with a focus on internalizing and externalizing behavior in children aged 7–10 years from the general population. A sample of 135 children was tested for emotion understanding using the Test of Emotion Comprehension. Information on internalizing and externalizing behavior was assessed with the Child Behavior Checklist 4/18. Age, bilingual upbringing, and amount of paternal working hours were significant control variables for emotion comprehension. In contrast to prior research, overall level of emotion understanding was not related to externalizing symptoms and correlated positively with elevated levels of somatic complaints and anxious/depressed symptoms. In addition, and in line with previous work, higher levels of social withdrawal were associated with worse performance in understanding emotions elicited by reminders. The present results implicate not only an altered understanding of emotions among more specific internalizing symptoms, but also that these alterations occur already on a low symptom level in a community based sample.

Keywords: emotion comprehension, emotion understanding, behavioral problems, internalizing, externalizing, child behavior checklist

# INTRODUCTION

Emotion comprehension is defined as the knowledge to identify and understand others' emotions by facial or bodily cues, and within specific social contexts (Harris et al., 2016). Emotion comprehension develops up until early adolescence along with increasing abilities in perspective taking and understanding of social and moral norms, and can therefore be described as the affective side of social cognition (Wellman, 2014).

Pons et al. (2004) identified at least nine successive components of emotion comprehension, which children master until the age of 11 years and which can be assessed via the Test of Emotion

Edited by:

Ilaria Grazzani, University of Milano-Bicocca, Italy

#### Reviewed by:

Tilmann Habermas, Goethe Business School, Germany Alessandro Pepe, University of Milano-Bicocca, Italy

> \*Correspondence: Ariane Göbel a.goebel@uke.de

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 29 August 2016 Accepted: 22 November 2016 Published: 06 December 2016

#### Citation:

Göbel A, Henning A, Möller C and Aschersleben G (2016) The Relationship between Emotion Comprehension and Internalizing and Externalizing Behavior in 7 to 10-Year-Old Children. Front. Psychol. 7:1917. doi: 10.3389/fpsyg.2016.01917

Comprehension (TEC, Pons et al., 2004). The proposed components can be categorized into three developmental phases: First, in the external phase, children between 3 and 5 years are able to recognize and label facial expressions of basic emotions (e.g., Bullock and Russell, 1985), infer a person's emotion based on situational cues (e.g., Cutting and Dunn, 1999), and understand that an external reminder may reactivate a past emotional state (e.g., Lagattuta et al., 1997). By the age of 7 years, children master the mental phase characterized by the improved ability of perspective taking. Within this phase, children understand that two persons can experience different emotions regarding an object depending on how strong their desire to receive the object is (e.g., Harris et al., 1987). Further, children understand that the same situation can elicit different emotions in two persons depending on their belief about the context of the situation (e.g., being aware of danger or not; Pons et al., 2004). At the same age, they know that a person's outward emotional expression and internal emotional experience can deviate from each other (Wellman and Liu, 2004). Finally, by around 9 years, children take different, divergent perspectives on the same scenario into account and by doing so, learn to understand the components of the reflective phase. They understand that incorporating different perspectives on a given situation may result in conflicting, mixed emotions in the same person, that a transgression of moral rules to satisfy a desire leads to negative emotions (e.g., Lagattuta, 2005), and moreover that cognitive strategies can effectively regulate emotions (e.g., Stegge and Terwogt, 2007).

Although it was shown that despite of small variations on component level, the development of emotion understanding from a rather external to a deeper, more complex understanding is similar among western and non-western cultures (Janke, 2008; Roazzi et al., 2009; Molina et al., 2014), some studies report evidence that interindividual differences influence the ontogenesis of emotion comprehension. Especially the child's receptive (understanding and comprehension of spoken or written words/sentences) and expressive (production of spoken or written words/sentences) language abilities are related to emotion comprehension throughout childhood (e.g., Cutting and Dunn, 1999; Pons et al., 2003; Beck et al., 2012). Moreover, cross-sectional and longitudinal studies indicate that not general language abilities only, but also the amount of communication of mental and emotional states with parents and older siblings influences children's current and later emotion understanding (Brown and Dunn, 1992; Cutting and Dunn, 1999). In accordance with these findings, training studies confirm the positive effect of explicit emotion state talk on emotion comprehension (e.g., Pons et al., 2003; Gavazzi and Ornaghi, 2011). Additionally, factors like the socioeconomic status, the quality of attachment between family members, or non-verbal intelligence scores have been identified as positive predictors of emotion comprehension (Meins and Fernyhough, 1999; Albanese et al., 2010; Colle and Del Giudice, 2011).

Besides the aforementioned factors, children's behavioral problems in form of internalizing and externalizing symptoms have been identified to be associated not only with social adjustment but also with their ability to understand others' emotions. Among children and adolescents, internalizing and externalizing problems are the most common mental health problems with prevalence rates of ∼10 and 14%, respectively (Ihle and Esser, 2002; Hölling et al., 2007). Internalizing problems are characterized by anxious and depressive symptoms, social withdrawal and somatic complaints. Externalizing problems on the other hand are defined as aggressive, oppositional, and delinquent behavior. Long-term consequences are problems within social, school and later professional environment (Ihle et al., 2000). Over the course of development, gender differences for the prevalence rate of emotional problems and behavior problems occur: While boys show higher rates of internalizing symptoms during childhood, an increase in internalizing symptoms is reported for teenage girls and young women with greater long-term stability (Ihle et al., 2000; Hölling et al., 2007). Externalizing symptoms are in general more common among boys, have an earlier onset in childhood, and show higher persistence rates with more unfavorable courses (Plück et al., 2000).

Longitudinal studies investigating the relation between behavioral problems and emotion understanding in samples from the general population report that pre-schoolers' ability to recognize others' facial expression negatively predicted the level of externalizing symptoms during pre-school years (Denham et al., 2002). Further, the comprehension of facial emotional expression and emotion situation knowledge of socially disadvantaged first-graders predicted their level of selfreported internalizing symptoms 4 years later (Izard, 2001; Fine et al., 2003). In a recent meta-analysis, Trentacosta and Fine (2010) investigated the comprehension of discrete emotions, which can directly be identified via facial expression, gesture, vocalization, or social context. The authors report robust low to medium negative effect sizes for internalizing and externalizing symptoms present in both community and clinical samples. Considering the age subgroups separately, effect sizes varied as a function of age –with small negative effect sizes for 3- to 5 year-olds, and medium negative effect sizes for 9- to 15-year-olds. However, for 6- to 11-year-old children, no significant effects of internalizing or externalizing symptoms were found.

Noteworthy, the few studies comprising the oldest age group mostly assessed emotion knowledge in samples of children with clinically relevant behavioral problems. Overall, the above-mentioned results support the assumption that children with internalizing and externalizing behavior in both clinically diagnosed and community samples have difficulties in understanding discrete emotions.

However, only a few studies have investigated this relation testing more complex aspects of emotion comprehension (Southam-Gerow and Kendall, 2000). Regarding internalizing symptoms, most evidence for a relationship with higher levels of emotion comprehension derives from studies investigating clinically referred children with depression and anxiety disorders. Researchers focusing on individual components of emotion comprehension report that children diagnosed with depression or anxiety disorders show a significant worse understanding of situations eliciting mixed emotions (Meerum Terwogt, 1990) and less knowledge of strategies to regulate or hide emotions

(Southam-Gerow and Kendall, 2000). In a study testing the performance in all TEC components in a sample of clinically anxious 8- to 12-year olds, no association between emotion comprehension and the general level of anxiety symptoms could be found, but between more specific forms of anxiety in obsessive-compulsive and post-traumatic stress disorder and emotion comprehension (Bender et al., 2015). To our knowledge, no study so far investigated the association between all nine TEC components and internalizing behavior in a sample from the general population.

Regarding the understanding of higher levels of emotions in children with externalizing behavior, studies are rare and report mixed results. Sutton et al. (1999) report that 7- to 10-yearold children, who were identified by their teachers as ringleader bullies (defined as verbally or physically attacking others) actually showed a better understanding in social cognition tasks including emotion comprehension compared to their classmates. Belacchi and Farina (2010) were first to test the correlation between the TEC and teacher-rated social role in a pre-school aged sample and confirmed a better emotion understanding in children being categorized as bullies, their reinforcers or assistants for the first component of emotion comprehension (facial recognition) only. No relations were found for the remaining components, or the total TEC score. On the contrary, children being categorized as members of prosocial groups showed positive associations with the understanding of affective expressions and emotions elicited by situational cues or desires. However, these studies focus on teachers' reports of students' social roles in the school environment, which might not validly reflect externalizing behavior in general. Therefore, research on the relation of higher levels of emotion comprehension and symptoms of behavioral problems is needed.

A model to explain the association between deficits in emotion comprehension and behavioral problems is the Social Information Processing Model (SIP) by Crick and Dodge (1994). Originally developed to explain the association between an altered perception and processing of social information in children with externalizing behavior, the model was later adapted to children with internalizing symptoms (Luebbe et al., 2010). Within SIP, six steps are proposed to explain how information is processed, and how subsequent actions are planned and performed. First, stimuli from the social environment or own body signals (e.g., increased heartbeat) are encoded and in a second step interpreted based on the evaluation of possible intentions of the other person (e.g., to be nice or mean), own goals, motivations and the perception of own personality. In a third step, based on this evaluation, actual personal goals for this interaction are defined (e.g., wish to play with the other person or to leave the situation). Next, based on information from the long-term memory, possible actions are generated to reach this goal. The fifth step is to choose the reaction fitting best to the prior defined goals or outcome expectations, which is afterward performed in the sixth step. Part of this process is also to consider, how socially accepted the generated options of reactions are. For example, to punch someone might seem like the easiest way to end a conflict, but would most likely not be the most socially acceptable solution. Social responses following the reaction and its effectiveness are themselves processed, evaluated and stored in long-term memory. Early subjective experiences from the social environment form scripts and schemata on how social interactions are constructed and usually take place. These scripts and schemata are necessary to ensure fast, efficient and subconscious information processing, orientation and capacity to act in everyday life (e.g., Salzer Burks et al., 1999). The more often they are activated, the stronger they are anchored in the longterm memory, leading to an even faster and automated activation in the right context. Therefore, they can show high stability from child- to adulthood. Despite its efficiency, information processing based on automatically activated scripts and schemata leaves out non-compliant information and situational stimuli, subsequently leading to a bias in processing social situations, and consequently influences the development and stabilization of maladjusted behavior patterns. Maladjusted behavior on the other hand leads to negative experiences from the environment and might strengthen processing biases in a feedback loop (Crick and Dodge, 1994).

Several studies support this model. A meta-analysis of 36 studies investigating clinical and community samples found that children with disruptive and aggressive behavior showed a bias in the first steps of SIP during the processing of situational cues and stimuli (Yoon et al., 1999). The authors do not only report an attentional bias with neglecting relevant general situational cues, but also higher rates of encoding cues associated with hostility. Furthermore, children more often attributed negative intentions behind the behavior of others. Longitudinal studies investigating community samples report that an altered and biased information processing in preschoolers predicted both lower popularity among peers, and an increased amount of aggressive behavior in the same children when they were school-aged and teenagers (Dodge et al., 2003; Lansford et al., 2006). Studies applying the SIP paradigm to children with anxious and depressive symptoms also find a negative information processing style with higher rates of attributing ambiguous situations as negative, attributing negative intentions to others in social interactions and generating contraproductive behavior responses to solve a hypothetical situation (Daleiden and Vasey, 1997; Luebbe et al., 2010). Bell et al. (2009) found associations between negative attributional biases and symptoms of both anxiety and depression in a factor-analytic study assessing 8- to 13-year-old children recruited from public schools. Additionally, only depressive symptoms were negatively associated with positive SIP. The associations between SIP biases in perception and evaluation of information from social contexts might also explain the above reported associations between both externalizing and internalizing behavior and the comprehension of others' emotion.

Overall, the aforementioned studies indicate a lower understanding of others' emotions in children showing both internalizing and externalizing symptoms. However, impairments in emotion comprehension cannot easily be traced back to behavioral problems in general. Most studies investigating this association focus on the comprehension of discrete emotion in pre-school-aged children. Little is known about the relation of more sophisticated aspects of emotion

comprehension and internalizing and externalizing behavior in children beyond the pre-school years from the general population. Therefore, the aim of the present study was to investigate the nine components of emotion comprehension in normally developing, school-aged children (7–10 years of age) and interindividual differences in emotion comprehension related to internalizing and externalizing behavior. In accordance with prior research, we expected to find an age effect for emotion comprehension, with older children outperforming younger ones.

In line with the above mentioned results from both community and clinical samples, and based on the assumption of an altered information processing in maladjusted children, we expected internalizing and externalizing behavior to negatively correlate with emotion comprehension and to independently explain unique variance of emotion understanding.

## MATERIALS AND METHODS

### Participants

The final sample comprised N = 135 children (72 female), with 34 7-year-olds (age in months M = 90.5, SD = 3.19, 17 female), 33 8-year-olds (M = 100.64, SD = 3.56, 19 female), 34 9-year-olds (M = 113.85, SD = 3.8, 18 female), and 34 10 year-olds (M = 124.62, SD = 3.7, 18 female). Of these, 12% were raised bilingually, 42.2% had one sibling and 34.8% two or more siblings. Regarding parental educational background, 69.7% of the fathers and 58.2% of the mothers had a high school or university degree. Further, 14.8% of the mothers and 3.7% of the fathers were unemployed, 66.7% of mothers and 9.6% of fathers were working half-time, and 18.5% of the mothers and 86.7% of the fathers were working full-time. All participants lived in Saarbrücken or its adjacent municipalities. The sample was recruited during open house events at Saarland University, Germany, by handing out flyers at schools, and by contacting interested families who had already participated in other studies of the department. Additional 12 children were tested but excluded due to developmental disorders (n = 2) and chronic diseases (n = 2) possibly influencing their performance, or to insufficient data provided by the parents (n = 8). The experiment was approved by the Ethical Committee of the Faculty for Social and Applied Human Sciences at Saarland University (running number of ethical approval EK16-10).

#### Measures

#### Socio-Demographic Questionnaire

Parents were asked to answer questions about their maternal and paternal education level as indices of their socioeconomic status (no degree = 0; general school certificate = 1, secondary school certificate = 2, advanced technical college certificate = 3, vocational technical diploma = 4, high school diploma = 5, university degree = 6), marital status (married/cohabiting = 0 and separated/single-parent household = 1) and number of siblings and friends. Maternal and paternal working hours (unemployed = 0, half-time employment = 1, and full-time employment = 2) were assessed as indices of their potential time spent with the family. Since Saarbrücken is located within the German-French border region, child's mother tongue (German = 0; other = 1) and bilingual upbringing (monolingual = 0, bilingual = 1) were also assessed. Further, the child's medical history regarding chronic diseases or developmental disorders was assessed as exclusion criteria.

#### Child Behavior Checklist 4/18

The degree of internalizing and externalizing symptoms was assessed with the German version of the Child Behavior Checklist (CBCL 4/18, Döpfner et al., 1994a). The CBCL 4/18 is a widely used parent report questionnaire assessing their child's social competence and problematic behavior within the last 6 months. This screening tool was developed as part of the empirically based dimensional classification system by Achenbach and Rescorla (2001). Within the dimensional approach, mental health problems are understood as characteristics along continuous dimensions of psychologic functioning, differing from normal development due to the intensity of reported symptoms. One hundred and eighteen problem items form in total eight syndrome scales, which can be combined to second order, broad scales. The for this study relevant broad scale Internalizing Problems is formed by the sum score of the three syndrome scales Withdrawn, Somatic Complaints, and Anxious/Depressed. Further, the broad scale Externalizing Problems is formed by the two syndrome scales Rule-Breaking Behavior and Aggressive Behavior. Each syndrome scale consists of those items, which loaded in factor and principal component analyses together on one factor and therefore form a syndrome cluster on the specific scale. Answers to each item are coded on a 3-point Likert- scale with 0 = not true, 1 = somewhat or sometimes true, 2 = very true or often true. Final T-values for broad scales and syndrome scales are calculated sensitive to gender. The T-values of the CBCL broad scales differ from those of the syndrome scales. For the broad scales Internalizing Problems and Externalizing Problems, a score from 60 to 62 marks behavior problems on a subclinical, a score of 64 or higher on a clinically relevant level. On syndrome scale level, a T-value between 67 and 70 marks behavior problems on a subclinical, a score of 71 or higher on a clinical level.

For the German translation, Cronbach's alpha of the syndrome scales and broad scales ranges between α = 0.56 and α = 0.91. Further, confirmed convergent and discriminant validity of the German version was previously reported (Döpfner et al., 1994b; Schmeck et al., 2001). In this study, Cronbach's alpha of the broad scale Internalizing Problems was α = 0.82 and ranged between α = 0.36 and 0.79 in the syndrome subscales. Regarding the Externalizing Problems scale Cronbach's Alpha was α = 0.88. Cronbach's Alpha for the subscales Rule-Breaking Behavior and Aggressive Behavior were α = 0.39 and α = 0.88, respectively.

#### Intelligence and Development Scales 5–10

To control for the level of language comprehension, a test taken from the Intelligence and Development Scales 5–10 (IDS 5–10, Grob et al., 2009) was conducted to assess receptive language comprehension. The IDS 5–10 is a test battery to assess cognitive development, language comprehension, mathematics, achievement motivation, psychomotor, and socio-emotional

development. In the test for receptive language comprehension, the examiner first shows different toys to the child, introducing them one after the other with a specific name (e.g., girl, boy, dog, cat) and puts them on the table in front of the child. Afterward the examiner reads aloud a behavior description of one or more characters, which the child is asked to reenact using the specific toys. The tests consists of 12 successive sentences, with increasing difficulty due to higher complexity of the behavior sequences. Scoring ranges between 0 (wrong reenactment), 0.5 (right reenactment but order of sequences is wrong), and 1 (right reenactment, right order of sequences). To compare the different age groups, T-values are calculated for each age group individually. Reliability of the receptive language subtest is satisfactory with a Cronbach's Alpha of α = 0.88 and a test-retest reliability of rtt = 0.57. Construct and criterion validity are also reported (see also Grob et al., 2009 for more details). For this study, the Cronbach's Alpha values were α = 0.57 and by this below those reported by the Grob et al. (2009).

#### Test of Emotion Comprehension (TEC)

The TEC (Pons and Harris, 2000) was developed to test nine components of emotion understanding, namely (I) recognition of facial expression, (II) external causes of emotions, (III) desire-based emotions, (IV) belief-based emotions, (V) the influence of a reminder on present emotional states, (VI) regulation of emotional states, (VII) hiding emotional states, (VIII) having mixed emotions, and (IX) emotions caused by moral considerations. In this study, the German version of the TEC was used (Janke, 2008). The test material consist of a picture book with simple drawings. The examiner presents nine short stories to the child, each accompanied by at least one drawing. Below the drawing for each vignette, its protagonist is portrayed with four out of five possible different emotion outcomes ("happy," "sad," "angry," "scared," or "just alright"). Depending on participants' own gender, a corresponding version of the picture book with either female or male protagonists was presented. At the end of each story, the child is asked to point on the most appropriate emotion outcome. For each component, a score of 1 is assigned if answered correctly. The overall score of emotion understanding ranges from 0 to 9. Since research by Janke (2008) with a German sample and Albanese et al. (2006) with an Italian sample showed that especially for components III and IV, children often chose the neutral instead of the expected positive emotion outcome, scoring was adjusted accordingly. Therefore, in addition to the original scoring by Pons et al. (2004), choosing the neutral emotion outcome for the components III and IV was accepted as a correct answer. The British version of the TEC showed in a sample of 9-year-olds good test-retest reliability after 3 months with r(18) = 0.84 (Pons et al., 2002). In another study with a sample of 7-to-11-year-olds, test-retest reliability after 13 months controlled for age and gender was r(40) = 0.68 (Tenenbaum et al., 2004). Its validity was also positively evaluated (see Pons et al., 2004).

#### Statistical Analysis

Cases were excluded from further analyses, if information on sociodemographic variables (n = 6) was missing. Additionally, in line with the manual guidelines, CBCL scores were not calculated when answers on more than eight items were missing (Döpfner et al., 1994a). Further, if more than 20% of items were missing for a specific scale, this scale was excluded from analyses (n = 2). Answers missing completely at random were replaced by the average of the other items for this scale (Schafer and Graham, 2002). Since there was only one child being raised with another mother tongue, this variable was excluded from further analyses.

Inspection of means, standard deviations, skewness, and kurtosis of the assessed variables revealed that both the CBCL broad and syndrome scales had right-skewed distributions. Therefore, their T-values were log-transformed for further analyses. In order to assess differences in emotion comprehension for gender and the four age groups, a 2 (gender) × 4 (age: 7, 8, 9, and 10 years) ANOVA was performed with the total TEC score as the dependent variable. Prior to conducting the ANOVA, the data were checked for homogeneity of variance and independence of observed variables. Additionally, a Kruskal–Wallis-test was conducted to test for significant age differences on the individual components of the TEC. To test for associations between the TEC total score and potential control variables (IDS score, maternal/paternal educational background, maternal/paternal working hours, bilingual upbringing, number of siblings, number of friends, marital status), bivariate Pearson and Spearman correlations were performed. Since for each variable a specific direction of relation was expected based on the literature, significance of the correlations were tested one-tailed.

To test the hypotheses of negative relations between emotion comprehension and internalizing or externalizing symptoms, Spearman correlations (one-tailed) were conducted. Following this, backward regression analyses were performed to investigate the amount of variance in emotion comprehension explained by each predictor. Finally, the associations between the CBCL broad and syndrome scales and each TEC component using Pearson bivariate correlations (two-tailed) were explored. All statistical analyses were conducted using IBM SPSS <sup>R</sup> 22 and the statistical level for significance was set at α = 0.05.

#### RESULTS

#### Test of Emotion Comprehension

For the total sample, the TEC score ranged from 3 to 9, (M = 7.21, SD = 1.8, n = 135). A 2(gender) × 4 (age group) ANOVA revealed a significant main effect of age on the total TEC score, F3,<sup>127</sup> = 6.95, p < 0.001; η<sup>p</sup> <sup>2</sup> = 0.14. Moreover, neither a significant main effect for gender (F1,<sup>127</sup> = 0.862, p = 0.36, ηp <sup>2</sup> = 0.007), nor a significant interaction between age and gender (F3,<sup>127</sup> = 6.95, p = 0.574, η<sup>p</sup> <sup>2</sup> = 0.015) were revealed. Post hoc contrast analyses using Tukey- HSD controlling for Type 1 error revealed that 7-year-olds (M = 6.56; SD = 1.44) had significantly lower TEC scores than 9-year-olds (M = 7.38; SD = 1.02; p < 0.05) and 10-year-olds (M = 7.85; SD = 0.989; p < 0.01). In addition, the 8-year-olds showed significant lower TEC scores than the 10-year-olds (p < 0.05). The 9-year-olds did not significantly differ from the 8-year-olds (M = 7.03; SD = 1.29; p ≥ 0.629) or 10-year-olds (p ≥ 0.375). Further, while component

I (recognition) and component II (external cause) had a correct response rate of 100% for the whole sample, a Kruskal–Wallistest revealed significant age differences in the expected direction for the components IV (belief- based emotion), V (reminder), VII (hiding emotions), and VIII (mixed emotions, see **Table 1**).

Regarding the assessed control variables, analyses revealed significant correlations between the total TEC score and Paternal working hours (r<sup>s</sup> = −0.235, p < 0.01), and total TEC score and bilingual upbringing (r = 0.154, p < 0.05) only. Children of fathers with less working hours or no employment and those being raised with a second language scored higher on the total TEC score. For the variables IDS score, maternal educational background, paternal educational background, maternal working hours, number of siblings, number of friends, or marital status no significant correlations were found (all p-values ≥ 0.111).

#### Internalizing and Externalizing Behavior

An independent t-test revealed a significant gender difference for the Externalizing Problems scale, with girls (M = 49.2, SD = 8.82) having a significantly lower score than boys (M = 55.3, SD = 9.59, t(133) = −3.6, p < 0.001, d = 0.66). For Internalizing Problems, the difference of mean scores between girls (M = 52.0, SD = 9.13) and boys (M = 54.9, SD = 8.88) was not significant (t(133) = −1.77, p ≥ 0.08, d = 0.32).

Regarding the broad CBCL scale Internalizing Problems (total M = 53.4, SD = 9.10), 13.3% of the sample reached a score of 64 or higher, which indicates behavioral problems on a clinically relevant level. Regarding the broad scale Externalizing Problems (total M = 52.1, SD = 9.65), 10.6% of the total sample reached a score of 64 or higher. These rates of behavioral problems are overall comparable to prevalence rates reported for German community samples (Ihle et al., 2000; Hölling et al., 2007). When considering clinically relevant scores of T ≥ 70 on the individual syndrome scales, 3.7% of the sample reached this score on the Withdrawn syndrome scale (total M = 55.4, SD = 6.86), 2.5% on the Somatic Complaints syndrome scale (total M = 54.1, SD = 5.82), 2.5% on the Anxious/Depressed syndrome scale (total M = 56.1, SD = 7.37), 5.2% on the Rule-Breaking Behavior syndrome scale (total M = 54.3, SD = 5.72), and 3% on the Aggressive Behavior syndrome scale (total M = 55.1, SD = 8.40).

# Relationship between Emotion Comprehension and Behavioral Symptoms

Pearson bivariate correlations (one-tailed) revealed significant positive correlations between the total TEC score and the syndrome scales Somatic Complaints (M = 54.0, SD = 5.84; r = 0.156, p < 0.05) and Anxious/Depressed (M = 56.0, SD = 7.3; r = 0.191, p < 0.05). None of the other syndrome or broader scales were significantly correlated to the TEC total score (all r<sup>s</sup> < 0.114, all p<sup>s</sup> ≥ 0.094).

To further investigate the amount of explained variance in the TEC score by the two syndrome scales Somatic Complaints and Anxious/Depressed and the relevant control variables age, bilingual upbringing and paternal working hours, backward regression analyses were performed. The ordinal variable paternal working hours was dummy coded before entering it into the regression model. Since the majority of fathers worked fulltime (n = 117), the category full-time employment was used as reference group. Therefore, two dummy variables were calculated for the categories part-time employment and unemployment. Since the scales Somatic Complaints and Anxious/Depressed were not independent from another (r = 0.268, p < 0.001), individual regression analyses were performed for both scales (see **Table 2**). The final model including Somatic Complaints explained 21.6% of variance in total TEC score, with age explaining 12.8% and bilingual upbringing 5%. Unemployment explained 3.49% and part-time employment 2.31%. Finally, Somatic Complaints independently explained 2.37% of variance.

The same regression analysis was conducted including the Anxious/Depressed scale. This model explained 23.1% of variance. The explained variance of the control variables was comparable to the first model, with age explaining 13.1%, bilingual upbringing 4.97%, unemployment 3.65%, and part-time employment 2.13% of variance. Anxious/Depressed independently explained 3.84% variance of the total TEC score.

Further exploration of the associations between the CBCL scales and each TEC component revealed significant correlations (one-tailed) for both the internalizing broad and syndrome scales. For the component VIII (mixed emotions) again positive correlations were found with the syndrome scales Somatic Complaints (r = 0.177, p < 0.05), Anxious/Depressed (r = 0.203, p < 0.05), and the broad scale Internalizing Problems (r = 0.194, p < 0.05). On the contrary, component V (reminder) was negatively correlated with the syndrome scale Withdrawn (r = −0.266, p < 0.01). None of these associations could be explained by extreme values. No further correlations between the remaining components and CBCL syndrome or broad scales could be found (all r<sup>s</sup> < 0.132, all p<sup>s</sup> ≥ 0.063).

# DISCUSSION

The aim of this study was to investigate emotion comprehension in 7- to 10-year old children from the general population. Moreover, interindividual differences in the association between emotion comprehension and internalizing and externalizing symptoms were investigated. Based on the SIP model (Crick and Dodge, 1994), the hypothesis of a negative association between emotion comprehension in TEC and both externalizing and internalizing symptoms was formulated.

Regarding the performance in the emotion comprehension tasks, 7-year-olds' scoring was as expected significantly worse than the performance of 9-year-olds and 10-year-olds. Also, 10 year-olds significantly outperformed the group of 8-year-olds. On component level, component I (recognition of facial expressions) and II (external causes of emotions) showed a ceiling effect with a rate of 100% correct answers by all children within the four age groups. This is not a surprising result since already toddlers understand facial expressions of their caregiver during social referencing (Sorce et al., 1985), and 3- to 4-year-old children make precise distinctions between facial expressions of basic emotions and understand situations as their elicitor.

#### TABLE 1 | Distribution (frequency and percentage) of correct answers to each TEC component across age groups and total sample.


TEC, Test of Emotion Comprehension; n, sample size; M, mean; SD, standard deviation; I, recognition; II, external cause; III, desire-based; IV, belief-based; V, reminder; VI, regulation; VII, hiding; VIII, mixed emotions, IX, morality; K–W = χ 2 (3) of Kruskal–Wallis-Test, two-tailed; <sup>∗</sup>p < 0.05, ∗∗p < 0.01.

TABLE 2 | Explained variance in total TEC score of the control variables age, bilingual upbringing, paternal working hours, and either Somatic Complaints (Model 1) or Anxious/Depressed (Model 2).


Multiple regression with total TEC score as dependent variable, and age, bilingual upbringing (0–1 dummy coded), paternal working hours (0–1 dummy coded, with full-time employment as reference group), and mean T-value of either Somatic Complaints (Model 1) or Anxious/Depressed (Model 2). <sup>∗</sup>p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001.

Looking at the components of the mental and reflective phase, an increase in correct responses over different age groups is evident. These results are in line with the assumption of a maturation of emotion comprehension throughout pre-school years (e.g., Pons et al., 2004; Janke, 2008). Component IX (asking for emotions elicited by an action, which satisfies desires but leads to a moral conflict) was the most difficult component and was answered correctly by only 47.4% of the total sample. This replicates the finding by Janke (2008) that German children less often ascribe the correct negative emotion "sad" to a moral conflict than British children do. This can be explained due to differences in the development of social norms or differences in parental rules between both countries, or due to changes in the meaning of the emotional expressions caused by the translation of the TEC. Lagattuta (2005) investigated emotion comprehension in ambiguous situations in 4- to 7-year-olds and adults. In her study, participants were asked to describe the emotion elicited by scenarios, in which the protagonists break rules to fulfill a specific desire. Older children and adults more often predicted next to negative also mixed emotions as a result to a transgression. Therefore, ascribing the single basic emotion "sad" as the right answer to morally wrong behavior might not fully cover the content of the elicited emotional experience. Instead of using forced-choice answers, it would be interesting to let children with German mother tongue freely describe the elicited emotions.

In this study, we were not able to replicate the abovementioned association between language abilities and emotion comprehension. The explanation for this might be the chosen age range. Looking at the study by Pons et al. (2003), the groups of 8-to-9-year-olds and 10-to-11-year-olds did neither differ significantly in their language abilities. Therefore, it is likely that differences in language development at this age are not strong enough to be detected by the language test used in this study.

Despite the missing association between general receptive language skills and emotion comprehension, children with a bilingual background showed a better overall emotion comprehension compared to monolingual children. The second languages of the bilingual children in our sample were French, Polish, Russian, English, Italian, or Montenegrin. Therefore, their advantage cannot be explained by a simply difference between the cultures of the first and second languages. According to research from social cognition, Goetz (2003) conducted a study with 3 and 4-year-olds being raised either monolingual or bilingual in English and/or Mandarin Chinese, and reported an advantage for bilingual children in their understanding of Theory of Mind tasks. The author's explanation refers to the fact that children raised bilingual learn to refer to the same concept from different language perspectives, enhancing their understanding of the bare existence of different systems and also train their ability for metarepresentation. The explanations presented by Goetz (2003) might also be applicable for the associations between bilingualism and emotion comprehension, since especially the higher components of emotion comprehension from the mental and reflective phases require sophisticated abilities in perspective taking. Particularly with regard to the observed cultural deviations in ranking order of the TEC (see e.g., Roazzi et al., 2009), it would be interesting to not only compare emotion comprehension in mono- vs. bilingual samples, but also subgroups of bilingual children from different cultural backgrounds.

Paternal working hours showed a low-to-medium negative correlation with the total TEC score. Therefore, children with fathers working full-time showed a significant lower level of emotion comprehension than children of fathers with either half-time or no employment. It is important to note that, we did not individually assess, why fathers were unemployed or working part-time. Therefore, we cannot conclude whether parttime or no employment necessarily represents a critical living situation regarding the socioeconomic status. It might be possible that ongoing studies or paternal leave could be reasons for this, which is not equivalent to a critical living situation with one partner being long-term unemployed. Looking at the importance of relationships among family members for the development of emotion comprehension (e.g., Colle and Del Giudice, 2011) and results from attachment literature about the unique contribution of child-father quality of attachment on their social development and adjustment (e.g., Grossmann et al., 2002), one possible explanation for this association could be that children have both a more regular, vivid exchange and closer relationship with fathers not working full-time. Lagattuta and Wellman (2002) found that conversations with the father play an important role for the development of mental states understanding. Since higher levels of emotion comprehension require sophisticated skills in perspective taking, a more regular exchange with the father might positively influence higher levels of social cognition and by this also influence emotion understanding. Future research needs to investigate the paternal role for emotion comprehension beyond the pre-school years to test this assumption.

The main goal of this study was to investigate the relation between internalizing and externalizing symptoms and emotion comprehension. As the ceiling effect of the TEC components I and II revealed, none of the children in our sample had difficulties labeling emotions or understanding external events as their causes. This is in line with findings by Trentacosta and Fine (2010), who could not report significant associations between discrete emotion comprehension and externalizing or internalizing symptoms for the age group 6–11 years. Noteworthy, the reported negative associations between understanding of discrete emotions and behavior problems in older children originated mostly from samples categorized by clinical diagnoses or by placement status (e.g., in a detention center).

For externalizing symptoms, no significant correlations were found with the total TEC score or individual, more complex components, neither for the overall Externalizing Problems scale nor on level of the two syndrome scales Rule-Breaking and Aggressive Behavior. Therefore, the hypothesis of an altered comprehension of others' emotion in children showing externalizing behavior could not be confirmed and is in line with findings by Belacchi and Farina (2010) that pre-schoolers who were classified as member of hostile, aggressive social groups like bullies, their reinforcers and assistants, show overall no correlation with the TEC components in contrast to children who were attributed with prosocial roles.

For internalizing symptoms, the initial hypothesis could only be confirmed in terms of a relatively small negative correlation between the syndrome scale Withdrawn and component V (reminder). Hence, children showing higher levels of withdrawn behavior were less often able to name the right emotion elicited by specific reminders. This result is in line with the abovementioned results by longitudinal and cross- sectional studies investigating the relation between emotion comprehension and internalizing symptoms (e.g., Fine et al., 2003). However, further research is needed to confirm the component specificity of this exploratory finding. Contrary to our expectation, we found a small positive correlation between the overall score of emotion comprehension and the syndrome scales Somatic Complaints and Anxious/Depressed, explaining 2.37 and 3.84% of variance, respectively. Therefore, children with reported higher levels of internalizing symptoms showed a significantly better understanding of others' emotion. In general, children with these symptoms more often report low selfesteem, insecurity, and fear of exclusion and devaluation in social contexts (Luebbe et al., 2010). Further, brooding and worrying on potential or actual experiences and emotions is a common symptom among anxious-depressed children and youth (Verstraeten et al., 2011). Because of their fear of devaluation, children with internalizing symptoms might therefore more frequently think about others' minds and motives and by this actually train their understanding of others' emotions. An interesting phenomenon supporting this assumption is that of co-rumination among children and adolescents with internalizing problems. Co-rumination is defined as the excessively discussion of personal problems within a dyadic relationship (Rose, 2002). According to Rose (2002), the repeated discussion of especially problematic experiences and negative emotions does on the one hand strengthen the

specific friendship, but on the other hand also contribute to the stability of internalizing symptoms (see also Rose et al., 2007). With reference to the reported important role of internal state and emotional talk for understanding others' mind and emotions (e.g., Pons et al., 2003), the mutual encouragement in co-ruminating dyads to focus on and analyze emotions might in contrast to withdrawn children further train their perspective taking skills and emotion comprehension.

With regard to the diverging effects for the internalizing subgroups, Bell et al. (2009) reported individual differences in information processing patterns among children with internalizing symptoms. While children with both depressive and anxious symptoms showed a negative style in processing social information, children with depressive symptoms additionally less often made positive attributions to social situations than anxious children. While the assessment of emotion comprehension with the TEC focusses on the understanding of others' emotional states in a specific situation, in studies focusing on SIP there are often scenarios presented in which the participant is asked to evaluate the social behavior and intentions of the protagonist above merely labeling a specific emotion. Therefore, one explanation for these divergent results could be that children with elevated levels of anxious/depressed symptoms can generally understand others' emotions, but rather struggle with the evaluation of social scenarios, others' intentions and social behavior.

Further, studies reporting a negative association between higher order emotion comprehension and anxious/depressed symptoms mainly stem from clinical samples (e.g., Meerum Terwogt, 1990; Bender et al., 2015). The divergent direction of associations for this study leads to the conclusion that while children from the general population having elevated but not clinically relevant levels of symptoms can master the different components of emotion comprehension, and even show a better performance, clinically diagnosed children might have problems in emotion comprehension due to the severity of their symptoms.

The assumptions about the relation between the TEC components and internalizing and externalizing syndromes in community and clinical samples clearly need further investigation. Especially, for children with externalizing symptoms more research is required.

Further, it would be interesting to investigate the nature behind the differences found for the three internalizing syndrome scales, and which individual associations occur between higher levels of emotion comprehension and somatic complaints, depression and different forms of anxiety with or without social withdrawal.

One limitation of this study is that, we assessed children's behavior and emotional problems by parent-report only. It has been critically claimed that parents and children show discrepancies in reporting on externalizing and internalizing symptoms: While for externalizing symptoms due to a lack of awareness children tend to report less problems than parents, it is vice versa for internalizing symptoms (Seiffge-Krenke and Kollmar, 1998). However, experts agree that around the age of eight children can reliably report on their mental health (Riley, 2004). Therefore, it might be more reliable to assess especially internalizing symptoms directly by children's reports or multiple sources. Another limitation of this study is the lack of diversity in educational background. 69.7% of the fathers and 58.2% of the mothers had a high school or university degree, which is above the German average. Therefore, generalization of these results is limited. We conducted regression analyses to investigate, how much variance of emotion comprehension is explained by the assessed variables. We want to emphasize that due to the crosssectional design of our study, a causal interpretation of the observed associations is not possible.

Finally, in our sample the gender was unbalanced with 72 girls and 63 boys. Even though the difference in number of participants between the two genders was rather small compared to the full sample, this should be mentioned concerning the generalizability of our results

In sum, the results of this study support the assumption of the development of emotion comprehension from a general, external to a deeper, complex understanding (e.g., Pons et al., 2004). In concordance with Janke (2008), the ninth TEC component addressing emotion after transgression of a moral rule turned out to be the most difficult one. Whether this is caused by cultural differences or due to a methodological artifact needs further investigation.

With regard to interindividual differences in this development, we found that emotion comprehension develops with increasing age, and benefits from bilingual upbringing. Moreover, emotion comprehension seems to be related to paternal working hours in such ways that children showed slightly worse abilities in understanding others' emotions with increasing working hours of their fathers.

To our knowledge, this is the first study simultaneously testing the impact of both externalizing and internalizing symptoms on emotion understanding operationalized by the TEC. The assumption of a negative association between externalizing and internalizing symptoms due to an altered SIP could only be partially confirmed. While no relationship was found between externalizing symptoms and emotion comprehension, we found different response patterns in the TEC in children with reported anxious/depressed symptoms and social withdrawal, implicating an individual style of SIP. These findings need further investigation in samples both from the general population and being clinically diagnosed with internalizing problems.

# ETHICS STATEMENT

The present study was approved by the local ethics committee of Saarland University (Faculty 5, leader: Prof. Dr. König). Informed written consent was required for all participants prior to the onset of the study. All parents were informed in detail about the study procedure and that participation was voluntary and could be stopped at any point of the experiment. The present study conducted research with school-aged children. However, the present study only included behavioral measurements and questionnaires without any risk of harm for our participants. No drugs, potentially dangerous setups or other risky procedures were applied.

#### AUTHOR CONTRIBUTIONS

fpsyg-07-01917 December 5, 2016 Time: 16:0 # 10

AG is the first author of the manuscript. She conducted the experiment of the present study and ran the main analyses of the collected data. AH was responsible for the main supervision during the conduction of the experiment. She helped with the manuscript by giving comments on the report of statistical results and extensive proof reading. Furthermore, she was mainly involved in formulating the research question and hypotheses. CM gave extensive feedback on the drafts of the manuscript. Her main contribution was in the writing process of the introduction and the discussion of the manuscript. Moreover, she gave extensive feedback for statististical analyses, the report of statistical results and on formal requirements for the submission. Moreover, she was responsible for proofreading the manuscript. GA gave extensive feedback during both the

#### REFERENCES


conduction of the experiment and writing the manuscript. She extensively discussed the findings and wrote parts of the manuscript. Moreover, GA was involved in the proof reading process and the final feedback for the manuscript. Overall, all authors extensively contributed to the present submission and were involved in the writing and feedbacking process at any time.

#### ACKNOWLEDGMENTS

We thank Christoph Kowalski for his help with data collection and all parents and children who participated in this study. We also thank Bettina Janke for helpful comments regarding the testing procedure and coding of the German version of the Test of Emotion Comprehension.

and aggression. J. Child Psychol. Psychiatry 43, 901–916. doi: 10.1111/1469- 7610.00139


Erwachsenenalter. Z. Kl. Psych. Psychoth. 29, 263–275. doi: 10.1026//0084-5345. 29.4.263


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer AP and the handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Göbel, Henning, Möller and Aschersleben. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sincere, Deceitful, and Ironic Communicative Acts and the Role of the Theory of Mind in Childhood

Francesca M. Bosco1,2 and Ilaria Gabbatore1,3 \*

<sup>1</sup> Department of Psychology, Center for Cognitive Science, University of Turin, Turin, Italy, <sup>2</sup> Neuroscience Institute of Turin, University of Turin, Turin, Italy, <sup>3</sup> Faculty of Humanities, Research Unit of Logopedics, Child Language Research Center, University of Oulu, Oulu, Finland

The aim of the study is to investigate the relationship among age, first- and second-order Theory of Mind and the increasing ability of children to understand and produce different kinds of communicative acts – sincere, ironic, and deceitful communicative acts – expressed through linguistic and extralinguistic expressive means. To communicate means to modify an interlocutor's mental states (Grice, 1989), and pragmatics studies the inferential processes that are necessary to fill the gap, which often exists in human communication, between the literal meaning of a speaker's utterance and what the speaker intends to communicate to the interlocutor. We administered brief video-clip stories showing different kinds of pragmatic phenomena – sincere, ironic, and deceitful communicative acts - and first- and second-order ToM tasks, to 120 children, ranging in age from 3 to 8 years. The results showed the existence of a trend of difficulty in children's ability to deal with both linguistic and extralinguistic pragmatic tasks, from the simplest to the most difficult: sincere, deceitful, and ironic communicative acts. A hierarchical regression analysis indicated that age plays a significant role in explaining children's performance on each pragmatic task. Furthermore, the hierarchical regression analysis revealed that first-order ToM has a causal role in explaining children's performance in handling sincere and deceitful speech acts, but not irony. We did not detect any specific role for second-order ToM. Finally, ToM only partially explains the observed increasing trend of difficulty in children's pragmatic performance: the variance in pragmatic performance explained by ToM increases between sincere and deceitful communicative acts, but not between deceit and irony. The role of inferential ability in explaining the improvement in children's performance across the pragmatic tasks investigated is discussed.

Keywords: pragmatics, development, theory of mind, deceit, irony, direct and indirect speech acts

# INTRODUCTION

Pragmatic ability refers to the use of language (Levinson, 1983) and other expressive means, such as non-verbal/extralinguistic means, i.e., gestures and body movements (Bara, 2010), to convey a specific meaning in a given context. Interesting examples of such ability are indirect speech acts, meaning acts through which the speaker communicates more than is literally said to the listener (Searle, 1975); deceitful communicative acts, meaning intentional attempts to manipulate

#### Edited by:

Anne Henning, SRH Hochschule für Gesundheit Gera, Germany

#### Reviewed by:

Cecilia Ines Calero, Unidad de Neurobiología Aplicada (CEMIC) and Torcuato Di Tella University, Argentina Cristina Colonnesi, University of Amsterdam, Netherlands

#### \*Correspondence:

Ilaria Gabbatore ilaria.gabbatore@oulu.fi; ilariagabbatore@gmail.com

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 01 August 2016 Accepted: 04 January 2017 Published: 30 January 2017

#### Citation:

Bosco FM and Gabbatore I (2017) Sincere, Deceitful, and Ironic Communicative Acts and the Role of the Theory of Mind in Childhood. Front. Psychol. 8:21. doi: 10.3389/fpsyg.2017.00021

the listener's mental state in order to induce them to believe something untrue (Perner, 1991); and irony, meaning communicative acts expressing the opposite of what is meant by the speaker (Grice, 1989).

Evidence in the literature show that pragmatic ability correlates with Theory of Mind (ToM; Premack and Woodruff, 1978), i.e., the human ability to attribute mental states to oneself and to other individuals. Evidence also show that this ability increases during childhood (Wellman and Liu, 2004) and adolescence (Bosco et al., 2014, 2016; Brizio et al., 2015). Similarly, the increasing ability of children to manage these pragmatic phenomena as they grow older has been well documented in the literature. At around one year of age children start to understand and use direct speech acts, meaning utterances that express literally and exactly what the speaker intends to say (Searle, 1975), in order to communicate with another person (Garvey, 1984). However, children also become able to handle indirect speech acts early on in their development. Reeder (1980), for example, showed that starting from 2;6 years of age, children understand equally well, in an adequate context, that direct utterances like 'I want you to do that' and indirect requests like 'Would you mind doing that?' have the same conventional communicative meaning (see also Bernicot and Legros, 1987). Bucciarelli et al. (2003) also showed that, starting from 2;6 years of age, children understand direct and conventional forms of indirect speech acts ("Would you mind" "Do you know?", etc) equally well, and that such ability increases with age (see Bosco and Bucciarelli, 2008).

Studies in the literature have shown that children's ability to deal with verbal deceit also increases with age. In particular, starting from three years of age (Lewis, 1993; Bussey, 1999), children start using lies, meaning false utterances proffered with the intention of avoiding a disagreeable consequence such as a punishment (Leekman, 1992). Talwar and Crossman (2011) argue, in their review of the literature, that a child's ability to lie could be considered normative, testifying the child's social and cognitive development. The ability to handle lies of different complexity evolves during the pre-school to school period: as they grow up, children become able to consider the speaker's intention and the impact of the social acceptability of lying and they start deceiving (for a review, see Talwar and Crossman, 2011). A deceitful communicative act is a speaker's intentional attempt to manipulate the listener's mental states in order to induce them to believe something untrue (Perner, 1991). Peskin (1996) claims that, in order to comprehend deceit, the speaker must take as shared with the listener something the speaker does not really believe. Peskin also claims that it is necessary to understand that the listener thus comes to hold a false belief, and observes how starting from the age of 4, children fully comprehend the speaker's intention to deceive. The ability to deceive has thus been frequently explained on the basis of the ability to use a fully developed ToM (Chandler et al., 1989; Polak and Harris, 1999; Ma et al., 2015). Talwar et al. (2007, p. 804) for example affirm that "Lying, in essence, is ToM in action"; to deceive consists of creating a false-belief in the interlocutor's mind (Lee, 2000). In particular, Talwar et al. (2007) investigated the relation between first- and second- order ToM and children's ability to lie. First-order ToM involves the comprehension of another person's belief about a certain state of the world, while second-order ToM involves the ability to infer what one person believes about another person's thoughts, meaning to understand nested mental states (Perner and Wimmer, 1985). Talwar et al. (2007) reported a correlation between 6 and 11-yearold children's second-order ToM and the ability to lie. Similarly, Cheung et al. (2015) found a correlation between 7 and 9-yearold children's second-order ToM and their ability to understand a liar's intention. The increasing ability of children to manage more complex forms of deceit has thus been explained on the basis of the children's development from first-order to second-order ToM. However, the exact role of (first- and second-order) ToM in explaining a growing child's ability to manage deceit is not yet completely clear (Talwar and Crossman, 2011). For example, some authors (Russell et al., 1995), claimed that not yet fully developed ToM abilities are not the best factor for explaining children's difficulty in managing complex deceit, and proposed that the executive demand (in terms of executive functions as planning and shifting) that the comprehension of complex deceit requires is the best explanatory factor. Bosco and Bucciarelli (2008) also argued that ToM did not seem to be the best factor for explaining children's ability to manage deceitful speech acts of increasing complexity.

Focusing now on irony, in its easiest form, this typically takes place when an utterance expresses the opposite of what the speaker means (Grice, 1989). In particular, irony involves a discrepancy between the literal meaning and the speaker's communicative intent (Mey, 2001). Children usually start to develop the ability to recognize ironic speech acts between five and six years of age (Dews et al., 1996; Harris and Pexman, 2003; Filippova and Astington, 2010), although younger children may sometimes also understand irony (Loukusa and Leinonen, 2008; Angeleri and Airenti, 2014) and such ability improves over time (Demorest et al., 1984; Dews et al., 1996; Dews and Winner, 1997; Bosco and Bucciarelli, 2008; Filippova and Astington, 2008). Loukusa and Leinonen (2008), for example, found a significant difference between 6- and 7-year-olds in their ability to provide a correct explanation in a comprehension task on a simple ironic utterance, and Bosco and Bucciarelli (2008) reported that children of 6, 8, and 9 years of age found it easier to comprehend simple forms of irony, that is, utterances directly in contrast with the background knowledge, than complex ones, involving utterances implying knowledge that is in contrast with the background scenario.

According to Winner (1997), in order to interpret an ironic utterance correctly, the child must have the ability to detect incongruity or falsehood, to avoid mistaking irony for error, to understand another person's beliefs, and to avoid interpreting irony as deception. In line with this theoretical proposal, Nilsen et al. (2011) showed that second-order ToM is correlated with children's comprehension of verbal irony. Specifically, they pointed out that adults and older children aged between 8 and 10, but not younger children aged 6 to 7 years, were able to recognize that listeners require contextual knowledge to comprehend irony.

The studies in the literature have mainly focused on one single pragmatic phenomenon (and its possible relation with ToM) at

a time, and only a few have undertaken a direct comparison of the different phenomena (Bosco and Bucciarelli, 2008; Bucciarelli et al., 2003). In particular, Bosco et al. (2012a, 2013) provided a broad assessment of the abilities of children ranging in age from 5 to 8;6 years, to comprehend and produce: direct and indirect speech acts (that the authors define as standard communication acts), and deceitful and ironic communication acts, using both linguistic and nonverbal/extralinguistic means of expression, such as gestures and body posture. The authors reported that the ability to perform all the pragmatic tasks investigated increases with age in children aged between 5 and 8;6 years, and their ability to deal with standard communication acts (direct and indirect), and deceitful and ironic speech acts also improves. In line with Bucciarelli et al. (2003), the authors explained the existence of such an increasing trend of difficulty on the basis of the Cognitive Pragmatic theory (Bara, 2010) and the increasing complexity of the inferential processes involved in the various pragmatic tasks investigated. The ability to infer refers to the cognitive capacity necessary to fill the gap, which often exists, between the literal meaning of an utterance and what the speaker actually means (Searle, 1975). According to the Cognitive Pragmatic theory, in expressing a sincere communicative act (direct and indirect communicative acts), the actor says something that is in line with his/her private beliefs. In terms of the inferential processes involved, the comprehension or production of a sincere communicative act merely requires the partner to refer to the background knowledge shared between the interlocutors. By contrast, the comprehension and production of deceitful and ironic communicative acts requires more complex inferential processes. In particular, in deceit, what the speaker says is in conflict with his private knowledge, but it does not contradict the knowledge given as shared with the partner. In a case of deceit, the partner has to recognize the difference between what is expressed and what the speaker privately entertains. In irony, the actor's communicative intention is again in conflict with his private knowledge, as in the previous case, but it also contradicts the knowledge given as shared with the partner. This makes an ironic communicative act more difficult to entertain than a deceitful one (for a detailed description, see Bucciarelli et al., 2003; Bosco and Bucciarelli, 2008; Bara, 2010).

However, a possible different explanation for the increasing trend of difficulty in the comprehension and production of the pragmatic tasks described above implies a role for ToM, and in particular it states that ToM could play a greater role in deceitful communicative acts (Flanagan, 1992; Sodian and Frith, 1992) and ironic communicative acts (Happé, 1993) as compared to standard (direct and indirect) ones. Winner and Leekman (1991) assume that it is more difficult to understand irony than deceit because the former requires second-order ToM, whereas the latter only requires first-order ToM. In particular, Sullivan et al. (1995) found that starting from 7 years of age, children can distinguish lies from jokes, and they attribute this to the acquired ability to attribute second-order mental states.

To the best of our knowledge, no previous studies have empirically investigated such a hypothesis by assessing, in the same sample of participants, the possible role of first- and second-order ToM in explaining children's increasing ability to comprehend and produce sincere (direct and indirect), deceitful, and ironic communicative acts. For this reason, the aim of the study was to investigate the increasing ability of children to manage different kinds of pragmatic phenomena, i.e., direct and indirect, deceitful, and ironic speech acts, and the possible role of first- and second-order ToM in explaining such performance. In detail, we wished to replicate the findings of Bosco et al. (2013, see also Bucciarelli et al., 2003; Bosco and Bucciarelli, 2008), and expected: (i) to find children's performance on all the investigated tasks to improve with age; (ii) to find an increasing trend of difficulty in the comprehension and production of the investigated pragmatic phenomena, namely sincere communicative acts (direct and indirect), and deceitful and ironic communicative acts, in both the linguistic and non-verbal/extralinguistic modalities, including the use of gestures and body movements. In particular, the novelty of the present study was (iii) to explore the causal role of ToM (first- and second-order) in explaining such an improvement in their performance, in both the linguistic and non-verbal/extralinguistic modalities, within each investigated phenomenon. Moreover, (iv) we investigated the possible role of ToM in explaining the increasing trend of difficulty we expected to find across the various pragmatic phenomena investigated.

#### MATERIALS AND METHODS

#### Participants

The sample consisted of 120 Italian children (60 males and 60 females) ranging in age from 3 to 8 years. In order to compare the subsamples' performance in a more reliable way, the sample was organized in 4 age groups so that there was a one year difference between one age group and the next: Group A (3 years; 6 months – 4 years) (M = 3;10; SD = 0;2); Group B (5–5;6) (M = 5;3; DS = 0;2); Group C (6;6–7) (M = 6;10; DS = 0;2); Group D (8–8;6) (M = 8;3; DS = 0;3). Each age group was composed of 30 children and was balanced for gender, including an equal number of males and females.

The children were recruited from four different schools in the Piedmont area (Italy). Research assistants visited the schools before data collection commenced, and provided the teachers with details about the study. A letter containing all the details about the research was sent to the children's families, together with an informed consent form, which the parents were required to complete. Only children whose parents gave their consent were included in the sample.

#### Material

The experimental protocol consisted of a selection of 48 items taken from the linguistic and extralinguistic scales of the ABaCo (Sacco et al., 2008; Angeleri et al., 2012; Bosco et al., 2012a), a validated assessment tool to evaluate pragmatic abilities in typical (Bosco et al., 2013) and atypical (Angeleri et al., 2016) development. Examples of ABaCo items are provided in Appendix A.

For each expressive modality (i.e., linguistic and extralinguistic), the experimental task contained the same number and structure of items, and assessed the same type of pragmatic phenomena (half in comprehension and half in production):


Each item consists of a video lasting 20–25 s, comprising a controlled number of words (range: 7 ± 2), and representing a communicative interaction between two people. The linguistic items investigate pragmatic phenomena expressed primarily through linguistic means, while the extralinguistic items are composed of communicative acts expressed through gestures (for a detailed description of the items, see Bosco et al., 2013).

In comprehension tasks, participants observed an interaction between two actors, and they were required to understand what was communicated (e.g., In your opinion, what did the girl want to say to the boy?). In production tasks, participants observed only the initial part of an interaction, and they were asked to produce a communicative act appropriate with respect to the proposed communicative situation (e.g., The child doesn't want to be discovered. What could he say?).

For each pragmatic task, it was possible to obtain a score of "0" when the answer was considered incorrect and "1" when the answer was considered correct. More details concerning scoring criteria are reported in Appendix A (see also Sacco et al., 2008; Bosco et al., 2013). Inter-rater reliability was calculated using Cohen's Kappa on the scores assigned to 40 randomly selected children (about 33% of the total sample): K was 0.67 (p < 0.001) 95% CI (0.653, 0.696), indicating substantial agreement (Landis and Koch, 1977).

In addition to the pragmatic tasks, a number of ToM tests were administered to the children. See Appendix B for a detailed description of the items.

#### Sally and Ann Task

In this task (Baron-Cohen et al., 1985), the child is required to observe a scene acted out by two paper dolls, Sally and Ann: Sally places her ball in the basket and leaves the scene. Ann moves the ball from the basket to the box. Then the child is required to reply to a test question (When Sally comes back, where does she think the ball is?) and a justification question (Why does Sally think the ball is there?). A score of 1 is gained when both the test and the justification questions are answered correctly.

#### Modified Smarties Task

This is a revised version of the original task developed by Perner et al. (1987). Because nowadays many children are no longer familiar with the famous candy brand, we introduced a packet of a currently famous brand of potato chips as the target object. During the task, the experimenter shows the packet of chips to every child and asks: What is in there? Then the experimenter opens the packet, showing that it contains pencils rather than the expected chips. The next question is: What will someone else, who has not seen what the packet contains, think is in there, before it is opened? A score of 1 is obtained when the child replies "chips" and a score of 0 is attributed to any other kind of answer.

#### John and Mary and Maxi Stories

These tasks (Sullivan et al., 1995) are a modified version of those used by Wimmer and Perner (1983) and Perner and Wimmer (1985) respectively, and they are told using cardboard puppets in order to reduce the memory load. The two stories assess second-order ToM, and they have an identical structure but different characters and settings. In the Maxi story, for example, the scenario is the following: Maxi and Bobby are in their kitchen when their mother brings in some chocolate. Maxi would like to have some chocolate and his mom tells him he can have some after walking the dog. Unbeknownst to Maxi but not to Bobby, their mother takes the chocolate to the neighbor's place. Unbeknownst to Bobby, Maxi discovers that their mom has taken the chocolate to the neighbor's place. Bobby then goes to look for Maxi in the yard. His mother tells him that Maxi has gone to get some chocolate. The task consists of a 'second-order ignorance question' (i.e., Does Bobby know that Maxi knows where the chocolate is?) and a 'second-order belief question' (i.e., Where does Bobby go to look for Maxi?). Along with the story-telling, a number of factual questions (e.g., Why is Maxi in the yard?) and first-order ToM questions (e.g., Does Maxi know where the chocolate is now?) were used to help the children to follow the storyline, but they were not taken into account in the scoring procedure. The children's answers could be scored 1 (correct) or 0 (incorrect) for both second-order ignorance and belief questions. The mean value of the scores obtained from the two test questions was run to perform the analyses.

#### Picture Sequencing Task

Within the present study, just part of the original task (Langdon and Coltheart, 1999; Porter et al., 2008) was administered: the tasks used comprise six stories, including two social scripts (more than one person interacting in everyday social routines) and four false-belief sequences (a person, unaware of an event in a story, acts on a false belief). Internal consistency among these items was calculated (Cronbach's alpha = 0.77). Each story was depicted in a set of four black-and-white picture cards. Two practice runs were used to allow the child to become familiar with the procedure of the task, and these were not considered in the scoring procedure. The set of cards for each story was placed face down in front of the child, and the child was required to arrange the cards in the correct order to tell the story according to the logical sequence of events, like in a comic-strip. Scores ranged from 0–6; each sequence scored 2 points if the first card was in the correct position, 2 points if the last card was in the correct position, and 1 point for each of the second and third cards being in the correct positions. Failure to produce a sequence was scored as 0.

Inter-rater reliability was calculated using Cohen's Kappa also for ToM tasks scores attributed by two raters in about 33% of the total sample: Sally and Ann task, Modified Smarties task, John and Mary and Maxi Stories task. It was not calculated on the

Picture Sequencing task scores, because the scoring procedures for this test only involve comparing the order of the sequences provided by the child with the correct ones provided by the test instructions. Since no different interpretation is possible, we did not consider it necessary to have a second rater for such scores. K ranged from 0.76 to 1 (p < 0.001), indicating substantial to almost perfect agreement (Landis and Koch, 1977).

#### Procedure

The experimenters visited the schools before the beginning of the study, in order to familiarize with the children. The children dealt with the experimental tasks in a single individual session, lasting approximately 50 min and performed in a quiet room at the school. The video-taped stories were shown to the children one at a time, using a portable computer, and each session was video-recorded, to allow offline coding procedures. The tasks were presented in two different random orders, A and B; the participants in each group were balanced for age and gender, and were assigned to order A or B of the protocol in a balanced way. The ToM tests were also balanced, so that they were presented to half of the participants before the presentation of the pragmatic protocol and to half of the participants after the presentation of the pragmatic protocol. Moreover, the ToM tasks were presented in two different random orders (first-order tasks followed by second-order tasks and vice versa). When performing the analyses, first- and second-order ToM scores were considered separately. In particular, the first-order ToM value was obtained using the average scores gained from the Sally and Ann, Smarties, and Picture Sequencing tasks. Likewise, the second-order ToM value was obtained by combining the average scores obtained from the John and Mary and Maxi tasks.

#### Data Analysis

The distribution of scores for each kind of task was not normal in most age groups. In particular, the Kolmogorov–Smirnov test showed the distribution to be normal only in a few cases, namely extralinguistic deceit in group B and extralinguistic irony in both groups C and D, while data were not normally distributed in any of the other cases: linguistic sincere acts 0.001 < p < 0.011; linguistic deceit 0.001 < p < 0.042; linguistic irony: 0.001 < p < 0.027; extralinguistic sincere acts p < 0.001; extralinguistic deceit: 0.001 < p < 0.200; extralinguistic irony 0.001 < p < 0.181. We also performed a Shapiro–Wilk test, which confirmed the previous results: the distribution of scores was normal in only a few cases, namely linguistic irony in group D, extralinguistic deceit in group B, and extralinguistic irony in groups C and D; data were, instead, not normally distributed in any of the other cases: linguistic sincere acts.001 < p < 0.022; linguistic deceit 0.001 < p < 0.010; linguistic irony 0.001 < p < 0.164; extralinguistic sincere acts 0.002 < p < 0.010; extralinguistic deceit 0.001 < p < 0.178; extralinguistic irony 0.001 < p < 0.214. We thus conducted an arcsine transformation on the children's answers in each pragmatic task (linguistic and extralinguistic sincere communicative acts, deceit, and irony) and each ToM task (first- and second-order). We were thus able to perform parametric analyses while satisfying the required assumptions.

To investigate children's performance in managing different kinds of pragmatic tasks, we conducted a multivariate analysis of variance (MANOVA), with age as between-subject factor (type of age group: Group A 3;6–4, Group B 5–5;6, Group C 6;6–7, and Group D 8–8;6) and performance at sincere communicative act, deceit, and irony as dependent variables on both the linguistic and the extralinguistic scales. Analogously, children's ability to manage different kinds of ToM tasks was investigated by conducting a MANOVA with age as between-subject factor (type of age group: Group A 3;6–4, Group B 5–5;6, Group C 6;6–7, and Group D 8–8;6) and performance at first- and second-order ToM tasks as dependent variables.

Moreover, in order to investigate the effect of performance on the different pragmatic tasks within each age group (type of pragmatic phenomena: sincere, deceit, irony), we performed separate ANOVA analyses, for both linguistic and extralinguistic tasks.

In order to investigate the correlation between pragmatic and ToM ability, we calculated the partial correlation (Pearson's r, controlling for age) between children's performance on pragmatic and ToM tasks in the overall sample.

Lastly, in order to investigate the specific effect of age and of first- and second-order ToM in explaining children's pragmatic performance, we conducted a hierarchical regression analysis, including three steps: Age (step1), first-order ToM (step2) and second-order ToM (step3). Such variables were entered into the regression model as predictors to detect their impact on children's performance on the pragmatic tasks (i.e., sincere, deceit and irony). Statistically significant correlations were found between linguistic and extralinguistic performance on the different types of tasks: sincere (r = 0.29; p = 0.001), deceit (r = 0.74; p < 0.001), irony (r = 0.63; p < 0.001). For this reason, and since the trends in scores were the same for both modalities in all age groups, in this regression analysis we collapsed the scores obtained for the linguistic and extralinguistic tasks into a single type of pragmatic task score. Despite the differences implied in these pragmatic phenomena, collapsing them into a single score provides a more statistically robust measure of overall pragmatic ability (Cronbach's alpha = 0.93).

# RESULTS

The scores obtained by each age group on the pragmatic and ToM tasks are summarized in **Tables 1** and **2**.

In **Tables 3** and **4** the correlation coefficients among pragmatic tasks and ToM tasks, respectively, are provided.

On the linguistic scale, the MANOVA revealed a significant effect of age on the pragmatic performance [F(9,348) = 8.97; p < 0.001; η <sup>2</sup> = 0.19]. Separate univariate ANOVAs on the outcome variables revealed a significant effect of age on deceits [F(3,116) = 36.77; p < 0.001; η <sup>2</sup> = 0.49] and ironies [F(3,116) = 12.09; p < 0.001; η <sup>2</sup> = 0.24] but a not significant effect on sincere communicative acts [F(3,116) = 2.12; p = 0.10; η <sup>2</sup> = 0.05]. Post hoc pairwise comparison (Bonferroni) between the performance of A vs. B, B vs. C and C vs. D age group at each


TABLE 1 | Performance of each age group at the pragmatic tasks, mean (standard deviation).

TABLE 2 | Performance of each age group at the Theory of Mind (ToM) tasks.


pragmatic task highlighted the following results: no differences were detected among the groups at the sincere acts (p = 1.0); the groups performed significantly differently at the deceitful acts (p < 0.001), with the only exception being Group C vs. Group D, which showed no differences (p = 1.0); finally, at the ironic tasks, a significant difference was found between Group A and B (p = 0.003), while no differences were detected between the remaining groups (0.999 < p < 1.0).

In terms of the extralinguistic scale, the MANOVA revealed a significant effect of age on the pragmatic performance [F(9,348) = 10.11; p < 0.001; η <sup>2</sup> = 0.21]. Separate univariate ANOVAs on the outcome variables revealed a significant effect of age on all the communicative acts investigated: sincere [F(3,116) = 20.51; p < 0.001; η <sup>2</sup> = 0.35], deceits [F(3,116) = 48.63; p < 0.001; η <sup>2</sup> = 0.56] and ironies [F(3,116) = 13.18; p < 0.001; η <sup>2</sup> = 0.25]. Post hoc pairwise comparison (Bonferroni) between the performance of each age group at each pragmatic task highlighted the following results: the groups performed significantly differently at the sincere (0.003 < p < 0.016) and deceitful acts (p < 0.001), with the only exception being Group C vs. Group D, which showed no differences both at sincere (p = 1.0) and at deceitful acts (p = 0.863); for what concerns ironies, again a significant difference was found between Group A and B (p = 0.033), while no differences were detected between the remaining groups (0.235 < p < 1.0).

The ANOVA analyses performed within each age group separately and concerning the linguistic tasks, revealed an effect of the type of task in all age groups [20.09 < F(2,58) < 57.06; p < 0.001; 0.41 < η <sup>2</sup> < 0.66]. Moreover, introducing contrasts for each analysis, we detected a linear contrast, depending on the type of pragmatic task in each age group [25.39 < F(1,29) < 145.53; p < 0.001;.47 < η <sup>2</sup> < 0.83]. The same pattern of results was


∗∗p < 0.01; <sup>∗</sup>p < 0.05.

found concerning extralinguistic tasks: an effect of the type of task was detected in all age groups [14.21 < F(2,58) < 36.48; p < 0.001; 0.33 < η <sup>2</sup> < 0.56] and contrast analysis revealed a linear contrast, depending on the type of pragmatic task in each age group [24.60 < F(1,29) < 39.10; p < 0.001; 0.46 < η <sup>2</sup> < 0.61].

In terms of children's ability to manage different kinds of ToM tasks, the MANOVA revealed a significant effect of age on the children's performance [F(6,232) = 10.86; p < 0.001; η <sup>2</sup> = 0.22]. Separate univariate ANOVAs on the outcome variables revealed a significant effect of age on first-order ToM performance [F(3,116) = 27.60; p < 0.001; η <sup>2</sup> = 0.42] as well as on second-order ToM performance [F(3,116) = 7.37; p < 0.001; η <sup>2</sup> = 0.16]. Post hoc pairwise comparison (Bonferroni) between the performance of each age group at first-and second order ToM tasks revealed the following results: the groups performed significantly differently at the first-order ToM tasks (0.003 < p < 0.005) with the only exception being Group C vs. Group D, which showed no differences (p = 1.0); at the secondorder tasks, no differences were found between the performance of the age groups (0.512 < p < 1.0).

Partial correlation coefficients between linguistic and extralinguistic pragmatic tasks (comprehension and production ability) and overall ToM ability (first- and second-order tasks) are reported in **Table 5**.

As shown in **Table 5**, we found a significant correlation between overall ToM tasks and all the pragmatic tasks investigated, in both the Linguistic and Extralinguistic scales, with the only exception of sincere linguistic communicative acts and Extralinguistic irony. The same result applies for first-order ToM tasks. By contrast, the only significant relation we detected for second-order ToM tasks was between linguistic deceit and second-order ToM.

**Table 6** displays the results of multiple hierarchical regression analysis on the overall sample. In particular it shows all the coefficients of the regression models as well as the information about the summary of the model: adjusted regression coefficients (R 2 Adj. ) for each predictor variable, the change in R 2 after the addition of first- and second-order ToM (R 2 Change), the change in F (FChange), and its significance value (Sig. FChange).

The regression analysis revealed that age (Step 1) explains 26% of the variance in children's performance on sincere tasks, 53% on deceitful tasks and 26% on ironic tasks. The model also including children's performance on first-order ToM tasks as a regressor (Step 2) only significantly improved the prediction for sincere (i.e., direct and indirect communicative acts) and deceitful tasks, but not for ironic ones. The addition of scores for performance on second-order ToM tasks (Step 3) did not improve the prediction for any of the pragmatic tasks.

As a final point, the analyses also showed that within both the model including first-order ToM (Step 2) and the model comprising second-order ToM (Step 3), R <sup>2</sup> only partially follows the trend of increasing difficulty exhibited by children in solving pragmatic problems, when considering both linguistic and extralinguistic tasks, i.e., first-order ToM (sincere, R <sup>2</sup> = 0.304; deceit, R <sup>2</sup> = 0.558; irony R <sup>2</sup> = 0.269), second-order ToM (sincere, R <sup>2</sup> = 0.310; deceit, R <sup>2</sup> = 0.565; irony R <sup>2</sup> = 0.269). In particular, the R 2 value increases across tasks between sincere and deceitful communicative acts, but not between deceitful and ironic ones.

#### DISCUSSION

The goal of the present study was to investigate the possible role of ToM – both first- and second-order - in explaining children's ability to comprehend and produce different kinds of pragmatic phenomena, namely sincere (direct and indirect), deceitful, and ironic communicative acts, expressed through linguistic and nonverbal/extralinguistic modalities.

First of all, and in line with our expectation and the relevant literature (Bucciarelli et al., 2003; Bosco and Bucciarelli, 2008; Filippova and Astington, 2010; Talwar and Crossman, 2011; Bosco et al., 2013), overall our results showed that children's ability to comprehend and produce the pragmatic phenomena investigated increases with age, in both the linguistic and nonverbal/extralinguistic modalities. Analyzing deeper this result for each pragmatic task and comparing age groups we found that, for the linguistic modality, children showed no differences at the sincere acts, while performed significantly differently at the deceitful acts with the only exception of oldest groups of age of 6- vs. 8- year-olds children; for what concern ironic acts, younger group of 3- year-olds children showed a significant worse performance than all the other groups, while children belonging to the remaining age groups had comparable performance. We explain such results on the base of the Cognitive Pragmatic theory, proposing that, because of the inferential process involved, sincere communicative acts are the easiest task to solve for children and thus they performed quite well starting from 3;6–4 years of age. Always following the tenets of the Cognitive Pragmatic theory a deceitful communicative

TABLE 5 | Partial correlation (Pearson r, controlling for age) between overall ToM tasks (first- and second-order) and pragmatic tasks, in the overall group.


∗∗p < 0.01; <sup>∗</sup>p < 0.05.


TABLE 6 | Hierarchical regression analysis: pragmatic tasks (linguistic and extralinguistic) in the overall group.

Variables significantly predicting pragmatic performance is marked in bold.

act represents a more difficult pragmatic task to solve and only starting from 6;6–7 years of age children handle it without errors. Finally, irony is the most difficult task to solve and it represents a really hard task to manage for children as young as 3;6–4 years of age. However, it remains a quite difficult task also for the older children. Globally considered the same pattern of results and the same explanation hold for the extralinguistic modality; the only exception is represented by the younger 3- and 5- years-olds, who showed differences in performance at sincere communicative acts. A possible explanation for this difference is that the extralinguistic modality was harder for 3- and 5- year-olds children to deal with and this additional difficulty allowed this difference in performance to emerge.

In line with our hypothesis, and considering each age group separately, we also found an increasing trend of difficulty in children's performance across the pragmatic tasks investigated: children were able to comprehend and produce sincere communicative acts more accurately than deceit, which was followed by ironic speech acts, which were the most difficult task to deal with. Considered overall, this linear increase in difficulty holds in both the linguistic and extralinguistic modality following the patterns of results found in previous studies (Bucciarelli et al., 2003; Bosco and Bucciarelli, 2008; Bosco et al., 2013).

The novelty of the present research was to explore the causal role of age and ToM – both first- and second-order - in explaining children's pragmatic performance, in both the linguistic and non-verbal/extralinguistic modalities. Some authors have indeed proposed that pragmatics/communicative ability involves mentalizing, i.e., ToM, abilities (Sperber and Wilson, 2002; Tirassa et al., 2006a,b; Tirassa and Bosco, 2008; Fernandez, 2011; Bosco et al., 2012b; Cummings, 2015). In line with this proposal we found a correlation, controlling for age, between overall ToM tasks (first- and second-order tasks) and linguistic and extralinguistic irony and deceit, but not

between linguistic sincere communicative acts and extralinguistic irony. The same pattern of results holds for first-order ToM. Considering second-order ToM, we only found a significant correlation with linguistic deceit.

The correlation we found between children's performance on ToM tasks (overall and first-order) and sincere acts may be considered a surprising result. We can explain this result by considering that half of the items making up our experimental material were indirect communicative acts. Studies in the literature have suggested that ToM has a role in the comprehension of indirect speech acts. For example, Corcoran et al. (1995) and Corcoran (2003) showed that patients with schizophrenia, a disorder explained (e.g., Frith, 1994) on the basis of a primary deficit in ToM, have difficulties in the comprehension of indirect speech acts.

Our results concerning the correlation between ToM (overall and first-order) and deceit are in line with the current literature (see for example Chandler et al., 1989; Polak and Harris, 1999; Ma et al., 2015). In particular, our result regarding the significant role played by second-order ToM in dealing with deceitful acts is in line with Talwar and Lee (2008). The authors showed that the performance of children aged from 3 to 8 years on second-order ToM tasks is related to their ability to maintain a plausible explanation in order to not reveal their lies. Some authors also found that second-order ToM ability correlates with pro-social lies (Cheung et al., 2015; Williams et al., 2016), which are considered more sophisticated than lies. In particular, Broomfield et al. (2002) found that only pro-social, but not other forms of lies, are related to second-order ToM. However, our experimental material did not include pro-social lies, so a direct comparison is not possible. Our results concerning the correlation between ToM (second-order) and irony are also consistent with the literature, in particular with Winner (1997), who argued for the role of second-order ToM in irony comprehension and Nilsen et al. (2011), who reported that second order ToM is correlated with children's comprehension of verbal irony.

Taken globally, our results are also in line with the literature concerning autism, a pathology characterized by a ToM impairment (Baron-Cohen et al., 1985) and showing how people with autism have difficulties in comprehending and producing indirect, deceitful and ironic communicative acts (Happé, 1993; Angeleri et al., 2016, for a review see Loukusa and Moilanen, 2009).

However, in order to conduct an in-depth investigation of the possible role of age and of first- and secondorder ToM in explaining the improvement in children's performance across each pragmatic task (linguistic and nonverbal/extralinguistic), we performed a hierarchical multiple regression analysis. We found that, as expected, age has a significant role in explaining children's performance on all the investigated tasks. The results also showed, consistently with the correlation analysis, a significant role for first-order ToM in explaining children's performance in the comprehension and production of sincere (direct and indirect) communicative acts as well as their ability to manage deceitful communicative acts. By contrast, we did not detect any significant role for second-order ToM in explaining any of the pragmatic tasks investigated, thus testifying, when the role of age and firstorder ToM is kept under control, a limited causal role of this more sophisticated ToM aspect in explaining children's ability to deal with sincere, deceitful and ironic communicative acts.

A direct comparison of this result with the current literature is not possible, since other studies (see for example Talwar and Lee, 2008; Nilsen et al., 2011) usually limit their investigation to the correlation analyses. An exception is the study by Angeleri and Airenti (2014) where, despite the significant correlation found between ToM (first and second-order) ability and the comprehension of linguistic ironic tasks, a more detailed investigation, run through path analysis, underlined that ToM had no direct effect on humor comprehension. In line with the results provided by Angeleri and Airenti (2014), our hierarchical regression analysis showed that, when the role of age is kept under control, neither first- nor second-order ToM has a direct impact on children's performance on irony tasks. Our finding thus did not provide empirical support to theories proposing that ToM (Happé, 1993) and specifically second-order ToM (Winner and Leekman, 1991), plays a key role in explaining irony comprehension. Furthermore, the results of the present investigation, in addition to those of Angeleri and Airenti (2014) indicate that to use ironic statements - it is for example the case of some items composing the Strange Stories (Happé, 1994) – could not be not a reliable measure to investigate ToM ability in children.

Lastly, we now wish to focus on the role of ToM in explaining the increasing trend of difficulty shown by children in dealing with sincere, deceitful and ironic communicative acts, using both Linguistic and Extralinguistic expressive means. We found that ToM, neither first- or second- order, could be considered the best factor explaining our incresasing trend of difficulty in children's performance. Indeed, we found that R <sup>2</sup> only partially follows the trend of increasing difficulty exhibited by children in solving each kind of investigated task, i.e., sincere, deceitful and ironic communicative acts. The R 2 value indicates how much variance is explained by a certain variable. If ToM (both first- and second-order) was the factor that best explained the difference in difficulty among the three tasks, we would expect the R 2 value to increase according to the level of difficulty detected in managing linguistic and extralinguistic sincere communicative acts, deceit, and irony. However, this value increases when considering sincere and deceitful communicative acts, but not when considering deceit and irony.

To summarize, our results on the existence of an increasing trend of difficulty across pragmatic tasks seem only partially explained by the role of ToM (see also Bosco and Gabbatore, 2017). Considered overall, our results suggest a role for first-order ToM in explaining the differences in performance only when considering sincere and deceitful acts, but not when considering deceit and irony. A possible alternative explanation for the existence of such an increasing trend of difficulty is based on the inferential complexity underlying the pragmatic tasks investigated (see Bucciarelli et al., 2003;

Bosco and Bucciarelli, 2008; Bosco et al., 2013). The existence of an increasing trend of difficulty in the comprehension and production of sincere (direct and indirect), deceitful, and ironic communicative acts has been experimentally demonstrated, not only in studies on children (see Bosco et al., 2009, 2012c), but also through the assessment of pragmatic abilities in patients with schizophrenia (Colle et al., 2013), and individuals with brain injury (Bara et al., 2001; Angeleri et al., 2008), left brain damage (Gabbatore et al., 2014), and right brain damage (Parola et al., 2016). Other authors in the literature have also highlighted the key role that the inferential processes play in the comprehension process (Leinonen et al., 2000). In particular Pexman and Glenwright (2007) highlighted the role of inferential ability in the comprehension of an ironic statement.

A limit of the present investigation is that it does not consider the role that other cognitive functions, such as executive functions like planning, working memory, inhibition, and shifting, might play in explaining the development of children's communicative-pragmatic performance. In future, it might be useful to conduct a longitudinal study in order to observe the development of pragmatic abilities in a specific group of children over time. Even though the present investigation focuses on pragmatics, a further interesting topic of study is the influence of linguistic development on children's ToM ability. As a final point, the merit of the present study was to help to clarify the (limited) causal role of first- and second-order ToM in explaining the improvement in children's pragmatic performance across different kinds of pragmatic tasks, such as sincere, deceitful, and ironic communicative acts.

#### REFERENCES


# ETHICS STATEMENT

Bio-ethical Committee of the University of Turin (Protocol no. 13620121).

#### AUTHOR CONTRIBUTIONS

IG took care of the preparation and administration of the experimental material, run the statistical analysis and wrote the corresponding part of the paper (Methods and Results). FB is responsible for the whole research project. She took care of the review of the literature and wrote the introductive part of the manuscript and its discussion.

#### ACKNOWLEDGMENTS

The research was funded by MIUR – Ministero Italiano dell'Università e della Ricerca – PRIN – Progetti di Ricerca di Rilevante Interesse Nazionale\_2017. Project "The interpretative brain: Understanding and promoting pragmatic abilities across lifespan and in mental illness" project code 201577HA9M.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2017.00021/full#supplementary-material




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Bosco and Gabbatore. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Facial Expression Recognition in Children with Cochlear Implants and Hearing Aids

Yifang Wang<sup>1</sup> , Yanjie Su<sup>2</sup> \* and Song Yan<sup>3</sup>

<sup>1</sup> Beijing Key Laboratory of Learning and Cognition, Department of Psychology, Capital Normal University, Beijing, China, <sup>2</sup> Department of Psychology, Peking University, Beijing, China, <sup>3</sup> School of Humanities and Social Sciences, Jacobs University, Bremen, Germany

Facial expression recognition (FER) is an important aspect of effective interpersonal communication. In order to explore whether the development of FER was delayed in hearing impaired children, 44 child participants completed labeling, and matching tasks to identify four basic emotions (happiness, sadness, anger, and fear). Twentytwo participants had either a cochlear implant (CI) or a hearing aid (HA) while 22 had normal hearing and participants were matched across conditions by age and gender. The results showed that children with a CI or HA were developmentally delayed not only in their emotion-labeling (verbal) tasks but also in their emotion-matching (nonverbal) tasks. For all participants, the emotion-labeling task was more difficult than the emotionmatching task. Additionally, the relative difficulty of recognizing four different emotional expressions was similar between verbal and nonverbal tasks.

#### Edited by:

Anne Henning, SRH Hochschule für Gesundheit Gera, Germany

#### Reviewed by:

Maciej Haman, University of Warsaw, Poland Karin Wiefferink, Dutch Association for the Deaf and Hard of Hearing Child, Netherlands

> \*Correspondence: Yanjie Su yjsu@pku.edu.cn

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 07 June 2016 Accepted: 06 December 2016 Published: 21 December 2016

#### Citation:

Wang Y, Su Y and Yan S (2016) Facial Expression Recognition in Children with Cochlear Implants and Hearing Aids. Front. Psychol. 7:1989. doi: 10.3389/fpsyg.2016.01989 Keywords: facial expression recognition, cochlear implants, hearing aids, verbal task, non-verbal task

# INTRODUCTION

Facial expression recognition (FER) is important for social interactions and effective communication. Deficits in young children's ability to recognize facial expressions can lead to impairments in social functioning (Herba and Phillips, 2004; Batty and Taylor, 2006). Denham et al. (1990) showed that peer-rated popularity and academic achievement correlated strongly with the ability to recognize others' emotional expressions.

Hiroko and Yamaguchi (2014) found that Japanese babies between the age of 6 and 7 months were highly sensitive to angry facial expressions. This could possibly be an adaptation that might allow them to determine if they are in potentially dangerous situations. Infants also use emotional expressions as behavioral cues. For example, when their mothers appeared happy, they were more likely to participate in novel situations. Developmental researchers used different paradigms (matching and labeling) to measure how accurately children recognized facial expressions of different emotions (Markham and Adams, 1992; Bruce et al., 2000). They found that, compared to children who were 3 to 6 years old, older children more accurately recognized facial expressions (Denham et al., 1990; Widen and Russell, 2008).

In China, there are 115,000 children under 7 years old with severe to profound or complete deafness and 30,000 babies are born with hearing impairments annually (Liang and Mason, 2013). A cochlear implant (CI) is a device that provides direct electrical stimulation to the auditory nerve in the inner ear, giving deaf individuals the ability to hear. Children with severe to profound hearing loss (71 and 90 dB HL or greater) who cannot be helped with hearing aids (HI) may resort to CIs.

Some researchers have examined the broader effects of a CI or HA on children's emotional and social development. Ziv et al. (2013) investigated the FER of hearing children, deaf children who communicated with sign language, and children with a CI

who communicated orally. They found that when completing labeling tasks, no significant difference in performance existed among the three groups. Additionally, they found that in pointing tests, children with a CI and those in the hearing group achieved higher scores than deaf children who could communicate using sign language. Finally, they found no significant difference between children with a CI and children with normal hearing. However, Wang et al. (2011) found that children with a CI or HA displayed less developed FER compared to children with normal hearing, especially regarding their ability to recognize anger and fear. Additional support for Wang, Su, Fang, and Zhou's finding was found by Wiefferink et al. (2013). Their findings revealed that compared to hearing children, children between the ages of 2.5 and 5 years old with a CI were less proficient in emotion recognition of facial expressions. In all three studies, children were presented with four photos of facial expressions and then randomly asked "Who looks happy/sad/angry/fearful". The respective image that the children indicated was recorded in another study, Most and Aviner (2009) used a written emotion vocabulary test containing 36 items to examine 40 children between the ages of 10 and 17 years. Their sample included children with normal hearing, children with a CI, and children with a HA. Each item was designed to trigger a specific emotion. The participants were asked to indicate if each facial expression showed happiness, sadness, anger, or fear. The children in the study were matched by age and gender and the results showed that children with normal hearing were no more proficient at FER than children with a CI or HA. Additionally, no differences were found between children with a CI and HA. Moreover, Hopyan-Misakyan et al. (2009) found similar results in a study of children with a CI who were between 7 and 13 years old.

McClure (2000) asserted that there are two phases of FER development. The first is the ability to discriminate between different facial expressions, independent of language skills. The matching task mentioned above is a demonstration of this initial stage of FER. In this task, children had to match the emotional expressions of persons in one group to those of persons in another group purely based off the visual stimuli presented. Subsequently, researchers found amygdala activation during these types of non-verbal FER tasks with low cognitive demands (Herba et al., 2006). Another study involved identifying and labeling facial expressions. Children were asked to point to the facial expression that matched the label. Attenuated amygdala activation and increased prefrontal activation were observed during the verbal FER tasks (Phillips et al., 2003).

Székely et al. (2011) found that normally developing three year olds showed a difference in the levels of recognition of the four basic emotions (happiness, sadness, anger, and fear) between the verbal and nonverbal FER tasks. For example, on the nonverbal (emotion-matching) task, fear was most easily recognized, while on the verbal (emotionlabeling) task, fear was the most difficult to recognize. It is possible that in the early rehabilitation of children with a CI or HA, the nonverbal (e.g., matching) task was more suitable.

The primary purpose of the present study was to explore the differences between the performance of children with a CI or HA and normal children, who were matched by age and gender, for emotion-matching (nonverbal) tasks and emotion-labeling (verbal) tasks. The secondary purpose was to examine which FER task was more difficult and which emotional expressions (happiness, sadness, anger, and fear) were the most difficult to recognize during the verbal and nonverbal tasks. We assumed that children with a CI or HA in the early rehabilitation were developmentally delayed for both emotion-matching and emotion-labeling tasks.

# MATERIALS AND METHODS

# Participants

The experiment included 22 children with a CI or HA (13 boys and 9 girls) from Beijing Sullivan Rehabilitation Center and Beijing Sullivan kindergarten. There were 10 children with a CI and 12 children using HA. In addition, the study included 22 children with normal hearing from Beijing Normal University kindergarten (13 boys and 9 girls). The teachers in the kindergarten assisted in acquiring parental consent. The children in the two groups were matched by age and gender in order to allow for an independent-samples t-test analysis. The results of the analysis showed that the difference between the mean ages of normal hearing children (54.41 ± 10.76 months) and children with a CI or HA (54.86 ± 11.97 months) was not statistically significant, t(42) = 0.132, p > 0.05. Of the 22 children with a CI or HA, 19 had over half a year of CI or HA experience and language rehabilitation. One had been using a HA for one month, and two who had been using a CI for 4 months. None of the children had an additional disability (such as blindness or autism). All children attended kindergarten from 8:00 am to 5:00 pm, 5 days a week, Monday to Friday, and received daily one hour, individual, auditory-oral therapy sessions. In addition, none of the parents had hearing impairments. There were six children living with their teachers because their parents worked in other cities. The participants with a CI or HA were selected by teachers who believed that they could understand the tasks. They all had prelingual deafness and did not know sign language. Mandarin was the children's first language. The attributes of the children are shown in **Table 1**.

# Materials and Procedure

Color images of four basic emotions (happiness, sadness, anger, and fear) (Wang et al., 2011) were used in emotion-matching tasks and emotion-labeling tasks. Black and white images of four shapes (circle, square, rectangle, and triangle) were used as control tasks to measure children's basic abilities of matching and labeling. The images were 7 cm by 9.5 cm.

#### Practice

Prior to the test trials, a color-matching task and an emotionmatching task, different from those used in the test trials, were used to ensure that the children understood both the concept of

#### TABLE 1 | Characteristics of participants in each group.

fpsyg-07-01989 December 19, 2016 Time: 12:27 # 3


matching and the tasks. First, the experimenter or the teacher asked children to match the color. If a child did not successfully complete the color-matching task, the experimenter conducted more trials until the child correctly completed two consecutively. The children were then asked to match the images of emotional expressions. If a child could not complete the emotion-matching task, the experimenter or the teacher would instruct him or her the correct response and ask him or her to match the emotional expression again. Children who completed these two practice tasks could receive the formal test trials (matching task and labeling task).

#### Matching Test Task

This session included two tasks: emotion-matching task and shape-matching task. The shape-matching task was used to control for the presence of basic matching abilities. Children were asked to match the emotion or shape of a target stimulus at the top of a paper with one of the four choices presented at the bottom (see **Figure 1**). For example, "please match the same facial emotional expression". The study included eight trials of emotion-matching. Two female and two male identity pairs completed eight trials of shape-matching. Each emotion and shape was used as the target stimulus twice. The position of the stimuli was balanced and the order in which they were presented was randomized. A correct response was given a score of 1 and an incorrect response was given a score of 0. The total scores for each of the emotion- and shape-matching tasks ranged from 0 to 8.

#### Labeling Test Task

This session also included two tasks: emotion-labeling task and shape-labeling task. The shape-labeling task was used to control for the presence of basic labeling abilities. Children were asked to point to the item that the experimenter asked for randomly, either an emotion (happiness, sadness, anger, and fear) (Wang et al., 2011) or a shape (circle, square, rectangle, and triangle) (control task). For example, "who is happy" was the child's cue to point out the respective emotional expression that matched the label. The positions of the four facial expressions or shapes were counterbalanced. The order that men and women were presented in was also counterbalanced. A correct response was given a score of 1, and an incorrect response was given a score of 0. The total scores for the emotion- and shape-labeling tasks ranged from 0 to 8.

The paradigms of the matching and labeling task were used by Székely et al. (2011). In Ziv et al. (2013), the labeling task in the present study was named "pointing task". The order of shape and emotion matching and labeling tasks was determined using a Latin-square design. SPSS 19.0 was used to analyze the data.

# RESULTS

Two scatter plots (see **Figures 2** and **3**) show the fractional distribution of different (shape and emotion) tasks and participants (normal hearing, CI and HA). **Figures 2** and **3** show that some participants received the same score, most notably for the shape tasks.

Following the fractional distribution, four homogeneity of variance tests were conducted. The scores of emotion-matching and emotion-labeling tasks showed homogeneity of variance [F(1,42) = 3.77, p > 0.05; F(1,42) = 2.65, p > 0.05]. However, the scores of shape-matching and shape-labeling tasks showed heterogeneity of variance [F(1,42) = 14.94, p < 0.05; F(1,42) = 26.96, p < 0.05].

We conducted a repeated measures ANOVA analysis utilizing the type of participant (normal/CI or HA) as a between-subject independent variable, and the type of task (matching/labeling) and type of stimuli (shapes/emotions) as within-subject independent variables. Because of the heterogeneity of variance, we used a Greenhouse–Geisser correction. It showed significant main effects for the type of participants, F(1,42) = 8.95, p < 0.01, η <sup>2</sup> = 0.18, which indicated that hearing children did significantly better than children with a CI or HA. The test also indicated significance differences in the type of stimuli, F(1,42) = 63.38, p < 0.01, η <sup>2</sup> = 0.60. The only significant interaction was between the type of task and the type of stimuli, F(1,42) = 11.43, p < 0.01, η <sup>2</sup> = 0.21. Other interactions and the main effects for the type of task were not significant (ps > 0.05).

Based on a significant interaction between the type of task and the type of stimuli, we used a simple effect analysis. For

TABLE 2 | Descriptive statistics for the labeling and matching task for each group (M (SD)).

both matching and labeling tasks, the scores using shape as stimuli were significantly higher than those using emotions as stimuli (p < 0.05). When the stimuli were emotions, the scores of matching tasks were significantly higher than those of the labeling tasks, p < 0.05. However, when the stimuli were shapes, no significant difference was present, p > 0.05.

Because of the four different emotion scores as dependent variables ranged from 0 to 2 only, nonparametric tests were used. Two Friedman tests indicated that for the matching and labeling tasks, significant differences existed among the four types of emotions (ps < 0.05) separately. Combined with the descriptive statistics shown in **Table 2**, the order of the four types of emotion scores from high to low was sadness, happiness, anger and fear for the labeling task and happiness, sadness, anger and fear for the matching task.


### DISCUSSION

fpsyg-07-01989 December 19, 2016 Time: 12:27 # 5

The results showed that children with a CI or HA were developmentally delayed in the performance of both emotionlabeling and emotion-matching. These present findings contradict the prior findings of Ziv et al. (2013) who found that for labeling and pointing tasks, there was no significant difference between children with a CI and hearing children. The two studies differ in two ways: the attributes of the participants and the experimental stimuli. In Ziv et al.'s (2013)study, the mean age of children with a CI was 6.6 years, whereas in the present study, the mean age of children with a CI or HA was 4.3 years. This is relevant because, according to Denham et al. (1990), FER development progresses with age, meaning differences in findings could possibly be attributed to the children being in different FER developmental phases. In addition, in Ziv et al.'s (2013) study, the mean age at implantation was 2.5 years, whereas in the present study, the nineteen children with a CI or HA had between half a year to 2 years of CI or HA experience and language rehabilitation, and the two children with CI and one child with a HA had less than half a year. During the early stage of rehabilitation, participants in the present study could not communicate with others fully and validly though they communicate orally. The present study also used the facial expressions of adult males and females, while Ziv et al.'s (2013) study used photographs of boys and girls who were the same age as the participants. Anastasi and Rhodes (2005) showed that it was easier for participants to interpret the emotional expressions of individuals in their own age group.

The relative difficulty of recognizing four different emotional expressions is similar between verbal and nonverbal tasks except for the order of happiness and sadness. These findings were inconsistent with the results of Székely et al. (2011). One important difference was the number of alternatives available to choose from during each trial. Székely et al. (2011) used only two alternatives during each trial of the matching task. In contrast, the present study used four alternatives. Another important difference was the different participants. All children in Székely et al. (2011) were 3-year-olds with normal hearing, while the participants in our experiment were children with normal hearing and children with a CI or HA that were between 30 and 84 months old.

The findings showed that both children with normal hearing and children with a CI or HA were most accurate when matching and labeling happy and sad faces, followed by angry and fearful faces. Vicari et al. (2000) found a similar rank in the four types of emotions. They reported that children who were between 5 to 10 years old consistently and regardless of age, recognized happiness and sadness, whereas the recognition of anger and fear improved with age. The findings of Ziv et al. (2013) indicated that happiness is the most difficult to recognize for the "pointing task". This discrepancy is possibly due to the individual socio-culture experience and the complexity of facial expressions (Montirosso et al., 2010; Helen et al., 2015). Hence, it is hard to reach a universal conclusion on the developmental sequence of the FER (Vicari et al., 2000; Herba and Phillips, 2004; Helen et al., 2015).

The primary limitation of the present study was that we did not compare children with a CI to those with a HA due to the small sample size. However, Most and Aviner (2009) found no difference in the ability of children with a CI and those with HA to recognize facial expressions. An additional limitation of this study was that language ability was not measured.

To summarize, the recognition of facial expressions during verbal and nonverbal tasks was delayed in children with a CI or HA who were in early rehabilitation stage. For all participants, the emotion-labeling task was more difficult than the emotionmatching task. The relative difficulty of recognizing four different emotional expressions is similar between verbal and nonverbal tasks. The results of this study suggest that a future study of the rehabilitation process should be conducted to understand how it affects the development of FER in children with a CI or HA.

#### ETHICAL APPROVAL

All procedures performed in the study involving human participants were conducted in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Written informed consent was obtained from all participants included in the study.

#### AUTHOR CONTRIBUTIONS

YW: Substantial contributions to the conception or design of the work. Analysis and interpretation of data for the work. Drafting the work. YS: Final approval of the version to be published. Revising it critically for important intellectual content. SY: Drafting the work. Acquisition of data.

#### FUNDING

This research was supported by National Natural Science Foundation of China [grant number 31371058 to YW] and State Administration of Press, Publication, Radio, Film, and Television of The People's Republic of China [grant number GD1608 to YW].

#### ACKNOWLEDGMENT

We are grateful to the children and teachers in the Beijing Sullivan Rehabilitation Center and Beijing Sullivan kindergarten.

# REFERENCES

fpsyg-07-01989 December 19, 2016 Time: 12:27 # 6


in 4- to 18-year-olds. Rev. Soc. Dev. 19, 71–92. doi: 10.1111/j.1467-9507.2008. 00527.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Wang, Su and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Theory of Mind and Reading Comprehension in Deaf and Hard-of-Hearing Signing Children

Emil Holmer<sup>1</sup> \*, Mikael Heimann<sup>2</sup> and Mary Rudner<sup>1</sup>

<sup>1</sup> Linnaeus Centre HEAD, Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden, <sup>2</sup> Infant and Child Lab, Division of Psychology and Swedish Institute for Disability Research, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden

Theory of Mind (ToM) is related to reading comprehension in hearing children. In the present study, we investigated progression in ToM in Swedish deaf and hard-of-hearing (DHH) signing children who were learning to read, as well as the association of ToM with reading comprehension. Thirteen children at Swedish state primary schools for DHH children performed a Swedish Sign Language (SSL) version of the Wellman and Liu (2004) ToM scale, along with tests of reading comprehension, SSL comprehension, and working memory. Results indicated that ToM progression did not differ from that reported in previous studies, although ToM development was delayed despite age-appropriate sign language skills. Correlation analysis revealed that ToM was associated with reading comprehension and working memory, but not sign language comprehension. We propose that some factor not investigated in the present study, possibly represented by inference making constrained by working memory capacity, supports both ToM and

Edited by:

Daniela Bulgarelli, University of Turin, Italy

#### Reviewed by:

Gábor Péter Háden, Hungarian Academy of Sciences, Hungary Jo Van Herwegen, Kingston University London, UK

> \*Correspondence: Emil Holmer emil.holmer@liu.se

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 06 March 2016 Accepted: 23 May 2016 Published: 07 June 2016

#### Citation:

Holmer E, Heimann M and Rudner M (2016) Theory of Mind and Reading Comprehension in Deaf and Hard-of-Hearing Signing Children. Front. Psychol. 7:854. doi: 10.3389/fpsyg.2016.00854 reading comprehension and may thus explain the results observed in the present study. Keywords: deaf and hard-of-hearing, children, Theory of Mind, sign language, working memory, reading comprehension

# INTRODUCTION

Theory of Mind (ToM) is the ability to understand and predict the mental worlds of oneself and others and how they relate to behavior (Frith and Frith, 2012), or, simply, to represent and understand minds. Our understanding of the functional correlates of ToM is still evolving (Carlson et al., 2013); however, one interesting finding is that ToM is associated with reading ability (e.g., Kim, 2015). Early studies assessed ToM using false belief tasks (e.g., Wimmer and Perner, 1983), in which correct solution requires a protagonist's false belief to be kept in mind. This procedure reflects early conceptualizations of ToM as an all-or-nothing capacity (cf, Baron-Cohen et al., 1985), typically in place at age four (for a meta-analysis, see Wellman et al., 2001). Using a fivepoint scale, Wellman and Liu (2004) showed that ToM is in fact an ability with a developmental progression, in which representation and understanding of mind emerge in a specific order over time. Their original finding relating to North American children has been replicated in other cultures (e.g., Germany: Henning et al., 2011; China: Wu and Su, 2014; for a review, see Wellman, 2014). According to the five-point scale (Wellman and Liu, 2004), the first stage in ToM development is the ability to understand that the desires of oneself and others may not be the same. This ability appears around the age of 2 years in typically developing children. The second stage, typically emerging at the age of three, is the ability to understand that the beliefs of oneself and

others may differ. The third stage, also emerging at 3 years, is the ability to understand that someone else's knowledge may not be the same as one's own. The ability to understand false belief is the fourth stage in the Wellman and Liu (2004) ToM scale, and this is followed by a fifth stage which involves the ability to understand that displayed and experienced emotions may not be the same. The validity of this scale is supported by other work showing that while children have a basic understanding of desires at the age of two, at the age of three, they also start to differentiate between their own beliefs and knowledge and those of others (for reviews, see Carlson et al., 2013; Wellman, 2014). During a similar phase in development, an increase in working memory capacity and executive skills is typically also observed, and the level of these skills seems to constrain development of ToM (Moses and Tahiroglu, 2010; Carlson et al., 2013).

Several disabilities are related to changes in the development of ToM. For example, in children with autism spectrum disorder (ASD), ToM shows atypical developmental progression (Peterson et al., 2005, 2012) which has been attributed to atypical neurobiological development (Lord and Bishop, 2015). In particular, it has been reported that children with ASD have a better ability to understand hidden emotions than false beliefs, possibly because it is easier to form representations of emotions, which are concrete, than of beliefs, which are abstract (Peterson et al., 2005, 2012). From an Australian cultural setting, it has been reported that deaf and hard-of-hearing (DHH) signing children display the same progression in ToM as typically developing hearing children do, but that there might be delays in the age at which different ToM concepts are understood (Peterson et al., 2005, 2012). Such delays have been attributed to socio-cultural factors, including restricted discussion of abstract concepts, including mental states, due to mismatch between the language capabilities of the child and its caregivers (Peterson, 2009; Lederberg et al., 2013; Sundqvist and Heimann, 2014). Mismatch of this nature may arise either because parents underestimate the importance of such speech-based talk or because they lack adequate sign language skills. These situations are common since only about 5% of all DHH children grow up with deaf parents who primarily use sign language themselves (Lederberg et al., 2013). DHH signing children who grow up with hearing parents having restricted knowledge of sign language, typically display delayed ToM development (Peterson, 2009; Lederberg et al., 2013). Other studies have shown that DHH children who have been exposed to a sign language from birth, i.e., DHH native signing children, perform on par with typically developing hearing children on ToM tasks (Lederberg et al., 2013). DHH children with poorer language capabilities are likely to have poorer representations of mental states. According to flexible resource models of working memory, when it is more difficult to form representations it may be harder to process them in working memory (Ma et al., 2014). Thus, delayed development of ToM in DHH children may be due in part to poor language skills and the limitations of working memory. Indeed, associations between ToM and working memory have been reported for DHH children (Meristo and Hjelmquist, 2009). The first purpose of the present study is to determine whether DHH signing children in Sweden follow the typical developmental trajectory in ToM and whether their level of ToM skills is related to working memory and home language.

It is estimated that between 100 and 200 DHH children are born each year in Sweden (Assessing Health Care Interventions, 2006). With the right support, many DHH children can achieve good speech development with technical aids<sup>1</sup> (Kral and Sharma, 2012), as well as age-appropriate reading skills (Geers et al., 2011; Nakeva von Mentzer et al., 2014; Asker-Árnason et al., 2015). However, there is large variation in speech outcome (Campbell et al., 2014) and some DHH children in Sweden use Swedish Sign Language (SSL; Svartholm, 2010). In order to achieve adequate linguistic development, it is important for these children that SSL is used during both social and learning activities (Svartholm, 2010; Lederberg et al., 2013). Sign languages are natural languages that are used to share thoughts, ideas and beliefs and can be understood at the same linguistic levels as spoken languages but differ from ambient spoken languages in their phonological and syntactic structure (Emmorey, 2002). Thus, sign languages and spoken languages are functionally equivalent. However, sign languages do not have written forms, and DHH children learn to read the written form of the spoken language in the cultural setting in which they grow up, even though their primary language may be signed. Generally, children learn to read by mapping written symbols onto mental representations of speech sounds (Wagner and Torgesen, 1987). When mapping is successful, lexical items are accessed, revealing the meaning of written language (Perfetti and Stafura, 2014). DHH children may lack well-established, speech-based representations. Thus, for DHH children who use sign language, learning to read depends both on the ability to learn a new language system (Perfetti and Sandak, 2000; Trezek et al., 2011), and the ability to utilize sign language skills to understand text (Chamberlain and Mayberry, 2000; Hoffmeister and Caldwell-Harris, 2014).

The bilingual approach to deaf education adopted at Swedish state primary schools for DHH children involves teachers translating written Swedish into SSL and discussing differences between the two languages with the pupils (Svartholm, 2010). Such discussions involve mutual reflection on the child's thoughts and beliefs about the content of texts. Apart from the intended purpose of supporting reading development, such reflection is likely to promote the ability to differentiate between the thoughts and beliefs of oneself and others that is fundamental to the development of ToM (Wellman, 2014). Furthermore, ToM is likely to influence the establishment of reading skills (Astington and Pelletier, 2005; Blair and Razza, 2007). Indeed, ToM has been shown to explain unique variance in reading comprehension in both typical children (Kim, 2015) and children with ASD (Ricketts et al., 2013). In other words, ToM is likely to be associated with reading comprehension in DHH children. However, to our knowledge, this association has not hitherto been studied. Thus, the second purpose of the present study is to investigate the association of ToM and reading comprehension in DHH children being educated using the bilingual approach.

<sup>1</sup>The term technical aids refers to hearing aids, cochlear implants or a combination of these.

Language comprehension and word reading skills predict reading comprehension in DHH children (Marschark and Wauters, 2008), and they have been estimated to explain around 50% of the variance in reading comprehension in hearing children (Ripoll Salceda et al., 2014). In order to secure variance in reading comprehension ability, while keeping word reading skills in control, we selected participants who had Grade 1 word reading skills. In addition, to rule out general language delays as a factor, we wanted participants to display age-appropriate sign language skills.

In the present study, we investigated ToM in children who are at an early stage of reading development and are attending Swedish state primary schools for DHH children. We predicted typical developmental progression in ToM, although delayed in children with whom caregivers did not primarily use SSL. We also predicted that ToM would be positively related to sign language comprehension and reading comprehension, as well as working memory.

# MATERIALS AND METHODS

#### Participants

Sixteen DHH children (8 boys) with a mean age of 10.1 years (SD = 2.1; range 7.3–14.5), attending grades 1–7 in Swedish state primary schools for DHH children, were recruited. Three of the participants had an additional diagnosed medical or developmental disability and were therefore excluded from the study. These individuals performed below the 5th percentile on a test of non-verbal intelligence, i.e., Raven's Colored Progressive Matrices (Raven and Raven, 1994), indicating possible atypical cognitive functioning. Staff members at the schools selected participants they considered to be at a word reading level corresponding to Grade 1 of hearing children and subsequent testing showed that performance on word reading in the sample did not differ from Grade 1 hearing children (Holmer et al., 2016a). After selection, participants and their parents provided informed consent, attested in writing by the parents. The study was approved by the Regional Ethical Review Board, Linköping, Sweden.

Demographics are presented in **Table 1**. Mean age was 10.2 (SD = 2.3). All participants but one performed within the normal range on non-verbal intelligence. This participant scored only one point below (M = 25.2, SD = 5.88) the normal range and was not excluded since no additional disability was reported. Furthermore, performance on tests of word reading skills of this participant was within ±1SD of average performance of Grade 1 hearing children. Two participants had a vision deficit which was corrected. Eleven used technical aids and the mean age at fitting was 3.9 years (SD = 2.2). Up-to-date audiological records were not available and since ToM and other cognitive and linguistic skills were the focus of this study, audiological measurements were not made. Seven of the participants were born abroad, one in an expatriate family, and age at which residence in Sweden commenced ranged from 2.2 to 10.6 years (n = 5). The age of exposure to SSL was on average 4 years (SD = 3; range 0– 12). Three participants had been exposed to SSL since birth;

#### TABLE 1 | Demographics (N = 13).


SSL, Swedish Sign Language; HA, Hearing Aid; CI, Cochlear Implant.

two of these participants had parents who where themselves deaf and used SSL. One further participant had parents who primarily used SSL; the rest had parents who spoke a language from Europe, Asia, or Africa, sometimes with the support of signs from SSL when interacting with the participant. The families of three participants partly or fully omitted to provide background data.

#### Measures

#### Theory of Mind

To assess ToM, a Swedish version (Sundqvist et al., 2014a) of the Wellman and Liu's (2004) five-step ToM scale was adapted for use in SSL (see Procedure below). The Swedish version of the scale was created by translating the original scale in English into Swedish and back-translating into English in consultation with the authors. The scale includes a set of tasks (see **Table 2**) which were administered in an order recommended by Wellman and Liu (2004): diverse desires, knowledge access, content false belief, diverse beliefs, and hidden emotions. The SSL adaptation of the scale differed from standard versions in two ways. First, all names were replaced with their category designator (e.g., "the man", "the girl"). This choice was made because the particular name was not pertinent to the task and in sign languages all names have to be fingerspelled the first time they are used, probably leading to letter-by-letter representation and increased working memory load, reducing resources for performing the ToM task. The second change was based on recommendations of Peterson et al. (2005); a control question was added to the HE task, i.e., when the child had pointed to the neutral, smiling, or sad face, the child was asked why the protagonist felt that way. In accordance with the standard procedure (Wellman and Liu, 2004), one point was awarded for each of the tasks where both target question and control questions were answered correctly and the total number of tasks solved constituted an index that was used when computing correlations.



#### Reading Comprehension

A Swedish version of the Woodcock Passage Reading Comprehension (Woodcock, 1998) test was used (i.e., Furnes and Samuelsson, 2009). The participant silently read Swedish sentences and paragraphs of different lengths in which one word was omitted. The placement of the omitted word varied over items. In total, there were 68 passages of text and testing was stopped after a sequence of 6 errors. The first few passages consisted of one short sentence (3 or 4 words) with subsequent passages increasing in difficulty such that the last few included two or three sentences with both principal and subordinate clauses. The participant's task was to complete the passage by either saying, signing, or writing an appropriate word. The dependent measure was the number of correct answers.

#### Sign Language Comprehension

A version of the British Sign Language Receptive Skills Test (Herman et al., 1999) adapted for SSL was used to assess sign language comprehension. Testing started with a vocabulary check, and then the participant was presented with 40 videos of SSL sentences. For each sentence, the participant judged which picture out of three or four alternatives best matched the meaning of the sentence. The participant was awarded one point for each correct response and the dependent measure was total number of correct answers. Testing was conducted by native SSL users who had been trained to administer the test. For two of the participants, results dating from 10 months prior to testing were available and these participants were not re-tested due to ethical considerations.

#### Working Memory

To assess working memory capacity, a visuo-spatial task called The Clown test (Sundqvist and Rönnberg, 2010; Birberg Thornberg, 2011) was used. The Clown test is based on the Mr Peanut task introduced by Kemps et al. (2000). The participant was presented with a clown figure on a magnetic board, which had a set of colored magnets placed on it in a predefined pattern. After a number of seconds, corresponding to the number of magnets placed on the figure, it was turned away from the participant and the magnets were removed. Then the participant was asked to report the color of the magnets. When a response was given, the participant was asked to replace the magnets in their original configuration or point it out. The number of magnets increased from one to ten across trials, with three trials at each level (30 possible trials in all). On each trial, the original configuration of the magnets had to be correctly specified for a response to be counted as correct, and the participant had to answer correctly on at least two out of three trials with a particular number of magnets to move on to the next level. The participant was awarded one point for each correct trial, and the dependent measure was the total score.

#### Procedure

Participants were tested individually at their school by members of staff who were fluent signers and familiar to the participants. In total, there were five test leaders, of whom two deaf native signers administered the test of sign language comprehension. The other three were trained to administer all other tests by the first author. Instructions were available in SSL and in Swedish, and mode of instruction was adapted to the needs of the participant. SSL instructions were provided by the test leader and were based on written instructions following a formalized coding system for rephrasing the Swedish instructions in SSL (Bergman, 2012). This procedure was used to minimize divergences in the instructions different participants were given. Test leaders made sure that the participant understood each task before testing took place, and participants practiced all but the ToM tasks before administration.

For the ToM scale, the rephrasing was done by a licensed sign language interpreter, and was then checked by two native signing teachers of DHH children. In the few instances that disagreement occurred, it was discussed until a consensus was formed. The rest of the instructions were coded by a deaf native SSL user, and checked by three of the test leaders in the study. For practical reasons, even though there was a recommended order, test order was individually adapted and breaks were taken when needed.

#### Data Analysis

fpsyg-07-00854 June 4, 2016 Time: 11:44 # 5

Descriptive statistics for reading comprehension, sign language comprehension, and working memory are reported elsewhere (Holmer et al., 2016a,b), and here we perform new analyses not previously reported. First, normality assumptions were tested, descriptive statistics were computed and data was visually inspected. Progression on the ToM scale was determined by calculating the proportion of participants who correctly solved each task. The total number of tasks correctly solved by each participant was used as an individual index of ToM ability (cf, Peterson et al., 2005, 2012). We used an independent samples t-test to test our prediction that participants with caregivers who mainly used SSL at home would score higher than other participants on the individual index of ToM ability. We then computed correlations (Pearson's r) to investigate our predictions that ToM would be associated with reading comprehension, SSL comprehension and working memory. The parametric approach was applied because Shapiro–Wilk's test statistics indicated that all variables were normally distributed (p > 0.05). A significance level of 0.05 (two-tailed) was applied for all tests. To obtain maximum power, despite low n, no correction was made for multiple comparisons and one missing data point on the sign language comprehension test was replaced with group mean when calculating correlations. All statistical computations were conducted using IBM SPSS Statistics (Version 22.0).

#### RESULTS

#### Descriptive Statistics

In **Table 3**, performance on the ToM scale is shown and compared to published results relating to similar groups. This reveals that the developmental progression of participants in the present study did not differ from that found in previous studies relating to children with typical development (Wellman and Liu, 2004; Peterson et al., 2005, 2012; Henning et al., 2011; Wu and Su, 2014) and DHH signing children (Peterson et al., 2005, 2012). However, it should be observed that there was no difference in the proportion of children who solved the diverse beliefs and knowledge access tasks in the current data set. Furthermore, comparing ToM index of the present sample to that of groups from earlier studies revealed that the participants in the present study were delayed in ToM (see **Table 3** and **Figure 1**). Overall, comparisons indicated small to large betweengroup effect sizes (Cohen's d, with 0.2 reflecting a small, 0.5 a medium, and 0.8 a large effect size, Cohen, 1992; see **Figure 1**). In particular, performance was worse than the mean score of deaf native signers in Peterson et al. (2005), t(12) = 4.23, p = 0.001, d = 1.15, and that of hearing children in Peterson et al. (2012), t(12) = 6.13, p < 0.001, d = 2.82, despite similar ages across groups. Thus, although developmental progression did not differ from that demonstrated in earlier studies, there was a clear delay in development of ToM.

We have reported (Holmer et al., 2016b) that the performance of the DHH participants in the present study on reading comprehension (M = 3.8, SD = 1.2) was significantly worse than that of Grade 1 hearing children (M = 14, SD = 8.8), but that there was no difference in working memory (DHH, M = 2.1, SD = 0.7; hearing, M = 1.8, SD = 0.8; a similar pattern from a Swedish context was reported by Rudner et al., 2015).

Mean performance on the SSL comprehension test was 33 (out of a possible 40; SD = 5.0, n = 12). No norms are available for the SSL version of this test. However, norms are available for the equivalent test in British Sign Language (BSL) for children between the ages of 3 and 12 (Herman and Roy, 2006). One participant in the present study was older than 12 years and performed almost 1 SD above the mean according to the BSL norm for 12-year olds. Of the remaining 11 participants, 9 scored within ±2SD of the mean according to the BSL norm for their age group and 2 performed even better.

Descriptive statistics for all tasks are reported in **Table 4**. Participants with parents who primarily used SSL did not differ from other participants on ToM, t(11) = 0.07, p = 0.95, d = 0.04. In fact, no between-group differences were initially detected on study variables (t-test statistics, p > 0.05). However, there was a large effect size (d > 0.8; Cohen, 1992) on SSL comprehension (see **Figure 2**), suggesting that performance was better among participants with parents who primarily used SSL. When age was entered as a covariate, this difference reached significance, F(1, 10) = 5.70, p = 0.038, d = 1.62.


Results from previous studies are provided for comparison purposes and ToM index is based on the total number of solved tasks. HC, hearing children; NS, native signing deaf children; LS, late signing deaf children. <sup>a</sup>Total score on the five-step scale is not reported.

FIGURE 1 | Effect sizes (Cohen's d) for comparisons of index score on the Theory of Mind (ToM) scale of the present sample with that of selected groups of deaf native signing (NS), deaf late signing (LS), and hearing children (HC) from Peterson et al. (2005, 2012). The mean age of each comparison group is displayed next to the group label. ∗∗One-sample t-test, p < 0.01. ∗∗∗One-sample t-test, p < 0.001.

TABLE 4 | Descriptive statistics on study variables for participants with parents who primarily use Swedish Sign Language at home (SSL) and participants with parents who primarily use a spoken language at home (other).


WPRC, Woodcock Passage Reading Comprehension; SSL, Swedish Sign Language; CI, Confidence Interval. <sup>a</sup>n = 8 on SSL comprehension.

#### Correlations

Associations between variables are reported in **Table 5**. In accordance with our predictions, index score on the ToM scale was positively associated with reading comprehension, r(13) = 0.69, p = 0.009 (see **Figure 3**), and working memory, r(13) = 0.61, p = 0.028. However, contrary to our prediction, no statistically significant association was found between ToM and sign language comprehension, r(13) = 0.39, p = 0.18. Furthermore, there was no statistically significant correlation between sign language comprehension and reading comprehension, r(13) = 0.42, p = 0.15, and the association between ToM and reading comprehension was still significant after partialling out the effect of sign language comprehension, rp(10) = 0.63, p = 0.028. Despite the high variability in age of exposure to SSL, there was no association with the ToM index, r(10) = 0.11, p = 0.66.

#### DISCUSSION

In the present study, we investigated ToM in children attending Swedish state primary schools for DHH who use SSL and are

#### TABLE 5 | Correlations between study variables.


WPRC, Woodcock Passage Reading Comprehension; SSL, Swedish Sign Language. <sup>a</sup>One mean imputation. ∗∗p < 0.01, <sup>∗</sup>p < 0.05.

at an early stage of their reading development. To achieve this, we used a version of the Wellman and Liu (2004) five-step ToM scale adapted for SSL. The main finding was that the order of progression in ToM development did not differ from that reported for typically developing children (Wellman and Liu, 2004; Henning et al., 2011; Wu and Su, 2014) as well as for DHH signing children (Peterson et al., 2005, 2012) in other cultural settings. However, we did not find that DHH signing children whose parents mainly communicated with them in SSL had more advanced ToM than the other participants, even though their SSL comprehension

large > 0.80; Cohen, 1992).

of relationship between variables.

was better. Furthermore, all participants appeared to be delayed in their ToM, compared to the performance of native signing and typically developing hearing children reported in earlier studies. We did find an association between ToM and both reading comprehension and working memory, in line with our predictions, although not between ToM and SSL comprehension, and the association between ToM and reading comprehension was still significant after controlling for general language skills.

#### Progression in Theory of Mind

While early studies regarded ToM as an all-or-nothing capacity (e.g., Baron-Cohen et al., 1985), more recent work has shown that there is a sequence in the development of different aspects of this skill. In particular, Wellman and Liu (2004) showed that there is a typical developmental progression ranging from the understanding of diverse desires and beliefs through knowledge access to understanding of false belief and hidden emotion. Previous work has suggested that DHH children who are not native signers are at risk of delayed ToM development (Peterson, 2009; Lederberg et al., 2013; Sundqvist et al., 2014b).

The results of the present study are in line with previous studies showing typical progression of ToM development in DHH signing children. Importantly, while earlier work has been able to generalize findings of ToM progression in typically developing children from the English-speaking world to cultures with other languages (Henning et al., 2011; Wu and Su, 2014), the present work partially supports generalization of findings of typical ToM progression in DHH signing children from English-speaking cultures to a Swedish setting, and thus lends support to the notion of a progression in ToM during childhood (Wellman, 2014). DHH signing children seem to advance in ToM across a set of developmentally differentiated but psychologically linked achievements in much the same way as typically developing children. However, in contrast to previous studies there was no difference in the percentage of

participants who solved the diverse beliefs and knowledge access tasks. Although, our data cannot definitively determine the order of these developmental stages, it does not suggest that it is different in DHH signing children in a Swedish setting from that found in previous studies. It is likely that this phenomenon is related to the process of adapting the scale to a new culture (Wellman, 2014) or random errors. To learn more about the usefulness and psychometric properties of the ToM scale in a Swedish context, future studies should use the scale to investigate ToM development in larger samples of typically developing Swedish children as well as children with diagnoses previously associated with ToM difficulties (e.g., ASD; Lord and Bishop, 2015).

#### Delay in Theory of Mind

Although the developmental progression of ToM was not altered in the present study, it was delayed in relation to the ToM performance of DHH native signing and typically developing hearing children of similar age reported in earlier studies (Peterson et al., 2005, 2012). This applied both to participants whose parents primarily used speech and to participants whose parents mainly used SSL, despite the stronger sign language skills of the latter group. It is well established that linguistic environment and establishment of functional language skills influence ToM development in DHH children (Peterson, 2009; Lederberg et al., 2013; Sundqvist and Heimann, 2014). However, another important aspect is the nature of the social interactions in the environment in which development occurs (Reddy, 2008; Wellman, 2014). It has been shown that the degree to which parents adapt their behaviors to the mental world of their infants during social interaction predicts ToM development (Meins et al., 2002; Kirk et al., 2015). Thus, belonging to a sign language rich setting and developing ageappropriate sign language skills may be necessary but not sufficient for typical ToM development in DHH signing children. Investigating parent–child interaction was beyond the scope of the present work, but should be considered in future studies.

ToM performance in the present sample was weaker than that of DHH native signing children in the study by Peterson et al. (2005) but statistical testing did not reveal that it was weaker than that of late signing DHH children reported in previous studies (Peterson et al., 2005, 2012), although effect sizes indicated small to medium mean differences. SSL comprehension for the sample was age appropriate, and thus there was no general language delay that could explain the observed delay in ToM. Furthermore, age of first exposure to SSL was not related to ToM performance. In fact, it is hard to identify any factor taken into account in the present study that can explain the obviously delayed ToM in this group. However, at the same time, we cannot rule out that any of these factors has explanatory value, considering the small and heterogeneous sample as well as the concomitant disproportionately large effect of any confounding variables. In particular, it should be noted that in the group of participants whose parents mainly used SSL, only two were native signers, defined as having at least one deaf signing parent and had been exposed to sign language since birth. Hence, as a group, the present sample may be very similar to late signing groups included in earlier studies, and the lack of association between ToM development and age of SSL exposure on the one hand and general SSL skills on the other should be interpreted with caution.

# Correlations between Theory of Mind, Reading Comprehension, Sign Language Skills and Working Memory

In line with previous work in typically developing children (Kim, 2015) and children with ASD (Ricketts et al., 2013), we observed a positive association between ToM and reading comprehension in the DHH signing participants in the present study. To our knowledge, this is the first time this relationship has been studied in DHH children. In earlier studies, a relationship between ToM and reading comprehension has been discussed in relation to general language skills (Astington and Pelletier, 2005; Ricketts et al., 2013; Kim, 2015), working memory and executive skills as well as inference making (Ricketts et al., 2013; Kim, 2015), and it has been suggested that ToM is a prerequisite for learning socially mediated skills like reading (Frith and Happé, 1994; Ricketts et al., 2013).

#### Lack of an Association between Sign Language Skills and Theory of Mind and Reading Comprehension

Sign language comprehension was not significantly associated either with ToM or with reading comprehension. However, the literature indicates that general language skills and ToM are related in typically developing children (Milligan et al., 2007), DHH signing children (Lederberg et al., 2013), DHH children with technical aids who use speech (Sundqvist and Heimann, 2014), and individuals with dual sensory loss (Frölander et al., 2014). Furthermore, general language skills are related to reading comprehension in typically developing (Ripoll Salceda et al., 2014) as well as DHH (Mayberry et al., 2011) children.

It is possible that the lack of statistically significant associations between sign language skills and ToM as well as reading comprehension is due to low power or heterogeneity of the sample in the present study. However, the lack of association between ToM and sign language comprehension may also be due to the fact that although the sign language test used here provides an estimate of general sign language skills (e.g., Woolfe et al., 2002; Jones et al., 2015), it does not tap onto linguistic aspects of central importance to ToM. For example, it has been suggested that the ability to represent mental states linguistically and to embed propositions under mental verbs, e.g., "He/she thought that . . .", is a prerequisite for reasoning about the minds of others (Milligan et al., 2007; de Villiers and de Villiers, 2014), and neither of these aspects was assessed in the present study. Astington and Pelletier (2005) suggested that general language skills explain shared variance between ToM and reading comprehension. However, controlling for general language skills in the present study did not seem to affect the correlation between ToM and reading comprehension. This is

in line with findings from a structural equation model (SEM) by Kim (2015), where ToM, vocabulary and grammatical knowledge all explained unique variance in reading comprehension. Ricketts et al. (2013) also reported that ToM predicted unique variance in reading comprehension after controlling for general language skills in children with ASD. Thus, the findings of Kim (2015) and Ricketts et al. (2013) and the correlations between sign language, ToM and reading comprehension in the present study, suggest that a positive relationship between ToM and reading comprehension cannot be completely explained by general language skills.

#### Associations between Working Memory, Theory of Mind and Reading Comprehension

As predicted, working memory capacity was related to both ToM and reading comprehension. In typically developing individuals, working memory is related to comprehension of both texts (Daneman and Merikle, 1996) and minds (Moses and Tahiroglu, 2010; Carlson et al., 2013). Positive relationships between working memory and ToM (Meristo and Hjelmquist, 2009) and between working memory and reading comprehension (Garrison et al., 1997; Daza et al., 2014) have also been reported in DHH individuals. Kim (2015) reported that working memory had a direct relationship to ToM; however, the relationship between working memory and reading comprehension was mediated by vocabulary and ToM.

In the five-step ToM scale (Wellman and Liu, 2004), the working memory demands increase across tasks. In the two most fundamental tasks, diverse desires and diverse beliefs, the participant has to differentiate between their own preference and another person's preference. Because pictures are provided to support this decision, mental representation is supplemented. However, the more advanced tasks (i.e., Knowledge access, Content false Belief, and Hidden emotion) all rely more on mental representation. To test the constraining influence of working memory capacity on progression in ToM, we suggest adding further tasks to the scale to determine whether individuals who fail to solve the more advanced tasks are able to solve the diverse desires and diverse belief tasks without the support of pictures. If they cannot, this would suggest that working memory capacity constrains performance on the ToM scale.

#### On the Relation between Theory of Mind and Reading Comprehension

Language skills and working memory capacity seem to be important for comprehension of both texts and minds. However, we suggest that neither of these variables on their own, or in combination, can fully explain the set of results of the present study. Kim (2015) and Ricketts et al. (2013) noted that both ToM and reading comprehension involve inference making, and suggested that this ability may link ToM to reading comprehension. Furthermore, Kyle and Cain (2015) showed that both deaf and hearing children who were poor reading comprehenders had poorer inference making skills than hearing controls with good reading comprehension. Since DHH signing children learn to read in a second language, their lack of relevant language-specific background knowledge may make it especially difficult to make appropriate inferences during reading (Hoffmeister and Caldwell-Harris, 2014). Poor ToM has also been suggested to negatively influence skills that rely on socially mediated learning (Frith and Happé, 1994; Scheuffgen et al., 2000; Carlson et al., 2013), such as reading (Ricketts et al., 2013), and it is possible that this is reflected in the relationship between ToM and reading comprehension (cf, Lecce et al., 2011, 2014). However, we tentatively suggest that in addition to working memory and language skills, inference making may play a crucial role in both ToM and reading comprehension and is a plausible mechanism behind the positive correlation between these skills in the present study. Future studies should consider the role of inference making ability, as well as other possible key mechanisms, when further exploring the association between ToM and reading comprehension.

# CONCLUSION

Children attending Swedish state primary schools for DHH children and who are at an early stage of their reading development, displayed progression in ToM that did not differ from previous studies. However, they had delayed ToM and poor reading comprehension. These skills were positively associated with each other and related to working memory capacity. Our tentative interpretation of this set of results is that some factor not investigated in the present study, possibly represented by inference making constrained by working memory capacity, is involved in constructing a representational model both of minds and of texts.

# AUTHOR CONTRIBUTIONS

EH, MH, and MR designed the study. EH co-ordinated data collection and performed the statistical analyses. EH prepared the first draft of the article and all authors contributed to the final version.

# FUNDING

This work was supported by grant number 2008-0846 to MR from the Swedish Research Council for Health, Working Life and Welfare.

# ACKNOWLEDGMENTS

The authors thank the children and their parents for their participation in this project; Jenny Carlsson, Gunilla Turesson-Morais, Hanna Åkerblom, Elisabeth Thilén, and Lisbeth Wikström for help with data collection; Lena Davidsson and Magnus Ryttervik for translating administration instructions into Swedish Sign Language; Annette Sundqvist and Katarina Forssén for technical assistance; and, participating schools for giving us access to their facilities.

# REFERENCES

fpsyg-07-00854 June 4, 2016 Time: 11:44 # 10


the Development of Executive Functions, eds B. Sokol, U. Müller, J. I. M. Carpendale, A. R. Young, and G. Iarocci (New York, NY: Oxford University Press), 218–233.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Holmer, Heimann and Rudner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Theory of Mind Deficits and Social Emotional Functioning in Preschoolers with Specific Language Impairment

Constance Vissers 1, 2 \* and Sophieke Koolen1, 3

*<sup>1</sup> Royal Dutch Kentalis, Kentalis Academy, St Michielsgestel, Netherlands, <sup>2</sup> Behavioural Science Institute, Radboud University Nijmegen, Nijmegen, Netherlands, <sup>3</sup> Pro Persona for Mental Health, Arnhem, Netherlands*

Children with Specific Language Impairment (SLI) often experience emotional and social difficulties. In general, problems in social emotional functioning can be cognitively explained in terms of Theory of Mind (ToM). In this mini-review, an overview is provided of studies on social-emotional functioning and ToM in preschoolers (average age from 2.3 to 6.2 years) with SLI. It is concluded that, similar to school-aged children with SLI, preschoolers with SLI have several social-emotional problems and that both cognitive and affective aspects of ToM are impaired in those children. Based hereon, three possible causal models for the interrelation between language, ToM and social emotional functioning are put forward. It is proposed that future research on the construct and measurement of early ToM, social emotional functioning and language development in preschoolers with SLI is needed to achieve early detection, tailored treatment, and ultimately insight into the pathogenesis of SLI.

#### Edited by:

*Daniela Bulgarelli, Aosta Valley University, Italy*

#### Reviewed by:

*Veronica Ornaghi, University of Milano-Bicocca, Italy Carol A. Miller, Penn State University, USA*

> \*Correspondence: *Constance Vissers c.vissers@kentalis.nl*

#### Specialty section:

*This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology*

Received: *29 August 2016* Accepted: *20 October 2016* Published: *04 November 2016*

#### Citation:

*Vissers C and Koolen S (2016) Theory of Mind Deficits and Social Emotional Functioning in Preschoolers with Specific Language Impairment. Front. Psychol. 7:1734. doi: 10.3389/fpsyg.2016.01734* Keywords: specific language impairment (SLI), social emotional functioning, theory of mind (ToM), neuropsychological functioning, language

# INTRODUCTION

Children with SLI have often been reported to experience behavioral, emotional and social difficulties (Mawhood et al., 2000; Yew and O'Kearney, 2013; Helland et al., 2014). They have low social self-esteem, poorer social skills and peer relationships and rate themselves as having a higher risk of being bullied (Fujiki et al., 1996; Knox and Conti-Ramsden, 2003; Marton et al., 2005). While behavioral problems appear to decrease during adolescence, emotional problems persist and social problems have even been reported to increase (St Clair et al., 2011). Adolescents and adults with SLI have social emotional problems like low self-esteem and symptoms of anxiety and depression (Howlin et al., 2000; Wadman et al., 2008; Whitehouse et al., 2009; Durkin and Conti-Ramsden, 2010; Conti-Ramsden et al., 2013; Lewis et al., 2016).

Problems in social-emotional functioning can be explained in terms of Theory of Mind (ToM). The concept of ToM was introduced in the 1970s in primate research by Premack and Woodruff (1978), who defined ToM as the ability to represent mental states of oneself and others in order to understand behaviors. Nowadays, distinctive dimensions of human ToM, each with different neuroanatomical underpinnings, can be discerned (Westby and Robinson, 2014). ToM can be explained along cognitive, affective, interpersonal, and intrapersonal dimensions. Cognitive ToM refers to thinking about thoughts, knowledge, beliefs and intentions and affective ToM involves thinking about and experiencing emotions (e.g., Dvash and Shamay-Tsoory, 2014), which can refer either to oneself (intrapersonal) or to others (interpersonal) (e.g., Tine and Lucariello, 2012).

Given the above, it is not surprising that ToM abilities are associated with social emotional maturity and social skills (e.g., Lalonde and Chandler, 1995; Dunn and Cutting, 1999; Carpendale and Lewis, 2004; Caputi et al., 2012). Children with SLI have been reported to have both social-emotional problems and ToM deficits, which bolsters this association (e.g., Andrés-Roqueta et al., 2016). ToM development in SLI is taken to follow a trajectory similar to that in typically developing (TD) children, but at a different pace and with a lower final level of ToM performance (Nilsson and de López, 2016; Spanoudis, 2016). Hence, ToM deficits in SLI continue into adulthood (Clegg et al, 2005; Botting and Conti-Ramsden, 2008).

Studies in typically developing (TD) preschoolers show that important progress in ToM is made during this period. During the second year of life, joint attention, imitation and pretend play develop, which can be taken as evidence for the understanding of others as intentional agents, the ability to form and coordinate representations of self and others, and the capacity to form meta representations (Leslie, 1987; Rogers and Pennington, 1991; Tomasello, 1995). At this stage, emotional recognition and mental state vocabulary also start to develop (Astington and Baird, 2005). With a sense of self, children begin to realize that they are separate from others, can have different emotions from others and they start to show empathy by intentionally comforting/helping another person (Thompson and Newton, 2013). Between 4 and 5 years of age, first order ToM, the ability to think about what someone else is thinking or feeling, develops (Wellman et al., 2011).

Up to now, most studies on ToM in SLI have focused on school-aged children. Given the early onset of ToM development, it is surprising that little research has focused on ToM in preschoolers with SLI. Since early childhood is the primary period for both language and ToM to develop, the early development of language and ToM plausibly interact in an facilitative or inhibitory manner. In order to achieve early detection, tailored treatment, and ultimately insight into the pathogenesis of SLI, research on the construct and measurement of early ToM, social emotional functioning, and language development and their existing deficits is necessary. The aim of this review is to provide an overview of state of the art evidence on social functioning and ToM in preschoolers with SLI (average age range: 2.3–6.2 years), to elaborate on theoretical and clinical implications of these empirical data and to give suggestions for future research.

# Social Emotional Functioning in Preschoolers with SLI

Social skills of preschoolers with SLI are shown to be less well developed or at least delayed. For instance, preschoolers with SLI were rated lower by parents and teachers on social competence (e.g., assertiveness, peer social skills) than TD children (McCabe, 2005). Moreover, they were found to be less likely to verbally address other children and to engage more in adjacent rather than sociointeractive play (McCabe and Marshall, 2006). Further, preschoolers with SLI were rated significantly lower by their parents on skills such as cooperation, assertion and responsibility (Stanton-Chapman et al., 2007), although in a later study, language-impaired preschoolers were found to score within the average range (Pentimonte et al., 2016). Andrés-Roqueta et al. (2016) showed young children with SLI to receive a significantly higher number of negative peer-nominations compared to typical children. Likewise, withdrawal was reported as the most frequent problem behavior in language-impaired preschoolers (Maggio et al., 2014).

# ToM in Preschoolers with SLI

Deficits in social emotional functioning can be explained in terms of ToM (e.g., Lalonde and Chandler, 1995; Ford and Milosky, 2003; Creusere et al., 2004; Andrés-Roqueta et al., 2016). Below, empirical evidence on ToM deficits in preschoolers with SLI is presented (see **Table 1** for an overview of essential aspects of ToM and observed ToM deficits during preschool).

## Imitation

Several studies have focused on imitation abilities in preschoolers with SLI. Within a sentence imitation paradigm, Snow (2001) found that although 4-year olds with language impairment imitate rising intonation contours in the same way as TD children, they are impaired in terms of their segmental phonology. Others have shown that children with SLI have more difficulties in imitating sentences with different linguistic and affective intonation contours and with different empathic stress (Van Der Meulen et al., 1997). Hence, language disordered children seemed to be less able to imitate prosodic features, although both children with SLI and typical children were found to show an increase in performance on prosodic imitation and emotion identification with age.

In addition, research has been done on the effectiveness of imitation/modeling procedures for children with SLI. The underlying assumption is that imitation-based interventions should generate language production under control of the clinician aiming to facilitate spontaneous language use (e.g., Camarata et al., 1994). Kouri (2005) studied the effectiveness of modeling (input that requires imitation without any other response requirements) vs. elicitation (input that includes prompts for production) training procedures for late-talking preschoolers with SLI and developmental delay on the production of comprehended lexical items. Overall, it was concluded that both training methods are effective training procedures for preschoolers with language impairment. The exact mechanism through which those procedures facilitate linguistic functioning, however, remains to be specified. Verbal imitation is proposed as the key component, as verbal practice is expected to stimulate language functioning in children who have impaired verbal production. Another explanation would be that the use of minds is what stimulates linguistic functioning in this group of children.


*See paragraph ToM in preschoolers with SLI for relevant studies supporting these empirical findings.*

#### Joint Attention

As far as we know, only a few studies have directly investigated into joint attention in preschoolers with SLI. Farrant et al. (2011) studied the associations between child and maternal socio-emotional engagement, joint attention, imitation and conversation skill in preschoolers with SLI. Deficits were found on all of those skills in these children, compared with TD children. It was proposed that small impairments in parent-child socio-emotional engagement may lead to larger deficits in joint attention, child imitation and conversation skills. In another study (Loveland and Landry, 1986), focus was on attentiondirecting language and gesture in children with developmental language delay and children with autism. Language delayed children were reported to be better responders to joint attention interactions than autistic children. Both groups of children did not differ from each other on the number of joint attention behaviors, nor on the types of joint attention behaviors used. Gestural behavior of language delayed children was more communicative than that of autistic children. Given the fact that no typical control group was included, no conclusions could be drawn at the level of performance [(mal)functioning] relative to children without autism or developmental language delay.

#### Emotion Recognition and Understanding

A few studies have examined emotion recognition and understanding in preschoolers with SLI. Courtright and Courtright (1983) observed young children with language impairment to perform less well on interpreting vocal cues to affect than typical controls. Similarly, Creusere et al. (2004) examined affect comprehension in young children with SLI using an affect discrimination task and found lower scores for the language impaired group for measures of recognition of facial expressions and nonfiltered speech. The authors argued that children with SLI may miss cues to the emotional state of their conversational partner, which in turn may hamper their understanding of the speaker's communicative intentions. Similarly, other researchers (Ford and Milosky, 2003) found young children with SLI to be able to identify facial expressions, yet, to have problems inferring the appropriate emotion and choosing the corresponding facial expression when presented with an event context. McCabe and Meller (2004) showed no differences between preschoolers with and without SLI on an emotional expression identification test. Language impaired children did, however, perform more poorly on a stereotyped emotional knowledge task, based on which the authors proposed that under certain circumstances children with SLI are impaired in identifying emotions.

# False Belief Understanding

Various studies have examined false belief (FB) understanding in SLI, some of which have focused on preschool children. Jester and Johnson (2016) showed young children with SLI to perform more poorly on a FB task than their TD peers. Farrant et al. (2006) found impairments in both visual perspective taking and FB understanding in young children with SLI, based on which they propose language to have a facilitating role in ToM. Interestingly, Farrar et al. (2009) found that while syntactic complementation was not correlated with FB performance in preschoolers with SLI, general grammatical development and vocabulary were significant predictors of ToM ability. In line with this, Andrés-Roqueta et al. (2013) found that, compared to age-matched control children, children with SLI showed more problems on several FB tasks; moreover, FB performance in SLI was best predicted by their overall linguistic abilities, and their grammatical abilities in particular. In another study from this group (Andrés-Roqueta et al., 2016), similar results were found; preschoolers with SLI were shown to have a significant delay both in language and performance on FB and strange stories tasks. Several studies (Miller, 2001, 2004; Guiberson and Rodriguez, 2013) found young children with SLI to be able to perform FB tasks with low linguistic complexity, but to show impairments on linguistically more complex FB tasks.

# DISCUSSION

Given the empirical findings presented above, we conclude that preschoolers with SLI have moderate to severe socialemotional problems. ToM deficits can be taken to play an underlying role in these social emotional problems. That is, in preschoolers with SLI impairments in cognitive ToM (imitation, joint attention, false belief understanding) as well as affective ToM (recognizing and understanding emotions) have been found.

The association between social emotional functioning, ToM and language abilities in SLI is not surprising. From early childhood until adolescence, language development and ToM development are entangled (Tager-Flusberg, 2000). Further, the ability to form a ToM is indispensable for mastering language and efficient communication and interaction (e.g., Baldwin and Moses, 2001). Mental representations of one's own and others' inner world are necessary to come to adequate communicative skills. At the same time, language is essential in understanding mental representations and controlling/regulating emotions and thus in mastering ToM (e.g., Dunn and Brophy, 2005; Grazzani and Ornaghi, 2012; Kolk, 2012; Grazzani et al., 2016). Hence, both ToM and language abilities promote social communication, the understanding and regulation of one's own and others' inner worlds and social emotional maturation. Thus, it is not unexpected that children's level of social emotional functioning can be explained in terms of ToM (e.g., Lalonde and Chandler, 1995) and language abilities (e.g., Jenkins and Astington, 1996). Once they emerge, social emotional problems can, in turn, further affect the development of language and ToM. Importantly, deficits in ToM and language cannot account for the full range of social emotional difficulties in SLI. Plausibly, other cognitive functions (such as level of executive functioning) but also environmental factors (such as parental social emotional engagement) influence the development of language and ToM and social emotional maturation (e.g., Leslie, 1987; Bishop, 1997; Cutting and Dunn, 1999; Farrant et al., 2011; Stanzione and Schick, 2014; Vissers et al., 2015).

Social emotional problems in preschoolers with SLI can thus (at least partly) be understood in terms of an interplay between ToM and language, and social emotional problems may in turn further hamper both language and ToM development. The findings presented do not reveal whether ToM impairments cause language impairments or vice versa. Three possible causal models for the relation between language and ToM can be put forward.

According to the first model, ToM facilitates language development. Following this approach, social understanding informs word learning, even in the infancy period (Baldwin and Moses, 2001). ToM is proposed to allow children to learn new words through their sensitivity to referential intentions of others. Accordingly, word learning problems can be explained by ToM deficits (Bloom, 2001; Birch and Bloom, 2002). Bolstering the essential role of ToM in language development, Morales et al. (2000) found that the capacity to respond to joint attention of infants across the first and second year is related to subsequent vocabulary acquisition (see also Mundy and Gomes, 1998).

The second model argues that language fuels ToM development. Stronger relations were found between early language ability and later ToM performance than the reverse, which suggests a causal role for language in ToM development (see Milligan et al., 2007, for a meta-analysis combining results of 104 studies). The importance of language in developing ToM is further emphasized by the finding that deaf children of hearing parents, who typically demonstrate language delays, have ToM deficits, whereas deaf children from deaf families perform identically to same-aged hearing controls on ToM tasks (e.g., Schick et al., 2007). This is explained by assuming that deaf children with deaf parents share a common sign language and are thus exposed to a rich language context. In line with a central role for language in ToM development, Rosenqvist et al. (2014) found language to be the most important predictor (compared to several neurocognitive capacities) of children's emotion recognition ability. There is no consensus on which aspects of language influence ToM development. The semantic approach argues that the development of mental state verbs (e.g., think and feel) enhances the understanding of own and others' mental representations (Bartsch and Wellman, 1995; Peterson and Siegal, 2000). Others highlight syntactic processing to play an essential role in ToM acquisition (de Villiers, 2007), from the mastering of basic syntax, such as word order (Astington and Jenkins, 1999), to the use of linguistic structures which are embedded or the mastery of syntactic complementation (e.g., De Villiers and Pyers, 2002; Schick et al., 2007). Interestingly, Slade and Ruffman (2005) state that both syntax and semantics contribute to FB understanding. Further, there is substantial evidence for the conversational approach, proposing that ToM development is influenced by conversational interactions about events and aspects of the external world as well as about inner concepts and states. For instance, it has been suggested that parent-child conversations about situations that involve the mind enhance children's understanding of psychological terms and thereby the development of ToM (Turnbull et al., 2009). Talking about the mind is said to promote the differentiation of one's own viewpoint from others' and to stimulate reflection on social and emotional experiences (e.g., Appleton and Reddy, 1996; Symons, 2004; De Rosnay and Hughes, 2006). Bianco et al. (2016) suggest that conversations about the mind promote ToM by enhancing the accuracy of mental-state attributions. Others found that the use and comprehension of meta-cognitive language correlates with FB performance and emotion comprehension (Grazzani and Ornaghi, 2012). Supportive hereof, training 2-year-old children in using mental-state talk appears to enhance ToM (Grazzani et al., 2016). Moreover, engagement in conversations on emotions appears to stimulate ToM (Ornaghi et al., 2014). Emotion understanding can also be enhanced by participation in explanatory conversations (i.e., about emotional reactions) (Tenenbaum et al., 2008). Hence, according to the second model it is language (semantics and syntax but also conversational interactions) that promotes ToM.

According to a third model, language deficits and ToM deficits co-occur because they are driven by a single factor. Both language abilities and ToM abilities could be manifestations of a single neuropsychological underlying structure, for instance working memory (WM) an aspect of executive functioning. Accordingly, various studies have revealed correlations between WM ability and FB performance (e.g., Jenkins and Astington, 1996; Gordon and Olson, 1998), and also between WM and language development (e.g., Adams and Gathercole, 1996; Baddeley et al., 1998; Vissers et al., 2015).

Future research is needed to investigate the nature of the interplay between language, ToM and social-emotional functioning in SLI. Longitudinal designs are helpful to monitor progress in this interplay across the lifespan. As ToM starts to develop already within the first months from birth, at which point linguistic (dis)abilities are still far from clear, longitudinal cohort studies would be of value starting at birth with children at-risk. Further, up to now, most research has focused mainly on aspects of (interpersonal) cognitive ToM. In order to gain more insight into ToM development in SLI, it is necessary to examine interpersonal/intrapersonal cognitive and affective ToM abilities (Westby and Robinson, 2014).

#### REFERENCES


Neuropsychological insight into social-emotional functioning has important clinical implications. The effects of training studies exposing (young) children to ToM vocabulary for instance are promising (e.g., Hale and Tager-Flusberg, 2003; Lohmann and Tomasello, 2003; Bianco et al., 2016). The fact that language and ToM development start in infancy and continue into adulthood implies that to prevent and treat social emotional dysregulations language and ToM interventions should extend into adulthood (see also Stanzione and Schick, 2014).

#### AUTHOR CONTRIBUTIONS

Both authors contributed to developing the hypotheses and searched for/studied literature. SK focussed on the empirical part of the mini review. CV integrated all empircal findings, wrote the Introduction and Discussion (conclusions and theoretical/clinical implications) and finalized the review.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Vissers and Koolen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Putting Ostracism into Perspective: Young Children Tell More Mentalistic Stories after Exclusion, But Not When Anxious

Lars O. White<sup>1</sup> \*, Annette M. Klein<sup>1</sup> , Kai von Klitzing<sup>1</sup> , Alice Graneist1,2, Yvonne Otto<sup>1</sup> , Jonathan Hill<sup>3</sup> , Harriet Over<sup>4</sup> , Peter Fonagy<sup>5</sup> and Michael J. Crowley<sup>6</sup>

<sup>1</sup> Department of Child and Adolescent Psychiatry, Psychotherapy and Psychosomatics, University of Leipzig, Leipzig, Germany, <sup>2</sup> Institute of Psychology, Goethe University Frankfurt, Frankfurt, Germany, <sup>3</sup> School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK, <sup>4</sup> Department of Psychology, University of York, York, UK, <sup>5</sup> Research Department of Clinical, Educational and Health Psychology, University College London, London, UK, <sup>6</sup> Yale Child Study Center, Yale University, New Haven, CT, USA

Much is known about when children acquire an understanding of mental states, but few, if any, experiments identify social contexts in which children tend to use this capacity and dispositions that influence its usage. Social exclusion is a common situation that compels us to reconnect with new parties, which may crucially involve attending to those parties' mental states. Across two studies, this line of inquiry was extended to typically developing preschoolers (Study 1) and young children with and without anxiety disorder (AD) (Study 2). Children played the virtual game of toss "Cyberball" ostensibly over the Internet with two peers who first played fair (inclusion), but eventually threw very few balls to the child (exclusion). Before and after Cyberball, children in both studies completed stories about peer-scenarios. For Study 1, 36 typically developing 5-year-olds were randomly assigned to regular exclusion (for no apparent reason) or accidental exclusion (due to an alleged computer malfunction). Compared to accidental exclusion, regular exclusion led children to portray story-characters more strongly as intentional agents (intentionality), with use of more mental state language (MSL), and more between-character affiliation in post-Cyberball stories. For Study 2, 20 clinically referred 4 to 8-year-olds with AD and 15 age- and gender-matched non-anxious controls completed stories before and after regular exclusion. While we replicated the post regular-exclusion increase of intentional and MSL portrayals of story-characters among non-anxious controls, anxious children exhibited a decline on both dimensions after regular exclusion. We conclude that exclusion typically induces young children to mentalize, enabling more effective reconnection with others. However, excessive anxiety may impair controlled mentalizing, which may, in turn, hamper effective reconnection with others after exclusion.

Keywords: social exclusion, early childhood, theory of mind, mentalizing, prosocial behavior

# INTRODUCTION

The preschool years have long been noted for fundamental advances in mentalizing – the socialcognitive capacity to construe oneself and others in terms of intentional mental states (Dennett, 1978; Fonagy et al., 2002). The timetable of the development of mentalizing has received much attention over the past decades (see Wellman, 2014). Yet, as mentalizing enters the child's

#### Edited by:

Anne Henning, SRH Fachhochschule für Gesundheit Gera GmbH, Germany

#### Reviewed by:

Virginia Slaughter, University of Queensland, Australia Ruth Ford, Anglia Ruskin University, UK

> \*Correspondence: Lars O. White white@medizin.uni-leipzig.de

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 19 September 2016 Accepted: 24 November 2016 Published: 22 December 2016

#### Citation:

White LO, Klein AM, von Klitzing K, Graneist A, Otto Y, Hill J, Over H, Fonagy P and Crowley MJ (2016) Putting Ostracism into Perspective: Young Children Tell More Mentalistic Stories after Exclusion, But Not When Anxious. Front. Psychol. 7:1926. doi: 10.3389/fpsyg.2016.01926

repertoire, the question arises as to when and which children make use of this new mental tool by mentalizing in varying social contexts. Despite the importance of such work for theories of mentalizing – particularly the interaction of mentalizing with motivational states and stress regulation (Ickes and Simpson, 2001; Tomasello et al., 2005; Fonagy and Luyten, 2009) – few if any experimental studies directly address the roles of context and disposition in mentalizing. Indeed, if mentalizing varies systematically as a function of context or arousal, it could be crucial to assess context-specific mentalizing capacities of clinical populations whose symptoms primarily appear under certain conditions, such as anxiety disorder (AD).

Mentalizing may be relevant to a broad set of social interactions, from dyadic emotion regulation and caregiving to cooperative and competitive interactions, more broadly (Dennett, 1987; Moore and Frye, 1991; Fonagy et al., 2002). Accordingly, individuals may be thought to mentalize in a wide variety of contexts with many authors proposing that mentalizing permeates our everyday social cognition (e.g., Wellman, 2014). Importantly, for the present purposes, the degree and cognitive control of mentalizing may still show cross-situational variation as the need and expectation to cooperate and compete with others fluctuates.

With this in mind, one important context for inducing shifts in social cognition may be exclusion from groups. As a fundamental process for humans, social exclusion blocks access to various group resources that, across phylogeny, were essential to survival, from group protection, to collaboration for provisions, to exchange of social information (Leary and Cottrell, 2013). Potentially for this reason, threats of exclusion still act as powerful triggers for conformity. Serving as a deterrent for exploiting others, threats of exclusion therefore also stabilize and promote cooperation (Ouwerkerk et al., 2005; Williams, 2009; Feinberg et al., 2014). Critically, to act on the first hints of and avoid further exclusion, excluded parties may potentially increase vigilance regarding social cues to promote more skillful re-affiliation (Pickett and Gardner, 2005; see below). Yet, few studies address such exclusion-responses early in development, especially with young children.

To date, the bulk of work on peer exclusion in early childhood has focused on risk factors for chronic peer rejection and its adverse developmental sequelae (e.g., Crick et al., 1999; von Klitzing et al., 2014). Consequently, we know relatively little about typical and atypical responses to experimental social exclusion at this age. A handful of studies examining exclusion among preschoolers uses indirect primes where the child observes the exclusion of a third party. Even this simple manipulation leads some preschoolers to behave in a way that suggests a reconnection motive has been engaged, including more accurate imitation of others (Over and Carpenter, 2009b; Watson-Jones et al., 2014) and drawing pictures of themselves and friends standing closer to one another (Song et al., 2015). Consistent with these findings, a recent study exposed preschoolers to firsthand exclusion while playing the virtual ball-toss game, Cyberball, also finding increased fidelity of imitation post-exclusion (Watson-Jones et al., 2016). Overall, these findings in young children resemble research on adults, showing increased affiliative tendencies (e.g., conformity, generosity, mimicry) following exclusion compared to control conditions (see Molden and Maner, 2013).

Given the behavioral affiliation-inducing effect of social exclusion, we sought to examine whether young children would also attend to mental states more closely after exclusion. Indeed, some theorists propose that exclusion gives rise to a state of "social hunger" (Gardner et al., 2000, p. 486) that stimulates social monitoring processes, akin to increased attention to food stimuli after fasting. Among adults, social exclusion thus promotes attentional biases to relevant social information (Pickett and Gardner, 2005), including others' perspectives (Knowles, 2014). Coping with social exclusion by attending to other's perspectives and mental states may enable more adept detection and selection of new partners likely to reciprocate while weeding out less promising partners. Many affiliative actions (e.g., helping) could also improve (in quality and quantity) if excluded parties attend to mental states of potential targets for re-affiliation so as to tailor affiliative actions to the needs, goals, and knowledge of those targets (Tomasello et al., 2005). Despite its clear potential for informing developmental theories on mentalizing, little or no work currently extends this work to social exclusion in young children. We therefore sought to address this gap in the literature with Study 1.

In a second Study, we moved beyond examining mentalizing in typically developing youth, to consider young children with elevated anxiety concerns. Deficits in social cognition and mentalizing have been linked to numerous childhood psychopathologies (Sharp et al., 2008). However, in the case of AD, one of the most prevalent conditions in childhood (Costello et al., 2011), the deficit in mentalizing has proven somewhat difficult to pin down (see Banerjee, 2008). While socially anxious young children have shown normal responses on standard false-belief tasks in most studies (Banerjee and Henderson, 2001; Broeren et al., 2013; but see Colonnesi et al., 2016), they have exhibited impairments in social behaviors requiring insight into mental states, in self-presentational tactics toward peers as well as in understanding the causes and emotional effects of unintentional insults (Banerjee and Watling, 2010).

Arguably, this pattern of data could be at least partly accounted for by context-specific deficits in mentalizing under affectively charged conditions, such as social exclusion. Thus, it has been proposed that controlled mentalizing varies as a function of the arousal induced by a specific context, following a trajectory of an inverted u-curve, i.e., first rising and then falling with increasing arousal (Fonagy and Luyten, 2009). Given the excessive negative arousal inherent in acute anxiety, deficits in stress-related mentalizing may typify anxious children (Nolte et al., 2011), much like what has been shown by pilot data in adults with panic disorder (Rudden et al., 2006). Moreover, in acute anxiety, one's own and others' thoughts often take on an imminent and threatening quality, which may derive from insufficient distinctions between one's mental representation and reality, one of the hallmarks of a prementalizing mode (e.g., fear of imagined catastrophic separation outcomes, fear of negative

evaluation by others; Fonagy et al., 2002). Thus, in Study 2 we examine young anxious children's usage of mentalizing in an acute stress-context, following social exclusion.

In the current pair of studies, we used the virtual balltoss game "Cyberball" (Williams et al., 2000) to manipulate social exclusion. Children were ostensibly connected to the Internet to toss a ball back and forth with two peers. The peers eventually stopped passing the ball to the subject (exclusion). Initially, we demonstrated that 5-year-olds excluded in Cyberball report higher threat to relational needs and attribute more bad intentions to co-players on post-Cyberball puppet interviews, as well as more tattling to experimenters on co-players than included children (White et al., unpublished).

Here, to capture young children's mentalizing and affiliative responses to exclusion, we adapted a widely used narrative storystem task that children completed before and after Cyberball. In this task, children are exposed to scripted story-beginnings and asked to show and tell the experimenter what happens next using toy figures (see Emde et al., 2003). Story-completion measures have a long history of use in studies of typical and atypical child development. Many of these studies have focused on the way children portray characters in their stories (e.g., parents, children) as a window to their internal representations of themselves and others (see Yuval-Adler and Oppenheim, 2014 for a review). Accordingly, studies suggest that the manner in which children portray the child- and parent-characters in their stories partly overlaps with actual real-world behaviors of these children and their caregiving experiences (Oppenheim et al., 1997; Toth et al., 1997). For example, the magnitude of children's affiliative and aggressive themes in such narratives is associated with the tendency to express similar behaviors in various social contexts, as reported by clinicians, parents, or teachers (e.g., Kochanska et al., 1996; Hill et al., 2007; von Klitzing et al., 2007).

Recently, the story-stem approach has been broadened to assess children's tendency to mentalize in their stories (Hill et al., 2007, 2008; Luyten and Fonagy, 2014). More specifically, this approach assesses the degree to which children treat storycharacters as intentional agents, i.e., portraying figures as if they have goals and mental states.<sup>1</sup> For story-stems with positive themes, previous research has documented an association between mentalizing, as indexed by the story-stem approach, and theory of mind, as indexed by a traditional false-belief measure (Hill et al., 2008). By contrast, for stories with distressing themes mentalizing was associated with the child's previous attachment history and their risk for externalizing disorders (Hill et al., 2007, 2008).

For the present studies, children completed scripted story beginnings, themed with peer exclusion and victimization. Importantly, and unlike most exclusion research to date (see Wesselmann et al., 2015), the open-ended story-completion method offers subjects much latitude to express a range of post-exclusion responses. Specifically, we chose this measure as it enabled assessment of spontaneous prosocial and aggressive responses as well as children's tendency to mentalize before and after exclusion. Though rarely, if ever, used in the context of an experimental task such as Cyberball, the story-completion approach is particularly appealing for use with young children, who may otherwise struggle to verbalize their thoughts (Emde et al., 2003).

# STUDY 1

Given the aforementioned links between affiliative and aggressive themes in children's story-completions and parallel behaviors in various social contexts, it seemed plausible that exclusion would affect children's play analogous to adults' affiliative responses to exclusion (e.g., Maner et al., 2007). For typically developing children in Study 1, we predicted that compared to controls, excluded children would portray more affiliation between characters in stories. While studies report that social exclusion can elicit aggression (e.g., Twenge et al., 2001; Will et al., 2014), few if any child studies report such effects. Thus, we explored, but did not predict any effects of exclusion on aggression between characters.

Beyond affiliation and aggression, story-completion narratives are well-placed to examine post-exclusion attention to mental states. Thus we assessed the degree to which children treat storycharacters as intentional agents (Hill et al., 2008). In line with enhanced post-exclusion social monitoring (Pickett and Gardner, 2005), we predicted that exclusion, compared to a control condition, would lead children to portray characters using more mental state language (MSL) and with more intentionality. Because social monitoring is thought to enhance reconnection (Molden and Maner, 2013), we also predicted that increases in mentalizing would mediate the effect of exclusion on affiliative story-themes.

Aside from testing our main hypotheses, in Study 1 we also employed character-specific codes to assess whether or not children selectively describe mental states of some storycharacters and direct affiliation toward some characters over others (i.e., victims vs. perpetrators in the story). Social monitoring putatively helps to select good and weed out poor targets for affiliation (Pickett and Gardner, 2005). Accordingly, we predicted that a social exclusion condition would result in increased references to both the victim's and perpetrators' mental states compared to a control condition. Regarding affiliative portrayals, we expected that excluded children would favor victims over perpetrators, as victims should qualify as more promising sources of affiliation.

Finally, in selecting an appropriate control condition for Study 1, we were aware that inclusion cues can also promote both prosocial and antisocial responses (see Over and Carpenter, 2009a; Waytz, 2013) and that inclusion also activates fewer behavioral responses compared to exclusion (e.g., tattling; White et al., unpublished). Also, we aimed to ensure that children are responding to the perceived intentions of excluders. We therefore

<sup>1</sup>Various dimensions of mentalizing have been operationalized (see Luyten and Fonagy, 2014, for an overview). Story-stem based measures primarily focus on the child's tendency to attribute cognitive-affective mental states to others (i.e., storycharacters) starting from the portrayals in the story-beginning. Notably, unlike standard false belief tasks (e.g., Wimmer and Perner, 1983), story-stem based assessments focus on the child's spontaneous usage of mental state attribution rather than the accuracy of these attributions.

opted for an accidental exclusion control condition in which children were informed afterward that exclusion occurred due to a computer malfunction. This maps onto procedures in adult studies showing that affiliative responses are reliably elicited by rejecting departures compared to accidental departures (e.g., Maner et al., 2007). As a manipulation check for this control condition, we assessed whether or not children attributed more bad intentions to regular vs. accidental excluders on a puppet interview, after learning about the alleged computer malfunction.

# Method

#### Sample

Thirty-six 5-year-olds with a mean age of 68.26 months (SD = 2.43 months; 18 females) were recruited drawing on a database of families volunteering to participate in development studies. All subjects were native speakers. No ethnicity or SES data were available. Boys and girls were separately randomized to exclusion and accidental conditions. Ethical approval was obtained from Leipzig University's institutional review board.

#### Procedure

Children initially completed a warm-up story themed with a Birthday party to acclimatize children to storytelling (Emde et al., 2003). After completing the story, they were informed that they could tell some more stories later. Next, children were furnished with a real-life glove and baseball, which they tossed back and forth with the experimenter. After a few throws, they were told that they would now play this game on the computer over the Internet. In the event that children were unfamiliar with the Internet, the experimenter explained that the Internet would allow them to play on the computer with two other children who were playing the game on a computer in different places, just like they were. Next, children played a first inclusion round of Cyberball, followed by an experimenter administering the first set of baseline story-stems. Then the child played a second experimental round of Cyberball during which they were initially included and then eventually either excluded or accidentally excluded (see section on Cyberball for manipulation details). Following either exclusion condition, a second set of story-stems was administered (stems counterbalanced to pre- and post-test). Puppet interviews were collected after administration of the second set of story-stems to assess attribution of bad intentions to co-players. Afterward, all children were over-included in Cyberball. An over-inclusion phase was deemed more suitable than debriefing for 5-year-olds in keeping with ethical guidelines for young children (see Thompson, 1990). Parents were fully debriefed after their child entered the lab, providing ample time to withdraw from the study before the child played Cyberball (no parents withdrew). Experimenters were blind to all research questions.

#### Measures

#### **Cyberball (see Figure 1)**

Cyberball is a computerized ball-toss game designed for adults (Williams et al., 2000) that was adapted for use with children (Crowley et al., 2010; see below). Subjects ostensibly played online with two other peers using a response pad. In fact, subjects were the only ones playing the game. Peers were computergenerated and their throws adhered to a pseudo-random event script. An initial inclusion period comprised of 30 trials, aimed to acclimatize children to the game interface. To help with comprehension of the task, an experimenter initially sat beside the child explaining the task and, if necessary, demonstrating the first throw before inviting children to try for themselves. After the eighth trial (third subject throw), experimenters complimented children on their performance and told them they had to do some paper work, taking a seat behind the child (while children completed the acclimatization round). The "acclimatization" round alternated between 9 "my turn" events (ball is thrown to participant), 9 "ball-toss" events (participant throws the ball) and 12 "not my turn" events (ball is passed between co-players).

For the second experimental round of Cyberball, the experimenter immediately took a seat behind the child, pretending to work. The round was divided into a brief initial inclusion period of nine trials for all children (3 "my turn," 3 "ball toss," 3 "not my turn" events) seamlessly transitioning into exclusion (2 "my turn" events, 2 "throw events," and 35 "not my turn" events). The exclusion and accidental conditions only differed in the two final screenshots appearing after the final ball-pass in the accidental condition. In the accidental condition, a first screenshot suggested that an error had occurred in red capital letters. Experimenters read this information out loud to children and terminated screenshots using the spacebar. The second screenshot showed a figure holding two disconnected ends of a red cable. To match this screenshot, response pads were connected to computers with a red sparkling USB cable and experimenters tampered with this cable when the second screenshot appeared. They also asked children if they had only received few balls, and told them that the other players could not toss the ball to them because the cable was disconnected. After the second set of story-stems and the puppet interview, all children played a third 38-trial over-inclusion round (16 "my turn," 15 "ball toss," and 6 "not my turn" events).

Crowley et al.'s (2010) version of Cyberball adds a number of child-friendly features. For example, a pre-recorded female narrator asks the child to pick their favorite from a selection of six baseball gloves before the game commences. For each throw the ball travels in one of many arcs from player to player (e.g., curved line), accompanied by a variety of swoosh sounds. Names and pictures of co-players were displayed above their gloves. Pictures of co-players were age and gender-matched, drawing on a picture bank of neutral child faces. Besides adding a new narrator to this version, we aimed to scaffold understanding of game controls. Thus, each time the subject caught the ball, names of co-players changed colors from white to red and blue to match the color of the respective button children had to press to throw the ball to that player (see **Figure 1**).

#### **Story-stem administration**

Following the MacArthur Story-Stem method (Bretherton and Oppenheim, 2003; Emde et al., 2003), standardized storycompletions, enacted with Lego <sup>R</sup> DUPLO <sup>R</sup> figures, were used to elicit narratives from each child. Trained experimenters presented story beginnings to children following a standardized

script before they asked children to "tell and show me what happens next". Experimenters employed standardized prompts if children failed to address the problem presented in the stem. Before playing the acclimatization round of Cyberball, children completed a positively themed warm-up stem about a child's birthday to check engagement and introduce all characters (Emde et al., 2003). Before and after the experimental Cyberball round children first completed a stem themed with peerexclusion ("Sandbox," "Snowman") followed by a stem themed with peer-victimization ("Fight with a Friend," "Favorite Chair"; Warren, 2003; Hill et al., 2007). Exclusion-themed stems were newly developed for this study (see Supplemental Material). We counterbalanced stems to baseline and experimental phases, so that each stem occurred equally often before and after exclusion. To standardize temporal gaps between stories and Cyberball, children were allowed to narrate stories for up to 3 min each.

#### **Story-stem coding**

All stories were transcribed and scored drawing on two different coding manuals and extensions of these systems (Robinson et al., 2002; Hill et al., 2009). All ratings were completed individually for each narrative from verbatim transcripts. Raters remained blind to the condition of subjects, other narratives of that child, order in which the stems were administered, and all other subject information. Raters received training from authors and/or experts of the respective coding systems. A second rater double-coded a random sample of 25% of stories (ICCs: 0.61 to 0.93).

Based on the first manual (Robinson et al., 2002) and in line with previous studies (von Klitzing et al., 2007), a composite of affiliative themes was formed for each story, involving empathy or helping (e.g., character puts band aid on other character), affection (e.g., characters hug), sharing (e.g., characters share items), reparation (e.g., character apologizes) and affiliation (e.g., characters play together) between characters. The presence of each theme was coded in a story and summed to a maximum score of five per story (affiliation). Each instance of affiliation was also coded in a new character-specific fashion. Two separate character-specific affiliative codes were derived by identifying the beneficiaries or recipients of each affiliative action, to create two separate affiliative codes. Affiliative actions were summed with the victimized party as recipients (victim-directed affiliation) and peers who perpetrated victimization as recipients (perpetratordirected affiliation).

Based on a second coding manual (Hill et al., 2009), we coded the extent to which children globally portrayed characters as intentional agents (intentionality), i.e., as if they were goal-directed and had mental states (see Hill et al., 2007, 2008). Extending Hill et al.'s (2009) manual, we summed explicit intentional or mental state words children used to describe story-characters (e.g., "She wants to play with her in the snow.") to create a score for mental state language (MSL) per story. To create a new set of character-specific scores we determined whether the child described a mental state of the victimized character (victim-focused MSL) or the characters perpetrating the victimization (perpetrator-focused MSL).

Additionally, we scored aggression between characters (Hill et al., 2009). Aggression assesses the extent to which children portray characters as acting aggressively toward one another, with higher scores reflecting more severe aggression. For example, verbal aggression usually scores in the lowest range (1–3), minor physical aggression in the intermediate range (4–6) while severe aggression resulting in injuries or even death rate in the high (7–9) or highest range (10–12), respectively.

To gain a more complete picture of narratives, we also scored story-quality (coherence) following a coding manual (Hill et al., 2009) and derived word counts from transcripts as a control a control variable using a standard software package (Pennebaker et al., 2007).

#### **Preschool Ostracism Puppet Interview (POPI; White et al., unpublished)**

We used a puppet interview protocol informed by the Berkeley Puppet Interview (Ablow and Measelle, 1993) to assess the extent to which children attributed bad intentions to their fellow players. Puppets claimed they had played the game as well and made opposing attributional statements regarding motives of their coplayers (four items; "I think the other boys/ girls wanted to tease me" vs. "I don't think the other boys/ girls wanted to tease me"). Interviews were videotaped and coded on seven-point scales (higher scores indicating stronger attribution of bad intentions; Cronbach's α = 0.92). Over 25% of interviews were double-coded (n = 12; ICC = 1.00). Due to time-constraints, two children did not complete the interview.

#### Data-Analysis

fpsyg-07-01926 December 21, 2016 Time: 12:2 # 6

We compared attribution of bad intentions by children in the exclusion and accidental conditions using analysis of variance (ANOVA). To compare conditions in regard to changes in global narrative codes from pre- to post-Cyberball on affiliation, MSL, aggression, intentionality, coherence, and word-count, we conducted a series of mixed-design ANOVAs, with time (preto post-Cyberball) as within-subject factor, and condition as between-subject factor. To analyze character-specific affiliation and MSL, we conducted two mixed-design ANOVAs, with time (pre- to post-Cyberball) and story-character (victim, perpetrator) as within-subject factors, and condition as between-subject factor. For all analyses, we averaged scores on peer-exclusion and peer-victimization stories before and after the manipulation after ensuring absence of Time by Condition by Story Type interactions. In a final step, we entered pre–post change in word count as a covariate in analyses of global narrative codes that yielded Condition × Time interactions, to ensure their independence of changes in story-length. The PROCESS macro (Hayes, 2013) was used to assess if changes in intentionality or MSL mediated effects of regular vs. accidental exclusion on changes in affiliative themes. Post-Cyberball affiliation and intentionality/ MSL scores were entered as independent and mediator variables, respectively, while pre-Cyberball scores functioned as covariates. We conducted ordinary least squares (OLSs) path analyses using 10,000 bootstrapping samples, a biascorrected 95% confidence interval (CI), and omitted covariates to compute Preacher and Kelley's (2011) κ 2 as an effect size (small: 0.01 to 0.089, intermediate: 0.09 to 0.249, large: ≥0.25).

#### Results

#### Manipulation check

An ANOVA revealed that excluded children attributed more bad intentions to their co-players, compared to children in the accidental condition, F(1,32) = 7.436, p = 0.010, η 2 <sup>p</sup> = 0.189; Mexcl = 4.094; SDexcl = 1.837; Maccid = 2.625; SDaccid = 1.284. This finding provides validity information regarding the accidental condition, supporting that preschoolers make distinctions between types of exclusion based on intentions of excluders.

#### Effects of exclusion on story-completions

To test our hypotheses that exclusion would give rise to an increase in affiliation, intentionality, and MSL compared to the accidental condition, a series of 2 (Condition) by 2 (Time) repeated measures ANOVAs were performed (see **Table 1** for descriptives, F-values and effect sizes). No main effects of Condition or Time emerged for affiliation, intentionality, or MSL (ps > 0.12). Confirming our hypotheses, Condition × Time interactions were detected indicating greater increases after exclusion for affiliation (p < 0.001) as well as MSL (p = 0.004) and intentionality (p = 0.001) compared to the accidental condition (see **Figure 2**). Condition × Time Interaction effects on affiliation, MSL, and intentionality were robust to controlling for pre- to post-word count changes (ps < 0.014). The same analyses were conducted for coherence, aggression, and word count. Coherence yielded a main effect of time (p = 0.025), but neither an effect of condition (p = 0.652), nor a Condition × Time interaction (p = 0.593). No main effects or Condition × Time interactions emerged for word count (p = 0.131) or aggression (p = 0.626; see **Table 1**).

To test our hypothesis that excluded children, but not controls, would preferentially direct affiliation toward the victim of the story, a 2 (Time) by 2 (Condition) by 2 (Character: victim or perpetrator) mixed-design ANOVA was performed. For affiliation, we detected a Condition × Time interaction, F(1,34) = 11.900, p = 0.002, η 2 <sup>p</sup> = 0.259, which was further moderated by Condition × Time × Character interaction, F(1,34) = 5.100, p = 0.030, η 2 <sup>p</sup> = 0.130. Two follow-up 2 (Time) by 2 (Condition) ANOVAs, revealed Condition × Time interactions for affiliation that was victim-directed (p = 0.001), but only at trend-level for affiliation that was perpetratordirected (p = 0.057). This pattern of results suggested that excluded children increased victim-directed affiliation, but not perpetrator-directed affiliation compared to children in the accidental condition (see **Figure 3**, lower panels). For MSL, we also performed a 2 (Time) by 2 (Condition) by 2 (Character: victim or perpetrator) mixed-design ANOVA. Here, we detected a Condition × Time interaction, F(1,34) = 9.047, p = 0.005, η 2 <sup>p</sup> = 0.210, but no evidence for a Condition × Time × Character interaction, F(1,34) = 0.468, p = 0.499, η 2 <sup>p</sup> = 0.014. This pattern of results indicated that excluded children increased victimfocused and perpetrator-focused MSL to a comparable extent relative to children in the accidental condition (see **Figure 3**, upper panels).

From simple mediation models employing OLS path analysis, we found evidence that regular vs. accidental exclusion generated an increase in affiliation through their indirect effects on intentionality (CI for indirect effect: −0.416 to −0.017) as well as MSL (CI for indirect effect: −0.385 to −0.044). The mediation effects were medium to large for intentionality (κ <sup>2</sup> = 0.201; CI = 0.053 to 0.395) and MSL (κ <sup>2</sup> = 0.165; CI = 0.052 to 0.332).

# STUDY 2

In Study 2, we aimed to test the proposal that childhood anxiety may coincide with stress-induced deficits in mentalizing (e.g., Nolte et al., 2011). Accordingly, we predicted that children with ADs would exhibit a decline in depicting story-characters using MSL and intentionality after exclusion compared to controls. In this study, we thus exposed all children to regular exclusion and examined its effect as a function of anxiety. Concerning affiliative themes, we did not make specific predictions because the research is inconsistent, with some work suggesting that anxious children are highly motivated to be accepted by others (Banerjee, 2008), but other research indicating that individuals with (social) anxiety have trouble enacting reconnection behaviors after exclusion (Mallott et al., 2009). For this study, we also broadened our age-range as compared to Study 1. We did


TABLE 1 | Means and ANOVA results testing effect of condition (exclusion, accidental exclusion) on global codes in pre- and post-Cyberball doll-play narratives in Study 1.

†p < 0.10; <sup>∗</sup>p ≤ 0.05; ∗∗p ≤ 0.01; ∗∗∗p ≤ 0.001.

this, first, because we aimed to provide initial evidence that the patterns documented in Study 1 are not circumscribed to preschoolers, but also generalize to older children. Second, pragmatic reasons also played a role as the recruitment of clinically referred young children with diagnosed ADs also posed a challenge.

#### Sample

Twenty clinically referred 4 to 8-year-olds with AD participated in this study prior to enrollment in a treatment-evaluation study (see Göttken et al., 2014). Following referral by a senior child psychologist of the outpatient services, presence of AD was independently established by a trained researcher using a diagnostic interview with the parent (see below). As a control group, 15 non-referred age- and gender-matched children were recruited via telephone from a group of volunteers for studies of child development. All children of the comparison group scored below the clinical cut-off of the emotional symptoms subscale of the Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997; see below), which assesses anxiety and mood symptoms. The control group (hereafter referred to as nonanxious children or controls) was also comparable to the AD group in regard to years of parental schooling as well as rate of parental separation (see **Table 1**). All children in the AD group were recommended for enrollment in a treatmentevaluation study (see Göttken et al., 2014). Ethical approval was obtained from Leipzig University's institutional review board.

#### Procedure

All steps matched the regular exclusion condition of Study 1, with the following exceptions: AD children completed a puppet interview on their symptoms (not analyzed herein) prior to engaging in the procedure. To minimize the time-burden for AD children, the POPI was omitted after completion of the second set of story beginnings.

the exclusion compared to the accidental condition in Study 1.

#### Measures

#### **Cyberball**

The identical set-up was used as for the exclusion condition in Study 1.

#### **Story-stem narratives**

Administration (e.g., counterbalancing) and coding procedure of child narratives matched Study 2 in all regards, except the following: coding was limited to hypothesis-related dimensions of affiliation, aggression, coherence, intentionality, and MSL. A random sample of 20% of the present stories were double-coded by trained coders (ICCs: 0.66 to 0.86).

# Psychiatric Disorders and Symptoms

#### **Preschool Age Psychiatric Assessment (PAPA)**

The interviewer-based Preschool Age Psychiatric Assessment (PAPA; Egger and Angold, 2004) was administered to mothers of the AD group. The PAPA is a 2–3 h structured clinical interview to assess DSM-IV criteria of preschool and young school-age children below age 9 (Egger, 2012, personal communication). Across a 3-month primary period, mothers report frequency, duration and onset of child psychiatric symptoms to the interviewer. After entering all data into the electronic interview interface of the PAPA, algorithms designed by the developers of the PAPA and implementing DSM-IV criteria generate symptom scores and categorical diagnoses. The PAPA was translated and adapted between 2009 and 2010 by a research group at the University of Leipzig, assisted by the US PAPA authors. PAPA modules included in this study were: Oppositional Defiant Disorder (ODD), Conduct Disorder (CD), Depression (D), Social and Specific Phobia (SOP; SP), General Anxiety Disorder (GAD), and Separation Anxiety Disorder (SAD). A high degree of inter-rater reliability was established on primary diagnoses and subthreshold diagnoses (kappa coefficient = 0.92; range: 0.62 to 1.00; Göttken et al., 2014). The PAPA has shown good test-retest reliability and construct validity (Egger and Angold, 2006; Egger et al., 2006).

#### **Strengths and Difficulties Questionnaire**

fpsyg-07-01926 December 21, 2016 Time: 12:2 # 9

All caregivers completed the 25-item Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997) – a commonly used child-psychiatric screener that yields symptom scores for emotional symptoms (i.e., anxiety and mood symptoms), conduct problems, hyperactivity, and peer problems. Validity and adequate reliability for English and German versions were established in several studies (Goodman, 2001; Klein et al., 2013), for example, showing significant overlap between clinicianrated emotional disorders and parent-rated emotional symptoms (Becker et al., 2004). To screen the control group negative for anxiety symptoms, the Emotional symptoms subscale was checked to ensure that all controls scored below the clinical cut-off of 5, established within a representative German sample (Woerner et al., 2004).

#### Verbal Competence

Receptive verbal ability was assessed using the picture-based Peabody Picture Vocabulary Test-Revised (PPVT-R; Dunn and Dunn, 1981) to ensure that groups were comparable in terms of verbal competence.

#### Data-Analysis

First, to confirm successful matching, anxiety-disordered children and controls were compared on all demographic factors and verbal competence using χ 2 and a series of one-way analyses of variance (ANOVA). For the main analyses, a series of two-way 2 (Time: Pre- vs. Post-exclusion) by 2 (Group: AD group vs. Controls) mixed-design analyses of variance (ANOVA) were conducted to assess group by time interactions on intentionality, MSL, coherence, aggression and affiliation.<sup>2</sup> Significant interactions were followed up with separate one-way repeated measures ANOVAs in both groups to analyze whether effects of time (Time: Pre- vs. Post-exclusion) in the AD or the control group or both accounted for the results.

#### Results

Children with ADs were comparable to non-anxious controls on child age, gender, verbal competence, rate of parental separation, and parental education (all ps > 0.10; see **Table 2**). To compare AD children with controls on pre- to post-exclusion changes in narrative dimensions (prosociality, aggression, coherence, intentionality, MSL), a series of mixed-design ANOVAs were conducted (see **Table 3** for means, standard deviations, and test statistics). For intentionality and MSL, no main effects of group or time were observed, but, as predicted, an interaction between group and time emerged for intentionality (p < 0.001) and MSL (p < 0.006), showing that intentionality and MSL decreased from baseline to post-exclusion in the AD group, but increased for controls (see **Figure 4**). To check whether the interaction effect mainly derived from the decrease in the AD group or the increase among controls, a post hoc repeated measures ANOVA was conducted separately for each group with time as within-group variable. This revealed an increase in the non-anxious control group on intentionality, F(1,14) = 13.55, p = 0.002, η 2 <sup>p</sup> = 0.492, and MSL, F(1,14) = 6.175, p = 0.026, η 2 <sup>p</sup> = 0.306, as well as decrease in the AD group on intentionality, F(1,19) = 10.322, p = 0.005, η 2 <sup>p</sup> = 0.352, and trend for a decrease on MSL, F(1,19) = 3.048, p = 0.097, η 2 <sup>p</sup> = 0.138. Similarly, coherence also revealed a significant interaction effect (p < 0.001). Again, separate post hocrepeated measures ANOVAs were conducted for each group with time as within-group variable. This revealed both an increase in the control group, F(1,14) = 11.455, p = 0.004, η 2 <sup>p</sup> = 0.450, as well as a decrease in the AD group, F(1,19) = 5.93, p = 0.022, η 2 <sup>p</sup> = 0.246. No main effects of group or time, or interactions between time and group emerged for affiliation (ps > 0.23) and aggression (ps > 0.11).

## DISCUSSION

This research is the first to show that exclusion leads young children to shift how much they attend to others' mental states and that the extent to which they do so depends on their level of anxiety. Thus, exclusion, but not accidental exclusion, led typically developing preschoolers to tell stories that portrayed characters as intentional agents, with more references to characters' mental states, and increased affiliation between characters (Study 1). Conversely, young children with ADs were less likely to portray characters as intentional agents and made fewer references to story-characters' mental states after exclusion compared to a non-anxious control group who showed similar increases on these dimensions as in the first study (Study 2).

Across Studies 1 and 2, we provide the field with first experimental data documenting young children's systematic moment-to-moment fluctuations in attention to others' mental states. During this crucial stage of development in understanding mental states, children already appear capable of flexibly increasing or decreasing mentalizing to meet the needs of a given situation. Indeed, exclusion may compel children to increase mentalizing, paving the way toward more effective reconnection (Pickett and Gardner, 2005), as suggested by the parallel increase in affiliative story-themes and their mediation by intentionality and MSL in Study 1. Moreover, considering the characterspecific findings, children appear to monitor other minds broadly (victims and perpetrators alike), but direct their affiliative motivation specifically to those targets who are most open to cooperation (victims).<sup>3</sup> Excluded children's contemplation of

<sup>2</sup> Including story-type in a three-way 2 (Time: Baseline vs. Post-exclusion) by 2 (Group: AD group vs. Controls) by 2 (Story Type: exclusion vs. peer-conflict) mixed-design ANOVA, yielded no evidence of a three-way interaction. Therefore, as in Study 1, we collapsed children's scores across stories (i.e., using mean scores at baseline and post-exclusion).

<sup>3</sup>The post-exclusion increase in victim-directed affiliation may also reflect an "attraction" to story-characters who share the subject's plight (i.e., victimization), resembling classic findings reporting that subjects expecting a novel threat preferred to wait with similarly threatened others, rather than others in a dissimilar situation (Schachter, 1959; Gump and Kulik, 1997). Potentially other excluded parties may afford especially promising targets for reconnection, as they may share the subject's desire to reconnect, given their equally excluded state.


#### TABLE 2 | Demographic data of children with and without anxiety disorder in Study 2.

TABLE 3 | Means and ANOVA results testing effect of group (anxious, non-anxious) on global codes in pre- and post-Cyberball doll-play narratives in Study 2.


†p < 0.10; ∗∗p ≤ 0.01; ∗∗∗p ≤ 0.001.

the mental states of those around them may thus help them navigate toward target individuals who are most worthwhile to approach in order to restore a sense of connection. In turn, closely attending to a target's mental states may also facilitate post-exclusion affiliative behaviors by the excluded party, given that genuinely prosocial and cooperative actions demand that the actor keeps the needs and goals of the recipient in mind (Tomasello, 2014). In that sense, excluded children may be thought of as adopting a "cooperative mindset."

A distinct, but related interpretation of our data may suggest that exclusion prompted children to more strongly anthropomorphize story-characters in an attempt to cope with exclusion. Indeed, other studies have documented that exclusion or a dispositionally high need for inclusion leads individuals to anthropomorphize ambiguous or inanimate agents, thus augmenting the perception of social connection (Epley et al., 2008; Powers et al., 2014). Scholars have speculated that these patterns may assist excluded individuals in seeking solace in imaginary "parasocial" relationships or reflect adjustment of information-processing thresholds after exclusion to seek out new partners in more places (Knowles, 2013; Molden and Maner, 2013). We would suggest that this account complements the view that excluded children adopt a "cooperative mindset," in that increased mentalizing post-exclusion may prepare children should opportunities for reconnection arise.

However, adopting a "cooperative mindset" does not appear to be a universal response to exclusion. Indeed, young children with ADs instead showed a decline in attending to mental states after exclusion. This deficit in mentalizing upon social threat therefore provides one potentially important reason why anxious children may have trouble applying their intact mentalizing skills to affectively charged social situations (see Banerjee, 2008). Excessive negative arousal, typical of childhood anxiety, may interfere with controlled mentalizing, potentially resulting in a more automatic mode of mentalizing after exclusion, coinciding

with reflexive assumptions about others' internal states (Fonagy and Luyten, 2009).

Notably, we recently reported neural data suggesting that insecure attachment strategies lead children to respond to the Cyberball paradigm with more excessive and enduring negative expectations regarding re-inclusion than securely attached children (White et al., 2012, 2013). The present anxiety-related drop in mentalizing could set the stage for an over-extension of these negative expectations to other encounters after exclusion. Specifically, anxious children might effectively be making unjustified, reflexive, and sweeping assumptions about the mental attitudes of others toward themselves (automatic mentalization) that promotes generalization of their own negative views of themselves, others, and the world ("Nobody will ever let me back in"). Inasmuch as reduced mentalizing may then, in turn, impede affiliation after exclusion, it may partly explain why childhood anxiety is associated with increased risk for peer rejection in many studies (e.g., Perren et al., 2006; von Klitzing et al., 2014). Indeed, given that most individuals get exposed to exclusion at some point or another (Nezlek et al., 2012) – perhaps especially so in early childhood when children are less socially skilled and exclusion may even occur accidentally (Monks, 2011) – much may depend on the capacity to recover from exclusion once it has transpired.

### Limitations and Future Directions

First, it may seem surprising that anxious children did not also evidence diminished affiliative themes in their story-completions in Study 2. However, scholars frequently caution against equating portrayals in story-completions with the actual experiences they denote (e.g., Bretherton and Oppenheim, 2003). The exclusioninduced increase in affiliative portrayals in Study 1 may thus potentially signify a behavioral disposition of the excluded child or a wish for such behavior from others, rather than the behavior or experience itself. Perhaps anxious children preserve their wish and motivation to be accepted by others, despite a failure to act accordingly to reach this goal (Banerjee, 2008), which would reconcile our findings with data showing diminished post-exclusion reconnection behaviors among socially anxious adults (Mallott et al., 2009). Given that we have shown that social exclusion impacts what children "think about," future work may examine how attention to mental states relates to what they actually do, for instance, if given an opportunity to "reunite" (White et al., 2013) or if aggressive options are available (Warburton et al., 2006).

Second, our data also raise important questions regarding the exclusion-specificity of the observed changes in mentalizing for typically developing and anxious children. To draw conclusions on this issue, we would need to compare effects of various types of stressors (e.g., negative pictures, tackling unsolvable tasks, losing a game). However, we speculate that other social-evaluative stressors (e.g., giving a presentation to an audience) would also generate similar results. Indeed, even non-social threat may sometimes kindle an affiliative motivation (Schachter, 1959), and may therefore, by extension, also lead to elevated mentalizing among healthy individuals. Future research could attempt to disentangle the effects of arousal and affiliative motivation in different populations.

Third, in a related vein, future research should also aim to specify the dispositional factors that influence contextdependent shifts in mentalizing. Indeed, in other work using the story-completion method, conduct disorders and externalizing symptoms have also been associated with reduced portrayals of characters as intentional agents, but only in stories with distressing themes (Hill et al., 2007, 2008). In keeping with recent proposals, stress-induced mentalizing deficits may therefore reflect a transdiagnostic vulnerability to mental disorder, rather than a vulnerability specific to anxiety (see Fonagy et al., 2016). Future work could examine children with other clinical problems that promote high arousal under challenge (e.g., aggression), likely impeding children in bouncing back from rejection.

Fourth, it is also noteworthy that unlike some behavioral data in adults (Twenge et al., 2001), we did not observe any increases in aggressive story-themes in our data either among typical or anxious young children. Interestingly, this corresponds to a finding in our previous study, showing that

preschoolers in contrast to adults do not feel threatened in their subjective sense of control by exclusion (White et al., unpublished). Notably, control-threat has been identified as the single-most important mediator of aggressive responses to exclusion, as excluded individuals act aggressively to regain a sense of agency and influence over events (Gerber and Wheeler, 2009). Potentially, during this early period when children are still gaining familiarity with peer interactions and may show greater generosity than at later stages (Fehr et al., 2008), peer exclusion may serve as a stronger suppressant of aggression than at later stages (Barner-Barry, 1986). More generally, this nullfinding additionally strengthens our conclusion that the increases in mentalizing observed here primarily occurred in the context of a motivation to reconnect. Yet, a sample which included dispositionally aggressive children may potentially yield increases in aggressive story-themes.

Fifth, in this study we used a story-completion measure to assess the degree to which children engage in mentalizing following exclusion. However, it is conceivable that other measures of mentalizing, such as standard false belief tasks that tap into the capacity to infer beliefs that contrast with the child's own knowledge (Wellman, 2014), may yield divergent results. For a more complete picture, researchers should also aim to administer such tasks before and after exclusion in future studies.

Sixth, future work should also assess healthy and anxious children's responses to inclusion conditions. For the present study, an inclusion condition was primarily deemed less appropriate, given that previous studies document that inclusion cues may also promote cooperation and trust (Over and Carpenter, 2009a; Hillebrandt et al., 2011). Therefore, inclusion may prove suboptimal as a control condition to examine reconnection responses to exclusion. However, inclusion responses may be of interest in their own right.

Finally, a set of alternative interpretations also deserve attention. Thus, it might be suggested that children merely ponder mental states of others after exclusion because they are wondering why they were excluded. Indeed, Cyberball is a causally ambiguous task (Williams and Zadro, 2005), i.e., participants are not informed why their co-players stopped passing them the ball. However, if increased mentalizing merely reflected a wish to understand the reasons for exclusion in Cyberall, excluded children would be expected to focus their attention more narrowly on mental states of perpetrators in their stories. Yet, we did not find evidence for this using character-specific scores in Study 1. A second account might suggest that Cyberball gives children a firsthand experience of exclusion that leads to a better understanding of mental states of story-characters facing similar situations. However, if this were the sole explanation, excluded children might primarily be expected to better understand mental states of the story-victim. Instead, we observed an increase in mentalizing in relation to victims and perpetrators. Notably, we are not claiming that neither of these social-cognitive processes operate after exclusion. Rather, we are suggesting that they are unlikely to fully explain our pattern of findings. Indeed, neither of these lean interpretations of our data are easily reconciled with the fact that intentionality and MSL mediated the effect of exclusion on affiliative story-themes in Study 1, suggesting that mentalizing in this context provides a means for reconnection and that young children may already flexibly adapt their level of mentalizing to match their affiliative goals.

# CONCLUSION

A developmental theory of mental state understanding is incomplete as long as we know relatively little about the circumstances and dispositions that determine the extent to which children actually use this competence or not. Our findings show that social exclusion offers an important stimulus for the usage of mentalizing from preschool age onward. As excluded children weigh the benefits of reconnection (promotion) against the cost of potential further rejection (prevention; Molden and Maner, 2013), attending to others' mental states may provide a useful "mental reconnection tool" to vigilantly filter, approach, and re-engage with potential social partners. However, this "mental reconnection tool" may not be readily available to all children facing social exclusion. Thus, we showed that children with ADs exhibit a drop in mentalizing following exclusion. Given a general model of mentalization and regulation of negative affect (Fonagy et al., 2002), it is likely that the process of impaired mentalizing under the social challenge of exclusion reflects a transdiagnostic vulnerability factor that more broadly lies at the core of developmental psychopathology.

# ETHIC STATEMENT

Informed written consent was collected from all parents and all children also orally assented to the study. Four to 8-yearolds participated in a computerized ball-toss game during which they were eventually excluded. At the end of the procedure, all children were over-included in the game by a new set of children to dispel any potential negative emotions. An overinclusion phase was deemed more suitable than debriefing for this age group in keeping with ethical guidelines for young children (Thompson, 1990). Parents were fully debriefed before the experimental procedure, providing ample time to withdraw from the study before the child played the ball-game (no parents withdrew).

# AUTHOR CONTRIBUTIONS

Writing and revision of manuscript: LW, MC, AK, KvK, AG, YO, JH, HO, and PF; study design: LW, AK, KvK, MC, AG, YO, and PF; data-collection: LW, AK, and AG; and data analysis: LW, AK, AG, and YO.

# FUNDING

The preparation of this manuscript was supported by the Heidehof Foundation (Germany) and the Economic and Social

Research Council, Grant number: ES/K006702/1. Special thanks are due to Dr. Michael Tomasello and Katharina Haberl for their generous support, especially regarding recruitment of child subjects for these studies. The authors would also like to thank Dr. Robert Emde and Dr. Martin Debbané for their comments at the Annual Research Training Programme of International Psychoanalytic Association at University College London. Moreover, the authors are grateful to Dr. Malinda Carpenter, Dr. Maria Plötner, Antonia

#### REFERENCES


Misch, and Dr. Robert Hepach for their feedback on this work.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01926/full#supplementary-material

J. Child Psychol. Psychiatry 47, 313–337. doi: 10.1111/j.1469-7610.2006. 01618.x


children of women with post-natal depression. Philos. Trans. R. Soc. B Biol. Sci. 363, 2529–2541. doi: 10.1098/rstb.2008.0036



of deception. Cognition 13, 103–128. doi: 10.1016/0010-0277(83) 90004-5


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 White, Klein, von Klitzing, Graneist, Otto, Hill, Over, Fonagy and Crowley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Social Cognition in Children Born Preterm: A Perspective on Future Research Directions

Norbert Zmyj<sup>1</sup> \* † , Sarah Witt<sup>1</sup>† , Almut Weitkämper<sup>2</sup> , Helmut Neumann<sup>2</sup> and Thomas Lücke<sup>2</sup>

1 Institute of Psychology, TU Dortmund University, Dortmund, Germany, <sup>2</sup> Department of Neuropediatrics, University Children's Hospital, Ruhr-Universität Bochum, Bochum, Germany

Preterm birth is a major risk factor for children's development. It affects children's cognitive and intellectual development and is related to impairments in IQ, executive functions, and well-being, with these problems persisting into adulthood. While preterm children's intellectual and cognitive development has been studied in detail, their social development and social-cognitive competencies have received less attention. Namely, preterm children show problems in interactions with others. These interaction problems are present in relationships with parents, teachers, and peers. Parents' behavior has been identified as a possible mediator of children's social behavior. Maternal sensitivity and responsiveness as well as absence of mental disorders foster children's social development. In this article, we will report on the social side of impairments that preterm children face. The review of the literature revealed that preterm infants' joint attention abilities are impaired: They are less likely to initiate joint attention with others and to respond to others' efforts to engage in joint attention. These deficits in joint attention might contribute to later impairments in social cognition, which in turn might affect social interaction skills. Based on these three domains (i.e., problems in social interaction, parental behavior, and impairments in joint attention), we suggest that preterm children's social cognitive abilities should be investigated more intensively.

Keywords: preterm birth, social cognition, social problems, Theory of Mind, joint attention

# INTRODUCTION

Preterm birth is a major risk factor for children's development (Aarnoudse-Moens et al., 2009b). It affects preterm children's motor development (Jeyaseelan et al., 2006; Sansavini et al., 2015) and somatic health (Saigal and Doyle, 2008), as well as their cognitive and intellectual development: Impairments in IQ, executive functions, and well-being are related to a preterm birth, and these problems persist into adulthood (Løhaugen et al., 2010). While these factors of preterm children's intellectual and cognitive development have been studied in detail, their social development and social-cognitive competencies have received less attention. This lesser interest in socialcognitive development is surprising, as preterm children face problems not only in their intellectual development but also in social interaction (for a review, see Chapieski and Evankovich, 1997). Reading the following paragraphs, it should be noted that the definitions of preterm and very preterm birth vary across studies, both in the criteria used (birth weight or gestational age or both) and the specific critical values. Usually, the critical values are a birth weight of less than 1500 g and

#### Edited by:

Paola Molina, University of Turin, Italy

#### Reviewed by:

Ruth Ford, Anglia Ruskin University, United Kingdom Alessandra Sansavini, University of Bologna, Italy

#### \*Correspondence:

Norbert Zmyj norbert.zmyj@tu-dortmund.de †These authors have shared first authorship.

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 09 February 2016 Accepted: 13 March 2017 Published: 29 May 2017

#### Citation:

Zmyj N, Witt S, Weitkämper A, Neumann H and Lücke T (2017) Social Cognition in Children Born Preterm: A Perspective on Future Research Directions. Front. Psychol. 8:455. doi: 10.3389/fpsyg.2017.00455

a gestational age under 33 weeks (Aarnoudse-Moens et al., 2009b). According to WHO criteria, preterm birth is defined by a gestational age of less than 37 weeks. Therefore criteria defining preterm birth should be taken into account thoroughly before comparing various findings (for an overview of definitions of preterm birth given by the studies reported below, see Appendix, Table 1).

# INTERACTION DIFFICULTIES WITH OTHERS

Preterm children's interaction difficulties are reported to be manifold: A systematic review of 23 studies dealing with social development in children between 0 and 17 years of age revealed 16 out of 21 studies reporting more peer problems and social withdrawal in preterm children compared to full-term children (Ritchie et al., 2015). More specifically, at 2 years of age, children born very preterm already have lower social competence (e.g., listening to parents or playing with other children, Spittle et al., 2009) and are rated as less socially competent by their parents (Alduncin et al., 2014; Johnson et al., 2015) than their full-term peers. Preterm children also show more externalizing behaviors than their full-term peers (Bhutta et al., 2002; Potijk et al., 2012), imposing special challenges on their social environment.

Other studies considering very-low-birth-weight infants between 5 and 10 years of age have reported a persistence of social problems into school age (Ross et al., 1990; Hille et al., 2001; Reijneveld et al., 2006), underlining the relevance of this topic. Preterm children were found to be not as accepted by peers as full-term children, and were more likely to withdraw from social situations (Hoy et al., 1992; McCormick and Workman-Daniels, 1996; Nadeau et al., 2003). They were also verbally victimized more often (Nadeau et al., 2004), and rated as socially immature (Nadeau et al., 2003). Various possible reasons for these findings have been discussed (Nadeau et al., 2004). For instance, minor motor difficulties might lead to exclusion from the peer group and to victimization, and preterm children have more of these motor difficulties than their full-term peers (Holsti et al., 2002). Preterm children might themselves feel uncomfortable during physical activities with their peers who are more dexterous than themselves (Yude et al., 1998).

However, some studies indicate that preterm children do not, in general, show more difficulties in social interaction than their peers. A study differentiating between two subgroups of preterm children revealed only preterm children with medical risk factors (e.g., intraventricular hemorrhage) exhibiting more difficulties in social interaction than full-term peers (Landry et al., 1990). In accordance with this finding, brain abnormalities could be identified as a predictor of social competence (Ritchie et al., 2015). The predictive power of gestational age and brain abnormalities might serve as an explanation for one report that does not support the suggestion of differences in social competence between preterm and full-term children (Jacob et al., 1984). This study included preterm children with a birth weight up to 2500 g and a gestational age up to 37 weeks. These values are higher than in the studies that reported differences in social competence between preterm and full-term children, thereby favoring the inclusion of preterm children at lower medical risk.

Besides brain abnormalities and motor difficulties, parental behavior emerged as a crucial factor in preterm children's interaction problems. Therefore this aspect will be considered in more detail in the following section.

# THE ROLE OF PARENT'S BEHAVIOR IN THEIR PRETERM CHILDREN'S SOCIAL BEHAVIOR

Preterm children's social behavior cannot be considered without taking a closer look at its relationship to parents' behavior and mental condition. A recent study revealed that mothers who reported more depressive symptoms, more perceived stress as a parent, and a reduced sense of coherence had children with fewer social skills. This relationship, however, was not domain-specific for social skills, but was also prevalent in emotional-behavioral problems as well as in fewer executive functions (Huhtala et al., 2014). The relationship between maternal stress and children's social problems applies to preterm as well as full-term children (Assel et al., 2002). However, there is a higher prevalence of perceived stress (Huhtala et al., 2011), anxiety (Brooten et al., 1988; Bener, 2013) and depression (Brooten et al., 1988; Huhtala et al., 2011; Bener, 2013) among mothers of preterm infants compared to mothers of full-term children.

In addition to mothers' mental condition, the parental interaction style seems to be important for preterm children's social development. The first point to mention is maternal directiveness. In general, parental behavior that is not highly controlling or that does not restrict children's behavior predicts a larger and faster increase in social development (e.g., compliance with maternal requests, Landry et al., 1997b). Mothers of preterm children were found to give their 3-year-old children fewer choices in interaction than mothers of full-term children (Landry et al., 1990), and this directive behavior was negatively associated with children's initiation of activities. For preterm infants with medical risk factors, this might have been an adaptive strategy, because it takes into account the individual cognitive delay. However, for preterm infants without these risk factors, maternal directiveness was not related to the children's cognitive delay or social problems.

Using a micro-analytic coding system, 12-month-old preterm infants could be shown to differ from full-term controls concerning co-regulation and affective intensity in mother–infant interaction (Sansavini et al., 2015). More precisely, co-regulation patterns of preterm dyads were less frequently characterized by symmetry and showed more frequent unilateral elements. These characteristics of mother–infant interaction pose a risk to preterm children since symmetrical co-regulation was positively related to motor development in this group. Additionally, dyads including preterm infants were characterized by less positive and more neutral affective intensity exhibited by infants as well as their mothers.

Examining parental behavior from a long term perspective reveals growing evidence that it is also predictive for preterm

children's later development: Positive parenting during early childhood resulted in better cognitive as well as social-emotional outcomes at kindergarten entry (Maupin and Fine, 2014). More specifically, maternal sensitivity (i.e., mother following child's topic in play) and verbal reciprocity (i.e., responding vocally to infant vocalization) in 1-year-olds predicted social competence (i.e., solving hypothetical problems in a non-hostile way) in 5-year-olds (Beckwith and Rodning, 1996). In a recent study, researchers found not only that preterm infants have more problems in social situation than full-term infants (Forcada-Guex et al., 2006), but also that the mothers' and the infants' interaction behavior at 6 months of age predicted problems in social situations at 18 months of age. Mother–infant dyads in which the mothers were rated as controlling and the infants were rated as compulsive-compliant had more problems in social situations than other dyads, including preterm infants and full-term infants.

As mentioned in the previous chapter, preterm children show more externalizing behaviors than same-aged full-term children (Bhutta et al., 2002; Potijk et al., 2012). Again, this relationship is not independent of parental behavior in the way that maternal responsiveness has been found to moderate the prevalence of externalizing behavior: Preterm children of high responsive mothers at 2 years of age show less externalizing behavior at 8 years of age than preterm children of low responsive mothers (Laucht et al., 2001). Converging findings come from another longitudinal study with a group of full-term and preterm children, in which the mothers' warm sensitivity at 2 years of age predicted social responsiveness at 4 years of age (Miller-Loncar et al., 2000).

# SOCIAL-COGNITIVE SKILLS IN PRETERM CHILDREN: EVIDENCE FROM STUDIES ON JOINT ATTENTION AND THEORY OF MIND SKILLS

This article focuses on the role of social-cognitive skills in explaining preterm children's interaction problems. Since these skills develop rapidly within the 1st years of life and might be impaired in similar ways to intellectual and cognitive skills. In the 1st year of life, full-term infants typically start to attribute goals to another person's behavior (Gergely et al., 1995) and are even able to imitate observed behaviors (Meltzoff, 1988). They also start to learn words for objects (Friedrich and Friederici, 2008). In order to learn novel actions or novel words in social interactions, infants have to direct their attention to the same object as the interaction partner. This so-called 'joint attention' is regarded as a basic social-cognitive skill (Tomasello et al., 2005) that also predicts preterm infants' later social language and intelligence (Smith and Ulvund, 2003): In particular, the initiation of joint attention—and not the response to offers of joint attention—contributes to later IQ. Preterm infants' attention also mediates the link between the risks of prematurity and later cognitive development. Therefore prematurity per se does not directly affect cognitive development. More likely, gestational age correlates with focused attention which in turn is related to cognitive performance (Reuner et al., 2014).

Joint attention skills differ between preterm and full-term infants. Responding to joint attention signals (i.e., following the gaze of an experimenter) was more often observed in fullterm than in preterm infants at 9 months of age (De Schuymer et al., 2011). Likewise, initiating joint attention (e.g., pointing toward an object) was more often observed in full-term than in preterm 2-year-olds (De Groote et al., 2006). Preterm infants in the first 2 years of life also showed less joint attention in terms of exploratory responses such as toy manipulation as well as communicative responses such as following eye gaze vocalizations and imitating social interaction (Garner et al., 1991). These deficits translate to the infants' behavior: Preterm infants were less likely than full-term infants to reach for toys in joint attention situations (Landry and Chapieski, 1988). Difficulties in motor skills might additionally contribute to the latter finding. In contrast, there is one report that preterm infants responded to joint attention interactions with their mothers in the same manner as full-term infants. However, preterm infants moved their attention away from situations of joint attention more often than full-term infants (Landry, 1986). Another study also demonstrated that preterm infants with medical risk factors showed a slower increase in social initiation (but not in social response) than preterm infants without medical risk factors or full-term infants (Landry et al., 1997a). The reason for differences in joint attention skills between full-term and preterm infants may be manifold. First, preterm infants look away from the parents' face more often and are less responsive than full-term infants (Crnic et al., 1983; De Schuymer et al., 2012). Second, preterm infants show general problems in attention, such as shifting gaze to peripheral stimuli, in which they are slower than full-term infants (De Schuymer et al., 2012). Third, the severity of medical risk factors of preterm infants is negatively correlated with abilities to regulate attentional processes such as longer looks to an experimenter's talking in motherese compared to fullterm infants (Eckerman et al., 1994). This finding indicates that preterm infants are not less attentive in general. Rather, they are more reactive and less self-regulated in their attentional behavior than full-term infants.

However, preterm and full-term 2-year-olds were also reported not to differ in the amount of initiation of social interaction (Greenberg and Crnic, 1988). This discrepancy might be partly explained by methodological aspects: The inclusion criteria for preterm infants in Greenberg and Crnic's (1988) study was a gestational age of 38 weeks or younger, and in Landry's (1986) study, the sample size was rather low, with around 24 infants per group. These details may have obscured differences between groups.

Social-cognitive skills besides joint attention, such as imitation, goal understanding, and self-other differentiation have not yet been tested in preterm infants. Research in this regard would complement the existing knowledge about infants' social cognition and potential underlying mechanisms for preterm children's difficulties in social interaction. These social-cognitive skills might be mediated by environmental

factors. For example, neonatal care, such as the Newborn Individualized Developmental Care and Assessment Program (NIDCAP), embeds the infant in the natural parent niche, avoids over-stimulation, stress, pain, and isolation, and supports self-regulation, competence, and goal orientation. NIDCAP improves brain development, functional competence, health, and life quality (Als and McAnulty, 2011). Additionally, administration of some nutrients (e.g., omega-3 long-chain polyunsaturated fatty acids) to children with a gestational age of less than 29 weeks also shows beneficial effects (Zhang et al., 2014).

Impairments in preterm children's social-cognitive abilities are not restricted to early forms like joint attention but apply to later forms as well. A variety of findings on social-cognitive skills related to Theory of Mind indicate deficits in preterm children. For example, at the age of 7 they were found to show weaker empathic development compared to full-term controls (Campbell et al., 2015). Between 8 and 11 years of age preterm children struggle with interpreting non-verbal cues from facial expressions and body movements properly (Williamson and Jakobson, 2014b). Compared to full-term children, they show a lack of competence in reasoning somebody's emotions on the basis of these cues. This deficit may result from a preference for looking at eyes over the mouth which is not as pronounced in preterm children as it is in full-term ones (Telford et al., 2016). Additionally, when confronted with the animated triangle task (Abell et al., 2000), school aged preterm children demonstrated less social attribution skills relative to full-term peers (Williamson and Jakobson, 2014a). These difficulties were indicated by inappropriate descriptions of the animations including overattribution of mental states to randomly moving triangles and underattribution of mental states to shapes interacting socially. Future research on Theory of Mind should clarify if these attribution problems are restricted to a rather abstract level or if they exist on the interpersonal level as well. Both of the studies mentioned above revealed an association between social-cognitive deficits and negative behavioral outcomes in preterm children. These difficulties are expressed by increased 'autistic-like' traits. However, both estimations of these traits refer to parent-report exclusively. Since autistic-like traits are likely to be overestimated in preterm children (Stephens et al., 2012) especially when rated by parents (Gray et al., 2015), they have to be treated with caution.

Theory of Mind represents a social-cognitive skill that has considerable predictive power in terms of social acceptance (Slaughter et al., 2002). By means of Theory of Mind, children acknowledge the representational nature of an individual's mental state. Theory of Mind allows cognition such as perception and beliefs to be conceived of as the result of mental acts, as well as the realization that these mental acts can be wrong. The insight into false beliefs is therefore a key aspect of developing a mature understanding of others' cognitive functioning. At around the age of 4, children are able to solve a classic task of false-belief understanding by Wimmer and Perner (1983), in which the protagonist of a story, called "Maxi," puts a chocolate in a blue cupboard and goes outside. While Maxi is playing outside, his mother moves the chocolate from the blue cupboard to a green cupboard. When children were asked where Maxi will look for his chocolate when he comes back, 3-year-olds incorrectly assumed that he will look in the green cupboard, where the mother put the chocolate. In contrast, 4-year-olds were aware that Maxi believes that the chocolate is still in blue cupboard and will accordingly look for it there. Only one study has directly tested falsebelief understanding in preterm children at the age of 4 so far. The authors used two standard false-belief understanding tasks and one rather novel false-belief understanding task. Preterm children did not perform differently from full-term children on the tasks (Jones et al., 2013). This finding is surprising, because in the same study sample, preterm children showed the typical deficits in social interactions compared to full-term children. Nevertheless, the finding might be explained by the type of tasks the researchers used. Despite the standard nature of two of the tasks, their psychometric properties are rather unexplored, and there is no standardized way of conducting them.

# FUTURE RESEARCH: THEORY OF MIND IN PRETERM CHILDREN

In the present article, we showed that preterm infants' joint attention is impaired in comparison to that of full-term infants. This basic social-cognitive skill is important for the infants' later development of social interactions and learning of novel behavior. This early impairment might represent a first step in a cascade of maladjusted social development (see Bornstein et al., 2013 for a similar account on cognitive development). It is interesting, however, that little is known about preterm children's later social-cognitive skills, such as Theory of Mind.

Impaired social-cognitive skills are mirrored in problems in social interactions (Badenes et al., 2000; Slaughter et al., 2002; Banerjee and Watling, 2005). These studies showed that lower Theory of Mind abilities are associated with less social acceptance by peers. There is also evidence that the way in which parents interact with their children is related to their children's Theory of Mind. Parents who use more words that focus on mental states (e.g., to believe, to want) have children with higher Theory of Mind abilities than parents who use fewer of these words (Dunn et al., 1987; Sabbagh and Callanan, 1998; Jenkins et al., 2003). Based on the social difficulties and altered maternal interaction styles reported above, one might assume that preterm children's development of a Theory of Mind is delayed or even impaired.

Further evidence for the necessity to find out more about Theory of Mind abilities in preterm children is provided by deficits in cognitive skills that are associated with prematurity and impaired Theory of Mind abilities simultaneously. First, prematurity is related to impairments in language development (Barre et al., 2011) showing a linear relationship between gestational age and language skills (Foster-Cohen et al., 2007). Preterm children show problems in a variety of language outcomes including vocabulary size, quality of word use as well as morphological and syntactic complexity (Foster-Cohen et al., 2007). Since it is well known that several language abilities contribute to the development of Theory of Mind (Cutting and Dunn, 1999; Milligan et al., 2007; Farrar et al., 2009), one might

assume that preterm children's language deficits hinder their Theory of Mind abilities.

Second, children born at less or equal 34 weeks of gestation and having a birth weight of less than 2500 g show impaired executive functions (Alduncin et al., 2014): More precisely, preschoolers born preterm were found to have difficulties concerning inhibitory control (Bayless and Stevenson, 2007; Aarnoudse-Moens et al., 2009a, 2012), working memory (Ni et al., 2011; Aarnoudse-Moens et al., 2012; Brumbaugh et al., 2014) and attention shifting (Bayless and Stevenson, 2007; Aarnoudse-Moens et al., 2009a). With the exception of inhibition, these problems persist up to adolescence (Aarnoudse-Moens et al., 2012). The executive functions listed above are closely linked to Theory of Mind tasks requiring working memory to bear in mind different perspectives and inhibitory control to suppress the own knowledge in favor of a correct answer. Associations between executive functions and Theory of Mind are well established especially for inhibition (Carlson and Moses, 2001) and working memory (Carlson et al., 2002). Again, these relationships indicate impaired Theory of Mind in preterm children.

As mentioned above, evidence concerning Theory of Mind abilities in preterm children is limited to one study relying solely on two tasks comprising unknown psychometric properties. Therefore, future research should apply a Theory of Mind battery with better psychometric properties (e.g., Peterson et al., 2012) and additional established procedures like the Children's Faux Pas Test (Baron-Cohen et al., 1999) or the "Reading the Mind in the Eyes" Test (Baron-Cohen et al., 2001). To gain a more complete insight in preterm children's Theory of Mind abilities,

## REFERENCES


it would be desirable to take into account parental judgment as further source of information by using a questionnaire inquiring children's behavior in everyday situations (e.g., Tahiroglu et al., 2014).

# CONCLUSION

Preterm children face problems in social interactions. These problems might be based on difficulties in social-cognitive skills, and can be moderated by parental behavior. The emphasis on preterm children's motor, physiological, and intellectual development in past research should be enriched by a closer look at preterm children's social-cognitive development.

# AUTHOR CONTRIBUTIONS

NZ and SW share the first authorship. All authors made substantial contributions to the conception, writing, and editing of the work. All authors approved the final version to be published. All authors agreed to be accountable for all aspects of the work.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2017.00455/full#supplementary-material


born preterm: a meta-analysis. JAMA 288, 728–737. doi: 10.1001/jama.288. 6.728



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Zmyj, Witt, Weitkämper, Neumann and Lücke. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Role of Executive Functions in Social Cognition among Children with Down Syndrome: Relationship Patterns

#### Anna Amadó<sup>1</sup> \*, Elisabet Serrat<sup>1</sup> and Eduard Vallès-Majoral1,2

<sup>1</sup> Department of Psychology, University of Girona, Girona, Spain, <sup>2</sup> Servei Neuropsicopedagògic Arlot, Girona, Spain

Many studies show a link between social cognition, a set of cognitive and emotional abilities applied to social situations, and executive functions in typical developing children. Children with Down syndrome (DS) show deficits both in social cognition and in some subcomponents of executive functions. However this link has barely been studied in this population. The aim of this study is to investigate the links between social cognition and executive functions among children with DS. We administered a battery of social cognition and executive function tasks (six theory of mind tasks, a test of emotion comprehension, and three executive function tasks) to a group of 30 participants with DS between 4 and 12 years of age. The same tasks were administered to a chronological-age control group and to a control group with the same linguistic development level. Results showed that apart from deficits in social cognition and executive function abilities, children with DS displayed a slight improvement with increasing chronological age and language development in those abilities. Correlational analysis suggested that working memory was the only component that remained constant in the relation patterns of the three groups of participants, being the relation patterns similar among participants with DS and the language development control group. A multiple linear regression showed that working memory explained above 50% of the variability of social cognition in DS participants and in language development control group, whereas in the chronological-age control group this component only explained 31% of the variability. These findings, and specifically the link between working memory and social cognition, are discussed on the basis of their theoretical and practical implications for children with DS. We discuss the possibility to use a working memory training to improve social cognition in this population.

Keywords: children, Down syndrome, executive functions, social cognition, working memory

# INTRODUCTION

Down syndrome (DS) is the most common genetic syndrome associated with intellectual disability (Canfield et al., 2006). So many studies have described the pattern of relative weaknesses and strengths in this population. Previous studies also suggest that social cognition and executive functions are critical abilities to ensure a better quality of life for infants, children and adults, also

#### Edited by:

Daniela Bulgarelli, Aosta Valley University, Italy

#### Reviewed by:

Josie Booth, University of Edinburgh, UK Laura J. Hahn, University of Illinois at Urbana–Champaign, USA

> \*Correspondence: Anna Amadó anna.amado@udg.edu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 19 April 2016 Accepted: 26 August 2016 Published: 13 September 2016

#### Citation:

Amadó A, Serrat E and Vallès-Majoral E (2016) The Role of Executive Functions in Social Cognition among Children with Down Syndrome: Relationship Patterns. Front. Psychol. 7:1363. doi: 10.3389/fpsyg.2016.01363

with DS. However, the amount of studies about the relation between social cognition and executive functions in DS is scant.

# Social Cognition in Children with Down Syndrome

The importance of competent social cognition abilities in having a satisfying personal life and social interactions is widely accepted. In this study, we understand social cognition as a set of abilities that involve cognitive capability applied to social situations (Harvey and Penn, 2010). Thus, this set of abilities includes understanding mental states and intentions in oneself and in others (or what has traditionally been known as theory of mind), emotional recognition and perception, and social knowledge, among others. According to this definition, some authors have suggested that socio cognitive abilities can be divided in two parts (Shamay-Tsoory et al., 2006; Tirapu-Ustárroz et al., 2007): a part more connected with cognitive aspects, and another part more related to affective aspects. On the one hand, from the cognitive perspective, it has been considered essential to understanding the difference between knowledge of self and others. On the other hand, from the affective perspective, the empathic appreciation of the emotional state of others has been considered essential.

This study will address both aspects of social cognition. Regarding the more cognitive aspect, we will focus on one of the most widely studied developmental milestones -the comprehension of explicit first-order false beliefs- as well as on other abilities that are acquired either before or after it. As for the more affective aspect, we will explore different dimensions of emotional comprehension above and beyond facial expression recognition.

Initially, as a result of studies like the one conducted by Baron-Cohen et al. (1985), and descriptions of people with DS as individuals who are especially friendly and interested in others, highly sociable and with few social problems, it was postulated that these children had no particular difficulty in theory of mind development.

However, subsequent research has revealed such difficulties (Binnie and Williams, 2002; Giaouri et al., 2010). For example, the study conducted by Giaouri et al. (2010) suggested that children with DS have difficulties in understanding false beliefs and appearance-reality compared with typical developing children and children with intellectual disability of unknown etiology. Molina and Amador (2010) found that when offered the necessary help children with DS are able to exhibit a similar performance to that of their peers with typical development.

With regard to the more affective aspect, most studies have focused on the recognition of facial expressions in others, this being a necessary ability to respond appropriately in situations requiring social interaction. Studies like the one by Wishart et al. (2007) show that, among all aetiologies of intellectual disability, children with DS are the only ones that exhibit a significantly lower performance than children with typical development in interpreting facial expressions. Previous studies reinforce this idea regarding difficulties with emotional recognition among children with DS (e.g., Kasari et al., 1995), some suggesting particular difficulty in recognizing fear, surprise and anger (Hippolyte et al., 2008). In fact, as suggested in the work carried out by Kasari et al. (2001), it is possible that emotional recognition among children with DS is more related to their mental than their chronological age.

# Executive Functions in Children with Down Syndrome

For a long time, there has been a lack of clarity over which abilities are included under the concept of executive functions. Some authors have posited that they are higherorder control processes, while others have defined them as processes aimed at achieving a milestone; some have emphasized the constructive and creative aspect, while others still have focused on working memory. In an attempt to encompass all these approaches, Hughes (2011) and Low and Simpson (2012) conceptualized executive functions as an umbrella term that includes a set of complex cognitive abilities that guide actions aimed at a goal and adaptive responses to new or complex situations. According with Diamond (2013), the core components of executive functions are inhibition, working memory, and cognitive flexibility. So, in our study, we will focus on them.

To succeed in life and have a healthy social, cognitive and psychological development we need to: control our attention, thoughts, emotions or behavior (called inhibitory control), hold information in mind and work with it (called working memory), and based on these two skills, change spatial and interpersonal perspectives (called cognitive flexibility). As suggested by Pennington and Bennetto (1998), people with DS are expected to exhibit deficits in these executive function skills due to the fact that they have often been described as having persistent behavior (Wilding et al., 2002). However, some studies suggest that in children with DS, it seems that difficulties with executive functions do not occur equally across all of these components (Rowe et al., 2006; Kogan et al., 2009). For example, the study by Pennington et al. (2003) only describes difficulties in the components associated with the functioning of the hippocampus (such as long-term visual and verbal memory). The one conducted by Lanfranchi et al. (2009) find them in a simultaneous task on spatial working memory but not in one on spatial sequencing. The study conducted by Carney et al. (2013) shows that compared with children with typical development of the same mental age, children and adolescents with DS have difficulty with working memory but not inhibition and fluency.

In a developmental study by Costanzo et al. (2013) designed to test the hypothesis of etiological specificity, different aspects of the executive functions were evaluated in children, adolescents and adults with DS, Williams syndrome, and a typical development group of the same mental age. Both groups with intellectual disabilities displayed difficulties with some components of the executive functions, such as selective attention and working memory, but not others, such as inhibition. A different pattern was also found according to the etiology of the disability, participants with DS displaying a more

affected performance in cognitive flexibility, memory and verbal inhibition.

Recent studies have analyzed DS executive functions by parent and teacher reports like the Behavior Rating Inventory of Executive Function-Preschool (BRIEF-P; Gioia et al., 2003). On the one hand, in the study conducted by Lee et al. (2011), caregivers of children with DS completed these report. Results suggest a specific pattern of executive function weaknesses in this population; working memory was the most impaired domain and emotional control the least impaired one. On the other hand, in the study of Daunhauer et al. (2014) were the teachers of children with DS who completed the report. The results of this study suggested that in the area of school function, children with DS showed a distinct pattern of strengths and weaknesses. On the Activity Performance domain, children with DS reported greatest challenge on recreational movement, following social conventions, functional communication, positive interaction, or behavior regulation among others. On the other hand, on Task Supports domain, children with DS reported to need more assistance than adaptations, and that supports on cognitivebehavioral tasks were the subdomain in which they showed higher levels of assistance. Apart from that, it is important to highlight that executive functions was the only predictor of school function in this study. The idea of greater difficulty on the cool EF suggested by these studies was also found in the work by Lee et al. (2015).

Furthermore, some studies suggest a dissociation between verbal and visuo-spatial abilities in children with DS (Laws, 2002; Brock and Jarrold, 2005), because they show deficits in verbal but not in visuo-spatial working memory abilities (Lanfranchi et al., 2012).

So, it seems that research about executive functions indicates the presence of a particular profile of abilities and difficulties in DS children. Being some components more preserved (as inhibition or visuo-spatial working memory), and others most affected (as working memory, verbal inhibition, or cognitive flexibility). Therefore, this second group of components could require the design of interventions to improve it.

# The Role of Executive Functions in Social Cognition

In the pioneering study by Russell et al. (1991) on children with typical development, a positive association was found between performance in a false belief task and one on strategic deception task. Although deception tasks have been considered traditionally as a social cognition measure, in the study of Russell et al. (1991) has been considered as an executive control task. According with Hala and Russell (2001), children's difficulty in this task is related with the executive control demands, particularly with the dual requirement to hold in mind the task rules and the inhibition of a prepotent response of pointing directly at the treat.

Since then, several studies have confirmed the relationship between individual differences in executive functions and individual differences in theory of mind (Carlson and Moses, 2001; Carlson et al., 2002, 2004a,b). The nature of this relationship is unclear. However, one of the perspectives argues that executive functions are needed for theory of mind (Russell, 1996, 1998; Hughes and Ensor, 2007; Austin et al., 2014). So, in this study, we will focus on the role of executive function components in social cognition.

Of all the theory of mind abilities, the most studied in this regard has been the understanding of false belief, which in general terms has been positively associated with flexibility, inhibition and working memory, but not planning.

With regard to cognitive flexibility, both correlational and training studies show the presence of a relationship between this component and theory of mind. Carlson and Moses (2001), for example, found relationships between various theory of mind tasks and a cognitive flexibility task. Specifically, Carlson and Moses (2001) found that inhibitory task requiring a novel response in the face of a conflicting prepotent response significantly predicted performance in theory of mind tasks. However, inhibitory task requiring the delay of a prepotent response was not significant in the same analysis. Additionally, Zelazo et al. (2002) found that a poor performance in theory of mind tasks might be caused by a lack of ability to integrate two contradictory rules into one system. A training study conducted by Kloo and Perner (2003) showed that false belief training improves performance in a card classification task that evaluates cognitive flexibility, and vice versa.

In relation to inhibition, Hughes (1998) found a correlation between the performance of a deception task and inhibitory control. A year later, Perner and Lang (1999) confirmed this relationship, and Carlson and Moses (2001) subsequently also found a strong association between inhibition and false belief. Further studies such as that by Carlson et al. (2002) have confirmed this association.

Regarding working memory, Olson, Kennan and colleagues (Gordon and Olson, 1998; Keenan et al., 1998) suggested that the ability to hold two conflicting perspectives on the same stimulus is a prerequisite for promoting the development of social cognition. In line with this, Davis and Pratt (1995) found that children under 4 did not succeed in false belief tasks because of difficulties in working memory. Gordon and Olson (1998) described a correlation between working memory and an appearance-reality and a false belief task. Subsequent studies have confirmed this relationship (Keenan et al., 1998; Hala et al., 2003; Mutter et al., 2006) although there are also studies that suggest the opposite (Hughes, 1998; Slade and Ruffman, 2005).

The relationship between the understanding of false belief and the executive functions has been described for different stages of development (Carlson et al., 2004a; Dumonthiel et al., 2010), in longitudinal studies (Flynn, 2006; Hughes and Ensor, 2007), when the tasks involve minimal executive demands (Perner et al., 2002; Moses and Carlson, 2004) or in populations with atypical development, such as autism spectrum disorders (Pellicano, 2007).

However, it is unclear which components of the executive functions display a stronger relationship with theory of mind abilities. Zelazo et al. (1996) examined the relationship between social cognition and executive functions in adults with DS. They found that performance in a set of theory of mind tasks and in

a card sort task was positively correlated when mental age was controlled. However, we do not know more studies or results in infants or children with DS.

Therefore, the aim of this study is to take a more in-depth look at this relationship in children with DS. We want to investigate the role of executive functions in social cognition among children with DS and compare this relationship with the one described for children with typical development of the same chronological age and a similar level of linguistic development (LD). In addressing this aim we will be analyzing the social cognition and executive function abilities of children with DS in depth and will focus on how these abilities evolve with increasing chronological age and language development.

#### MATERIALS AND METHODS

#### Participants

A total of 90 participants (aged 2;9 – 12;2) took part in the study, divided into three groups of 30 participants: one group of children with a medical diagnosis of DS, and two control groups composed of children with typical development; one with the same chronological age as the participants with DS (CA), and a second with the same level of LD. Children with other associated difficulties were excluded from the study.

As shown in **Table 1**, the CA group participants had the same chronological age and gender distribution as the DS group participants. According to the T-test, the age of these two groups was statistically higher than that of the LD group participants (p < 0.0005), while the LD group participants had a similar level of language development (±4 months) to that of the DS group participants (the gender variable was not taken into account in the construction of this control group). The T-test shows that the language level of the CA group participants was statistically higher than that of the DS and LD groups (p < 0.0005).

#### Tasks

In this paper, we evaluate two aspects of participants' cognitive functioning: social cognition and the executive functions. Below, we detail the tasks used to evaluate each of these aspects.

#### Social Cognition Tasks

To evaluate social cognition, we used six tasks that have been traditionally used to evaluate theory of mind, and one test of emotional understanding.

We divided the theory of mind evaluation tasks into three levels of difficulty according to their distance from the understanding of first-order false beliefs, and we included two tasks in each level. Apart from that, a pilot study with DS children was conducted to ensure that participants understood the tasks (Amadó et al., 2012). We designed visual and verbal aids to compensate comprehension difficulties detected.

In Level 1, we used two tasks to evaluate abilities developed by children with typical development prior to the first-order false belief. We administered an adaptation of the task Diverse Beliefs designed by Wellman and Liu (2004) and an adaptation of the task Seeing is knowing developed by Pratt and Bryant (1990). In the Diverse Beliefs task children had to predict the behavior of the story character according to the beliefs of the character. It is important to know that the character always had a belief contrary to child's. In the Seeing is Knowing task, we evaluated the capacity of the child to understand the relationship between seeing (or not seeing) the content of a closed box and knowing what object was inside the box. In both tasks, we used pictures to tell the story and facilitate their understanding. The score for each of these tasks was one point, meaning the highest score at this level was two points. In order to make the score at this level equivalent with the scores in other theory of mind levels, we doubled the total score for Level 1.

In Level 2, we included first-order false belief tasks. Thus, we administered the Unexpected Content task, based on the procedure designed by Gopnik and Astington (1988), and the Change of Location task, designed by Wimmer and Perner (1983). In the Unexpected Content task, we used a tube of Smarties <sup>R</sup> with rocks inside. After exploring the tube, we showed its real content to the child. Then we asked them what they thought there was inside the tube before opening it, and what their friend would think the tube contained without seeing the real content. In the Change of Location task the child had to predict, in a story represented with small dolls, the behavior of a character when the character held a false belief about the location of a hidden object.

Each of the Level 2 tasks was awarded two points, meaning the highest score for this level was four points.

At Level 3, we used tasks which according to developmental research are successfully completed after the first-order false belief. Specifically, we used an adaptation of the Deception task designed by Sodian (1991) and a Second-Order Change of Location task based on the procedure devised by Sullivan et al. (1994). The Second-order Change of Location followed a procedure similar to that described in the first-order false belief

TABLE 1 | Characteristics (sex, age, language, and IQ) of each group of participants.


<sup>a</sup>Language development was calculated using the score obtained in the Peabody Picture Vocabulary Test or PPVT (Dunn et al., 2006); <sup>b</sup>The IQ score was obtained via administration of the Raven Progressive Matrices (Raven et al., 1996).

task. However, in this story the child had to predict the behavior of a character involved in a second order recursivity situation. In the Deception task the child played a game, with the puppets of the Little Red Ridding Hood and the Wolf, where the player getting more stars was the winner. To win stars the child had to help Little Red Ridding Hood (who gave the stars to the chid whenever she found one) and had to deceive the Wolf (who kept the stars for himself). Each of the tasks was awarded two points, meaning the highest score for this level was four also points.

Finally, in order to evaluate emotional understanding, we administered an adaptation of the Emotional Comprehension Test by Albanese and Molina (2008). Of the nine components included in the original version of this instrument, we administered six, selected on the basis of results from the study by Pons et al. (2004): recognition (I), external causes (II), desires (III), beliefs (IV), memory (V), and hidden emotions (VII). In the administration of each component, we followed the instructions of the original test. Also following the instructions in the manual, each component that was passed scored one point, meaning the maximum score in this task was six points.

In all of our analyses in the results section, we consider scores from the theory of mind and emotional understanding tasks together, giving a maximum score of 18 points for social cognition.

#### Executive Function Tasks

In accordance with the results presented in the study by Miyake et al. (2000), in this study, we used different tasks to evaluate three of the components of the executive functions: working memory, inhibition, and cognitive flexibility.

To evaluate working memory, we administered a version of the visual-spatial memory task used by Lanfranchi et al. (2004), which we called the Frog Task. This entailed administering a total of eight tests (and two trial tests) divided into four levels of difficulty, in which the child had to follow two rules simultaneously. In this task, we used a frog and a board with squares. The frog jumped from one square to another. The child had to remember the frog starting position on the board (first rule) and to hit the table with the hand when the frog jumped into a red square (second rule). Each test was awarded one point only if the participant completed the game successfully; meaning the maximum score for the task was eight points.

To evaluate inhibition, we used a simplified version of the Stroop test, the Day-Night Task (Gerstadt et al., 1994). After two trial tests, we administered 16 tests in random order in which the participant had to inhibit their predominant response to a visual stimulus. We designed a Power Point presentation in which two images (a sun and a moon) appeared randomly. When the child saw the sun they had to say "night" (inhibiting the predominant response, "day"), and when the moon appeared in the screen the child had to say "day" (inhibiting the predominant response, "night"). One point was awarded for each correct answer, meaning the maximum score for the task was 16 points.

Finally, we evaluated cognitive flexibility by means of an adapted version of the Wisconsin Card Sorting Test developed by Fisher and Happé (2005), which comprises a card classification game using different shapes (triangle, round, square), colors (red, blue, yellow), and numbers (1, 2, 3). The experimenter put in front of the child, three stimulus cards to define the three categories where to classify the response cards. Then, the experimenter gave to the child each of the response cards and asked them to classify each card in the correct category. In accordance with the hidden classification rule governing the game in each part (color, shape, number, and color), the experimenter gave feedback to the child (correct or incorrect classification). The child had to discover the classification rules during the game. Following the recommendations of these authors, we used the number of completed dimensions (unaided) as a measure of overall success in this task. Therefore, the maximum score in this task was four points (classification criteria: color, shape, number, and color).

For a more detailed description of the procedure of the tasks used, see the appendix of the study conducted by Amadó et al. (2012).

#### Procedure

We contacted the DS participants through various organizations dedicated to the care of people with this etiology of intellectual disability. Participants in the CA and LD groups were selected according to their chronological age, gender, and language development from different schools in provinces of Catalonia (Spain).

In all three groups, the parents/legal guardians were first duly informed of the purpose and requirements of the study by means of an explanatory letter requesting consent for their son/daughter's participation. We then carried out between two and four individual sessions (at the school, foundation/association or child's family home) in order to administer the tasks. The amount of time spent administering each of the tasks varied from one participant to another, but the order of administration was always the same: vocabulary, intelligence, executive functions (working memory, inhibition, and cognitive flexibility), and social cognition (emotional understanding, theory of mind tasks according to their order of difficulty).

In some analysis, participants with DS will be divided into different subgroups. According with their chronological age, we will distinguish between 3 subgroups with 10 participants in each group: the younger group (4;0 – 6;11), the middle group (7;0 – 9;11), and the older group (10;0 – 12;11). According with their level of language development (based on the score obtained in the PPVT by Dunn et al., 2006), we will stablish 2 groups with 15 participants in each group: low LD group (0;0 – 4;0) and high LD group (4;1 – 8;12).

# RESULTS

As **Table 2** shows, the DS group participants scored significantly lower than the children with typical development in both control groups (CA and LD) in all of the administered tasks.

As **Table 2** shows, participants of the CA group obtained higher scores in all the tasks, followed by the participants of the LD group and the participants with DS, in this order.



<sup>a</sup>Social Cognition (range: 0–18); <sup>b</sup> M, working memory (range: 0–8); <sup>c</sup> INH, inhibition (range: 0–16); <sup>d</sup>FLEX, cognitive flexibility (range: 0–4); <sup>e</sup>Contrasts were identified using the T-test. ∗∗∗p < 0.001; ∗∗p < 0.01; <sup>∗</sup>p < 0.05; t.s. when p > 0.05; and n.s. when p > 1. <sup>f</sup>The values of Cohen's d are presented in this order: DS-CA, DS-LD, and CA-LD.

TABLE 3 | Means (standard deviations) in the social cognition and executive function tasks in each group of participants with Down syndrome.


<sup>a</sup>Means (and standard deviations) of chronological age and linguistic age (calculated using the score obtained in the Peabody Picture Vocabulary Test or PPVT by Dunn et al., 2006) of the DS groups; <sup>b</sup>Social Cognition (range: 0–18); <sup>c</sup>WM, working memory (range: 0–8); <sup>d</sup> INH, inhibition (range: 0–16); <sup>e</sup>FLEX, cognitive flexibility (range: 0–4); <sup>f</sup>Contrasts were identified using the T-test. ∗∗∗p < 0.001; ∗∗p < 0.01; <sup>∗</sup>p < 0.05; t.s. when p > 0.05; and n.s. when p > 1.

As T-test shows, the mean score of participants with typical development was significantly better than the mean score of participants with DS in all the tasks. According with Cohen (1988), the effect sizes of all these comparisons are large. Also it's important to consider that, in all the tasks, the largest effects are observed between DS and CA participants.

However, it is interesting to analyze performance in these tasks with increased chronological age and language development in DS children. On the one hand, to test the effect of chronological age we divided participants with DS into three groups: the younger group (4;0 – 6;11), the middle group (7;0 – 9;11), and the older group (10;0 – 12;11). On the other hand, to test the effect of linguistic level, we divided participants with DS into two groups based on the score obtained in the PPVT (Dunn et al., 2006): low LD group (0;0 – 4;0) and high LD group (4;1 – 8;12). The mean scores for the chronological age and linguistic level groups into which, we divided the participants with DS are shown in **Table 3**.

As the above table shows, both chronological age and LD were relevant factors in mastering social cognition in participants with DS. Thus, as the chronological age and linguistic level of this group of children increased, their social cognition abilities improved, with significant differences observed in the group of participants with older chronological age and both linguistic level groups.

With regard to the executive functions, we saw that performance in the tasks of working memory and cognitive flexibility also improved with both chronological age and LD of participants with DS. Therefore, these two developmental factors were also relevant to mastering these two components of the executive functions. Specifically, we observed a significant improvement in working memory in the younger chronological age group and the two linguistic level groups, and a significant

#### TABLE 4 | Correlations between social cognition and executive function components for each group of participants.


The values in the table are Pearson correlation coefficients (r) and its significance ( ∗∗∗p < 0.001; ∗∗p < 0.01; <sup>∗</sup>p < 0.05; t.s. when p > 0.05; and n.s. when p > 1). The correlations appear in this order on the table: DS, CA, and LD.

improvement in flexibility in the older age group and the two LD groups. In contrast, in the inhibition task, we only observed a significant improvement in the two linguistic level groups into which, we divided the participants with DS. With inhibition, it seems that chronological age was not an important factor, and in fact, the scores obtained by all three age groups for this task were very close to the maximum score.

**Table 4** below illustrates the patterns of relationship between social cognition and the executive functions for each of the groups participating in this study. To graduate the intensity of a correlation, we used the criteria described by Bisquerra (2004): r = 1 perfect correlation, 0.8 < r < 1 very high correlation, 0.6 < r < 0.8 high correlation, 0.4 < r < 0.6 moderate correlation, 0.2 < r < 0.4 low correlation, 0 < r < 0.2 very low correlation, and r = 0 no correlation.

In the DS group there was a significant correlation between social cognition and all components of the executive functions evaluated. The strongest correlation was with working memory, this being both positive and high. The correlation with components of cognitive flexibility and inhibition was also positive but moderate. Similarly, in the LD group social cognition displayed a significant, positive and high correlation with both working memory and flexibility. In contrast, in the CA group a significant correlation was only found between social cognition and working memory, this being positive and moderate. It is therefore interesting to point out that working memory was the only component of the executive functions that remained constant in the relationship patterns of the three groups of participants.

It is worth noting that, as the above table shows, internal correlations between the different components of the executive functions followed different patterns in groups. The CA group is the one which displayed most divergence in relation to the other two, without significant correlations between executive function components. On the other hand, in DS and LD groups, working memory correlates with cognitive flexibility.

To evaluate the predictive capacity of each executive component, we conducted a multiple linear regression model (independent/predictive variable: score in working memory, inhibition, and cognitive flexibility; dependent variable: total score in social cognitive abilities). The results of the multiple linear regressions (using the enter method of the SPSS) pointed in the same direction, as for all three groups the regression model only included working memory as a predictive variable of score in social cognition. However, it is interesting to note that although the predictive variable of social cognition is the same for all three groups, the percentage of variance in social cognition that this component of the executive functions was able to explain was not the same. In the DS and LD groups, the model which includes only the significant predictors (constant and working memory) explained above 50% of the variability, whereas in the CA group this component of the executive functions only explained 30% of the variability (see **Table 5**).

Although this is not the aim of the present work, we have used the same procedure to analyze the predictive capacity of social cognition on each component of executive functions (note that this multiple linear regression includes only one predictive variable). As **Table 6** shows, in the DS group, social cognition was significant to predict all the components of executive functions assessed, explaining above 50% of working memoyr, 30% of cognitive flexibility, and 15% of inhibition. In a similar pattern, in the LD group, social cognition is a significant predictor for working memory (explaining above 60%) and cognitive flexibility (explaining above 40%). And finally, in the CA group, social cognition only predicts above 30% of working memory.

# DISCUSSION

The aim of this study was to investigate the role of executive functions in social cognition among children with DS and compare it with that described for children with typical development of the same linguistic level and chronological age. We will first discuss the results that children with DS obtained for social cognition and the executive functions in the administered tasks, as well as their evolution by chronological age and language development. We will then discuss the role of executive functions in social cognition and comment briefly the relationship in reverse.

Previous studies have reported difficulties in social cognition among children with DS, both from the cognitive aspect (e.g., Binnie and Williams, 2002; Giaouri et al., 2010) and the emotional aspect (e.g., Kasari et al., 1995; Wishart et al., 2007). The results of our study point in the same direction, showing that participants with DS have deficits in all aspects of social cognition that, we evaluated. We would also add, in line with the findings of Kasari et al. (2001), that these difficulties, although remaining present, are not as severe when mental age (or level of LD) is taken into account.

As for mastery of the executive functions, the results of our study on children with DS were also in line with those suggested by previous research, in particular the fact that the different components of the executive functions are


#### TABLE 5 | Multiple linear regression models of social cognition for each group of participants.

<sup>a</sup>WM, working memory; <sup>b</sup> INH, inhibition; <sup>c</sup>FLEX, cognitive flexibility. The predictive models described in this table include only the significant predictors.

TABLE 6 | Multiple linear regression models of working memory, inhibition, and cognitive flexibility for each group of participants.


<sup>a</sup>WM, working memory; <sup>b</sup> INH, inhibition; <sup>c</sup>FLEX, cognitive flexibility; <sup>d</sup>SC, social cognition. The predictive models are described only in the components in which social cognition is a significant predictor.

affected unequally (e.g., Rowe et al., 2006). In our study, participants with DS displayed less alteration in the component of inhibition, especially when language ability was taken into account. This relative preservation of inhibition compared to the other components of executive functioning has also been described in research by Carney et al. (2013) on children and adolescents with intellectual disability, and in the study by Costanzo et al. (2013) on adolescents and adults with DS and Williams syndrome. Furthermore, in the latter study participants with DS showed a greater alteration in the components of flexibility and working memory when compared with children with Williams syndrome. Danielsson et al. (2012) found that adults with intellectual disabilities have difficulties in working memory and accessing lexical items, but not in inhibition. So it would seem that this tendency continues in later stages of development.

Beyond the difficulties described in the two aspects of social cognition and the different components of the executive functions, the results of this study suggest that, at the ages studied at least, children with DS experience improvements in

these abilities with increased chronological age and language development. Lee et al. (2015) found that inhibition is the only component of executive functions that improve with age in a sample of 85 youth with DS. However it is important to consider that this study assess executive functions by a report completed by parents. Molina and Amador (2010) concluded that despite the described difficulties in social cognition, when a group of children with DS were offered the necessary assistance they improved and exhibited a similar performance to that of their peers with typical development. We believe that the improvement described in participants with DS in our study supports this finding. Thus, contrary to the stagnation that has sometimes been suggested in both individuals with DS and other forms of intellectual disability, social cognition and executive function abilities improve with development, at least in the group that we have studied. However, in order to verify the presence of this improvement with increasing age in participants with DS, we would need to conduct a longitudinal study, like the one conducted by Lee et al. (2015). Our study has not a longitudinal nature. Therefore, we can only conclude that older children with DS performed better than young children with DS, because we cannot discard that older participants had better executive function abilities in early ages.

With regard to the role of executive functions in social cognition, we should first take a moment to discuss the relationship between the various components of the executive functions. Miyake et al. (2000) suggested that in the beginning of their development, executive functions may be grouped under the same domain and no differentiation is made between them. According with these authors, as development progresses, these functions can be grouped into more specialized and separated components. The results of our study could be seen to agree with this, because in participants with DS and peers of the same linguistic level, both at a lower developmental level than the control group by age, there was a high correlation between most of the executive function components evaluated. In the control group by age, however, which had a higher level of development than the above groups, the correlation between the different executive function components was nonexistent. Therefore, we believe that at this age the different executive function components have become specialized, which is why the correlational analysis presented them as independent components.

Above and beyond discussion of the relationship between the different components of the executive functions, the main aim of this study was to analyze the role of executive functions in social cognition abilities in DS children.

In all three of the study groups, a relationship was found between social cognition and working memory, as described in previous studies on children with typical development (Hala et al., 2003; Mutter et al., 2006). Surprisingly, and unlike the findings of previous studies such as that by Carlson and Moses (2001) or Carlson et al. (2002), the components of cognitive flexibility and inhibition did not display any relationship with the social cognition abilities of participants in the control group by chronological age. However, a relationship was found between social cognition and cognitive flexibility in the other two groups and between social cognition and inhibition in the group of children with DS. It is worth noting that participants in the age control group obtained very close to maximum scores in all the tasks, and the lack of a relationship between these components and social cognition could therefore be caused by this ceiling effect. As the results show, participants with typical development of the age-matched control group obtained a score near the ceiling on social cognition, working memory, inhibition, and cognitive flexibility tasks. Perhaps, this ceiling effect is hiding major or subtle differences between groups. So, the results of this study need to be corroborated in future works using more appropriate instruments or tasks to evaluate social cognition and executive functions in typically developing and DS children.

With regard to the pattern of relationships between social cognition and the executive functions it's important to refer the study conducted by Zelazo et al. (1996) in adults with DS. They found that theory of mind was correlated with cognitive flexibility. However, we have no previous studies conducted in children with DS with which to compare it at the present time. What, we can state, however, is firstly that the relationship established between social cognition and the executive functions in populations with typical development is also extended to children with DS, and secondly that the relationship pattern described in individuals with DS is similar to that displayed by their peers with typical development with similar linguistic skills.

We might therefore consider working memory, being the only component of the executive functions in our study to display a relationship with social cognition in all three groups of participants, to be an essential element in improving social cognition in children with both typical and atypical development. The nature of our study is correlational. So, we can not conclude, from our results, the presence of a causal relationship between working memory and social cognition abilities. However, our results added to the ones of other studies, point to this direction. For example, Hughes (1998) described a correlation between the understanding of false belief and working memory, the relationship remaining with age and when controlling for verbal ability (Davis and Pratt, 1995; Keenan, 1998). Though other studies, such as that by Slade and Ruffman (2005) did not find that working memory facilitated subsequent understanding of false belief in a group of children aged three to four.

Regarding the predictive capacity of working memory on theory of mind, we only know of one study that supports this hypothesis, that conducted by Davis and Pratt (1995). Said study, using a multiple linear regression analysis on children aged between three and five, showed that working memory predicts performance in a false belief task, even when controlling for age and vocabulary. Nevertheless, the authors themselves say that success in working memory is a necessary requirement for competent performance in theory of mind.

We have also explored the opposite direction of this relationship: the role of social cognition in each component of executive functions. Our results show that social cognition is a predictive variable for working memory in all the groups, and also for other executive components in children with DS and

their peers with a similar LD. However, we must be cautious with these results because a recent meta-analysis conducted by Devine and Hughes (2014) found an asymmetrical pattern of relationship between social cognition and executive functions; early executive functions predict later variation in false belief understanding more strongly than vice versa.

According with Diamond (2013), we have enough empirical evidence to say that executive functions can be improved at any age across the life cycle. More specifically, training studies like that conducted by Klingberg et al. (2005) on children with attention difficulties, but also others conducted on children with typical development, suggest that the component of working memory can be trained. However, according to Shipstead et al. (2012), the aforementioned training studies displayed many shortcomings and a definitive demonstration of the possibility of improving working memory through training was therefore still required. In response to this debate, the meta-analysis conducted by Melby-Lervå and Hulme (2013) showed that working memory training produces only short-term effects and cannot be extrapolated to work with other cognitive abilities. However, in this same study the authors observed, through the analysis of a small and surely insufficient number of studies, that in visuospatial working memory the effects of training can last for up to 5 months. More recent studies report specific data about the possibility to train working memory in individuals with DS. For example, the study by Costa et al. (2015) showed that two adolescents with DS improved their performance in some (trained and non-trained) working memory tasks, specifically in visuo-spatial working memory tasks, after a sixweek school-based intervention. In the same direction, Pulina et al. (2015) found that thirty-nine children and adolescents with DS improved their performance in the spatial-simultaneous component of working memory after a training administered individually (by parents or experts) during a month. It is worth noting that these improvements were maintained after a month in both groups.

Aside from the results suggested by training studies, and given that cognitive flexibility is also found to be related to working memory in children with DS and their peers of the same linguistic level, we might consider it to be another important element in improving social cognition. This was demonstrated by the findings of Fisher and Happé (2005) in research where training in the executive functions (based on cognitive flexibility) was found to be useful in improving the understanding of false belief in children with autistic disorders.

Working memory or cognitive flexibility training could be an open window for improving social cognition in this population. However, with the research available to us today, we can state that the effects of working memory training (or that of other components of the executive functions) on the understanding of false beliefs are not clear, and even less so when applied to social cognition. It is for this reason that, taking into account the contributions by Diamond (2012) regarding repeated practice as a key element in the success of executive function training and the greater benefit of this to children with poorer executive functions, we believe it necessary and useful to continue to explore this relationship in these populations.

In the future, this line of inquiry could provide the key to promote the cognitive domains of social cognition and executive functions in children with DS. But also, and more importantly, could provide the key to ensure higher levels of inclusion in society and a best quality of life for people with DS.

# CONCLUSION

The aim of this study was to investigate the role of the executive functions in social cognition among children with DS and compare it with that of their peers with typical development. Apart from one study of adults with DS, we do not know of any previous studies that have addressed this question in this population, and neither have, we found many studies comparing the performance of children with DS and that of their peers with typical development of the same linguistic level.

The results of our study show that, in line with the findings of previous studies, participants with DS underperformed in comparison with their peers with typical development, both in terms of social cognition and the executive functions. The most interesting finding is that the predictive role of executive funcitons in social cognition described in children with DS is similar to that exhibited by their peers with typical development with the same language skills. The results of this study confirm the importance of the different components of the executive functions in this relationship and highlight the central role of working memory. Moreover, they suggest that the executive functions may be displayed as undifferentiated in early stages of development.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

This research was supported by a grant from the University of Girona (BR09/19).

# ACKNOWLEDGMENTS

We are grateful to the many children and families who participated in this study. Also, we would like to thank all of the schools and associations that have collaborated in this study, specially Fundació Catalana Síndrome de Down (Barcelona, Spain), Fundació síndrome de Down de Girona i comarques Àstrid 21 (Girona, Spain), and Associació Espai 21 (Vic, Spain).

# REFERENCES

fpsyg-07-01363 September 10, 2016 Time: 15:11 # 11



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Amadó, Serrat and Vallès-Majoral. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mental State Understanding in Children with Agenesis of the Corpus Callosum

#### Beatrix Lábadi <sup>1</sup> \* and Anna M. Beke<sup>2</sup>

<sup>1</sup> Department of General and Evolutionary Psychology, Institute of Psychology, University of Pécs, Pécs, Hungary, <sup>2</sup> Obstetric and Gynecology Clinic No. 1, Semmelweis University, Budapest, Hungary

Impaired social functioning is a well-known outcome of individuals with agenesis of the corpus callosum. Social deficits in nonliteral language comprehension, humor, social reasoning, and recognition of facial expression have all been documented in adults with agenesis of the corpus callosum. In the present study, we examined the emotional and mentalizing deficits that contributing to the social-cognitive development in children with isolated corpus callosum agenesia, including emotion recognition, theory of mind, executive function, working memory, and behavioral impairments as assessed by the parents. The study involved children between the age of 6 and 8 years along with typically developing children who were matched by IQ, age, gender, education, and caregiver's education. The findings indicated that children with agenesis of the corpus callosum exhibited mild impairments in all social factors (recognizing emotions, understanding theory of mind), and showed more behavioral problems than control children. Taken together, these findings suggest that reduced callosal connectivity may contribute to the development of higher-order social-cognitive deficits, involving limits of complex and rapidly occurring social information to be processed. The studies of AgCC shed lights of the role of structural connectivity across the hemispheres in neurodevelopmental disorders.

Keywords: agenesis of the corpus callosum, mentalizing ability, emotion recognition, executive function, behavioral problems

# INTRODUCTION

Agenesis of the corpus callosum (AgCC) is a common cerebral malformation resulting from a failure to develop fibers that provide the largest connective tract between the two hemispheres. The corpus callosum consists of over 200 million axons that transfer information between the two hemispheres. Callosal anomalies are the most frequent malformations in the brain, with imaging studies indicating that AgCC occurs in 1:4000 live births (Wang et al., 2004; Glass et al., 2008), and 3–5% of neurodevelopmental disorders involve callosal malformation (Bodensteiner et al., 1994). The developmental absence (agenesis) of the corpus callosum can occur in a variety of conditions that disrupt the early development of the callosal fibers. Current studies suggest that callosal dysgenesis can be reflected in inborn errors of metabolism, chromosomal anomalies, or genetic syndromes (Bedeschi et al., 2006). AgCC can encompass either total or partial absence of the corpus callous, as well as hypoplasia (formation of a thinner than expected corpus callosum). Surprisingly, the comparison of partial and total agenesis of the corpus callosum showed only slight differences

#### Edited by:

Anne Henning, SRH Hochschule für Gesundheit Gera, Germany

#### Reviewed by:

Andrea Poretti, Johns Hopkins School of Medicine, USA Przemyslaw Tomalski, University of Warsaw, Poland Andrea Berger, Ben-Gurion University of the Negev, Israel

> \*Correspondence: Beatrix Lábadi labadi.beatrix@pte.hu

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 29 August 2016 Accepted: 16 January 2017 Published: 06 February 2017

#### Citation:

Lábadi B and Beke AM (2017) Mental State Understanding in Children with Agenesis of the Corpus Callosum. Front. Psychol. 8:94. doi: 10.3389/fpsyg.2017.00094 in medical and behavioral outcomes (Paul et al., 2007). Patients with the syndromic form of corpus callosum dysgenesis (a callosal abnormality associated with other genetic syndromes e.g., Aicardi syndrome) show severe developmental delay and intellectual disabilities (Sztriha, 2005). Whereas individuals with isolated AgCC (Symington et al., 2010), meaning they do not have additional syndromes or disorders or evidence of other brain pathology, typically only have mild behavioral and cognitive problems (Moutard et al., 2003). However, the outcome of isolated AgCC is often unclear because their intellectual development can range from severely delayed to "perfectly normal" (Paul et al., 2007).

Initial studies examining individuals diagnosed with AgCC suggested impairments in their higher-order cognitive functions and social interaction. Neuropsychological studies found evidence for cognitive impairments in abstract reasoning (Brown and Paul, 2000), problem solving, and processing speed (Marco et al., 2012). These cognitive abilities become more impaired as the task's complexity increases (Brown and Paul, 2000). However, those with isolated AgCC do not show severe general cognitive disabilities (Sauerwein et al., 1994) or language impairments regarding naming, receptive language, and lexical reading abilities (Brown and Paul, 2000). While deficits were observed in linguistic pragmatics, AgCC sufferers have difficulty understanding idioms, proverbs (Banich and Brown, 2000), and narrative humor (Paul et al., 2003) as they tend to ignore the second-order meaning of narratives or conversations.

Acallosal patients generally exhibit difficulties in social cognition and social behavior, with adults showing impairment in understanding others' mental states (Symington et al., 2010) and in recognizing emotions (Bridgman et al., 2014). The deficit in emotion recognition seems to be directly associated with atypical facial scanning; adults with AgCC spend less time in the eye region while observing emotional expressions of others (Bridgman et al., 2014), resulting in poorer detection of others' emotional and mental states (Baron-Cohen et al., 1997). The social-cognitive impairment in AgCC was also demonstrated by mild theory of mind (ToM) deficits (Symington et al., 2010), poor social self-awareness (Brown and Paul, 2000), and difficulties in social perspective taking (Symington et al., 2010; Turk et al., 2010). Overall, the findings of social cognition research suggest that AgCC patients have particular difficulties understanding complex socio-emotional and life-like contexts of everyday situations. These social cognitive impairments in people with callosal agenesis overlap with the profile of autism spectrum disorders (ASD). Individuals with ASD show similar patterns of emotion recognition, being significantly worse at recognizing emotions compared with normal controls, particularly when only the eye region of faces is presented. There is also evidence that AgCC individuals share the characteristic of impaired social cognition with ASD patients, especially with respect to the difficulties in recognizing another person's mental states, feelings, intentions, and goals. Survey studies, completed by caregivers of children with AgCC, reported that a significant number of children and adults have problems with social behaviors (Badaruddin et al., 2007), and exhibit significant autistic symptomatology (Moes et al., 2009; Lau et al., 2013). To clarify the relationship between the autistic-like behavior and callosal agenesis, a recent study (Paul et al., 2014) directly compared AgCC adults with ASD adults. Using the Autism Diagnostic Observation Schedule, they found that one third of the adults with AgCC met the clinical criteria for autism, whereas very few subjects were consistent with the ASD diagnosis when developmental history was included. The autistic traits seen in AgCC patients appear to emerge in differing time-courses, depending on the age of the AgCC patient.

Despite the convergent evidence reviewed here, with regard to social and cognitive deficits in AgCC, much work is still needed. At present there are only a few studies that have directly examined the mentalizing abilities in persons with agenesis of the corpus callosum, and these have mainly been case studies. In addition, even less research has been conducted in children with AgCC that has specifically investigated the developmental course of social-cognitive domains, including emotion recognition and theory of mind. Previous studies involving adults implicitly proposed that these impairments are not likely to be exhibited in younger children with AgCC (Paul et al., 2014), because normally, the corpus callosum is not yet fully myelinated until adolescence (Giedd et al., 1996). However, parent reported assessment studies have suggested that children with AgCC are more likely to exhibit autistic symptoms, including social-cognitive deficits, compared with adults (Moes et al., 2009). Additionally, the specific social and communicative abnormalities emerge early on, at about the age of three, in ASD children who share important clinical and neuroanatomical parallels with AgCC children. Therefore, the social and cognitive deficits in emotion recognition and mentalizing capacities are more likely to occur in childhood in AgCC children. The present study addresses this issue in a sample of 6–8 year old children with isolated AgCC. We chose this age range because the social and higher-order cognitive functions (theory of mind, understanding emotions, inhibitory control, working memory), which are necessary for school readiness, are available for typically developing children at this age.

The first aim of this study was to characterize the social and higher-order cognitive functions in a sample of 18 children diagnosed with isolated AgCC. In light of previous studies examining social and cognitive functions in AgCC and ASD individuals, we predicted that when given the task of recognizing complex mental states from faces, children would perform poorly; but would perform normally at recognizing basic emotions. Additionally, we expected that AgCC children would have more difficulty identifying mental states or emotions that involved the eye region. We also hypothesized that children with AgCC would have difficulties in inferring the mental states of others, but this deficiency only becomes apparent in more complex situations, when more information must be processed and integrated. We expected that children would be more likely to pass first-order false belief tasks, but would perform poorly in second-order false belief tasks. Alternatively, the impaired social cognitive function could reflect deficits in inhibitory control and working memory. Here, we predicted that performance of inhibitory control becomes more impaired in children with AgCC relative to normally developing children, and the impaired executive functions makes a unique contribution to the ability of theory of mind. In this study, we test the relative contributions of inhibitory control, working memory, and assessment of intelligence to AgCC children's social abilities (ToM and emotion - mental state recognition). Our second aim was to examine the relationship between social cognition and the severity of behavioral problems in children with and without corpus callous agenesia. To answer these research questions, we employed validated social cognitive tasks (emotion recognition, theory of mind, executive function, and working memory), and parentreported assessments. Our study is the first comprehensive direct comparison of AgCC children with typically developing children, in cognitive and social domains.

#### MATERIALS AND METHODS

#### Participants

Participants included 18 children with isolated corpus callosum agenesia between the age of 6 and 8 years, and 18 normally developing children as control (**Table 1** shows the demographic and psychological background information). Groups were matched with respect to IQ, age, gender, children's education, and caregiver's education. The two groups had exactly the same number of males (14) and females (4), and they were perfectly matched for age and education level; each child with AgCC was individually paired to a typically developing child. In the AgCC cohort, five were left-handed and four were ambidextrous, while in the control group two were left-handed and 16 were right-handed, with handedness being determined by the administration of the Edinburgh Handedness Inventory. The AgCC group included 16 with complete agenesis of the corpus callosum and two with partial agenesis (we did not exclude two children with partial AgCC because the individual connectivity pattern and differences was beyond the scope of our study, and previous studies also included both partial and complete AgCC individuals, e.g., Lau et al., 2013; Paul et al., 2014). The inclusion criteria for both groups were: 6–8 years of age, IQ scores >75, no major head trauma or neurosurgery, no additional genetic syndromes (e.g., Aicardi syndrome), and no severe psychopathology (children with anxiety, ADHD, and children undergoing psychotherapy treatment and/or taking psychotropic medication were excluded). Regular and neuropsychological examinations for all AgCC participants were conducted at the Neurology Department of Obstetric and Gynecology Clinic (Semmelweis University) in Budapest. The controls were recruited from the local kindergarten and primary school.

Children with AgCC were first diagnosed before birth; the absence of the corpus callous in utero was identified upon routine high-resolution ultrasound, and, then a magnetic resonance imaging (MRI) scan confirmed the diagnosis. For all participants with AgCC, previous MRI and radiological reports were gathered to confirm the diagnosis of complete or partial AgCC by an independent second neuroradiologist. Images were evaluated for the presence and size of the corpus callosum, Probst bundle, anterior commissure, white matter abnormalities, and cortical malformations (e.g., subcortical heterotopia, polymicrogyria). Participants with AgCC were included if they had structural TABLE 1 | Demographic and background psychological measures.


The comparison based on the t-test for age, FSIQ, PIQ, VIQ, caregiver education and x<sup>2</sup> for gender, handedness and children education.

findings that commonly co-occur with AgCC: Probst Bundle, colpocephaly, and interhemispheric cysts. Children with other structural brain abnormalities (known genetic syndrome, frontal lobe dysgenesis) were excluded. The presence of anterior commissure was confirmed in all participants. The intelligence scores, based on the Test of Wechsler Intelligence Scale for Children - III (Hungarian standard version), were also collected from previous neuropsychological records (assessed within a year). Control participants' intelligence scores were also established using the Test of Wechsler Intelligence Scale for Children -III.

The caregiver of each participant signed an informed consent. All participants were treated in accordance with the Hungarian Psychological Association Ethical Codes. This study was carried out in accordance with the recommendations of Psychology Research Guidelines of the Ethical Committee of the Hungarian Psychological Association, with written informed consent from each caregiver of the subjects, in accordance with the Declaration of Helsinki.

#### Measures

#### Theory of Mind

To assess ToM, we used two classic False Belief Tasks with some modification to test AgCC children's ability to understand others' mind. The first-order false belief task was the traditional The Smarties tube test (Perner et al., 1989). The task involves a familiar Smarties box, but filled with pencils instead of candies. The experimenter first asks the child "What do you think is in this box?," and the child naturally replies "Candies," because they have an expectation of what is in the box (false belief). The child is then shown the pencils inside the box. Then the experimenter closes the lid of the box and asks the child two belief questions. The first question is "When I first showed you this box what did you think was inside?," and the second question is "What will your mother (who did not see the pencils) think is inside the box?" If the child has a theory of mind, they will realize that their mother would also think candies are inside, and a normal 4-year-old child mostly answers "Candies," by referring the other's false belief, but 3 year-old children or children with impaired ToM usually reply "pencils."

The second-order false belief task was a modification of the Sally-Anne test (Baron-Cohen et al., 1999). In the original task (Baron-Cohen et al., 1985) the child is introduced to two dolls, Sally and Anne, who are playing with a marble. The dolls put the marble in a basket then Sally leaves the scene. Anne takes the marble out of the basket and she puts it away in a different container. When Sally returns the child is asked "Where will Sally search for the marble?" The child fails the theory of mind task if she answers that Sally will search for it in the second container. The second-order modification of the Sally-Anne task is that when Sally leaves, she looks back through the key-hole while Anne is transferring the marble to the new location. When Sally returns, the test question is no longer "Where will Sally search for the marble?," instead it is "Where does Anne think Sally will search for the marble?" We used this modification of the Sally-Anne Task to test the children's second-order theory of mind ability. Children were successful if they responded correctly to both the test and control questions. These tasks were scored as pass = 1, fail = 0. Performance across the two tasks was summed (range 0–2) to create a single indicator of false belief understanding. Additionally, we also analyzed each test individually as the indicator of first-order false belief test and second-order false belief test.

#### Emotion and Mental State Recognition

We administered the Faces Test (Baron-Cohen et al., 1997) to measure the emotion and mental state recognition of children. The Faces Test consists of 20 photographs of an actress posing, 10 photos of basic emotions (happy, sad, angry, afraid, disgust, distress, surprise), and 10 photos of complex mental states (scheme, guilt, admire, interest, thoughtfulness, quizzical, bored, arrogant, flirting, quizzical). In our experiment, under each photo, two words were typed, but only one described the target basic emotion or mental state the actress was posing. Subjects were presented with 20 photos (10 basic emotions and 10 complex mental states) separately in a random order. For each photo, the experimenter read the two words under the photo, and the child was asked to choose the emotion/mental state that best described what the person was thinking or feeling in the picture. Each trial was scored as pass = 1, fail = 0. The dependent measure was the number of correct answers for basic emotions and for complex mental states. Performance across the two conditions was summed (range 0– 20) to create a single indicator of emotion and mental state recognition.

#### Executive Function

Two executive function tasks were administered, providing measures of inhibitory control (Day and Night Stroop), and working memory (Digit Span forward and backward).

For working memory, we administered the standard Wechsler Digit Span Task to measure the working memory capacity. This test requires the examiner to verbally present digits at a rate of one per second, and children are asked to repeat the digits verbatim in the same order (forward test). The backward test requires the participant to repeat the digits in reverse order. The number of digits increases by one until the participant consecutively fails two trials of the same digit span length. The task was preceded by a brief training procedure. Two practice items preceded the experimental trials, and the task was only started if the child passed the practice trials. Children were administered two different trials of each sequence length, which ranged from two to nine.

To measure inhibitory control, the Day and Night Stroop task (Gerstadt et al., 1994) was used to assess executive function measurement of interference control in young children. Children were instructed to say the word "night" when they saw a white sun card and to say "day" when shown a black moon card. We used two conditions, an Incongruent Condition to test the ability of inhibition and a Congruent Condition as a control. In the Incongruent Condition, children were required to say the opposite of what was shown on the day-night cards, maintain the task instructions over the procedure, and inhibit a dominant response associated directly to the stimulus while executing the subdominant response. In the Congruent condition, children simply said what the stimulus represented. The order of the conditions was counterbalanced across participants, for half of the participants, the experiment started with the presentation of Congruent Condition, while for the other half of the participants the Incongruent Condition was conducted. The participant did four practice trials. In each condition 16 trials were administered, in which eight night cards and eight day cards were shown in a pseudo-random order (n(ight), d(ay), n, d, d, n, d, n, n, d, d, n, d, n, n, d, n, d). No feedback was given to the children. The task was presented on a computer screen and was controlled by PsychoPy, presenting the stimuli and recording the participants' responses. The dependent measure was the total number of correct answers for each condition. Response latency was not measured because most of the children were unable to correctly use the response panel.

#### Behavior Questionnaire

We used the validated Hungarian version of the extended Strength and Difficulty Questionnaire (Goodman, 1997, 1999, SDQ) to measure the children's emotional and behavioral difficulties. The SDQ was administered by parents and covers the major areas of emotional and behavioral difficulties, and predicts psychiatric disorders. The SDQ consists of 25 items, divided into five subscales: the prosocial subscale, the inattentionhyperactivity subscale, the emotional symptoms subscale, the conduct problems subscale, and the peer problems subscale. Each item can be scored "not true," "somewhat true," or "certainly true." The extended version includes questions that ask whether the respondent considers the young person to have a problem and its impact on their social emotional life. All subscales expecting pro-social behavior are summed to compute a total difficulties score. The dependent variables taken from the SDQ include the total score of difficulties and five subscale scores. The SDQ is available on the internet website: www.sdqinfo.com.

#### Procedure

Subjects were tested individually in a quiet room, at the clinic of Semmelweis University for AgCC children, and at the local primary school for control children. All children were tested in a single session for the target tasks, the control children completed the intelligence test in a separate session. Intelligence was assessed in the AgCC children prior to the study, by a neuropsychologist during a regular yearly visit. On arrival, the child was asked to be seated at the table, then the experimenter explained that they were going to play some "games." Prior to each test, participants were trained on how to do the task. Parents received a child behavior questionnaire and were asked to complete the questionnaires and return them. All children were administered individually over two separate sessions, and there was no time limit.

#### Data Analysis

All statistical analyses were conducted using IMB SPSS Statistics (Version 22.0). We used an independent t-test and chi<sup>2</sup> test to examine the group differences between the AgCC group and control group for social-cognitive factors and behavioral problems. We then computed correlations (Pearson's r) to test our prediction that social-cognitive abilities would be associated with behavioral symptoms. We used a significance level of 0.05 (two-tailed) for all tests.

## RESULTS

#### Theory of Mind

First, we evaluated the control questions (reality and memory) of the false belief tasks, and only those subjects who passed the control questions were included in the present analysis. One AgCC child was excluded from the analysis due to failing the control questions. We compared the performance of AgCC children with control children for on each false belief test using chi<sup>2</sup> test. The proportion of subjects in the AgCC group who passed either the first-order false belief task [Smarties tube test, x 2 (35) = 7.098, 1df, p = 0.00], or the second-order false belief task [modified Sally-Anne false belief task, x 2 (35) = 3.736, 1df, p = 0.05] was significantly smaller than that of the control group. Finally, the Mann-Whitney non-parametric test confirmed that children with AgCC performed poorer, with lower total scores on the ToM tasks [z(35) = −2.612, p = 0.009]. **Table 2** shows numbers of children who passed the Smarties or Sally-Anne M false belief tasks.

Additional analysis of covariance (ANCOVA) was conducted in order to investigate the effect of verbal intelligence on false belief performance. When VIQ was applied as a covariate in ANCOVA, it indicated that false belief scores were significantly poorer for the AgCC group [F(1, 32) = 6.42, p = 0.01, ηp <sup>2</sup> = 0.01]. TABLE 2 | Number of subjects in each group who passed on Smarties or Sally Anne M belief tasks (an AgCC child did not pass the control questions).


ANCOVA shows that the main effect of verbal ability (VIQ) on false belief performance was not significant [F(1, 32) = 2.90, p = 0.09, ηp <sup>2</sup> = 0.08].

#### Emotion and Mental State Recognition

First, we compared the performance on recognition of basic emotions and complex mental states in AgCC and control subjects. The AgCC group were less accurate than the control group on overall Faces scores [for total scores t(35) = −3.483, p = 0.001, d = 1.16]. Subjects with AgCC showed poorer performance (M = 12.83, SD = 3.05) on selecting the target emotional and mental states compared with the control children (M = 15.55, SD = 1.29). Repeated measures of ANOVA, comparing group (AgCC vs. control) and complexity of mental states factor (basic vs. complex), revealed a significant group effect [F(1, 34) = 12.13, p = 0.001, ηp <sup>2</sup> = 0.26], indicating that children with AgCC were less accurate than control subjects. A significant difference was also found for complexity of mental states factor [F(1, 34) = 15.93, p = 0.00, ηp <sup>2</sup> = 0.31]; children recognized the basic emotional mental states more accurately than complex mental states in both groups. The group × complexity interaction was not significant [F(1, 34) = 1.602, p = 0.21, ηp <sup>2</sup> = 0.045] for complexity and group factors. The mean scores (**Figure 1**) indicate that subjects with AgCC performed less accurately on both basic emotion trials (M = 7.16, SD = 1.65) and complex mental state recognition trials (M = 5.66, SD = 1.87)



The first term is the target (correct) term. The position of the target term and the order of the picture pairs were randomized. \*\*p < 0.01, \*p < 0.05.

compared with the control children (for basic emotion: M = 8.16, SD =1.04, and for complex mental states: M = 7.38, SD = 1.29).

**Table 3** shows the number of children choosing the correct basic emotion or complex mental state for each trial. Using chi<sup>2</sup> test to compare the performance of AgCC individuals and control subjects on each trial, the analysis revealed that there was a significant difference between the AgCC group and control group, but only on the "angry vs. afraid" and "sad vs. disgust" trials in the basic emotion trials. As for the trials of the complex mental states, children with AgCC were significantly less accurate on the trials of the "guilt vs. arrogant," "interest vs. disinterest," "scheming vs. arrogant" compared with control subjects (**Table 3**).

#### Executive Function Working Memory

The independent sample t-test results showed no significant effect for the forward digit span (p = ns.), or for the backward digit span (p = ns.).

#### Inhibitory Control

A 2 groups (AgCC vs. control) × 2 conditions (congruent vs. incongruent) mixed model ANOVA was conducted for performance (correct response rate). The analysis of performance showed a significant main effect for the condition [F(1, 30) =

65.29, p = 0.000, ηp <sup>2</sup> = 0.684], and for the group [F(1, 30) = 36.39, p = 0.000, ηp <sup>2</sup> = 0.548], and for the interaction between group and condition [F(1, 30) = 33.61, p = 0.000, ηp <sup>2</sup> = 0.527]. The results demonstrate that Stroop-like interference is higher in children with AgCC (M = 14.0, SD = 0.41) compared with control children (M = 17.35, SD = 0.46), regarding the performance (**Figure 2**).

# Relationship between Social and Cognitive Factors

We computed additional correlational analysis for both groups separately to determine correlations between social and cognitive factors (**Table 4**). Within the AgCC group, there was a medium strength correlation between forward digit span and basic emotion recognition, as well as between backward digit span and complex mental state recognition and inhibitory controls, with significant correlation coefficients ranging from r = 0.41 to r = 0.62, p < 0.05. For the inhibitory control measures, there was no significant correlation with social factors (false belief and emotion and mental state recognition). Similarly, intelligence factors (general IQ, verbal IQ, and non-verbal IQ) also showed no significant correlation with social and cognitive factors (theory of mind, emotions, and mental state recognition, inhibitory control and working memory). For the control group, significant correlations were only found between backward digit span and false belief [r(18) = 0.51, p = 0.03].

#### Behavioral Questionnaire

We compared differences in behavioral, emotional and relationship problems between the two groups using independent sample t-test, and computed Cohen's d effect size. Comparison of the SDQ subscale scores and the total difficulties revealed that children with AgCC had significantly higher mean scores (p <



False belief, FACES BE, Basic Emotion recognition, FACES MS Mental States, FACES Total, Day-night, Day and Night Stroop, Digit Span Forward, Digit Span Backward \*\*p < 0.001, \*p < 0.05.



Number of children (N), SDQ mean scores, standard deviations (SD) and effect sizes for SDQ (sub) scales and t-test values.

Statistical significance of differences between children with and without AgCC (t-test): \*\*p < 0.001.

0.001) on the total difficulties scale, and on all subscales, expect for the inattention-hyperactivity subscale (p = ns.). **Table 5** shows the mean scores and Cohen's d effect size values in the AgCC cohort (**Figure 3**).

## Relationships between Social-Cognitive Factors and Behavioral Problems

We carried out further correlational analysis for each group separately to determine correlations between social and cognitive factors, and the severity of behavioral problems. The analysis involved the following factors: ToM ability (the total score of false belief tasks), emotion and mental state recognition (Faces: basic emotion recognition, complex mental states recognition, and Faces total score), and executive function (inhibitory control: Day and Night Stroop incongruent condition, working memory: digit span forward and digit span backward). The relationship of social cognitive factors to behavioral problems was examined separately for each SDQ subscale (**Table 6**). The AgCC group revealed a borderline significant correlation between ToM and the SDQ Peer Problems subscale [r(17) = −0.45, p = 0.06], and the Prosocial subscale marginally correlated with ToM [r(17) = 0.46, p = 0.06], complex mental state recognition [r(18) = 0.42, p = 0.08], and inhibitory control [r(15) = 0.48, p = 0.06]. In the control group, a significant correlation was found between complex mental state and the SDQ Inattention-hyperactivity subscale [r(17) = −0.51, p = 0.03], and digit span backward and SDQ Conduct problems subscale [r(17) = −0.49, p = 0.03].

We used a value of z (Weaver and Wuensch, 2013) that can be applied to assess the difference between two correlation coefficients of the AgCC group and control group. We computed z-tests for each pair of correlations. The ztest results showed that the observed correlations did not differ from one another between two groups (z coefficients ranged from z = −1.08 to z = 1.66, p = ns.), except for the relationship between Hyperactivity-inattention subscale and complex mental state recognition (z = −1.98 p = 0.02). This finding indicates that the relationship between social cognition and the severity of behavioral problems in children with and without AgCC does not differ significantly. This means that typically developing children and AgCC children represent the two endpoints of the same scale. Control children showed fewer behavioral difficulties, with good performance in social and cognitive tasks, while AgCC children showed more behavioral difficulties associated with weaker cognitive and social abilities.

#### DISCUSSION

In order to provide evidence to clarify the role of the corpus callosum, regarding the nature of understanding others' mind, we investigated the main socio-emotional and cognitive functions in a group of children with isolated AgCC. We administered theory of mind tasks, emotion/mental state recognition, executive measures, and a parent-reported behavioral problems questionnaire. The findings of the present study are in line with previous studies showing typical mild social cognitive impairments in individuals with AgCC, even in childhood.

On the theory of mind tasks, children with AgCC performed significantly poorer than age- and IQ-matched controls. AgCC children performed poorly on both false belief tasks, with only half of the AgCC children passing the first-order-false belief task compared with 89 percent of control children. Only 23% of AgCC children passed the second-order false belief task compared with 55% of normally developing children in our sample. These

FIGURE 3 | Parent rating on Strength and Difficulties Questionnaire. Mean scores for parent rating scales including (A) Emotional problems, (B) Conduct problems, (C) Inattention-Hyperactivity, (D) Peer problems, (E) Prosocial behavior, and (F) Total difficulties. Higher scores indicate greater symptomatology. Error bars represent standard deviation.


TABLE 6 | Correlation r between social factors and behavioral problems measures for the whole sample and for separately for each group (AgCC Group/Control Group).

False belief, FACES BE, Basic Emotion recognition, FACES SE, Social emotion recognition, FACES Total, Day-night, Day and Night Stroop, Digit Span Forward, Digit Span Backward \*\*p < 0.001, \*p < 0.05.

findings suggest that children with AgCC have an increased risk of having problems in understanding other people's mind. Similar deficits in theory of mind are known from findings in children with autism. They typically fail the false belief task (e.g., Smarties Tube Task or Sally Anne task), while 4-yearold normally developing children, or even children with Down syndrome, are able to pass it. A previous study with AgCC adults (Symington et al., 2010) also found mild theory of mind deficits in various mentalizing tasks, such as understanding sarcasm and interpreting visual and textual social cues. However, this study has not reported serious deficits in specific theory of mind tasks that required mental state attribution (Faux Pas Test and Happé Theory of Mind Stories). A more recent study (Paul et al., 2014), directly comparing the social functions in an AgCC group and an ASD group, reported that AgCC adults had higher empathizing scores than ASD adults. These findings together indicate that individuals with AgCC have difficulties in understanding others' mental state, and they share some impaired social cognition with ASD persons, but AgCC individuals seem to have better mentalizing capacities. Our findings support this conclusion; children between the age of 6 and 8 years with callosal dysgenesis also indicated developmental delay in standard ToM tasks, but their performance showed high variability, ranging from severely impaired to "perfectly normal," and their ToM performance was not associated with intelligence factors or any executive functions.

With respect to emotional and mental state understanding (Faces Test, Baron-Cohen et al., 1997), children with AgCC had fewer difficulties recognizing basic emotions from photographed facial expressions; they only showed some deficits on negative emotion trials (angry-fear and disgust-sad distinctions). In contrast, the AgCC children showed more deficits in recognizing complex mental states compared to the control children. In particular, AgCC children failed to identify expressions that depicted a specific mental state (interest vs. disinterest), or a complex social emotion (scheming vs. arrogant, and guilt vs. arrogant). Similar impaired recognition of emotions has been shown in previous studies, which examined social-emotional abilities in clinical samples. Individuals with schizophrenia have difficulties recognizing emotion expression of disgust (Bediou et al., 2005), anger (Gogharie and Sponheim, 2013), fear, and surprise (Barkl et al., 2014). While individuals with autism often fail to recognize fear and anger, they tend to mislabel anger as fear (Pelphrey et al., 2002). Additionally, the impaired recognition of complex mental states is more well-known in subjects with autism (Baron-Cohen et al., 1997). A prior study with AgCC subjects (Bridgman et al., 2014) found similar patterns of emotion recognition. AgCC individuals also had difficulties in recognizing fearful expressions and they often mislabelled anger as disgust or sadness, and fear as surprise. Individuals with callosal dygenesis also showed atypical face perception, including reduced gaze to the eyes and increased focus on the mouth. In line with this evidence, the present findings indicate that children with AgCC are able to identify most of the primary emotions, however, AgCC children have some difficulties in detecting a complex mental state from facial expressions, particularly those expressions that require processing information from the eyes (e.g., interest, guilt, arrogance, scheming). This finding supports the idea that AgCC and ADS individuals share some impairment in social cognition.

The executive functions in AgCC children showed a normal range of working memory abilities, but difficulties in performance of inhibitory control; the Stroop-like interference was higher than the performance of control children. Executive functions are a set of higher-order cognitive skills that depend upon specific callosal connectivity. A comprehensive study (Marco et al., 2012) directly investigating the executive functions in an AgCC cohort also found impairments in inhibition and flexibility tasks, but the performance in executive tests was attributed to slow cognitive processing. In contrast, Brown et al. (2001) found evidence that individuals with AgCC have normal executive functions with respect to the inhibition/flexibility skills. They speculated that the presence of other cerebral commissures in AgCC allow for the interhemispheric transfer of information in inhibitory control tasks. The inconsistency of findings, probably comes from the high variability of difficulty levels in test batteries that were used. It is likely that AgCC individuals exhibit more errors and slower processing speed when the information is more complex and less easily encoded. Our findings also showed that children without the corpus callosum committed more errors in interference tasks. Response time was not measured reliably in our study; therefore, we have no information on whether this performance is a consequence of processing speed when the nervous system uses the alternative routes connecting the hemispheres.

Taken together, findings presented here support several potential explanations that may account for the impaired social cognition in children with AgCC. First, the language explanation suggests that the impaired social function is mediated by the decreased capacity in linguistic pragmatics. Previous findings of linguistic studies have also suggested that AgCC patients tend to use the literal meaning of the narrative information and ignore the second-order meaning of narratives or conversations. It is possible that children with AgCC have trouble understanding the false belief tasks because they do not understand different perspectives in the context of ToM tasks required to describe the world linguistically. Additionally, AgCC individuals tend to use fewer "mentalizing words" (Symington et al., 2010), that reflect others' mental states ("know," "think," "feel"), resulting in deficits that infer mental and emotional processes of others. The lack of callosal interconnectivity may support the decreased capacity in linguistic pragmatics, as the callosal dysgenesis reduces the accessibility to the more complex integration of the semantic network, which is widespread in the two hemispheres. According to the second explanation, executive functioning is also a potential candidate for mediating impaired social cognition in AgCC patients. The absence of callosal connections in AgCC functional brain connectivity seems to be more limited during tasks that require complex cognitive operation, such as inhibitory control, working memory, and flexible switching. A previous brain imaging study (Hinkley et al., 2012) demonstrated that impairment of functional interaction appears in regions in the frontal, parietal, and occipital cortices, which indicated social-cognitive functions, known to be impaired in AgCC patients. Indeed, performance in executive function (Tower of London task) directly correlated with resting-state functional connectivity of dorsolateral pre-frontal cortex in individuals with AgCC (Hinkley et al., 2012). Our findings, however, do not support that executive functions, namely the inhibitory control, reflect the impaired social cognitive function, like theory of mind and mental state recognition. A third possible explanation is the deficit in the process of integration of multiple sources of information. AgCC patients have difficulties using the context of a complex situation to interpret the meaning of linguistic information, and inferring the mental states of other persons based on the available, simpler information. The absence of the corpus callosum disrupts the interhemispheric connection and limits the size of the functional processing network of complex social cognitive functions. However, individuals with complete AgCC do not experience disconnection syndrome, they exhibit a limited amount of interhemispheric transfers, mediated by anterior commissure and additional alternative anatomical tracts, developed in AgCC subjects, such as Probst and heterotopic bundles, providing compensatory mechanisms for social cognitive processes (Marco et al., 2012). However, these alternative connections cannot compensate fully for the complex function of the corpus callosum, and results in a slower processing speed. Moreover, the lacking interhemispheric connection leads to an alteration in intrahemispheric connectivity that increases the likelihood of other cognitive deficits. Other developmental disorders, such as ASD, ADHD, and schizophrenia also demonstrated reduced callosal size that contributed to impairments in interhemispheric transfer and processing speed (Paul et al., 2007).

The secondary aim of this study was to investigate whether difficulties in social cognition and/or executive functioning are related to the severity of behavioral problems, and whether they increase the prevalence of emotional- and social problems in children with AgCC. Mental health problems were assessed by the Strength and Difficulties Questionnaire, which had never been used before in a sample of children with AgCC. Using parent-administered questionnaires, we found that children with AgCC had more problems with all domains (behavior, emotions, prosocial behavior, and relationship), except the inattentionhyperactivity domain, compared with control subjects. However, the lack of differences in the Inattention-Hyperactivity subscale can be explained by the fact that there were three children in the control group who also reached the cut-off score on the Inattention-Hyperactivity subscale. The results of the SDQ indicate that the functional changes in brain connectivity might contribute to behavioral problems in childhood, as several previous studies reported mild to severe behavioral anomalies in AgCC individuals, with the most frequently mentioned disorders being autism and attention deficit hyperactivity disorder (ADHD) (Paul et al., 2007). The abnormal development of connectivity during childhood is likely to mediate the reduced capacity in complex social cognitive processes, which may contribute to the symptoms of behavioral problems or early onset of psychiatric disorders. The findings within the AgCC group demonstrated that the deficits in social cognitive skills are only marginally correlated with behavioral problems. AgCC children show mild dysfunction in three domains of mental state recognition, theory of mind, and executive function (inhibitory control), and these dysfunctions are associated with behavioral problems. It seems that the peer problems and prosocial behaviors are linked to mentalizing capacity (false belief and complex mental state) and inhibitory control, and the complemental state recognition and inhibitory control are related to the Prosocial behavior subscale. Children who lack the capacity for understanding others' mind have more problems in social domains.

One of the limitations of this study the relatively small size of sample, which might have prevented findings the hypothesized correlation between the socio-cognitive factors the behavioral adjustment. Another limitation of our findings are that the AgCC cohort involved children with partial corpus callosum agenesis (pAgCC), whose behavioral performance did not differ from the complete AgCC, but the different brain condition might predict variable social-cognitive performance. Previous studies showed that the residual fibers of the callosum of pAgCC probably provide higher variability in the pattern of interhemispheric connectivity (Wahl et al., 2009), increasing the variability of the behavioral and cognitive outcome. Further study is necessary in order to understand how the compensatory anatomical changes and residual callosal tracts contribute to the social and cognitive functioning. It may be more informative to investigate the emerging social cognitive performance and mapping of the developmental tracts assessed with MR and DTi imaging techniques, in parallel.

In conclusion, this study demonstrated that there are mild deficits in mental state understanding, executive functions, and behavior symptoms in children with AgCC. The findings indicate that dysgenesis of the corpus callosum constitutes a specific risk factor for developing social cognitive symptoms. AgCC individuals tend to misconstrue social information and misunderstand the mental states of others within complex social contexts, including problems with emotion recognition and complex mental state recognition, theory of mind, and inhibitory control. The absence of the corpus callosum seems to affect the development of behavioral characteristics and cause specific behavioral anomalies. Taken together, evidence from previous studies with AgCC patients suggest that social cognitive impairments may relate to the missing corpus callosum. The callosal agenesis results in deficiencies in imagining and inferring the mental, emotional, and social functioning of others. This pattern of cognitive and social deficits has been labeled as primary AgCC syndrome, by Symington et al. (2010) for that condition, when there is callosal absence without evidence of other brain pathology or the observable cognitive and social impairments primarily related to the absence of the corpus callosum. The primary AgCC syndrome profile includes impaired emotion recognition, restricted verbal interpretation of social scenes, and emotional experiences, as well as mild deficits in theory of mind. The lack of callosal interconnectivity might explain the decreased capacity in the higher-order cognitive domain,

#### REFERENCES


as the callosal dysgenesis reduces the accessibility to the more complex integration of social networks, which is widespread in the two hemispheres. In AgCC individuals it is likely that the functions involved are those that are hemispherically lateralized (emotions, language, visuospatial processing), or the complex social functions, in which the information is spatially distributed between the two hemispheres. If the development of the corpus callosum is impaired, the normal interaction and competition between the hemispheres is abolished, resulting an alternative routes in the adult brain.

#### INFORMED CONSENT

Written informed consent was obtained from patients who participated in this study.

#### AUTHOR CONTRIBUTIONS

BL as an author contributed to the following tasks during the preparation of this manuscript: planning research method, collecting and analyzing data, writing the manuscript. AB as an author contributed to the following tasks during the preparation of this manuscript: planning research, examining children concerning the medical conditions, analyzing data and writing manuscript.

#### ACKNOWLEDGMENTS

This research was supported by OTKA (PD-109597) research grant.

affect processing in schizophrenia. Psychiatry Res. 133, 149–157. doi: 10.1016/j.psychres.2004.08.008


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Lábadi and Beke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder

Francesco Margoni and Luca Surian\*

*Department of Psychology and Cognitive Sciences, University of Trento, Rovereto, Italy*

Keywords: moral judgment, mental state understanding, theory of mind, autism spectrum disorders, moral development

Do children with autistic spectrum disorder (ASD) develop the ability to take into account an agent's mental states when they are judging the morality of his or her actions? The present article aims to answer this question by reviewing recent evidence on moral reasoning on children with autism and typical development. A basic moral judgment (e.g., judgments of violations in which negative intentions are followed by negative consequences) and the ability to distinguish between conventional and moral violations appear to be spared in autism (Leslie et al., 2006). However, a closer look at the data reveals that these capacities can be explained by the tendency of ASD individuals to rely heavily on actions consequences and other external factors rather than agents' mental states. By contrast, studies that presented typically developing (TD) children with accidental and failed attempts actions have shown that even preschoolers can display an intent-based moral judgment (e.g., Cushman et al., 2013; Margoni and Surian, 2016). The tendency to rely on outcome in ASD children is further confirmed by those studies that direcly show that ASD individuals fail to attend to the agents' intentions when the cases are more complex or ambiguous, like in accidentally harmful actions or failed attempts to harm. We propose that the impairment in understanding others' mind hinders the development of an intent-based moral judgment in children with ASD.

# MENTAL STATE REASONING IN THE MORAL JUDGMENT TASKS

In our social life, we often engange in the evaluation of others' actions and intentions, and we are very sensitive to harmful acts and violations of rights. For example, we maintain friendships on the basis on an assessment of our friends' moral behaviors toward us. The production and the justification of a moral judgment is a complex socio-cognitive task that often requires the use of mental state reasoning abilities (Young et al., 2007; Moran et al., 2011). In particular when people are asked to evaluate accidental harming (or helping) actions or failed attempts to harm (or help), they need to weigh the agents' intention, that requires a mental state analysis, against the external consequences of the action. Several neuroscientific studies confirm the association between moral judgment and theory of mind (Young et al., 2007, 2010; Young and Saxe, 2009).

Then, to what extent individuals with ASD, who present deficits in theory of mind abilities (Baron-Cohen et al., 1985, 2000; Bowler, 1992; Surian and Leslie, 1999; Abell et al., 2000; Castelli et al., 2002), meet with difficulties in the acquisition of an intent-based moral judgment? Individuals with ASD are characterized by impaired social interactions and communication abilities, and a set of restricted and repetitive behaviors. Here we focus on their impairment in mentalizing, that has been shown to be a main factor affecting their socio-moral abilities.

Edited by: *Paola Molina, University of Turin, Italy*

Reviewed by: *Annette M. Klein, Leipzig University, Germany*

> \*Correspondence: *Luca Surian luca.surian@unitn.it*

#### Specialty section:

*This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology*

Received: *19 July 2016* Accepted: *14 September 2016* Published: *27 September 2016*

#### Citation:

*Margoni F and Surian L (2016) Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder. Front. Psychol. 7:1478. doi: 10.3389/fpsyg.2016.01478* Studies on the moral judgment of ASD children have traditionally focused on (a) the capacity to distinguish between moral and social-conventional transgressions and (b) the ways in which individuals with autism judge the moral rightness or wrongness of an action.

# MORAL AND CONVENTIONAL TRANSGRESSIONS

One fundamental aspect of the moral competence has been identified by social domain theorists in the capacity to distinguish between moral and social-conventional violations. While the former involve a victim and are to be blamed regardless of the social context, the latter do not need to involve a victim and are contingent over a specific group consensus or authority mandate (Turiel, 1978; Nucci, 1981; Killen and Smetana, 2015). By the age of three, children judge moral violations, like hit someone, more harshly, and less authority-dependent than social-conventional, like wearing pajamas at school (Nucci, 1985; Smetana and Braeges, 1990).

The capacity to distinguish between these two types of violation is intact in ASD individuals (Blair, 1996; Rogers et al., 2006; Zalla et al., 2011; Shulman et al., 2012). However, ASD individuals produce poorer justifications compared to TD individuals, and they do not evaluate moral violations as more serious than non-moral but disgusting actions, such as drinking tomato soup out of the bowl at a dinner party. Moreover, contrary to TD children, school-aged children with ASD are swayed by the victims' emotion and judge wrong actions that caused the crying of the victim more harshly than wrong actions that did not cause any crying (Weisberg and Leslie, 2012). ASD children usually succeed in tasks devised to investigate the moral-conventional distinction, but they rely mainly on external factors that could depend on irrelevant variables such as the particular emotional level of the agents.

# THE RELATIVE WEIGHT OF INTENTION AND OUTCOME IN THE JUDGMENTS OF ASD INDIVIDUALS

A working hypothesis here is that ASD children respond as TD children do when they are presented with simple, unambiguous moral cases (i.e., a negative/positive outcome produced by an intentional action with the same valence). In those cases, the difficulties encountered in integrating the mental state understanding in the moral reasoning can be overcome by the children's reliance on action outcomes and victims' emotional reactions. For this reason, ASD children appear to develop a basic moral judgment.

ASD school-aged children evaluate actions that are motivated by positive or negative intentions and are followed by congruent outcomes as TD children do (Leslie et al., 2006; Li et al., 2014). Moreover, they are able to judge an agent that caused intentionally a bad outcome more harshly than an agent that caused it accidentally, although they do not produce verbal justifications that refer to the agent's intention (Grant et al., 2005). However, Steele et al. (2003) found that children with ASD aged 4–14 failed to distinguish between intentional and accidental bad acts (e.g., failing to come to a planned meeting as a result of canceling the plan without telling or as a result of the bus breaking). Studies on ASD adults also showed that they judge an accidental harm both more punishable and more intentional compared to TD adults, suggesting a partial impairment in the ability to rely on intentions (Buon et al., 2013; see also Rogé and Mullet, 2011; Zalla and Leboyer, 2011; Salvano-Pardieu et al., 2015). Nevertheless, ASD school-aged children distinguish between a distressed victim and an individual in distress that however is not a victim (Leslie et al., 2006). So, their judgments do not completely rely on the external outcomes assessment.

However, what about the judgments of more complex cases such as the failed attempts to help or harm, that require a more substantial contribution of mental state reasoning? In fact, in judging an ambiguous case such as a failed attempt to harm, it is not possible to rely solely on action outcomes, and still produce a moral condemnation of the agent.

A first evidence of an outcome-bias in the judgment in ASD individuals comes from those studies that reported a "heteronomous" (i.e., rules are understood as handed down by authority, and violations are wrong because they produce bad outcomes, namely they lead to punishment) rather than an "autonomous" (i.e., rules are based on socially agreed-on principles, and violations are wrong because of the agent's beliefs and motivations) moral reasoning in ASD school-aged children (Grant et al., 2005; Takeda et al., 2007; see also Fadda et al., 2016). ASD children attributed moral wrongness and badness to actions that caused bad outcomes. A second and more direct evidence comes from a study that presented ASD individuals with accidental and failed attempted harms. Moran et al. (2011) found that they failed to distinguish between the two scenarios, and they judged the accidental harm significantly more harshly than TD individuals. Moreover, there is evidence of an activation of the right temporo-parietal junction (RTPJ)—an area associated with mental state reasoning—in TD individuals during the evaluation of intentional vs. accidental harm, but such result has not been found in adults with ASD (Koster-Hale et al., 2013). These results clearly suggest that ASD individuals fail to integrate the agent's mental states in their moral reasoning when judging situations in which intentions and outcomes present different valences (see **Figure 1**).

# THEORETICAL IMPLICATIONS OF THE STUDIES ON MENTAL REASONING IN ASD INDIVIDUALS' MORAL JUDGMENTS

Three main theoretical implications relevant for the current understanding of the relationship between theory of mind and moral reasoning could be inferred from the results we briefly discussed. First, the evidence that ASD individuals, who are characterized by an impaired mental state understanding, show an atypical moral judgment, further confirms that theory of mind is fundamental for the development of a mature moral reasoning.

Second, the study of moral judgment in ASD individuals could prove useful in assessing the role of cognitive empathy in the production of a moral evaluation. ASD individuals show a spared capacity for emotional empathy (e.g., Blair, 1999; Rogers et al., 2007), that is, the proper emotional response to others' emotions, but an impaired capacity for cognitive empathy, that is, the proper knowing how others may feel. While emotional empathy skills help ASD children developing a basic moral judgment by relying on the emotional and external aspects of the moral case such as the victims' emotional reactions or the actions outcomes (Leslie et al., 2006; Hobson et al., 2009; Weisberg and Leslie, 2012), the poor understanding of the cognitive aspects hinders the development of an intent-based moral judgment. Further studies confirm this interpretation by reporting that aspects related to cognitive empathy impairment affect the moral evaluations of ASD individuals (Channon et al., 2010; Gleichgerrcht et al., 2013; Patil et al., 2016).

A third relevant theoretical implication concerns whether the action understanding required in moral evaluation is mentalistic. A mentalistic understanding represents and explains others' actions by ascribing mental states such as beliefs, desires, and internal representations to the agents (Baron-Cohen et al., 1985; Leslie, 1987; Surian et al., 2007; Baillargeon et al., 2010). By contrast, a non-mentalistic or teleological understanding represents others' actions without ascribing mental states, by linking directly the agent's actions, the goal-states and the situational constraints through the principle of rational actions (i.e., agents act to achieve certain goals choosing the most efficient means; Gergely and Csibra, 2003; Schlottmann et al., 2009). According to the proponents of teleological accounts of action understanding, humans first develop very early in life a non-mentalistic understanding, and only later they acquire a mentalistic understanding. While it could be argued that ASD individuals possess the ability to interpret actions in a nonmentalistic way already during preschool years (Hamilton, 2009; Vivanti et al., 2011), we have seen that they do not develop a mature intent-based moral judgment. Therefore, the literature on ASD individuals suggests that a non-mentalistic understanding is not sufficient for the development of a full-blown intent-based moral reasoning.

# CONCLUSIONS

The ability to produce moral evaluations often requires the understanding of others' mental states and it is central for living in human social groups. While much more research is needed to acquire a full understanding of the development of moral judgment in ASD individuals, the current state of the literature suggests that this clinical population encounters some difficulties in developing a mature intent-based moral judgment because of the well-known impairment in mental state understanding. Nevertheless, ASD individuals show the ability to produce a basic moral judgment by relying on external cues such as the action outcomes and the victims' emotional reactions.

Can these results turn out to be useful in guiding programs designed to improve moral judgment in children with ASD? Since a main result of the literature we reviewed is that individuals with ASD show difficulties in integrating mental states information in their judgments, clinical treatments, and educational programs aimed at improving their theory of mind abilities are likely to have, as a side-effect, a positive impact also on their moral reasoning abilities. Further research is needed to point out whether such a desiderable effect is achieved equally by any effective training on mentalizing skills (e.g., Silver and Oakes, 2001; Fisher and Happé, 2005; Begeer et al., 2011), or it is best achieved by a program that requires both mental state attribution and the generation of moral judgments.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

# REFERENCES


Zalla, T., and Leboyer, M. (2011). Judgment of intentionality and moral evaluation in individuals with high functioning autism. Rev. Philos. Psychol. 2, 681–698. doi: 10.1007/s13164-011-0048-1

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Margoni and Surian. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Corrigendum: Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder

Francesco Margoni and Luca Surian\*

*Department of Psychology and Cognitive Sciences, University of Trento, Rovereto, Italy*

Keywords: moral judgment, mental state understanding, theory of mind, autism spectrum disorders, moral development

#### **A corrigendum on**

**Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder** by Margoni, F., and Surian, L. (2016). Front. Psychol. 7:1478. doi: 10.3389/fpsyg.2016.01478

We realized that **Figure 1** was misleading. The figure showed the mechanisms underlying the moral judgment of attempted harm cases in individuals with autistic spectrum disorder (ASD). However, it would be more in line with the current studies, that primarily presented ASD individuals with cases of accidental harm, to show in the figure the ASD individuals' processing of accidental harm. Therefore, we replaced "Attempted Harm" with "Accidental Harm" in the top boxes, and we also changed accordingly the bottom boxes. The authors apologize for the mistake.

The original article has been reproduced with the correct image, originally it was published with the version of **Figure 1** displayed here.

The original files have been updated.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Margoni and Surian. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Edited and reviewed by: *University of Turin, Italy*

> \*Correspondence: *luca.surian@unitn.it*

#### Specialty section:

*This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology*

Received: *06 October 2016* Accepted: *17 October 2016* Published: *27 October 2016*

#### Citation:

*Margoni F and Surian L (2016) Corrigendum: Mental State Understanding and Moral Judgment in Children with Autistic Spectrum Disorder. Front. Psychol. 7:1705. doi: 10.3389/fpsyg.2016.01705*

# Implicit Mentalizing Persists beyond Early Childhood and Is Profoundly Impaired in Children with Autism Spectrum Condition

Tobias Schuwerk1,2 \*, Irina Jarvers<sup>1</sup> , Maria Vuori1,3 and Beate Sodian<sup>1</sup>

<sup>1</sup> Department of Psychology, Ludwig-Maximilians-University, Munich, Germany, <sup>2</sup> Department of Psychiatry and Psychotherapy, University of Regensburg, Regensburg, Germany, <sup>3</sup> Institute of Medical Psychology, Ludwig-Maximilians-University, Munich, Germany

Implicit mentalizing, a fast, unconscious and rigid way of processing other's mental states has recently received much interest in typical social cognitive development in early childhood and in adults with autism spectrum condition (ASC). This research suggests that already infants implicitly mentalize, and that adults with ASC have a sustained implicit mentalizing deficit. Yet, we have only sparse empirical evidence on implicit mentalizing beyond early childhood, and deviations thereof in children with ASC. Here, we administered an implicit mentalizing eye tracking task to assess the sensitivity to false beliefs to a group of 8-year-old children with and without ASC, matched for chronological age, verbal and non-verbal IQ. As previous research suggested that presenting outcomes of belief-based actions leads to fast learning from experience and false belief-congruent looking behavior in adults with ASC, we were also interested in whether already children with ASC learn from such information. Our results provide support for a persistent implicit mentalizing ability in neurotypical development beyond early childhood. Further, they confirmed an implicit mentalizing deficit in children with ASC, even when they are closely matched to controls for explicit mentalizing skills. In contrast to previous findings with adults, no experiencebased modulation of anticipatory looking was observed. It seems that children with ASC have not yet developed compensatory general purpose learning mechanisms. The observed intact explicit, but impaired implicit mentalizing in ASC, and correlation patterns between mentalizing tasks and executive function tasks, are in line with theories on two dissociable mentalizing systems.

Keywords: mentalizing, implicit Theory of Mind, autism spectrum condition, executive function, eye tracking

# INTRODUCTION

Implicit mentalizing, or implicit Theory of Mind (ToM), a fast, unconscious and rigid way of processing others' mental states, such as beliefs or desires, has recently received much interest in research on typical and atypical social cognitive development (Apperly and Butterfill, 2009; Baillargeon et al., 2010; Sodian, 2011; Senju, 2012). One key finding is that already children younger than 4 years of age implicitly mentalize. For example, results from violation of expectation

#### Edited by:

Daniela Bulgarelli, Aosta Valley University, Italy

#### Reviewed by:

Evelyne Thommen, Haute École de Travail Social et de la Santé – EESP, Switzerland Emmanuelle Rossini, University of Applied Sciences and Arts of Southern Switzerland (SUPSI), Switzerland

> \*Correspondence: Tobias Schuwerk tobias.schuwerk@psy.lmu.de

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 27 August 2016 Accepted: 14 October 2016 Published: 28 October 2016

#### Citation:

Schuwerk T, Jarvers I, Vuori M and Sodian B (2016) Implicit Mentalizing Persists beyond Early Childhood and Is Profoundly Impaired in Children with Autism Spectrum Condition. Front. Psychol. 7:1696. doi: 10.3389/fpsyg.2016.01696

paradigms suggest that already in their second year of life, children process an agent's true and false belief. In these tasks, they look longer at events in which an agent acts incongruently to her mental state, e.g., when she does not look for an object where she believes it is located, but at a different location (Onishi and Baillargeon, 2005; Surian et al., 2007; Song et al., 2008). According to recent theories, implicit mentalizing continues to exist alongside the ability to explicitly mentalize, that is, to deliberately consider another's mental state, in children above the age of 4 years, as well as in adults (cf., Apperly and Butterfill, 2009; Perner and Roessler, 2012).

Anticipatory looking paradigms revealed that children younger than 4 years of age not only expect an agent to act according to her beliefs, but that they also use her mental state to predict upcoming actions (Clements and Perner, 1994; Surian and Geraci, 2012; Thoermer et al., 2012; Low and Watts, 2013). In an influential study, Southgate et al. (2007) familiarized 25-month-olds with an agent and her goal to retrieve an object in one of two boxes. In the test trial, the agent was distracted and did not witness that the object was removed from the box she had lastly seen it in. When the agent turned back to the scene, the children's anticipatory gaze indicated that they predicted the agent would open the door next to the now empty box. This suggests that they were sensitive to the agent's false belief.

Another key finding is that individuals with autism spectrum condition (ASC) have an implicit mentalizing deficit (Senju et al., 2009; Schneider et al., 2013; Sodian et al., 2015). This deficit is thought to contribute to social interaction deficits in ASC (Frith, 2012; Senju, 2012). Employing equivalent anticipatory paradigms as described above, Senju et al. (2009) found that adults with ASC lacked the spontaneous appreciation of the agent's false belief and did not reliably produce according action predictions.

Interestingly, this and other studies found a dissociation between implicit and explicit mentalizing (Senju et al., 2009; Schneider et al., 2013; cf., Abell et al., 2000): While implicit mentalizing seemed to be persistently impaired, participants with ASC were able to solve explicit mentalizing tasks. This led to the conclusion that individuals with ASC, especially those with highfunctioning autism and Asperger syndrome, cope with explicit mentalizing deficits (Baron-Cohen et al., 1985) by developing compensatory strategies (Bowler, 1992; Happé, 1995; Ozonoff and Miller, 1995).

However, a recent study by Schuwerk et al. (2015) found that an implicit mentalizing deficit might be addressable by compensatory learning. In contrast to previous anticipatory looking false belief tasks, Schuwerk et al. (2015) presented the false belief-based action and its outcome, i.e., looking for the object in the empty box. Based on this perception-action contingency it is possible to form the non-mentalistic rule "if the agent did not witness the object transfer, she will look into the box that is empty by now." Interestingly, after adults with ASC had the critical outcome information at the end of the first test trial, their performance on the second test trial in this implicit mentalizing task was comparable to that of neurotypical controls. This suggests that individuals with ASC might be able to modify their performance in an implicit mentalizing task by experience.

To date, we lack clear understanding of implicit mentalizing in typical and atypical social cognitive development in two aspects. First, we know little about an implicit mentalizing deficit in children with ASC. In the one study that documented this deficit in 8-year-old children with ASC, the children with and without ASC differed also to some extent in their explicit mentalizing competence (Senju et al., 2010). Thus, for drawing more firm conclusions about the implicit – explicit dissociation and impaired implicit mentalizing in ASC, one has to look at implicit mentalizing in children with and without ASC who pass explicit mentalizing tasks.

Moreover, a recent study employing a presumably more engaging mentalizing paradigm showed that 10-year-old children with ASC were able to spontaneously track another's belief (Peterson et al., 2013). In a competitive game, one of two agents ended up with a false belief about the location of a prize. The other agent and the child knew where the prize was hidden. The child was encouraged to get the price, but only after he or she had nominated one of the agents to look for it first. Thus, to gain the price, the child should choose the agent with the false belief, who would not find it so that they could get it themselves. The majority of 10-year-olds with ASC opted for the agent with the false belief. At the same time, they performed poorly on a standard explicit false belief paradigm.

Although it is unclear to what extent the task by Peterson et al. (2013) tapped into the same implicit/spontaneous mentalizing ability as the anticipatory looking paradigm employed by Senju et al. (2010), their findings suggest that the conclusion that children with ASC have an implicit mentalizing deficit may be premature.

Second, we have only sparse empirical evidence on implicit mentalizing beyond early childhood, as most previous research studied infant or adult samples (for an example of the latter, see van der Wel et al., 2014). One recent example of a study investigating implicit and explicit mentalizing in older children comes from Grosse Wiesmann et al. (2016). These authors found the usual developmental trend in explicit false belief understanding between 3 and 4 years of age. However, implicit false belief understanding was already present and stable in this age range (see also Low, 2010). More evidence is necessary to conclude that implicit mentalizing continuously persists in parallel to a corresponding explicit system beyond early childhood, when children become increasingly proficient in advanced and second order mentalizing (e.g., Perner and Wimmer, 1985; Osterhaus et al., 2016).

To address these issues, we administered an implicit mentalizing task to a group of 8-year-old children with and without ASC, who were matched for chronological age, verbal and non-verbal intelligence, executive function skills<sup>1</sup> and explicit mentalizing ability. We employed an anticipatory looking eye tracking paradigm that was previously used with adults (Schuwerk et al., 2015) and that was adapted from previous paradigms (Southgate et al., 2007; Senju et al., 2010). In this

<sup>1</sup>The two groups were meant to be matched for executive function. However, although both children with ASC and controls performed comparably well, they differed significantly in their executive function skill. We provide details on this result, post hoc analyses and a discussion of this finding.

task, an agent looks for an object in one of two boxes. In two familiarization trials, she observes the self-propelled object entering one box, opens the box, and finds the object. In two test trials, the agent is distracted, and consequently ends up with a false belief about the object's location and reaches for the object in the empty box.

We reasoned that if implicit mentalizing is a phenomenon dissociated from explicit mentalizing and specifically impaired in ASC, we should observe a group difference in the implicit mentalizing task in the current sample of children with ASC who are competent in explicit mentalizing tasks.

Further, as the previous study by Schuwerk et al. (2015) suggested that presenting the outcome of a false belief-based action leads to fast learning from experience and false beliefcongruent looking behavior in adults with ASC, we were interested in whether already children with ASC learn from such information. If this were the case, we should find an effect of test trial repetition on false-belief congruent action anticipation in children with ASC.

# MATERIALS AND METHODS

#### Participants

A total of 14 children with ASC (Mage = 8.0 years, SD = 1.8 years; one female) participated in the study. Another 10 children with ASC were tested, but had to be excluded due to missing gaze data in one or both test trials of the implicit mentalizing task (n = 6) or not fulfilling inclusion criteria in the implicit mentalizing task (n = 4, see the data analysis section for details). All participants with ASC were clinically assessed by a psychologist or a psychiatrist and were required to fulfill the International Classification of Diseases-10th Revision (ICD)- 10 criteria (American Psychiatric Association, 2013) for either Asperger Syndrome (n = 8), high-functioning autism (n = 5) or atypical autism (n = 1). We used two ASC screening tests which were filled by the caregivers to corroborate group assignment: the Social Responsiveness Scale (SRS), introduced by Constantino and Gruber, 2005 (German version by Bölte et al., 2008; cutoff criterion: T-score ≥ 60), and the Social Communication Questionnaire (SCQ, current form; discriminative cut-off: sum score ≥ 15) introduced by Rutter et al. (2003; German version by Bölte and Poustka, 2006). The SRS was used as a more general assessment of autistic traits whereas the SCQ questionnaire was applied as a measure of current communication skills and social functioning. In the SRS, the ASC group had a mean T-score of 80.8 (range = 70–100, SD = 9.1). The mean SCQ sum score was 16.6 (range = 9–27, SD = 6.0). Note that seven participants with ASC scored below the cut-off value of 15, indicating a currently alleviated communication skills and social functioning deficit in this subgroup.

The control group consisted of 21 neurotypical children (Mage = 7.2 years, SD = 1.4; six females). Five additional participants had to be excluded due to missing gaze data in one or both test trials of the implicit mentalizing task (n = 1) or not fulfilling inclusion criteria in the implicit mentalizing task (n = 4). The control group was matched for chronological age, t(31) = 1.45, p = 0.156, Cohen's d = 0.52, non-verbal IQ, t(31) = −0.02, p = 0.988, Cohen's d = −0.01, verbal IQ, t(31) = 1.26, p = 0.219, Cohen's d = 0.45, and explicit ToM ability, t(31) = −1.64, p = 0.112, Cohen's d = −0.59, based on the ToM scales by Wellman and Liu (2004, for details see the tasks and material section). The verbal and non-verbal IQ were obtained using subtests of the Wechsler Preschool and Primary Scale of Intelligence-III (WPPSI-III; Wechsler, 2002; German Version: Hannover-Wechsler-Intelligenztest für das Vorschulalter – III, HAWIVA-III; Ricken et al., 2007) and the Wechsler Intelligence Score for Children-IV (WISC-IV; Wechsler, 2004; German Version: Hamburg-Wechsler-Intelligenztest für Kinder – IV, HAWIK-IV; Petermann and Petermann, 2007). For the verbal IQ the subtest used from the WPPSI-III was Vocabulary (passive and active) and from the WISC-IV Vocabulary and Picture Concepts. For the non-verbal IQ the subtest Block Design and Matrix Reasoning were used from both IQ test. The control group had significantly less autistic traits as assessed by the SRS and SCQ. The average SRS T-score was 38.9 (range = 25–55, SD = 8.5), the average SCQ sum score was 2.7 (range = 0–7, SD = 1.7). There was a significant difference between the ASC group and the control group in both the SRS, t(33) = 13.91, p < 0.001, Cohen's d = 4.84, and the SCQ, t(33) = 10.11, p < 0.001, Cohen's d = 3.52.

The caregivers gave informed written consent. Children received a present for their participation. Their caregivers received monetary compensation for travel expenses. The ethics committee of the Department of Psychology and Education of LMU Munich approved the study.

#### Tasks and Material Implicit Mentalizing Task

This implicit mentalizing task was adapted from previous eye tracking false belief paradigms (Southgate et al., 2007; Senju et al., 2010; Thoermer et al., 2012). The same task was previously used with a sample of adults with and without ASC (Schuwerk et al., 2015). **Figure 1** provides an overview of trials and scene setup. In two familiarization trials (each lasting for 32 s), an agent watched a toy car moving from one into another box. Subsequently, the agent disappeared behind a screen. Two doors, one next to each box, were illuminated, accompanied by a chime. This scene was frozen for 3 s and served as anticipatory period. After that, the agent opened the door next to the box she had seen the toy car disappear in. Finally, she reached for the car. These two trials served to familiarize the participants that the agent wants to get the car and therefore opens one of the two doors. Second, participants learned about the contingency between the illumination of the doors/chime and the opening of the door in these familiarization trials. The subsequently presented two test trials lasted for 41 s each. In the test trials, the agent was distracted by a phone ring and did not see that the car, after arriving at the second box, drove back to the first box and then left the scene. Then the phone ringing stopped, the agent turned back to the scene, and disappeared behind the screen. The doors were illuminated, the chime sounded, and the 3 s-long anticipatory period started. Finally, the agent opened the door and reached for the box she falsely believed

the car would be located in. Half of the participants watched horizontally flipped movies to counterbalance for the laterality of events.

#### Explicit Mentalizing Task

We used the ToM scale by Wellman and Liu (2004) to assess explicit mentalizing ability. The ToM scale consists of 6 tasks including the following concepts: diverse desires, diverse beliefs, knowledge access, contents false belief, real apparent emotion, and explicit false belief. We used the validated German version by Kristen et al. (2006). All tasks were presented with the help of toys and pictures. The first two tasks do not require a representational understanding of mental states whereas the following tasks do. Overall, a score of 6 for solving all six tasks could be achieved. We adhered to the procedure as described by Wellman and Liu (2004).

#### Executive Function Tasks

Executive function was assessed employing two card-sorting tasks, which draw on cognitive flexibility and self-control, namely the dimensional change card sorting task (DCCS) and the reversal shift test. Both tasks consist of two different sets of cards, which include two goal cards and 30 test cards. The goal cards were assigned to a box each and the test cards had to be sorted into the boxes according to a certain rule. There were always three phases for the tasks: a pre-switch phase, a post-switch phase, and a mix-phase that consisted of a mix of the previous two phases.

The DCCS (modified by Kloo et al., 2008) was administered according to the procedure described by Zelazo (2006). The two goal cards depicted a green apple and a red banana. Children were asked to sort cards either according to the form or according to the color. In the pre-switch phase, children had to sort everything according to color. In the post-switch phase, the rule was to sort according to form. The last phase, namely the mix-phase, was added to the procedure by Zelazo (2006), to have an additional level of difficulty and thereby the ability to further differentiate performance. In this last phase participants had to switch back and forth between the previous two tasks and rules.

The reversal shift test was based on the one-dimensional card sorting task by Perner and Lang (2002). The two goal cards showed an elephant and a bunny. Both cards had the color beige and therefore only differed in the type of animal shown. Here, the pre-switch phase was to play the game "correctly", i.e., put the elephants to the elephants, etc., and the post-switch phase required playing the game "incorrectly", i.e., putting the elephants to the bunnies and bunnies to the elephants. In the mix-phase, which was added just like in the DCCS, the rules were intermixed.

#### Procedure

The children performed the implicit mentalizing task first. Eye tracking stimuli were presented with Tobii Studio (Version 2.2, Tobii Technology) on a Tobii T60 eye tracker (60 Hz sampling rate, inbuilt 17-inch TFT screen, 1280 × 1024 pixels; Tobii Technology, Stockholm, Sweden). The participants sat on a chair with a distance of approximately 60 cm from the screen. A 5-point calibration procedure preceded the stimulus presentation. The explicit mentalizing task and the executive function tasks were performed at a table with the experimenter seated across from the child. Subsequently, verbal and non-verbal IQ subtests were administered. Caregivers filled the SRS and SCQ questionnaires during the experimental session with the child.

#### Data Analysis

Statistical analyses were conducted using IBM SPSS Statistics 23 (SPSS Inc., Chicago, IL, USA). As preliminary analyses revealed no influence of sex, data was collapsed across this variable. The significance level was p ≤ 0.05.

#### Implicit Mentalizing Task

Analyses of raw data were conducted using customized scripts in R (R Core Team, 2013). A velocity-based fixation filter (Salvucci and Goldberg, 2000) with a velocity threshold of 0.05◦ /ms was used to define the fixations. Additionally, a temporal threshold was set to exclude fixations that lasted less than 80 ms. As fixations on the doors during the 3 s-long anticipatory period

were the critical measure, the two doors were chosen as areas of interest (AOIs; approximately 2.8◦ × 2.8◦ ) for data analysis. The door the character opened after the anticipatory period was defined as the "correct door," whereas the other door is referred to as the "incorrect door." A differential looking score (DLS) according to Senju et al. (2009) was calculated by subtracting the total duration of fixations on the incorrect door from the total duration of fixations on the correct door, and then dividing it by the sum of the total duration of fixations on both doors. The DLS ranges from 1 (visual preference for correct door) to −1 (visual preference for incorrect door). A value around 0 indicates no preference for one of the two doors. Participants who had a looking bias toward the correct door in at least one of the two familiarization trials were included in the further analysis. Four children from the ASD group and four children from the control group had to be excluded as they did not show this beliefcongruent anticipatory looking behavior. Note that this differs slightly from Senju et al. (2010), who only included participants who looked longer to the correct than to the incorrect door in the last familiarization trial. Senju et al. (2010) presented four familiarization trials in contrast to only two in the current study. This presumably made it easier for their participants to learn the contingency between the door illumination and the reaching action. Because of this, and to be consistent with our previous study with adults, we adjusted our criterion. Applying Senju et al.'s criterion to our sample would have resulted in excluding five additional participants with ASC and one additional control participant. Notably, preliminary analyses revealed that using Senju et al.'s criterion did not change the pattern of DLS results.

Additionally, first looks toward the two doors in the anticipatory period were analyzed. It was coded whether the first fixation after the illumination of the doors was on the correct or incorrect door.

#### Explicit Mentalizing Task

To pass each of the ToM scale subtests it was required to answer both the test and the control questions correctly. For each solved task a point was given resulting in a maximum of 6 points. The percentage of correct responses was used for statistical analyses. One child with ASC and one child from the control group refused to take part in the ToM scale. A second coder recoded test and control questions of 33% of the whole sample from a video recording of the test session The Inter-rater reliability revealed an agreement of 100% (Cohen's kappa = 1).

#### Executive Functions Task

To pass the DCCS and the reversal shift tasks a certain number of cards had to be sorted correctly. For the first two phases of the tasks it was necessary to sort at least five cards correctly. To pass the last phase, at least nine cards had to be sorted correctly (Zelazo, 2006). The third phase was only administered if a child sorted at least five cards in each of the other phases correctly. None of the children failed the pre-switch-phase. One control child refused to take part in the executive function tasks. The maximum that could be achieved in this set of tasks was a score of 6 for passing all three phases of both tasks. Statistical analyses are based on the percentage of correct responses. Inter-rater reliability was assessed as in the explicit mentalizing task. It again revealed a perfect agreement (Cohen's kappa = 1).

## RESULTS

#### Implicit Mentalizing Task

The DLS was analyzed via a 2 × 2 repeated measures ANOVA with the between factor group (ASC group, control group) and the within factor test trial (first, second). **Figure 2** displays mean DLS scores for group and test trial. The ANOVA revealed a significant effect of group, F(1,33) = 10.55, p = 0.003, η <sup>2</sup> = 0.24, but no effect of test trial, F(1,33) = 0.60, p = 0.446, η <sup>2</sup> = 0.02. There was also no significant interaction between group and test trial, F(1,33) = 0.34, p = 0.441, η <sup>2</sup> = 0.02. Overall, the control group showed a stronger looking bias toward the correct door (M = 0.21, SD = 0.42) compared to the ASC group (M = −0.23, SD = 0.33).

Because of our a priori interest in a potential learning effect from one test trial to another, we checked for significant differences in DLS scores between the first and second test trial within each group. Neither in children with ASC, nor in the control group, the DLS differed between the first and second test trial [ASC group: t(13) = 0.96, p = 0.357, Cohen's d = 0.53; control group: t(20) = −0.01, p = 0.995, Cohen's d = 0.00]. Consequently, we collapsed the DLS score across both test trials for further analyses.

To check whether children from the ASC group and the control group had a looking bias significantly different from chance, we calculated one-sample t-tests against zero for each group. The control group looked significantly more at the correct door compared to the incorrect door, t(20) = 2.28, p = 0.034, Cohen's d = 1.02, whereas the ASC group looked significantly more at the incorrect door compared to the correct door, t(13) = −2.53, p = 0.025, Cohen's d = −1.41.

For the first fixations, a binominal logistic regression was calculated with the dichotomous dependent variable performance (0 or 1) and the categorical independent variables group (ASC group = 1, control group = 0) and test trial (first test trial = 0, second test trial = 1). **Figure 3** shows percentage of correct first fixations per group and test trial. The intention was to assess the influence of group and test trial repetition on the location of first fixations. The logistic regression model was statistically significant, χ 2 (1) = 6.84, p = 0.033; and explained 12.4% (Nagelkerke R 2 ) of the variance in first fixations. The model correctly classified 62.9% of all cases. There was a significant effect of group as a predictor (B = 1.02, SE = 0.52, Wald = 3.86, p = 0.049). Participants with ASC were 2.77 times more likely to direct their first fixation to the incorrect door as compared to the controls. There was no significant effect of test trial as predictor (B = 0.86, SE = 0.51, Wald = 2.91, p = 0.088).

Analogous to the DLS analysis, we compared first fixations between the first and second test trial within each group. There was neither a difference in first fixations between trials for the ASC group (p = 0.414, McNemar's Test, one-tailed), nor for the control group (p = 0.687, McNemar's Test, one-tailed).

To check whether children from the ASC group and the control group had a first fixation preference for the correct door significantly different from chance, we created a percentage score over both test trials, which was then tested against zero for each group. Neither the ASC group, t(13) = −1.47, p = 0.165, Cohen's d = −0.82, nor the control group differed significantly from chance, t(20) = 1.45, p = 0.162, Cohen's d = 0.65.

#### Explicit Mentalizing Task

The control children achieved an average performance of 85% (range = 25–100%, SD = 21%). The children with ASC solved 70% (range = 25–100%, SD = 31%) of tasks from the ToM scale. Performance of children with ASC and controls did not significantly differ, t(31) = −1.51, p = 0.112, Cohen's d = −0.54.

#### Executive Function Task

In the executive function tasks the control children solved an average of 98.3% of the tasks correctly (range = 5−6, SD = 6.6%). The ASC group solved 81.7 tasks on average (range = 2–6, SD = 21.7%). The groups differed significantly in their executive function skill, t(32) = −3.26, p = 0.003, Cohen's d = −0.99.

#### Post hoc Analyses

As the two groups differed in executive function, the ANOVA of the DLS performance − the variable of key interest − was repeated with executive function task performance as a covariate. The pattern of results did not change. We again found a significant effect of group, F(1,31) = 5.57, p = 0.025, η <sup>2</sup> = 0.15, but no effect of test trial, F(1,31) = 0.23, p = 0.634, η <sup>2</sup> = 0.01. Also the group × test trial interaction was not significant, F(1,31) = 0.28, p = 0.471, η <sup>2</sup> = 0.02. Additionally, we checked for each group whether executive function tasks performance is related to performance in the implicit and explicit mentalizing task. Neither in children with ASC, nor in the control group, a significant correlation between performance in the implicit mentalizing task and the executive function tasks was observed (ASC group: r = 0.34, p = 0.232; control group: r = 0.36, p = 0.879). However, there was a significant correlation between executive function tasks performance and explicit mentalizing task performance in the ASC group (r = 0.61, p = 0.012) and the control group (r = 0.45, p = 0.045).

## DISCUSSION

The current implicit mentalizing task revealed a difference between 8-year-old children with ASC and matched control children in the spontaneous anticipation of an agent's false beliefbased action. Whereas neurotypical children's looking bias over two test trials suggests that they predicted the agent's action based on her false belief, children with ASC lacked this appreciation of the agent's false belief-congruent action. In contrast, over both test trials, they even displayed a significant looking bias toward

the incorrect door. Repeating the test trial had no effect on anticipatory looking.

The finding that 8-year-olds with ASC did not systematically generate false belief-based action anticipations confirms an implicit mentalizing deficit in children with ASC, previously documented in 10-year-olds (Senju et al., 2010). However, in the previous study by Senju et al. (2010), children with ASC and their controls differed not only in implicit, but also in explicit mentalizing task performance. Thus, it could not be ruled out completely that group differences in implicit mentalizing arose from differences in explicit mentalizing, maybe mediated by verbal intelligence. Our results advance previous research by showing that poor performance persists, even when children with ASC and their control group are closely matched for chronological age, verbal and non-verbal intelligence and explicit mentalizing skills. Together, both studies point to a specific deficit in implicit mentalizing in children with ASC.

By repeating the test trial, we were able to check whether participants learned from the presentation of the false beliefbased action and its outcome. The lacking effect of test trial repetition indicated no experience-based modulation of anticipatory looking in the repeated presentation of the test trial. Consequently, we collapsed gaze data over the two test trials for each group. Comparing anticipatory looking over both test trials against chance performance showed that neurotypical children systematically predicted that the agent would open the box in which she falsely believed the car would be located. This finding helps closing a gap of evidence on implicit mentalizing beyond early childhood. Recent two-systems accounts on mentalizing claimed that implicit mentalizing is already present in infancy and co-exists in parallel to later developing explicit mentalizing (Apperly and Butterfill, 2009). Yet, this remains to be proven empirically. Together with other recent work (Low, 2010; Senju et al., 2010; Grosse Wiesmann et al., 2016), our findings suggest that implicit mentalizing indeed is a phenomenon presumably persisting across lifespan.

Contrary to what we expected, test trial repetition had no effect on anticipatory looking in 8-year-old children with ASC. In a previous study with adults with ASC, showing the false-belief based action and its outcome (i.e., the agent opens the empty box and vainly looks for the ball) only once, was sufficient to increased the looking bias toward the false belief-congruent door in the second test trial (Schuwerk et al., 2015). In the second test trial, adults with ASC performed as good as neurotypical controls. This suggested that rapid learning from action-outcome contingencies modulated gaze behavior in this implicit mentalizing task. However, although the current stimulus material was identical, this was not what we observed in the present sample of children with ASC. On the contrary, in the second test trial the DLS was even more negative than the one in the first test trial, what led to a significant looking bias toward the incorrect box.

A possible explanation for this finding is the counterbalancing of our stimulus material across the two test trials. In the first test trial, the agent opened the right door (left door, respectively) to look for the car. In the second test trial, the presented movie was flipped horizontally, so that the door that would be opened was on the opposite side. It could be that children with ASC perseverated on the location in which they saw the agent reaching for the car in the previous test trial. This could in turn reflect a simple action prediction strategy, which might be fruitful in several cases, but not in the current situation. It seems that children with ASC let themselves be guided by superficial scene properties (i.e., location) and that they were not yet able to use action-effect contingencies. Yet, future research is needed to pin down whether children with ASC make use of such a location-bound action prediction strategy.

In summary, our findings point to a sustained implicit mentalizing deficit that cannot be easily addressed by experience. It seems that 8-year-old children with ASC, in contrast to adults with ASC, are not yet capable of employing information about perception-action contingencies to compensate for an implicit mentalizing deficit.

Notably, our group of children with ASC performed poorer than the control children in the executive function tasks. To check whether poorer executive function could contribute to the observed group difference in the implicit mentalizing task, we ran post hoc analyses. First, including the performance in the executive function tasks as a covariate and the DLS analysis revealed the same pattern of results. Second, within each group, executive function performance was unrelated to anticipatory looking in the implicit mentalizing task. This gives us good reason to conclude that anticipatory looking in the current implicit mentalizing task did not rely on voluntary cognitive control and that the lacking systematic false belief-based action prediction of children with ASC cannot be explained by poorer executive function skills.

Interestingly, our post hoc analyses revealed a positive correlation between performance in the executive function tasks and explicit mentalizing ability. This is in line with a large body of evidence on the close link between both cognitive domains (for a recent meta-analysis, see Devine and Hughes, 2014). Further, consistent with recent evidence (Low, 2010; Grosse Wiesmann et al., 2016), we found that executive function task performance was related to performance in the explicit, but not in the implicit mentalizing task. This provides further support for two-systems accounts of mentalizing (Apperly and Butterfill, 2009; cf., Perner and Roessler, 2012). The implicit system enables already young children to be sensitive to false beliefs. This system works involuntarily, fast, effortless, but inflexibly. The around the age of 4 developing explicit system allows to voluntary switch perspectives and to consider another's false belief to generate action explanations. This system is flexible but slow, and draws on cognitive resources.

When investigating social cognition in ASC using eye tracking, potentially confounding deficits in general visual processing have to be taken into account. In other words, is the group difference we observed in the implicit mentalizing task attributable to an implicit mentalizing deficit, or did this difference arise from general – and not specifically social – visual processing deficits in ASC? The following aspects of our paradigm help to address alternative explanations in terms of general atypical visual processing in the group of children with ASC. First, to account for a potentially weaker saccadic accuracy in ASC (e.g., Schmitt et al., 2014), a calibration procedure prior to

the implicit mentalizing task ensured that the fixation of targets was sufficiently accurate. Second, Wang et al. (2015) recently reported atypical visual saliency in the first few seconds of scene perception in ASC. The scene, the agent and the objects of the present paradigm, were introduced for several minutes to avoid potential group differences in early visual processing of the scene. Third, events took place slowly, and the anticipatory period was statistic without displaying an agent. This should render any impact of movement/biological motion and social stimuli processing deficits in ASC (Blake et al., 2003; Dakin and Frith, 2005; Guillon et al., 2014) neglectable.

Future research that carefully contrasts social and nonsocial stimuli is necessary to unravel the relationship between implicit social cognitive and rather general visual processing characteristics in ASC (for an example, see von Hofsten et al., 2009). To date, it is unclear whether these two are independent phenomena, whether visual processing characteristics contribute to social cognitive deficits (Behrmann et al., 2006; Hellendoorn et al., 2014), or whether both are manifestations of an impaired underlying cognitive ability (Sinha et al., 2014).

In summary, our findings provide support for a persistent implicit mentalizing ability in neurotypical development beyond early childhood. The observed intact explicit, but impaired implicit mentalizing in ASC, and the observed link between executive functions and explicit, but not implicit mentalizing, is in line with theories on two dissociable mentalizing systems. Further, it seems that 8-year-old children with ASC are not

#### REFERENCES


yet capable of employing information about perception-action contingencies to compensate for an implicit mentalizing deficit.

#### AUTHOR CONTRIBUTIONS

Conceptualization, MV and BS; Methodology, MV and BS; Formal Analysis, TS, MV, and IJ; Investigation, MV; Resources, BS; Writing-Original Draft, TS, and IJ; Writing-Review and Editing, TS, IJ, MV, and BS; Visualization, TS; Supervision, BS; Funding Acquisition, BS.

### FUNDING

This work was funded by a grant from VolkswagenStiftung.

#### ACKNOWLEDGMENTS

We thank all participants and their caregivers who took part in the study. We are grateful to Nicosia Nieß and Gertrud Niggemann (Autismus Oberbayern e.V.), Martina Schabert (Autismuszentrum Oberbayern), and Martin Sobanski (Heckscher-Klinikum gGmbH) for their continuous help with recruiting participants. We further thank Stefanie Brock and Djulia Tucev for their help in data acquisition and Iyad Aldaqre for preprocessing the gaze data.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Schuwerk, Jarvers, Vuori and Sodian. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Promoting Mentalizing in Pupils by Acting on Teachers: Preliminary Italian Evidence of the "Thought in Mind" Project

Annalisa Valle1,2, Davide Massaro1,2 \*, Ilaria Castelli1,3, Francesca Sangiuliano Intra1,2 , Elisabetta Lombardi1,2, Edoardo Bracaglia<sup>2</sup> and Antonella Marchetti1,2

<sup>1</sup> Research Unit on Theory of Mind, Università Cattolica del Sacro Cuore, Milan, Italy, <sup>2</sup> Department of Psychology, Università Cattolica del Sacro Cuore, Milan, Italy, <sup>3</sup> Department of Humanities and Social Sciences, Università degli Studi di Bergamo, Bergamo, Italy

#### Edited by:

Daniela Bulgarelli, Aosta Valley University, Italy

#### Reviewed by:

Ruth Ford, Anglia Ruskin University, UK Veronica Ornaghi, University of Milano-Bicocca, Italy

> \*Correspondence: Davide Massaro davide.massaro@unicatt.it

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 14 April 2016 Accepted: 02 August 2016 Published: 31 August 2016

#### Citation:

Valle A, Massaro D, Castelli I, Sangiuliano Intra F, Lombardi E, Bracaglia E and Marchetti A (2016) Promoting Mentalizing in Pupils by Acting on Teachers: Preliminary Italian Evidence of the "Thought in Mind" Project. Front. Psychol. 7:1213. doi: 10.3389/fpsyg.2016.01213 Mentalization research focuses on different aspects of this topic, highlighting individual differences in mentalizing and proposing programs of intervention for children and adults to increase this ability. The "Thought in Mind Project" (TiM Project) provides training targeted to adults—teachers or parents—to increase their mentalization and, consequently, to obtain mentalization improvement in children. The present research aimed to explore for the first time ever the potential of training for teachers based on the TiM Project, regarding the enhancement of mentalizing of an adult who would have interacted as a teacher with children. For this reason, two teachers – similar for metacognitive and meta-emotional skills - and their classes (N = 46) were randomly assigned to the training or control condition. In the first case, the teacher participated in training on the implementation of promotion of mentalizing in everyday school teaching strategies; in the second case the teacher participated in a control activity, similar to training for scheduling and methods, but without promoting the implementation of mentalization (in both conditions two meetings lasting about 3 h at the beginning of the school year and two supervisions during the school year were conducted). The children were tested by tasks assessing several aspects of mentalization (second and third-order false belief understanding, Strange Stories, Reading the mind in the Eyes, Mentalizing Task) both before and after the teacher participate in the TiM or control training (i.e., at the beginning and at the end of the school year). The results showed that, although some measured components of mentalization progressed over time, only the TiM Project training group significantly improved in third order false belief understanding and changed - in a greater way compared to the control group – in two of the three components of the Mentalizing Task. These evidences are promising about the idea that the creation of a mentalizing community promotes the mentalization abilities of its members.

Keywords: mentalizing, theory of mind, training, teacher-pupil relationship, TiM Project, resilience

# INTRODUCTION

fpsyg-07-01213 August 30, 2016 Time: 16:9 # 2

Mentalizing and theory of mind are two constructs often used interchangeably, although they cannot be considered perfectly overlapping (Sharp and Venta, 2012). Analyzing the studies in this area, it emerges that mentalizing is the construct more often used in the "clinical framework," whereas theory of mind is the construct more often used in the "cognitive and socio-constructivist one." This study is in line with those theoretical positions that highlight the similarities rather than the differences between these two concepts. We also explicitly refer to the literature that stresses the relational co-construction of children's theory of mind thanks to their relationships with significant caregivers (Dunn et al., 1991; Dunn, 1994). The importance of this interpersonal dimension is largely responsible for the individual differences in the developmental paths of mentalization. Mentalization, or mentalizing (Allen, 2006), is a mental activity consisting in the ability to understand and to interpret human behavior on the basis of intentional mental states as beliefs, desires, intentions, goals, and emotions (Bateman and Fonagy, 2006; Fonagy, 2006; Choi-Kain and Gunderson, 2008; Fonagy and Allison, 2012). Mentalizing is an imaginative activity including a wide range of cognitive operations about one's own and others' mind, such as interpreting, inferring, remembering and so on (Allen, 2003). Choi-Kain and Gunderson (2008) identified three dimensions of the construct of mentalization: (1) the functioning (implicit and explicit), (2) the objects (self and others), and (3) the aspects (cognitive and affective). The first dimension refers to the fact that mentalization can be an implicit, automatic, and pre-reflective process when the subject acts on the basis of an intuition about mental contents (for example, during a conversation), but also an explicit, symbolic, and conscious activity when the individual intentionally reflects about the mind (for example, in psychotherapy; Allen et al., 2008). The second dimension indicates that mentalizing happens during interactions (Allen, 2006) where people reflect about the minds of all the participants of the social exchange. The third dimension highlights the fact that reasoning about intentional mental states is usually cognitively focused and affectively laden; the cognitive and affective aspects are closely connected. Moreover, the mentalizing process integrates the ability to reason about the epistemic mental contents and about emotions. Finally, the developmental model suggested by Allen et al. (2008) argues that the mentalization process is rooted in the attachment relationship established with the first caregiver in infancy and early childhood.

The concept of mentalization that Fonagy (1991) proposed derives both from the psychoanalytic term "reflective functioning," and from the psychological construct of "theory of mind" (Choi-Kain and Gunderson, 2008). Based on psychoanalytic work with borderline patients, a Mentalization Based Treatment (MBT) was created (Allen and Fonagy, 2006; Bateman and Fonagy, 2006, 2013): it is a clinical treatment designed to improve mentalization processes, which is impaired in these individuals. Recently, MBT has been adapted and applied to other clinical or atypical situations, including substances abuse, eating disorders, antisocial personality disorder, parental relationships at risk (Bateman and Fonagy, 2011), families with adopted children (Muller et al., 2012), and self-harm in adolescence (Rossouw and Fonagy, 2012). On the basis of the positive effects obtained from MBT in increasing mentalizing abilities in the above-mentioned situations, in recent years several researchers have been developing programs of intervention for non-clinical settings, such as schools. For example, Twemlow and colleagues (2005a,b) applied the mentalization principles in the Peaceful Schools Program, with the aim to create mentalizing school communities to reduce violence and bullying. The authors illustrated the two key components of their approach: (1) violent individuals and communities are impaired in mentalization, and (2) power dynamics involving these individuals and their communities tend to further reduce mentalization abilities. "The difference between a violent and a non-violent community must be the degree to which the implicit social conventions are structured to encourage all participants to be aware of the mental states of others" (Twemlow and Sacco, 2012, pp. 195–196). The main components of the Peaceful Schools Program are the following: (1) positive climate campaigns, stimulating and supporting the awareness of mental states and their role in violent contexts; (2) classroom management; i.e., training teachers to not use coercive discipline, but rather to refer to their mentalization abilities and to those of children; (3) peer and adult mentorship; i.e., training other adults to become mentors, able to intervene in a mentalistic way during violent episodes outside the classroom; (4) the "gentle warrior physical education program"; i.e., teaching children physical self-control in violent situations (a low activation of the body allows high activation of the mind); and (5) reflection time; i.e., the introduction in the classrooms of a 10 min period at the end of each day devoted to talking, from a mentalistic point of view, about the trend of the day and any situations of violence that occurred. The evaluation of the Peaceful Schools Program, longitudinally applied to children aged 8–11 years are encouraging (Fonagy et al., 2009). In contrast with traditional school psychiatry consultation and with usual treatment at school, this program moderated the increase of aggressiveness typical of this age period, the victimization phenomena, and the decline in empathy. Additionally, the program decreased the number of self-reported aggressive acts and aggressive bystanding.

Another proposal of the educational application of mentalization is the "Thought in Mind Project" (TiM Project), also named "Resilience Program," created by Bak (2012). The TiM Project shares with the Peaceful Schools Program the assumption that the creation of a mentalizing community promotes the mentalization abilities of its members. Furthermore, it claims that in these type of communities mentalizing children can develop several strategies to react to the difficulties in their life, thus increasing their resilience (see Stein, 2006). This approach is also in line with the recent rethinking of resilience within a developmental systems framework, that claims – among other things – "the possibility of changes that spread across domains and levels through the many interactions of systems" (Masten, 2016, p. 301). The TiM Project addresses mentalization, resilience, and self-control concepts using simple language, metaphors, pictures, and short movies available on a dedicated

website<sup>1</sup> . Clinicians or researchers propose and explain these materials to a target group (usually teachers and/or parents), who then use the materials as they deem most appropriate for their condition. A follow-up supervision is sometimes provided. An exploratory pilot study (Bak et al., 2015) proposed the TiM Project to the staff members of a social club for adolescents with disruptive behavior in a low income urban area in Denmark. Results showed that as a consequence of the TiM Project training, the yearly frequency of situations where the staff members of the club had to use physical force to solve high-risk conflicts among adolescents decreased significantly. The mental health of the staff increased and the methods introduced by this project continued to be used by the majority of the staff 3 years later.

The TiM Project training aims to clarify those cognitive processes strongly impregnated with mental contents through a metacognitive approach related to both emotional and epistemic contents. In addition, the training emphasizes the relational dimension, because it proposes an intervention directed to the caregivers that is likely to have a positive and long-term effect on children or adolescents. It may be interesting also to consider some indicators of the potential changes in children's mentalizing ability. In our opinion, the psychological construct that fits this goal is the theory of mind. The reason is threefold: (1) it is a key component of mentalizing, (2) it has been explored through a broad and substantial range of tasks, and (3) its development can be supported by training specifically designed for this purpose.

Theory of mind is the ability to understand mental states (intentions, desires, thoughts, and beliefs), and to predict one's own and others' behavior on the basis of these understandings (Premack and Woodruff, 1978). Theory of mind develops during childhood and continues to evolve in adolescence (Valle et al., 2015) and adulthood (Apperly et al., 2009; Sommerville et al., 2013). According to a socio-constructivist approach, theory of mind emerges within contexts of social interactions, thanks to the participation in social exchanges (Astington and Olson, 1995; Carpendale and Lewis, 2004). In this theoretical perspective, an interesting construct that focuses on the relational potential in the mother–child dyad in supporting the development of theory of mind is mind-mindedness. It is the maternal proclivity to consider infants as intentional agents with mental states and to interact with them on the basis of such a belief (Meins et al., 2002, 2003). In this regard, it was highlighted that maternal mind-mindedness, operationalized as the ability to individuate and comment appropriately on their 8-month-olds' internal states, was negatively related to children's externalizing and internalizing behaviors specifically in low socioeconomic status families (Meins et al., 2013). Furthermore, mind-mindedness appears to be an important aspect of personal relationships rather than a trait-like quality (Meins et al., 2014). In this sense, it is likely that adults—supported in the development of activities of mentalization—may find it easier to engage in mentalization-oriented relationships. This evidence provides support to the implementation of the TiM Project, whose strong point is the involvement of the adults who take care of the children in the educational setting. Moreover, many research studies have shown that high levels of theory of mind are linked to different abilities, such as social competences (Jenkins and Astington, 2000; Razza, 2009), prosocial behaviors (Caputi et al., 2012), academic results (Malecki and Elliot, 2002; Lecce et al., 2011), and attribution of intentions in different daily situations (for example, attribution of fair or unfair intention during economical exchanges; Castelli et al., 2010, 2014). In light of these findings, several interventions have been constructed and evaluated in order to implement theory of mind in children. In typical development, different types of training positively affect theory of mind abilities in the short and medium term (Slaughter and Gopnik, 1996; Kloo and Perner, 2008; Grazzani and Ornaghi, 2011; Lecce et al., 2014; Ornaghi et al., 2014; Grazzani et al., 2016). In the case of learning disorders (Ashcroft et al., 1999) and intellectual disabilities (Adibsereshki et al., 2014), theory of mind training improved reflective and social skills. To construct and directly evaluate such training and its effects on the psychological development of children, classical and advanced theory of mind tasks are used. The possibility to rigorously evaluate the effect of these different types of training using psychological tasks has supported our idea that it is possible to realize a similar assessment in the TiM Project, which has been subjected only to indirect evaluations thus far (Bak et al., 2015). Any confirmation of the validity of the TiM project would be particularly interesting. In fact, this training is aimed at teachers and supports them in the implementation of strategies for the development of children's mentalizing. The effects of this training are therefore indirect, as the aim is to support the children through an intervention involving teachers. If effective, the potential is considerable: maximum efficiency with low costs (since the teachers can use these strategies with all the children with whom they come into contact), and a greater likelihood of generalization and persistence of acquired skills due to the high integration of support practices to mentalizing within normal teaching strategies.

#### Aims and Hypotheses

This research aimed to evaluate for the first time the efficacy of the TiM Project on a group of 10-year-old pupils. The hypothesis was that children whose teacher participated in the TiM Project training would improve theory of mind and mentalization styles more than a control group of children whose teacher participated in a training without mentalistic contents.

#### MATERIALS AND METHODS

#### Participants

Forty-six ten-year-old children belonging to two school classes and the respective two teachers who spend more time with the class during the school year took part in the study. The two school classes were randomly assigned to the study groups: the TiM Project training group (N = 23, M = 10.26 years, SD = 3.16 months; 10 boys, 13 girls) and the control training group (N = 23, M = 10.23 years, SD = 5.16 months; 13 boys, 10 girls). All children were Italian and of middle socioeconomic

<sup>1</sup>http://myresilience.org

status based on the parents' education and socioeconomic levels. Children were not clinically referred for any cognitive or learning difficulties and were neither referred to social services nor reported by teachers for learning and socio-relational difficulties. The children were tested for those skills on which we hypothesized that the TiM Project training with the teachers would have a positive effect. The two teachers who participated in the study were both female, 34 and 35 years of age, and had a master degree and 10 years of working experience at the school. The teachers, depending on the class, participated in either the meetings for TiM project training or the meetings for the control group training.

#### Tasks and Training

fpsyg-07-01213 August 30, 2016 Time: 16:9 # 4

All children were evaluated by the following tasks in both the preand post-training phases (i.e., at the beginning and at the end of the school year, which was approximately a 6-month interval between the two phases).

#### Mentalizing Task

The Mentalizing Task (Sharp et al., 2007; Di Terlizzi, 2010) evaluates children's mentalizing attributional styles in everyday life situations. The styles include the following: overly negative (ON), a cognitive mentalizing bias characterized by a global, negative, and stable self-attribution of the causes of social situations ("They would think nobody likes me") typical of children with symptoms of depression and anxiety (Quiggle et al., 1992; Barrett et al., 1996); overly positive (OP), a cognitive mentalizing bias characterized by a global, positive, and stable self-attribution of the causes of social situations ("They would think I'm cool not to play silly games with the rest of the kids") typical of aggressive children (David and Kistner, 2000) idealizing their own competence in interpersonal relationships; rational or neutral (R), a non-self-referent, non-stable type of interpretation of social situations ("They would think I'm just sitting down to have a think and a rest") typical of children with a helpful, functional, and adaptive coping style. This forcedchoice task, which lasts 10 to 15 min per child, included 15 stories and vignettes about social situations that happen at school to a certain child. At the end of each story, the researcher asked the participant the following: "Imagine you are [the character]. If you were, what do you think the other kids would be thinking about you?" The participant can choose among three options that reflect one of three mutually exclusive categories: ON, OP, or R that represent the three final variables. Each variable score can range from 0 to 15.

#### False Belief Tasks

To test children's cognitive theory of mind competence we used two second-order false belief tasks (second FBTs; Sullivan et al., 1994; Astington et al., 2002; Liverta Sempio et al., 2005) and a third-order false belief task (third FBT; Valle et al., 2015), all based on the unexpected transfer paradigm. The two second FBTs were the Look Prediction version (LP) and the Say Prediction version (SP) (Sullivan et al., 1994; Astington et al., 2002; Liverta Sempio et al., 2005). In the LP and SP versions there are two control questions, two false belief questions, and a justification question. In the third FBT there are two control questions, a second-order false belief question with its justification, and a third-order false belief question with its justification. We attributed 1 point for each correct answer and 0 points for each wrong answer. The total score range is 0–2 for the second FBT and 0–2 for the third FBT. Two raters independently coded 33% of the responses at pre- and post-test and inter rater agreement was established using Cohen's Kappa. This agreement was very high for both the second FBTs (at pre-test, LP: κ = 0.92; SP: κ = 0.89; at post-test, LP: κ = 1; SP: κ = 0.90) and the third FBT (at pre-test, κ = 0.92; at post-test, κ = 0.93).

#### Strange Stories

The Strange Stories (Happé, 1994) evaluate the application of theory of mind ability in everyday social situations. This task consists of 24 short stories where the protagonist does or says something strange, in order for the participant to explain the character's strange behavior or provide a statement referring to the mental contents of the protagonist. As a control task the Physical Stories were used, in which in order to explain the character's behavior or provide a statement, the participant has no need to refer to the mental contents of the character. In the present research we selected four Strange Stories (concerning sarcasm, double bluff, persuasion, and contrary emotions) and four Physical Stories. Two Strange Stories have one question, whereas the other two Strange Stories have two comprehension questions. Furthermore, each story has a justification question. Each comprehension question is scored 1 if correct and 0 if wrong. The justification question is scored 2 if correct and has an explicit answer, 1 if partially correct, and 0 if wrong. The total score range is 0–18. Each Physical Story has a comprehension question coded 2 if correct and has an explicit answer, 1 if partially correct, and 0 if wrong. The total score range is 0–8. Two raters independently coded 33% of the responses at pre- and post-test and inter rater agreement was established using Cohen's Kappa. This agreement was very high for both the Strange Stories (at pretest, κ = 0.89; at post-test, κ = 0.90) and the Physical Stories (at pre-test, κ = 0.92; at post-test, κ = 0.98).

#### Reading the Mind in the Eyes Test-Child Version

To test the affective component of theory of mind we used the Reading the Mind in the Eyes Test-Child Version (RMET; Baron-Cohen et al., 2001; Castelli, 2010) that requires the attribution of mental states to other people by observing the eye region of their face. The test comprises 28 pictures of the eye region of different people. The participant has to choose among four options the one that best represents what the character is thinking or feeling. Only one option is correct and is scored 1 point, with all other answers receiving a score of 0 points. The total score range is 0–28.

#### Teacher Characteristics and Teacher Training

The two teachers took part in the training and control activities, depending on their group assignment. We constructed and proposed training based on the TiM Project principles and methods, and supervised the teacher during the application of the TiM Project methods with two meetings during the school year. We also developed a control activity, similar to

training for scheduling and methods, but without promoting the implementation of mentalization within the standard educational strategies. Meta-cognitive and meta-emotional skills of the two teachers were evaluated prior to the study by administering the MESI (Moè et al., 2010), a set of questionnaires that assess working practice, teaching satisfaction, positive and negative emotions related to work, positive and negative emotions related to the role of teacher, teaching strategies, self-efficacy and upgradeability (see **Table 1**). Both teachers showed values in line with the psychometric characteristics derived from the Italian validation of the measure. Specifically, all the scores were significantly distant from the critical thresholds identified for each scale and the two teachers' values for each scale differ one from another appreciably less than one standard deviation.

#### Test Condition: the TiM Project Training

The aims of the TiM Project training were to introduce and to explain the key concepts and methods of the TiM Project, to involve the teacher in the direct experience of these methods, and to reflect together on the way to apply the TiM Project methods in the classroom with children. The TiM Project training was organized in two meetings, each lasting 3 h. At the end of the training, the teacher proposed the TiM Project methods to the classroom in the way the teacher liked, meeting the researcher for a supervision session on 2 days during the school year. Moreover, the teacher could ask for support at any time, contacting the researcher by e-mail or by the phone.

In the first meeting of the TiM Project training the researcher explained "The Thinking Brain and The Alarm Center," which concern two concepts regarding brain functioning and are the basis of the TiM Project (Bak, 2012; Bak et al., 2015). Moreover, the researcher explained the importance of the ability to direct attention to one's own thoughts in order to know one's own mind. For each concept, activities and games were proposed to clarify explanations and to suggest possible activities to use with children. In the case of "The Thinking Brain and The Alarm Center," the teacher had to draw a picture representing her brain on alert, then the teacher had to build a spotlight of attention with paper and use it to observe the world around. In both cases, reflection on the activities was promoted by the researcher.

In the second meeting, the term "resilience" was introduced and it was linked to the body-mind relationship. The activity proposed was the construction of a poster with a list of stressful situations of everyday life at school and the identification of the strategies that the teacher could use, with a focus on cognitive and emotional regulation strategies (involving the management of the alarm system). Moreover, the researcher introduced the TiM stories, such as the story of the "House of Thoughts" (Bak, 2012; Bak et al., 2015): a metaphor of the brain as a house of thoughts with the possibility to visit different rooms containing positive and negative thoughts (an example of the story is provided in the **Appendix**). To better understand the contents of this story, the teacher participated in a role play acting the role of a thought that inhabits one's own brain.

At the end of this training the researcher guided a reflection on how to use the TiM Project methods with the children, and then the researcher delivered to the teacher the TiM Project Manual consisting of the Italian translation of the contents of the TiM Project website. During the following months, the researcher met the teacher twice to know how the teacher proceeded in applying the techniques, and to guide her in the preparation of new activities for the classroom. The teacher could also benefit from online or telephone support (advice, clarification, and suggestions) provided by the researcher over the entire length of the project.

#### Control Condition: the Non-mentalizing Training

The aims of the control condition training were to promote reflection about the teaching strategies that the teacher can apply in the classroom. More specifically, the focus was on the advantages and disadvantages of the traditional lecture method, and the strategies to support collaborative and cooperative learning. The control condition training was organized in two meetings.

In the first meeting the researcher explained the advantages and disadvantages of the traditional lecture method, and the teacher discussed professional experiences with this method and on the role as tutor of collective reasoning.

In the second meeting, the properties and the differences of cooperative learning (Johnson and Johnson, 2012) and collaborative learning (Nagata and Ronkowski, 1998) were discussed. The researcher explained the strategies and methods to encourage the active participation of students, and to promote the responsibility of each pupil in the working group. As in the TiM Project training, the teacher could also benefit from online or telephone support (advice, clarification, and suggestions) provided by the researcher over the entire length of the project.


WP, working practice; TSA, teaching satisfaction; ERW+, positive emotions related to work; ERRT+, positive emotions related to the role of teacher; ERW−, negative emotions related to work; ERRT−, negative emotions related to the role of teacher; TS, teaching strategies; SE, self-efficacy; UP, upgradeability.

#### Procedures

The research was organized in three steps.

fpsyg-07-01213 August 30, 2016 Time: 16:9 # 6

Step 1: Children were tested for their mentalization and theory of mind abilities (pre-test, 5 weeks after the beginning of the school year), and teachers participated in the TiM Project training or the control group training.

Step 2: Each teacher applied the training that she participated in, and teachers received supervision both in the presence of the researcher (two meetings during the school year, respectively, 2 and 4 months from the pre-test) and remotely (on-line).

Step 3: Children were re-tested for their mentalization and theory of mind abilities (post-test, 5 weeks at the end of the school year).

Each child was interviewed individually in two sessions of about 20–25 min each in a quiet room at the school. The procedure was identical for each participant. All tasks were administered in a fixed order. No feedback was given to children's answers in the pre-test and in the post-test sessions. Teachers were trained in a room of the school. Informed parental consent was obtained for the children, and informed consent was obtained from each teacher. The three steps of the research were conducted by independent researchers. The research was conducted according to APA ethical standards and was approved by the local ethics committee.

### RESULTS

**Table 2** reports the descriptive statistics for the explored variables at pre-test and post-test for the two groups; namely, the total scores of each task as they have been used in subsequent analyses unless otherwise specified.

We conducted some preliminary analyses to verify the homogeneity of the groups for the considered variables at the pre-test session. The t-test for independent samples did not show any statistically significant differences between the children assigned to the TiM Project training group and the children assigned to the control training group (all ps > 0.05).

Next, we performed a GLM for repeated measures for each variable explored (Mentalizing task, second and third order false belief tasks, Strange Stories, RMET) with time (pre-test and post-test) as the within-subjects factor and training groups (TiM Project and control) as the between-subjects factor. In order to test the training effect. The results showed a significant main effect of time for LP and SP tasks, Strange Stories, and the OP and R Mentalizing styles. Performance increased over the time for second order false belief LP: [F(1,44) = 9.85, p = 0.003, η 2 <sup>p</sup> = 0.186, θ = 0.866]; SP: [F(1,44) = 9.40, p = 0.004, η 2 <sup>p</sup> = 0.227, θ = 0.845] and Strange Stories understanding [F(1,44) = 27.46, p = 0.001, η 2 <sup>p</sup> = 0.384, θ = 0.999]. Furthermore, the OP style decreased [F(1,44) = 30.1, p = 0.000, η 2 <sup>p</sup> = 0.406, θ = 1], whereas the R styles increased [F(1,44) = 37.30, p = 0.000, η 2 <sup>p</sup> = 0.459, θ = 1]. The results also showed a significant interaction between time and training groups for the third FBT [F(1,44) = 24.18, p = 0.001, η 2 <sup>p</sup> = 0.392, θ = 0.999] and the Mentalizing task [F(2,43) = 4.48, p = 0.017, η 2 <sup>p</sup> = 0.173, θ = 0.737].

More specifically, pairwise comparisons revealed that, for the third FBT, the children in the TiM Project training group showed a significantly higher post-test performance compared to the post-test performance of children in the control training group [F(1,44) = 26.62, p = 0.001, η 2 <sup>p</sup> = 0.377, θ = 0.999] (see **Figure 1**).

With regard to the Mentalizing task, in the post-test the children in the TiM Project training group showed a significantly higher performance on the R style of the task [F(1,44) = 12.44, p = 0.001, η 2 <sup>p</sup> = 0.220, θ = 0.932] and a significantly lower performance on the OP style of the task [F(1,44) = 24.24, p = 0.001, η 2 <sup>p</sup> = 0.355, θ = 0.998] than children in the control training group (see **Figures 2** and **3**).

# DISCUSSION

The present research preliminarily explored the efficacy of the TiM Project training on mentalization performance in 10-yearold pupils. To this aim, we tested children's cognitive, affective, and social components of theory of mind as well as mentalizing styles. The training succeeded in promoting specific elements of mentalistic ability. We will discuss these results starting from disentangling this specificity from the mere time effect that occurred with regard to some variables.

Performance on the second-order false belief tasks and Strange Stories showed an increase over time. The understanding of the second level of recursivity begins to be successfully overcome around 7 years of age, although in his review Miller (2009) pointed out that the available studies indicate that this type of task continues to improve until pre-adolescence. This period appears to be a sensible one also for the development of the comprehension of ambiguous social situations—here measured through the Strange Stories—where mentalization is implied. On the contrary, this was not the case for the capacity to "read" the mind through the eyes, because performance on average was already well developed and the RMET did not improve with time. The rational attributional style and the overly positive style also changed with time: the former increasing and the latter decreasing in respective scores. However, while the improvement in second order false belief understanding and ambiguous social situations understanding seems not to depend on the TiM project training, third order false belief understanding and the changes in mentalizing styles appear to be significantly supported by the training itself.

As for the comprehension of the third level of recursivity, the presence of the training effect could be interpreted in terms of efficacy of the teacher's intervention in the pupils' zone of proximal development (ZoPed), although no classroom observations were taken. In fact, this action pulls the comprehension from very low levels to intermediate ones. The same does not happen in the case of the second order false belief tasks (LP and SP). Considering together the results about the false belief understanding, the ZoPed acts on the comprehension of the third level of recursivity similar to what time does with the second level of recursivity. The absence of the effect of time and training on the RMET is not surprising; in fact, the average performance is already medium-high in the pre-test


n, number of participants; M, mean score; SD, standard deviation; SS, Strange Stories; PS, physical stories; RMET-C, Reading the mind in the eyes test-child version; third FBT, third-order false belief task; second FBT SP, second-order false belief task say prediction version; second FBT LP, second-order false belief task look prediction version; MT ON, mentalizing task overly negative style; MT R, mentalizing task rational style; MT OP, mentalizing task overly positive style. P-values refer to the t-test for paired samples between pre- and post-conditions within each group.

session. Furthermore, it is in line with the performance of slightly older subjects (see for example Sharp, 2008, in which a sample of children with an average age equal to 11 obtained a mean performance of 17.96 on the RMET). So, we can hypothesize that the time frame considered was not sufficiently long enough in order to have an effect on this ability. Kaland et al. (2008) showed that the performance of a sample with an average age of 15.6 obtained a mean performance score of 23.16. In addition, the training did not have more of an effect by being more focused on metacognitive abilities than affective aspects directly implied in the RMET.

With regard to the training effect on the OP and R mentalizing styles, the literature shows that the critical age for a change in attributional style is 7–11 year-olds. Indeed, from 4 to 7 years of age children generally attribute an overly positive judgment to peers about their behavior, whereas from 8 years on the attributional style becomes more rational and more congruent with objective indicators (Damon and Hart, 1991; Berndt and Burgy, 1996; Harter, 1999). Furthermore, Sharp et al. (2007) also corroborated the presence of a critical period for variations in the attributional style of children ages 7 to 11 years old, suggesting that these changes are closely related to the ability to take the perspective of others in complex social situations. The participants in the present study are in the top margin of this critical range. Therefore, it is plausible that they have already undergone the developmental changes. This fact would explain the absence of the effect of time. On the contrary, the training may promote a change in the ZoPed, anticipating a change that it is likely to appear later. This explanation is also consistent with the work of Meins et al. (2002, 2006) showing that the maternal proclivity to consider the child as an individual with mental states and not just as the bearer of needs supports the acquisition of

FIGURE 2 | MT-Rational style for TiM Project group and control group at pre-test and post-test.

the child's mentalistic skills according to longitudinal dynamics. This attitude, otherwise known as mind-mindedness, offers the child the opportunity to engage actively with his or her own and others' mental states, and to understand the mentalistic attitudes that people have toward the world. This relational competence is exercised in the ZoPed and, through the process of internalization, affects the child's ability to interact mentally with partners (Laranjo et al., 2014; Meins et al., 2014).

The fact that the teacher had attended the TiM Project training and had used the training in the classroom increased the children's capacity to apply a rational attributional style to other's mind, and decreased the tendency to use an overly positive attributional style. This result supports the efficacy of the TiM Project training application in the classroom, and suggests that teachers involved in it can help their pupils to increase an attributional style that can act as a potential protective factor against psychopathology (Baumeister et al., 1996). In fact, although it has been observed that children have the tendency to misperceive the thoughts, feelings, and intentions of others (O'Connor and Hirsch, 1999), it seems that emotional disorders are associated with specific attributional styles in childhood (Ingram et al., 1998). Sharp et al. (2007) affirmed that an overly positive attributional style (i.e., estimating the judgment of peers on themselves in an overly positive way) combined with a lack of a rational attributional style (i.e., an objective evaluation of other people's thoughts) is associated with symptoms of externalizing disorder (as individuated by teachers). Additionally, Hughes et al. (1997, 2001), David and Kistner (2000), and Brendgen et al. (2004) linked together the aggression in primary school children, the over-estimation of peer acceptance, and the tendency to idealize the perception of one's own qualities.

Although Mentalizing tasks and the Strange Stories have the common aim of investigating theory of mind understanding in complex social situations, the performance on the Strange Stories was not affected by the training. This discrepancy can be explained by considering two aspects related to the tasks. The first one concerns the characteristics of the tasks in terms of instructions and test questions. In fact, theory of mind understanding is evaluated in the mentalizing task by asking the child to put him/herself in another person's shoes, whereas the Strange Stories asks the child to explain another person's behavior. The Mentalizing task requires a first person simulation of another's mind, which is isomorphic to the way the content of the TiM project training is implemented by the teachers in the classroom. The second aspect that can explain the abovementioned discrepancy regards the structure of the test questions. In the Mentalizing task, children are faced with a forced choice among three possible answers (corresponding to the three mentalizing styles), while in the Strange Stories children are faced with open questioning. Due to its intrinsic metacognitive features, the TiM project training appears to be more suited for promoting a form of mentalization more coherent with the forced choices format than with the open questioning format. Finally, the social situations proposed in the Strange Stories imply the understanding of numerous components of theory of mind that the TiM Project training does not involve [for example, the case of irony and sarcasm (Massaro et al., 2014)].

This study, despite offering some interesting evidence supporting the implementation of mentalizing strategies, presents some methodological issues that need to be carefully evaluated for the interpretation of the results. First, the sample size is limited: only two classes were compared. Accordingly, only two teachers, for the training and the control groups, were involved. Secondly, classroom observations should be implemented in order to evaluate the teacher's strategies applied to support mentalization and to identify situations in which the TiM Project can be used with the greatest impact. Finally, although teachers did not differ in metacognitive and metaemotional skills, their mentalizing abilities were not directly evaluated in the pre-training phase. The possibility that the significant variation observed in children's mentalizing abilities depended on past differences between the teachers cannot be excluded. However, it is important to note that teachers were involved in training and the children tested for mentalizing abilities during the 5th year of primary school (i.e., after 4 years of interaction with their teachers), and that in the pre-training phase there were no differences in children's mentalizing abilities between the two groups of children.

Future research should replicate these results with better management of these issues. Furthermore, given that recently Bak et al. (2015) evaluated the welfare of the operators involved in the training, the inclusion of teacher evaluations may prove to be a significant element. Finally, the inclusion of a wider sample of teachers that allows the exploration of possible covariates of the implementation of mentalizing strategies is highly desirable.

# CONCLUSION

This study provides some preliminary evidence to support the validity of the TiM Project. It is likely that a teacher who has an increased understanding of mental functioning, and who can talk about it in the classroom, is able to help children to increase their mentalistic skills. In particular, children improve their mentalizing attributional style (from overly positive to rational), which consequently can reduce the risk of psychopathology, increase the level of recursive thinking in cognitive theory of mind, and increase learning to reason at a third level of recursivity. Although these findings require further investigation, they remain promising about the idea that the creation of a mentalizing community promotes the mentalization abilities of its members, evaluating for the first time this efficacy on children's competencies. These communities can be consistently regarded as the extension of ZoPed within which, as just mentioned, the mother uses the mind-mindedness (Laranjo et al., 2014; Meins et al., 2014) to support child's mentalization. Similarly, the teachers, specifically trained, will accompany children in the acquisition of more and more effective and socially adaptive mentalist abilities (Meins et al., 2013).

# AUTHOR CONTRIBUTIONS

AV, DM, IC, FSI, EL, EB, and AM conceived and designed the experiments. AV, DM, IC, FSI, EL, and EB performed the experiments. AV, DM, IC, FSI, EL, and AM analyzed the data. AV, DM, IC, FSI, EL, EB, and AM wrote the paper.

# FUNDING

This research was also made possible by a D1-2016 research grant from the Universitá Cattolica del Sacro Cuore to DM. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

# ACKNOWLEDGMENTS

We are grateful to Valentina Cornetti for data collection. A special thanks to children, parents, and school for their collaboration.

# REFERENCES

fpsyg-07-01213 August 30, 2016 Time: 16:9 # 10



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Valle, Massaro, Castelli, Sangiuliano Intra, Lombardi, Bracaglia and Marchetti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

#### Valle et al. Promoting Mentalizing in Pupils

# APPENDIX

#### Example of TiM Project Story The Story of The House of Thoughts<sup>2</sup>

fpsyg-07-01213 August 30, 2016 Time: 16:9 # 12

In some way, we may say that our thoughts live inside our heads. Imagine that your thoughts live in a house with many rooms where you can wander around and discover them. When you discover thoughts you are using the world's finest tool – your attention, which is a kind of spotlight. When you throw light on a thought, you spot it and discover it. Thereafter you can shift your attention and discover another thought. The House of Thoughts has plenty of rooms – a number of exciting thoughts may live in one room, perhaps some sad or angry thoughts live in another room and various happy thoughts live in a third room. From The House of Thoughts, your thoughts can call you if they want to be discovered. This may be really exciting and good, but could be irritating too – especially if the thoughts are annoying and they keep knocking all the time, trying to take charge over your attention. In the case, where you have sad or angry thoughts that take charge and force you into their room all the time, you might end up believing there are no exciting or happy thoughts to be found anywhere and that is not much fun.... Yet this is not the case at all. All the happy and exciting thoughts are just waiting in other rooms in the House of Thoughts, waiting for you to discover them with your attention. Maybe there even are tools to be found in one room that could be used to fix some other thoughts in another room in the house. There may also be thoughts in a room who need to be left in peace, so they won't disturb you to much. If you often go to explore The House of Thoughts with your attention, then it becomes easier to be in charge with your thoughts.

<sup>2</sup> From the web site "Resilience" http://robusthed.dk/en/stories-even-stories-fromreal-life/stories/the-story-of-the-house-of-thoughts.

# The ToMenovela – A Photograph-Based Stimulus Set for the Study of Social Cognition with High Ecological Validity

Maike C. Herbort1,2,3 \*, Jenny Iseev<sup>4</sup> , Christopher Stolz3,5, Benedict Roeser<sup>4</sup> , Nora Großkopf3,6, Torsten Wüstenberg<sup>1</sup> , Rainer Hellweg1,2, Henrik Walter1,2 , Isabel Dziobek<sup>2</sup> and Björn H. Schott1,3,6,7 \*

<sup>1</sup> Department of Psychiatry and Psychotherapy, Campus Mitte, Charité Universitätsmedizin Berlin, Berlin, Germany, <sup>2</sup> Humboldt University, Berlin, Germany, <sup>3</sup> Leibniz Institute for Neurobiology, Magdeburg, Germany, <sup>4</sup> Free University of Berlin, Berlin, Germany, <sup>5</sup> Department of Psychology, Philipps University of Marburg, Marburg, Germany, <sup>6</sup> Otto von Guericke University, Magdeburg, Germany, <sup>7</sup> Center for Behavioral Brain Sciences, Magdeburg, Germany

#### Edited by:

Daniela Bulgarelli, Aosta Valley University, Italy

#### Reviewed by:

Virginia Slaughter, University of Queensland, Australia Ruth Ford, Anglia Ruskin University, UK

#### \*Correspondence:

Björn H. Schott bjoern.schott@med.ovgu.de Maike C. Herbort maike.herbort@charite.de

#### Specialty section:

This article was submitted to Developmental Psychology, a section of the journal Frontiers in Psychology

Received: 18 July 2016 Accepted: 15 November 2016 Published: 02 December 2016

#### Citation:

Herbort MC, Iseev J, Stolz C, Roeser B, Großkopf N, Wüstenberg T, Hellweg R, Walter H, Dziobek I and Schott BH (2016) The ToMenovela – A Photograph-Based Stimulus Set for the Study of Social Cognition with High Ecological Validity. Front. Psychol. 7:1883. doi: 10.3389/fpsyg.2016.01883 We present the ToMenovela, a stimulus set that has been developed to provide a set of normatively rated socio-emotional stimuli showing varying amount of characters in emotionally laden interactions for experimental investigations of (i) cognitive and (ii) affective Theory of Mind (ToM), (iii) emotional reactivity, and (iv) complex emotion judgment with respect to Ekman's basic emotions (happiness, anger, disgust, fear, sadness, surprise, Ekman and Friesen, 1975). Stimuli were generated with focus on ecological validity and consist of 190 scenes depicting daily-life situations. Two or more of eight main characters with distinct biographies and personalities are depicted on each scene picture. To obtain an initial evaluation of the stimulus set and to pave the way for future studies in clinical populations, normative data on each stimulus of the set was obtained from a sample of 61 neurologically and psychiatrically healthy participants (31 female, 30 male; mean age 26.74 ± 5.84), including a visual analog scale rating of Ekman's basic emotions (happiness, anger, disgust, fear, sadness, surprise) and freetext descriptions of the content of each scene. The ToMenovela is being developed to provide standardized material of social scenes that are available to researchers in the study of social cognition. It should facilitate experimental control while keeping ecological validity high.

Keywords: Theory of Mind, stimulus set, ecological validity, social cognition, photographs, empathy, emotions

# INTRODUCTION

Recent years have seen a steep increase in behavioral and brain imaging research of human social cognition. Defining, differentiating and operationalizing cognitive and emotional subprocesses of social cognition such as empathy, Theory of Mind (ToM), and emotion recognition, have attracted increasing interest from psychologists and neuroscientists. Two related, but yet separable constructs have been employed by researchers to describe the cognitive processes that may enable humans to understand others' cognitive and affective states – empathy and ToM. While ToM describes the ability to understand and predict another's mental states, intentions, or beliefs, empathy as a psychological construct rather describes the phenomenon to share other people's

affective states, which is likely to form the basis for social emotions like guilt or compassion. Hein and Singer (2008) explicitly distinguish empathy from "cognitive perspective taking as the ability to understand intentions, desires, beliefs of another person, resulting from (cognitively) reasoning about the other's state", a concept that can be called "cognitive empathy", whereas the classical definition could be referred to as "affective empathy." The related concept of mentalizing (Frith and Frith, 2006) has been defined as "the process by which we make inferences about mental states" and comprises an immediate recognition and understanding of emotional states, also via cognitive inference. A triple-dissociation of the ToM/empathy complex suggested by Walter (2012) divides the ToM concept into three separable cognitive mechanisms: Cognitive ToM comprises the ability of an individual to mentalize about cognitive states of others, Affective ToM – or Cognitive Empathy – is defined as an individual's ability to cognitively reflect on affective states of others, and Affective Empathy is characterized by the induction of others' affective states in the perceiving individual.

Numerous experimental paradigms have been developed to formalize the ToM construct in a way that allows researchers to assess both behavioral manifestations and neural underpinnings of ToM-related cognitive mechanisms. These include the wellknown False Belief Task (initially developed by Wimmer and Perner, 1983), a paradigm commonly used in developmental research, and the related Sally-Anne Tasks (Baron-Cohen et al., 1985), which have been employed to demonstrate ToM deficits in children with Down's Syndrome and Asperger's Syndrome. A different approach to the experimental assessment of ToM and empathy was introduced with the publication of the Reading the Mind in the Eyes Task (RMET; Baron-Cohen et al., 1997), in which participants have to assign mental states to static pictures of eye regions. Notably, comparisons of the behavioral performance in different ToM tasks have yielded poor correlations (Ahmed and Miller, 2013).

Despite this lack of correlation, the cognitive processes tested by the presently available tasks do most likely all contribute to enabling ToM in real-life social situations. It is conceivable that, in the real world, people rely on highly multimodal information when engaging in social cognitive tasks, and different individuals are therefore likely to potentially employ distinct strategies during social cognition. Achim et al. (2013) have proposed the Eight Sources of Information Framework (8- SIF) as a theoretical framework to analyze mentalizing tasks with respect to the information participants can use for task performance. It consists of a 2\*2 matrix, with the axes reflecting the temporal characteristics of information [immediate (I), with the subcategories "linguistic" and "perceptual", vs. stored (S), with the subcategories "general" and "source-specific"] and agentrelated versus context-related information. The authors suggest that the multimodal nature of information described in the 8-SIF framework is best met by more naturalistic – or ecologically valid – paradigms or stimuli.

The need for ecologically valid stimulus material has been recognized in cognitive neuroscience, and several stimulus sets of various categories have been developed for this purpose. For example, a number of photograph-based sets of object stimuli have been developed as an alternative for the commonly used Snodgrass pictures, line drawings of common objects (Snodgrass and Vanderwart, 1980). These include the Amsterdam Library of Object Images (ALOI; Geusebroek et al., 2005) or the Bank of Standardized Stimuli (BOSS; Brodeur et al., 2014) 1 . The importance of examining ecologically valid information is wellestablished in the field of visual perception research (Kayser et al., 2004), but only few ecologically valid stimulus sets applicable to emotion processing and social cognition have been published so far. A notable exception is the International Affective Picture System (IAPS; Lang et al., 2008), which contains images of different degrees of emotional valence and arousal, including highly aversive images of accidents and mutilation.

Based on the IAPS stimuli, the MET (Multifaceted Empathy Test; Dziobek et al., 2008) has been developed to study both affective ToM as well as affective empathy. In this photographbased stimulus set, human beings are depicted in various emotional situations and participants are asked to infer the mental states of the persons depicted (affective ToM) and to indicate the level of own emotional involvement when perceiving or evaluating the scenes (affective empathy). The MET has been extensively validated by experts and is therefore suitable for assessing response accuracy in social cognitive tasks. One potential limitation of the MET is that the images are based on IAPS stimuli, which are –to a large extent– not representative for daily-life situations.

With a strong focus on ecological validity, Dziobek et al. (2006) have developed the MASC (Movie for the Assessment of Social Cognition). The stimulus set consists of a 15-min video showing four main characters at a dinner party. In 46 breaks, subjects have to answer questions on the feelings, thoughts, and intentions of the characters. The task shows rather high ecological validity, but its design as a movie with a fixed location and a small number of protagonists limit its use particularly in neuroimaging studies that require precise trial timings and appropriate baseline conditions. In neuroimaging studies of ToM and empathy, it is also important to employ appropriate controls, both at the task level (e.g., first-person perspective versus "pure" ToM) and at the item level (e.g., different degrees of task difficulty or emotional salience and valence), preferably using the same stimulus material. Schnell and Walter have developed a task that allows one to distinguish first-person and third-person perspective during emotional and cognitive/visualperceptual processing (Schnell et al., 2011; Walter et al., 2011). The stimulus set consists of cartoon stories that are usable as false-belief tasks, but have been designed in a way that suitable first-person perspective control questions can also be applied to all stories. Cartoon stories consisting of three sequentially presented pictures are shown, and participants are instructed to either count the number of animate objects (self-cognitive), to state whether the protagonist can see more or less animate objects than in the previous picture (third-person cognitive), whether they feel better or worse than during the picture presented before (first-person affective), or whether the protagonist feels better or worse than during the previous picture (other-affective).

<sup>1</sup>http://www.cogsci.nl/stimulus-sets

Notably, that stimulus set is devoid of any direct indicators of the protagonists' affective states, like expressive facial elements.

Here, we present a stimulus set (The ToMenovela) that was specifically designed to combine the high ecological validity of the MASC and the MET with the applicability of first-person control tasks as in the cartoon task by Schnell and Walter. We chose to base the task on photographs rather than movies, in order to make it more suitable for event-related fMRI and EEG studies. To achieve high ecological validity, we set up a fictional circle of eight friends (four male and four female; see **Figure 1**) and designed a background story that contains biographies and personalities of each protagonist as well as the relationships between the characters. Each of the characters possesses stable characteristics (traits) that are distinct from one another (e.g., homely, outgoing, artistic, etc.). Based on this social arrangement, we scripted a series of scenes that would be comprehensible from a single still photograph. We aimed to balance the scenes with respect to location (indoor vs. outdoor) and appearance of the characters (each scene depicts at least two of the protagonists). After selection of the suitable stimuli, we collected normative data on the stimulus set in a cohort of 61 healthy study participants (31 women, 30 men), in order to obtain normative data with respect to content, emotional salience and valence, as well as cognitive and affective ToM. Because emotion recognition constitutes an important facet of human social cognition, the scenes were designed to Ekman's basic emotions (happiness, anger, disgust, fear, sadness, surprise; Ekman et al., 1972; Ekman and Friesen, 1975, 1978) to a various degree, and the evaluation contained specific questions testing for emotion recognition (see Methods section for details). One important reason for including Ekman's emotions was the potential for future clinical applications: Emotion recognition and cognitive ToM show parallel deficits in certain neuropsychiatric disorders like schizophrenia (Sparks et al., 2010; Barbato et al., 2015) or temporal lobe epilepsy (Amlerova et al., 2014), but may be differentially affected in other conditions like Alzheimer's disease and frontotemporal dementia (Gregory et al., 2002; Freedman et al., 2013). Therefore, the inclusion of Ekman's emotions may be useful for future clinical applications.

As will be outlined in the following sections, the ToMenovela has several potential advantages for future studies of human social cognition:


higher-level vision, memory, or face and scene processing (Zweynert et al., 2011; Hofstetter et al., 2012; Rossion et al., 2012).

# MATERIALS AND METHODS

In order to generate a stimulus set of pictures depicting daily-life social interactions for use in future studies of social cognition, we scripted a total of 220 distinct daily-life scenes, 193 of which were subsequently staged and photographed (see **Figure 2** for example scenes). Because we aimed to generate stimuli that would be particularly suitable for neuroimaging studies, we opted for photographs rather than video clips. Two scenes were excluded due to technical problems, and one due to ambiguous evaluation results, resulting in a final set of 190 scenes.

In a subsequent validation study, each scene was rated with respect to principal content, cognitive and affective firstand third-person perspective, emotional valence along six basic emotions (happiness, anger, disgust, fear, sadness, surprise; Ekman et al., 1972; Ekman and Friesen, 1975, 1978). Those ratings were complemented by two free-text open questions, and the response data will be reported in a future publication.

#### Generation of the Stimulus Material Script

We first developed an initial sketch of eight distinct human characters that constitute a circle of friends with diverse relationships (a long-term married couple, a new romantic relationship, two sisters, colleagues, high school friends, the "new guy in town", etc.). **Figure 1** describes the biography and personality traits of the main characters and the interpersonal relations within the group.

We next scripted a total of 220 scenes, each of which was to depict at least two of the eight main characters. Each scene was constructed with respect to general content, basic emotions (happiness, anger, disgust, fear, sadness, surprise), dramatic setting, characters displayed, requisites, and location. The scripts also included mindsets of the different protagonists instructing the actors to feel and express specific emotions (for example scripts, see Supplementary Tables S1A,B). When scripting the scenes, we aimed to balance the appearance of the eight main characters, basic emotions and location (indoor vs. outdoor). Due to external conditions during the shooting of the scenes (e.g., sicknesses of actors or unexpected weather changes), some scenes deviated in details from their original script.

#### Team

We recruited eight professional and semi-professional actors as main cast and, depending on the specific scene, additional experienced lay actors. The cast for the main characters and reoccurring background actors were recruited in early 2013. The final ensemble consisted of two professionally educated actors and six amateurs with previous stage experience (drama and/or music). The actors were known to each other prior to the shootings and specifically selected based on their certain style and personality, although it should be noted that their

actual biography and personality differ from that of the fictional characters described here. All actors gave written informed consent for the use of the resulting photographs for research purposes.

All main actors were familiarized with their respective character by authors MCH, a trained psychologist, and BR who holds a B.A. in theater studies and has extensive previous experience in directing. MCH and BR also directed and supervised the shootings of all scenes.

Photographs were acquired and processed by Sven Reichelt<sup>2</sup> , a photographer with extensive previous experience in portrait photography.

<sup>2</sup>http://www.lensbreaker.com

#### Shootings

To ensure a continuous look and feel of each character, clothes, accessories, and make-up were obtained from a previously assembled pool of equipment prior to the beginning of the shootings. Each shooting session was carefully prepared in terms of location, equipment, clothes, make-up, and look. Depending on the complexity of the scene and external conditions (e.g., availability of the actors, weather conditions at the time of shooting), between four and 22 different scenes were shot on one day. All shootings took place in Berlin, Germany, between May 4th, 2013 and July 20th, 2013. Because the scenario is intended to take place in an unnamed major city in an unspecified country in Europe (possibly also North America or Australasia), we aimed to minimize recognizable German

writing and strictly avoided any iconic buildings (e.g., the Brandenburg Gate or the Emperor William Memorial Church) in the pictures.

Photographs were taken using a Nikon D300s digital SLR camera with a sensor size of 23.6 mm × 15.8 mm and a resolution of 12.3 megapixel (4352 × 2868). All pictures were taken in sRGB color mode. Depending on the requirements posed by the scene, either a AF-S Nikkor 16-85 mm1:3.5 – 5.6G ED medium-angle lens or a Sigma 10–20 mm F 4.0 – 5.6 EX DC HSM wide angle lens were used. If necessary, two Nikon SB900 were used as flash.

# Post-processing and Picture Selection Procedure

We used a multi-level picture selection and processing procedure to obtain a final set of images that best represented the intended social interactions and emotional valance.

Pictures were first screened for technical, compositional, and photographic aspects. All approximately 10 000 pictures were screened with respect to sharpness, lighting conditions or unintended facial expressions and with regard to the final aspect ratio. To this end, the photographer and the first author selected between one and eight pictures per scene for postprocessing. Post-processing of the pictures was done using PhotoShop (Adobe, San José, CA, USA) and the open source image manipulation software GIMP<sup>3</sup> . Camera RAW images were

<sup>3</sup>http://www.gimp.org

adjusted for brightness, contrast and color, and converted into JPG format. All images were clipped horizontally to set the horizontal to vertical aspect ratio to 4:3. When necessary (e.g., due to distracting content outside the focus of the picture), images were clipped further, keeping the aspect ratio.

A resulting set of 555 pictures belonging to 191 scenes was presented to five raters who had not been involved in the initial shootings and did not know the actors personally (authors CS and NG, prior to their further participation in normative data collection and/or data analysis; and one other man and two other women). They were asked to answer two questions on a 5-point Likert scale.


Based on the raters' responses, weighted sum scores were calculated (clarity <sup>∗</sup> 3 + emotion), and the pictures with the highest sum scores were selected for the final picture set. The aim of this pre-rating procedure was to have only one picture per scene with the highest possible rating clarity. It left 46 scenes for which two or more pictures had equally high scores. The pictures in question were inspected by the first and last authors, and the final image was selected based on consensus. The resulting final set of 191 unique images was used in the validation study. **Figure 2** depicts four example images [Note: The pictures displayed here are not part of the actual stimulus set and may be used for illustrative purposes in publications].

## Normative Data Collection Study

fpsyg-07-01883 November 30, 2016 Time: 17:26 # 6

The evaluation of the final stimulus set of 191 pictures was performed using a computer-based rating procedure and was carried out in Berlin and Magdeburg, Germany, from December 2014 to November 2015.

#### Participants

Sixty-one participants of the validation study (31 women, 30 men) were recruited via advertisements, through various academic mailing lists, and by contacting former participants of earlier experiments done by the authors. A total of 41 participants (26 female) were recruited and tested in Berlin, and 20 participants (five female) performed the task in Magdeburg. Detailed demographic data of the study cohort are displayed in **Table 1**. People interested in participating were first informed about the evaluation process via e-mail and were asked to answer to a set of psychological questionnaires at home, including a general health questionnaire and the Structured Clinical Interview for DSM-IV, (First et al., 1996, 1997; Saß et al., 2003) Section II (SCID-II) screening questionnaire. Participants were interviewed for present or past DSM-IV psychiatric disorders using a SCID-I-based screening questionnaire and the appropriate SCID-I modules when applicable. Clinical interviews were performed by the first author under supervision of the last author, who is a board-certified psychiatrist. Exclusion criteria were insufficient knowledge of the German language, a history of head trauma, neurological illness, bipolar disorder, schizophrenia or substance use disorder, and the use of centrally acting medication. Participants with above-cut-off values in the SCID-II questionnaire were interviewed according to the SCID-II manual by the first author, and a potential clinically relevant diagnosis led to exclusion from the study. All participants

#### TABLE 1 | Demographic and psychometric parameters.


Demographic information and psychometric measures are displayed separately for male and female participants. PR, percentile rank; LPS, Leistungsprüfsystem – subtests 3 + 4; MWT-B, Mehrfachwahl-Wortschatz-Intelligenztest form B; BDI, Becks Depression Inventory (Beck et al., 1961); STAI, State-Trait Anxiety Inventory (Laux et al., 1981); STAXI, State-Trait Anger Expression Inventory (Schwenkmezger et al., 1992); BIS, Barratt Impulsiveness Scale (Patton et al., 1995); ADHS, ADHS-Diagnose-Checkliste (Rösler et al., 2004); AQ Autism Spectrum Quotient (Dammann, 2002); SPF, Saarbrücker Persönlichkeitsfragebogen. Standard deviations are given in parentheses; T-tests were calculated 2-tailed. In case of normal distribution, t-tests were calculated. All scales met the Levene-Test. In case of not normally distributed, Mann–Whitney-U (U) and Kolmogorov–Smirnov-Z (Z) were calculated.

gave written informed consent prior to the participation in the study in accordance with the Declaration of Helsinki and received financial reimbursement. The study was approved by the Ethics Committee of the University of Magdeburg, Faculty of Medicine.

#### Schedule

Participants received the biographical chart (**Figure 1**) to familiarize them with the characters and their backgrounds and relationships. This was done for the purpose of further increasing ecological validity, as most daily-life social interactions occur with familiar individuals. Seven days (±2 days) after receiving the chart, participants were scheduled for the actual rating procedure. Due to the length of the procedure, the experiment was split into three experimental sessions that were performed within three to seven days. At the beginning of the study, participants were asked to provide their individual impression of the eight protagonists in written form and to fill in a paper–pencil two-alternative forcedchoice quiz designed to ensure that they were sufficiently familiar with the characters (for example questions, see Supplementary Table S2; the complete quiz is available along with the stimulus set).

#### Experimental Paradigm

The actual experiment started with a standardized instruction provided by the experimenter (author MCH, JI, or NG). The participants were explained that they would be presented with scenes depicting the eight characters in various daily-life situations in a total of 191 pictures. The pictures would have no chronological timeline and were to be considered independently from each other.

Pictures were presented on a computer screen (resolution 1600×1200 or 1920×1080) at a resolution of 700 × 525 pixels, together with a set of task instructions presented sequentially. The same rating tasks were performed for each of the images:

	- (a) one scale assessing emotional salience (first-person affective)
	- (b) valence ratings across the six basic emotions according to Ekman

The affective and cognitive ToM tasks were designed to closely match the cognitive ToM tasks used in the previously described cartoon-based ToM paradigm developed by Schnell et al. (2011) and Walter et al. (2011). Because single pictures rather than sequences were presented, we opted for the use of a comparative task between two protagonists (instead of the within-subject across-sequence rating employed by Schnell and Walter). Also to match the task by Schnell and Walter, the cognitive ToM task required visual perspective taking (original task: number of animate objects seen by the protagonist; present task: number of human beings seen by the two protagonists).

Because all ratings were performed by lay participants – that is, no data from either experts or clinical populations were collected – they represent normative data rather than accuracy scores at this point. Expert ratings of the ToMenovela are, however, currently in preparation. While absolute accuracy scores cannot be conclusively determined from the ratings performed so far, our normative data do provide information with respect to ambiguity, which reflect in part difficulty of an item. Thus, researchers may use this information to generate subsets of stimuli sets with different degrees of ambiguity and thus varying difficulty.

All task instructions, along with the corresponding response options and the purpose of each question are summarized in **Table 2**. The task was self-paced, and participants could interrupt the rating procedure at any time to ensure that they would remain alert for the entire experiment. **Supplementary Figure S1** depicts an example trial. The software used for the rating procedure was programmed in Java (Oracle, Redwood City, CA, USA) by author CS and is available from the authors upon request.

#### Psychometric Questionnaires and Correlations with Stimulus Rating Data

To ensure that participants of the rating procedure were psychopathologically healthy, all participants received a set of well-established psychometric questionnaires, including the Beck Depression Inventory (BDI, Hautzinger et al., 1994), questions 21–40 from the State-Trait Anxiety Inventory (STAItrait, Spielberger and Lushene, 1966; Laux et al., 1981), the State-Trait Anger Expression Inventory (STAXI, Schwenkmezger et al., 1992), the Barratt Impulsiveness Scale (BIS, Preuss et al., 2003) and an attention deficit hyperactivity disorder checklist (ADHS-CL, adapted on Rösler et al., 2004). The Autism Questionnaire by Baron-Cohen (AQ, Baron-Cohen et al., 2001) and the Saarbrücker Persönlichkeitsfragebogen (SPF, Paulus, 2009) were administered to the participants in an online-based follow-up survey in autumn 2015. As measures of cognitive functions, the Leistungsprüfsystem (LPS, Horn, 1983) and the Mehrfachwahl-Wortschatz-Intelligenztest (MWT, Lehrl, 2005) were obtained, either prior or after the evaluation session.

To allow for correlational analyses of stimulus ratings and psychometric data, we computed numeric measures that reflected individuals' "typical" response behavior across the stimuli. Specifically, we computed a measure of decisiveness in the third-person affective and third-person cognitive conditions ([OA<sup>A</sup> + OAB]/OAboth), a measure of the tendency to make non-standard responses (i.e., the tendency to chose a response

#### TABLE 2 | Task instructions.

fpsyg-07-01883 November 30, 2016 Time: 17:26 # 8


not chosen by the majority of the participants), as well as the mean emotion recognition ratings for the Ekman emotions across scenes. These measures were correlated with the SPF subscales and with the AQ, employing non-parametric Spearman correlations and robust Shepherd's Pi correlations that include an outlier exclusion based on the bootstrapped Mahalanobis distance (Schwarzkopf et al., 2012). All correlations were computed for 59 participants, due to missing SPF and AQ data from one male and one female participant.

# RESULTS

#### Stimuli

As a result of the rating procedure, one image (#164) had to be excluded due to ambiguous interpretation by the raters, leaving a total of 190 images in the stimulus set. Supplementary Table S3 displays the basic characteristics of the images.

### Demographic and Psychometric Results

The demographics and psychometric data of the study cohort are presented in **Table 1**, separated by gender. Women and men in our sample did not differ with respect to age, education, and cognitive measures (assessed with LPS and MWT). There were also no significant differences regarding depressive symptoms (BDI), trait anxiety (STAI), anger (STAXI), or impulsivity (BIS-11). Fisher's exact Test yielded no difference [F = 1.607, p = 0.460] with respect to smoking status.

Across the study sample, autism- and empathy-related questionnaires revealed scores in line with previous normative data of the AQ (Baron-Cohen et al., 2001) and the SPF.<sup>4</sup> In both questionnaires, we observed gender differences in the expected directions: male participants had higher mean scores in the AQ (t<sup>59</sup> = −2.985, p = 0.004), while in the SPF, male participants had lower scores on the subscales fantasy (t<sup>59</sup> = 3.731, p < 0.001), empathic concern (t<sup>59</sup> = 3.485, p < 0.001), personal distress (t<sup>59</sup> = 2.389, p = 0.02), and the overall score (t<sup>59</sup> = 3.44, p < 0.001), but no significant difference in perspective taking (t<sup>59</sup> = 5.20, p < 0.605).

#### Behavioral Results

The results from free-text ratings (descriptions of each scene's content and one's own behavioral reactions) are not part of the present work and will be reported separately.

#### Ratings of Emotional Salience and Valence

**Figure 3** depicts the result of the affective salience rating, separated by gender. When asked "How much do you feel affected by the picture" and responding on a slider comparable to a Likert scale, participants gave the scenes a median rating of approximately 30 percent (women: 29.8; men: 31.4), with a broad range from approximately 10 to 60 percent (women: 8.8 – 64.2; men: 11.0 – 59.3). We provide detailed descriptive statistics of the affective salience ratings (mean, median, mode, standard deviation, skewness, standard deviation of skewness, curtosis, standard deviation of curtosis) for each scene as along with the stimulus set.

Emotional valence ratings were conducted for the six basic emotions defined by Ekman (happiness, anger, disgust, fear, sadness, surprise; Ekman et al., 1972, Ekman and Friesen, 1975, 1978). The distribution of the emotional valence ratings across scenes is depicted in **Figure 4**, separated by gender. A MANOVA with the six emotions as independent variables and gender and scene as fixed factors suggested a small but significant tendency for men to rate the images somewhat higher with respect to all six emotions (main effect of gender: Wilk's λ = 0.978, F6,<sup>11205</sup> = 42.83, p < 0.001; interaction gender \* scene: Wilk's λ = 0.868, F6,<sup>11205</sup> = 1.21; p < 0.001). However, post hoc univariate tests revealed that gender effect could not be observed for disgust (F1,<sup>11210</sup> = 0.610, p = 0.435), but for all other emotions (all F > 14.20, all p < 0.001). Interaction effects reflecting gender differences in the rating of individual scenes were observed for anger, fear, and sadness (all F > 1.19,

<sup>4</sup>The original normative data of the SPF can be found at http://psydok. psycharchives.de/jspui/handle/20.500.11780/3343

all p < 0.037), but not for happiness, disgust, and surprise (all F < 1.085, all p > 0.202). Detailed descriptive statistics of the emotional valence ratings (mean, median, mode, standard deviation, skewness, standard deviation of skewness, curtosis, standard deviation of curtosis) for each scene are available along with the stimulus set.

#### Cognitive and Affective ToM Ratings

To obtain a measure of ambiguity with respect to the ToM tasks (cognitive: "Can person A or person B see more people"; affective: "Does person A or B feel better"), we computed a simple measure of agreement, namely the ratio of the difference to the sum of A versus B responses (+1 to avoid division by 0: |1AB+1|/| 6AB+1|). Scenes yielding values lower than 1/3 were considered ambiguous with respect to the participants' responses. **Figure 5** displays the results of our evaluation, separated by the condition gender. In the cognitive ToM condition, 15 photographs came out as ambiguous among female participants, and nine among male participants. In the affective ToM condition, 19 images came out as ambiguous in both men and women, although there was only partial overlap. Supplementary Table S4 lists the potentially ambiguous scenes, separated by task and gender.

Note that the "both equally" responses were not considered in this approach, and users of the stimulus set may choose to include "ambiguous" scenes in an experiment when the "both equally" answer was the most common one in the group. Cumulative response data for each scene are available as along with the stimulus set.

#### Correlations of Stimulus Ratings and Psychometric Data

To assess a potential relationship between response behavior during stimulus evaluation and psychometric measures of self-reported social cognitive abilities, we computed numeric measures that reflected individuals' "typical" response behavior across the stimuli. Across the cohort of study participants (N = 59, due to missing SPF and AQ data from two participants), we observed a significant negative correlation between the empathic concern subscale of the SPF (SPF – EC) and the decisiveness measure in the third-person affective condition (i.e., the tendency to decide for either person A or B to feel better versus choosing the option "both equally"; Spearman's r = −0.30375; p = 0.0193). This correlation remained significant when bivariate outliers were excluded by bootstrapping the Mahalanobis distance (Shepherd's Pi correlation; Schwarzkopf et al., 2012; see **Figure 6**). No other correlations between stimulus ratings and psychometric data reached significance (all p > 0.30).

# DISCUSSION

We have developed a photograph-based normative stimulus set (The ToMenovela) specifically designed for the experimental assessment of social cognition, particularly suitable for neuroimaging studies. All stimuli were designed in a way that (a) ecological validity would be high and (b) different types of ToMand empathy-related constructs can be assessed experimentally (i.e., affective empathy, affective ToM (≈ cognitive empathy) and cognitive ToM; see Walter, 2012). The stimulus set will be available for non-commercial research free of charge for other researchers upon contacting the authors.<sup>5</sup>

# Applicability to the Study of Social Cognition

Our focus during the generation of the here presented stimulus set was high ecological validity. To this end, we scripted a background story and individual scenes revolving around a fictional circle of friends, the eight main characters. The scenes all depict at least two of the eight protagonists, but are yet independent of each other, showing the characters in different combinations and across a variety of different social situations and locations. While certain basic characteristics are fixed due to the nature of the stimulus set (e.g., the age of the protagonists in the twenties or early thirties, or the urban setting of the scenes), it should readily be possible for an experimenter to adapt the background story to their requirements.

By using a plausible real-life setting, our stimulus set bears some similarity with the MASC, a movie-based test instrument for the study of social cognition (Dziobek et al., 2006). While the MASC has previously constituted a considerable advance in ecological validity of test instruments of social cognitive processing, it is not without limitations. Its fixed composition

<sup>5</sup>Please contact us via the ToMenovela website (http://neuro2.med.unimagdeburg.de/∼bschott/ToMenovela) to gain access to the stimulus set.

as a movie of people at a dinner party limits the spectrum of emotions displayed and the use of non-social control tasks. These two limitations are less prominent in the MET (Dziobek et al., 2008) and in the cartoon-based ToM task developed by Schnell et al. (2011) and Walter et al. (2011), but the ecological validity of those tasks is on the other hand limited by the somewhat artificial construction of the MET stimuli and the lack of facial expressions in the cartoon-based task. Here, we provide a stimulus set that combines a plausible ecological setting with a broad range of emotions displayed across stimuli and the possibility to apply different tasks to the same stimuli.

One important limitation of the present stimulus set may be the ethnic background and age range of the eight main characters. First, the ethnic composition was rather narrow, albeit somewhat representative for a European urban area (seven Europeans, one East Indian), which may be an advantage when

testing the typically available study population in Europe (or, to some extent, North America or Australia), namely, drawing from the student body of the researchers' institution (Henrich et al., 2010), but may limit the interpretation when using the stimulus set with a non-Western study population (Adams et al., 2010; Koelkebeck et al., 2011; Hu et al., 2015). Similar considerations apply with respect to age. The protagonists of the ToMenovela are all in their twenties or early thirties. They may thus be highly comparable to the typical cohort of participants in psychological experiments at educational institutions (Henrich et al., 2010). As the biographies were written with considerations to our anticipated study populations, we cannot exclude that the biographies provided may have influenced the ratings. Future experimenters may further improve the comparability by adapting the characters' biographies to their specific study populations, although it must be cautioned that doing so might warrant the collection of new normative data. The authors had considered the inclusion of elderly protagonists in the stimulus set, to make it more approachable by older study participants. That would, however, raise the potential confound

that the (healthy) elderly are generally capable of imagining or retrieving information from memories of their own youth, while younger participants cannot to the same extent imagine themselves as being old. The authors are aware of the limitation that may arise when applying our stimulus set to a study population that differs substantially from our protagonists with respect to age, ethnicity, or cultural background. We strongly encourage researchers to expand our stimulus set presented here by including other ethnicities or age groups, paving the way for investigations of individual differences in social cognition.

With respect to the 8-SIF framework, it must be noted that the ToMenovela, does not contain any immediate (written or auditory) verbal information. Therefore, the factors I2 and I4 of the 8-SIF, the immediate linguistic information about agents or context, could not be implemented in our stimulus set, at least in its present form. While the authors do understand that this may constitute a potential limitation, it should be noted that all images were intended to be comprehensible without verbal information, and preliminary analyses of the free-text responses in our validation study confirm that the content of the images was indeed understood by the participants.<sup>6</sup> We encourage future researchers interested in factors I2 and I4 of the 8-SIF to expand the stimuli by adding – spoken or written – verbal information to the photographs.

## Normative Evaluation

During our normative data collection, each scene was rated with respect to principal content, cognitive and affective ToM, and to first-person emotional salience and valence – the latter with respect to the six basic emotions according to Ekman (Ekman and Friesen, 1975). Ratings were performed by 61 participants (31 women, 30 men). Women and men in our sample were highly comparable with respect to age, education, intelligence, depressiveness, trait anxiety, anger, and impulsivity. In line with previous studies, autism-related traits were more pronounced in male participants scores, while men scored lower in several subscales of the empathy-related questionnaires (fantasy, empathic concern, personal distress, and sum scores, but not perspective taking). Supplementary Table S5 displays an overview of the tasks employed during evaluation and their potential applications in future research.

#### Emotional Salience and Valence

Analysis of the salience ratings ("How much do you feel affected by the picture?") revealed a median rating of approximately 30 percent with a broad range from approximately 10 to 60 percent (**Figure 3**). The relatively low median arousal with a broad range was not unexpected, as the authors had aimed to depict reallife situations and interactions in the stimulus set. Along the same line, the rating of the scenes with respect to basic emotions revealed that happiness was most strongly represented across the stimuli, while, for example, only few scenes received high ratings for disgust (**Figure 4**). Importantly for future users of our stimulus set, all six emotions were represented in subsets of the scenes, and researchers can select the subset of pictures suitable for certain specific research questions.

We found small but significant gender difference of the ratings: men tended to rate the images somewhat higher with respect to emotional salience (first-person affective: "How much do you feel affected by the picture?") and to all emotionratings except for disgust. As shown in the post hoc univariate tests, gender differences could not be observed for disgust, but for all other emotions requested. Surprisingly, rather few studies have thus far investigated gender differences in emotion processing. One previous study using images from the IAPS (Lang et al., 1998) suggested that women had a higher tendency to rate pictures as fearful (Barke et al., 2012) or found no gender differences at all (Gruhn and Scheibe, 2008). With respect to happiness – and possibly surprise – ratings, on the other hand, our results are in line with previous studies that have shown men to rate pictures more positively (Barke et al., 2012), particularly pictures with erotic content (Bradley et al., 2001). Our stimulus set, while not displaying explicit nudity, does contain scenes with (in most cases implicit) erotic content that might have contributed to the overall more positive ratings by male participants. It must be cautioned, however, that the scenes were not designed to elicit extreme emotional responses as is the case with the IAPS pictures. Therefore, further research is required to systematically characterize the gender differences observed here. Finally, the authors would like to emphasize that all differences observed were, albeit being significant, quantitatively small and should therefore be unlikely

<sup>6</sup>Please note that one picture (#164), for which the free-text responses suggested ambiguity of content, was excluded from the stimulus set for that reason.

to affect the usability of our stimulus set. Furthermore, we did not include experts like psychotherapists or people well versed in the Facial Action Coding System (FACS, Ekman and Friesen, 1978) to evaluate the pictures from a rather professional point of view and thereby we do not deliver a gold-standard for salience and valence norms.

#### Results on Third-Person ToM: Agreement across Raters

Analysis for the cognitive and affective ToM conditions revealed that only a small subset of images yielded ambiguous responses. In the cognitive condition ("Who can see more people?"), 15 photographs were rated as ambiguous among female participants, and nine among male participants (Supplementary Table S4). In the affective ToM condition ("Does person A or B feel better?"), nineteen images were rated as ambiguous by both men and women, although there was only partial overlap. Depending on future researchers' need for unambiguous stimulus material, scenes with little or no disagreement can be selected from our stimulus set. The detailed results of the rating procedure are available along with the stimulus set. It should be noted at this point that a certain degree of ambiguity of the scenes may be unavoidable, given that our focus was on ecological validity of the stimulus material, and ambiguity of certain stimuli is most likely not unique to the ToMenovela. For example, rating studies of the well-established IAPS stimuli suggest that several pictures did not receive high ratings on the initially intended emotions in a normative rating procedure (Barke et al., 2012). On the other hand, some researchers may want to explicitly include ambiguous scenes, for example in order to vary cognitive load or task difficulty. Most ToM or mentalizing tasks currently used simplified settings, unimodal structures or highly simplified fictional characters. As mentalizing can be conceptualized as "an executive component managing the multiple aspects of representations that are concurrently activated by the inherently complex everyday social interactions" (Brunet-Gouet et al., 2011), we suggest that the naturalistic setting employed in our paradigm invariably includes some degree of ambiguity, at least in a subset of the stimuli, while rather accurately representing daily life social interactions.

#### Relationship of Stimulus Ratings with Self-report Measures of Social Cognition

Correlational analyses revealed a negative relationship between decisiveness in the third-person affective condition and the empathic concern subscale of the SPF (**Figure 6**). This may appear somewhat surprising, as this negative correlation suggests that participants with higher empathic concern show more difficulties in judging an individual's emotion. On the other hand, there is considerable debate with respect to potential subdivisions of the ToM construct into different subprocesses like emotion recognition, understanding of causality, or the ability to distinguish knowledge and facts (Kanske et al., 2015). Furthermore, a distinction has been suggested between affective empathy, affective ToM/cognitive empathy, and cognitive ToM (Walter, 2012; Schaafsma et al., 2015). Kanske et al. (2015) could recently demonstrate that empathy and ToM can be orthogonalized within the same task at both the behavioral and neural level. With respect to the present results, this notion points to the possibility that increased empathic concern may induce difficulties in some individuals when it comes to making (comparative) decisions about other people's feelings. One limitation in this context is that we did not record reaction time data, which would provide a more objective measure to further substantiate this interpretation.

# Limitations and Directions for Future Research

It should be noted that, as of now, expert evaluation of the ToMenovela has not been completed, and thus the stimulus set does not represent a performance test as of yet, which can be used for investigating mentalizing skills or deficits at the behavioral level. Future studies are planned that will obtain both expert ratings on the stimulus set and ratings from clinical populations like individuals with autism spectrum disorders, both of which will be used to establish concurrent and discriminant validity. In addition, other researchers may develop new questions applicable to our stimulus set, for example with respect to social cue recognition or potential gender-related differences in ToM for male versus female characters. We have summarized the purpose of each question used in the initial evaluation, along with potential use cases in Supplementary Table S5, in order to provide suggestions for future applications of the ToMenovela stimuli.

# Availability

The ToMenovela stimulus set is freely available for use in noncommercial scientific research. Functionalities of this online service include the picture set in three different resolutions, full normative data and the full quiz. To prevent circulation of the pictures unrelated to research usage, scientists will be requested to provide contact details and a brief outline of their research purpose when accessing to the ToMenovela database. All details required for access can be found at http://neuro2.med.unimagdeburg.de/∼bschott/ToMenovela. The script of the scenes is available in German language only and can be obtained from the first author (maike.herbort@charite.de).

# ETHIC STATEMENT

The study was approved by the Ethics Committee of the Otto von Guericke University, Magdeburg, Faculty of Medicine. All actors gave written informed consent for the use of the resulting photographs for research purposes. All participants of the evaluation study gave written informed consent prior to the participation in the study in accordance with the Declaration of Helsinki. Some photographs display children as supporting actors. All parents were informed about the purpose of the stimulus set and consented to have their children participate in the photo shootings. At least one parent or (in case of children over 10), a person entrusted by the parents, was always present when photographs involving children were taken. No children served as supporting actors in photographs with potentially disturbing content (e.g., accidents, fighting, sexually suggestive scenes).

#### AUTHOR CONTRIBUTIONS

fpsyg-07-01883 November 30, 2016 Time: 17:26 # 14

MCH, BR, HW, and BHS designed research; MCH, BR, JI, CS, and NG performed research; CS programmed the stimulus rating software; MCH, JI, CS, TW, and BHS analyzed the data; RH and ID supervised evaluation of stimulus material and data analysis; MCH, HW, ID, and BHS wrote the paper. All authors approved the final version of the manuscript.

# FUNDING

This work was supported by the Deutsche Forschungsgemeinschaft (DFG, SFB 779, TP A08 and A10) and by the Leibniz Association (Leibniz Graduate School"Synaptogenetics").

## REFERENCES


# ACKNOWLEDGMENTS

The authors would like to thank all actors for their participation. We are particularly grateful to our main actors, Kai Kittler-Packmor (Oliver), Fabian Dott (Jonas/Noah), Vinzenz Rothenburg (Viktor/Victor), Jörn Kriehmig (Hannes/Jack), Carla Junghans (Theresa), Annika Packmor (Kathrin/Catherine), Mandy Promok (Lea/Leah), and Lisa Budzinsky (Celine) for their exceptional effort. We would like to express our gratitude to Sven Reichelt (http://lensbreaker.com) for photography and picture post-processing and to Adriana Barman for helpful and inspiring discussions during the initial planning phase of the project. Furthermore, the authors would like to say special thanks to Marlene Promok, Alessa Tschaftary, Ramona Henkel, and Thilo Krause, Alina Kirichenko, Christa Herbort, and to all shopkeepers, café-owners, medical practitioners, and other professionals who allowed us to perform shootings at their places.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2016.01883/full#supplementary-material

#### FIGURE S1 | Example trial of the normative data collection study.


current concepts. Alzheimer Dis. Assoc. Disord. 27, 56–61. doi: 10.1097/WAD. 0b013e31824ea5db


Horn, W. (1983). Leistungsprüfsystem L-P-S. Göttingen: Hogrefe.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Herbort, Iseev, Stolz, Roeser, Großkopf, Wüstenberg, Hellweg, Walter, Dziobek and Schott. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.