# DYNAMICS OF SENSORIMOTOR INTERACTIONS IN EMBODIED COGNITION

EDITED BY: Guillaume T. Vallet, Benoit Riou, Lionel Brunel and Nicolas Vermeulen PUBLISHED IN: Frontiers in Psychology

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

*All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-807-8 DOI 10.3389/978-2-88919-807-8

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **DYNAMICS OF SENSORIMOTOR INTERACTIONS IN EMBODIED COGNITION**

Topic Editors:

**Guillaume T. Vallet,** Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Montreal, Canada **Benoit Riou,** Université Lyon 2, France **Lionel Brunel,** Université Paul Valery, France **Nicolas Vermeulen,** Université Catholique de Louvain, Belgium

Image by Thomas Camus

We interact with our environment through perception and action. Perception is based on sensory components while actions are based on motor components. It is commonly accepted that these sensorimotor components constitute the foundation of knowledge (i.e., percepts and concepts), action and emotion. However, whether or not these components remain part of knowledge, action and emotion is still being debated (see Glenberg, Witt, & Metcalfe, 2013). According to the classical symbolic/abstracted approach of cognition, cognitive processes operate on symbols that are abstracted from these components. Reversely, embodied cognition theory states that knowledge, action and emotion remain grounded in these sensorimotor components (see Wilson, 2002). This embodiment revolution assumes that the interactions between present and absent —but simulated in memory— sensory-motor components determine the emergence of knowledge, action and emotion (Barsalou, 2008). It also implies that perception, memory (in particular conceptual knowledge), action and emotion interact together in a closer way that previously thought (e.g. Riou, Lesourd, Brunel & Versace, 2011; Corveleyn, Lopez-Moliner & Coello, 2012; Vermeulen et al., 2013).

Despite the accumulation of empirical evidence showing that perception, memory, action and emotion interact together, less is known about the dynamics of these interactions. It remains to precise the temporal dynamic (when these interactions occur), the neural underlying networks, and the factors that modulate these interactions. The present research topic focuses on the dynamic relationship between present and absent sensorimotor components across perception, memory, action and emotion in a grounded cognition perspective. This research topic aims 1) to demonstrate the validity of the embodied cognition theories 2) to highlight the dynamics of emergence of conceptual knowledge, action and emotion 3) to provide a comprehensive state-of-the-art theoretical explanation and/or models.

**Citation:** Vallet, G. T., Riou, B., Brunel, L., Vermeulen, N., eds. (2016). Dynamics of Sensorimotor Interactions in Embodied Cognition. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-807-8

# Table of Contents

*06 Editorial: Dynamics of Sensorimotor Interactions in Embodied Cognition* Guillaume T. Vallet, Lionel Brunel, Benoit Riou and Nicolas Vermeulen *09 Mechanisms of embodiment* Katinka Dijkstra and Lysanne Post *20 What are memory-perception interactions for? Implications for action* Loïc P. Heurley and Laurent P. Ferrier *24 Manipulation gesture effect in visual and auditory presentations: the link between tools in perceptual and motor tasks* Amandine E. Rey, Kévin Roche, Rémy Versace and Hanna Chainay *35 The embodied dynamics of perceptual causality: a slippery slope?* Michel-Ange Amorim, Isabelle A. Siegler, Robin Baurès and Armando M. Oliveira *46 Visiting Richard Serra's "Promenade" sculpture improves postural control and judgment of subjective visual vertical* Zoï Kapoula, Alexandre Lang, Thanh-Thuan Lê, Marie-Sarah Adenis, Qing Yang, Gabi Lipede and Marine Vernet *56 Evidence for the embodiment of space perception: concurrent hand but not arm action moderates reachability and egocentric distance perception.* Stéphane Grade, Mauro Pesenti and Martin G. Edwards *65 For your eyes only: effect of confederate's eye level on reach-to-grasp action* François Quesque and Yann Coello *72 Children's looking preference for biological motion may be related to an affinity for mathematical chaos* Joshua L. Haworth, Anastasia Kyvelidou, Wayne Fisher and Nicholas Stergiou *79 Starting off on the right foot: strong right-footers respond faster with the right foot to positive words and with the left foot to negative words* Irmgard de la Vega, Julia Graebe, Leonie Härtner, Carolin Dudschig and Barbara Kaup *86 When "good" is not always right: effect of the consequences of motor action on valence-space associations* Denis Brouillet, Audrey Milhau and Thibaut Brouillet *94 Affective valence facilitates spatial detection on vertical axis: shorter time strengthens effect* Jiushu Xie, Yanli Huang, Ruiming Wang and Wenjuan Liu *105 Spatial biases during mental arithmetic: evidence from eye movements on a blank screen*

Matthias Hartmann, Fred W. Mast and Martin H. Fischer

*113 Pushing forward in embodied cognition: may we mouse the mathematical mind?*

Martin H. Fischer and Matthias Hartmann

*117 Extending the reach of mousetracking in numerical cognition: a comment on Fischer and Hartmann (2014)*

Thomas J. Faulkenberry and Amandine E. Rey


# Editorial: Dynamics of Sensorimotor Interactions in Embodied Cognition

#### Guillaume T. Vallet <sup>1</sup> \*, Lionel Brunel <sup>2</sup> , Benoit Riou<sup>3</sup> and Nicolas Vermeulen<sup>4</sup>

<sup>1</sup> Department of Psychology, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Montreal, Canada, <sup>2</sup> Laboratoire EPSYLON, Department of Psychology, Université Paul-Valéry Montpellier III, Montpellier, France, <sup>3</sup> Laboratoire d'Etude des Mécanismes Cognitifs, Department of Psychology, Université Lyon 2, Lyon, France, <sup>4</sup> Psychological Sciences Research Institute, Université Catholique de Louvain, Louvain-la-Neuve, Belgium

Keywords: embodied cognition, grounded cognition, situated cognition, sensorimotor interactions, memory, action, perception, emotion

### **The Editorial on the Research Topic**

### **Dynamics of Sensorimotor Interactions in Embodied Cognition**

The concept of incorporating the current situation and the body state within cognitive processes, referred to as embodiment, has revolutionized cognitive research (Glenberg et al., 2013). Interest in this approach has grown substantially in the last few decades. Embodied cognition has now been demonstrated across a wide range of topics, from babies (e.g., Smith and Gasser, 2005) to elderly adults (e.g., Vallet et al., 2013a), from normal cognition to neuropsychology (e.g., Vallet et al., 2013b) as well as in emotion (e.g., Vermeulen et al., 2007), and in neuroscience as a whole (e.g., Pulvermüller, 2013). Nevertheless, there is yet much to discover in order to better understand embodiment.

One of the striking arguments of embodiment is that sensori-motor features are at the core of mental processes (Pecher and Zwaan, 2005; Barsalou, 2010). Their relationship should also be considered to be dynamic. In the embodied cognition approaches, perception, memory, and action are no longer regarded as (relatively) independent functions, but rather as closely interacting components. Some authors even argue for an overlap within these processes (Brunel et al., 2009, 2015; Vermeulen et al., 2009) that would rely on the same neural code (e.g., Hommel). One key direction in embodiment that needs to be further explored is the dynamic interaction between the sensory and motor components across the different cognitive functions. The present Research Topic aims to shed new light on this issue.

The importance of studying this problematic is well presented in Dijkstra and Post's critical review. Their article highlights the crucial role of sensorimotor simulation across several cognitive activities (e.g., reasoning, evaluating) and how the interaction between the current situation and the simulation of past sensorimotor states mediates the emergence of adaptive behavior. This is possible since perception and memory interact closely and share processes and resources (e.g., Vermeulen et al., 2008; Riou et al., 2011; Rey et al., 2014). In other words, "direct" perception of an object and the mental simulation of this object involve common sensorimotor units. Heurley and Ferrier argue in their perspective that these interactions serve to plan and control actions to interact with well-known objects in non-optimal perceptual conditions, thus producing adaptive behavior.

The idea that cognitive activities are strongly influenced by the given social and physical environment is referred to as situated cognition. Rey et al. demonstrate that motor simulation relies on the relevance of the object as a function of the task. Further evidence also supports the hypothesis that sensorimotor simulation is heavily influenced by situational aspects (Kapoula et al.). For instance, Amorim et al. have demonstrated that the visual angle and screen orientation along with contextual information modulates the resulting movement.

Edited and reviewed by: Bernhard Hommel, Leiden University, Netherlands

> \*Correspondence: Guillaume T. Vallet gtvallet@gmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 27 November 2015 Accepted: 30 November 2015 Published: 05 January 2016

#### Citation:

Vallet GT, Brunel L, Riou B and Vermeulen N (2016) Editorial: Dynamics of Sensorimotor Interactions in Embodied Cognition. Front. Psychol. 6:1929. doi: 10.3389/fpsyg.2015.01929

The reverse is also applicable, since sensorimotor simulation shapes our perception of, as well as our interaction with, our environment. Grade et al. found that similar action simulation underlies both reachability and egocentric distance perception. This effect can be generalized across individuals and possibly incorporate the role of social environment in cognition. In accordance, Quesque and Coello show that during reach-tograsp actions, participants unconsciously modify their trajectory curvature based on their partner's eye level. This adaptation to biological motion is one the key component of social interaction, but human movements are so complex that they might be described as mathematically chaotic. Nonetheless, children quickly develop the ability to coordinate gaze, and in some respect posture, in response to complex chaotic motion structures (Haworth et al.).

The dynamics of sensorimotor interactions allows for situational behavioral adaptation in reaching and grasping, but also in more complex evaluative processes. For instance, associations between handedness and valence have previously been found (e.g., Casasanto, 2011). de la Vega et al. extend this finding by showing that strong right-footers respond to positive words faster with the dominant foot. The association between valence and laterality, however, becomes less clear when motor fluency is taken into account. Brouillet et al. observe that most participants prefer choosing stable supports for "good" items, regardless of side and handedness. The spacevalence association could also be reported for the vertical axis. Xie et al. demonstrate that the processing of affective valence concepts activated the vertical spatial axis (positive in the up position).

The association between cognition and space in the context of emotion is well documented, but it is not restricted to this cognitive domain. Hartmann et al. report that simple arithmetics are associated with gaze shifting along the vertical axis. Interestingly, operand magnitude partly modulated horizontal gaze position as well. The reciprocal influence of motor components on numerical cognition may also provide the opportunity to assess numerical cognition with methods such as

## REFERENCES


mouse tracking. In their perspective, Fischer and Hartmann point out important insight and methodological considerations on conceptual aspects related to numeric cognition. This perspective has been further elaborated by Faulkenberry and Rey.

These recent reports offer a new perspective on cognitive functioning, one that combines sensorimotor dynamics with contextual and body information. These new concepts provide new opportunities to explore related domains such as motivation (Shalev) and anticipation (Raab) and offer a new framework to interpret well-known and sometimes contradictory results in fields such as short-term memory (Macken et al.) or healthy cognitive aging (Vallet).

### AUTHOR CONTRIBUTIONS

Every author participates to the writing of the editorial article. All of them also significantly contribute to the edition of the articles of the research topic.

### FUNDING

GV is supported by a postdoctoral grant from the Fonds Québécois de la Recherche en Santé (FRQS). LB is supported by ACCEPT ("Assistance Tools and Cognitive Contribution: Embodied Potential of Technology"), French research ministerial mission (MiRe-DREES and CNSA). BR is supported by FS-CH (Formation Universitaire à distance, Suisse) and this work by the LabEx Cortex ("Construction, Function, and Cognitive Function and Rehabilitation of the Cortex," ANR-10-LABX-0042) of Université de Lyon, within the program "Investissements d'Avenir" (ANR-11-IDEX-0007) operated by the French National Research Agency (ANR).

### ACKNOWLEDGMENTS

The authors wish to thank Kristina Aurousseau for her precious help and meticulous proofreading as well as all authors and reviewers involved in the present research topic.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Vallet, Brunel, Riou and Vermeulen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Mechanisms of embodiment**

### *Katinka Dijkstra\* and Lysanne Post*

*Department of Psychology, Faculty of Social Sciences, Erasmus University Rotterdam, Rotterdam, Netherlands*

This paper is a critical review of recent studies demonstrating the mechanism of sensorimotor simulation in different cognitive domains. Empirical studies that specify conditions under which embodiment occurs in different domains will be discussed and evaluated. Examples of relevant domains are language comprehension (Tucker and Ellis, 1998), autobiographical memory (Dijkstra et al., 2007), gestures (Alibali et al., 2014), facial mimicry (Stel and Vonk, 2010), and problem solving (Wiemers et al., 2014). The focus of the review is on supporting claims regarding sensorimotor simulation as well as on factors that modulate dynamic relationships between sensorimotor components in action and cognitive domains, such as expertise (Boschker et al., 2002). This discussion takes place within the context of currently debated issues, specifically the need to specify the underlying mechanisms of embodied representations (Zwaan, 2014; Körner et al., 2015).

**Keywords: embodied cognition, sensorimotor simulation, memory, cognition, online embodiment, offline embodiment**

## **INTRODUCTION**

#### *Edited by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada*

#### *Reviewed by:*

*Lionel Brunel, Université Paul Valéry, France Denis Brouillet, Université Paul Valéry, France*

> *\*Correspondence: Katinka Dijkstra k.dijkstra@fsw.eur.nl*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 28 January 2015 Accepted: 22 September 2015 Published: 15 October 2015*

#### *Citation:*

*Dijkstra K and Post L (2015) Mechanisms of embodiment. Front. Psychol. 6:1525. doi: 10.3389/fpsyg.2015.01525* More than two decades after the grounding problem in symbol theories was brought up (Harnad, 1990) embodied cognition approaches have gained a stronghold in the study of cognition (Dijkstra and Zwaan, 2014). Since the first notion that cognition is grounded in perception and action (Glenberg, 1997), an abundance of empirical studies have demonstrated support for this groundedness (see for example Glenberg et al., 2013; Dijkstra and Zwaan, 2014).

Recently, the need to take stock of what all these studies contribute to the concept of grounded and embodied cognition has been expressed (Willems and Francken, 2012). Moreover, discontent with the current state of affairs has been noted. For example, opposing results in different studies have been interpreted as supporting embodied cognition approaches, indicating that the predictions may be too general to be falsified (Willems and Francken, 2012). Also, boundary issues have been raised regarding what phenomena embodied cognition approaches may or may not be able to explain (Mahon and Caramazza, 2008). For example, the issue whether, or to what extent, abstract symbols are also grounded in action and perception is a recurring topic in the debate on embodied versus disembodied approaches. Abstract concepts generally do not have physical or spatial referents, which renders a direct mapping of an abstract concept with a sensorimotor domain problematic at least (Mahon and Caramazza, 2008; Dijkstra et al., 2014).

This criticism does not stand on its own but is complemented by constructive proposals to counter the potential "erosion of the embodiment concept" (Willems and Francken, 2012) and to get out of the "impasse" regarding the discussion around the grounding of language comprehension (Zwaan, 2014). These proposals converge on the issue that the focus of current research should not be on supporting either embodied or disembodied accounts but on *how* sensorimotor systems and cognitive processes interact. The *pluralist view of cognition* proposed by Zwaan (2014) entails that abstract and grounded symbols contribute differently to language comprehension depending on how language is embedded in the environment in which it is used. In this view, research should focus on what the representations consist of and assess when and how they interact (Zwaan, 2014). Others proposals claim that research on embodiment lacks explanative power (Körner et al., 2015), should specify the conditions under which conditions certain phenomena do or do not occur (Willems and Francken, 2012), and integrate existing knowledge regarding embodiment to describe its underlying mechanisms (Körner et al., 2015).

This review is written within the scope of these proposals to more specifically assess conditions under which embodiment occurs or not and to gain insight into underlying mechanisms of embodiment. To accomplish this, studies are drawn from various domains in research in which embodiment effects have been found under clearly defined conditions and that demonstrate the mechanism that underlies these embodiment effects. The selection of studies discussed in this review is by no means exhaustive but illustrates how a limited number of claims that are empirically supported can contribute to a deeper understanding of embodiment effects. "Embodiment" is defined here broadly as the effect that the body or parts of the body (movement, position) can have on cognition or vice versa.

### **MECHANISMS OF EMBODIMENT— SENSORIMOTOR SIMULATION**

Recent reviews on embodied cognition have focused on establishing mechanisms of embodiment (Körner et al., 2015), describing relevant domains in embodiment effects (Dijkstra and Zwaan, 2014), such as language comprehension (Fischer and Zwaan, 2008), or stipulating a theory on embodied simulation (Gallese, 2007, 2009), or grounded cognition (Barsalou, 2008, 2010). A common element in their reviews is that sensorimotor simulation is considered to be a core element in cognitive processing.

Most recently, the idea of sensorimotor simulation as one of the main mechanisms underlying embodiment, has been developed further into specific ways in which this sensorimotor simulation takes place (Körner et al., 2015). The first claim is that the perception of a stimulus automatically triggers the simulation of reenactment with it. Actions are facilitated when they match the simulated actions and impeded when there is a mismatch between the two. Secondly, simulation may be blocked by a concurrent task that involves the same sensorimotor resources. The third claim is that simulation may also work offline, whereas the fourth claim entails that simulation depends on previous experiences and skills. Two other claims are discussed later on in this review.

In this review, both the mechanism of sensorimotor simulation as the claims derived from it are discussed in the context of research domains that illustrate the conditions under which embodiment occurred or failed to occur. Some of these domains have already been studied extensively, such as language comprehension, whereas fewer studies have been dedicated to other domains, such as autobiographical memory, gestures, expertise, facial mimicry, and problem solving. The review will remedy this omission by devoting more attention to these domains.

### **SENSORIMOTOR SIMULATION IN LANGUAGE COMPREHENSION**

The idea of sensorimotor simulation is that there are neural correlates between the content of what is being read and represented (i.e., action words) and the areas in the brain being activated (i.e., actions). Embodied cognition research on language comprehension has focused on examining the simulations being formed when reading sentences that are compatible with certain actions that are performed (Glenberg and Kaschak, 2002; Fischer and Zwaan, 2008; Kaschak et al., 2014). In one study, participants read sentences describing actions toward the body (John gave you a pencil) or away from the body (I closed the drawer) and had to make sensibility judgments regarding the content of the sentence by moving their hand toward or away from their body by pressing a button. Participants responded faster when the direction implied in the sentence (i.e., toward the body) was congruent with their own action (Glenberg and Kaschak, 2002). This supports the claim of congruence effects in sensorimotor simulation facilitating performance for congruent over incongruent conditions.

Other studies have shown how simulation is prevented from happening and how this affects cognitive processing. If the neural system is already engaged with a task at the time that a similar task requires a response and also involves the same neural circuits, interference will occur and embodiment is impeded. Consequently, there will be no facilitation of a congruent bodily manipulation on cognitive processes. This has been demonstrated in a study in which participants listened to sentences that implied upward or downward motion and concurrently watched a display on a screen that would scroll upward or downward (e.g., the cat climbed the tree; Kaschak et at., 2005). Participants had longer response times to decide whether the sentence made sense or not when sentence content and visual display were congruent than when they were incongruent. It was not possible to simulate the motion they represented from the sentence as they were processing the same motion in a different task. In the second experiment (Kaschak et at., 2005), participants responded faster to sentences when the visual stimulus concurrently showed motion in the opposite direction as was implied by the sentence content. Apparently, when neural circuits are already engaged for the motion part of the experiment, it is not possible to simulate sentence content that requires a mental representation of motion information at the same time. Thus, interference occurred for congruent but not for incongruent mappings.

The studies discussed above illustrate how and when the simulation mechanism operates and when it does not. They also support the claim that the perception of stimuli triggers their simulation when they match and the claim that simulation may be blocked by a concurrent task that involves the same sensorimotor resources. Compatibility effects are only present when neural circuits are available for simulation of congruent stimuli materials. If not, interference for processing congruent stimuli materials occurs. The third claim that simulation can work offline is also supported by empirical research and is described in more detail next.

Sensorimotor simulation may happen offline (Niedenthal et al., 2005; Körner et al., 2015) in the absence of a bodily state or action. Research has shown that retrieving an object from memory, and thinking or imagining an object generates responses in the body that show similarities to the responses that would occur if the object were present (Brouillet et al., 2010). Such an offline embodiment effect was demonstrated in a study conducted by Tucker and Ellis (1998). In this study, lists of words were presented to participants. Half the words represented small (penny) and half the words represented large objects (beach ball) and these words could be either natural or artificial. Participants indicated whether the object was natural or artificial with a joy stick and either had to apply a full grasp of the joystick or a precision grasp between thumb and forefinger. The results indicated that congruence in the grasp response with the type of object resulted in faster responses (small object-precision grip) than when this congruence was absent (large object-precision grip). Apparently, participants have a similar response in the absence of the object as when the object would have been actually present.

Another study demonstrating offline embodiment under congruent conditions of sensorimotor simulation was a language comprehension study (Pecher et al., 2009) in which participants responded whether a picture of an object occurred in the sentence preceding the picture. Previous studies have demonstrated that participants tend to simulate the shape of the object, being slower in the yes response of the object having occurred in the sentence when the shape does not entirely match the implied shape of the object in the sentence than when the object matches closely in shape (Stanfield and Zwaan, 2001; Zwaan et al., 2002). These were online effects. In the study by Pecher et al. (2009), however, the matching or mismatching pictures were presented at the end of the experiment when all sentences had been presented or after a 45-min delay. In both retention conditions, a match effect was found. Simulation of relevant aspects of the stimuli (shape) from memory should have occurred because only this mechanism can explain the faster responses under congruent (match) than under incongruent (mismatch) conditions.

The studies discussed above demonstrate sensorimotor simulation while processing the stimuli or considerable time after these stimuli have been processed or reconstructed from memory. Sensorimotor simulation affects the speed with which responses are made and requires that relevant aspects of the stimuli (for example shape or motion direction) are congruent in order to transpire unless neural circuits are already activated in a concurrent sensorimotor simulation and therefore blocking the simulation from happening. These results illustrate what favorable conditions are for sensorimotor simulation. Simulation happens under favorable conditions, such as compatibility, without the requirement that a bodily state or action is present at the same time. Simulation is hindered, however, when the same sensorimotor resources engaged in one task, are also needed for a second, concurrent task, suggesting that simulation may be a default mechanism that is constrained by how resources are engaged. More compatibility effects have been demonstrated in research on language comprehension (see for an overview Fischer and Zwaan, 2008; Kaschak et al., 2014) but it exceeds the purpose of this review to mention them all. Importantly, we aim to illustrate how certain claims regarding the sensorimotor simulation also operate in different domains, such as gestures.

## **SENSORIMOTOR SIMULATION IN GESTURES**

According to the Gesture as Simulated Action (GSA) framework (Hostetter and Alibali, 2008), gestures emerge when premotor activation exceeds a certain threshold and spreads to motor activation. This premotor activation is evoked by embodied (sensorimotor) simulations, which occur, for instance, while thinking about a task that would be facilitated by the use of gestures, such as pointing. The motor activation that results from exceeding the threshold is a gesture. This way, gestures express embodied simulations.

Gestures do not only express embodied simulations, but even have a causal role in the sense that gestures highlight perceptual and motor information which are consequently more likely to be incorporated in reasoning (Alibali et al., 2014). Specifically, the information conveyed in gestures influences the listeners' reasoning, because they refer to present and absent entities through pointing gestures and representational gestures, respectively, which help the listener to comprehend what the speaker is saying. Besides literally seeing what the speaker means because of the pointing gestures, the referential gestures can influence the readers simulations through activation of the motor system, but they can also cause the listener to mimic the speakers' gestures which allows the gestures to influence the listeners' reasoning the same way they influenced the speaker. This influence of gestures on simulations proposed in the GSA framework is therefore in line with the idea of sensorimotor simulation as a major mechanism underlying embodiment as proposed by Körner et al. (2015).

Indeed, several studies suggest that perception of gestures automatically triggers its simulation and that simulation may also work offline. A study by Cook and Tanenhaus (2009), for example, showed that speakers' co-speech gestures while explaining a problem solving task affected listeners' behavior when solving the problem on a computer (later in time, i.e., offline), that is, the mouse trajectories resembled the observed gestures. There is even evidence that speakers' gestures activates listeners' motor systems (i.e., automatic simulation; Ping et al., 2014). This study also showed that, in line with the claims made by Körner et al. (2015), actions are facilitated when they match the simulated actions and impeded when they mismatch. That is, the initially found positive effect on reaction times for pictures after congruent sentences from observing gestures disappeared when listeners moved their arms and hands. In contrast, the effect did not disappear when participants moved their legs and feet, indicating that the speakers' gestures indeed activated the listeners' motor system, that is, the part of the motor system that is involved in gesture related body parts, the arm and hands (Ping et al., 2014).

There seems to be, however, more to gesturing than sensorimotor simulation. Pouw et al. (2014) focus on the mechanism of offloading onto the environment, a claim on embodiment stipulated by Wilson (2002). According to Pouw et al.'s (2014) theory, gestures are external placeholders for internal processes. This offloading on the environment allows more internal processes to take place at the same time. Their point is illustrated by the following two examples. First, gesturing as if one actually rotates an object during a mental rotation task (e.g., Chu and Kita, 2008) can reveal information that is otherwise difficult to mentally compute and provide a physical platform to support internal processes (Pouw et al., 2014). Secondly, pointing gestures can aid internal cognitive processes by helping to keep track of counting (Delgado et al., 2011) or mental calculations (Hatano et al., 1977; Hatano and Osawa, 1983). Thus, Pouw et al. (2014) argue that gestures have a cognitive function, because they are used for and support cognition when the costs of mental computation are high (either by internal or external constraints).

Now, within the scope of the current paper, we ask what can the interaction between gestures and memory tell us about the underlying mechanisms of embodiment? Are gestures an externalization of cognition (Pouw et al., 2014) or are the relationships between gestures and cognition bidirectional, with one influencing the other and vice versa (Alibali et al., 2014)? Even though these two theories are not mutually exclusive, empirical research has mainly shown support for the latter claim (Beilock and Goldin-Meadow, 2010; Hostetter and Alibali, 2010; Post et al., 2013).

Positive effects of gestures on children's memory were found in a substantial number of studies on mathematics (Broaders et al., 2007; Cook et al., 2008), memory of a fictional story (Stevanoni and Salmon, 2005), and word learning (Tellier, 2008; see Macedonia and Von Kriegstein, 2012, for a review). Word learning in a foreign language was also facilitated by gesture observation for adults in a study by Kelly et al. (2009). These results are in line with the current ideas about embodiment. In the different studies, gesture production and observation seem to reflect and even elicit sensorimotor simulations. Because it is theorized that these simulations automatically emerge, gestures should be an effective way to improve memory. Indeed, memory was improved when participants used gestures (Kelly et al., 2009).

The studies discussed above demonstrated positive effects of gestures on learning and memory. But there are also circumstances under which gestures do not enhance learning (Beilock and Goldin-Meadow, 2010; De Nooijer et al., 2013; Post et al., 2013). Does that mean there is no embodiment in those cases? Or do those studies perhaps even provide evidence that embodiment is not characterized by sensorimotor simulations? Not necessarily. Beilock and Goldin-Meadow (2010) had participants gesture while explaining the Tower of Hanoi task. Their gestures reflected disk size and weight, as they used one-handed gestures when referring to the light disks and twohanded gestures when referring to the heavy disks. In a second task, weight was switched in a way that was less compatible with the gestures that were made earlier (i.e., small disks were heavier than large disks). Performance for participants in the gesture condition declined, indicating that the gestures actually represented the weight of the disks. Thus, in this case a detrimental effect of gestures is actually revealing sensorimotor simulations (i.e., embodiment).

Other studies in which negative effects of gestures were found shed light on the boundary conditions of sensorimotor simulation. For example, Post et al. (2013) found that, for children with poor general language skills, gestures increase cognitive load and hamper learning when imitated online during the first encounter with new material. For these children, sensorimotor simulation, and thus embodiment, was prevented by the cognitive overload they experienced. De Nooijer et al. (2013) found that gestures do help when imitated during either learning or testing, but not when imitated during both. It is not entirely clear what caused the effects found by the Nooijer et al., but it is clear that gestures are not always beneficial for memory.

There is also empirical evidence of the influence of cognition on gestures. Gesture frequency during descriptions of images of dot patterns was influenced by participants' physical experience with the patterns (Hostetter and Alibali, 2010). Participants gestured more when they had specific physical experience with the patterns than when they had only learned visually. The experiments show that this was not due to a decrease in verbal rehearsal or motor priming. Hostetter and Alibali conclude that gestures occur when people talk about thoughts that involve action simulations.

In sum, within the domain of gestures, the interaction of action and cognition is bidirectional. Gestures are a form of actions that can influence learning and memory and this is interpreted as support for embodied cognitive processes (Hostetter and Alibali, 2008; Pouw et al., 2014). In line with the claims made by Körner et al. (2015), the described gesture studies support the idea that perception can automatically trigger sensorimotor simulations, that actions can be facilitated or hampered depending on congruency with the simulated actions, and that simulations also work offline. However, there are boundaries as to how far this embodiment of gestures reaches, because gestures are not always supportive for learning, for example when cognitive resources are overloaded (Post et al., 2013).

The domain of gestures is after language comprehension the second cognitive domain that illustrates how sensorimotor simulation operates, under which conditions, and what the boundaries are. Another domain in which relevant aspects of the body can influence memory through simulation is autobiographical memory.

## **SENSORIMOTOR SIMULATION IN AUTOBIOGRAPHICAL MEMORY**

A simulation view of autobiographical memory entails that modality-specific states of perception, action, and introspection that were activated when an event was experienced, are activated again when the experience is represented at a later point in time (Niedenthal et al., 2005, 2014; Dijkstra et al., 2007; Dijkstra and Zwaan, 2014). If autobiographical memory is a simulation and reconstruction of the original experience along with the relevant perceptual, sensorimotor and affective components of the experience, facilitation of the retrieval process should occur if these components of the original experience are triggered when the retrieval process is initiated (Dijkstra and Zwaan, 2014).

Support for this prediction was found in a study in which participants retrieved autobiographical memories in bodycongruent and body-incongruent positions relative to that of the original experience (Dijkstra et al., 2007). Participants were not only faster retrieving the memory when prompted after assuming a body-congruent position (talking about a previous visit to the dentist while being reclined in a chair) than in a body-incongruent position (talking about this dentist visit while standing up with the legs out and the hands in the waist) but also retained the memory better after a period of 2 weeks when they were asked to talk about the memories they retrieved 2 weeks earlier. This study demonstrates sensorimotor simulation under congruent conditions of body position both during retrieval and long-term retention tasks. It seems as if body position is a sensorimotor component of the original experience that acts as an embodied cue to facilitate the reconstruction of the original experience. The ease of retrieval is reflected in faster retrieval times to access a relevant (i.e., body congruent) memory and even strengthens the memory trace so that body-congruent memories are remembered better at a later point in time.

This study supports two of the claims that were discussed earlier, compatibility effects when there is a correspondence between the stimuli condition (body position) and its simulation (the body position that was relevant during the original experience). The other claim is that the study demonstrates offline embodiment because experiences from the past are being reconstructed rather than experiences that happen at this moment. The study also supports the claim regarding embodiment by Wilson (2002) that offline cognition is body based, specifically for memories that are records of spatiotemporally localized events that are relived by the person who remembers them. Studies demonstrating congruence effects with a body manipulation thus provide strong support for claims regarding sensorimotor simulation.

Do these simulations extend to more indirect relationships between bodily states and autobiographical memory simulations? For example, the mapping of certain actions, such as moving hands "up" or "down" with certain emotions, such as "positive" for "up" and "negative" for "down" illustrate such an indirect relationship. This mapping is an example of a conceptual metaphor, abstract concepts that have metaphorical associations with actual experiences (Dijkstra et al., 2014). Conceptual metaphors arise from a pattern of associations of concrete experiences (cheering, jumping out of joy) with certain body movements (upward movement). Conceptual metaphors, such as "positive is up" can be understood in terms of concrete concepts and experiences, such as the times that you cheered when your favorite soccer team scored a goal, or the times you slumped in your seat when the other team scored against your team. Based on the conceptual metaphor that maps verticality with valence, the prediction can be made that the retrieval of positive or negative autobiographical memories should be facilitated when an up or down action is performed, but only under movement-valence congruent conditions (up and positive, down and negative).

This facilitation was demonstrated in a study involving body movement and emotional memory retrieval (Casasanto and Dijkstra, 2010). Participants retrieved autobiographical memories to prompts while moving both hands upward to deposit a marble in each hand in two containers that were placed in a higher location, or downward into containers below. The idea behind the study was that an upward movement triggers the "up is positive" metaphor and facilitates retrieval of positive memories, while a steady downward movement triggers the "down is negative" metaphor and facilitates retrieval of negative memories. The tobe-retrieved memories, were either positive or negative (Tell me of a time you felt proud of yourself/Tell me of a time you felt ashamed of yourself) in Experiment 1, and neutral (Tell me of an event that happened yesterday) in Experiment 2. In Experiment 1, reaction times to congruent (positive is up) and incongruent (positive is down) movement/valence memories were assessed. Participants were faster retrieving memories in congruent than incongruent trials. In the second experiment, participants again moved their hands but first retrieved memories to neutral prompts silently and then retold the memories afterward when their hands did not move. They were more likely to retell memories they later judged as positive when they moved their hands upward.

Just as in the previous study with body position (Dijkstra et al., 2007), body movement seemed to facilitate access to the memory itself by activating a relevant aspect of the experience, the emotion that was experienced at that time. This was an indirect trigger, however, because the emotion arose as a result of the mapping of a motor action and the associated emotion with that action. Both studies demonstrate the role of offline sensorimotor simulation under congruent conditions. The last study also supports the claim by Körner et al. (2015) that simulation may play a causal role in processing emotion. Only the activation of the mapping between valence ("positive is up") and the vertical movement of the hands (up) explains why participants have better access to emotion-movement congruent memories and attribute a certain emotion to memories when they move their hands a certain way.

A similar embodiment effect demonstrating the mapping of body position with valence was demonstrated in a study by Riskind (1983). Participants were instructed to be either in an upright position and smile or in a slumped position with a downcast expression and their head and neck down. While being in this position, they had to retrieve unpleasant and pleasant experiences. The results indicated that participants were faster in retrieval of these experiences in body positionvalence congruent condition (upright and pleasant or slumped and unpleasant) compared to body position-valence incongruent conditions. Again, research demonstrated that manipulations of the body affect cognitive processes under specific conditions: those of body-valence congruence.

A last study that contributes to our insight into sensorimotor simulation in autobiographical memory does not involve the retrieval of emotional experiences but the generation of emotion words, such as "disappointment" and "pride" that are associated with previous emotional experiences in which the word was relevant (Oosterwijk et al., 2009). Given the association of positive words with up movements and negative words with down movements, changes in body posture (straight up or slumped) were expected depending on the emotion that was elicited. Participants' height was measured (with a hidden camera) during the word generation procedure which would give an indication of the posture demonstrating pride (upright posture) or disappointment (slumped posture). The results showed that participants indeed changed their body position to a lower, more slumped position when disappointment words compared to the body position during the generation of pride words (Oosterwijk et al., 2009).

Other than the previous study, this study illustrates a more subtle embodiment effect. Body position was influenced by the simulated sensation that was triggered by the valence of the presented words. A major difference with the autobiographical memory studies, discussed above, is that the effect of perceiving a valenced stimulus triggered the simulation of the valence-matching body position, not the other way around as in the previous studies. Such bidirectional effects were also demonstrated in the discussion of research on gestures. A similarity with the autobiographical memory studies on valence was that all three studies illustrate the claim that simulation plays a causal role in processing emotion. A stimulus or action that is congruent in valence with the simulated stimulus or action can be simulated more readily and easily than when these favorable conditions are absent. Autobiographical memory retrieval and sensorimotor processes tap into long-term memory stores of experiences that are being simulated when triggered appropriately. In other domains, the accumulation of motor experiences builds up to a certain level of motor fluency and a knowledge base that can be tapped into when being triggered. Individuals, who have done this over a long period of time, can tap into this store more easily and efficiently than those who have not. They are considered to be experts.

### **SENSORIMOTOR SIMULATION: EXPERTISE**

What happens when a person has performed complex motor movements and motor sequences so many times that a high level of expertise has been reached? From an embodied cognition perspective, if someone has extensive experience with in an action domain, such as tennis, more grounded representations will be activated when playing or talking about the sport compared with a novice. This fits with the claim by Körner et al. (2015) that automatic simulation depends on previous experiences and skills. Experts are likely to form a full-blown first person mental simulation of the described actions (with many grounded representations) whereas a domain-novice would form shallow, word-like representations instead (Sutton and Williamson, 2014). The motor expertise that has been developed would then facilitate any processing and memory of expertise-related information that is presented to this expert. Several studies have demonstrated support for this assumption.

In one study, expert climbers were tested to see if they would remember climbing routes of wall elements better compared to novices (Boschker et al., 2002). Expert climbers have built up motor fluency of specific climbing movements and should therefore remember these routes better. In addition, they should notice elements of the climbing environment as potential holds for grasping and establishing a route for ascend or descend to a greater extent than novices. The results supported the assumption that experts remembered the climbing routes with the affordances that the wall elements offered whereas the more inexperienced climbers tended to focus on structural features of the climbing wall and did not remember the routes as well. Expertise thus facilitates the way to take in and remember expertise-related information that results in superior performance on cognitive tasks. Here, the sensorimotor system involves an accumulation of experiences that feed into the cognitive domain.

There is another relevant aspect of expertise when considered from an embodied cognition perspective, which may actually hinder performance on certain tasks, and this is loss of attentional control. Expert skills are built up over a substantial period of time, resulting in competencies that are part of procedural knowledge and no longer require explicit attentional control (Beilock et al., 2004). Consequently, if actions have to be performed with unlimited time available, novice learners in a domain should benefit from this whereas it may actually harm an expert because control processes may come into play for a task that only requires procedural skills (Beilock et al., 2004).

These effects of expertise were demonstrated when experts imagined actions that were within their expert motor repertoire (Beilock and Gonso, 2008). Novice golfers had lower accuracy scores in putting when speed was stressed in the instruction and they had to actually make a putt, or imagine putting a certain sequences of actions. In contrast, expert golfers performed better on actual and imagined putting when time was more limited because they tapped into their proceduralized skill under conditions of time pressure but allowed conscious control when more time was available which impaired their performance. Imagery appeared to serve the function of action readiness in these experts. This is similar to the step-by-step unfolding of the action itself and involves the same cognitive and motor parameters. When speed is no issue, other elements become important in the imagery process of experts that involve explicit control of the skill, which gets in the way of the planning of actual steps.

Motor expertise may therefore enhance memory performance and performance accuracy because complex patterns are practiced frequently and readily available. However, superior memory performance in experts tends to be limited to expertiserelated information relative to everyday information and novices when stimuli were encoded through observation, or enactment (Dijkstra et al., 2008). The knowledge and experience base are specific to the domain of expertise and do not necessarily transfer to other domains.

Because the action patterns that form the basis of expert performance are overlearned, motor expertise may also lead to reconstruction bias (Barsalou, 1999) in tasks that may cause some confusion as to whether an item tapping into this fluency was encountered before or not. As a result, it may be more difficult to differentiate between patterns that were or were not presented or imagined previously. This false recognition bias was examined by (Yang et al., 2009) who assessed the effect of the expert skill of typing on recognition rates of letter dyads that would normally be typed with the same finger (j and h with the index finger) or with different fingers (j and l), reflecting motor fluency among expert typists. The idea was that the activation of action plans that are associated with the different letter dyads, such as dyads reflecting higher motor fluency could lead to decision errors on a recognition task among experts but not novices. Experts have more consistent mappings between certain letters and how they type them. The results supported this assumption. Skilled typists recognized different-finger letter dyads that were not presented earlier incorrectly as having been presented before more frequently than novices. Recognition memory was influenced by the covert simulation of repeated action patterns among experts. As a typist, you cannot help but covertly simulate the motor action of typing letters when you are merely processing them visually. Therefore, you think you saw letter dyads before because you have a covert motor trace of activating these letters. In other words, your motor fluency gets in the way of simply performing the cognitive task.

These studies on the role of expertise show how sensorimotor simulation operates in the domain of expertise. The activation of previous experiences relevant for an expert task results in the availability of relevant knowledge for a particular task. This knowledge can be utilized in the performance of motor and memory tasks. If other information is allowed to enter the system when there is ample time available or if processing information in one modality allows for covert simulation of action patterns that are part of motor fluency, expertise can actually get in the way of performance in the cognitive domain. What we learn from these studies is that there are different aspects of expertise that play a differential role in the interaction between sensorimotor systems and cognitive domains. These have to be taken into account when examining the role of expertise from an embodied cognition perspective.

The studies discussed so far were chosen because their results support claims regarding sensorimotor simulations. Language comprehension is facilitated under congruent stimulus-simulation conditions and hindered when the neutral circuits needed for simulation are already engaged with another task. Gestures can be considered as a way to offload information into the environment reducing working memory load and increase availability of cognitive resources for internal cognitive processes. Bodily cues, such as body position, help the reconstruction process of the original experience when retrieving an autobiographical memory. Expertise intensifies the interactions between action and cognitive domains. Repeated action patterns allow experts to take in and remember more relevant aspects of a scene than novices (Boschker et al., 2002) because they have learned to perceive their environment in a different manner. All these effects operate through sensorimotor simulation under congruent conditions and both online and offline. However, once motor fluency has been developed in experts, it cannot be undone easily which suggests that performance may include activated yet incorrect information from a motor repertoire, forming a bias in memory processes.

In the preceding sections, research on offline embodiment effects was discussed in the domains of language comprehension, gestures, autobiographical memory, and expertise. Simulation may occur in the absence of an object, in a later stage of stimuli processing, when retrieving an event from the past, and as an accumulation of previous experiences. Another domain in which offline embodiment effects can be explained with sensorimotor simulation, is problem solving.

### **SENSORIMOTOR SIMULATION: PROBLEM-SOLVING**

Embodied research on problem-solving often demonstrates the use of physical or imagined simulation to facilitate finding the solution for a task. A well-known example in this respect is a study by (Kirsh and Maglio, 1994) in which participants have to rotate and flip falling block shapes quickly to make them fit with the surface and the blocks they fall on. Participants use the strategy of actual rotation of the blocks in order to determine the best fit rather than mental computation to solve the problem. This fits with the idea of sensorimotor simulation and the facilitating role of the body in the simulation process (Dixon et al., 2014).

An example of such simulation is a study that employed a gearsystem problem. Participants have to predict the turning direction of the final gear in a series based on the turning direction of the gear that drives the force to the system. Participants typically solve this problem by employing their body, that is, by manually simulating the movement with their hand of each gear that follows the previous one (Dixon et al., 2014). This manual simulation then helps participants to discover the higher order solution that applies to all these problems which is the insight that alternation occurs of the gear direction.

Simulation based on indirect but commonly occurring mappings between abstract concepts and concrete experiences, as we saw in the studies in the autobiographical memory domain has also been demonstrated in the domain of problem solving. Apart from the "positive is up," mapping, other mappings exist, for example in relation to the mental number line, "many is up" or "few is left." They can also be activated with sensorimotor manipulations and facilitate processing of information if the mapping is present. Numerical magnitude can be represented both horizontally, with left representing a smaller quantity than right (Pinhas and Fischer, 2008), and vertically, with up representing a larger quantity than down (Shaki and Fischer, 2012). We stack coins into piles with higher piles indicating higher quantities and numbers are usually written horizontally with lower numbers being presented on the left and higher numbers being presented on the right.

Wiemers et al. (2014) examined the activation of these two representations, horizontal and vertical, in a mental arithmetic task. Participants performed addition and subtraction tasks while making upward and downward movements, movements to the right and left, or no movement at all with their right arm in Experiment 1. They solved more problems under movement/magnitude congruent than incongruent conditions. In Experiment 2, they did the same arithmetic task but this time the arithmetic problem (and not their arm) moved in the directions described above. Again a compatibility effect of movement with the spatial representation of the task (up and addition with down and subtraction) was shown. This study demonstrates the idea of sensorimotor simulation through compatibility of actual and perceived movement in order to solve the task. Moreover, earlier experiences with arithmetic and the representation of magnitude were activated when the magnitude-spatial mapping occurred which resulted in facilitation of the response.

Werner and Raab (2014) also investigated the mapping between horizontal spatial representations and the abstract concept of magnitude but in a different type of problem solving task. The authors used the water-jar problem which requires participants to obtain a required volume of water when they are given certain empty jars for measure. Participants were primed to a left or right gaze direction in a perception task to mentally compare full jars either to the left or the right of a similar empty jar. This should bias them to either the left or right when being presented with the water jug problem and after they sorted marbles from outer bowls inward (plus group) or the other way around (minus group). Participants indeed demonstrated a spatial bias in gaze direction to the right for the plus group and to the left for the minus group without there being differences in overall problem solving ability.

A similar gaze design was used by (Thomas and Lleras, 2007) who manipulated gaze behavior in participants prior to them being exposed to a problem solving task, known as the Duncker radiation task. Participants have to destroy a tumor in a patient on a computer screen with lasers while keeping the tissue around the tumor healthy. The solution to this task is to have multiple laser beams fire at low intensity at the tumor from various locations around the tumor so that the convergence of these beams can destroy the tumor, yet keep the tissue intact. Participants whose gaze behavior was manipulated to make saccadic eye movements between the tumor and the surrounding locations, which hints at the solution, were more successful in solving the problem later than participants who were instructed to fixate their gaze on the tumor. Here, the practiced eye movements (moving in and out versus fixed) facilitated later problem solving by activating the sensorimotor simulation to tackle the problem.

A last study demonstrating sensorimotor simulation in a problem solving task involved spatial working memory (Thomas, 2013). Again, the underlying assumption was that movement with arms or eyes may embody the solution to an insight problem through simulation. A second assumption was, however, that actions will only affect problem solving when the representations of these actions are active in working memory. If these representations are (no longer) active, no effect will occur. Participants tried to solve the Duncker radiation problem, but different from the study discussed above, participants had to occasionally perform a visual tracking task. Moreover, they held a spatial (a grid filled with dots) or verbal stimulus (a string of digits) active in working memory at the same time. It was expected that holding a spatial stimulus in memory would engage spatial working memory resources and therefore interfere with the problem solving task, even if the eye movements are directed at various locations around the tumor. The results showed that being assigned to an embodied-solution condition (observing different colored stimuli from different locations around the tumor toward the tumor and out again) indeed was not sufficient for better problem solving performance. Only if they also held a verbal stimulus active in working memory did they solve the Duncker problem after fewer attempt intervals. They also did this faster than participants who were in the same embodiedsolution condition but also held a spatial stimulus active. In other words, representations in spatial working memory affected problem solving even if participants were visually primed toward a solution of the problem. This demonstrates the simulation blocking effect that we encountered in the discussion of language comprehension research. If neural circuits are already engaged with a concurrent task, simulation is impeded or blocked.

The studies discussed from the domain of problem solving also support the claims regarding sensorimotor simulation. Under specific conditions of congruence as long as neural circuits are not engaged beyond capacity, sensorimotor simulation may aid problem solving. This simulation may be a physical or mental simulation and can take place during the task or afterward. It may also apply to the last domain to be discussed in this review, facial mimicry. Research conducted within this domain also supports the last claim from Körner et al. (2015)that has not been discussed as of yet, and that is the claim that simulation is important in social interactions.

### **SENSORIMOTOR SIMULATION: FACIAL MIMICRY**

Facial mimicry is the imitation of facial expressions by a person who is observing these expressions in another person. This commonly happens when emotions are expressed on the face (smile or frown), usually in a situation in which someone is communicating with someone else but also when observing pictures or videos of other persons who display an emotion on their face. The mimicry of these emotions on the face is the simulation of what one observes. How does this happen? When individuals observe actions and emotions of others, the same neural circuits are engaged as when the same actions and emotions would be experienced personally (Oosterwijk and Barrett, 2014). We therefore understand the feelings and emotions of another person through the engagement of neural circuits based from a recreation of these feelings in ourselves (Niedenthal et al., 2005). From an embodiment perspective, this is an interesting notion because it illustrates how embodiment is a shared rather than an individual phenomenon and it fits with a sensorimotor simulation account that presupposes a causal role regarding emotion processing and is important in social interactions (Körner et al., 2015).

Researchers on facial mimicry agree that emotions are grounded in online bodily states that arise from a context-specific experience in which the emotion is evoked and distributed across multiple regions of the brain (Niedenthal et al., 2014). Facial mimicry then is the way how the sharing of emotion becomes apparent for the other (Niedenthal et al., 2014). Facial mimicry can help to strengthen social bonds between people because a speaker receives through the facial expression of the listener the feedback that the other understands what is being communicated. If the expression is emotional, facial mimicry may signal the other is feeling the same way. This benefit of understanding the other may not be limited to understanding how the other person is feeling but extend to understanding what this person is communicating.

Empirical support for these assumptions comes from research manipulating how an emotion can be expressed on the face (Niedenthal et al., 2001; Oberman et al., 2007; Stel and Vonk, 2010). An emotional expression on the face can either be induced by placing a pen between the teeth, or blocked by placing a pen between the lips to block the expression of smiling. In one study employing the emotion-blocking procedure, some participants held a pen in the mouth between the lips to hinder the expression of smiling whereas other participants could display their emotions freely. When instructed to detect changes in facial emotion expression in morphed pictures (from sad to happy) participants were slower to recognize the emotion change when their ability to display the emotion was blocked than participants who did not have this restriction (Niedenthal et al., 2001). In another emotionblocking study, participants placing pressure on a pen, showed an impaired ability to recognize happy faces relative to participants who could show facial expressions without being hindered in their expression and thus engaging the relevant muscles (Oberman et al., 2007). Both studies demonstrated better performance on a cognitive task when participants were able to freely express their emotions on the face relative to a condition in which the expression was blocked. Other than the studies discussed above in which simulation was impeded because of engagement of neural circuits with a concurrent task, here simulation is physically blocked by limiting the muscles to contract to express emotion.

These effects of manipulations of the face can also be the result of more invasive procedures, for example when getting the first Botox injection. People have these injections to eliminate wrinkles from their forehead for a certain period of time. A by-effect is that at the peak of the effectiveness of the injection, the corrugator muscle that normally is involved in frowning does not participate in the frowning after Botox is injected. This non-participation of the muscle was used in a study in which participants read sentences with angry, sad, and positive content right before and 3 weeks after the Botox injection when the effectiveness of the injection reaches a peak (Havas et al., 2010). The results indicated that reading times were slower after the injection than before but only for the sentences with negative content. Because participants were no longer able to frown due to the paralysis of the muscles in the corresponding area, facial feedback of expressing the emotion to emotion-relevant neural circuits was disrupted and processing of negative sentences affected because the associated frowning expression could not be made.

The studies discussed so far on facial expression and mimicry illustrate how sensorimotor simulation occurs and under what conditions this may or may not happen. This effect is specific to the emotion that can be displayed on the face and the emotion that is depicted in or activated with the stimuli materials. The effects discussed so far were online effects where the observation and display of emotions occurred while processing emotional expressions on a face, or valenced information in sentences. A few studies have also examined offline effects of facial expression, supporting the claim that simulation may work offline when the bodily state no longer is relevant for a cognitive task to be performed.

Specifically, studies have assessed the effects of manipulations by having participants remember or retrieve emotional information after the task was completed (Schnall and Laird, 2003; Halberstadt et al., 2009). A study by Halberstadt et al. (2009) included pictures of ambiguous facial expressions that were paired with an emotion concept, such as "happy," or with a concept unrelated to emotion, such as "reliable." Afterward, faces encoded with the happy concepts were remembered as happier than in the learning phase whereas no such effect was demonstrated for the concepts that were unrelated to emotion (Halberstadt et al., 2009). The encoding process thus was relevant for later memory when the pictures of the faces and the labels of the concepts were no longer available for comparison. In another study, participants practiced facial expressions and body postures based on emotional cues (Schnall and Laird, 2003). After this practice phase, participants responsive to personal cues recalled more life events that were emotionally congruent (positive memories to an upright body posture and smiling facial expression) than incongruent with the practiced expressions.

Both studies demonstrated offline simulation because once an emotion-congruent association of facial or bodily expression with emotion concepts was established, memory for encoded or newly retrieved materials was higher under congruent than under incongruent conditions. These studies on facial mimicry also underscore the causal role in emotion processing. An emotional stimulus is needed in order for the simulation to occur or be blocked. Otherwise, simulation of the emotional expression on the face would be irrelevant. Moreover, if the emotion cannot be simulated on the face because of a physical (pen) or mental (suppression) prevention of emotion to be shown, the benefits as a result of the compatibility effects disappear. Finally, the simulation mechanism may work offline, after a bodily state has occurred. This was shown in the studies demonstrating benefits of facial mimicry in memory tasks in which earlier encoded information under simulated conditions had to be retrieved at a later point in time when the manipulation of the body was no longer relevant.

Facial mimicry was the last domain that demonstrated sensorimotor simulation, in particular its effect on emotion processing and within a social context. Together, these six domains, language comprehension, gestures, autobiographical memory, expertise, problem solving, and facial mimicry, support an account of sensorimotor simulation to explain embodiment effects in specific ways. Now, it is time to take stock of what this empirical support in those domains has brought us. How stable and general is sensorimotor simulation?

## **CONCLUSION**

Some researchers claim that the degree of simulation required for a task is inversely related to the abstraction needed for it (Myachykov et al., 2014). In this view, the more abstract the process, the more offline it is, and the more stable it is. In contrast, online processes are less stable because of their sensitivity to individual differences in the sensorimotor experience. Although this may be a promising notion that could be developed further, our review shows rather how universal sensorimotor simulation is, as much for abstract and offline processes as for concrete and online processes, by accounting for a wide variety of embodiment effects in various domains. It remains to be seen whether online processes are less stable than offline processes and how sensitive online processes are to individual differences in the sensorimotor experience. Based on the discussion of the studies on online processes in this review, there is no evidence that online processes are less stable than offline processes.

Interference of embodiment was demonstrated in research on language comprehension, gestures, and problem solving. This phenomenon occurs when the same neural circuits are engaged for the sensorimotor as the cognitive processes in an experiment. At this point, it is not entirely clear exactly how the interference occurs and whether it is strictly a capacity problem or a dual engagement problem. It appears that involvement in one task leaves insufficient resources for the other and that this interferes with simulation. This was also found in some of the gesture studies that demonstrated negative effects of gestures among children with poor general language skills (Post et al., 2013). For this group, gestures increased cognitive load and had a negative effect on performance. Further examination of these conditions is essential to a deeper understanding of how sensorimotor simulation operates.

As said before, this overview of research on sensorimotor simulation in various domains is by no means exhaustive. Examples from different domains were chosen to illustrate this mechanism and specific claims operating under this mechanism. This review can be considered a first step toward an approach that further defines conditions under which embodiment occurs or not, and how they depend on previous experiences and skills, affect emotion processing, and operate in social situations. The evidence so far points in the direction of congruence/compatibility being a necessary condition for embodiment to occur and for engagement of similar systems to hinder embodiment. Only in this context can offline simulation and emotion processing be accomplished. The role of simulation in social interactions requires further study. Only support from one domain was discussed in this review.

Future research should also start to address questions regarding individuals who may be less able or susceptible to the effects of

### **REFERENCES**


the body. How does sensorimotor simulation work for individuals with depression, aphasia or dementia? Would children be more or less affected by manipulations of the body, or does it not make a difference? For example, children with poor language skills may not benefit as much from body manipulations, such as gestures, compared to children with average to good language skills (Post et al., 2013). Likewise, older adults may not benefit as much from embodied encoding compared to young adults when their memories that are being retrieved, are more remote (Dijkstra et al., 2007). On the other hand, in the posture study (Dijkstra et al., 2007), older adults demonstrated as much facilitation from congruent postures as younger adults.

With these studies, domains, and claims, we have now a better understanding of one of the major mechanisms underlying embodiment; sensorimotor simulation. The future holds promise for further exploration and specification of the claims associated with this mechanism and gathering empirical evidence to support these claims.

### **ACKNOWLEDGMENT**

During the realization of this work, LP was supported by a grant from the Netherlands Organization for Scientific Research (NWO-PROO; Project number 411-10-907).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Dijkstra and Post. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# What are memory-perception interactions for? Implications for action

#### *Loïc P. Heurley1 \* and Laurent P. Ferrier 2,3*

*<sup>1</sup> Laboratory CeRSM EA2931, Center of Research on Sport and Movement, Université Paris Ouest-Nanterre La Défense, Nanterre, France*

*<sup>2</sup> Institut Français des Sciences et Technologies des Transports, de l'Aménagement et des Réseaux, Laboratoire de Recherche Mécanismes d'Accidents, Salon de Provence, France*

*<sup>3</sup> Laboratory EPSYLON EA4556, Dynamics of Human Abilities and Health Behaviors, Université Montpellier III Paul Valéry, Montpellier, France*

*\*Correspondence: heurleyloic@yahoo.fr*

#### *Edited by:*

*Benoit Riou, Université Lyon 2, France*

#### *Reviewed by:*

*Dermot Lynott, Lancaster University, UK*

**Keywords: embodied cognition, memory, knowledge, perception-action interaction, size perception, grasping movements**

Currently, a growing body of studies demonstrates memory-perception interactions (see Barsalou, 2008; Heurley et al., 2012; Lobel, 2014, for reviews). Even if such interactions are highly relevant to support embodied approaches of cognition as well as to better understand memory and perception (e.g., Zwaan, 2008; Versace et al., 2009; Landau et al., 2010; Kiefer and Barsalou, 2013), their functional role remains unclear: Why would perception integrate memory and knowledge while it seems highly efficient without such influences? To understand the functional relevance of these interactions, we assume that it is necessary to take into account two important conditions in which our cognitive systems have evolved during the phylogenesis and continue to evolve during our ontogenesis. More precisely, we develop a view where memory-perception interactions are highly relevant to plan and control actions when we interact with well-known objects in non-optimal perceptual conditions.

It is widely accepted that to properly parameterize action components and to control them during the course of action, it is necessary to perceptually process some object's features (Hommel and Elsner, 2009). As already claimed by Glover (2004), these "action-relevant perceptual features" (ARPF) can be spatial (e.g., shape, orientation) as well as non-spatial (e.g., fragility, weight). Among spatial-ARPF, size is usually recognized as an important cause to its involvement in a great variety of actions. Jeannerod (1984) has for example demonstrated that the magnitude of the grip aperture, a component of the grasping movement, is function to the visual size of objects (see also Ellis et al., 2007; Fagioli et al., 2007; Wykowska et al., 2009). Visual size processing also seems highly important in order to intercept flying objects (Lee, 1976). Nevertheless, very frequently and for various reasons, the perceptual processing of ARPF is far from being optimal especially in "out-of-laboratory" conditions. For instance, some ARPF can't be processed by the available perceptual channels. Indeed, when we want to grasp an object, we are only able to visually perceive it and therefore we are unable to directly perceive its fragility, weight, and temperature whereas they are extremely relevant to plan the force and the velocity of the grasp (Glover, 2004). Even when the right channels are available, some environmental conditions can impair perception. For example, the occlusion of an object by other surfaces can limit our ability to visually perceive it and thus process its shape, size, or distance (Tanaka et al., 2001). Furthermore, short- or long-term injuries to perceptual systems can also induce nonoptimal conditions of perception. Indeed, the eyes can be long-term-impaired by cell aging or short-term-impaired by an intense flash light, but in both cases our ability to process visual features is affected. Accordingly, how do we plan and control actions in conditions where features that are suited to plan and control relevant parameters of action cannot often be optimally perceived? First of all, it is important to note that non-optimal processing of ARPF do not necessarily induce object-recognition problems. Indeed, as mentioned in several models, object recognition can be accurately based on non-ARPF such as the color and/or the texture of objects and the context (Tanaka et al., 2001; Bar, 2009). Therefore, even if some ARPF cannot be processed, objects can be accurately identified in many cases. Secondly, because in everyday life we mainly interact with well-known objects, preserved ability to recognize object identity can automatically induce the retrieval of a myriad of knowledge associated with the recognized-objects including associated ARPF (e.g., shape, size). Thus, we claim that recognition processes used to identify objects during the planning phase of actions involve the retrieval of previous experienced ARPF that are automatically integrated into perception. We also claim that they allow compensating nonoptimally perceived ARPF and so to maintain a high level of action efficiency even in non-optimal conditions of perception. To resume, we assume that the functional relevance of memory-perception interactions (i.e., an embodied cognitive architecture) occurs when humans interact with wellknown objects in degraded-perceptualconditions. We discuss three potential sets of evidence, coming from studies all focusing on the ARPF size in support of this view.

First, numerous experiments suggests that memory would be able to store objects' perceptual features and especially ARPF (see Barsalou, 2008, for a review). For instance, a great variety of studies support the idea that the size of objects is accurately stored in memory and closely matches their real visual organization (Moyer, 1973; Holyoak, 1977; Holyoak et al., 1979; Shoben and Wilson, 1998; Bertamini et al., 2011; Konkle and Oliva, 2011; Linsen et al., 2011). More importantly, it seems that the known size of objects can be automatically retrieved even when objects are briefly perceived suggesting the possible automatic retrieval of ARPF during fast real interactions with objects. Ferrier et al. (2007) have for example demonstrated that a target picture (e.g., an elephant) is easily categorized as an animal when the brief prime picture (150 ms) has a similar known size (e.g., a giraffe or a car) rather than a different (e.g., a bee or a key) while both pictures have the same visual size on the screen (see also Setti et al., 2009; Gabay et al., 2013). It is noteworthy that the size is generally stable across items of a category as well as across experiences. Because all the ladybugs we experienced have approximately the same small size, their size can be easily stored at a conceptual level (i.e., general knowledge, Whittlesea, 1987). However, in some cases, ARPF could be stored in a more specific or short-term format. For example, because the size of your car is not shared by all the exemplars of the "car" category, this feature is undoubtedly stored in a more autobiographic format. Furthermore, some ARPF are so variable that we can only store them for a short period of time like the last position of your car on the supermarket car park or the distance of some objects on a table (see also Borghi, 2013 for a close distinction). Thus, we claim that ARPF can be stored and automatically retrieved from memory but perhaps in various ways according to the stability of the ARPF across experiences.

Moreover, several studies suggest that ARPF are not only stored but can also influence conscious perception. Among others, the case of the size perception has been strongly studied. In a primary study, Paivio (1975) has demonstrated that the comparison of the known sizes of objects is faster when they are congruent with their visual sizes. In others words, it is easier to say that in general an elephant is larger than a mouse when in the experiment the picture of the elephant is presented larger than the picture of the mouse rather than smaller (see also Srinivas, 1996; Rubinsten and Henik, 2002; Konkle and Oliva, 2012, for similar results). The works of Riou et al. (2011) and Rey et al. (2014) go further and suggest the automatic nature of this influence. Riou et al. (2011) have demonstrated that the known size of objects can influence the detection of a visually oddsized stimulus in a visual search task while such an object's feature is absolutely useless to complete the task. Others studies have demonstrated an influence of the known size of objects on the judgment of distance that are often derived from visual size suggesting that the stored size can automatically impact not only the perception of visual size but also the perception of other ARPF derived from it (Epstein, 1965; Predebon, 1992, 1994; Hershenson and Samuels, 1999; Distler et al., 2000). Besides the known size of objects, the perception of visual size can be affected by a more abstract kind of size representation: numbers. Henik and Tzelgov (1982) have replicated the interaction between visualand stored-size reported by Paivio (1975) but with numbers. In a classic bisection task requiring implicit length estimation, de Hevia and Spelke (2009) have found a bias of bisection toward the side of the line where the larger number is printed. In a reproduction task, Viarouge and de Hevia (2013) have demonstrated that large numbers (e.g., 9) presented at each corner of a square induce larger reproduction of this square compared to the condition where smaller numbers are presented (e.g., 2). Altogether, these studies support the possibility that the size stored in memory (i.e., known size of objects or numbers) can directly influence the perception of size or of size-related features (e.g., distance) supporting the possible completion of perception by stored-ARPF when some of them are missing or ambiguous (see Barsalou, 2009, for a similar idea).

A further step is achieved by recent works demonstrating the influence of size stored in memory on more automatic perception-action links (rather than conscious judgments). Indeed, some studies have been able to show an influence of the known size of objects on action parameters that are dependent on the visual size. For instance, Hosking and Crassini (2010) have conducted experiments in which participants have to carry out time-tocontact judgments on stimuli for which a linear or a parabolic trajectory of approach are simulated. Such a judgment is highly important for a great variety of interceptive actions and is mainly based on the online processing of the visual size of the approaching stimulus. In their experiments, the stimuli used have different known sizes (i.e., large: a football vs. small: a tennis ball). Results elegantly demonstrated that this stored feature of objects influences time-to-contact judgments suggesting that it could interfere with our ability to intercept mobiles (see also DeLucia, 2005; Hosking and Crassini, 2011, for similar results). Another set of studies suggest also an influence of the known size of objects on another wellestablished perception-action link: Our ability to adapt our grip aperture according to the visual size of the to-be-grasped objects (Jeannerod, 1984). Several studies demonstrate that participants are faster to carry out a precision grip on typically small objects (e.g., cherry) and a power grip on typically large objects (i.e., eggplant; Ellis and Tucker, 2000; Tucker and Ellis, 2004; Derbyshire et al., 2006; Girardi et al., 2010) even when visual size cannot interfere (Glover et al., 2004; Tucker and Ellis, 2004; Heurley et al., in revision). The same effect on grip aperture is obtained when size-related adjectives are concomitantly processed (e.g., SMALL/LARGE, LONG/SHORT) rather than known objects (Gentilucci and Gangitano, 1998; Gentilucci et al., 2000; Glover and Dixon, 2002). These results are also replicated when numbers are used. More concretely, Moretto and di Pellegrino (2008) have shown that large number processing facilitate power grips while small number processing facilitate precision grips (see also Andres et al., 2004; Lindemann et al., 2007). In addition, some results support that such interactions are highly automatic (Moretto and di Pellegrino, 2008; Namdar et al., 2014) and seems to be restricted to the planning phase of grasping (Glover and Dixon, 2002; Glover et al., 2004; Badets et al., 2007; Andres et al., 2008). Taken together, these works demonstrate that stored ARPF, such as size, can influence automatic perception-action links and not only conscious perception supporting the possibility that perception can be completed by stored ARPF, itself influencing the planning of some action components.

This short review suggests that the ARPF size can be stored in memory, automatically retrieved during object perception, and can influence conscious perception of visual size (or relatedfeatures) as well as the planning of action components mainly based on visual size processing. We used this evidence to support the view that the interactions between present and absent–but simulated in memory–perceptual features are important for action especially in "out-of-laboratory conditions" in which ARPF can't be optimally perceived and in which interactions mainly occur with wellknown objects. Of course, the reported evidence is limited to the size, but several studies have already demonstrated that other ARPF such as distance, position, and weight could be stored and automatically retrieved (Estes et al., 2008; Scorolli et al., 2009; Winter and Bergen, 2012). This strongly suggests that our view can be extended. Even if many questions remain open and a lot of work has to be done to best support this view, it has the advantage to search for the functional relevance of memory-perception interactions (i.e., an embodied cognitive architecture) by taking into account two main constraints in which our cognitive systems have certainly evolved at phylogenetic and ontogenetic scales: Interactions with (i) well-known objects in (ii) more or less degraded-perceptual-conditions.

#### **ACKNOWLEDGMENT**

We are grateful to Gabrielle Chesnoy-Servanin for her revision of the English text.

### **REFERENCES**


memory level: manipulation of the memory size and not the perceptual size. *Exp. Psychol.* 61, 378–384. doi: 10.1027/1618-3169/a000258


knowledge. *J. Exp. Psychol. Learn. Mem. Cogn.* 13, 3–17. doi: 10.1037/0278-7393.13.1.3


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 November 2014; accepted: 15 December 2014; published online: 08 January 2015.*

*Citation: Heurley LP and Ferrier LP (2015) What are memory-perception interactions for? Implications for action. Front. Psychol. 5:1553. doi: 10.3389/fpsyg. 2014.01553*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2015 Heurley and Ferrier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Manipulation gesture effect in visual and auditory presentations: the link between tools in perceptual and motor tasks

#### Amandine E. Rey \* † , Kévin Roche † , Rémy Versace and Hanna Chainay

Laboratoire d'Etude des Mécanismes Cognitifs, Université Lumière Lyon 2, Lyon, France

There is much behavioral and neurophysiological evidence in support of the idea that seeing a tool activates motor components of action related to the perceived object (e.g., grasping, use manipulation). However, the question remains as to whether the processing of the motor components associated with the tool is automatic or depends on the situation, including the task and the modality of tool presentation. The present study investigated whether the activation of motor components involved in tool use in response to the simple perception of a tool is influenced by the link between prime and target tools, as well as by the modality of presentation, in perceptual or motor tasks. To explore this issue, we manipulated the similarity of gesture involved in the use of the prime and target (identical, similar, different) with two tool presentation modalities of the presentation tool (visual or auditory) in perceptual and motor tasks. Across the experiments, we also manipulated the relevance of the prime (i.e., associated or not with the current task). The participants saw a first tool (or heard the sound it makes), which was immediately followed by a second tool on which they had to perform a perceptual task (i.e., indicate whether the second tool was identical to or different from the first tool) or a motor task (i.e., manipulate the second tool as if it were the first tool). In both tasks, the similarity between the gestures employed for the first and the second tool was manipulated (Identical, Similar or Different gestures). The results showed that responses were faster when the manipulation gestures for the two tools were identical or similar, but only in the motor task. This effect was observed irrespective of the modality of presentation of the first tool, i.e., visual or auditory. We suggest that the influence of manipulation gesture on response time depends on the relevance of the first tool in motor tasks. We discuss these motor activation results in terms of the relevance and demands of the tasks.

Keywords: embodied cognition, gesture, visual and auditory presentation, perceptual task, motor task, situated cognition

#### Edited by:

Roberta Sellaro, Leiden University, Netherlands

#### Reviewed by:

Filomena Anelli, University of Bologna & Fondazione Salvatore Maugeri, Italy Christine Sutter, RWTH Aachen University, Germany

#### \*Correspondence:

Amandine E. Rey, Laboratoire d'Etude des Mécanismes Cognitifs, EA 308 Université Lumière Lyon 2, 5 Avenue Pierre-Mendès France - 69676 Bron Cedex, Lyon, France

amandine.rey@univ-lyon2.fr

† These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

> Received: 23 January 2015 Accepted: 06 July 2015 Published: 22 July 2015

#### Citation:

Rey AE, Roche K, Versace R and Chainay H (2015) Manipulation gesture effect in visual and auditory presentations: the link between tools in perceptual and motor tasks. Front. Psychol. 6:1031. doi: 10.3389/fpsyg.2015.01031

## 1. Introduction

Grounded and embodied cognition theories claim that knowledge is assembled in order to prepare for action (Wilson, 2002) and is grounded in sensory-motor systems (Barsalou, 1999). The cognitive processes that underpin the use of knowledge are thought to be deeply rooted in physical action, with close links existing between perception, action and the environment (Glenberg, 1997; Clark, 2008). Consequently, it has been suggested that seeing an object does not involve only the processing of its different sensory properties but also the activation (or simulation) of motor components related to the object's typical action/use (Barsalou, 2008). The present paper focuses on the following question: Does the activation of motor components always result in an influence on the current task or does this influence depend on the relevance of motor activation for the current task? The study reported here focused more specifically on manipulation gestures that are typically associated with a tool (e.g., cutting for a knife, screwing for a screwdriver). We first explored whether the facilitation effect in perceptual and motor tasks depends on the relevance of the manipulation gesture activated by the prime. Second, because real-life experience is inherently multimodal and depends on our knowledge or our environment (Slotnick, 2004; Jääskeläinen et al., 2007; Versace et al., 2014), we compared the facilitation of manipulation gestures in response to visual (static tools in this study) and auditory (dynamic action-related) tool presentations in both perceptual and motor tasks.

Certain data reported by neurological and behavioral studies involving perceptual tasks has partially confirmed the idea that motor components can be automatically activated. It has been suggested that seeing objects typically activates actions that are associated with these objects irrespective of the task. Some neuroimaging studies have lent support to this argument by showing that neural motor areas are activated by a visual presentation of tools even when there is no intention of acting upon them (Chao and Martin, 2000; Vingerhoets, 2008). At the behavioral argument, Ellis and Tucker (2000) argued that visually presented objects activate motor components that are appropriate for grasping these objects. They showed that even if participants did not have to use the objects, the response times in categorization tasks were slower in congruent conditions (when the grip potentiated by an object was the same as that required by the ongoing task) than in incongruent conditions (see also Tucker and Ellis, 1998). Some studies have reported that actions with tools directly activate representations of their typical manipulation and have suggested that knowledge about manipulation gesture is involved in the selection of appropriate action plans (Creem-Regehr and Lee, 2005; Jax and Buxbaum, 2010; Ranganathan et al., 2011). According to these authors, specific memorized movements include action knowledge about manipulation and use that is automatically activated when a tool is seen (Buxbaum and Kalénine, 2010).

However, some studies have reported results that indicate non-automatic motor effects. For instance, the study conducted by McNair and Harris (2012) showed that seeing a tool automatically activates the grasp component rather than the manipulation component of motor activity in order to prepare for possible future use of the tool. They tested this assumption by comparing congruent vs. incongruent grasp and congruent vs. incongruent manipulation gestures between a prime and a target (both presented as pictures on a computer screen). The participants' task was to recall the name of the previously seen tool from a choice of many other tool names. The results showed that only grasp congruency enhanced participants' accuracy when identifying the previously seen tool. Furthermore, Pecher (2013) showed that a concurrent motor task did not interfere with the processing of the motor components of manipulable tools. This author asked participants to perform a perceptual task based on the perceptual or motor components of the stimulus, while also performing a concurrent motor task (i.e., various movements with their free hands). For instance, the participants performed visual tasks on manipulable and non-manipulable objects (e.g., they had to indicate whether a photograph of a tool was the same as or a mirror image of a preceding photograph) while performing a concurrent motor task. The author assumed that if the processing of manipulable tools is based partially on the activation of motor components, a concurrent motor task should interfere with processing. However, this study, like certain others that have been conducted, revealed no difference between the perceptual processing of manipulable and non-manipulable tools in a concurrent motor task paradigm (see also Pecher, 2013; Quak et al., 2014).

One possible explanation of motor activation effects (automatic or not) could lie in the intention to act (Massen and Prinz, 2009). Indeed, it has been suggested that relevant motor components are selected depending on the intention of the actor (Allport, 1987), which may be absent in perceptual tasks. Intention to act determines the nature of the information that is relevant for processing and this information can be processed irrespective of the target of the action (Craighero et al., 1998; Bekkering and Neggers, 2002; Lee et al., 2013; Roche and Chainay, 2014). One possibility is that activation of tool knowledge is not automatic but selectively modulated by the purpose of the action. If this is indeed the case, tool manipulation knowledge will not be activated in full and only those aspects relevant for the present situation will be activated. Contrary to sensorimotor theories, ideomotor theories have proposed that, rather than being automatic, tool knowledge depends on the intention (Prinz, 1997; Hommel, 2009). The intention to act in a given situation activates certain motor components that result from similar tool uses in the past and that are associated with the current environmental (Massen and Prinz, 2009). Thus, the goal of the action could be taken into account at a very early stage during the planning of a movement, i.e., when the relevant information is selected for the planning and execution of the action (van Elk et al., 2010). For example, the study by Lindemann et al. (2006) focused on how tool manipulation knowledge is involved in the preparation for an action. Their results suggest that tool manipulation knowledge is not activated automatically, but is only activated when the subject intends to grasp the object in a typical way instead of just making a finger-lifting movement. In a more recent study by Ranganathan et al. (2011), participants had to interact in three different ways with a glass placed either upright or upside-down; by grasping it, touching it with a clenched fist, or grasping it with a magnetic implement. Shorter initiation times were found in the case of simple grasping and grasping with the magnetic implement when the glass was placed upright as opposed to upside-down. This effect was not present when the participants touched the glass with their fist. These results, together with those obtained by Lindemann et al. (2006), suggest that an object does not activate motor components automatically but instead does so in light of the purpose of the action and the possibilities and intentions of the person performing the action. Consequently, if there is no intention to manipulate the tool, manipulation gesture components remain irrelevant to the task. However, it is unclear whether a tool-associated manipulation gesture component can be activated when it is irrelevant to the task as in a physical identity judgment task.

The second aim of the present study was to examine activation of the manipulation gesture processes as a function of the presentation modality of the tool (i.e., visual or auditory). In the traditional approach in which motor activation is considered to include manipulation gestures, visual information seems: (1) to be the preferred basis for the efficient execution of actions (Jacob and Jeannerod, 2005; Milner and Goodale, 2008) and (2) to rely on processes different from those involved in recognition and action knowledge (Milner and Goodale, 1995; Buxbaum and Kalénine, 2010). Accordingly, the visual processing of a tool would be sufficient in order to select the appropriate manipulation gesture for executing the action (Jax and Buxbaum, 2010). On the other hand, in a grounded cognition perspective, motor components are thought to be activated by another modality such as auditory information. Indeed, in everyday life, some actions are accompanied by a specific sound. Grounded theories suggest that the typical use of a tool is part of our knowledge about it (Gallese, 2005; Barsalou, 2008). This type of activation is consistent with the suggestion made by Gallese (2000) that tools differ from other objects because knowledge about tools includes one particular usage (Creem-Regehr and Lee, 2005). According to sensorimotor theories, knowledge about tools comes from sensorimotor traces that result from previous experience with tools. According to this view: (1) the auditory modality, just like any other sensory modality, could form the basis for the activation of motor components (see also Trumpp et al., 2014) and (2) motor and perceptual processes, including recognition and action knowledge, share common processes (see also Helbig et al., 2006, 2010; Sim et al., 2014).

In the present study, we tested the possible effect of motor component activation as a function of manipulation gesture in perceptual and motor tasks. In both tasks, a first tool was presented to the participants just before the presentation of a second tool on which they had to perform a perceptual and a motor task. In all the experiments, we manipulated the factor of Gesture Similarity between the first and second tool in three conditions. The gesture used for the two tools could be Identical (same tool, same gesture), Similar (different tools but similar gesture) or Different (different tools and gestures). We assumed that the activation of a similar gesture for the two tools would facilitate the subject response (i.e., faster response times or initiation times) if manipulation gesture is activated by the presentation of a tool. In addition, different results should be observed depending on the demands of the motor task. Indeed, if motor components are activated when they are relevant to the task, participants should only respond faster in the Similar than in the Different condition in the motor task since the activation of motor components is not relevant for the perceptual tasks. In addition to the type of the task (perceptual or motor), we also manipulated the presentation modality (visual or auditory) of the first object across the experiments. By using familiar tools associated with a well-known sound during utilization, we assumed that the auditory presentation of tools could activate manipulation gesture components in the same way as a visual presentation. More specifically, the first and second tools were presented visually in a perceptual (Experiment 1) and a motor task (Experiments 1 and 2), whereas the first tool was presented auditorily in Experiment 3 in perceptual and motor tasks.

## 2. Experiment 1

### 2.1. Method

### 2.1.1. Participants

Sixteen participants from the University of Lyon 2 took part in Experiment 1 (13 females, M = 20.06, SD = 2.05) after completing a written consent form. All of them reported themselves as right-handed and with normal or corrected-to-normal vision and audition. The study was approved by the local Ethics Committee.

### 2.1.2. Stimuli

The stimuli consisted of six manipulable objects which were presented as pictures in the perceptual tasks (in order remain consistent with the habitual perceptual paradigms, Labeye et al., 2008; Borghi et al., 2012; Pecher, 2013, for instance) and were physically in front of the participants for the motor tasks. As in the McNair and Harris (2012) study, they were subdivided into three pairs depending on the similarity of the gestures required for their use. The first and the second pairs corresponded to pistol/spraybottle and hammer/maracas (the maracas replaced the bell from the McNair and Harris study, because in the pre-test we found it to be too noisy when manipulated by the experimenter). The third pair—whistle/party blowout—was chosen on the same principle as the first and the second pair (see **Figure 1**).

The pictures were colored photographs of the six tools (2725 × 1187 pixels with a resolution of 300 × 300 dots per inch), taken from the same angle as that at which they were presented in the motor task. The photographs were presented at a distance of 65 cm from the participant's eyes.

### 2.1.3. Tasks Assignment and General Design

The participants were tested individually in each of the tasks. The assignment of participants to the order of presentation of the perceptual and motor tasks was counterbalanced across participants. Before starting the experiment, we made sure that the participants knew the tools and the gesture associated with their use: the participants were first asked to say the name of the tools. If they failed to say the correct name, they were asked

correspond to the pairs with simular utilization gestures.

to describe the context of use (this was particularly useful for the party blowouts and spray cleaner since they have unfamiliar names in French). They were then asked to grasp the tool and demonstrate how to use it. If the gesture they made was only approximate, we told them normally we use it like this, showed them the correct movement and asked them to replicate the gesture (for example, maracas are moved front-to-back and not left-to-right). After this preparatory phase, the participants were asked to start performing the task to which they had been assigned.

In all the experiments presented in this study, we manipulated the factor of Gesture Similarity between the first and second tool over three conditions: (1) Identical: the first tool was identical to the second; (2) Similar: the first tool belonged to the same pair of tools as the second; (3) Different: the first tool was different and did not belong to the same pair of tools as the second (the first tool was chosen pseudo-randomly across the four remaining tools).

### 2.1.4. Material and Procedure **Perceptual task**

### **Material**

The experiment was conducted on a Macintosh IMac. OpenSesame software was used to set up and control the experiment (Mathôt et al., 2012).

### **Procedure**

For the perceptual task, the participants were positioned facing the computer, with their right hand on the keyboard. Following the display of a fixation point, a first tool was presented to the participants for 1000 ms. After an Inter Stimulus Interval (ISI) of 500 ms, a second tool was presented. The second tool was displayed until the subject responded and was followed by an inter-trial interval of 1500 ms. The participants were asked to respond as quickly as possible by pressing the appropriate key on the keyboard (corresponding to the J and K keys, with the key assignment being counterbalanced across participants). After the phase of familiarization with the material, the perceptual task consisted of a physical identity judgment task (e.g., Vingerhoets et al., 2009). The participants had to indicate whether the tools were visually identical or different. Both tools were presented in one of the Gesture Similarity conditions (Identical, Similar, or Different). The first tool was always presented at 45◦ to the right (relative to the participant's midline), while the second tool was presented twice at 0◦ (aligned with the participant's midline) and twice at 90◦ , thus giving a total of 72 trials.

### **Motor task**

### **Material**

A Dell computer equipped with E-prime2 software (Psychology Software Tools, Inc., USA) was used to run the experiment and record initiation times. Liquid-crystal goggles (Plato Translucent Technologies, Toronto, ON, USA) were used to control the subjects vision and a home-made spherical trigger button of 4 cm diameter was connected to the computer and used to collect gesture initiation times. The tools were placed on a board measuring 40 cm by 50 cm.

### **Procedure**

The participants were positioned facing the experimental board, with their right hand on the button. In order to be consistent with the other tasks, the primes and targets, i.e., the first and second tools, respectively, were presented on the experimental board one at a time at a distance of approximately 45 cm from the participant's hand and with their graspable component facing the participant. To avoid the affordance of exactly the same grasp movement between the first and second tools, we always used different orientations for the two tools. The orientations were 0◦ (aligned with the participant's midline) or 90◦ for the second tool, and always 45◦ to the right for the first tool.

After 10 training trials in which all the conditions and tools were presented, each participant performed 72 trials which were identical to those used in the perceptual task (see **Figure 2**). The second tool was oriented at either 0◦ or 90◦ to create a variation in the grasp parameters and thus avoid repetitive grasp movements across trials. The trials were divided into 3 mini-blocks which were counterbalanced across participants.

All the trials started with a beep to remind the participant to place his/her hand on the release button. At the same time, the goggles became opaque for 1500 ms and a first tool was placed on the experimental board during this period. The goggles then became transparent for 500 ms so that the prime was visible, before turning opaque again for a further 1500 ms. During the ISI, the experimenter replaced the first tool on the experimental board with the second or, in the Identical condition, simply changed the orientation of the tool. At the end of the ISI, the goggles became transparent again and a simultaneous go signal indicated to the participant that he/she should grasp the second tool and show how to use it. The next trial then started with a beep. The participants were told to initiate the movement toward the tool as quickly as possible and simulate its use. They were given 3000 ms to do so.

To summarize, there were three differences between the tasks: the time interval between the stimuli (500 ms in the perceptual tasks and 1500 ms in the motor task), the presentation of the stimuli (pictures in the perceptual tasks and real tools in the motor task) and the nature of the task (comparison of two tools

in the perceptual tasks and execution of the utilization gesture in the motor task).

### 2.1.5. Statistical Analyses

We measured Reaction Times (RT) and error rates in the perceptual tasks and Initiation Time (IT) in the motor task (IT corresponded to the time that elapsed between the go signal and the time when the participants removed their hand from the release button). Errors in the motor task were not analyzed due to their very small number (i.e., only three subjects made one or two errors in the motor task of the first experiment). Reaction (or Initiation) times that were greater than 1500 ms or less than 250 ms and also differed by more than 2.5 standard deviations from the individual participants mean for each condition were removed (less than 3% of the data). Preliminary analyses were conducted to check for normality (Shapiro-Wilks test) and sphericity (Mauchleys test) and no violations were found. We used the mean correct RT for the analyses. Separate analyses of variance (ANOVA) were performed for RT and error rates in the Perceptual task and for IT in the Motor task, with subjects as random variableand Gesture Similarity (identical, similar, different) as within-subjects factor.

Given that we tested specific hypotheses, planned comparisons were performed. A significance level of a = 0.05 was used for all the statistical analyses. Means and standard errors of RT/IT for all the tasks and experiments are presented in **Table 1**.

For control purposes, we checked for a possible Tool Pair effect as well as for an interaction with the Gesture Similarity factor. We also checked for a possible Task Order effect and for an interaction between this and the Gesture Similarity factor. The data analyses were performed using STATISTICA (version 8.0, Stat-Soft, Inc.). The same analyses and controls were used for all the data presented in this article.

### 2.2. Results and Discussion 2.2.1. Perceptual Task

The analysis of RT revealed a significant effect of Gesture Similarity, F(2, 30) = 4.49, p = 0.023, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.23. Planned comparisons showed that RT were faster in the Identical condition (M = 575 ms, SE = 38) than in either the Similar condition (M = 600 ms, SE = 41, p = 0.012) or the Different condition (M = 602 ms, SE = 40, p = 0.018), but no difference was observed between the Similar and

TABLE 1 | Means of reaction (initiation) times (in ms) for Experiments 1, 2, and 3 (between-subjects standard errors in parentheses).


Different conditions (p = 0.91). No simple effect of Task Order or Tool Pair was observed and there was no interaction between either Task Order or Tool Pair and Gesture Similarity (p > 0.1).

The analysis of error rates showed a significant effect of Gesture Similarity, F(2, 30) = 2.40, p = 0.029, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.14. The participants were more accurate in the Identical condition (M = 1.95, SE = 0.56) than in the Different condition (M = 3.22, SE = 0.59, p = 0.029. No difference was observed between the Similar (M = 2.77, SE = 0.54) and either the Identical, p = 0.20, or Different conditions, p = 0.42.

### 2.2.2. Motor Task

No simple effect of Task Order or Tool Pairs was observed, and neither of these interacted with Gesture Similarity (p > 0.1). No significant effect of Gesture Similarity was observed, F(2, 30) = 0.49, p = 0.62. We did not observe a significant difference between Identical (M = 521 ms, SE = 35 ms), Similar (M = 528 ms, SE = 71 ms) and Different (M = 520 ms, SE = 67 ms) conditions.

In the perceptual task, the fact that the two tools required a similar gesture did not facilitate the subjects response (no difference between the Similar and Different conditions was observed). The processing of the first tool seemed irrelevant for the processing of the second tool even if use of the two tools shared a similar manipulation gesture.

Surprisingly, no effect at all was found in the motor task. A previous study using an identical protocol, but with a grasping task, found priming effects when the prime and the target were identical tools (Roche and Chainay, 2013). One way to explain this difference compared to the present motor task is to consider that the movement is planned and controlled as a function of its purpose and that this determines the different steps involved in the movement, including the grasp (Rosenbaum and Halloran, 2006; Ansuini et al., 2008). If this is indeed the case, then it is possible that a priming effect will be found in a grasping task, whereas no such effect will occur in a task in which a toolspecific gesture guides the entire movement, (e.g., only grasping a toothbrush vs grasping a toothbrush and performing the toothbrushing movement) (Massen and Prinz, 2009). Another possible explanation for this effect might be that the participants had learned that the first tool was irrelevant to the task and that they therefore ignored it. Indeed, Pfannmüller et al. (2012) have shown that visuomotor priming effects depend on the quality of prime processing and its memorization. It is possible that some of the processes involved in grasping are more likely to be activated automatically and are less intentional than those involved in the utilization gesture. As in the (Pfannmüller et al., 2012) study, we changed the protocol used for our motor task in Experiment 2 so that the first tool involved a preparation for a subsequent response. This ensured that the participants could not ignore the first tool as they could in Experiment 1. We asked the participants to grasp the second tool while performing the action corresponding to the first tool. This change of protocol increased the memorization and quality of prime processing (Pfannmüller et al., 2012). Thus, in this experiment, the intention to act was directed toward the second tool, whereas the planned gesture was determined in advance by the first tool.

### 3. Experiment 2

### 3.1. Method

### 3.1.1. Participants

Sixteen self-reported right-handed participants took part in this experiment (8 women, M = 23.25, SD = 5.65). None of them had participated in Experiment 1.

#### 3.1.2. Stimuli and Procedure

The same material and procedure as in the motor task in the previous experiment were used, except that we did not use a visuomotor protocol. In this experiment, the participants were told to grasp the second tool as quickly as possible while reproducing the action corresponding to the first tool, irrespective of the grasped tool.

#### 3.1.3. Statistical Analyses

The same cutoff as in Experiment 1 was used (which eliminated less than 1% of the data). The IT werepre-processed according to the same criteria as in Experiment 1. ANOVAs were performed on the IT with subjects as a random variable and Gesture Similarity as within-subjects factor. In addition, the interactions of Tool Pair and Task order with Gesture Similarity were tested for control purposes.

#### 3.2. Results and Discussion

The results showed a significant effect of Gesture Similarity, F(2, 30) = 11.92, p < 0.001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.44. IT were shorter for the Identical condition (M = 543 ms, SE = 23) than for the Similar (M = 565 ms, SE = 27, p = 0.026) and the Different conditions (M = 604 ms, SE = 29, p < 0.001). Moreover, IT were faster for the Similar condition than the Different condition (p = 0.011). The Tool Pair effect did not differ and did not interact significantly with Gesture Similarity (p > 0.1).

To gain a better understanding of the relevance of the first tool for the task, which could explain the different patterns of results in the motor tasks in Experiments 1 and 2, we performed an ANOVA with the relevance of the first tool (relevant/Experiment 1 vs. irrelevant/Experiment 2) as group factor and Gesture Similarity as repeated-measure factor. The analysis revealed a significant effect of Gesture Similarity [F(2, 60) = 7.67, p = 0.002, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.20] and, more interestingly, showed that the interaction between Experiments and Gesture Similarity [F(2, 60) = 8.55, p < 0.001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.22) was significant (see **Figure 3**). Planned comparisons are reported separately in the results section of each experiment.

In Experiment 2, we found an effect of Gesture Similarity. First of all, the results showed that movement IT were faster when the first and second tools were Identical rather than Similar or Different. We interpret this finding in terms of a facilitatory effect which enables subjects to plan their action in advance on the basis of the first tool presented just before manipulating the same (Identical) tool. Secondly, we found shorter IT in the Similar than in the Different condition. In both conditions, although the tools changed between the first and second presentation, their similarity in terms of motor manipulation nevertheless facilitated the initiation of the movement. In addition, and unlike in the motor task in Experiment 1, increasing the memorization and quality of the processing of the first tool enabled us to obtain an effect of Gesture Similarity. It seems possible that, unlike in a perceptual or grasping task, a more complex action such as demonstrating the actual utilization of a tool demands more situated processing. More generally, the results of Experiment 2 suggest that it is the intention to act that determines the processing of motor components (Allport, 1987) in the light of the overall goal of the action (Massen and Prinz, 2009).

To extend our study, Experiment 3 explored the possibility that a facilitation effect might be observed in response to an auditory presentation of the tool. If all the sensorymotor components are activated during the situation, then this activation should be induced by any sensory modality (e.g., the sound of a hammer should allow access to its action in just the same way as a hammer presented visually). Consequently, the first tool was not presented visually but auditorily by playing the sound associated with its utilization. The participants performed the perceptual identity task from Experiment 1 and the motor task from Experiment 2.

### 4. Experiment 3

### 4.1. Method

### 4.1.1. Participants

Sixteen participants took part in this experiment (13 females, M = 21.38, SD = 3.12). None of them had taken part in the previous experiments.

### 4.1.2. Stimuli

The respective sounds of the six objects replaced the presentation of the first tool in the motor task and the presentation of the photographs in the perceptual task.

### 4.1.3. Procedure

The same general material and procedure as in the first experiment were used for this experiment. The only difference concerned the modality in which the first tool was presented. The visual presentation in Experiments 1 and 2 was replaced by the corresponding sound of tool utilization. We kept the same exposure duration of 1000 ms for the first tool. The second tool was presented in the same way as in the previous experiments (pictures in the perceptual task and the physical tool in the motor task). In the motor task, the goggles did not become transparent during the presentation of the sound. As in Experiment 2, the participants were told to grasp the second tool as quickly as possible while reproducing the action corresponding to the first tool (which had been presented auditorily), irrespective of the grasped tool.

### 4.1.4. Statistical Analyses

The same cutoff as in the previous experiments was used (which eliminated less than 3% of the data in the perceptual task and 1% in the motor task). The RT/IT were pre-processed using the same criteria as in Experiment 1. The ANOVAs conducted on the RT and error rates, and IT were performed with subjects as random variable, and with Gesture Similarity as within-subjects factor. In addition, the interaction of Tool Pair and Task Order effects were tested with Gesture Similarity as control.

### 4.2. Results and Discussion

### 4.2.1. Perceptual Task

A significant effect of Gesture Similarity was observed on RT, F(2, 30) = 3.81, p = 0.033, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.19. Planned comparisons showed that RT were faster for the Identical condition (M = 609 ms, SE = 27) than for either the Similar condition (M = 641 ms, SE = 31, p = 0.05) or the Different condition (M = 640 ms, SE = 29, p = 0.017). However, no difference was observed between the Similar and Different conditions (p = 0.94). No simple effect of Task Order or Tool Pair was observed and neither Task Order nor Tool Pair interacted with Gesture Similarity (p > 0.1).

The analyses of error rates revealed no significant difference between the identical (M = 1.91, SE = 0.57), similar (M = 1.78, SE = 0.53) and different conditions (M = 2.55, SE = 0.53), p = 0.58.

#### 4.2.2. Motor Task

The IT as a function of Gesture Similarity revealed a significant effect, F(2, 30) = 10.91, p < 0.001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.42 (see **Figure 4**). Planned comparisons showed that IT were faster for the Identical condition (M = 458 ms, SE = 30) than for the Similar (M = 476 ms, SE = 32, p < 0.03) and the Different condition (M = 490 ms, SE = 32, p < 0.001), and that IT were faster for the Similar than for the Different condition (p = 0.05). No simple effect of Task Order or Tool Pair was observed, and neither Task Order nor Tool Pair individually interacted with Gesture Similarity (p > 0.1).

In the perceptual task, which in this case involved an auditory presentation of the first tool, the same pattern of results was observed as in the perceptual task of Experiment 1.

In the motor task, the results revealed shorter movement IT when the two tools were Identical than when they were Similar or Different. This result showed that there was a facilitatory effect on the planning of an action with the second tool when the participants had heard the same tool before. Moreover, and in line with our assumption, the effect of Gesture Similarity (previously observed in the motor task of Experiment 2, in which a similar protocol was used) was also observed when the two tools were similar. In fact, the participants responded faster in this condition than in a condition in which the tools were different. The difference between these two conditions lay in the similarity of the motor manipulation between the two tools in the Similar condition.

### 4.2.3. Comparison of Visual and Auditory Conditions in Perceptual and Motor Tasks

We ran supplementary analyses to compare the visual and auditory modalities in the perceptual (visual modality in Experiment 1 vs. auditory modality in Experiment 3) and motor tasks (visual modality in Experiment 2 vs. auditory modality in Experiment 3). Separate ANOVAs were performed on RT/IT for the perceptual and motor tasks, with subjects as random variable, Gesture Similarity as within-subjects factor and Modality (of the first tool) as between-subjects factor.

Concerning the perceptual tasks, a significant main effect of Gesture Similarity [F(2, 60) = 7.88, p < 0.001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.20;

Identical: M = 592, SE = 23, Similar: M = 621, SE = 25, Different: M = 621, SE = 25] was found, but no main effect of Modality (p = 0.44) and no interaction between Gesture Similarity and Modality (p = 0.92). In Experiments 1 and 3, the participants were faster in the Identical condition than either the Similar or Different conditions.

As far as the motor tasks are concerned, we found a significant main effect of the Gesture Similarity [F(2, 60) = 21.07, p < 0.001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.41; Identical: <sup>M</sup> <sup>=</sup> 501, SE <sup>=</sup> 20, Similar: <sup>M</sup> <sup>=</sup> 521, SE = 22, Different: M = 547, SE = 24] and a main effect of Modality [F(1, 30) = 5.77, p = 0.022, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.16; Visual: M = 571, SE = 26, Auditory: M = 474, SE = 31]. Participants were faster in the auditory condition (Experiment 3) than in the visual condition (Experiment 2). However, no interaction between Gesture Similarity and Modality was observed [F(2, 60) = 2.41, p = 0.098].

### 5. Discussion

The present study investigated motor facilitation by presenting familiar tools that either did or not require the same gesture when manipulated. The activation of motor components should be reflected by shorter reaction times or movement initiation times when the two presented tools share a similar gesture. More specifically, we explored whether the facilitation induced by manipulation gesture congruency depends on the relevance of the first tool for response preparation. We asked our participants to perform both perceptual and motor tasks, with the motor task requiring the physical execution of the movement. The originality of the present study lies in the manipulation of the relevance of the first tool for motor preparation within a motor priming paradigm. We also investigated whether motor preparation would be induced by auditory stimulation, i.e., whether an auditory presentation modality of tools can influence the initiation of a corresponding gesture in the same way as visual presentation.

As far as the presentation modality of the prime, which might play a role in motor activation, is concerned, we focus our discussion on the perceptual tasks in Experiment 1 (visual) and Experiment 3 (auditory) and the motor tasks in Experiment 2 (visual) and Experiment 3 (auditory), which were identical except for the modality of presentation of the first tool. With regard to the difference between the "similar" and "different" modalities of Gesture Similarity in these experiments, we observed an effect of Gesture Similarity in the motor task but not in the perceptual task, irrespective of visual or auditory presentation of the first tool. It has been argued that vision is the preferred sense for tool use (Jeannerod and Jacob, 2005; Milner and Goodale, 2008). However, audio-motor interaction has also been explored in the literature. For instance, D'Ausilio et al. (2010) showed that the congruency between motor preparation induced by an auditory stimulation and the future motor state had different consequences on motor performance. The present results are consistent with the theoretical framework of embodied and situated cognition. According to this framework, individuals encode all the sensory components of the situation when they interact with the environment, with there being no difference between a static (in this experiment, visual) and a more dynamic, action-related (auditory) presentation (Versace et al., 2009). Behavioral research has shown that congruent motor interactions between object pairs facilitate perceptual processes such as object recognition (e.g., Helbig et al., 2006; Kiefer and Martens, 2010). The results of the present study support the idea that motor components such as manipulation gesture can be reactivated not only by visual presentation but also by auditory presentation.

The comparison of motor tasks revealed a main effect of modality, with faster initiation times being observed in Experiment 3 (auditory presentation of the first tool) than in Experiment 2 (visual presentation). This effect must be interpreted carefully as we did not observe an interaction between Modality and Gesture Similarity. However, the faster initiation times in response to auditory presentation can be explained by the multimodality of the presentations (an auditory presentation of the first tool and a visual presentation of the second tool). Indeed, the literature reports that multimodal objects are processed faster and more accurately than unimodal objects (Giard and Peronnet, 1999). In an embodied perspective, the sound of a tool refers to its direct utilization and may accelerate the activation of components of the manipulation gesture. This difference might also be related to the experimental design of the study. Indeed, the goggles were opaque throughout the entire auditory presentation of the first tool, whereas they were successively opaque/transparent/opaque during the visual presentation of the first tool. Thus, participants might have focused more on the sound and the task with the auditory presentation. To determinate whether this is indeed the case, it would be possible to perform a further study with the same experimental design, i.e., in which the goggles are also opaque, then transparent, and then opaque again during the auditory presentation of the object.

Action planning can affect perceptual processing (see Theory of Event Coding, Hommel et al., 2001). Consequently, the presentation of a tool (or another stimulus which is associated with a particular action) automatically induces the production of the same action by the system. However, in the present study, the gestures associated with the tools in the perceptual tasks in Experiments 1 and 3 were irrelevant to the task and there was no intention to act. In these perceptual tasks, we did not find any difference between the similar and different gesture conditions. These results are consistent with the suggestion made by Vingerhoets et al. (2009) that motor knowledge about tools, and especially about their manipulation (corresponding gesture), is not activated by simply seeing a tool. These authors also found that grasp motor components could be automatically activated by

### References


seeing a tool, a finding which is consistent with the observation of shorter reaction times in the identical gesture condition than in the similar and different gesture conditions in our perceptual task, as well as with other studies (Ellis and Tucker, 2000; Sumner and Husain, 2008; McNair and Harris, 2012). In the present study, we can not exclude the possibility that the participants did not pay attention to the first tool in Experiment 1. Further investigations (in which the participants cannot ignore the first tool) should help to determine whether the results are due to the relevance of the first tool or to the attention paid to the tool.

All our motor tasks involved an intention to act. However, while in Experiment 1 the first tool presentation was irrelevant to the task, the motor tasks in Experiments 2 and 3 required the participants to plan their movements as a function of the initially presented tool and to perform the gesture with the second tool. Thus, in Experiments 2 and 3, the results of the motor tasks revealed furthermore shorter initiation times in the similar compared to the different gesture condition. The different patterns of results between the motor tasks used in Experiment 1 and Experiments 2 and 3 showed that an intention to act is not the only source of motor component activation. Indeed, it seems that the motor components need to be relevant to the task if they are to induce motor facilitation, especially when the task demands a more complex activity than simply grasping and carrying the tool. Unlike grasping, which is the non-reducible first step for actions with tools, it seems likely that tool use requires more specific processing of the situation and of the individuals needs. The fact that, in complex motor tasks such as tool use, individuals process only specific, relevant information can be seen as economical at the level of cognitive resources (Randerath et al., 2013). Embodied cognition theories claim that knowledge about tools comes from previous sensorymotor experiences with them (e.g., Gallese, 2005; Binkofski and Buxbaum, 2013). However, the question remains as to whether conceptual knowledge about tools might include manipulation knowledge (Garcea and Mahon, 2012; Osiurak, 2014). It seems that the best way to address this question would be to explore it in relation to the intention to act on the tool and the relevance of the motor components in the current situation.

### Acknowledgments

AR and KR were supported a graduate research allocation from the French Ministry for Higher Education and Scientific Research. This work was supported by the LabEx Cortex (ANR-10-LABX-0042) of Université de Lyon, within the program "Investissements d'Avenir" (ANR-11-IDEX-0007).


J. Cogn. Psychol. 26, 280–306. doi: 10.1080/20445911.2014. 892113


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Rey, Roche, Versace and Chainay. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The embodied dynamics of perceptual causality: a slippery slope?

Michel-Ange Amorim<sup>1</sup> \*, Isabelle A. Siegler <sup>1</sup> , Robin Baurès 2, 3 and Armando M. Oliveira<sup>4</sup>

<sup>1</sup> CIAMS, Univ Paris-Sud, Université Paris-Saclay, Orsay, France, <sup>2</sup> Centre de Recherche Cerveau et Cognition, Université de Toulouse, UPS, Toulouse, France, <sup>3</sup> Centre National de la Recherche Scientifique, CerCo, Toulouse, France, <sup>4</sup> Institute of Cognitive Psychology – Faculty of Psychology and Educational Sciences, University of Coimbra, Coimbra, Portugal

In Michotte's launching displays, while the launcher (object A) seems to move autonomously, the target (object B) seems to be displaced passively. However, the impression of A actively launching B does not persist beyond a certain distance identified as the "radius of action" of A over B. If the target keeps moving beyond the radius of action, it loses its passivity and seems to move autonomously. Here, we manipulated implied friction by drawing (or not) a surface upon which A and B are traveling, and by varying the inclination of this surface in screen- and earth-centered reference frames. Among 72 participants (n = 52 in Experiment 1; n = 20 in Experiment 2), we show that both physical embodiment of the event (looking straight ahead at a screen displaying the event on a vertical plane vs. looking downwards at the event displayed on a horizontal plane) and contextual information (objects moving along a depicted surface or in isolation) affect interpretation of the event and modulate the radius of action of the launcher. Using classical mechanics equations, we show that representational consistency of friction from radius of action responses emphasizes the embodied nature of frictional force in our cognitive architecture.

### Keywords: causality, friction, embodied cognition, event perception, prediction

### Introduction

"It was a slippery, slippery, slippery slope I feel me slipping in and out of consciousness" Thom Yorke (2006) Harrowdown Hill lyrics excerpt

We inhabit a world where friction is omnipresent and crucial in our everyday life actions. Consider a world without friction (i.e., full of slippery surfaces): we could not walk without the friction between our shoes and the ground; neither could we hold any object (e.g., a pencil). When walking, our foot pushes backwards on the ground and the reaction force pushes us forwards thanks to friction. Actually, we often experience friction as a source of effort, such as when rearranging the furniture in a room. There is strong evidence that actions in our world, and the accompanying forces we experience, influence visual perception of distance and motion (Proffitt, 2006), and are at the origin of our causal understanding (White, 2006, 2009). For example, apparent distance or terrain inclination increases when wearing a heavy backpack or throwing a heavy object (Proffitt, 2006), and because our brain predicts the sensory effects and causal

#### Edited by:

Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada

#### Reviewed by:

Kelly M. Goedert, Seton Hall University, USA Timothy L. Hubbard, Texas Christian University, USA

#### \*Correspondence:

Michel-Ange Amorim, CIAMS, Univ Paris-Sud, UFR STAPS, 91405 Orsay, France michel-ange.amorim@u-psud.fr

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 30 November 2014 Accepted: 02 April 2015 Published: 21 April 2015

#### Citation:

Amorim M-A, Siegler IA, Baurès R and Oliveira AM (2015) The embodied dynamics of perceptual causality: a slippery slope? Front. Psychol. 6:483. doi: 10.3389/fpsyg.2015.00483 effects of our actions we can learn to play drums or basketball, but can hardly tickle ourselves (Blakemore et al., 2000). However, the mastery of physical forces accompanying our actions (e.g., playing music or sport) stands in sharp contrast to our poor explicit declarative knowledge about their underlying dynamics (McCloskey, 1983; Hecht and Bertamini, 2000).

Perception of collision events offers a variety of paradigms to unravel the structure of our mental representation of physical principles, e.g., perception of objects bouncing on the ground (Twardy and Bingham, 2002); control of rhythmic ball bouncing (Siegler et al., 2010), etc. Michotte (1963) and others (e.g., Schlottmann and Anderson, 1993; Schlottmann et al., 2006) showed that when an object A moves toward an initially stationary object B, and B is set into motion when A reaches B (and in turn A becomes stationary), the impression of A actively launching B does not persist beyond a certain distance identified as the "radius of action" of A over B (Yela, 1954; Boyle, 1961). While the launcher (object A) seems to move autonomously, the target (object B) seems to be displaced passively. This distinction is reminiscent of Newton's definition of force as both an action (vis impressa or impressed force) and also as a property of motion (vis inertiae or force of inertia). If the target keeps moving beyond the radius of action, it loses its passivity (vis inertia) and seems to move autonomously (vis impressa). Our contention is that measurement of the radius of action (RA) provides insight into the perceived kinetic properties of the event (Sinico and Parovel, 2002).

The notion that our understanding of dynamics stems from our experiences of acting on objects has been argued to offer a unifying account of visual impressions of forces, imagery implicated in the simulation of dynamic events, and explicit judgments about forces (White, 2009, 2011). This view assigns a unique role to proprioception and mechanoreceptors in the way forces are perceived and representationally construed. In contrast to claims of a direct perception of causality (Michotte, 1963), or of a direct specification of dynamics by kinematics (Runeson and Frykholm, 1983), it stipulates that visual impressions of force arise through the coupling of visual input with a knowledge base of embodied dynamics (White, 2006, 2009, 2012c,d). Distinctions between force and resistance, active and passive, or cause and effect, which do not form part of mechanics (standing in violation of its third law), become thereby a part of our understanding of mechanical interactions, and give rise to visual impressions of causality (White, 2006). One entailed consequence is that haptically embodied representations do not presume an isomorphism with physical invariants—e.g., kinematic geometry (Shepard, 1984, 2001), spatio-temporal coherence (Freyd, 1987, 1993), Newtonian principles (Sanborn et al., 2013). While they correspond to a form of internalization, they may as well be described as an externalization of body dynamics (Hecht, 2001), with the consequence that both their internal and external consistency remain in every case a matter for inquiry (Hecht and Bertamini, 2000; White, 2012b). Evidence exists that which forces come into awareness, and how they are interpreted, depends on mental simulations driven by our embodied knowledge of dynamics: discrepancy from predictions (in a forward model of action) may thus bring into awareness a force otherwise unnoticed (White, 2009, 2012a).

Michotte's launching display is suggestive of an elastic collision (e.g., between steel or pool balls) where the momentum (and kinetic energy) of the launcher is entirely transferred to the target and the target would keep traveling at the same velocity as the launcher just before contact, onto a surface without friction. However, both the concept of RA and the fact that the causal impression is increased when the target moves at a reduced velocity relative to the launcher (Michotte, 1963; Schlottmann and Anderson, 1993), suggest that implied friction is in the eye of the beholder. In the present study, we manipulated implied friction by drawing (or not) a surface onto which the launcher and the target are traveling (which is usually not the case in causal displays), and by varying the inclination of this surface in a screen-centered (horizontal or diagonal) and earth-centered (vertical or horizontal screen) frame of reference. We expected that the radius of action would vary with inclination of the drawn surface when objects are displayed on a vertical screen but not for a horizontal screen (seen from above). More precisely, due to implied gravity (Hubbard, 1990, 1997; Bertamini, 1993), RA would be greater for objects moving downwards (than upwards) onto an (screen-centered) inclined surface displayed on a vertical (earth-centered) screen. In contrast, when the animations are displayed on a horizontally-oriented screen, no effect of inclination is expected. Finally, we quantified the representational friction coefficient from RA responses using classical mechanics equations, in order to study the representational consistency of friction in causal displays, as a first approximation toward mental tribology (tribology is the science of interacting surfaces that are in relative motion, see Ludema, 1996). Consistency in the use of the coefficient of friction would follow if people acted according to physics (for the same pair of contacting materials, the coefficient of friction should remain the same across conditions). Inconsistency in its use, meaning a significant change in its estimated value across conditions, corresponds in turn to a deviation from what is entailed by physics.

While several meanings of embodiment co-exist in the psychology literature (Wilson, 2002; Shapiro, 2007; Wilson and Golonka, 2013), we take here a view of embodied cognition as the conjoining of two claims: that cognition is situated, on the one hand, and that "environmental invariants" (i.e., physical regularities in the world around us, such as gravity or friction) have been internalized in our cognitive system (Hubbard, 1995b), on the other hand. As an illustration of the first point, nobody expects equivalent collisional kinematics for a tennis ball dropped on the floor or rolling on the ground toward a wall: the ball will bounce several times on the floor, with observers having been shown capable of judging how natural the bouncing looks (Twardy and Bingham, 2002), whereas only once on the wall. Similarly, on the basis of our locomotor experience, we can expect moving furniture uphill to be more effortful than downhill. This is because we inhabit a world with gravity and friction that we sense with our body. As an illustration, now, of the second point, we do learn from experience that driving on a dry road is much less dangerous than on a wet or icy surface where friction is reduced and braking distance increases dramatically. On the basis of this knowledge, we can anticipate from weather forecast news the consequences for driving of impending poor weather conditions (e.g., need for reducing the speed and increasing the distance behind the vehicle in front by given amounts, depending on the particular route conditions and type of vehicle driven). Building on this 2-fold view, we examine here the contention that perception of collision events in a Michotte-type launching display will be driven by both environmental (i.e., orientation of the display with respect to gravity) and internal (e.g., coefficient of friction in prospective mental simulations of dynamical events) constraints resulting from embodied cognition.

## Experiment 1 (Main Experiment)

### Materials and Methods Participants

Fifty-two individuals took part in this experiment (mean age = 23 years, range = 19–34 years) after providing informed consent. They were divided in two different groups according to the experimental conditions. Local ethical approval from EA 4532 ethics committee of Université Paris-Sud was granted for this study.

### Stimuli and Apparatus

In the "friction" group, the launcher and target moved on a surface (thick gray line, 10 pixels wide) either horizontally, or on a diagonal trajectory downwards or upwards, −30◦ and +30◦ in screen-centered coordinates (Motion slope condition). In the "no friction" group, the launcher and the target motions were the same but without any surface displayed. The distance traveled by the launcher before entering in contact with the target was always the same in each trial (300 pixels). However, this launcher-target system could be initially positioned in three possible screen-centered coordinates (A, B, C) differing in 160 pixels steps along the motion path. The distance from the initial position of the launcher to the screen border in the direction of motion could be either 660, 500, or 340 pixels in the 0◦ Motion slope, and either 740, 580, or 420 pixels in the +30◦ and −30◦ (see sample **Videos 1**–**7** in the Supplementary Material available online). The movies comprised 700 frames and lasted 7 s in total, with each frame played 10 ms. The motion could be rightward or leftward (mirror animations were used). The animations were 1024 × 768 pixels movies, and each square (launcher or target) side was 20 pixels. Because the launching effect is perceived as more natural with a velocity ratio of 3:1 for the launcher vs. target, respectively, (Michotte, 1963; Schlottmann and Anderson, 1993), that was also the velocity ratio we used in our displays. The launcher moved at 330 pixels/s (during the first 90 frames) whereas the velocity of the target was 110 pixels/s (approximately 3.3 ◦ of visual angle per second) after the contact.

The experiment ran on a HP Compaq nx9500 Laptop (17′′ LCD screen) using ERTS-VIPL, a software package for programming psychology experiments (http://www.berisoft.com/). Depending on the Motion plane condition, participants faced either a vertical screen or a screen placed horizontally onto a table surface, from approximately 57 cm (see **Figure 1**). In contrast to Motion slope defined in a screen-centered (or viewercentered) reference frame (referring to the configuration within the display), Motion plane is defined in an earth-centered frame of reference (referring to the orientation of the screen).

FIGURE 1 | Illustration of the vertical (left) and horizontal (right) Motion plane conditions, for objects traveling onto a +30◦ slope surface.

### Procedure

In each movie, the launcher moved toward the target at constant velocity until it "collided" with the target. When the launcher reached the target, it stopped abruptly, and the target started to move in the same direction at reduced constant velocity (1/3 launcher velocity) until leaving the screen. It has been previously shown that such displays lead to induce strong launching impressions (Schlottmann et al., 2006) and that the motion of the launched object systematically appears passive (Parovel and Casco, 2006). While we did not ask participants to rate the strength of the perceived physical causality, we note that all participants during an exit debriefing mentioned having perceived the target to be launched.

After the target disappeared, a mouse cursor (in the form of a plus sign) was displayed in the center of the screen. Participants were required to place the mouse cursor over where the target square had been located when its motion was perceived to become "autonomous." Participants were instructed that "autonomous" meant that the motion of the launched object was no longer passive, and were explicitly asked to match the center of the mouse cursor with the center of the target square. Note that the launcher was always displayed onto the screen until participants responded. The movie duration was the same whatever the Motion slope condition: 7 s. Therefore, although the distance from the ending screen border was 80 pixels shorter for the 0◦ condition as compared to the diagonal trajectories, the mouse cursor appeared always 6.1 s after the target started to move.

### Results and Discussion

### Statistical Analyses of Behavioral Data

Repeated-measures ANOVAs on RA (in pixels) were conducted with Implied friction (friction vs. no-friction) as a betweensubject factor, and with Motion plane (earth-centered: horizontal vs. vertical) and Motion slope (screen-centered: −30◦ vs. 0◦ vs. +30◦ ) as within-subjects factors. Motion direction (whether leftward or rightward) as well as objects' Initial position (A, B, or C) were not considered in the statistical analyses. ANOVA, η 2 p and Scheffé post-hoc tests were computed using SPSS 16.0 and Statistica 7, and the significance threshold was set to α = 0.05 unless otherwise specified. Means are reported together with 95% Confidence Intervals (see **Table 1**).


NB. Mean RA and O-displacement as a function of each condition (±95% CI).

First, the ANOVA showed a significant Implied friction effect with shorter RA in the condition of friction (M = 181 ± 27) than in the condition without friction (M = 232 ± 34), F(1, 50) = 5.67, p = 0.021, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.10. The effect of Implied friction on RA varied with Motion slope, F(2, 100) = 3.90, p = 0.023, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.07, but not with Motion plane, F(1, 50) < 1. However, there was a Motion slope × Motion plane × Implied friction interaction, F(2, 100) = 7.40, p = 0.001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.13. In order to examine the latter interaction separate ANOVAs were conducted for each Motion plane or Implied friction condition.

Separate ANOVAs for each Motion plane showed that the Motion slope × Implied friction interaction was significant in the vertical Motion plane, F(2, 100) = 6.84, p = 0.002, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.12, but not for the horizontal Motion plane, F(2,100) < 1. Moreover, post-hoc analyses confirmed that RA did not differ significantly between each cell of the Motion slope × Implied friction interaction for the horizontal Motion plane condition. In contrast, although RA did not differ significantly between the levels of Motion slope in the vertical Motion plane condition for the no-friction group, it presented differences in the friction group. More precisely, the RA for +30◦ was marginally (p = 0.065) smaller than for 0◦ , but significantly smaller than for −30◦ (p < 0.00001); and RA for 0◦ and −30◦ differed significantly (p < 0.000001). The patterns of means are available from both **Figure 2** as well as **Table 1**.

Moreover, separate ANOVAs for each Implied friction group showed that the Motion slope × Motion plane interaction was significant for the friction group, F(2, 50) = 13.51, p < 0.0001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.35, but not for the no-friction group, <sup>F</sup>(2,50) <sup>&</sup>lt; 1. Posthoc analyses for the no-friction group confirmed that RA did not differ significantly between each cell of the Motion slope × Motion plane interaction. In contrast, for the friction group, RA varied with Motion slope in the vertical Motion plane (RA for −30◦ was greater than for 0◦ , p < 0.00001, and for +30◦ , p < 0.00001), but not in the horizontal Motion plane. These differences in RA depending on Implied friction are visible in **Figure 2** illustrating the 95% confidence interval ellipses (computed using the student's t distribution) around mean response for each cell of Motion slope × Motion plane. The 0,0 coordinate

FIGURE 2 | Illustration of the 95% confidence interval ellipses around the mean position where participants felt the target movement started to be autonomous among each Implied friction group, for each condition of the Motion slope × Motion plane interaction. The 0,0 coordinate in the left-hand y-axis on each panel reflects the initial location of the target (i.e., its location at the time it is contacted by the launcher). The dashed lines (not visible in the stimulus) illustrate the direction traveled by the target.

in the left-hand y-axis on each panel refers to the initial location of the target (i.e., its location at the time it is contacted by the launcher).

Finally, in order to control that the RA did not vary due to spatial memory displacement away from the target trajectory (Hubbard, 1997), especially in the no-friction group where the motion surface was not displayed, we computed the orthogonal distance between participants' response and the target trajectory ("trajectory" meaning the path of motion of the target center). Orthogonal distance is a measure of deviation from the target path combining the x and y distances p (x<sup>2</sup> + y 2 ) between the point indicated by the participant and the point in the true path of motion along a line passing by the center of the target and intersecting the true trajectory at an angle of 90◦ . A measure of positive or negative O-displacement was then associated to this distance, depending on whether the indicated point was above or below the path of motion in screencentered coordinates, respectively (for the original distinction between M- and O-displacements, see Hubbard, 1995a). **Table 1** shows negligible mean O-displacements (within the range of target's height), and the ANOVA showed neither main effects of the experimental factors nor interactions on O-displacement values, which suggests that variations of RA are not related to spatial memory displacement orthogonal to the target trajectory.

### Physical Modeling of Friction from Radius of Action Responses: Friction Coefficients Computation

The friction coefficient µ is a dimensionless scalar value which describes the ratio of the force of friction between two bodies and the force pressing them together. The coefficient of friction depends on the materials used; for example, ice on steel has a low coefficient of friction, while rubber on pavement has a high coefficient of friction. Coefficients of friction range from near zero to greater than one. A friction coefficient µ = 1 means that the force needed to move the object is equal to its weight, <1 that it is less than its weight, and >1 that it is larger than its weight. For example, under good conditions, a tire on concrete may have a coefficient of friction of 1.5 (Ludema, 1996). In order to study the representational consistency of friction in our causal displays, we computed the (representational) friction coefficient from radius of action responses using classical mechanics equations.

The subjective value of friction coefficient µ for the target was computed with the equations below under the assumption that the target will decelerate post-collision due to friction, with the following parameters: object mass m; gravity acceleration g is a constant (9.81 ms−<sup>2</sup> ); target's deceleration dec (<0) after collision as inferred from participant's response (see below); and the slope angle 2. Note that actually µ does not depend on m. This is because m appears as a factor in both terms of the fraction and therefore can be canceled off, and the fraction simplified.

$$\mu = \frac{-m.dec - m.g.\sin\Theta}{m.g.\cos\Theta} = \frac{-dec}{g.\cos\Theta} - \tan\Theta \tag{1}$$

To calculate the forces on an object placed on an inclined plane (with slope angle θ), one must consider the three forces acting on it (air resistance is neglected for the sake of simplicity), as illustrated in **Figure 3**:


We assumed that the position indicated by participants reflects the moment (time 1t) when they detected a difference 1V between the predicted (on the basis of a representational friction coefficient and an analogy to the physics of friction) vs. perceived velocity (taken to be approached by the actual velocity) of the target, as indicated in the instructions: "the movement of the target seems to be autonomous." 1t was inferred from participant's response (mouse click on XY screen coordinates: Xobs, Yobs) with Equation (2) where Vobs stands for the actual target's velocity on the screen (110 pixels per second).

$$
\Delta t = \frac{\sqrt{(Xobs^2 + Yobs^2)}}{Vobs} \tag{2}
$$

Because literature indicates that our visual system detects a change in velocity when greater than 25% of actual velocity (e.g., Calderone and Kaiser, 1989; Babler and Dannemiller, 1993), we assume that participants detected a change in target velocity 1V when it reached a threshold value expressed as a percentage of actual target velocity Vobs. Thus, Equation (3) expresses 1V as a function of a coefficient k, with k = 0.25. Furthermore, in order to express 1V in meters per second, the self-to-scene subjective distance Dscene (i.e., subjective distance to the objects, which has to be estimated from participants' responses) and self-to-screen distance (DScreen = 0.57 m) had to be taken into account:

$$
\Delta V = k.Vobs \times \frac{D\_{Scene}}{D\_{Scren}} \tag{3}
$$

This stems from the circumstance that the moving square can either be perceived as the motion of an object on the screen plane or as the projection of a distant moving object on that plane; in the two cases, the visual angle is the same. Since the physics model and calculations of µ require to express the kinematics in physical units (m, m/s, m/s<sup>2</sup> ), this self-to-scene distance factor had to be factored in the model and computations.

Equation (4) provides the predicted target's subjective deceleration as a negative variation of the predicted target velocity over time.

$$dec = -\frac{\Delta V}{\Delta t} = -\frac{k.Volbs}{\Delta t} \times \frac{D\_{Scene}}{D\_{Screm}} \tag{4}$$

Vertical plane (earth-centered) condition with implied friction for objects traveling on a horizontal surface (2 = 0◦ ) served as a theoretical reference condition. Setting µ = 1 for this condition allowed us to examine the variation of µ in the other conditions, in proportion to this 0◦ reference condition. In this reference condition, with µ = 1, Eq. 1 yields dec = –g. In order to compute µ for the other conditions using Equation (4) and (1), we first computed the average subjective Dscene-value for the reference condition, using Equation (5) derived from Equation (4).

$$D\_{Scene} = \text{g.} D\_{Screa} \times \frac{\Delta t}{k.Vobs} \tag{5}$$

In brief, the strategy of our approach consisted in using a behavioral measure of RA, via a set of classical mechanics equations and some identified reasonable assumptions, to compute the representational equivalent of a "friction coefficient." The key assumption in the process is that, in order to comply with the instructions to locate where the target motion became autonomous, participants rely on the detection of a discrepancy between a predicted diminished velocity of the target, reflecting an expected deceleration due to friction, and the actual observed velocity (1V). The deceleration expected by participants was then derived from

the ratio between 1V and the time when they detected this difference (as inferred from RA responses). From this deceleration value, a representational friction coefficient µ was computed using Equation 1. After setting the theoretical reference value to µ = 1 in the vertical Motion plane condition for objects traveling along a 0◦ Motion slope visible surface (friction group), we proceeded to compare the µ values of the other conditions to this reference value after Bonferroni correction (α = 0.05/11).

Results, illustrated in **Figure 4**, show similar values of the representational friction coefficients in the horizontal Motion plane for each Implied friction group. In contrast, in the vertical Motion plane, the friction coefficient varied with Motion Slope and Implied Friction conditions. The variation of the representational friction with slope for the no-friction group results from the invariance of RA across Motion slope conditions, which would be unexpected from the standpoint of physics. As a consequence, if friction was to explain this invariance, the friction coefficient would have to be smaller for ascending slopes and greater for descending slopes, as compared to 0◦ . Such representational inconsistency of friction seems unreasonable and suggests rather that participants of the no-friction group reasoned about object motion independently of environmental invariants, such as gravity or friction. Finally, the RA estimate of the friction group in the vertical Motion plane condition is in agreement with physics for the ascending slope, whereas it is not for the descending slope where the representational friction is greater than the reference value, t(25) = 10.93, p < 0.05/11, Cohen's d = 2.14 (see **Figure 4**).

Friction from the ground-object contact acts as a brake that dissipates kinetic energy from the moving objects into thermal energy. Interactive Physics 2000©allowed us to simulate this dissipation along time for the launcher and the target, for equivalent launcher velocity just before contact, 1 kg objects and µ = 1 (see **Videos 8**–**10** in the Supplementary Material available online). Remember that, as mentioned above, µ does not depend on m. However, the mass of each object had to be specified in order for the software to perform the simulations. Contrary to Michotte's displays, simulated motion is that of an inelastic collision (some of the initial kinetic energy is lost while converted into internal excitation, heat, or eventually into deformation) with decreasing

object velocity due to friction. Both the launcher and the target continue to move after collision to a different extent depending on motion slope. Actually, the traveled distance after collision varies as a function of Motion slope with a pattern resembling that of the RA estimates of the friction group in the vertical Motion Plane condition. However, if RA reflects expected stopping point, the estimate for the descending slope is smaller than expected according to physics (mean RA for −30◦ is about 1.36 greater than for 0◦ , whereas Interactive Physics 2000©simulations indicate a target traveled distance about 2.82 greater for −30◦ as compared to 0◦ ), suggesting an increase in the representational friction as compared to the 0◦ reference condition. In other words, the observed RA on the −30◦ condition would have to be larger to agree with the physical model (with an invariant friction coefficient µ = 1). For explaining through friction the smaller ratio of the observed RA in the −30◦ with respect to 0◦ conditions, the physical model would imply an increase in the "friction coefficient."

In order to test if the increased representational friction for −30◦ might result from a low-level effect due to the proximity of the screen that would impose a limit to RA responses we performed an additional ANOVA on RA while introducing Initial position (A, B, or C) as a supplementary within-subject factor. Results indicated that although there was a main effect of Initial position on RA (M<sup>A</sup> = 219 ± 26; M<sup>B</sup> = 206 ± 23; M<sup>C</sup> = 195 ± 20), F(2, 100) = 19.05, p < 0.000001, η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.28, this factor did not interact with any of the other three factors. Therefore, the initial target distance to the screen border had an overall effect not specific to descending slopes. The observed decrease of RA with initial distance of the target from the screen border was negligible (range = 13–19 pixels, i.e., smaller than the target size) when initial distance to the screen border decreased by 160 pixels steps between Initial position A and B or B and C. Finally, although we did not formulate any hypothesis about an effect of Motion direction (whether leftward or rightward), we ran an ANOVA including both Initial position and Motion direction in addition to previous factors, as a further control to make sure that counterbalancing factors did not matter. As it turned out, Motion direction was not significant, F(1, 50) = 2.99, n.s., η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.06, nor did Motion direction interact with Initial position, F(2, 100) = 1.83, n.s., η 2 <sup>p</sup> <sup>=</sup> <sup>0</sup>.04. Moreover, regarding the Motion slope × Motion plane × Implied friction interaction of interest, neither Motion direction, Initial position, nor both taken together interacted with it, Fs(2,100) < 1.

### Experiment 2 (Control Experiment)

The validity of our friction coefficient computations rests on the assumption that RA provides a measure of the time when observers detected a difference between the predicted vs. actual behavior of the target. In order to ascertain this hypothesis we performed a control experiment where participants indicated this time point on-line vs. a posteriori.

## Materials and Methods

### Participants

Twenty individuals took part in this experiment (mean age = 20 years, range = 18–27 years) after providing informed consent. None of them had participated in Experiment 1. Local ethical approval from EA 4532 ethics committee of Université Paris-Sud was granted for this study.

### Stimuli and Apparatus

Stimuli and Apparatus were similar to those used for the friction group in the Experiment 1.

#### Procedure

Participants indicated the position/moment from which they felt that "the target movement started to be autonomous" either a posteriori with the mouse cursor (as in the Main Experiment) or on-line by pressing the space bar. These Response type conditions were performed in two separate counterbalanced blocks. Moreover, participants were tested only in the vertical Motion plane with friction.

### Results and Discussion

Time data (in ms) of the on-line response condition were converted into RA values (in pixels). A simple regression was performed in order to test if spatial RA of a posteriori responses

predicted temporal RA of on-line responses (see **Figure 5**). The linear regression model (Y = 12.54 + 0.81X) predicted significantly the data with a significant slope [β = 0.84, t(58) = 12.09, p < 0.000001] and an intercept not different from zero [t(58) = 0.88, n.s.]. These results are consistent with our hypothesis that spatial RA provides a good approximation for the time point when observers detected a difference between the predicted vs. actual behavior of the target.

Furthermore, in order to see whether results in Experiment 1 replicate in the vertical plane, friction condition, we computed representational friction coefficients after setting the theoretical reference value to µ = 1 in the vertical Motion plane condition for objects traveling along a 0◦ Motion slope visible surface. For each Response type condition, the resulting mean µ-values for the +30◦ and −30◦ Motion slope conditions were compared to this reference value after Bonferroni correction. In addition to these four comparisons we also tested if the µ-values for +30◦ and −30◦ differed between Response type conditions. Therefore, the Type I error threshold was set to α = 0.05/6. Results, illustrated in **Figure 6**, indicated that representational friction coefficients differed from the reference only for −30◦ for both the on-line [µ = 1.35, t(19) = 6.53, p < 0.05/6, Cohen's d = 1.46] and a posteriori responses [µ = 1.32, t(19) = 6.34, p < 0.05/6, d = 1.42], thus replicating the findings of Experiment 1. In addition, mean µ-values did not differ between Response type conditions neither for the <sup>+</sup>30◦ motion slope [t(19) <sup>=</sup> <sup>0</sup>.88, n.s., d = 0.20], nor for the –30◦ [t(19) = 0.91, n.s., d = 0.20] conditions.

Taken altogether, the empirical elements available (regression model and friction coefficients) speak in favor of the soundness

of using the spatial measure of RA as a reasonable approximation to the moment when the participants detect a difference between predicted and observed velocity, and thus as a means for the estimation of "friction coefficients."

### General Discussion

In spite of variations in surface layout or in friction in the world we inhabit, we seldom fall or slip in our adult lifetime, especially when standing or walking on (earth-centered) horizontal surfaces. However, when considering potentially slippery slopes for unfamiliar surfaces, we tend to rely more on visual than tactile information regarding friction (Joh et al., 2007). As a consequence, we explicitly overestimate our locomotor or standing ability on low-friction surface (e.g., vinyl), and underestimate our ability on high-friction surfaces (e.g., rubber). In contrast, since the brain can predict the consequences of the forces we generate (e.g., to adjust grip force when striking a ketchup bottle to prevent it from slipping (Wolpert and Flanagan, 2001) and experience (e.g., when returning a tennis ball), relations among forces appear to be internalized to some extent. Here, we used computergenerated displays based on Michotte's causal displays to investigate to what extent mental representations of collision events preserve similarity relations with properties of physical invariants (viz., second-order isomorphism, (Shepard and Chipman, 1970). For this purpose, we quantified the representational friction coefficient from radius of action (RA) responses to Michotte's causal displays using classical mechanics equations, in order to identify how representational consistency of frictional force in collision events is modulated by contextual information.

We reasoned that if physical invariants are embodied in the cognitive architecture, implied friction will affect RA with a pattern depending on the orientation of the display with respect to environmental constraints such as gravity direction. Because friction is proportional to the magnitude of the normal force acting on the object, which depends on the orientation of the support surface with respect to gravity, if cognition is situated then implied friction should be instantiated differently as a function of (viewer-centered) motion slope and (earth-centered) motion plane. Literature provides several examples where embodiment of gravity via proprioception influences visual perception and our prediction of motion: whatever observer's orientation with respect to gravity, tunnel turns are perceived more bent when the end is pointing in the direction of gravity (Vidal et al., 2006); virtual objects moving at constant velocity in the direction of gravity induce earlier interceptive responses (Senot et al., 2005) and greater spatial memory displacement (Nagai et al., 2002) than when moving in the direction opposite to gravity.

First, we found that when the support surface on which both launcher and target are traveling is not explicitly displayed, RA did not vary as a function of motion slope nor motion plane, as if the representational friction (resistive force) was constant across orientations. Moreover, participants were very consistent in their response as indicated by the insignificant orthogonal deviations from the target path. At first glance, this might suggest that when the support surface is not part of contextual information, participants would reason about object motion independently of environmental invariants, just as young infants believe that self-propelled objects require no external support to move in midair (Luo et al., 2009). However, on the one hand, invariance of RA across motion slope in the horizontal motion plane might be compatible with implied physics, since the gravity direction is constant across motion slope, which thus suggests some internal consistency with respect to the earth-centered referential. On the other hand, the similar invariance found for the vertical motion plane appears more consistent with an impetus heuristics (Kozhevnikov and Hegarty, 2001). In the more influential versions of the impetus theory, the imparted impetus that sets a body in motion in a given direction has to fully dissipate or to be strongly diminished for gravity to exert an effect and eventually prevail. This framework would thus lead to predict invariance of the RA across motion slope as long as dissipation of the imparted impetus didn't reach the required point for gravity to be factored in. In the vertical motion plane, such invariance is definitely inconsistent with motion along an invisible surface with a constant friction coefficient across motion slope.

When both launcher and target are traveling in sequence along a path displayed in the visual scene, the effect of this implied surface varies as a function of motion plane. RA is invariant across motion slope when motion plane is horizontal; however, mean RA is smaller by a constant amount (about 2 times an object size) than when no implied surface is displayed. This suggests that object's motion was interpreted as an instance of wall hugging, resulting in an expected overall reduction of the target's velocity by virtue of friction, while the computed friction coefficient still did not differ from the reference condition (objects moving along a horizontal slope surface in the vertical motion plane). As for objects' motion in the vertical plane, RA varied as a function of motion slope in a way roughly consistent with the effect of representational friction being modulated by gravity, i.e., smaller RA for ascending slopes and greater RA for descending slopes (as compared to 0◦ ). However, although there was internal consistency between the representational friction coefficients for ascending and horizontal slopes, the computed friction coefficient for descending motion was 1.5 greater than for the two other conditions. One possible conjecture consistent with this result is that the increased friction coefficient for descending targets would reflect an embodied braking of the target. Similar to experimental participants who project their intention to throw objects and mistakenly believe that objects would initially continue to accelerate shortly after leaving their hand (Hecht and Bertamini, 2000), our participants might have embodied the target to simulate braking in order to rapidly reach the deceleration required to stop the target safely (Fajen, 2005). This interpretation is consistent with authors arguing that our actions in the world are the origin of our visual impressions of force between interacting objects (White, 2009), and that environmental constraints are perceptually scaled to the economy of action (Proffitt, 2006). For example, when we descend hills it requires earlier/greater braking to counteract the gravity pull. Similarly, walking on a slippery floor is more likely to lead to falls and consequently needs to be carefully handled.

Recognition of surface slipperiness has been embodied in our perceptual system through experience. Although visual and auditory cues to slipperiness are less informative than sliding resistance from tactile cues (Cohen and Cohen, 1994), subjective slipperiness ratings made from available visual cues (such as reflectiveness and texture) are consistent with actual coefficients of friction of surfaces (Lesch et al., 2008). This remains true even among vision-impaired elderly persons, though to a lesser extent (Hsu, 2011). Our brain in-builts bisensory (haptics and vision) texture-selectivity regions located in the medial occipital (Stilla and Sathian, 2008) and medial occipitotemporal (Podrebarac et al., 2014) cortices. This suggests literally that we may in some sort "feel" (haptically-like) sliding resistance from the textures we see (White, 2012b). Visual features of a surface, such as the thick gray line on/aside which the launcher and target moved in our experiment (see **Figure 1**), may be used to elaborate a mental image of its roughness (Newman et al., 2005) and in turn predict the forces encountered by the launched object in a Michotte's display. Such visuo-haptic mental simulation process would modulate the radius of action of a launcher over a target.

Our data suggested that the representational friction affects the radius of action of a launcher over a target in Michottetype displays, with a pattern reflecting the embodiment of physical invariants in the cognitive architecture. These findings run counter Michotte's statement, directed against the projective interpretation of visual events, that only the visual structure of the event can cause a fusion between kinaesthetic and visual impressions (Michotte, 1941, p. 123–124). Here, we show that both physical embodiment of the event (looking at a vertical vs. horizontal plane, cf. **Figure 1**) and contextual information (objects moving along a surface) affect interpretation of the event and modulate the radius of action of the launcher. Literature also provides similar evidence of embodied friction. Objects sliding along a surface exhibit less forward displacement in memory (representational momentum) than when moving in isolation (Hubbard, 1998), due to the representational friction (Hubbard, 1995a); representational momentum is even more reduced when the object moves between two surfaces. In addition to replicating the general findings of greater representational friction for objects sliding along a surface than when moving in isolation, we provide quantitative estimates of representational friction coefficients, in order to test the consistency of their embodiment (internalization through experience) in our cognitive architecture. Moreover, we provide evidence that mental simulation of resistive force is situated, as implied friction varied with orientation of the display with respect to gravity in an earth-centered rather than eye-centered frame reference frame. Our results go along the lines of a series of experiments by Senot and colleagues (Senot et al., 2005, 2012; Le Seac'h et al., 2010) showing earlier interceptive responses for catching a virtual ball falling from "above" (as compared to approaching from "below") in alignment with respect to the gravity pull, but not when looking to the same scene (with visual "up" and "down") orthogonally to gravity (Senot et al., 2005). In our study, body posture varied while looking down to a horizontal display versus straight ahead to a vertical display. However, with experiments run on Earth or weightlessness (parabolic flights), Senot et al. showed that the only important parameter is the orientation of the moving object with respect to the gravity pull (sensed by our otolith receptors), and not the body posture (Le Seac'h et al., 2010; Senot et al., 2012). Similarly, the up/down asymmetry in estimating the pitch angle of a virtual corridor traveled on Earth reduces dramatically while free-floating in weightlessness (International Space Station). This asymmetry may be restored by attaching astronauts with belts and foot straps providing haptic inputs that help the brain reconstruct the missing gravitational cues (De Saedeleer et al., 2013). In summary, sensing gravity's pull (by the otoliths or contact forces) appears as the main parameter for cognition to be situated and, in our case, simulating the forces exerted on a launched target moving along a surface.

The embodiment of both friction and gravity forces in our cognitive architecture may well be more pervasive than we may at first think. Embodied theories of conceptual representation propose that the human sensorimotor system may serve to embody abstract ideas and metaphors (Lakoff and Johnson, 1999; Gallese and Lakoff, 2005). The gravity pull is "tightly linked" to our conception of emotion and morality. Positive words are associated to UP whereas negative words to DOWN (Lakoff and Johnson, 1999; Lakens, 2012), and both metaphorical associations automatically reactivate corresponding directional body movements (Koch et al., 2011) and postures (Dudschig et al., 2015). Multimodal simulation from language may also sustain The Mind is A Body and Thinking Is Physical Functioning metaphors (Lakoff and Johnson, 1999, p. 237) and explain why one might feel slipping in and out of consciousness just as on a slippery slope (see introductory excerpt), or why it is difficult "to resist the force of an argument" or "the overwhelming weight of evidence." Similarly, while analyzing the mechanisms of slippery slopes arguments in the legislative-judicial domain, Volokh (2003) emphasized that "People resist attempts to take rights away outright, but not if the rights are eroded slowly." In the latter metaphor, erosion supposedly alters coarseness of the surface, which in turn diminishes the resistive force of friction. In the present study, although roughness of surfaces was unspecified in the displays, the coefficient of friction emerged as a parameter in the mental simulations of participants seemingly driven by their embodied knowledge of dynamics.

### Conclusion

In Michotte's launching displays, if the target keeps moving beyond the radius of action (RA) of the launcher (over the target), the target loses its passivity and seems to move autonomously. Our findings show that when both launcher and target are traveling along a thick gray line displayed in the visual scene, the effect of this implied surface on the moment when the target movement is perceived as autonomous reveals how friction is embodied in our cognitive system. For objects' motion in the vertical plane, RA varied as a function of motion slope in a way roughly consistent with the effect of representational friction being modulated by gravity. In contrast, when the surface on which both launcher and target are traveling was not explicitly displayed, RA did not vary as a function of motion slope nor motion plane, as if the representational friction from the milieu was constant across orientations, which is more consistent in turn with an impetus heuristics interpretation. Therefore, the possibility remains that embodied knowledge of forces (whether exerted or resistive) gives raise to scattered concepts or micro-theories which mediate judgments about force (Hecht, 2001) and make up distinct branches of expression of a bodily-based dynamics (White, 2012b,c). When studying perceptual causality, not keeping in mind the role of embodied dynamics and the potential heterogeneous coexistence of resulting micro-theories in our cognitive system is what might well turn out being a slippery slope at the end.

### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpsyg. 2015.00483/abstract

Video 1 | This movie illustrates a rightward 0◦ Motion slope with implied friction from Initial position A.

Video 2 | Idem from Initial position B.

Video 3 | Ibidem from Initial position C.

Video 4 | Rightward +30◦ Motion slope with implied friction from Initial position B.

Video 5 | Idem but for a −30◦ Motion slope.

Video 6 | Leftward +30◦ Motion slope without implied friction from Initial position B.

Video 7 | Idem but for a rightward motion.

Video 8 | This movie shows a friction simulations from the Interactive Physics 2000© software for the vertical Motion plane condition and +30◦ Motion slope.

Video 9 | Idem for a 0◦ Motion slope.

Video 10 | Ibidem for a −30◦ Motion slope.

### References


Psychology, Artificial Intelligence, and Cognitive Neuroscience, eds D. Meyer and S. Kornblum (Cambridge, MA: MIT Press), 99–119.


Michotte, A. (1963). The Perception of Causality. Oxford: Basic Books.

Nagai, M., Kazai, K., and Yagi, A. (2002). Larger forward memory displacement in the direction of gravity. Vis. Cogn. 9, 28–40 doi: 10.1080/13506280143000304


Podrebarac, S. K., Goodale, M. A., and Snow, J. C. (2014). Are visual textureselective areas recruited during haptic texture discrimination? Neuroimage 94, 129–137. doi: 10.1016/j.neuroimage.2014.03.013


White, P. A. (2012c). The impetus theory in judgments about object motion: a new perspective. Psychon. Bull Rev. 19, 1007–1028. doi: 10.3758/s13423-012-0302-2


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Amorim, Siegler, Baurès and Oliveira. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Visiting Richard Serra's "Promenade" sculpture improves postural control and judgment of subjective visual vertical

### **Zoï Kapoula \*, Alexandre Lang, Thanh-Thuan Lê , Marie-Sarah Adenis , Qing Yang, Gabi Lipede and Marine Vernet**

IRIS Team, Physiopathologie de la Vision et Motricité Binoculaire, CNRS FR3636, UFR Biomédicale, Université Paris Descartes, Paris, France

#### **Edited by:**

Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada

#### **Reviewed by:**

Katinka Dijkstra, Erasmus University Rotterdam, Netherlands Xavier Corveleyn, Centre National de la Recherche Scientifique, France

#### **\*Correspondence:**

Zoï Kapoula, IRIS Team, Physiopathologie de la Vision et Motricité Binoculaire, CNRS FR3636, UFR Biomédicale, Université Paris Descartes, 45 rue des Saints-Pères, 75006 Paris, France e-mail: zoi.kapoula@parisdescartes.fr; zoi.kapoula@gmail.com

Body sway while maintaining an upright quiet stance reflects an active process of balance based on the integration of visual, vestibular, somatosensory, and proprioceptive inputs. Richard Serra's Promenade sculpture featured in the 2008 Monumenta exhibition at the Grand Palais in Paris, France is herein hypothesized to have stimulated the body's vertical and longitudinal axes as it showcased five monumental rectangular solids pitched at a 1.69◦ angle. Using computerized dynamic posturography we measured the body sway of 23 visitors when fixating a cross, or when observing the artwork (fixating it or actively exploring it with eye movements) before and after walking around and alongside the sculpture (i.e., before and after a promenade). A first fixation at the sculpture increased medio-lateral stability (in terms of spectral power of body sway). Eye movement exploration in the depth of the sculpture increased antero-posterior stability [in terms of spectral power and canceling time (CT) of body sway] at the expense of mediolateral stability (in terms of CT). Moreover, a medio-lateral instability associated with eye movement exploration before the promenade (in terms of body sway sensu stricto) was canceled after the promenade. Finally, the overall medio-lateral stability (in terms of spectral power) increased after the promenade. Fourteen additional visitors were asked to stand in a dark room and adjust a luminous line to what they considered to be the earth-vertical axis. The promenade executed within the sculpted environment afforded by Serra's monumental statuary works resulted in significantly improved performances on the subjective visual vertical test. We attribute these effects to the sculpted environment provided by the exhibition which may have acted as a kind of physiologic "training ground" thereby improving the visitors' overall sense of visual perspective, equilibrium, and gravity.

**Keywords: art, posture, sculpture, subjective vertical, eye movements**

#### **INTRODUCTION**

Neuroaesthetics is an innovative and rapidly expanding research field, which currently champions non-invasive neuroimaging techniques as the research methodology of choice in the investigation of the neural basis of the cognitive and affective processes triggered during an aesthetic episode. These neuroimaging studies are principally directed toward the cortical and sub-cortical activations associated with producing or viewing pictorial art, listening to music, etc. Such studies revealed that various brain areas, such as areas belonging to the reward system (e.g., orbitofrontal cortex) or to emotional processing (e.g., amygdala, insula), areas related to high-level cognitive processes (e.g., prefrontal cortex), participate in aesthetics experience (e.g., Di Dio et al., 2007; Di Dio and Gallese, 2009; Brown et al., 2011; Ishizu and Zeki, 2011). Interestingly, in line with theories of embodied cognition, several studies revealed that artworks also impact human sensorimotor system. For instance, parts of occipito-temporal and parietal areas important for observing actions as well as sensorimotor cortex are particularly activated when observing aesthetical dance movements (Calvo-Merino et al., 2008; Cross et al., 2011). Beyond the observation of real aesthetical movements, artworks conveying a sense of motion also impact the sensorimotor system. For example, the motion-sensitive brain area MT<sup>+</sup> is more activated when observing abstract paintings with implied motion than abstract paintings with little motion impression, but only in observers with prior experience viewing those kinds of paintings (Kim and Blake, 2007). Finally, an EEG study revealed that observation of artworks by Lucio Fontana made of cuts on canvas evoked mu rhythm suppression, which is a typical neurophysiologic marker of movement initiation (Umilta et al., 2012).

Beyond these brain responses indicating that artworks' observers can simulate creative processes, artworks can also impact the full body's physiology. To evaluate such impact, posturography emerges as a field capable of providing novel insights into creative ideation, artistic creation, and aesthetic experience. By providing empirical data for theories of embodied cognition, posturography can potentially span multiple interrelated disciplines and bridge the gap between seemingly distinctive, physiology-related disciplines with their specialized and mutually exclusive focus on brain and body. Posturography quantifies aspects of postural control peculiar to upright quiet stance and it does so in a non-invasive manner across relatively short time intervals. It provides valuable information concerning the central nervous system's ability to integrate multiple inputs (visual, vestibular, cutaneous, and muscle proprioceptive) and to generate muscular responses adapted as corrective torque by way of a feedback control system. The body is never perfectly still but is rather constantly in motion and physiologic body sway reflects these active processes. Posturography is mainly used in neurophysiology as a tool in the diagnosis and follow-up of patients suffering from balance and equilibrium disorders. Recently, two teams inaugurated this new field in aesthetics by questioning the effects of depicted body movements (Nather et al., 2010) and of pictorial depth (Kapoula et al., 2011) on body sway. Nather et al. (2010) showed that participants exhibited significantly greater body sway when observing a picture of a Degas' sculpture of a dancing ballerina than when observing a picture of a Degas' sculpture of a static ballerina, demonstrating that images of body movement internally generate unconscious body oscillations. Kapoula et al. (2011) showed that a pictorial representation of depth in the visual field increased body sway in much the same way that the perception of real physical depth does. This effect was, however, a function of the painting used: two abstract paintings by Maria Helena Vieira Da Silva (*Egypt*, 1948; *O quarto cinzento*, 1950) were tested in this study. These two paintings exhibited various formal and stylistic similarities and produced an equally vivid sense of depth on behalf of the observing subjects; however, only the second painting was able to modulate body sway. It was suggested that highly salient perspective constructions and other ingenious visual features of paintings can literally move the body, impacting body sway in a unique manner.

The present study pursues this innovative posturographic approach further by analyzing Richard Serra's *Promenade* sculpture, which was featured in the context of the 2008 *Monumenta* exhibition held at the Grand Palais in Paris, France. This study was conducted *in situ* and is truly unique in terms of recent experimental investigations into art and aesthetics in so far as it deals with sculpture, a fine art which has been the object of comparatively few empirical investigations.

The first hypothesis we made was that, as with other aesthetic stimuli displaying a strong sense of depth and movement, the mere observation of the sculpture would impact postural control. Such impact could be observed in either medio-lateral or antero-posterior direction as the sculpture involved strong depth (alignment of monumental solids in depth) and lateral (tilted monumental solids) components. Moreover, the effects were expected to be subtle, as the postural parameters would remain within the normal range for healthy young individuals.

In addition, artworks are never passively observed; rather, eye movement exploration is expected to participate in aesthetics experience. For instance, illusory movements while observing artwork from the Op Art movement could emerge from an instable eye movement behavior made of numerous small saccades (Zanker et al., 2003; Zanker and Walker, 2004). Such small, or, to a greater extent, larger saccades, probably involving vergence components as well, can in turn impact postural control (Kapoula et al., 2014). Indeed, laboratory studies revealed that vergence state modulates postural parameters (Kapoula and Le, 2006; Le and Kapoula, 2008; Matheron et al., 2008) and that vergence movements have a positive impact on posture (Kapoula et al., 2013). Thus, the second hypothesis we made in this study was that actively exploring the sculpture with eye movements would modulate the postural control.

Finally, we took advantage of our *in situ* experimental configuration allowing us to test our third hypothesis, according to which a real visit of the exhibition, including walking around and alongside the sculpture while listening to information about the exhibition, stimulating physical, cognitive, and emotional processes, would have a long-term impact on the postural control.

In addition to posturography (Experiment 1), this study also introduces the use of the subjective visual vertical (SVV) test (Experiment 2), widely used since 1970 (Friedmann, 1970) in the diagnosis of disequilibrium and vestibular disorders. Here, this diagnostic tool was used to determine the impact of Serra's sculpture on the accuracy with which visitors were able to make apparent verticality judgments. Our forth hypothesis was that changes in postural control after visiting the exhibition could be paralleled by changes in SVV evaluation.

This experimental approach applies a novel methodology. The artwork being a unique stimulus within a unique museum environment, the results can only be compared to a basic measure (fixation of a cross, run before the visit of the exhibition). This study focuses on demonstrating the effect of this unique stimulus. This, however, does not imply that ordinary objects with similar physical properties would not lead to similar physiologic impact.

### **EXPERIMENT 1: POSTUROGRAPHY MATERIALS AND METHODS**

#### **Ethics statement**

The investigation adhered to the tenets of the Declaration of Helsinki and was approved by the local ethics committee for human experimentation, CPP Ile de France II (No: 07035, Hôpital Necker in Paris). Written consent approved by the committee was obtained from all subjects after the nature of the experiment had been explained.

### **The artwork**

The primary stimulus employed in this study consisted of Richard Serra's *Promenade* sculpture featured in the 2008 *Monumenta* exhibition held at the Grand Palais in Paris, France (Monumenta: A New Look at the Grand Palais, n.d.). *Monumenta* is an annual exhibition which calls upon contemporary artists to showcase artworks specially designed to accommodate the Grand Palais' 13,500 m<sup>2</sup> nave of filigreed iron and glass. In 2008, the leading American artist, Richard Serra accepted the *Monumenta* challenge (tribeca75tv, 2008). Serra is highly adept at working on a large scale—whether within the confines of extensive interior spaces or in other outdoor exhibition venues. The artist created five massive, sculpted rectangular solids, which constitute his *Promenade* sculpture, and adapted their construction to the Grand Palais' unusually spacious exhibition space. Serra's proposed *Promenade* sculpture dealt principally with formal themes of verticality and the rectangular solids' transverse axis constituted the works' rhythmic counterpart. The sculptures can be minimally reduced to six determining physical qualities (height = 1700 cm, width = 400 cm, thickness = 13 cm, weight = 75 metric tons, tilt = 1.69◦ , distance between any two consecutive steal beams = 28 m). Their height was determined in relation to the Grand Palais' arching, cruciform glass nave (rising to a height of 45 m) and they were spread across the full length of the nave. The installation of Serra's gigantic steel slabs was determined by calculating their respective alignment according to a rhythmic sequence of asymmetrical inclines at a 1.69◦ angle (see **Figure 1**).

During the exhibition, observers typically walked around and alongside the steel beams, taking in multiple viewpoints. Such behavior constitutes a unique aesthetic experience whose

**FIGURE 1 | Illustration of the artwork stimuli. (A)** Cliché from Didier Plowy of Monumenta 2008, Richard Serra's A stroll in the Nave, © Didier Plowy/RMNGP. **(B)** Dimensions of the sculptures. **(C)** Illustration of the experimental configuration including the sculptures and location of the posturography platform.

underline physiologic correlates form the object of this study. Indeed, one's sense of gravity as well as both the visual and vestibular systems are challenged by Serra's *Promenade* sculpture. We therefore designed a set of physiologic measures consisting of posturography (Experiment 1) and the SVV test (Experiment 2) in order to assess the impact of this artwork on the physiology of the observer.

### **Participants**

Twenty-three young adults (26.1 ± 6.1 years) volunteered to take part in this experiment. Some of the participants came to the exhibition in response to an ad posted on a student mailing list and approached the experimenters at the entrance of the museum, while other participants were recruited directly at the entrance of the museum. None of the participants knew the hypotheses of the study.

### **Posturography apparatus**

Postural stability was measured using a posturography apparatus which consists of two dynamometric soles (TechnoConcept, Céreste, France), one for each foot. The excursions of the center of pressure (CoP) were measured during 51.2 s in each condition; the equipment contained an analog-to-digital converter of 16 bits. The sampling frequency of the CoP was 40 Hz. The duration of posturography recording is not standardized. While short durations (typically 30 s) are commonly used in clinics, longer durations might allow more discriminative power on non-linear parameters (Schubert et al., 2012a,b). Our 51.2-s recording is thus a compromise between short recordings avoiding the occurrence of transient and particular events due to postural changes and consequently deteriorating measures related to sway area, and long recordings allowing a better accuracy in time-frequency analysis.

### **Procedure**

The participants were placed on the posturography platform along the central axis of the Grand Palais at 15 m from the first plate (see **Figure 1C**). Throughout all of the posturography tests, participants stood on the posturography platform while assuming an upright position with the feet placed side-by-side forming a 30◦ angle and the heels positioned 4 cm apart. The participants were required to maintain a quiet stance by holding their arms at their sides while breathing normally and refraining from both speaking and clenching their teeth. The five following conditions were run.

– Condition 1 (basic): fixating a cross

When the participant entered the nave, before she/he had the opportunity to see the artwork, a 51.2-s posturography recording session was performed with the participant fixating the center of a cross (30 cm × 30 cm) located 30 m in front of her/him while turning her/his back away from the artwork.

– Condition 2: fixating the sculpture

Immediately after the basic condition, the participant performed a 180◦ turn in order to face the sculpture directly. She/he was then instructed to fixate the upper right edge of the last sculpture as precisely as possible. Again, the posturography was recorded during 51.2 s.

– Condition 3: eye navigation in depth along the sculptures' transverse plane

In the next condition, participants were asked to fixate back and forth successively each sculpture from the farthest to the nearest sculpture repeatedly during the allotted 51.2 s recording period. Oculomotor behaviors of this kind necessitate saccade– vergence eye movements, which are a commonly observed class of eye movements executed during visual explorations of the environment (Enright, 1984; Collewijn et al., 1995; Yang et al., 2002).

The participants were then asked to walk freely both alongside and around the sculptures, thereby assuring that the subjects took in multiple points of view as would likely be the case had they freely visited the exhibition under normal circumstances. During this stage, they were listening to the exhibition audioguide (tracks 102,103,104). The total duration of the *promenade* was about 1 h.

– Conditions 4 and 5 (post visit testing): fixation and eye navigation

Following their *promenade*, posturography was performed under the same conditions as those followed in conditions 2 and 3.

### **Data analysis**

Data were analyzed using the following postural parameters: (1) standard deviation of medio-lateral and antero-posterior body sway (SDx and SDy); (2) surface of the CoP excursions, measured with the confidence ellipse including 90% of the CoP positions which were sampled by eliminating the extreme points (Takagi et al., 1985); (3) variance of speed.

A wavelet non-linear analysis using Morlet waves (Framiral, n.d.) was applied to CoP displacements in order to elaborate a time-frequency chart of body sway (Dumistrescu and Lacour, 2006; Bernard-Demanze et al., 2009). Such analysis allows revealing temporal fluctuations in the body sway spectrum. From this analysis, several parameters were extracted for both medio-lateral and antero-posterior body sway and for three frequency bands (F1: 0.05–0.5 Hz; F2: 0.5–1.5 Hz; F3: higher than 1.5 Hz): (1) spectral power (Px and Py for F1, F2, and F3); (2) canceling time (CT) (CTx and CTy for F1, F2, and F3); as well as (3) a global postural instability index (PII). The hypothetical physiologic significance of the spectral power (Px and Py) of different bands is as follows: 0–0.5 Hz visual–vestibular (Naschner, 1979; Kohen-Raz et al., 1996; Paillard et al., 2002), 0.5–1.5 Hz cerebellar (Paillard et al., 2002), >1.5 Hz reflexive loops (Lacour et al., 2008; Bernard-Demanze et al., 2009). As a rule, power in the higher band (F3) is minimal in healthy subjects during quiet standing, but it can nevertheless be non-negligible in the elderly and in the presence of a postural pathology, or in dynamic postural conditions (Bernard-Demanze et al., 2009). The CT is the total time during which the spectral power of the body sway for a specific frequency band is canceled by the posture control mechanisms; the longer the CT of a given frequency band, the better the postural control (Dumistrescu and Lacour, 2006; Bernard-Demanze et al., 2009). The fact that, over a period of time, a certain frequency's power is reduced to 0 demonstrates that the postural control system has been successfully engaged given that the overall entropy of the sway has been reduced. Unlike most healthy subjects who do exhibit these zero power instances in their postural sway spectrum, pathological subjects do not. Precisely how the canceled frequencies are "chosen" by the postural control system is not known, but the minimization of muscular effort required for controlling the sway is perhaps one of the system's major deciding factors. The PII which quantifies the postural performance by taking into account the two aforementioned indices (P and CT), was also calculated (Dumistrescu and Lacour, 2006; Bernard-Demanze et al., 2009) as follows: PII = 6*x*,*<sup>y</sup>* P(F1, F2, F3)/CT(F1, F2, F3). For healthy adults the PII is close to unity during the quiet stance task (Bernard-Demanze et al., 2009). This time-frequency analysis and associated parameters were obtained with the software PosturoPro (Framiral, Cannes, France).

### **Statistical analysis**

In order to test Hypothesis 1 (a first glance at the sculpture has an immediate effect on posture), we run one-way ANOVAs with the main factor SCULPTURE (fixating the cross before viewing the sculpture *versus* fixating the sculpture). This analysis was performed on data from conditions 1 and 2.

In order to test Hypotheses 2 and 3 (exploration of the sculpture, with eye movements—Hypothesis 2—or during the *promenade*—Hypothesis 3, has an effect on posture), we run two-way ANOVAs with the main factors EYE MOVEMENTS (sculpture fixation *versus* eye navigation) and PROMENADE (before *versus* after visiting the exhibition). This analysis was performed on data from conditions 2–5.

*Post hoc* analyzes were conducted with the Fischer's LSD test. The significance level was set at *p* < 0.05.

### **RESULTS**

### **The SCULPTURE effect: the power of medio-lateral body sway decreased at first glance**

The one-way ANOVAs revealed that the SCULPTURE factor significantly decreased Px for F2 [*F*(1,20) = 6.7; *p* < 0.02] and F3 [*F*(1,20) = 5.0; *p* < 0.04; **Figure 2A**]. Thus, compared to the basic condition, a first glance of the laterally tilted sculptures caused a decrease in the power of the medio-lateral body sway. This is a subtle but significant, medio-lateral axis-specific, immediate effect.

### **The EYE MOVEMENT effects: exploring the sculpture with eye movements improved antero-posterior but deteriorate medio-lateral posture performance**

The two-way ANOVAs showed that the EYE MOVEMENTS factor significantly (i) decreased Py for F1 [*F*(1,21) = 6.5; *p* < 0.02] and F3 [*F*(1,21) = 4.7; *p* < 0.04; **Figure 2B**]; (ii) increased CTy for F1 [*F*(1,21) = 10.1; *p* < 0.005; **Figure 2C**]; and (iii) decreased CTx for F2 [*F*(1,21) = 5.16; *p* < 0.03, **Figure 2D**]. Thus, compared to merely staring at a sculpture, moving the eyes across sculptures decreased the energy spent in controlling antero-posterior body sway and increased the CT of antero-posterior body sway, thereby improving postural control along the main axis of eye movements. This was accompanied by a decrease in the CT of

#### **FIGURE 2 | Continued**

**Significant results of the Experiment 1.** Group mean and standard deviation. Legend: Basic condition: condition 1. First glance: condition 2. Pre: pre-promenade (conditions 2 and 3). Post: post-promenade (conditions 4 and 5). Fixation: fixating the sculpture (conditions 2 and 4). Exploration: eye navigation in depth along the sculpture's transverse plan (conditions 3 and 5). F1: 0.05–0.5 Hz. F2: 0.5–1.5 Hz. F3: >1.5 Hz. The symbol \* indicates significant p values. **(A)** Compared to the basic condition, fixating the sculpture significantly decreased the spectral power of medio-lateral sway (Px) for F2 and F3. **(B)** Navigating with eye movements along the sculpture significantly decreased the spectral power index of

medio-lateral body sway, i.e., a decreased postural performance along that perpendicular axis.

### **Interaction between EYE MOVEMENTS and PROMENADE: the promenade eliminated medio-lateral instability related to eye movements**

The two-way ANOVAs showed a significant interaction between the PROMENADE factor and the EYE MOVEMENTS factor for SDx [*F*(1,27) = 7.16; *p* = 0.0142; **Figure 2F**]. Before visiting *Promenade*, the subjects exhibited higher lateral body sway when they performed eye movements as compared to the fixation condition (*p* = 0.003); after visiting *Promenade*, this eye movement-related medio-lateral instability disappeared. In other words, after the *promenade* the medio-lateral body sway remained small indicating improved stability along this axis even when the eyes were moving. There was a similar tendency for the surface of body sway to increase in the eye moving condition before the *promenade* but to decrease in the eye moving condition after the *promenade* [*F*(1,27) = 3.42; *p* = 0.07; **Figure 2G**]. Consequently, after the *promenade* the postural stability improved in the exploration condition, indicating optimal overall control with respect to both viewing conditions (fixation, exploration).

### **The PROMENADE effect: the power of the medio-lateral body sway decreased after the visit**

The two-way ANOVAs showed that the PROMENADE factor significantly decreased Px for F1 [*F*(1,21) = 5.9; *p* < 0.02], F2 [*F*(1,21) = 9.5; *p* < 0.01], and F3 [*F*(1,21) = 11.2; *p* = 0.003; **Figure 2E**]. Thus the *promenade* around the laterally tilted sculptures resulted in a substantial decrease in the power of medio-lateral body sway across all frequency ranges.

#### **EXPERIMENT 2: SUBJECTIVE VISUAL VERTICAL**

#### **MATERIALS AND METHODS**

#### **The artwork**

The artwork was the same as in Experiment 1.

#### **Participants**

Fourteen new participants (29.5 ± 9.0 years) volunteered to take part in this second study.

#### **Procedure**

In Experiment 2, participants were invited to visit the exhibition under the same conditions as in Experiment 1. The participants' SVV was examined before and after visiting the exhibition with a antero-posterior sway (Py) for F1 and F3. **(C)** Navigating with eye movements significantly increased the canceling time of antero-posterior sway (CTy) for F1. **(D)** Navigating with eye movements significantly decreased the canceling time of medio-lateral sway (CTx) for F2. **(E)** The promenade significantly decreased the spectral power of medio-lateral sway (Px) for all frequency bands, F1, F2, and F3. **(F)** Before the promenade, navigating with eye movements significantly increased the standard deviation of medio-lateral body sway (SDx); this was no longer the case after the promenade. **(G)** Before the promenade, navigating with eye movements tended to increase the surface of the CoP excursions (Surface); this was no longer the case after the promenade.

dedicated device (Framiral, Cannes, France). The SVV was determined by inviting participants to stand in a dark room and adjust a luminous line to what they considered to be the earth-vertical axis. This test was carried out in an adjacent room of the museum where complete darkness was easily achieved. The reported error measures were absolute. To our knowledge, learning of the SVV task when examinations are performed tens of minutes apart has never been reported. Thus, any change in SVV errors can be reasonably attributed to the *promenade* itself (i.e., visiting the *Promenade* sculpture while listening to the audio-guide).

### **Statistical analysis**

In order to test Hypothesis 4 (the *promenade* has an effect on SVV), a one-way ANOVA was run on the SVV error data with the PROMENADE factor (before *versus* after visiting the exhibition).

### **RESULTS**

**Figure 3A** indicates group mean and standard deviations of SVV errors before and after the promenade. The one-way ANOVA showed that the PROMENADE factor significantly decreased the SVV errors [*F*(1,13) = 8.81; *p* = 0.01].

In addition, examination of **Figure 3B** displaying individual SVV errors before and after the promenade showed that: (i) there was a correlation between pre- and post-*promenade* measures (which was confirmed by a significant Pearson correlation: *R* <sup>2</sup> = 0.52; *p* = 0.0036) and (ii) the *promenade* improved the SVV for 10 participants, decreased it for three participants and did not change it for one participant. Thus, although the performance of individual participants before and after the *promenade* was correlated, the *promenade* induced a small but significant improvement of the SVV, found for the large majority of the participants (71%).

### **DISCUSSION**

#### **FIRST GLANCE: THE IMMEDIATE EFFECT**

The first glance at these laterally tilted monumental sculptures causes a subtle decrease of the spectral power of medio-lateral body sway. We note the immediacy and specificity of this effect on body sway which is shown to occur along the same anatomical axis as the geometric axis about which the sculpture itself is tilted. This observation is in keeping with our previous posturographic analyses of paintings (Kapoula et al., 2011; Kapoula and Gaertner, 2014), which also revealed the immediate effects of paintings on postural control. Indeed, the present study further corroborates these previous findings while at the same time extending its

reach for the first time to include statuary works. The observed immediacy and specificity indicate the impact, however, subtle, of artwork in general on human physiology.

### **EFFECTS OF EYE MOVEMENTS**

Testing the effects of visual exploration on postural control was primarily achieved by comparing fixation of the sculpture to eye movement navigation in depth along the sculptures' transverse plane. As explained in Section "Materials and Methods," the eye movements involved convergence and divergence eye movements, which were most likely combined with lateral and/or vertical saccades. Although eye movement recording was not possible, inspection of the subjects' eyes by the experienced eye movement investigators clearly helped to confirm the occurrence of the aforementioned naturally combined eye movements. Subjects made approximately 15 back and forth excursions fixating the five sculptures successively one after the other. Such combined eye movements are in fact the most natural eye movements we make in real life to explore our 3D environment. The impact of such movements on postural control has not yet been studied. Instead some studies from our own research group have examined the effect of saccades and the effect of vergence along the median plane separately, using basic experimental stimuli such as fixating a cross or a laser dot that moves in depth. Saccades alone were not found to compromise postural stability and in fact some benefits were even observed (Rey et al., 2008). In so far as vergence eye movements were concerned, a clear beneficial effect was reported in both healthy subjects and adult patients (Kapoula et al., 2013). The effects observed in the present study relative to combined saccade–vergence eye movements along the sculptures also indicate an improvement in the time-frequency domain along the antero-posterior, depth axis. The spectral power decreased and the cancellation time increased, which indicate longer time periods of optimal control. Whether or not this effect is purely physiologic and could be observed any time we make such eye movements even when directing our gaze at stimuli which were not necessarily designed with an artistic or aesthetic end in mind (e.g., a laser dot moving in depth with some lateral or vertical component) is not known. It is likely that this would be the case and we fully expect the sculptures to accentuate effects of this kind; however, this hypothesis requires further investigation. It is important to note that eye navigation in depth along the sculptures' transverse plane improved body sway control in precisely this antero-posterior direction. Another physiologic correlate of interest was the decrease in cancellation time of medio-lateral body sway which indicates that the optimization in the antero-posterior depth along the predominant eye movement axis was obtained at the expense of the lateral axis.

In summary, whether or not these observations are specific to sculptures, this study demonstrates that moving the eyes across sculptures that are laterally inclined and aligned in depth modifies the control of body sway. Navigating Serra's *Promenade* using solely ones eyes can thus be considered an active visual process, as opposed to a purely static, passive and receptive one; a *promenade* of the eyes so to speak, which calls the body into action during the dynamic control of quiet stance.

### **INTERACTION BETWEEN EYE MOVEMENTS AND THE PROMENADE**

Another major finding concerns the interaction between eye movements and the *promenade* around and alongside the sculptures. Visiting the sculptures for approximately 1 h with an audioguide was a multisensory, physiologic and aesthetic experience. Indeed, at the purely physiologic level, while the observers walk around and alongside the sculptures, they generate a large bank of visual, vestibular, and sensorimotor data, which is integrated into and associated with the aesthetic experience. Walking involves canal and otolithic stimulation, compensatory vestibularly generated eye movements as well as somatosensory and visual experiences (the optic flow contingent on one's body motion as one navigates into the sculpted space of the artwork). Walking as a motor behavior is essentially organized around the vertical axis and the body's orientation and stabilization in space depend on the integration of all of these signals (Montgomery, 1985; Hegeman et al., 2007). This sensorimotor experience combined with listening to the exhibit's handheld audio-guide involves physiologic, cognitive, and emotional processes whose combined effects are apparently quite lasting. Indeed, the results indicate that the partial destabilization in body control along the medio-lateral axis triggered during simulated visual navigation into the sculpted environment was ultimately abolished after engaging in the full-body navigation, i.e., the *promenade.* After the *promenade*, the medio-lateral body sway was under much better control even when the eyes were moving as it became similar to the sway under the fixating condition. As previously stated, we attribute this effect to the physiologic experience provided by the *promenade* and its lasting beneficial effects. In other words, exposure to the sculptures would seem to have acted as a physiologic "training" stimulus, not unlike the training modalities used in vestibular rehabilitation techniques. This interpretation is further corroborated by overall improvement of postural control and SVV to be discussed below.

#### **THE EFFECT OF THE PROMENADE**

The promenade decreased the spectral power of medio-lateral body sway. This was a major, subtle lasting effect that was significant for all frequency ranges. Again, this beneficial effect concerned the medio-lateral body sway, which was presumably related to the medio-lateral tilt of the sculptures. It has been demonstrated in laboratory studies that exposing healthy subjects to a tilted visual reference can cause a deviation in their body's position (Isableu et al., 1997). Presumably, the tilted sculptures initially produced such an effect resulting in a subsequent selfcorrecting and compensatory postural response designed to keep the body in an upright vertical position while ambulating and navigating in the sculpted environment of the artwork. All of these self-corrections might have also improved orthostatic posture during quiet stance. Postural control during quiet stance and postural control while ambulating are highly interdependent (Bent et al., 2002). A decrease in the spectral power is indicative of improved control over medio-lateral body sway which is presumably mediated by the prolonged sensorimotor, vestibular, and visual experience of the observer's *promenade* around and alongside the tilted sculptures.

### **IMPROVEMENT OF THE SUBJECTIVE VISUAL VERTICAL**

Healthy subjects are able to perceive verticality with considerable accuracy and this faculty is dependent on inputs from visual, vestibular, proprioceptive, and somatosensory systems (Friedmann, 1970; Pavlou et al., 2003). The perception of verticality also depends on a functioning central nervous system (Yelnik et al., 2002). The otolithic organs in the vestibular system sense gravity and the utricle and saccule contribute to a sense of verticality. Following injury to the otoliths or to the nerve that transmits signals from the otoliths and other parts of the ear to the brain, verticality judgments may be altered. Visual influences on verticality may be measured by putting a frame around a rectilinear bar. Alterations in the angle that the frame forms relative to the bar may disturb a person's judgment of the bar's verticality (e.g., Gueguen et al., 2012). Subjects with vestibular lesions may orient the bar such that it deviates from true vertical by as much as 10◦ (Vibert et al., 1999; Gomez Garcia and Jauregui-Renaud, 2003).

In this study, we found that the *promenade* was associated with a statistically significant reduction in errors in the estimation of SVV. Prior to the participants' *promenade*, the mean value for SVV was small, i.e., 1.1◦ which is within normal values (<2 ◦ , see Friedmann, 1970). Immediately following their *promenade*, this error rate continued to decrease falling to a mere 0.8◦ which reflects a performance of considerable accuracy. We attribute this ameliorative effect to the same mechanism responsible for the postural results, i.e., to the vestibular–visual signals generated while walking and interacting with the artwork. Indeed, it is known that both postural control and one's perception of SVV rely on the integration of vestibular, somatosensory, proprioceptive, and visual signals (Friedmann, 1970; Pavlou et al., 2003; Blumle et al., 2006). As mentioned above, walking around the sculptures stimulates otolith, visual, proprioceptive, and somatosensory signals and these signals are ultimately integrated, however, imperfectly. It therefore follows that an improvement in one's estimation of SVV could also be due to a concomitant improvement in one's ability to effectively integrate these diverse signals. Whether or not, in addition to relying on common signals, posture control and SVV may directly influence each other (see, e.g., the model suggested by Barra et al., 2012) in an experimental set-up similar to ours remains to be investigated in future studies.

#### **ARTWORKS ILLUSTRATE THEORIES OF EMBODIED COGNITION**

Artworks in general are excellent opportunities to illustrate the theories of embodied cognition. Wilson (2002) distinguished and evaluated six main propositions that are embedded within the concept of embodied cognition. By definition, visiting a exhibition, e.g., walking around sculptures, illustrates that human cognition is a situated activity (proposition #1): we do not merely get informed about an artist and her/his ideas but we go experience her/his artwork. Our findings concerning the artwork of Richard Serra's is specifically adding evidence to other propositions. For instance, the statement that the environment is part of the cognitive system (proposition #4) can be understood within the framework of a complex interaction between two minds, the artist's and the visitor's, mediated by their bodies and the environment. Indeed, the artist creates the physical artwork, which influences the body of the observer (her/his physiology, including posture and sense of verticality), which in turn probably influences her/his cognition. Our experiment specifically probes the link between the physical artwork and the body of the observer, while future studies might address other links, e.g., between the body of the observer and her/his subjective appreciation of the artwork. Serra's sculpture also allowed us to test the proposition that cognition is for action (proposition #5). Indeed, as the mere vision of a mug is an affordance and triggers a motor activation related to the action of seizing, the experience of tilted sculptures probably caused compensatory movements to adapt body balance to such unusual environment, at the origin of the effects described in the present study. We hope the role of body physiology in cognition will be further demonstrated in future studies, e.g., exploring whether remembering an artwork triggers similar postural effects than when actually visiting the exhibition, which would contribute to explore whether off-line cognition is body-based (proposition #6). Thus, we believe that measuring body physiology in various cognitive situations, including when interacting artwork, is importantly contributing to theories of embodied cognition.

### **CONCLUSION**

Visiting this exhibition involved emotional (e.g., facing the monumental sculpture within the Grand Palais's neve), cognitive (e.g., listening to the audio-guide), and physiologic (e.g., observing this stimuli strongly playing with sense of depth and verticality, exploring it with combined eye movements and deambulation around it) processes. Ambulating within the sculpted environment of Serra's monumental statuary work, *Promenade*, contributed, according to the evidence marshaled in the present study, to subtle improvements in the participants' postural control as well as in their capacity to accurately judge the SVV. Such improvements in postural control and sense of verticality might be related to the fact that such aesthetic stimulus is physiologically challenging for the human body. Such findings might be of interest to the field of Art Therapy. Notwithstanding cognitive and emotional evaluations, aesthetic experiences doubtless involve the body in its entirety, including our sense of gravity and verticality. The participants came out of the exhibition with an improved sense of verticality; this can contribute to the elaboration of aesthetic experience, which is existential and contributes to personal integrity according to philosophers (Funch, 1997).

### **ACKNOWLEDGMENTS**

The authors thank the administration of the Grand Palais, Paris, France, and the *commissaire d'exposition* Alfred Pacquement, director at the Musée National d'Art Moderne at the Centre Pompidou, Paris, France.

#### **REFERENCES**


postural and cognitive task complexity. *Curr. Aging Sci.* 2, 139–149. doi: 10.2174/1874609810902020135


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 June 2014; accepted: 05 November 2014; published online: 12 December 2014.*

*Citation: Kapoula Z, Lang A, Lê T-T, Adenis M-S, Yang Q, Lipede G and Vernet M (2014) Visiting Richard Serra's "Promenade" sculpture improves postural control and judgment of subjective visual vertical. Front. Psychol. 5:1349. doi: 10.3389/fpsyg.2014. 01349*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Kapoula, Lang, Lê, Adenis, Yang, Lipede and Vernet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# **Evidence for the embodiment of space perception: concurrent hand but not arm action moderates reachability and egocentric distance perception**

#### *Stéphane Grade 1,2, Mauro Pesenti 1,2 and Martin G. Edwards 1,2 \**

*1 Institut de Recherche en Sciences Psychologiques, Université Catholique de Louvain, Louvain-la-Neuve, Belgium, <sup>2</sup> Institute of Neuroscience, Université Catholique de Louvain, Louvain-la-Neuve, Belgium*

#### *Edited by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada*

#### *Reviewed by:*

*J. S. Jordan, Illinois State University, USA Kévin Roche, Le Laboratoire d'Étude des Mécanismes Cognitifs, France*

#### *\*Correspondence:*

*Martin G. Edwards, Institut de Recherche en Sciences Psychologiques, Université Catholique de Louvain, Place Cardinal Mercier, 10, 1348 Louvain-la-Neuve, Belgium martin.edwards@uclouvain.be*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 07 January 2015 Accepted: 12 June 2015 Published: 26 June 2015*

#### *Citation:*

*Grade S, Pesenti M and Edwards MG (2015) Evidence for the embodiment of space perception: concurrent hand but not arm action moderates reachability and egocentric distance perception. Front. Psychol. 6:862. doi: 10.3389/fpsyg.2015.00862* The perception of reachability (i.e., whether an object is within reach) relies on body representations and action simulation. Similarly, egocentric distance estimation (i.e., the perception of the distance an object is from the self) is thought to be partly derived from embodied action simulation. Although motor simulation is important for both, it is unclear whether the cognitive processes underlying these behaviors rely on the same motor processes. To investigate this, we measured the impact of a motor interference dual-task paradigm on reachability judgment and egocentric distance estimation, while allocentric length estimation (i.e., how distant two stimuli are from each other independent from the self) was used as a control task. Participants were required to make concurrent actions with either hand actions of foam ball grip squeezing or arm actions of weight lifting, or no concurrent actions. Results showed that concurrent squeeze actions significantly slowed response speed in the reachability judgment and egocentric distance estimation tasks, but that there was no impact of the concurrent actions on allocentric length estimation. Together, these results suggest that reachability and distance perception, both egocentric perspective tasks, and in contrast to the allocentric perspective task, involve action simulation cognitive processes. The results are discussed in terms of the implication of action simulation when evaluating the position of a target relative to the observer's body, supporting an embodied view of spatial cognition.

**Keywords: reachability judgment, distance estimation, action simulation, dual-task, space perception**

## **Introduction**

Space perception arises from multimodal integration (Andersen et al., 1997). Some studies show that neurons are active for both tactile and visual stimulation within a delimited space surrounding and anchored to specific body parts (see Graziano et al., 1994; Gross and Graziano, 1995; Fadiga et al., 2000), while other studies indicate that space perception can be derived from sensorimotor processes, for example, in the discrimination of peripersonal space (i.e., portion of space within arm reach allowing manual interaction with objects) and extrapersonal space (i.e., space beyond reaching capacity; Rizzolatti et al., 1997, 2002; Gallese et al., 1999; Coello and Delevoye-Turrell, 2007; Gallese, 2007). These findings suggest that space may be represented by multiple sub-spatial maps partly delimited by body and action capabilities (Gross and Graziano, 1995; Rizzolatti et al., 1997). Therefore, it appears that the motor system not only plans and controls actions, but that the same neural processes appear to be involved in internal simulation of actions (Fadiga et al., 2000; Jeannerod, 2001), and that these simulations may be used to derive an embodied perception of space (Witt and Proffitt, 2008).

Space perception being delimited into different subspaces regarding action capacities has also been emphasized in neuropsychological (see Previc, 1998, for a review) and neuroimaging research (Weiss et al., 2000; Committeri et al., 2007; Quinlan and Culham, 2007; Brozzoli et al., 2012). Peripersonal space (for visuomotor interaction in reaching range) differs from three extrapersonal spaces (focal extrapersonal, action extrapersonal and ambient extrapersonal; for visual scanning and orientation in space); proposed to rely on different cortical networks (Previc, 1998). For example, bisecting lines in near space has been shown to activate dorsal visuomotor areas, whereas performing the same task on a more distant screen was shown to use ventral perceptual areas (Weiss et al., 2000). In addition to different subspaces, other neuroimaging studies have highlighted neural differences between the frames of reference taken by a participant (Committeri et al., 2004; Zaehle et al., 2007; Galati et al., 2010). The egocentric perspective (i.e., the location of an object from one's own body; also termed body-referencing) involves a bilateral but mainly right-sided fronto-parietal network related to goal directed action planning (Vallar et al., 1999; Galati et al., 2000). The allocentric perspective (i.e., the location of an object relative to the location of another object or person) involves activation of similar areas of the dorsal stream to the egocentric perspective, though to a lesser extent, but also many additional areas in the ventral stream (for a review, see Galati et al., 2010). These observations suggest that egocentric perspective might recruit motor representations to a greater extent than the allocentric perspective. Consistent with these arguments, it has also been shown that the perception of reachable versus unreachable objects activates fronto-parietal networks (i.e., the precuneus and the parieto-occipital junction, the anterior parts of the cingulate gyrus and superior and medial frontal gyri, bilaterally) and the cerebellum, suggesting a contribution of dynamic motor representations facilitating the perceptive discrimination of peri- and extrapersonal spaces (Gallivan et al., 2009; Bartolo et al., 2014).

Visually determining whether an object is at a reachable distance is thought to rely on pre-reflective representations of body capacities for action (for a review, see Coello and Delevoye-Turrell, 2007). In reachability judgments tasks, it has been shown that perceived reaching limit can be influenced by the manipulation of action capability, with postural or environmental constrains (Carello et al., 1989; Rochat and Wraga, 1997; Fischer, 2000; Gabbard et al., 2007), height or position of the table where stimuli are presented (Carello et al., 1989), or participants wearing weights on the wrist (Rochat and Wraga, 1997). For example, hiding participants' hands and providing them a biased visual feedback about the end-point location of their pointing movement has been used to modify a person's perceived action capacity, and the manipulation has shown a moderation to perceived reachable space (Bourgeois and Coello, 2012). In a second example, a study used a motor constraint paradigm (i.e., by blocking the arms of participants) and showed response speed and accuracy interference for spatial localization decisions of stimuli within peripersonal space (Iachini et al., 2014). Further, physical (non-manipulated) differences in action capability such as handedness and visual laterality of target placement can also moderate reachability judgments (Fischer, 2005a; Gabbard et al., 2005a,b). Finally, motor disruption, for example through the use of transcranial magnetic stimulation (TMS) applied over the hand motor cortex of the left hemisphere, was shown to moderate response latencies in reachability judgments, particularly for stimuli positioned near the boundary of peripersonal space (Coello et al., 2008). Therefore, together, these effects demonstrate that moderations that normally influence action, also influence judgments of reachability, suggesting that reachability may be based on action representations that are constrained by the context in which the action could be performed (Fischer, 2000).

In parallel, studies focusing on the cognitive processes underlying distance perception have observed similar behavioral effects of action manipulation on distance estimation tasks (Proffitt, 2006; Witt, 2011). For instance, participants wearing a heavy backpack showed an increase in egocentric distance estimation compared to not wearing any backpack (Proffitt et al., 2003). Also, throwing a heavy compared to light ball to a target caused a subsequent greater estimation of the distance between the person and the same target (Witt et al., 2004). These two studies demonstrated that the manipulation of the effort associated with the action influenced space perception (Proffitt, 2006). In a further study, Witt and Proffitt (2008) added a concurrent ball squeezing task to the ball throwing task. It appeared that squeezing a rubber ball during distance estimation eliminated the influence of the heavy ball throwing, presumably through preventing ball throwing simulation (Witt and Proffitt, 2008). Distance estimation has also been investigated following the use of tools, understood to extend peripersonal space (Berti and Frassinetti, 2000; Farnè and Làdavas, 2000; Longo and Lourenco, 2006). For instance, after using a tool, participants perceived targets as closer than when no tool was used (Witt et al., 2005; Witt and Proffitt, 2008; Osiurak et al., 2012). Interestingly, when participants squeezed a rubber ball while making distance estimation judgments, the impact of tool use on distance estimation was reduced compared to making judgments without ball squeezing (Witt and Proffitt, 2008). As distance perception was moderated by tool use, and because the dual-task of ball squeezing reduced this moderation, it seems that motor simulation must provide a calibration metric for distance perception.

Altogether, these findings suggest that space perception benefits and is scaled to the representations of the body and its capacities. However, whether internal simulated actions do contribute to the perception of spatial distance is still an open question (Proffitt, 2013). The goal of the present study was to examine the contribution of action representations in both reachability and distance perception behaviors by investigating whether a concurrent motor task that may disrupt internal action representations would influence the perception of space. To assess this, participants completed three different spatial perceptual tasks (i.e., reachability, egocentric distance and allocentric length estimation) while performing concurrent hand (i.e., foam ball squeezing; Witt and Proffitt, 2008) or arm (i.e., weight lifting) actions in a within-participant design. With this manipulation, we tested whether similar interference of the concurrent actions would be observed in the reachability judgment task and the egocentric distance estimation task. We propose that these two tasks should show similar patterns of response time interference as, in previous studies, the manipulation of reach capacities influenced distance estimation (Witt and Proffitt, 2008; Osiurak et al., 2012; Morgado et al., 2013). In contrast, no dual-task moderation is expected in the allocentric length estimation task. Indeed, we argue that allocentric length estimation does not involve spatial localization relative to the body (or body referencing) and that action simulation processes are thus not recruited in this task. For the reachability judgment task, we predict a typical increase of response latency and error rate for targets placed near the boundary of peripersonal space (Gabbard et al., 2007; Bartolo et al., 2014). Moreover, an interaction between target location and dual-task for response latencies is expected in the reachability judgment task, with a stronger action dual-task effect for stimuli placed near the boundary of peripersonal space (Coello et al., 2008).

## **Method**

### **Participants**

There were 18 participants (aged between 18 and 25 years, *M* = 20.3, SD = 2.2, nine woman, three left-handed), all with normal or corrected-to-normal visual acuity and all naïve to the purpose of the experiment. The experiment was non-invasive and was approved by the ethics committee of the Institut de recherche en Sciences Psychologiques of the Université catholique de Louvain, in accordance with the ethical standards established by the Declaration of Helsinki.

### **Apparatus, Stimuli, and Procedure**

The apparatus consisted of a projector placed above a white table that was 215 cm long, 122 cm wide and 70 cm high. Black curtains surrounded the table in order to isolate the experimental environment from the rest of the room and reduce distractions. Participants sat on a chair situated in the middle of the small edge of the table and a microphone was placed above their head in order to record response latencies. The stimuli were composed of white rectangles (5 cm width and 2.5 cm length) displayed on a black background at various locations on the table (see **Figure 1**). A customized E-prime program (Schneider et al., 2002) was used to display the stimuli on the table and to control the experimental procedures.

All participants were required to perform three different tasks: reachability judgment, egocentric distance estimation, and allocentric length estimation. The reachability judgment and egocentric distance estimation tasks were made to the same stimuli. Rectangular shapes were projected on the tabletop at 16 different distances along the participant's sagittal body-midline axis (35; 55; 65; 75; 80; 85; 87.5; 90; 92.5; 95; 100; 105; 115; 125;

task **(B)**.

145; and 165 cm), such that approximately half of the stimuli was placed within reach and the other half out of reach, with more closely spaced stimuli placed at the boundary of reach space. In the reachability judgment task, participants were asked to judge whether they could touch the stimuli displayed on the table without actually performing any reaching actions. They responded aloud "yes" if they thought that they could touch the rectangle, or "no" if they thought that the stimulus was out of reach. It was explicitly mentioned that they could imagine themselves leaning forward, but that their bottom could not leave the chair in their action simulation. Moreover, they were asked to keep their back against the chair backrest during the entire experiment. In the egocentric distance estimation task, participants were asked to estimate the distance in centimeters separating them from the rectangular stimulus. In the allocentric length estimation task, they estimated the distance separating two rectangles presented on the tabletop. The two rectangles were presented either at 70 or 130 cm from the participant (i.e., within and out of reach space), and there were eight possible lengths between the rectangles (8; 16; 24; 32; 40; 48; 56; and 64 cm). The two rectangles were always equidistant from the center of the table compared to the participant's sagittal body mid-line. Within each task, the participants performed three different conditions of dual-task (run in separate trial blocks and counterbalanced within the tasks). Participants were either instructed to simply place their hands on the edge of the table (baseline condition), to perform arm actions (i.e., from fully laterally outstretch arm span to the flexion of elbows with the hands above the shoulders) with one-kilogram weights (arm action condition), or to perform foam ball squeezing hand actions placing their arms alongside their body (hand action condition). They were also trained with a metronome in order to perform the different actions at a specific rate (i.e., 40 per min).

Task order was counterbalanced across the participants. Each task consisted of three blocks of trials, each with a different dualtask condition (order of dual-tasks was also counterbalanced). Each block consisted of 16 different stimuli repeated four times, resulting in 64 randomized trials per block. Each trial started with a beep sound lasting 700 ms, then a stimulus displayed until participants responded, and finally, a blank screen for 1000 ms. In all tasks, the participant had to respond as fast as possible while keeping errors to a minimum. Each task started with a small practice session to make sure that participants fully understood the instructions and experimental set up, and that they performed the different dual-task actions at a specific frequency paced by a metronome. At the end of the experiment, the experimenter measured the height of participant's eyes, the length of their arms (from neck base to the edge of the middle finger) and their actual reaching limit (the participants furthest reach while being seated on the chair).

### **Data Analyses**

For the reachability judgment and egocentric distance estimation tasks, the 16 distances were averaged into four different distance categories (i.e., very close with distances 35; 55; 65; 75; close with distances 80; 85; 87.5; 90; far with distances 92.5; 95; 100; 105; and very far with distances 115; 125; 145; 165). Repeated measures analyses of variance (ANOVAs) were conducted with distance categories and dual-task conditions (i.e., no actions; arm actions or hand actions) as within-subject factors. For the allocentric length estimation task, two length categories were formed from the eight different stimuli (short: 8; 16; 24; 32; long: 40; 48; 56; 64). Repeated measures ANOVAs were conducted with length categories, the position from the participant (70 cm vs. 130 cm) and the dual-task conditions as within subject factors. The dependent variables were response latency and accuracy. For the reachability judgment, accuracy was computed on the basis of the participant's real reach capability measured at the end of experiment. For each stimuli distance, a trial was considered as an error when participants under-estimated (i.e., responded that they could not reach a target when they could) or overestimated (i.e., responded that they could reach a target when they could not) their reachability. To assess accuracy for egocentric distance and allocentric length estimations independently of the magnitude of the target to be estimated, an error rate was computed by subtracting the actual distance from the participant's response and dividing this difference by the actual distance (i.e., [(participant's response *−* actual distance)/actual distance]; where a positive value indicates overestimation, a negative value indicates underestimation, and a value of 0 means perfect accuracy for a similar index, see Crollen et al., 2013).

Bonferroni correction (*BC*) was applied where multiple *post hoc* comparisons were used. Data from unreliable trials (no response or microphone failures), and outlier responses (on which response latency was above or below 2.5 standard deviations from the overall mean) were excluded from the analyses. This led to the removal of 2.6, 1.0, and 4.2% of unreliable trials, and 2.8, 2.3, and 1.3% of outlier response latencies from the total trials of the reachability judgment, egocentric distance estimation and allocentric length estimation tasks respectively.

### **Results**

### **Reachability Judgment Task**

The analysis of response latency revealed a significant main effect of the dual-task conditions [*F*(2,34) = 3.5, *p <* 0.05], with a significant difference between the hand action condition (*M* = 610 *±* 76) and the no action condition (*M* = 579 *±* 93; *pBC <* 0.05, η 2 *<sup>p</sup>* = 0.17). A significant main effect of the distance categories was observed [*F*(3,51) = 10.1, *p <* 0.001, η 2 *<sup>p</sup>* = 0.37], with the very near (*M* = 558 *±* 90) and very far (*M* = 583 *±* 71) categories not being significantly different from each other (*pBC >* 0.05), but significantly different from the near (*M* = 606 *±* 98) and far (*M* = 624 *±* 82) categories (all *pBC <* 0.05). The difference between the near and far categories wasn't significant (*pBC >* 0.05). The interaction between the two variables was significant [*F*(6,102) = 3.6, *p >* 0.01, η 2 *<sup>p</sup>* = 0.17]. Separate ANOVAs were run for each distance category with the dual-task condition as factor. A significant effect of the dual-task was only observed in the near distance category [*F*(2,34) = 7.3, *p <* 0.01, η 2 *<sup>p</sup>* = 0.30], and pairwise comparisons showed that the hand action condition was significantly different from the arm action and the no action conditions (*pBC <* 0.05; see **Figure 2A**).

The analysis of accuracy (i.e., absolute error, irrespective of over or underestimation of reach) showed a significant main effect of distance categories [*F*(3,51) = 8.5, *p <* 0.001, η 2 *<sup>p</sup>* = 0.33], with the very close (*M* = 0.024 *±* 0.042) and the very far (*M* = 0.042 *±* 0.11) distances significantly different from the close (*M* = 0.21 *±* 0.19) and far (*M* = 0.26 *±* 0.24) distances (all *pBC <* 0.05). Participants made more errors for distances situated on the boundary between reachable and unreachable space than for the very close or very far distances. There was no main effect of the dual-task conditions [*F*(2,34) = 0.12, *p >* 0.05, η 2 *<sup>p</sup>* = 0.007], and no interaction between distance categories and dual-task conditions [*F*(6,102) = 1, *p >* 0.05, η 2 *<sup>p</sup>* = 0.03; see **Figure 2B**].

#### **Egocentric Distance Estimation Task**

The ANOVA on response latencies revealed a significant main effect of the dual-task condition [*F*(2,34) = 9.5, *p <* 0.01, η 2 *<sup>p</sup>* = 0.36], with the hand action condition (*M* = 1176 *±* 221) significantly different from the no action condition (*M* = 1082 *±* 195; *pBC <* 0.01). A significant main effect of the distances was also observed [*F*(3,51) = 6.3, *p <* 0.01, η 2 *<sup>p</sup>* = 0.27], with the very near category (*M* = 1159 *±* 196) significantly different from the near category (*M* = 1091 *±* 197; *pBC <* 0.05). The two variables did not interact [*F*(6,102) = 0.87, *p >* 0.05, η 2 *<sup>p</sup>* = 0.05; see **Figure 3A**].

The analysis of accuracy revealed no effect of the dual-task conditions [*F*(2,34) = 2.12, *p* = 0.15, η 2 *<sup>p</sup>* = 0.11], of the distance category [*F*(3,51) = 2.9, *p* = 0.069, η 2 *<sup>p</sup>* = 0.15], and no interaction [*F*(6,102) = 0.36, *p >* 0.05, η 2 *<sup>p</sup>* = 0.02; see **Figure 3B**].

### **Allocentric Length Estimation Task**

The ANOVA of the response latencies did not reveal a significant main effect of the dual-task conditions [*F*(2,34) = 0.8, *p >* 0.05, η 2 *<sup>p</sup>* = 0.04]. There was however, a significant main effect of the position [*F*(1,17) = 10.8, *p <* 0.01, η 2 *<sup>p</sup>* = 0.39] indicating that lengths positioned near (*M* = 1084 *±* 186) the participants were responded to faster than lengths presented further away (*M* = 1114 *±* 195). There was also a significant main effect of length category [*F*(1,17) = 9.9, *p <* 0.01, η 2 *<sup>p</sup>* = 0.37], with participants responding faster to shorter (*M* = 1071 *±* 172) than longer (*M* = 1127 *±* 216) lengths. There were no significant interactions (see **Figure 4A**).

The analysis of accuracy showed no significant main effect of dual-task conditions [*F*(2,34) = 0.76, *p >* 0.05, η 2 *<sup>p</sup>* = 0.04], but a significant main effect of the position [*F*(1,17) = 38,9, *p <* 0.001, η 2 *<sup>p</sup>* = 0.69], with the lengths appearing near (*M* = 0.072 *±* 0.25) showing a smaller overestimation bias than lengths appearing far (*M* = 0.18 *±* 0.30) from participants. There was also an effect of length categories [*F*(1,17) = 30.3, *p <* 0.001, η 2 *<sup>p</sup>* = 0.64], with short lengths (i.e., 8; 16; 24; 32 cm; *M* = 0.252 *±* 0.31) showing a larger overestimation bias than long lengths (i.e., 40; 48; 56; 64 cm; *M* = 0.004 *±* 0.28). The only significant interaction observed was the one between positions and length categories [*F*(1,17) = 4.9, *p <* 0.05, η 2 *<sup>p</sup>* = 0.22]. Accuracy differences between short and long lengths was significantly smaller in the near compared to far position [*t*(17) = 2.2, *p <* 0.05; see **Figure 4B**].

## **Discussion**

The perception of space is thought to benefit from the ability to mentally represent action (Rizzolatti et al., 1997; Gallese, 2007; Witt and Proffitt, 2008), and to use action capability as an index for understanding how objects are positioned relative to ourselves (Coello and Delevoye-Turrell, 2007; Witt and Proffitt, 2008; Proffitt, 2013). As many findings show similarities between action execution and action simulation (Decety et al., 1989; Parsons, 1994; Hanakawa et al., 2008), an interference of action execution on action simulation and beyond it, on space processing, is expected. However, it is currently unclear if reachability judgment (whether an object is in reach) and egocentric distance estimation (the distance between the viewer and the object) rely on the same body capability representations and action simulation processes. The aim of the present study was to evaluate whether performing a motor dual-task interfering with motor simulation processes would moderate space perception (extending dual-task effects to distance estimation; Witt and Proffitt, 2008). Furthermore, within the same participants, we wanted to determine for the first time whether two commonly used measures of space perception (reachability judgment and egocentric distance estimation) were based on a common action simulation mechanism.

For the reachability judgment task, significantly slower and less accurate responses were observed for targets presented close to the boundary between reachable and unreachable spaces, in comparison to the distances further from the boundary (very near and very far). This finding is in accordance with other studies investigating reachability judgment (e.g., Fischer, 2005b; Gabbard et al., 2005a, 2007; Bartolo et al., 2014). Furthermore, the dual-task conditions only affected response latencies, where performing foam ball squeezing actions significantly increased response latencies compared to the arm action and the control condition. This was particularly the case for judgments to targets placed in peripersonal space and near the boundary between peri- and extrapersonal spaces. This observation extends the finding that TMS application to the hand associated motor cortex slowed down participants perceptual judgments of whether a target was reachable or not, with a greater disruption when the targets appeared at the boundary of peripersonal space (Coello et al., 2008). Therefore, using the current method, we support the argument that when performing visually based judgments of whether a target is reachable, internal motor representations appeared to be recruited rather than being an epiphenomenal consequence of the reachability judgment (Coello and Delevoye-Turrell, 2007; Bartolo et al., 2014).

For egocentric distance estimation, participants tended to respond faster to targets positioned near compared to very near, but there was no influence of target position on accuracy. Although the influence of distance on the speed of response had a different profile than the one observed in the reachability judgment task, the ball-squeezing dual-task again slowed participant's responses relative to the no-action condition, but showed no interaction between target distance and the dual-task condition. This result supports the idea that motor simulation processes are involved in perceiving egocentric distances of objects (Witt et al., 2004; Proffitt, 2006, 2013; Witt and Proffitt, 2008), and that the simulation of reach capability may serve as a calibration metric for distance perception scaling spatial locations in the environment relative to body and its capacities (Witt et al., 2005; Linkenauger et al., 2009; Osiurak et al., 2012; Morgado et al., 2013).

In opposition to reachability judgment and egocentric distance estimation, allocentric length estimation was not influenced by the concurrent actions. This suggests that allocentric length estimation does not rely on motor simulation processes or body representation. An effect of stimulus magnitude was observed with participant's overestimating short compared to long lengths. This effect can be attributed to a contraction bias described in the literature, where participants have their magnitude estimations pulled toward the stimuli range center, leading to the underestimation of large and the overestimation of small stimuli (Poulton, 1979). Additionally, an effect of position was observed, where participants made faster and more accurate responses to targets positioned near compared to far from them. More specifically, participants reported far positioned lengths as being longer than near positioned lengths despite the fact that the stimuli were identical in length. This effect may be due to a compensation of size constancy in depth where objects presented further from participants appear smaller (Gregory, 1963). This effect could also be due to a Ponzo illusion (Ponzo, 1912), where identical lines are perceived as different when placed within a triangle or a trapezoid shape (Fisher, 1968). In our experimental set up, even if the table was rectangular, it appeared as a trapezium for participants (i.e., near edge is visually larger than the far edge of the table) perhaps causing Ponzo illusion-like effect on the stimuli in this task.

Interestingly, only the ball squeezing dual-task interfered with reachability judgment and egocentric distance estimation. Although one might have expected that the concurrent arm movement would have had a similar disruptive effect, this was not the case. An interpretation of this interaction effect could be the manner that reaching actions might be internally simulated or represented. The ball squeezing actions specifically required acting on an object rather than just moving the object continuously as in the arm action dual-task. We propose that the object or goal directed actions may be more interfering, involving both reach and grasp integrated representations (Jeannerod, 1997; Paulignan et al., 1997). Perhaps similarly, the effect could be explained by the theory of event coding (TEC) framework for interactions between perception and action (Hommel et al., 2001; Hommel, 2009). According to TEC, perceived events and action intentions/goals (or to be produced events) are coded within a common representational medium of distal events. Perception thus includes action planning or simulation that takes into account the goals an individual has regarding a distal event (i.e., the intended change to be performed). In the present study, the ball-squeezing dual-task required participants to act upon an object with the intention of modifying the object structure, generating an object-directed distal event. As the target stimuli in the reachability and distance estimation tasks were also processed as distal events, the two different events would have had to be represented simultaneously, and this competition in representation might have caused interference in event coding processes leading to longer response latencies. Moreover, for goal directed grasping in peripersonal space, the representation of the object position had to be coded in hand-centered coordinates rather than regarding arm position (Brozzoli et al., 2012). Experiments have shown that multisensory coding of peripersonal space (measured with cross-modal visualtactile interactions) was particularly influenced by object oriented grasping actions (Brozzoli et al., 2009, 2010). Their results showed that cross-modal congruency effects were stronger during action execution compared to a static condition (Brozzoli et al., 2009) and that this congruency effect was stronger during the execution of grasping action compared to pointing (reaching) action (Brozzoli et al., 2010). If the perception of peripersonal space was equally considered to be based on hand-centered simulation processes, then only the squeezing dual-task would influence space perception, and not the arm action dual-task condition.

From the common finding of the dual-task effects on both reachability judgment and egocentric distance estimation, we argue that egocentric space processing requires motor simulation processes where the viewer evaluates the position of a target in relation to their body and simulate an action within their capacity. This finding is consistent with several fMRI studies showing that, in egocentric perceptual tasks (e.g., judging which of two objects are closest to the viewer), there was a greater activation of a fronto-parietal network including the posterior parietal cortex and premotor areas (Committeri et al., 2004; Galati et al., 2010). These brain areas are used for action planning processes and, furthermore, are shown to be active when participants imagine executing actions (Hanakawa et al., 2008; Macuga and Frey, 2012). This is in contrast to allocentric perceptual tasks (e.g., judging which of two objects are closest to a third one) that show activations scattered across both the ventral and dorsal areas (Committeri et al., 2004; Galati et al., 2010), suggesting that allocentric length estimation may require different cognitive processes not related to action simulation.

In the present study, the effect of the dual-task was limited to response latency effects. Our analyses showed that the dual-task conditions had no influence on actual reachability judgments or distance estimates. This finding is consistent with previous dualtask experiments (Witt and Proffitt, 2008), where it was reported that squeezing soft balls at the same time as estimating the distance of targets showed no moderation of the reported distances when no tool was present. This suggests that represented body metrics are not modified by the concomitant execution of actions, as no extension or reduction of perceived peripersonal space was observed. Another interesting finding was that the response latency of the ball squeezing dual-task differed for reachability

### **References**


judgment and egocentric distance estimation. For reachability judgment, responses were slowed specifically for targets in the peripersonal space whereas the dual-task slowed responses to all target positions for egocentric distance estimation. These two findings suggest that while common motor resources are recruited for the two tasks, the manner in which those motor resources are used might be different. For reachability judgments, responses are particularly slowed for peripersonal targets close to the boundary between peripersonal and extrapersonal space. This could be explained through inefficiency or difficulty in the use of action simulation when at the limit of the participant's capability. For egocentric distance estimation however, the perceptual response latencies were similar for all targets, irrespective of whether they were within or outside of reach.

In conclusion, our study supports the idea that internal representations of action contribute to the perception of external space (Coello and Delevoye-Turrell, 2007; Witt and Proffitt, 2008; Bartolo et al., 2014). Visually determining what is reachable engages the simulation of a motor act that can be interfered with a hand motor dual-task. Moreover, we find that similar internal simulations of reach might also serve as a metric for egocentric distance perception. Therefore, reachability and egocentric distance perception appear to be linked (Osiurak et al., 2012; Morgado et al., 2013), requiring overlapping reaching simulation cognitive processes. However, despite the finding that similar motor resources appeared to be recruited for both behaviors, it could be that these resources are used differently for each behavior. This motor contribution to egocentric space perception may require body referencing, and we argue that in allocentric space, no such motor contribution is recruited. These findings suggest that in order to perceive the environmental layout surrounding a person, the viewer not only represents perceived space from sensorial inputs, but they also simulate the potential body and action interactions within space, supporting an embodied view of space perception (Coello and Delevoye-Turrell, 2007; Proffitt, 2013).

### **Acknowledgments**

This study was supported by grant FSR 2011 (ADi/DB/1058.2011) from the Fonds Spéciaux de Recherche of the Université catholique de Louvain (Belgium) and by grant 1.A.234.13 from the National Fund for Scientific Research (Belgium). SG is a research fellow and MP a research associate of the Fonds National de la Recherche Scientifique (Belgium). We thank Dominique Hougardy (Institut de Recherche en Sciences Psychologiques, Université catholique de Louvain) for his technical help in setting up the experimental environment.

Berti, A., and Frassinetti, F. (2000). When far becomes near: remapping of space. *J. Cogn. Neurosci.* 12, 415–420. doi: 10.1162/089892900562237


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Grade, Pesenti and Edwards. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## For your eyes only: effect of confederate's eye level on reach-to-grasp action

### *François Quesque and Yann Coello\**

*Psychology Department, Unité de Recherche en Sciences Cognitives et Affectives, Charles de Gaulle-Lille 3 University – University of Lille Nord de France, Villeneuve d'Ascq, France*

#### *Edited by:*

*Benoit Riou, Université Lyon 2, France*

#### *Reviewed by:*

*Cristina Becchio, Università degli Studi di Torino, Italy François Osiurak, Université de Lyon, France Richard Palluel Germain, Université de Grenoble, France*

#### *\*Correspondence:*

*Yann Coello, Psychology Department, Unité de Recherche en Sciences Cognitives et Affectives, Charles de Gaulle-Lille 3 University – University of Lille Nord de France, Rue du Barreau 59653 Villeneuve d'Ascq, France e-mail: yann.coello@univ.lille3.fr*

Previous studies have shown that the spatio-temporal parameters of reach-to-grasp movement are influenced by the social context in which the motor action is performed. In particular, when interacting with a confederate, movements are slower, with longer initiation times and more ample trajectories, which has been interpreted as implicit communicative information emerging through voluntary movement to catch the partner's attention and optimize cooperation (Quesque et al., 2013). Because gaze is a crucial component of social interactions, the present study evaluated the role of a confederate's eye level on the social modulation of trajectory curvature. An actor and a partner facing each other took part in a cooperative task consisting, for one of them, of grasping and moving a wooden dowel under time constraints. Before this *Main action*, the actor performed a *Preparatory action*, which consisted of placing the wooden dowel on a central marking.The partner's eye level was unnoticeably varied using an adjustable seat that matched or was higher than the actor's seat. Our data confirmed the previous effects of social intention on motor responses. Furthermore, we observed an effect of the partner's eye level on the *Preparatory action*, leading the actors to exaggerate unconsciously the trajectory curvature in relation to their partner's eye level. No interaction was found between the actor's social intention and their partner's eye level.These results suggest that other bodies are implicitly taken into account when a reach-to-grasp movement is produced in a social context.

**Keywords: perception, motor action, social intention, eye level, kinematics**

### **INTRODUCTION**

Humans live in social groups and spend much time engaging in cooperative actions (Richerson and Boyd, 1998) or interpreting observed behaviors (Barresi and Moore, 1996), even when there is no clear intention to interact with conspecifics (Frith and Frith, 2006). Motor actions have the special feature of being influenced by both the goal pursued (Marteniuk et al., 1987; Ansuini et al., 2006, 2008; Naish et al., 2013) and the social context in which they are performed (Ferri et al., 2011; Gianelli et al., 2011; Innocenti et al., 2012; Quesque et al., 2013; Scorolli et al., 2014). As a consequence, observers can detect, from kinematic variations in motor performances, the goal of an ongoing action before it is entirely executed (Orliaguet et al., 1997; Elsner et al., 2012; Stapel et al., 2012; Lewkowicz et al., 2013) and even the actor's social intention (Sebanz and Shiffrar, 2009; Manera et al., 2011; Sartori et al., 2011). For instance, Manera et al. (2011) showed that observers could readily categorize from movement information whether an object was grasped to perform an individual action or with the intention of socially cooperating. In line with this specific sensitivity to social cues borne by action, recent neuroimaging studies highlighted the capacity of the brain to discriminate very rapidly from the optic flow information relating to human bodies (see de Gelder et al., 2010 for a review) and bodily expressions (see Blake and Shiffrar, 2007 for a review). It has been suggested that the implicit process of socially relevant motor features could optimize cooperation between agents and contribute to the

selection of adapted responses depending on the social demands (Gallagher, 2008).

The role of sensorimotor cues in social interactions is a particular aspect of human communication that originates from the very early motor experiences that infants share with their parents (Brand et al., 2002; Brand and Shallcross, 2008). The so-called "motionese" strategy reflects the fact that parents exaggerate their movements when addressing their children. Although less accentuated in later life, this effect does not seem restricted to childhood since changes in kinematics have also been observed when communication occurs between adults in pointing (Cleret de Langavant et al., 2011) and grasping tasks (Sartori et al., 2009). In the latter experiment, participants were asked to reach, grasp, and lift colored spheres for an individual or cooperative purpose requiring an observer to decode a message from the alternation of colors via a simplified Morse code. Although the goal of the motor action was identical in the two conditions for the actor, the reach-to-grasp movements were performed differently when there was a social communication constraint. More precisely, the reaching movements were slower with less straight trajectories in the communicative than in the non-communicative condition. Thus, it appears that when endorsing social intention – that is, when other actors are crucial elements for satisfying the intended goals (Ciaramidaro et al., 2007) – humans tend to modify the kinematics of their motor behaviors, even when there is no explicit instruction to communicate. In agreement with

this, when actors move an object to allow a partner (rather than themselves) to perform a goal-directed action, they move and place the objet using a more curved trajectory (Becchio et al., 2008b; Quesque et al., 2013) and a longer movement initiation time (Quesque et al., 2013). Such an increase in movement amplitude has been interpreted as an implicit strategy to catch the partner's attention and communicate social intention (Quesque et al., 2013); the movements being performed with a higher amplitude due to the partner's eye level representing a social target that influences the implementation of goal-directed action.

Supporting the assumption of an influence of eye level on cooperative tasks, several studies have pointed out the predominant role of gaze in human social interactions (Argyle and Cook, 1976; Kleinke, 1986; Langton et al., 2000; Becchio et al., 2008a) from the early days of life (Farroni et al., 2002). In comparison with other primate species, humans have especially visible eyes (Kobayashi and Kohshima, 1997) which renders their gaze direction much more salient, thus facilitating cooperative behaviors and joint actions. In studying how social context affects movement kinematics, recent research has led to the conclusion that the appropriate direction of a partner's gaze is a prerequisite to effective social interactions (Ferri et al., 2011; Innocenti et al., 2012; Scorolli et al., 2014).

In this context, the present study aimed to evaluate the effect of a partner's eye level on the execution of individual or cooperative voluntary reach-to-grasp movements. If hand elevation when performing an action in a social context is influenced by the height of a partner's eyes, as suggested by previous studies, hand trajectories would be expected to be higher when a motor action was performed in the presence of a partner taking a higher seated position. Furthermore, this study investigated whether the effect of eye level on the spatio-temporal parameters of motor responses depends on the communicative context, i.e., when a social intention is endorsed, or if it depends on a more implicit influence occurring even in the absence of any social interaction (e.g., Bateson et al., 2006; Ernest-Jones et al., 2011).

### **MATERIALS AND METHODS PARTICIPANTS**

Twenty-one healthy, right-handed (as determined by the Edinburgh Handedness Inventory, Oldfield, 1971) adults (mean age = 21.05 years, SD = 1.96 years, four males) were tested. They had no prior knowledge about the scientific aim of the study and provided their written informed consent before participating. The protocol followed the general ethics rules defined by the local ethics committee and was in accordance with the principles of the Declaration of Helsinki (World Medical Organization, 1996). The experimenter (the first author of this paper) was a 24-year-old man who played the role of the social partner for all participants.

#### **APPARATUS AND STIMULI**

Participants and the partner sat on either side of a table (120 cm × 80 cm), facing each other. 2 cm × 2 cm black markings on the table indicated three specific locations, which will be hereinafter referred to as the initial, central and final positions. In addition, the starting positions used for the right hand of the participants and the partner were indicated by black markings located at each edge of the table. The object to be manipulated was a wooden dowel (diameter 2 cm, length 4 cm), which was to be moved from one spatial landmark to the next following a defined sequence, each movement in the sequence being triggered by an auditory cue (see **Figure 1**).

#### **PROCEDURE**

The task for the participants was to reach and grasp the wooden dowel using their thumb and index finger and move it from one position to the next in a sequence of three successive actions. Before performing each action, both the participants and the partner were requested to remain stationary with their thumb and index finger pinched together and resting upon the starting position. Each sequence started with the wooden dowel placed at the initial position. The first action was the *Preparatory action,* which consisted of moving the wooden dowel from the initial to the central position with no specific time constraints. The second action was the *Main action,* in which the wooden dowel was moved horizontally from the central to the final position as fast as possible. The third action was the *Repositioning action,* in which the wooden dowel was moved from the final to the initial position with no specific time constraints, thus setting up for the next sequence. Time constraints were thus only applied to the *Main action*, in which the velocity of the participant's wrist had to be more than 80% of the maximum reachable velocity (computed from the peak velocity recorded in a previous practice session, see below and Quesque et al., 2013 for a detailed description). Each movement was triggered by a specific auditory cue, always broadcast in the same order (cue 1 initiated the *Preparatory action*; cue 2 the *Main action;* cue 3 the *Repositioning action)*. Thus, participants and the partner had their right hand on the starting positions before initiating any of the movements in the sequence, while their left hand remained in their lap. When the participant or the partner was acting, the other person had to keep motionless. Furthermore, participants were not allowed to communicate and were asked to fix their gaze on the table during the course of the experiment in all sessions. In order to prevent participants from anticipating the time of movement initiation, between-sequences intervals varied randomly between 3 and 3.5 s. In addition, the interval between the first and second auditory cue was varied randomly between 3.5 and 4 s while the interval between the second and third auditory cue was fixed at 2 s in order to provide feedback on the participant's performance immediately after they had completed the *Main action*.

Participants performed four successive sessions of 25 sequences of action. In these sessions, the *Main action* was carried out by either the participant or the partner (block trials), with the seat of the partner being either at the same height as or higher than that of the participant (block trials). The eye level of the partner was manipulated using an adjustable seat, which was either at the same height as that of the participant (0 cm condition) or 5 cm higher (5 cm condition, counterbalanced order). In order to minimize the risk that participants detected this manipulation, the two height

conditions were performed on different days separated by 1 week and the height of the seat was adjusted before the arrival of the participants. In each of these conditions, the participants and the partner performed the *Main action* in two block sessions presented in a counterbalanced order on the same day. Then, depending on the session, when performing the *Preparatory action*, participants could place the wooden dowel for themselves (personal intention) or for their partner (social intention).

#### **PRACTICE SESSIONS**

Before the experimental session started, all participants underwent two practice blocks, each containing 15 sequences of action. The first practice block was done to obtain an estimate of the maximum speed at which participants could grasp the wooden dowel from the central position and place it on the final position. An adjustment procedure similar to the one used in Quesque et al. (2013) was used. The second practice block was done to check that instructions were understood and that the different auditory cues were accurately identified and the appropriate motor responses provided.

#### **DATA RECORDING AND ANALYSIS**

Participants' motor performances were recorded using Qualisys 4 Oqus infrared cameras (Qualisys AB, Gothenburg, Sweden). Infrared reflective markers were placed on the forefinger (base and tip), thumb (tip) and wrist (scaphoid) of the right hand of participants. An additional marker was placed on the wooden dowel. Cameras were calibrated before each session, enabling the system to reach SD accuracies of less than 0.2 mm, at a 200 Hz sampling rate. Only the *Preparatory action* data were analyzed, because the social influence on motor performances can be estimated only from this action. The *Preparatory action* was characterized by a reaching phase (reach-to-grasp action) and a transport phase (moving the wooden dowel from the initial to the central position). The focus was on movement parameters that are known to be affected by social intentionality, namely reaction time, movement time, peak wrist velocity, and height of the trajectory in the reaching and transport phases (Becchio et al., 2008b; Quesque et al., 2013). Reaction time, movement time and trajectory elevation were computed from the 3D coordinates of the reflective marker placed on the wrist of participants. Temporal and kinematic parameters of the (x,y,z) coordinates of the wrist marker were computed from tangential velocity profiles after filtering the data using a second-order Butterworth dual pass filter (cutoff frequency: 15 Hz). Movement onset was defined as when the first velocity value reached 20 mm s−1. Movement end was defined as the time the velocity profile reached the minimum value following the peak velocity of the transport phase. Reaction time corresponded to the time separating the *Preparatory action* auditory cue from movement onset. Movement time corresponded to the time separating movement onset from movement end. Peak wrist velocity corresponded to the maximum velocity reached by the wrist during the grasping and transport phase, respectively. The maximum height of trajectory was defined as the maximum z coordinate of the wrist measured in the grasping and transport phases.

Concerning reaction time, a 2 (Intention: Social vs. Personal) × 2 (Partner's eye level: 0 cm vs. 5 cm condition) ANOVA was conducted. Concerning movement time and kinematic parameters analysis, 2 (Intention: Social vs. Personal) × 2 (Partner's eye level: 0 cm vs. 5 cm condition) × 2 (Movement phase: Grasping vs. Transport) ANOVAs were conducted, all variables being associated with within-participants measures. The significance level was set at 0.05 and the problem

of multiple comparisons was corrected using the Bonferroni method.

### **RESULTS**

Trials were excluded from the data analysis when a participant responded erroneously, when the marker was not correctly recorded during the movement, or when the reaction time was shorter than 200 ms or longer than 2.5 SDs from the median (Leys et al., 2013) computed from the *Preparatory actions*. 5.2% of the trials, homogenously distributed across the conditions, were thus excluded.

### **REACTION TIME**

Reaction time was influenced by social intention [*F*(1,20) = 50.69, *p* < 0.001, η<sup>2</sup> <sup>p</sup> = 0.72]. Participants showed a longer reaction time when they placed the wooden dowel for the partner (564 ms) than for themselves (480 ms). However, no effect of partner's eye level on reaction time [*F*(1,20) = 0.62, ns] was found, and there was no interaction between the two factors [*F*(1,20) = 0.99, ns; see **Figure 2**].

### **MOVEMENT TIME**

Movement time was influenced by social intention [*F*(1,20)=5.49, *p* < 0.05, η<sup>2</sup> <sup>p</sup> = 0.22], increasing when a social (480 ms) rather than a personal (468 ms) intention was endorsed. Furthermore, movement time increased in the transport phase (485 ms) compared to the grasping phase [463 ms, *F*(1,20) = 7.15, *p* < 0.05, η2 <sup>p</sup> = 0.26]. However, no effect of partner's eye level on movement time [*F*(1,20) = 1.38, ns] was found, and there was no interaction between the three factors [social intention/movement phase: *F*(1,20) = 2.5, social intention/eye level: *F*(1,20) = 1.71, movement phase/eye level: *F*(1,20) = 2.25, social intention/movement phase/eye level: *F*(1,20) = 0.13, all ns; see **Figure 3**].

#### **WRIST ELEVATION**

Wrist elevation was influenced by social intention [*F*(1,20) = 8.01, *p* < 0.01, η<sup>2</sup> <sup>p</sup> = 0.29], with a higher trajectory when participants

endorsed a social (62.3 mm) rather than a personal (60.7 mm) intention. Wrist elevation was also influenced by the movement phase [*F*(1,20) <sup>=</sup> 73, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.78], with a higher trajectory during the transport (64.3 mm) than the grasping (58.3 mm) phase. Finally, wrist elevation was influenced by eye level [*F*(1,20) <sup>=</sup> 5.3, *<sup>p</sup>* <sup>&</sup>lt; 0.05, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.21], with a higher trajectory when participants were in the presence of a partner who had a higher seat (5 cm condition, 63.4 mm) than a seat at the same height as theirs (0 cm condition, 59.6 mm). However, there was no interaction between the three factors [social intention/movement phase: *F*(1,20) = 0.01, social intention/eye level: *F*(1,20) = 0.29, movement phase/eye level: *F*(1,20) = 0.87, social intention/movement phase/eye level: *F*(1,20) = 1.17, all ns; see **Figure 4**].

#### **PEAK WRIST VELOCITY**

Peak wrist velocity was not influenced by social intention [*F*(1,20) = 0.39, ns], nor by eye level [*F*(1,20) = 0.88, ns]. However, it was influenced by the movement phase [*F*(1,20) = 34, *p* < 0.001, η<sup>2</sup> <sup>p</sup> = 0.63], with a lower peak wrist velocity during the transport (522 mm s−1) than the grasping (607 mm s−1) phase, although more when endorsing a social (−101 mm s−1) rather than a personal (−70 mm s−1) intention, as shown by the significant intention/movement phase interaction [*F*(1,20) = 24.4, *p*<0.001, η<sup>2</sup> <sup>p</sup> = 0.55]. All the other interactions were not significant [social intention/eye level: *F*(1,20) = 0.12, movement phase/eye level: *F*(1,20) = 1.37, social intention/movement phase/eye level: *F*(1,20) = 0.86, all ns].

### **DISCUSSION**

In the present study, we examined the role of the eye level of a confederate in the execution of individual or cooperative voluntary reach-to-grasp movements. First of all, our data confirm previous findings concerning the effect on an actor of endorsing social intention. Analyses of the *Preparatory action* revealed that participants took more time to initiate their action, which was performed at a slower speed and with a higher hand trajectory, when they placed the wooden dowel knowing that the *Main action* would be performed by the partner (Becchio et al., 2008b; Quesque et al., 2013). It is also worth noting that, although spatio-temporal variations relating to social intention were quite subtle (around 80 ms for reaction time, 20 ms for movement duration and 2 mm for wrist elevation), they were significant and consistent across different studies (Quesque et al., 2013). Taken together, these effects support the hypothesis that when endorsing a social intention, humans tend to exaggerate the spatio-temporal parameters of their movements, probably in order to facilitate the confederate's detection of the motor and social goals of the planned action, and thus improve cooperative situations. This interpretation is supported by the findings showing that humans tend to increase the amplitude of their actions when performing explicit intentional communicative (pantomime) compared to non-communicative (actual use) object-related movements (Hermsdörfer et al., 2006, 2012).

The novelty of this study is that the unnoticed modification of the body characteristics of a partner had a sharp effect on the spatio-temporal parameters of object-oriented voluntary action.

**FIGURE 3 | Movement duration as a function of social intention and partner's eye level for both (A) the reaching and (B) the transport phases of the** *Preparatory action***.** Error bars represent the SE.

In particular, modifying the partner's eye level had an effect on the *Preparatory action* with participants producing movements with a higher amplitude when the partner's seat was 5 cm higher than when it was at the same height as their own, suggesting that the properties of the other person's body are implicitly taken into account when producing a motor action in a social context. These modulations of hand trajectory when acting in a social context might reflect specific attention allocation to several sources of information, as requested by the task (Howard and Tipper, 1997; Diedrichsen et al., 2004). For example, one may speculate that when a person performs a voluntary motor action in the presence of a partner, the latter's eye level represents a spatial target influencing the movement parameters specified to reach a particular object in the environment. The fact that social context influences object-oriented motor actions has already been suggested in previous studies (e.g., Cleret de Langavant et al., 2011; Gianelli et al., 2011; Quesque et al., 2013). In particular, Cleret de Langavant et al. (2011) showed that trajectories of pointing movements performed after giving a verbal instruction to a confederate slightly shifted in the direction of the confederate compared to pointing movements performed in a non-communicative context. The new result here is that the height of the confederate is also considered when planning and executing object-oriented motor actions. Though the present study focuses on eye level, it is worth noting that other cues might have contributed to the observed effect, as for instance body, shoulder or head height, the bending of the head or even a change in arm posture. However, previous works have shown that gaze is a crucial component of social interactions (Argyle and Cook, 1976; Kleinke, 1986; Langton et al., 2000; Becchio et al., 2008a) and that gaze direction influences motor kinematics in a social context (Ferri et al., 2011; Innocenti et al., 2012; Scorolli et al., 2014). No previous

study has shown that body height in itself or a change in arm posture influences motor kinematics in a social context. Taken together, these data suggest that eye level contributes to the social effect observed in the present study, though whether other information contributes to the effect remains an open question.

Strikingly, the effect of the partner's eye level appeared both when social intention was endorsed and when participants followed personal goals. In fact, no interaction was found between the factors Social intention and Partner's eye level, suggesting that even in the absence of communicative instructions, the gaze characteristics of a conspecific are taken into account when planning an object-oriented motor action. These results confirm the special importance of human bodies in motor performances in a social context (Cleret de Langavant et al., 2012). They also corroborate previous findings that the presence of conspecifics automatically leads to considering their perspectives (Mainwaring et al., 2003; Tversky and Martin Hard, 2009; Qureshi et al., 2010; Samson et al., 2010) and to processing objects in the environment with reference to them (Becchio et al., 2011). It remains possible that changing the eye level of the partner influenced the head-eye coordination strategies of participants, resulting in a change in movement control. However, although head-eye movements were not recorded, this could hardly account for the observed effect of the partner's eye level on movement amplitude since the participants and the partner had to keep their gaze on the wooden dowel when either was acting. It is worth noting that, in our study, the interactions between participants and the partner occurred in a pre-specified cooperative context. It thus remains to be established whether the influence of conspecific gaze characteristics on motor performances is still effective when the conspecifics are no longer partners but competitors. It would also be interesting to evaluate in future research whether the influence of body characteristics of conspecifics arises in a non-predefined communicative context and in a multi-agent social context.

In conclusion, although further investigations are necessary to unravel the effect of social intention on voluntary motor action, the present study demonstrates that the body characteristics of a partner, in particular their eye level, are implicitly taken into account when performing a motor action in cooperative and non-cooperative tasks. This suggests that the conspecific's body represents one of the crucial variables that constrain motor planning and execution.

#### **ACKNOWLEDGMENTS**

This study was supported by a grant from the French Research Agency ANR-11-EQPX-0023, FEDER SCV-IRDIVE and University Lille 3. The authors are grateful to Carol Robins for her helpful comments on the manuscript.

#### **REFERENCES**


to actual use and the effect of brain damage. *Exp. Brain Res.* 218, 201–214. doi: 10.1007/s00221-012-3021-z


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 September 2014; accepted: 17 November 2014; published online: 04 December 2014.*

*Citation: Quesque F and Coello Y (2014) For your eyes only: effect of confederate's eye level on reach-to-grasp action. Front. Psychol. 5:1407. doi: 10.3389/fpsyg.2014.01407 This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Quesque and Coello. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Children's looking preference for biological motion may be related to an affinity for mathematical chaos

*Joshua L. Haworth1,2,3\*, Anastasia Kyvelidou2, Wayne Fisher <sup>4</sup> and Nicholas Stergiou2,3*

*<sup>1</sup> Center for Autism and Related Disorders, Kennedy Krieger Institute, Baltimore, MD, USA, <sup>2</sup> School of Health, Physical Education and Recreation, University of Nebraska Omaha, Omaha, NE, USA, <sup>3</sup> College of Public Health, University of Nebraska Medical Center, Omaha, NE, USA, <sup>4</sup> Center for Autism Spectrum Disorders, University of Nebraska Medical Center, Omaha, NE, USA*

#### *Edited by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada*

#### *Reviewed by:*

*Joshua L. Williams, Armstrong Atlantic State University, USA Adam W. Kiefer, Cincinnati Children's Hospital Medical Center, USA*

#### *\*Correspondence:*

*Joshua L. Haworth, Center for Autism and Related Disorders, Kennedy Krieger Institute, 3901 Greenspring Avenue, Baltimore, MD 21211, USA haworth@kennedykrieger.org; JoshuaHaworth@gmail.com*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 29 November 2014 Accepted: 26 February 2015 Published: 17 March 2015*

#### *Citation:*

*Haworth JL, Kyvelidou A, Fisher W and Stergiou N (2015) Children's looking preference for biological motion may be related to an affinity for mathematical chaos. Front. Psychol. 6:281. doi: 10.3389/fpsyg.2015.00281* Recognition of biological motion is pervasive in early child development. Further, viewing the movement behavior of others is a primary component of a child's acquisition of complex, robust movement repertoires, through imitation and real-time coordinated action. We theorize that inherent to biological movements are particular qualities of mathematical chaos and complexity. We further posit that this character affords the rich and complex inter-dynamics throughout early motor development. Specifically, we explored whether children's preference for biological motion may be related to an affinity for mathematical chaos. Cross recurrence quantification analysis (cRQA) was used to investigate the coordination of gaze and posture with various temporal structures (periodic, chaotic, and aperiodic) of the motion of an oscillating visual stimulus. Children appear to competently perceive and respond to chaotic motion, both in rate (cRQApercent determinism) and duration (cRQA-maxline) of coordination. We interpret this to indicate that children not only recognize chaotic motion structures, but also have a preference for coordination with them. Further, stratification of our sample (by age) uncovers the suggestion that this preference may become refined with age.

Keywords: eye tracking, cross recurrence quantification analysis, sensorimotor, perception, complex systems

## Introduction

Developing children typically recognize biological motion in point-light displays (Fox and McDaniel, 1982). Interestingly, they specifically prefer to watch locomotion coherent with their own mode of locomotion; i.e., crawlers prefer to watch crawling whereas walkers prefer to watch walking (Sanefuji et al., 2008). Point-light studies such as this one provide insight into how children recognize and replicate movements of others in their social environment. The development of motor behavior relies, in part, on being able to incorporate the lessons learned from viewing others' attempts at similar motor performance. By watching others, we are able to vastly multiply our own experience and knowledge of successful movement strategies. Several experiments have provided evidence that very young children are able to attribute intentionality and action goals to human models of motor behavior (Hofer et al., 2005; Klein et al., 2006; Hauf, 2007). Similar findings demonstrate the ability to distinguish intentional behaviors when viewing both familiar and novel motor actions (Woodward, 1998, 1999; Hofer et al., 2007; Jovanovic and Schwarzer, 2007; Jovanovic et al., 2007) and also when viewing both live and televised models of motor performances (Meltzoff, 1988a; Klein et al., 2006). In other studies, it has been shown that when the visual information is contrived to the point that it becomes unreliable, the reliance on this information for action-production is averted (Longo et al., 2008). These studies collectively point to the richness of a child's perception of the movement behavior of other persons. Several additional investigations have sought to describe the nature and extent to which children are able to imitate movement behaviors of observed performers, either immediately or after a delay (Meltzoff, 1988b; Barr et al., 1996; Bremner, 2002; Herbert et al., 2006). Consistently, it is found that children 'develop' their ability to demonstrate delayed imitation throughout their early experiences, highlighting the complex milieu of sensory maturation, memory processes, and the formation of awareness of self and others.

Another key feature of these investigations that has received less attention is the notion that imitative behavior serves to directly foster the development of a rich motor behavioral repertoire, by eliciting motor repetition, and the production of potentially novel motor behaviors. From these findings others have built a case for the potential presence of a mirror neuron system in humans that is active from birth (Bertenthal and Longo, 2007). The differences in proposed mechanism behind the observed imitation are underpinned by the question of whether imitation is inherently an active or passive process. Regardless of the resolution of this mechanistic dispute, functional outcomes of motor imitation result in increases in motor behavioral experience.

We propose that imitation events are related to sensorimotor couplings resultant from the neural integration of oscillatory patterns of the viewed individual onto the viewer's actions. Description of these temporal patterns involves a class of variables which are derived from mathematical chaos (Abarbanel, 1996), and are useful in describing the temporal structure of motion; including that of the individual pointlights provided in discrimination studies, posture, as well as human gait. Computations developed from chaos and dynamical systems theories (non-linear analyses) have demonstrated a unique capacity to characterize biological movement (i.e., posture, gait, and cardioballistics) according to its inherent temporal structure of motion variability (Glass and Mackey, 1988; Harbourne and Stergiou, 2003; Stergiou et al., 2004). Healthy biological motion exhibits a complex variability, meaning it is neither rigid nor random (Stergiou et al., 2006). The ability to perceive this complexity may be a discriminating factor in identifying biological motion from point-light displays.

Therefore, the current project focused on assessing the influence of perceived object motion on concurrent sensorimotor behavior of young children (age 4–6 years). The children were presented with an oscillating visual stimulus moving with various temporal structures; periodic, chaotic, and aperiodic. Gaze and posture responses to the stimulus motion were measured to determine whether information about the quality of its motion was able to guide the responsive strategies of these motor systems for gaze and postural movement control. We expect that children will demonstrate an ability to coordinate both gaze and posture to the various motion structures, with a particular affinity for more biologically relevant, chaotic motion. In order to pursue further resolution, we subdivided our cohort into age matched groups (4, 5, and 6 year-olds) to consider the possibility of developmental changes in sensorimotor responsiveness.

### Materials and Methods

### Participants and Procedures

Seventeen children participated in this study, ages 4–6 years. This included six boys and 11 girls, with average height of 112.8 ± 8.3 cm and average weight of 20 ± 5.8 kg. All children were verified to have normal vision and no neurological history. Typical development was confirmed using the Denver II scale (Frankenburg et al., 1992). Participants attended a single data collection session during which synchronous measures of eye movement and standing posture were collected while viewing a series of point-light stimulus motions (**Figure 1**). Children stood atop a force platform (Advanced Mechanical Technology Inc., OR6–7, with MSA-6 amplifier) where center of pressure data was collected at 50 Hz. Stimuli were presented on a 55-- 1920 × 1200 pixel LCD display, with a black curtain surround to block sight of objects in the peripheral visual field. FaceLab 4.5 (Seeing Machines, Acton, MA, USA) eye-tracking equipment was mounted on the monitor stand, and was used to track eye movements also at 50 Hz. This sampling rate was selected as the highest

FIGURE 1 | Child participant during data collection, standing on a force platform while watching a point-light stimulus oscillate left–right on the display monitor (motion not shown).

common frequency across all equipment, and is also sufficient to capture the dynamics of the measured gaze and posture behaviors. Stimulus velocity was designed to prevent saccade or rapid postural perturbation. Custom LabView (National Instruments, Austin, TX, USA) software was designed to synchronize all data collection and stimulus display. Procedures were approved by the Institutional Review Board of the University of Nebraska Medical Center, and consent was obtained from the parent(s) of each child before participating.

### Stimulus Presentation

The motion of the stimulus differed across three conditions by scaling temporal complexity; including periodic (Sine), pseudoperiodic (Chaos), and aperiodic (Brown Noise) motion structures. Each signal was created in Matlab (MathWorks, Natick, MA, USA) with length of 10500 data points, which represents three and a half minutes of continuous stimulus motion when displayed at 50 Hz. The generated data series were displayed via the main Labview application during each testing condition. The Sine signal was generated with the embedded sin() function. This signal represents simple periodic redundancy, similar to what would be seen from a frictionless clock pendulum. Chaos is a complex signal that was created from the horizontal aspect of the distal position of a two-linkage, double-pendulum model which has previously been shown to emulate the dynamics of human posture (Suzuki et al., 2012). Shinbrot et al. (1992) assert that such a model would have sufficient degrees of freedom to afford chaotic dynamics. Surrogation analysis (Theiler et al., 1992) confirmed that the generated signal exhibited chaotic dynamics. Brown Noise demonstrates aperiodic dynamics, and is equivalent to integrated white noise (which is perfectly stochastic). Brown Noise was created by iterative perturbation of the position of the stimulus, by random direction and distance (within a specified boundary distance). This signal structure was selected as it provides an aperiodic motion structure, but also affords continuous smooth pursuit eye movement responses. Furthermore, early work by Collins and Deluca (1994) provided the observation that human posture expresses Brown motion.

### Data Processing

Trials lasted for three and a half minutes; with data recorded at 50 Hz. Gaze represents the on-screen pixel coordinate to which the participant was looking throughout the trial. Center of pressure was recorded as the measure of posture. Only mediolateral aspects of posture and gaze were further processed, to provide assessment of response to the horizontal motion of the stimulus. Gaze data was treated to remove blink events that occurred during collections. When the eye-tracker is unable to track the eyes, it reports a zero (0) value for gaze position. These values were removed from the time series, and replaced using a fifth order cubic spline (Matlab, interp1 function). Gaze and posture data were then filtered using a double pass Butterworth filter with a 10 Hz cutoff (Mathworks Inc.), which corresponds to previous observations of standing posture in children. Further processing included normalization and the identification of segments during which the child was not speaking or making overt motions

with their head or arms. The longest common segment was 1500 data points, or 30 s of continuous data. These segments were selected and submitted for analysis. Cross recurrence quantification analysis (cRQA) was used to assess coupling of gaze (Gaze) and posture (COP) to the stimulus, separately, as well as between gaze and posture to gage sensorimotor coupling (SensMot).

### Cross Recurrence Quantification Analysis

The cRQA tests the relative likelihood of recurrence of behaviors across two time series in a common embedded phase space (Zbilut et al., 1998; Shockley et al., 2002; Shockley, 2005). Parameters of time delay (from average mutual information algorithm; Fraser and Swinney, 1986) and embedding dimension (from False Nearest Neighbors algorithm; Abarbanel, 1996) are used to unfold the signals into phase space (Takens, 1981). These algorithms indicated that values of 15 and 6, respectively, were appropriate for the data in this study. It must also be decided what proximity of two points would be required in order for them to be considered recurrent. This typically involves establishing a distance threshold, or radius (Shockley, 2005). We chose to algorithmically select the radius for each data series, such that a common rate of recurrence (5%) could be established across all data. This was opted to maintain consistency amongst all comparisons and to prevent saturation of determinism (Shockley, 2005). We evaluated the selected radius, group-wise and condition-wise using a 3 × 3 (Age × Stimulus) Mixed ANOVA; analogous in a way to using fixed radius and evaluating percent recurrence (**Table 1**). We also applied a minimum line length (minline) threshold of 25, such that only pairs of points which are contiguously recurrent for 25 time points will be considered to form lines of recurrence. At a sampling rate of 50 Hz, then, recurrent lines mean that the two data series are coordinated for a minimum of 500 ms, which will prevent short coincidental events, e.g., saccades, from being considered as coordination events. We also ran the COP and SensMot data with a more traditional minline value of 2, to allow better resolution of the postural data for percent determinism. Group- and condition- wise comparisons were similar regardless of the minline selected. Therefore, we present the results for minline of 25 for all signals, to maintain consistency.

TABLE 1 | Condition-wise reporting of radius values selected to maintain 5% recurrence across all signal comparisons.


*COP*radius *was found to differ across stimulus condition (p* < *0.001,* η*<sup>2</sup> <sup>p</sup>* = *0.599, 100% observed power), with pairwise comparisons indicating differences between all conditions. No group-wise differences or interactions were found.*

The cRQA output includes percent determinism and maxline, representing probability, and duration (respectively) of recurrent behavior (Shockley et al., 2002; Shockley, 2005). Percent determinism is calculated as the ratio of recurrent points that form lines, divided by the total number of recurrent points; reported from 0 to 100%. So, if no lines are formed, then percent determinism will be 0%. Conversely if all recurrent points form lines, then percent determinism will be 100%. Maxline represents the longest bout of continuous recurrence between the two signals, measured in number of concurrent points. With minline set at 25, the smallest value of maxline possible is also 25. The upper limit is the length of the data in phase space, which would occur if the two behaviors were only coordinated throughout every time step of the trial. With data sampled at 50 Hz, each 50 point increment of maxline relates to 1 s of experiment time wherein the signals coordinate.

### Statistical Analysis

Separate 3 × 3 (Age × Stimulus) Mixed ANOVAs were used for statistical comparisons; for percent determinism and maxline for each of Gaze, COP, and SensMot. This design provided the opportunity to explore the primary hypothesis of the current study, regarding a main effect of stimulus. Planned, follow-up pairwise comparisons (LSD method) were used to identify where differences occurred, if any. The 2-way design additionally provided for assessment of the possible influence of age and/or interactions between age and stimulus. This second assessment is slightly more speculative and is underpowered for the current report, with only six 4 and 5 year-olds and five 6 year-olds, but provides strong support for extended investigations into the effect of age. Significance level was set to 0.05 for all comparisons.

### Results

### Stimulus

The statistical analysis produced significant results with respect to the main effect of stimulus (**Figure 2**). Gaze significantly responded to the structure of the stimulus motion with changes in percent determinism (*p* < 0.001, η<sup>2</sup> <sup>p</sup> = 0.691, 100% observed power). Planned, follow-up testing indicated that rate of coordination (measured by percent determinism) of gaze to stimulus motion was similar in response to Sine and Chaos conditions (*p* = 0.906), but was lesser than each for the Brown Noise condition (*p* < 0.001 for both Sine and Chaos to Brown Noise). Duration of coordination (measured by maxline) also responded to the structure of stimulus motion (*p* < 0.028, η<sup>2</sup> <sup>p</sup> = 0.225, 67.5% observed power). Follow up testing showed that the difference between Sine and Chaos was not significant (*p* = 0.132), however, differences between Sine and Brown Noise (*p* = 0.021) and between Chaos and Brown Noise (*p* = 0.040) were significant.

A significant stimulus effect was found for COP percent determinism (*p* < 0.001, η<sup>2</sup> <sup>p</sup> = 0.749, 100% observed power), but not for COP maxline. Planned, follow-up testing indicated significant differences among each of the three conditions, with the greatest rate of coordination in response to the Sine stimulus and the least rate of coordination in response to the Brown Noise stimulus (*p* < 0.01 for all comparisons). No main effect of stimulus was found for SensMot, for either percent determinism or maxline.

### Age and Interactions

No main effect of age was found for any outcome for Gaze, COP, or SensMot. Moreover, no interactions were found for any

shown here. ∗indicates differences with *p* < 0.05.

outcomes for Gaze, COP, or SensMot. However, when examining closer the responses of each age to each stimulus (**Figure 3**), it is noticeable that the 4 year-old group seems to have consistently short maxline response to each stimulus and lacks an increased response to chaos. These observations were furthered explored as we looked into the main effect of age (*p* = 0.406, η2 <sup>p</sup> = 0.129, 28.3% observed power) and interaction of stimulus and age (*p* = 0.174, observed power = 34.2%), which showed neither observation to be significant.

### Discussion

Our results showed that typically developing children are able to coordinate aspects of their gaze and postural behaviors with the motion of a dynamic stimulus. Further, children appear to competently perceive and respond to chaotic motion, both in rate (percent determinism) and duration (maxline) of coordination. Being able to address complex motion might be a fundamental aspect of biological motion recognition. This elucidates a possible mechanism for the initial proposition that imitation behavior among children might be facilitated by an ability to perceive, and respond to, the chaotic complexity of viewed actions.

Chaotic motion is known to be inherent in human movement (Stergiou et al., 2006; Haworth et al., 2013), and may serve as a critical perceptive feature in biological motion recognition. It is important to know if only general features of motion are observed, though, or if even the minutia of movement variability is available to be considered by a viewer. Our results argue an ability to coordinate gaze, and in some respects posture, even with complex motion structures; echoing the complex stimulus behavior even at the level of fine detail. The lack of similar responses to

FIGURE 3 | Although no main effect of age (or interaction) was found, we noticed an interesting trend in the maxline of each age group. Particularly, the 4 year-olds group appears to have the least response to the presence of chaotic dynamic∗. Replication with larger group sizes could provide greater power, and thus potential identification of significant differences between age groups.

brown noise indicate that motion structure coordination is possible as long as some fashion of deterministic order can be found in the stimulus motion. This may relate to the intentionality found in human movements, which operates through potentially infinite degrees of freedom to become realized as actual movements (Bernstein, 1996).

The critical benefit of chaotic dynamics may be its underlying deterministic variability; in other words, its non-random unpredictability. Riley and Turvey (2002) provide extensive support to the notion that movement variability represents exploratory strategies, not 'noise,' and is a characteristic feature of a healthy mover. Exploration is also well known to be an integral part of the child's experience (Campos et al., 2000). It is important that children attempt new skills in new ways, finding their way to strategies that work best for themselves within their own selftypical environments (Adolph and Joh, 2009). Observation of others can be a huge informant to this process, allowing a child to access the action-effects of particular behaviors through a sort of egocentric proxy (Siegler, 1996; Siegler et al., 2010). Observation and imitation of others is both a natural and efficient way to gain added movement experience, conceivably even at the fine detail level of the particular movement. Bertenthal and Longo (2007) purport that a mirror neuron system operates in this direct mapping of viewed movements onto the motor cortex of the viewer; suggesting that even highly complex motions can be mapped in real-time.

Following this line of thought, we expected to find strong coupling of posture to viewed motion dynamics, particularly in response to chaotic motion. We found, though, that children did not seem to transfer the viewed motion structure to the regulation of posture in the way expected. Children generally coordinated posture with any stimulus for up to 200 data points at a time, which works out to roughly 4 s, and showed no difference across stimuli with regard to duration of coordination (maxline). Percent determinism showed, though, that posture was generally more often coordinated with the rhythmic stimulus and least coordinated with the arrhythmic stimulus. We contend that there are two possible explanations for this phenomenon. Firstly, the nature of the task may have not allowed for coupling of posture to the stimulus. We should note here that the instructions given to the subjects were to stand on the force plate as naturally as possible and look at the screen. No instruction was given to them on tracking their gaze or body with the stimulus, which may explain the lack of continuous coordination of posture with the visual stimulus. Secondly, the inertial effects may have influenced the ability to coordinate posture with the nuanced variance of the chaotic motion stimulus. In contrast to the eye, the body takes considerably more force and may utilize longer time delay reflexes to move. These results partially echo those of Scharli et al. (2012), who found weaker sensorimotor coordination in children of this age and evidence of a mechanical component. They argue that vision and posture should be regarded as individual systems, with *only mechanical interdependence* (Scharli et al., 2012). Our data show intermittent, non-trivial coordination (up to four continuous second bouts) of posture to stimulus, which provides evidence of active sensorimotor coordination and thus *dynamic informational interdependence*. Strong evidence for such sensorimotor coupling has been found in adults during both posture and gait (Kay and Warren, 2001; Stoffregen et al., 2006; O'Connor and Kuo, 2009; Giveans et al., 2011). It seems plausible, that the results of our current study imply an immature system that is yet unable to coordinate sensorimotor processes successfully over long durations. Hong et al. (2008) discuss at length the evolution of complexity and internal coordination of posture during sitting. They show that through aging (from children at 6 years, to children at 10 years, to young adults at 18–23 years) there seems to be a release of control over internal degrees of freedom, which leads to a relatively more adaptable posture. Possibly, children merely have a shorter sensorimotor coordination 'attention' span than adults.

This provokes the question of whether cognitive awareness ('attention') is necessary to translate viewed information into subsequent movement behavior. Olivier et al. (2008) reported that children ranging from 4 to 11 years of age show considerable changes in the control of posture, with a compounding influence of attention. Donker et al. (2007) have reported similar observations that attention affects postural organization. Our results of gaze maxline response to various motion structures (**Figure 3**) reveal the possibility that younger children possess a less diverse attention than their older peers. We wonder if, in line with the work of Hong et al. (2008), development

### References


brings with it a general complexification of behaviors; including posture and gaze. If so, this may confound our perspective of the results of the current study regarding sensorimotor coordination.

Finally, this work could become a point of focus for those looking to foster the development of a child. Therapists could potentially focus on the development of sensorimotor 'attention' as an indirect means toward increased functional movement behavior. Essentially, encourage children to take advantage of behavioral imitation to motivate the acquisition of motor experience in ways that increase awareness of successful variations of movement strategies.

## Acknowledgments

Funding was provided by a Dennis Weatherstone Predoctoral Fellowship (Autism Speaks grant #7070, awarded to author JH), with additional support for materials from the American Society of Biomechanics. Authors AK and NS currently receive support from the National Institutes of Health Centers of Biomedical Research Excellence (1P20GM109090-01). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Haworth, Kyvelidou, Fisher and Stergiou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Starting off on the right foot: strong right-footers respond faster with the right foot to positive words and with the left foot to negative words

*Irmgard de la Vega\*, Julia Graebe, Leonie Härtner , Carolin Dudschig and Barbara Kaup*

*Department of Psychology, Language and Cognition, University of Tübingen, Tübingen, Germany*

Recent studies have provided evidence for an association between valence and left/right modulated by handedness, which is predicted by the body-specificity hypothesis (Casasanto, 2009) and also reflected in response times. We investigated whether such a response facilitation can also be observed with foot responses. Right-footed participants classified positive and negative words according to their valence by pressing a key with their left or right foot. A significant interaction between valence and foot only emerged in the by-items analysis. However, when dividing participants into two groups depending on the strength of their footedness, an interaction between valence and left/right was observed for strong right-footers, who responded faster with the right foot to positive words, and with the left foot to negative words. No interaction emerged for weak rightfooters. The results strongly support the assumption that fluency lies at the core of the association between valence and left/right.

Keywords: embodiment, body-specificity hypothesis, handedness, footedness, emotional valence, fluency

### Introduction

Recent research has provided evidence for an association between positive/negative and left/right. In a series of studies, Casasanto (2009; see also Casasanto and Chrysikou, 2011, and Casasanto and Henetz, 2012) showed that positive valence is associated with the dominant hand or side, and negative valence with the non-dominant hand. For example, right- and left-handers were presented pairs of novel objects, one located on the right side, one located on the left, and had to decide which of these objects was more attractive or happier. Results showed that handedness influenced this decision: right-handers tended to ascribe these positive characteristics to the object located on the right, whereas for left-handers, the opposite pattern emerged: they tended to choose the object on the left for these positive characteristics. This association between valence and left/right is also reflected in response times: when participants classify positive and negative words according to their valence, right-handers respond faster to positive words with their right hand compared to their left hand, and faster to negative words with their left hand compared to their right hand. Left-handers, on the other hand, show the opposite pattern: they respond faster with their left hand to positive stimuli, and with their right hand to negative stimuli (de la Vega et al., 2012). This pattern – faster responses to positive items with the dominant hand, faster responses to negative words with the non-dominant hand – even shows when participants hold their hands crossed (de la Vega et al., 2013). Thus, it appears that the interaction between valence and left/right

#### *Edited by:*

*Lionel Brunel, Université Paul-Valéry Montpellier III, France*

#### *Reviewed by:*

*Denis Brouillet, Université Paul-Valéry Montpellier III, France Audrey Milhau, Université Charles-de-Gaulle – Université Lille III, France*

#### *\*Correspondence:*

*Irmgard de la Vega, Department of Psychology, Language and Cognition, University of Tübingen, Schleichstraße 4, 72076 Tübingen, Germany irmgard.delavega@uni-tuebingen.de*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 10 December 2014 Accepted: 01 March 2015 Published: 20 March 2015*

#### *Citation:*

*de la Vega I, Graebe J, Härtner L, Dudschig C and Kaup B (2015) Starting off on the right foot: strong right-footers respond faster with the right foot to positive words and with the left foot to negative words. Front. Psychol. 6:292. doi: 10.3389/fpsyg.2015.00292* has its origin in different experiences individuals make with regard to their hands, and is therefore also modulated by handedness.

How does this association between valence and left/right emerge? A plausible assumption is that motor fluency – the ease with which an action is performed (see Oppenheimer, 2008) – lies at the core of this association. The dominant hand, used for the vast majority of manual actions in everyday life, such as writing or using a knife, is, of course, much more fluent than the non-dominant hand. Motor fluency is therefore directly associated with the dominant hand. A high degree of fluency, on the other hand, is associated with positive affect (see Reber et al., 1998; Winkielman and Cacioppo, 2001; Winkielman et al., 2003; Reber et al., 2004; Beilock and Holt, 2007; Oppenheimer, 2008). It seems that motor fluency serves as link between positive valence and dominant hand, and negative valence and nondominant hand (see also Casasanto and Chrysikou, 2011; see Milhau et al., 2014, for the influence of situated motor fluency on the interaction between valence and left/right). This impact of everyday experiences of individuals in their physical environment, reflected in the association between valence and left/right, is in line with the body-specificity hypothesis (Casasanto, 2009), which postulates that mental representations shaped through interactions with the physical environment should differ for individuals with different bodies and, consequently, different experiences (see also Willems et al., 2009, 2010; Casasanto and Jasmin, 2010; Brookshire and Casasanto, 2012; Brunyé et al., 2012). With respect to its theoretical orientation and underlying assumptions, the body-specificity hypothesis can be embedded in theories of embodied cognition (e.g., Glenberg, 1997, 2010; Barsalou, 1999, 2008; Glenberg and Kaschak, 2002; Barsalou et al., 2003; Zwaan, 2004; Zwaan and Madden, 2005; Fischer and Zwaan, 2008).

Of course, most individuals do not only have a dominant hand, but also a dominant foot. Although the difference between the different degrees of motor fluency of left and right foot does not necessarily come to attention in everyday life as often as the different degrees of fluency of left and right hand, there are situations when footedness becomes relevant. For example, any hardcore football fan – or at least any coach worth his money – knows that when it comes to football, footedness matters. Most football players prefer one foot to kick the ball; for example, it is well known among fans that Diego Armando Maradona scored his goals mostly with the left foot (and once with the help of his left hand; Gutwinski et al., 2011), and a player managing to score a goal with his "wrong" foot earns often jubilant praise from coach, teammates, and commentator. Interestingly, while being left-handed is still seen as a disadvantage in certain cultures, being left-footed can in football even have advantages (see Porac and Coren, 1981; see also McMorris and Colenso, 1996; Baumann et al., 2011; Memmert et al., 2013).

Of course, there are many other situations in everyday life where footedness plays a role, although we may not always be aware of it. For example, footedness determines which foot we use when we step on a chair, or which foot we use as front foot when snowboarding. However, in contrast to handedness, the notion of the preferred foot is not always clear (Gabbard and Hart, 1996). Peters (1988; see also Gabbard and Hart, 1996) defines as preferred foot the foot used to interact with an object (e.g., kicking a ball; picking up a pebble) or the one that leads out (e.g., when jumping or when stepping up on a chair). The non-preferred foot supports the preferred foot and stabilizes its activities.

In analogy to handedness, most individuals prefer their right foot for motor actions such as kicking a ball (Coren and Porac, 1980; Gabbard and Iteya, 1996). This bias toward the right side, however, is more pronounced in handedness (around 90%; Peters et al., 2006) than in footedness (around 80%; Porac and Coren, 1981). There is evidence that footedness is a better predictor of cerebral lateralization than handedness (Elias et al., 1998), which has be attributed to less social pressure when using the left foot in comparison to left hand (Chapman et al., 1987; see also Meng, 2007, and Tran et al., 2014). Interestingly, although research indicates that for most individuals, lateral preference is the same for hand and foot, there is, in fact, an important difference between right- and lefthanders. Peters and Durding (1979), for example, found that of 56 right-handers, 95% preferred their right foot for kicking a ball, whereas of 56 left-handers, only 50% preferred their left foot for this task (Chapman et al., 1987, report similar numbers when measuring the foot preference of left- and right-handers on their 11-item foot preference inventory).

Moreover, foot preference seems to undergo a shift during childhood. Gentry and Gabbard (1995) noted that whereas 26% of 4- and 8-years-olds showed a mixed foot preference, a significant shift toward a preference of the right foot was observed for children from the age of 8 to the age of 11 years: the prevalence of right-footedness for 4- and 8-years-olds, 66%, did differ significantly from the prevalence of right-footedness for 11- to 20-years-olds, 81%. There was no significant difference between 4- and 8-years-olds, nor between the different groups between 11 and 20 years, indicating that after the age of 11, no major shift of footedness preference occurs (see also Gabbard and Iteya, 1996, for a review of the data indicating prevalence of footedness and handedness in children and adults, and Porac, 1996).

In the study described here, we investigated whether the motor fluency of the dominant foot influences – in analogy to the fluency of the dominant hand – the association between valence and left/right. In contrast to handedness, footedness does not play a role in everyday life for most individuals, or only marginally. Moreover, the difference in fluency between dominant and nondominant foot should be much smaller than the one between dominant and non-dominant hand, given that for many actions such as walking or running, both feet are used. Additionally, in everyday life and in contrast to manual actions such as writing or sewing, we usually do not perform any fine motor actions with our feet. In spite of these differences between manual actions and motor actions performed with a foot, most people should show a difference between their dominant and non-dominant foot with respect to the degree of fluency, although this difference might be pronounced less than the one between the dominant and non-dominant hand. According to the bodyspecificity hypothesis (Casasanto, 2009), this difference should then lead to an association of the dominant foot and positive affect.

To assess the potential association between dominant / nondominant foot and positive / negative valence, we followed the procedure reported in de la Vega et al. (2012, Experiments 2 and 3), with the difference that participants used their feet to respond instead of their hands. We presented positive and negative words to participants, who classified them according to their valence. They pressed a key with their right foot in response to positive words and with their left foot in response to negative words, or the other way around (response with the right foot to negative words, response with the left foot to positive words). If the dominant foot is associated with positive affect due to its greater degree of fluency, responses with the dominant foot should be faster for positive words, and with the non-dominant foot to negative words, in analogy to previous findings concerning manual responses (de la Vega et al., 2012, 2013). We decided to assess in this first study only right-footers, although the pattern emerging for their dominant vs. non-dominant foot should be the same for left-handers' dominant vs. non-dominant foot.

### Experiment

### Method

### Participants

All participants gave informed consent. 40 volunteers participated in the experiment. Only right-footed participants were included in the analyses, reducing the total number of participants to 37 (9 male). Mean age of participants was 23.2 years (*SD* = 2.7). All remaining participants were native German speakers and right-handers, according to a translated version of the Edinburgh Handedness Inventory (Oldfield, 1971; *M*handedness score = 83.68; *SD* = 18.24). We assessed footedness with a self-constructed questionnaire adapted from Chapman et al. (1987). This questionnaire contained the adapted five items for assessing footedness with the highest validity, as described in Chapman et al. (1987). Participants indicated their preferred foot for the following actions: try to kick a ball into a basket; write your name in sand; after writing the name, smooth the sand; roll a golf ball around a printed circle as rapidly as possible; kick as high as possible on a wall. Participants indicated their preference by ticking a box on the left, on the right or – in the case they did not prefer a foot – by leaving them blank. A ticked box on the right counted as 1, a ticked box on the left as −1. Participants could therefore obtain a footedness score between −5 (strong left-footer) and +5 (strong right-footer). The mean score obtained in this footedness inventory was 4.27 (*SD* = 0.99).

#### Materials and Apparatus

The material and procedure employed during the response time study were the same as in Experiments 2 and 3 in de la Vega et al. (2012), with the difference that participants responded with their feet instead with their hands. Fourty German words (20 positive, 20 negative) were used. Fourty German pseudowords served as fillers. The words were matched with regard to their frequency, but not with regard to arousal (for details, see de la Vega et al., 2012). Responses were collected with the help of a computer keyboard placed on the floor. The keyboard had a self-constructed overlay with one key on the left, and one key on the right. Participants pressed with their right foot the right key (the key END), and with their left foot the left key (the key TAB).

#### Procedure and Design

Right before the response time study, participants were asked to dribble a small ball around obstacles (plastic bottles filled with water). They did this first with one foot, afterward with the other foot, to get a feeling which foot they might prefer. After having done this, they indicated their foot preference and filled out the translated version of the Edinburgh Handedness Inventory (Oldfield, 1971) and the footedness questionnaire.

Each trial started with a fixation cross appearing centrally for 400 ms. Afterward, the item was presented for 2000 ms. Participants were asked to respond during this period. A blank screen was then shown for 1000 ms during experimental trials; during practice trials, feedback was shown during 1500 ms.

Half of the participants started by responding with their right foot for positive words, and with their left foot to negative words. In the second part of the experiment, this stimulus-response assignment was reversed. For the other half of participants, this order was the other way around. Participants did not respond to pseudowords. The same set of stimuli was used in both parts of the experiment, resulting in a total of 160 trials. Each part of the experiment started with 20 practice trials.

#### Results

Incorrect responses were discarded (4.0% of all Go-Trials). RTs of correct responses were submitted to two 2 (valence: positive vs. negative) × 2 (response foot: left vs. right) ANOVAs. One ANOVA treated participants as the random factor (*F*1), and one ANOVA treated items as the random factor (*F*2).

Overall mean was 730 ms. As in earlier studies employing the same stimulus material (de la Vega et al., 2012, 2013), a main effect for valence emerged [*F*1(1,36) = 39.11, *p* < 0.001; *F*2(1, 38) = 9.97, *p* = 0.003], with faster responses to positive in comparison to negative words (709 ms vs. 750 ms). Responses with the right foot were numerically faster than responses with the left foot (725 vs. 735 ms); however, this difference was only significant in the by-items analysis [*F*1(1,36) = 1.89, *p* = 0.18; *F*2(1,38) = 4.45, *p* = 0.04]. Crucially, although responses to positive words were faster with the right foot than with the left foot (690 ms vs. 729 ms), and responses to negative words were faster with the left foot than with the right foot (741 ms vs. 760 ms; see **Figure 1**), this difference was only highly significant in the by-items analysis, but not in the analysis across participants [*F*1(1,36) = 2.69, *p* = 0.11; *F*2(1, 38) = 24.75, *p* < 0.001].

The fact that the by-items analysis shows divergent results from the by-participants analysis might indicate that some, but not all, of the participants show the expected pattern. As the association between footedness and positive valence should depend on the degree of fluency individuals have made with their dominant foot, we decided to split the participants into two groups depending on their fluency with the right foot to further explore the data. We classified participants as "strong" and "weak" rightfooters. Strong right-footers included all participants who had

indicated to use only their right foot in all five questions of the footedness questionnaire (21 participants; mean score = 5.00, *SD* = 0.0); weak right-footers included the rest (16 participants; mean score = 3.31, *SD* = 0.79). The difference between the footedness scores of these group was significant [*t*(35) = 9.79, *p* < 0.001], while the difference between handedness scores was not [*t*(35) = 1.24, *p* = 0.22].

We conducted the same analyses as for the whole set of participants for strong right-footers and for weak right-footers. For strong right-footers, a main effect of valence showed [*F*1(1,20) = 19.88, *p* < 0.001; *F*2(1, 38) = 5.30, *p* = 0.03], with faster responses to positive in comparison to negative words (723 vs. 759 ms). There was no main effect of foot used to respond [*F*1(1,20) = 1.18, *p* = 0.29; *F*2(1, 38) = 3.87, *p* = 0.06]. Most important for our research question, an interaction between foot and valence emerged for the strong right-footed participants [*F*1(1, 20) = 7.49, *p* = 0.01; *F*2(1,38) = 66.29, *p* < 0.001; see **Figure 2**]. We conducted separate analyses for positive words and negative words only. Faster responses showed with the right foot vs. the left foot to positive words [687 ms vs. 759 ms; *F*1(1,20) = 13.43, *p* = 0.002; *F*2(1, 19) = 35.03, *p* < 0.001]. For negative words, responses were faster with the left foot in comparison to the right foot (735 ms vs. 783 ms). However, this difference was only significant in the by-items analyses [*F*1(1,20) = 2.83, *p* = 0.11; *F*2(1, 38) = 35.21, *p* < 0.001]. Interestingly, the weak right-footers showed a different pattern: they showed a main effect for valence [*F*1(1,15) = 18.92, *p* < 0.001; *F*2(1,38) = 12.19, *p* = 0.001], with positive words eliciting faster responses than negative words, and no effect for foot used for response [*F*<sup>1</sup> < 1; *F*2(1,38) = 1.81, *p* = 0.19]. However, they did not show an interaction between dominant foot and valence [*F*<sup>1</sup> < 1; *F*2(1,38) = 2.66, *p* = 0.11; see also **Figure 2**].1 This different pattern was further corroborated when we included the strength of right-footedness as a factor and conducted an additional analysis across all participants, as a three-way interaction between valence (positive vs. negative), response foot (left vs. right), and strength of right-footedness (weak vs. strong) emerged [*F*1(1,35) = 4.16, *p* = 0.049; *F*2(1,38) = 68.39, *p* < 0.001]. Additionally, we correlated the compatibility effect of each participant (calculated by the difference between the mean response time in the incongruent condition and the mean response time in the congruent condition) with his or her footedness score. A moderate correlation showed (*r* = 0.39, *p* = 0.017). It should be noted, however, that the correlation of difference RT scores can easily lead one astray (see Miller and Ulrich, 2013) and has to be interpreted with caution, as the observed correlation does not necessarily show the same pattern as the underlying correlation.

### Discussion

Previous studies have found evidence for an association between dominant hand and positive valence, presumably based on the high degree of fluency of the dominant hand (Casasanto, 2009; Casasanto and Chrysikou, 2011; Casasanto and Henetz, 2012; de la Vega et al., 2012, 2013). If fluency indeed lies at the core of such an association, then it is plausible to expect a link between the dominant foot and positive affect – albeit maybe, due to the

<sup>1</sup>Order of stimulus-response mapping was almost equally distributed among the strong right-footers, but not among the weak left-footers (this was due to the initial exclusion of participants due to left-footedness). An equal distribution of the

order is, however, important, as participants become typically faster and less prone to errors during the second part of the experiment. A majority of participants for whom a mapping of positive to the right and of negative to the left comes in the second half of the experiment could consequently inflate the effect artificially, whereas a majority of participants responding in the second part to negative with the right and to positive with the left might deflate it. Indeed, the majority of the participants in the group of weak right-footers had started with a mapping of positive to the right and negative to the left, whereas eleven out of 21 participants in the group of strong right-footers had started with a mapping of negative to the right and positive to the left. To rule out that the observed effects are due to this unequal distribution, we ran the same analyses cutting off the last participants of each group to obtain an equal distribution. The strong right-handers (20 participants) showed basically the same results as described above, as did the remaining 12 weak rightfooters [*F*<sup>1</sup> < 1; *F*2(1,38) = 1.11, *p* = 0.30]. The latter result cannot be attributed to the small number of participants left in this analysis, as we see when cutting off strong right-footed participants to get the same sample size [*F*1(1,11) = 3.38, *p* = 0.09; *F*2(1,38) = 20.98, *p* < 0.001].

Related to this issue, and as a reviewer pointed out, one might expect differences between participants with regard to the compatibility effect depending on the order of stimulus-response mapping, that is, whether participants start with the fluent mapping (positive to the right, negative to the left) or with the non-fluent mapping (negative to the right, positive to the left). We ran an analysis including all 37 participants and the order of stimulus-response mapping (participants start with the mapping of positive to the right and negative to the left and proceed with the mapping of positive to the left and negative to the right, or the other way around) as a factor. In the by-participants analysis, the factor "order of stimulus-response mapping" did not interact with any variable (all *F*s < 1.80). In the by-items analysis, a significant three-way interaction between "order of stimulus-response mapping," "valence," and "foot" emerged [*F*2(1,38) = 13.19, *p* < 0.001]. This interaction can be attributed to the fact that the second part of the experiment is overall faster than the first block, which is due to the increased practice of participants. This practice enhances the compatibility effect for participants who had started with a non-fluent mapping and who go on with a fluent mapping in the second block, but obviously not for participants who had started with a non-fluent mapping. However, the compatibility effect described above cannot be attributed merely to a practice effect, as it is present already in the first part of the experiment [*F*1(1,35) = 4.71, *p* = 0.036; *F*2(1,38) = 70.51, *p* < 0.001]; albeit not, when analyzing weak and strong right-footers separately, for weak right-footers (both *F*s < 1).

fact that the difference between dominant and non-dominant foot is far from being as marked as the one between dominant and non-dominant hand, not as strong as the one found between hands and valence.

We investigated the question whether people associate positive with their dominant foot in a response times study. Right-footed participants classified positive and negative words according to their valence, responding with their right foot to positive and with their left foot to negative, or the other way around. Results showed only a significant interaction between valence and foot in the by-items analysis when looking at the whole set of participants. However, when dividing participants into two groups according to the strength of their footedness, strong rightfooters showed an interaction between valence and left/right foot, whereas no interaction emerged for weak right-footers.

The findings are in line with the body-specificity hypothesis, according to which the physical experiences individuals make with their individual physical characteristics in their environment should have an impact on their mental representations, such as the association between valence and left/right. Furthermore, the difference between strong and weak right-footers strongly supports the assumption that motor fluency lies at the core of the association between valence and left/right. To our knowledge, this study is the first to provide evidence that the strength of preference of one limb might have a direct impact on this association. This finding implies that an association between valence and left/right foot only comes into existence when the degree of fluency of the dominant foot is strong enough, or when the difference in fluency between left and right foot is large enough. Although we investigated only right-footers in the experiment presented here, we expect the same pattern (an association between dominant foot and positive valence, and non-dominant foot and negative valence) to show also for strong left-footers. Future studies are needed to corroborate this assumption. An interesting question in this regard is whether this pattern can also be observed for handedness: do strong right-handers exhibit an association between valence and left/right, whereas weak righthanders do not? Or can such an association always be observed, independent of the strength of handedness? Interestingly, a recent study by Huber et al. (2014) points toward the possibility that in certain cases, the degree of handedness does play a role, as a reversed MARC effect showed for stronger left-handers, but not for weaker left-handers.

If the pattern observed here is confirmed in subsequent studies investigating, for example, manual responses, the results would be difficult to reconcile with what has be considered up to now an alternative explanation for the valence-by-left/right association as found in response time tasks, namely the polarity correspondence principle (Proctor and Cho, 2006; see also Lakens, 2011, 2012). According to this principle, stimulus categories and response categories are both coded binarily, with the salient category receiving a +, the non-salient category receiving a −. It is assumed that when two categories match in the sense that they both receive a +, or that they both receive a −, response times should be faster. For the categories valence and left/right, it might be assumed that positive and right (for right-handers or right-footers) are the salient dimensions. Negative and left (for right-handers and right-footers) should be the non-salient dimensions. In a response times study, faster responses with the right to positive and with the left to negative might then be explained on the basis of a match with regard to the coding of the dimensions (see also de la Vega et al., 2013). However, this alternative explanation is difficult to reconcile with the pattern observed here: if we always code different dimensions of categories binarily according to their salience, it should not matter if someone is a strong or a weak right-hander or right-footer; as soon as the right hand or foot is stronger than the left one, it should be coded as the more salient dimension. It follows from this that no difference should show between weak and strong right-footers in a valence judgment task in which they respond with their right or left foot; both groups should respond faster with the right foot (coded as +) to positive items (also coded as +), and with their left foot to negative items (−). The fact that only strong right-footers show a valence-byleft/right interaction indicates, however, that this account cannot explain the findings presented here; rather it is an indication that the degree of fluency of hand or foot plays a crucial role for the emergence (or non-emergence) of an association between valence and left/right. This observation fits also nicely with a recent study investigating positive and negative words and their association with vertical space, and which found compelling evidence for the role of specific experiences for such an association (Dudschig et al., 2015).

An interesting question following from this observation is whether individuals who are weak right-footers but strong righthanders should exhibit different patterns, depending on whether they respond with their hand or with their foot. Furthermore, if fluency of limbs lies at the core of the association between valence and left/right, individuals whose preferred hand and foot differ – that is, right-handers who prefer their left foot, and left-handers who prefer their right foot – should show opposing patterns, depending on the limb with which they respond in a valence judgment task. If the outcome for these participants does differ depending on the limb used for response, it would be interesting to see in non-motoric tasks whether individuals whose preferred hand and foot differ prefer the right or the left. In Experiment 3 reported by Casasanto (2009), participants indicated verbally whether they thought a positively associated entity should go in a box on the left or on the right. Employing such a paradigm could provide clues to whether individuals prefer the side of their dominant hand or of their dominant foot.

A final remark on the procedure employed in the present study: footedness of participants was made salient before the actual experiment. Participants first dribbled a small ball around obstacles, first with one foot, then with the other, in order to get a feeling which foot they might prefer. Afterward, they filled out the footedness questionnaire. In sum, participants' attention was drawn to their footedness before the valence judgment task started. It would be interesting to see whether the salience of footedness had an impact on the observed results, or whether the association between valence and left/right for

### References


strong right- or left-footers would also show if participants' attention is not drawn to their footedness, i.e., by not mentioning footedness before the experiment and filling out the footedness questionnaire afterward. For manual responses, we have observed valence-by-left/right interactions previously although no handedness questionnaire was given to participants before the actual experiment (de la Vega et al., 2012, Experiments 2 and 3). However, as footedness does not play such a prominent role as handedness in everyday life, we cannot conclude from these previous findings that an interaction between valence and foot responses shows if attention is not drawn to footedness. If no interaction shows when footedness is not salient, this would imply that rather than actual footedness, the *idea* of whether an individual thinks she is a (strong) right-footer should have an impact on responses. In this case, of course, fluency should not play any role at all for the emergence of such an interaction.

In sum, the present study not only extends existing literature by providing evidence for a compatibility effect between positive/negative and left vs. right foot for strong right-footers; it also hints at an influential role of the degree of fluency for the emergence of the association between valence and left/right. Future studies are needed to address whether the degree of handedness influences also bimanual responses to positive and negative items, and whether the self-conceptualization of an individual as (strong) right-footer or left-footer influences the valence-byleft/right association observed in a response times study.

### Acknowledgments

The experiment described here was conducted by JG and LH (in alphabetical order) as part of their BA thesis under supervision of IV and BK. We thank the reviewers for helpful comments on a previous version of the manuscript. This work was supported by a research grant from the German Research Foundation (DFG) to BK (SFB 833, project B4). We acknowledge support by the German Research Foundation (DFG) and the Open Access Publishing Fund of the University of Tübingen to cover the publication costs of this article.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 de la Vega, Graebe, Härtner, Dudschig and Kaup. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# When "good" is not always right: effect of the consequences of motor action on valence-space associations

#### Denis Brouillet <sup>1</sup> \*, Audrey Milhau<sup>2</sup> and Thibaut Brouillet <sup>3</sup>

<sup>1</sup> EPSYLON Laboratory, Paul Valery University, Montpellier, France, <sup>2</sup> URECA Laboratory, Lille 3 University, Lille, France, <sup>3</sup> CERSÆM Laboratory, Université Paris Ouest-La Défense, Nanterre, France

Since the work of Casasanto (2009), it is now well established that valence and laterality are associated. Participants tend to prefer objects presented on their dominant side over items presented on their non-dominant side, and to place good items on their dominant side and bad items on the other side. Several studies highlight that those associations of valence and laterality are accounted for by the greater motor fluency of the dominant hand and various studies noted that these associations could be reversed depending on the way people interact with their environment. Consistently with the Theory of Event Coding, the aim of this work is to show that the consequences of motor actions could also reverse the associations between valence and laterality. Thus, if participants had to place two animals (one good, one bad) on two supports, one stable (no risk of falling), one unstable (risk of falling), we hypothesized that the good item would be placed on the stable support, regardless of the side where it would be put (i.e., on the dominant or non-dominant side). We expected the opposite for the bad item. The results of two experiments are consistent with this prediction and support the claim that the consequences of motor action bias the hedonic connotation of our dominant side.

#### Edited by:

Nicolas Vermeulen, Université Catholique de Louvain, Belgium

#### Reviewed by:

Ramesh Kumar Mishra, University of Hyderabad, India Derrick L. Hassert, Trinity Christian College, USA

#### \*Correspondence:

Denis Brouillet, EPSYLON Laboratory, Site St. Charles, Route de Mende, 34199 Montpellier, France denis.brouillet@univ-montp3.fr

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 15 November 2014 Accepted: 16 February 2015 Published: 05 March 2015

#### Citation:

Brouillet D, Milhau A and Brouillet T (2015) When "good" is not always right: effect of the consequences of motor action on valence-space associations. Front. Psychol. 6:237. doi: 10.3389/fpsyg.2015.00237

Keywords: Body-Specificity Hypothesis, motor fluency, valence-laterality associations, event coding, consequences of actions

## Introduction

In various languages and cultures, phrases and idioms express a link between valence and horizontal space: to be someone's right hand, to have two left feet. . . In all these expressions, good things tend to be associated with the right side and bad things with the left side.

According to Casasanto's Body-Specificity Hypothesis (2009, 2011), the way we interact with our environment participates in our conceptualization of concepts and meaning. For instance, valence is associated with horizontal space because of the motor fluency by which one acts with one's dominant hand (i.e., the association valence-laterality). Indeed, one of our main body specificities is our handedness. Right- and left-handers act differently in their environment, and experience fluency from opposed movements: our most fluent actions are those carried out by our dominant hand, and on our dominant side.

Various researches have shown that motor fluency is associated with a hedonic connotation, such as fluent actions are positively connoted compared with a less or non-fluent action (Reber et al., 1998; Winkielman and Cacioppo, 2001; Winkielman et al., 2003; Hayes et al., 2008; Brouillet et al., 2011; de la Vega et al., 2012).

However Casasanto and Chrysikou (2011, Experiment 2) highlighted that the association valence-laterality could be reversed by short-term changes in the way one interact with its environment (i.e., participants manipulated dominos while wearing a bulky ski glove on their dominant hand). The aim of this work is to show that the consequences of motor actions could also reverse the associations between valence and laterality.

Indeed, since James's (1890) ideomotor theory, it is known that learning establishes direct and automatic links between actions and the perceptual results they generate (for a review, see Stock and Stock, 2004; Shin et al., 2010). In other words, action execution is triggered by a stimulus, and is necessarily followed by a feedback, which is a function of this action (action effect, Hommel, 1996, 2013; Prinz, 1997). In line with the ideomotor theory (James, 1890), the Theory of Event Coding (i.e., TEC, Prinz, 1997; Hommel et al., 2001) considers that action and its effects constitute one and the same event. In this line, Brouillet et al. (2014) could show that sensory-motor consequences of past actions form part of the memory trace components cued by a current action.

The purpose of this research is to test the hypothesis that the hedonic connotation of the action performed with our dominant hand on our dominant side (i.e., motor fluency) is relative to the consequences of this action. If the consequences of this action are detrimental to the object be placed (e.g., fall risk), we can assume that the association "right space—positive valence" will not occur.

### Valence and Laterality

Casasanto (2009) conducted the first study that directly tested the associations on valence and laterality. In their first experiment, participants were presented with a character named Bob that was supposed to like pandas and thought they were good, and to dislike zebras, thinking they were bad. Participants had to place those two animals into two boxes presented on the left and on the right parts of a piece of paper. The main result obtained by Casasanto is that right-handers tended to place the good panda in the right box, and the bad zebra in the left box, while a majority of left-handers placed the good panda on the left box, and the bad zebra on the right one (Casasanto, 2009, Experiment 1). Identical results were found when Bob was supposed to like zebras, and to dislike pandas. In following experiments, participants tended to prefer the product, person or creature presented on their dominant side to the items presented on their nondominant side (Casasanto, 2009, Experiments 3–5). For example when facing two alien creatures placed on either side of a piece of paper and having to choose which of them looks the most intelligent, funny or honest, participants tended to choose the one situated on their dominant side (Casasanto, 2009, Experiment 4). Simple observations of hand gestures during speech highlight similar combinations of valence and side, depending on the orator's handedness: during the campaigns for the U.S. presidential election, Casasanto and Jasmin (2010) identified different gestures during positive and negative speeches, depending on the candidates' dominant hand: right-handed candidates (Bush and Kerry) used their right hand during positively connoted speeches, and their left-hand during negatively connoted discourses, while left-handed candidates (Obama and McCain) used their left hand for positive speeches and their right-hand for negative ones.

These spontaneous associations of positive valence with the dominant side seem to be rather robust since they have been demonstrated in children as young as 5 years old (Casasanto and Henetz, 2012) and in various cultures (de la Fuente et al., 2014). Furthermore, they affect people's memory as well as their judgments: Brunyé et al. (2012) showed participants a map featuring the positions of fictitious positive and negative events. Later, participants had to recall those locations on the map. Results demonstrated memory biases dependent on the valence of the event and on the handedness of the participants: right-handers tended to locate positive events too far to the right and negative events too far to the left, whereas left-handers showed opposite biases.

These studies focused on the spontaneous associations of valence and laterality. Recently, the question was raised of whether those associations could affect response times in a valence judgment task manipulating the compatibility between valence and response hand. de la Vega et al. (2012) tested righthanders and left-handers using a valence judgment task of emotional words. Participants were asked to respond with both hands (i.e., dominant and non-dominant) by pressing response keys located respectively on the left and on the right-hand side of a keyboard, corresponding to either a positive or negative response (the position of each valence was reversed in the second part of the experiment). Their results showed compatibility effects: both right- and left-handers responded faster with their dominant hand to positive words than to negative words (see Kong, 2013, for similar results on evaluation of faces).

One experiment has focused on the effect of the lateral position of the positive and negative responses on the evaluation of neutral words. Milhau et al. (2013) showed that in a valence judgment task the way positive and negative labels are presented on each side of a horizontal scale has an impact on judgment: right-handers' evaluations are more positive on a scale associating positive to the right and negative to the left (congruently with Casasanto's associations of valence and laterality in righthanders) than on the reversed scale, especially after the carrying out of a fluent movement of the right hand.

Since those associations of valence and laterality are explained on the basis of manual dominance, one could expect that these links are fixed and constant. Yet, Casasanto and Chrysikou (2011) noted that these associations could be reversed by both longterm and short-term changes in the way one interacts with its environment. In a first experiment, stroke-induced hemiplegic patients were asked to perform an oral version of the good/bad animals experiment conducted by Casasanto (2009). In the second experiment, the authors used a simple motor task requiring the two hands (manipulating dominos), and temporarily disabled right-handers and left-handers' dominant hand with a bulky ski glove, leaving their non-dominant hand more efficient for task performance. Results showed that right-handers paralyzed (Experiment 1) or virtually disabled (Experiment 2) on the

right side tended to manifest valence/laterality associations usually encountered in left-handers, specifically to associate positive with the left and negative with the right. Similarly, left-handed patients paralyzed/disabled on the left side manifested righthanders' valence/laterality mapping: positive on the right and negative on the left (Casasanto and Chrysikou, 2011). These results confirmed that modifications in the way people interact with their environment (even at very short-term) modify their motor fluency and therefore their valence/laterality associations.

Similarly, Milhau et al. (2014) showed that in a valence judgment task, right-handers and left-handers manifest the same pattern of compatibility effects when using the same hand of response. In a valence judgment task of positive and negative words, participants responded with lateralized actions of either their dominant or non-dominant hand. Results highlighted that for both right- and left-handers, when the location of responses was congruent with the fluency of the responding hand (for the right hand: negative/left and positive/right; for the left hand: positive/left and negative/right), response times to positive evaluations were shorter than for negative evaluations. Conversely, when the location of responses was non-congruent with the fluency of the responding hand, we observed faster responses for negative evaluations than for positive evaluations.

These studies support an account of the associations of valence and laterality based on motor fluency and not only in terms of handedness. It is not always the dominant side that is positively connoted, but the side of the most fluent action. A recent experiment by de la Vega et al. (2013) confirms this explanation, demonstrating that the compatibility effects of valence and laterality in a valence judgment task are based on the response hand, and not on the response side: dominant-hand responses are facilitated for positive evaluations, even when the responses are located on the non-dominant side.

From these last experiments motor fluency seems to be the key factor to explain hedonic connotation linked to action. However, it appears that all these studies overlooked a crucial aspect of action: each action is followed by consequences. The status of action in these experiments is only a response to a stimulus, the command you run to respond to the task. Yet, in real life, an action is not only an output of the system, it is also informative of the consequences associated for the person performing it (see above ideomotor theory and TEC). For example, Anelli et al. (2012) show that if generally graspable objects activate a facilitating motor response, dangerous objects do not. Our objective is therefore to take into account the impact of action's consequences on the link between horizontal space and valence.

### Experiment 1

### Overview

Our objective in this experiment was to determine whether associations between valence and horizontal space in right-handed participants are influenced by the consequences of the participants' actions.

We first intended to replicate Casasanto's (2009) classic result that right-handers tend to associate good with right and bad with left (Experiment 1a). Then we tested whether these classical associations would appear with our specific response device (Experiment 1b). Finally, our last and main objective was to determine the impact of action consequences on these associations, by manipulating the risk associated with the result of an action (Experiments 1c and 1d).

In four declinations of the experiment, participants were presented with two figurines of animals (one presented as good and the other as bad, see Supplementary Material), and were instructed to place, with their right hand, each of them on one of two small planks situated on the participant's left and righthand side (see Supplementary Material). In Experiment 1a, the two small planks were laid flat on the table. In Experiment 1b, the two small planks were placed in stable equilibrium on a wood lath. In Experiment 1c, the small plank on the right side was placed precariously on a wood lath and was in unstable equilibrium, while the plank on the left side stood stable. In Experiment 1d, the small plank on the left side was placed precariously on a wood lath and was in instable equilibrium, while the plank on the right side stood stable. Thus, in Experiments 1c and 1d putting the animal on the instable small plank would result in the animal falling.

### Experiment 1a

### Participants

Thirty two students from Montpellier University (25 females), all native French speakers and all right-handed, took part in this experiment after having signed informed consent to participate in the study.

### Material

The material comprised two plastic colored animals and two small planks.

The plastic animals were a hyena (8 cm long and 4 cm high) and a zebra (8 cm long and 6 cm high). We chose these two plastic animals because they were equally long (this feature is important for the position of each animal on the small plank). A black mark, like an inverted triangle, was painted on the belly of the animal that referred to the middle. A pre-test on 50 persons determined that the hyena was negatively valenced and the zebra positively valenced.

The two small planks were 15 cm long, 3 cm wide and 1 cm deep. A white point with a diameter of 0.5 cm was painted on the middle of the long face of each small plank. This point indicated where participants had to put each animal.

### Procedure

Participants sat at a table with the experimenter sitting in front of them. A box containing the material was placed on a chair next to the experimenter, so that participants could not see the material before the experimenter presented them with it. A sheet of stiff paper was placed on the table (80 cm long and 50 cm wide), the bottom edge aligned with the edge of the table. Two crosses were horizontally aligned and drawn 50 cm apart on the paper sheet. Each cross was positioned 15 cm from the right or left edge of the sheet. This device allowed for similar conditions of presentation among participants.

Each participant was tested individually in an isolated room by a single experimenter. Each experiment started with a brief presentation of the upcoming task; participants indicated whether they agreed to participate or not. Upon their approval, the experiment started.

The experiment consisted in only one trial. The experimenter took out the two animals in one hand. He asked the participants to reach out their left hand and deposited the two animals together. He asked them to put their hands onto their knees until further instructions. This procedure was meant to avoid presenting an animal before the other, thus inducing a preferential choice. Once the participant's hand was on his knees, the experimenter took out the wooden planks and placed them one on each cross. Participants were then asked to place, with their right hand, the good zebra on one of the two small planks laid on the table and the bad hyena on the other by matching the mark on the animal's belly with the white point on the small plank. As in Casasanto (2009), the order in which participants were instructed to place the good and bad animals was counterbalanced, to ensure that any associations between space and valence in participants' judgments were not confounded with associations between the side of space and the temporal order in which they placed the animals (16 participants were instructed for good animal first and 16 for bad animal first).

### Results and Discussion

When asked to freely place a good animal and a bad animal on their left and right side, a majority (78%) of our right-handed participants placed the good animal on the small plank on the right, and the bad one on the small plank on the left (sign test on 25 vs. 7, Z = 4.00, p < 0.01, see **Figure 1**).

These results replicate Casasanto's (2009) classic effect that people tend to place positive items on their dominant side, and bad items on their non-dominant side.

Our effect is consistent with the Good is Right association usually found in right-handers (Casasanto, 2009; de la Vega et al., 2012; Milhau et al., 2013, 2014). The following experiment will try to extend this effect to our specific device.

## Experiment 1b

### Participants

Thirty two students from Montpellier University (23 females) that did not participate in Experiment 1a took part in this experiment. They were all native French speakers and all right-handed, and signed informed consent to participate in the study.

### Material

The material was the same as in Experiment 1a, except that the small planks had a black mark, like an inverted triangle, painted in the middle of the edge of the plank. We also used two wood laths that were 10 cm long, 3 cm large and 3 cm deep. A black triangle was painted at the end of the wood laths, on their edges.

### Procedure

The procedure was very similar to Experiment 1a, except that here the experimenter first took out the two wood laths and placed then vertically on the crosses on the table. Then the experimenter took out one center-marked small plank and asked participants to put it on the wood lath on the right by matching the mark of the small plank with the mark of the wood lath. When the small plank was placed the experimenter took out the other center-marked small plank and asked participants to put it on the wood lath on the left by matching the mark of the small plank with the mark of the wood lath. After these manipulations the two small planks were in stable equilibrium on the two wood laths and did not present any risk. Participants were then asked to place, with their right hand, the two animals on each of the two small planks. The order in which participants were instructed to place the small planks and the two animals was counterbalanced.

### Results and Discussion

When asked to freely place a good animal and a bad animal on their left and right sides on stable planks, a majority (69%) of our right-handed participants placed the good animal on the small plank on the right, and the bad one on the small plank on the left (sign test on 22 vs. 10, Z = 3.17, p < 0.01, see **Figure 1**).

These results replicate both Casasanto's effect (2009) and our results in Experiment 1a.

It confirms that our device with planks in equilibrium is adequate to further explore the associations of valence and laterality.

Our objective in Experiments 1c and 1d is now to explore the impact of action consequences on these associations.

## Experiment 1c

### Participants

Thirty two students from Montpellier University (26 females) that did not participate in Experiments 1a and 1b took part in this experiment. They were all native French speakers and all right-handed, and signed informed consent to participate in the study.

### Material

The material was the same as in Experiment 1b, except that one small plank had a black mark painted on the middle of the edge of the plank (center-marked) while the other plank had a black mark painted at 5 cm from the left edge of the plank (left-marked). We also used two wood laths that were 10 cm long, 3 cm large, and 3 cm deep. A black triangle was painted at the end of the wood laths, on the edge.

### Procedure

The procedure was very similar from Experiment 1b, except that here participant were instructed to place the left-marked plank on the wood lath on the right, matching the mark on the small plank with the mark on the wood lath. The matching of the marks resulted in the small plank being in instable equilibrium on the wood lath. Note that by placing the small plank participants experienced the instability and therefore the fact that an object placed on it would fall systematically from the structure. When the small plank was placed, the experimenter took out the center-marked small plank and asked participants to put it on the wood lath on the left by matching the mark of the small plank with the mark of the wood lath. In this case the small plank was in stable equilibrium on the wood lath. Participants had to place, with their right hand, one animal on each plank.

### Results and Discussion

Contrary to our previous results and consistently with our predictions, a majority (69%) of our right-handed participants chose to place the good animal on the small plank on the left and the bad animal on the plank on the right (sign test on 22 vs. 10, Z = 3.17, p < 0.01 (see **Figure 1**).

This result indicates an inversion of right-handers' usual associations, and highlights that participants took into account the consequences of their action. Because the animal placed on the unstable plank on the right would fall, participants mostly chose not to place the good animal on it, and to place it on the stable ("safe") plank, rather, even if it was on the left side.

## Experiment 1d

### Participants

Thirty-two students from Montpellier University (25 females) that did not participate in previous experiments took part in this experiment. They were all native French speakers and all right-handed, and signed informed consent to participate in the study.

### Material

The material was the same as in Experiment 1c, except that one small plank had a black mark painted in the middle of the edge of the plank (center-marked) while the other plank had a black mark painted 5 cm from the right edge of the plank (right-marked). We also used two wood laths that were 10 cm long, 3 cm large, and 3 cm deep. A black triangle was painted at the end of the wood laths, on their edges.

### Procedure

The procedure was very similar from Experiment 1c, except that here the unstable equilibrium plank was on the wood lath on the left, and the stable plank was on the wood lath on the right. Participants had to place, with their right hand, one animal on each plank. Note that the animal placed on the plank on the left hand side fell systematically from the structure.

### Results and Discussion

The pattern of result in this fourth experiment is in line with those of Experiments 1a and 1b and is the exact opposite of the pattern in Experiment 1d. A majority (81%) of our right-handed participants chose to place the good animal on the small plank on the right and the bad animal on the plank on the left (sign test on 26 vs. 6, Z = 4.24, p < 0.01, see **Figure 1**).

When it was the plank on the left that fell, participants had no problem assigning it to the bad animal. Their choices consequently reflect their typical associations of valence and laterality, linking good with right and bad with left.

## Conclusion of Experiment 1

The aim of this experiment was to test associations between valence and horizontal space in right-handed participants to determine whether these associations were influenced by the consequences of participants' actions.

Usually right-handers tend to associate positive to their right part of space and negative to their left part (Casasanto, 2009; de la Vega et al., 2012), because of the greater ease of their interaction with the word with their dominant right hand (motor fluency, for extended explanation see Milhau et al., 2014).

Experiment 1a allowed us to replicate these associations when we manipulated only the animal affective trait and the position of the planks. In this situation the action of placing each animal on the plank did not have any overt consequence, and right-handers mostly placed the good animal on the plank on the right and the bad animal on the plank on the left (see **Figure 1**).

In Experiment 1b the consequences of the participants' actions were limited: placing the animal on the planks was not likely to make it fall, whether it was placed on the right-hand-side plank or on the left-hand-side plank. This situation therefore did not differ significantly from the one in Experiment 1a, and we observed a similar pattern of response: good is right.

Our main hypothesis was that an overt "bad" consequence of action would impact participants' choices in this task. In Experiments 1c and 1d we modified the task so that one of the response choices would induce a risk for the animal: by making one of the planks unstable, one of the animals fell from the plank when the participant put it on it.

Our main result in Experiment 1 is the demonstration of the inversion of responses in Experiment 1c. When the unstable small plank was on the right side, participants did not choose to place the good animal on it as right-handers had done in the previous experiments. In contrast they largely preferred to place the bad animal on it, as if they were trying to "protect" the good animal from the risk of falling.

Experiment 1d confirmed this interpretation since when the unstable plank was on the left, right-handers mostly chose to place the bad animal on it, manifesting in their turn the usual associations according to which good is right, bad is left.

To test the validity of our interpretation in terms of "trying to protect the good animal from the risk of falling", we conducted a second experiment in which in a first phase the action of placing the good animal on the plank did not have overt consequence (i.e., similar to Experiment 1a), but in a second phase the action of placing the animal could have dramatic consequences (i.e., similar to Experiment 1c). In other words we wanted to see if after placing the good animal on the right, participants were able to change their choice when their action could have bad consequences for it (i.e., the risk of falling).

### Experiment 2

Experiment 2 was composed of two declinations that differed only in the second phase. In the shared first phase the two small planks were laid on the table, one on the right side and one on the left side of the participants. During the second phase the small planks were placed on the top of a wood lath. In Experiment 2a one of the small plank was in stable equilibrium on the wood lath on the right and the other in instable equilibrium on the wood lath on the left. In Experiment 2b one of the small planks was in instable equilibrium on the wood lath on the right and the other in stable equilibrium on the wood lath on the left.

### Experiment 2a

### Participants

Thirty-two students from Montpellier University (22 females), all native French speakers and all right-handed, took part in this experiment after having signed informed consent to participate in the study.

### Material

The material was the same as in Experiment 1a and in Experiment 1c.

### Procedure

Participants were not informed that they would have to place animals on the two small planks two times in a row. The procedure for the first placement was very similar from Experiment 1a, with the two small planks posed directly on the table. The procedure for the second placement was very similar from Experiment 1d: the two small planks were in equilibrium on the wood lath the one on the right being stable and the one on the left being unstable.

### Results and Discussion

The results of the first placement show that a majority (75%) of our right-handed participants placed the good animal on the small plank on the right, and the bad one on the small plank on the left (sign test on 24 vs. 8, Z = 3.75, p < 0.01, see **Figure 2**).

The results of the second placement show that a majority (81%) of our right-handed participants chose to place the good animal on the small plank on the right and the bad animal on the

plank on the left (sign test on 26 vs. 6, Z = 4.24, p < 0.01, see **Figure 2**).

The results are consistent with those obtained in Experiments 1a and 1d. When asked to freely place a good animal and a bad animal on their left and right sides, participants placed the good animal on the small plank on the right, and the bad one on the small plank on the left. Their choices were similar in both phases, showing that the risk of falling associated to the left plank did not impact their responses.

### Experiment 2b

### Participants

Thirty-two students from Montpellier University (26 females), all native French speakers and all right-handed, took part in this experiment after having signed informed consent to participate in the study.

### Material

The material was the same as in Experiment 2a.

### Procedure

Participants were not informed that they would have to place animals on the two small planks two times in a row. The procedure for the first placement was very similar to that in Experiment 1a, with the two small planks laid directly on the table. The procedure for the second placement was very similar from Experiment 1c: the two small planks were in equilibrium on the wood lath the one on the right being unstable and the one on the left being stable.

### Results and Discussion

The results of the first placement show that a majority (78%) of our right-handed participants placed the good animal on the small plank on the right, and the bad one on the small plank on the left (sign test on 25 vs. 7, Z = 4.00, p < 0.01, see **Figure 3**).

The results of the second placement show that a majority (72%) of our right-handed participants chose to place the good

animal on the small plank on the left and the bad animal on the plank on the right (sign test on 23 vs. 9, Z = 3.47, p < 0.01, see **Figure 3**).

The results are consistent with those obtained in Experiments 1a and 1c. When asked to freely place a good animal and a bad animal on safe small planks laid on their left and right sides, participants placed the good animal on the small plank on the right, and the bad one on the small plank on the left. But when the small plank on the right would fall, participants reversed their previous placement, and chose not to place the good animal on it but rather to place it on the safety plank (stable), even if it was on the left side.

### Conclusion of Experiment 2

The aim of Experiment 1 was to test whether the associations between valence and horizontal space in right-handed participants, are influenced by the consequences of participants' actions. Results of Experiment 1c highlight that when the unstable plank was on the right side, participants preferred to place the bad animal on it, as if they were trying to "protect" the good animal from the risk of falling.

The aim of this Experiment 2 was to test this interpretation. The results highlight that, even if in a first phase participants placed spontaneously the good animal on the right side, when the unstable plank was on the right side, participants did not choose to place the good animal on it. In contrast they largely preferred to place it on the left side. These results support our hypothesis that an overt "bad" consequence of action would impact participants' choices.

### General Discussion

The Body-Specificity Hypothesis (Casasanto, 2009, 2011) suggests that the body and the way one uses it shape one's thoughts. This proposition has been tested by comparing right- and lefthanders' emotional judgment. Because of handedness, people experience horizontal space differently: they act more easily with their dominant hand, that is to say more fluently. Since motor fluency is associated with hedonic connotation, a fluent movement being ascribed a positive connotation (Hayes et al., 2008), people tend to associate positive valence with the dominant side: right is good and left is bad for a right-hander, the reverse for a left-hander (Casasanto and Henetz, 2012; de la Vega et al., 2012, 2013; Kong, 2013; de la Fuente et al., 2014; Milhau et al., 2014). Nonetheless Casasanto and Chrysikou (2011) showed that variations in the way people interact with their environment modify their valence—laterality associations. Moreover, Milhau et al. (2013) showed that the location of responses interacts with the fluency of the responding hand. Thus, these studies highlight that the valence-laterality association is supported by motor fluency more than handedness. It is not always the dominant side that is positively connoted, but the side of the most fluent action.

Here we considered that these last studies overlooked a crucial aspect of action: each action is followed by consequences. As supported by the ideomotor theory (James, 1890) or the Theory of Event Coding (Prinz, 1997; Hommel et al., 2001), action and its effects constitute one and the same event. Therefore, the aim of this article was to take into account the impact of action's consequences on the link between valence and laterality.

Our experiments are inspired by Casasanto's (2009) classic paradigm. Participants were presented two figurines of animals (one presented as good and the other as bad), and were instructed to place each of them on one of two small planks situated on the participant's left and right-hand side. These small planks were either laid flat on the table or the participants had first to place them on wood laths. In this case the small planks were in stable or unstable equilibrium.

The main result in Experiment 1 is that when the unstable small plank was on the right side, participants did not choose to place the good animal on it. In contrast they largely preferred to place the good animal on the left side on the stable small plank, as if they were trying to protect the good animal from the risk of falling.

The results of Experiment 2 highlight that, even if the participants spontaneously placed the good animal on the right when the device was safe, they changed their responses when it was made dangerous (i.e., risk of falling): they did not put the good animal on the risky right side but preferred to put it on the left side, in a safer place.

Taken together, these results support our hypothesis that the hedonic connotation associated to the fluency linked to dominant hand and dominant side is relative to the consequences of the actions performed. The classic association "right space positive valence" for a right-hander did not occur if the action performed could have negative consequences (e.g., fall risk). In other words, right is good if right is safe. But, the experience should be replicated with left-handers for the results to be generalizable.

To conclude, the originality of this work lies in three points. First, it is to our knowledge the first study that shows that the consequences of an action modify emotional judgments and reverse the valence—space associations. This result provides additional support to the claim that the compatibility effects of valence and laterality in a valence judgment task are based on the fluency of the response hand, and not on the response side: this is true despite the consequences of the action performed. Second, the perception of a situation is underpinned by the activation of the perceptual outcome of action, that is to say the action performed and its consequences. Consequently, the judgment on an object that appears in this situation and the action to perform with it are dependent on this perceptual outcome. Third, we have created an original paradigm that simulates what happens in real life: actions are always followed by an informative feedback. Indeed, the functional characteristics are crucial to define a fluent action (i.e., effector and orientation, for example the dominant hand acting in the dominant side), but in order to adapt to

### References


the situation at hand; cognition has to take into account action's consequences.

### Acknowledgments

All co-authors would like to thank Jean-Michel Ganteau for proofreading the document as an English Professor and Zineb Malki for her precious help in data collection.

### Supplementary Material

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpsyg. 2015.00237/abstract

Discipline, eds W. Prinz, M. Beisert, and A. Herwig (Cambridge, MA: MIT Press), 113–136.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Brouillet, Milhau and Brouillet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Affective valence facilitates spatial detection on vertical axis: shorter time strengthens effect

*Jiushu Xie1, Yanli Huang2, Ruiming Wang1\* and Wenjuan Liu1*

*<sup>1</sup> Center for Studies of Psychological Application, School of Psychology, South China Normal University, Guangzhou, China, <sup>2</sup> Department of Educational Psychology, The Chinese University of Hong Kong, Hong Kong, China*

Affective concepts can be described in terms of space, which is known as the valencespace metaphor. Previous studies have not investigated either the specifics of this metaphor on the transverse and vertical axes or the time course of this metaphoric association. With Chinese participants, we used a spatial cue task to study the valence-space metaphor on the transverse (left-and-right; Experiment 1A) and vertical (upper-and-lower; Experiment 1B) axes. After being shown an affective word and asked to keep it in mind, the participants were given a spatial target detection task. The results revealed that the metaphoric association was only found on the vertical axis. More specifically, keeping a positive word in mind facilitated the detection of the upper target, but no such effect was found in the detection of the lower target. Furthermore, in Experiment 2, we manipulated the duration of time (100, 500, and 1000 ms) between the offset of the affective word and the onset of the spatial target (i.e., interstimulus intervals, ISI), to test the dynamic time course of the valence-space metaphor on the vertical axis. The results showed that when ISI was 100 ms, keeping a positive word in mind facilitated the detection of the upper target and keeping a negative word in mind facilitated the detection of the lower target. However, when the ISI was 500 or 1000 ms, keeping a positive word in mind facilitated the detection of the upper target and no such effect was found in the detection of the lower target, indicating that ISI might be important in the valence-space metaphoric association. In sum, we found that the processing of affective valence activated the vertical spatial axis but not the transverse axis. Further, the association might be modulated by ISI, indicating that it may be related to attention allocation.

Keywords: embodied cognition, conceptual metaphor theory, grounded cognition, valence-space metaphor, affective word, spatial axis, interstimulus interval

### Introduction

People frequently use space to understand emotion. When we say, "s/he is in low/high spirits," we are employing down or up to interpret sadness or happiness, respectively. Previous studies have found this valence-space metaphoric association along the vertical axis (up vs. down; Meier and Robinson, 2004; Schubert, 2005; Lakens, 2012; Gozli et al., 2013a). Moreover, other studies have also found a relationship between valence and dominant/non-dominant hands such that positive emotional valence relates to the dominant hand and negative emotional valence relates

#### *Edited by:*

*Nicolas Vermeulen, Université Catholique de Louvain, Belgium*

#### *Reviewed by:*

*Kerstin Dittrich, Albert-Ludwigs-Universität Freiburg, Germany Brice Beffara, Université Grenoble-Alpes, France*

#### *\*Correspondence:*

*Ruiming Wang, Center for Studies of Psychological Application, School of Psychology, South China Normal University, No. 55, West Zhongshan Avenue, Tianhe District, Guangzhou 510631, China wangrm@scnu.edu.cn*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 14 November 2014 Accepted: 25 February 2015 Published: 24 March 2015*

#### *Citation:*

*Xie J, Huang Y, Wang R and Liu W (2015) Affective valence facilitates spatial detection on vertical axis: shorter time strengthens effect. Front. Psychol. 6:277. doi: 10.3389/fpsyg.2015.00277* to the non-dominant hand (Casasanto, 2009; de la Vega et al., 2013). However, the valence-space metaphoric association along the transverse axis (left vs. right side) has remained unclear. More importantly, the dynamic change of this metaphoric association is still unknown. We aimed to test these questions in the experiments described below.

Concepts are the basis for human cognition. However, there are no concrete examples for abstract concepts. How then can we understand and represent abstract concepts? The idea of embodied cognition is that abstract concepts (e.g., affective concepts) could be grounded in sensorimotor systems (Niedenthal, 2007; Barsalou, 2008). Vermeulen et al. (2007) found that verifying properties from different modalities (i.e., visual, auditory, and affective systems) led to longer reaction times and higher error rates, known as switching costs, indicating that affective concepts were grounded in sensorimotor systems. Furthermore, verifying affective properties but not sensory properties of concepts resulted in an interference effect when participants kept the affective load (i.e., emotional faces) in mind (Vermeulen et al., 2014). Hence, with embodied cognition, processing affective concepts operates perceptual symbols in sensorimotor systems.

Like the embodied view, Lakoff and Johnson (1980) propose that people use metaphor to understand abstract concepts. With metaphor, people use source concepts (usually concrete concepts) to elaborate other target concepts (usually abstract concepts). Take affective words, for example. We often use space (concrete concepts) to elaborate affective valence (abstract concepts), such as saying, "you bring me up/down." In this sentence, we understand "happy is up and sad is down" or "good is up and bad is down." Hence, when we understand, and express positive and negative emotion, we also activate upper and lower spatial information, respectively.

Many studies have found this metaphoric association with spatial information in processing affective words along the vertical axis. Meier and Robinson (2004) found that discrimination for stimuli in the upper position was facilitated following positive words, whereas discrimination for lower stimuli was facilitated following negative words. Gozli et al. (2013a; Experiment 2) tested the effect of stimulus-onset asynchrony (SOA) on the valence-space metaphor. Priming stimuli (i.e., affective or nonaffective words) appeared at the center of the screen for either a short or long SOA. Then a visual target letter "X" or "O" appeared above or below the priming stimuli. The participants identified the visual target regardless of whether the priming stimuli were affective or non-affective words. Identification was facilitated when upper spatial targets followed positive words or lower spatial targets followed negative words.

In addition to the valence-space metaphor along the vertical axis, the association between affective valence and dominant/non-dominant hand has also been tested in many studies. Casasanto (2009) proposes a body-specificity hypothesis to interpret the association between body and mind. This hypothesis holds that left- and right-handed people use different ways to represent information. Taking affective valence, for example, right-handers associate positive with right and negative with left, whereas left-handers associate positive with left and negative with right. In other words, people with different dominant hands think of affective valence differently (Casasanto, 2011).

The association between affective valence and dominant/nondominant hand is limited but stable. Research by de la Vega et al. (2012) found that this association along the transverse axis emerged only when participants made valence judgments and explicit mapping between valence and hand (left or right hand). Subsequently, de la Vega et al. (2013) asked participants to cross their hands and use their left/right hand (Experiment 1) or the left/right key (Experiment 2) to respond to affective words. In this way, their right hands were at the left side and left hands were at the right side. Hand and side carried incongruent information. They found that participants' responses were facilitated only when they used their right hands to respond to positive words and left hands to respond to negative words, no matter where the hand was. This finding indicates that the valence-space metaphor is related to hand along the transverse axis (de la Vega et al., 2013).

However, it is still unclear whether the valence-space metaphor is related to the left/right side, relative to the body. Previous studies only asked participants to respond to affective words using their hands, so they only tested the association between affective valence and response hand. Given this instruction, participants might indicate a mapping between affective valence and hand (which is one part of the body). Yet, the association between affective valence and left/right side (which is not a part of the body) has not been well investigated. Whether the valence-hand bodily association would extend its influence to space (left or right side) needs further research. Hence, in the current Experiment 1A, we adopted a spatial cueing task to test whether processing affective words activated spatial information along the transverse axis (left/right side). Meanwhile, we also tested whether processing affective words activated spatial information along the vertical axis (up/down; Experiment 1B) in order to replicate previous studies.

A further question concerns how affective words affect spatial processing in the priming paradigm. One potential explanation may be that the processing of affective words might engage spatial attention (Gozli et al., 2013b). Thus, an affective word could be treated as a special type of endogenous cue. Studies on typical endogenous cues found that reaction times to targets were a function of SOA. Frischen and Tipper (2004) found that valid gaze cue improved performance (cueing effect) when the SOA was 200 but not 1200 ms, indicating such attention shifts were rapid, automatic, and transient. Fischer et al. (2003) presented a smaller digit (e.g., 1 or 2) or a larger digit (e.g., 8 or 9) at the center of a screen with two boxes at the left and right side. They found larger or smaller digits shifted participants' attention to the right or left side, respectively. This effect was found when the interval before targets was 400 and 500 ms, but was decayed when the interval became longer. In a Simon task, cueinduced spatial coding was found when the SOA was 100, 500, or 1000 ms, which supported that there was a short-duration coding period (at least 1 s) before decay (Danziger et al., 2001). Hence, the cueing effect for typical endogenous cues lasts for a short period.

However, if we treat affective words as a special type of endogenous cue, it is unknown whether the valence-space association would be changed under different durations between the offset of the affective word and the onset of the spatial target hereinafter called interstimulus intervals (ISIs). If affective words in a valence-space metaphoric association had a similar function as typical endogenous cues, this association would be changed under different ISIs. If this is true, we could predict that this association would become weak or disappear when the ISI becomes longer. Testing the time course of valence-space metaphor could help us to better understand the function of affective words and the dynamic change of this metaphoric association. Thus, in Experiment 2, we manipulated the ISI between affective words and targets to test the time course of valence-space metaphor.

As mentioned above, in Experiment 1A, we tested whether processing affective words affects a spatial detection task along the transverse axis; that is, whether the valence-space metaphoric association exists along the transverse axis. If it exists, the likely pattern might be that processing positive words facilitates detection for right spatial targets and processing negative words facilitates detection for left spatial targets. In Experiment 1B, we aimed to replicate the results of previous studies indicating that processing affective words affects spatial detection tasks along the vertical axis: specifically, that processing positive words facilitates detection for upper spatial targets and processing negative words facilitates detection for lower spatial targets. The ISI in Experiments 1A and 1B was fixed at 750 ms. Experiment 2 tested the time course of the valence-space metaphoric association along the vertical axis by manipulating the ISI between affective words and spatial targets (100, 500, or 1000 ms). If affective words have similar functions as typical endogenous cues, this allocation would rapidly decay after processing affective words, and the valence-space metaphoric effect would be found with short ISIs (e.g., 100 ms), but not long ISIs (e.g., 1000 ms). In other words, the processing of affective words may facilitate participants' performance along the vertical axis when the affective valence is metaphorically congruent with the vertical position, and this facilitation effect may decay rapidly.

### Experiment 1A

In this experiment, we tested whether the metaphoric congruency effect exists on the transverse (left-to-right) axis (i.e., whether processing affective words affects spatial target detection tasks on the transverse axis), using a procedure in which participants kept an affective word in mind and completed a spatial target detection task on the transverse axis.

### Method

### Participants

Forty-five students (mean age = 20.33 years, SD = 1.73 years, 38 of them female) from South China Normal University, Guangzhou, China, participated in this experiment for a small monetary compensation. They were recruited randomly by posting an advertisement on a campus forum. All of them had normal or corrected-to-normal vision. They participated in this experiment voluntarily and could withdraw at any time. All participants provided written informed consent prior to the experiment. The study was approved by the ethics review board of South China Normal University.

### Materials

We used the same 240 affective words as those used by Xie et al. (2014), which were selected from the Chinese Affective Words System (CAWS). Half of them were positive, while the other half were negative (Wang et al., 2008). Positive and negative words were matched for number of first character strokes, number of second character strokes, number of word strokes, and word occurrence frequency. E-Prime 2.0 (Psychological Software Tools, Inc., Sharpsburg, PA, USA) was used for material presentation and data recording.

### Research Design and Procedure

This experiment adopted a 2 (valence of affective word: positive vs. negative) × 2 (location of spatial target: left vs. right) full within-subjects design.

The procedure was similar to the study of Ouellet et al. (2010; Experiment 1). The participants sat in a dimly lit soundproof cube. All materials were presented on a black background. To begin, a red cross was presented at the center of the screen for 500 ms, followed by a Chinese affective word (bold, 24-point SimSun Font) presented at the center of the screen for 1500 ms. The participants were instructed to memorize the word and that they would be tested at the end of each trial. After a 500-ms blank screen, two empty white squares (1.3 cm × 1.3 cm) were presented side by side, each appearing separately at the left and right of the screen for 250 ms. Then a white dot (diameter = 5 mm) appeared in one of these two squares for 50 ms. The squares remained on the screen for 2300 ms or until the participants made a response indicating the location of the white dot. Participants were asked to put their left index fingers on the "Z" key and right index fingers on the "M" key on the keyboard. They were asked to press the "Z" key to respond, if the dot appeared in the left square or press the "M" key to respond, if the dot appeared in the right square. After that, a blank screen was presented for 1000 ms. Then a question "? Positive?" or "?Negative?" was presented in Chinese for 4000 ms or until responses were given. The participants needed to judge whether this word (i.e., positive or negative) described the valence of the previously shown affective word correctly. Participants were asked to press the "Yes" key ("Z" or "M") on the keyboard to respond if the question word described the valence of the previous word correctly or to press the "No" key ("M" or "Z") to respond, if the question word described the valence of the previous word incorrectly. The allocation of "Yes" or "No" to "Z" or "M" was counterbalanced between participants. After a 1000-ms blank screen, the next trial was presented. Three short rest sessions were included in the experiment (see **Figure 1**).

Reaction times and accuracy data from the spatial detection task were dependent variables and were used for further analyses in all experiments reported here. We included the affective word memory test to ensure that participants memorized the affective words correctly in each trial.

#### Results

Given that we used the same paradigm as that used in Ouellet et al. (2010; Experiment 1), we also adopted the same trimming and data analysis methods as those in Ouellet et al. (2010; Experiment 1). Two participants' data were deleted due to their accuracies in the spatial target detection task or affective word memory task measuring lower than 75%. For the remaining participants, reaction times from the spatial detection task with erroneous trials (675 trials, 6.39%) in the detection and memory tasks were discarded. Then, correct trials with reaction times below 100 ms and above 650 ms (195 trials, 1.85%) were discarded as outliers. All remaining participants' accuracy data from the spatial detection task were included in the analyses.

Reaction times and accuracy data from the spatial detection task were analyzed with two separate 2 (valence: positive vs. negative) × 2 (location: left vs. right) ANOVAs, using both participants (*F*1) and items (*F*2) as random factors. Valence was a within-subjects and between-items factor when participants and items were random factors, respectively. For studies on the valence-space metaphor, it is common to conduct two separate ANOVAs to include both participants and items as random factors in the analyses (e.g., de la Vega et al., 2012; Kong, 2013). Using the same data analyses as those used in previous studies makes it possible to compare the current results with those of previous studies.

No significant results were found in the accuracy data analyses (*p*s > 0.05). Specifically, the interaction between valence and target location was not significant, [*F*1(1,42) = 0.54, *p* = 0.468, <sup>η</sup>*p*<sup>2</sup> = 0.01; *F*2(1,238) = 0.48, *p* = 0.489, <sup>η</sup>*p*<sup>2</sup> < 0.01].

In the reaction times analyses, no significant results were found (*p*s > 0.05). Of particular note, the interaction between valence and target location was not significant, [*F*1(1,42) = 0.11, *p* = 0.748, <sup>η</sup>*p*<sup>2</sup> < 0.01; *F*2(1,238) = 0.60, *p* = 0.441, <sup>η</sup>*p*<sup>2</sup> < 0.01; see **Figure 2** and **Table 1**].

#### Discussion

In Experiment 1A, we did not find any metaphoric congruency effects. Responses to the spatial target were not affected by affective words, indicating that participants may not use transverse space to understand affective words. In Experiment 1B, we intended to test whether participants use vertical (top-to-bottom) space to understand affective words. According to previous findings, metaphoric congruency effects have been reported with the vertical axis. We predicted that a metaphoric congruency effect would appear in Experiment 1B.

### Experiment 1B

In Experiment 1B, we try to replicate previous findings on metaphoric congruency effects using the same paradigm as

Experiment 1A. The procedure was similar to Experiment 1A, except that the spatial target was presented on the vertical axis.

### Method

#### Participants

Forty-five students (mean age = 21.24 years, SD = 2.05 years, 35 of them female) from South China Normal University, Guangzhou, China, participated in this experiment voluntarily. None of them had participated in the previous experiment. They were recruited using the same methods as Experiment 1A. All participants provided written informed consent prior to the experiment and were paid after the experiment. The study was approved by the ethics review board of South China Normal University.

#### Materials

The same affective words were used as those in Experiment 1A.

#### Research Design and Procedure

This experiment also adopted a 2 (valence of affective word: positive vs. negative) × 2 (location of spatial target: bottom vs. top) full within-subjects design.

The procedure was similar to Experiment 1A, except for the following: (1) two empty white squares were presented at the bottom and top of the screen, (2) the white dot was presented in a bottom or top square, (3) participants used "Y" and "B" to respond to the spatial target detection task. When the dot was presented in the bottom or top square, participants pressed "Y" or "B" to make a response, respectively; (4) in the memory test task, participants pressed "Y" or "B" for a "Yes" or "No" response. The allocation of "Yes" or "No" to "Y" or "B" was counterbalanced between participants.

#### Results

Two participants' data were deleted due to their response accuracies in the spatial target detection task or affective word memory task measuring lower than 75%. The same trimming method was adopted as in Experiment 1A. First, erroneous trials (722 trials, 7.00%) in the detection or memory task were discarded. The remaining correct trials in the spatial detection task with reaction times below 100 and above 650 ms (200 trials, 1.94%) were also discarded as outliers.

Two separate 2 (valence of affective word: positive vs. negative) × 2 (location of spatial target: bottom vs. top) ANOVAs were employed to analyze reaction times and accuracy data from the spatial detection task. Both participants (*F*1) and items (*F*2) were considered random factors in the analyses.

In the accuracy data analyses, the main effect for location of spatial target was significant [*F*1(1,42) = 7.44, *p* = 0.009, <sup>η</sup>*p*<sup>2</sup> = 0.15; *F*2(1,238) = 14.00, *p* < 0.0005, <sup>η</sup>*p*<sup>2</sup> = 0.06]. Responses to upper targets were more accurate than lower targets. The interaction between valence of affective word and location of spatial target was not significant [*F*1(1,42) = 0.89, *p* = 0.351, <sup>η</sup>*p*<sup>2</sup> = 0.02; *F*2(1,238) = 0.53, *p* = 0.466, <sup>η</sup>*p*<sup>2</sup> < 0.01].

In the reaction times analyses, the main effect for valence of affective word was only significant by participant [*F*1(1,42) = 8.25, *p* = 0.006, <sup>η</sup>*p*<sup>2</sup> = 0.16; *F*2(1,238) = 2.35, *p* = 0.127, <sup>η</sup>*p*<sup>2</sup> = 0.01]. The main effect for location of spatial target was significant [*F*1(1,42) = 33.71, *p* < 0.0005, <sup>η</sup>*p*<sup>2</sup> = 0.45; *F*2(1,238) = 268.00, *p* < 0.0005, <sup>η</sup>*p*<sup>2</sup> = 0.53]. Importantly, the interaction between the valence of affective word and the location of spatial target was significant by item [*F*1(1,42) = 2.87, *p* = 0.097, <sup>η</sup>*p*<sup>2</sup> = 0.06; *F*2(1,238) = 4.29, *p* = 0.039, <sup>η</sup>*p*<sup>2</sup> = 0.02]. Further simple analyses found that when the spatial targets were presented at the top, positive words facilitated the participants' responses [*F*1(1,42) = 9.62, *p* = 0.003, <sup>η</sup>*p*<sup>2</sup> = 0.19; *F*2(1,238) = 5.64, *p* = 0.018, <sup>η</sup>*p*<sup>2</sup> = 0.02], compared to negative



words. No such effect was found when the spatial targets were presented at the bottom [*F*1(1,42) = 0.87, *p* = 0.357, <sup>η</sup>*p*<sup>2</sup> = 0.02; *F*2(1,238) = 0.02, *p* = 0.884, <sup>η</sup>*p*<sup>2</sup> < 0.01]. No further effects reached statistical significance (*p*s > 0.05; see **Figure 3**; **Table 2**).

### Discussion

In Experiment 1B, we found a metaphoric congruency effect when the spatial target was presented at the top. Processing positive words facilitated participants' responses to upper spatial targets, indicating that positive words may activate upper spatial information. However, we did not find any such congruency effect when the spatial target was presented at the bottom. In the experiment by Gozli et al. (2013b), processing positive words made participants' saccade trajectories switch above fixation. Hence, our current findings are consistent with previous findings.

However, as it was not clear whether the metaphoric congruency effect on the vertical axis observed in Experiment 1B would be modulated by the ISI (i.e., the duration between the offset of the affective word and the onset of the spatial target), we intended to test this hypothesis in Experiment 2.

## Experiment 2

In Experiment 1B, we found a metaphoric congruency effect on the vertical axis. In the following experiment, we intended to test whether this effect would be modulated by the interval between the offset of the special endogenous cue (i.e., affective word) and the onset of the spatial target. Hence, we manipulated the ISI for 100, 500, or 1000 ms in the following experiment.

### Method

### Participants

Fifty students (mean age = 21.40 years, SD = 1.80 years, 44 of them female) from South China Normal University, Guangzhou, China, participated in this experiment voluntarily. All of them were right-handed and had normal or corrected-to-normal vision. None of them had participated in previous experiments. They were recruited using the same methods as Experiment 1. All participants provided written informed consent prior to the experiment. The study was approved by the ethics review board of South China Normal University.

### Materials

In order to ensure enough numbers of valid trials under each condition, we selected 288 affective two-character words from the CAWS (Wang et al., 2008). Words were matched for number of first character strokes (*M*positive = 8.83, *M*negative = 8.83), number of second character strokes (*M*positive = 8.62, *M*negative = 8.54), number of word strokes (*M*positive = 17.44, *M*negative = 17.38), and word occurrence frequency (*M*positive = 68.26, *M*negative = 62.56), all of which were statistically equivalent between positive and negative words (*p*s > 0.05).

### Research Design and Procedure

This experiment adopted a 2 (valence of affective word: positive vs. negative) × 2 (location of spatial target: bottom vs. top) × 3 (ISI: 100, 500, or 1000 ms) full within-subjects design.

The procedure was modified from Experiment 1B. We removed the blank screen after the affective word, and the duration of the empty boxes was 100, 500, or 1000 ms. In other words, we manipulated the duration between the offset of the affective word and the onset of the spatial target (i.e., the duration of the empty square boxes), which was the ISI in Experiment 2. Other aspects were the same as in those Experiment 1B (see **Figure 4**).

TABLE 2 | Mean reaction times (ms) and correct responses (%) per condition in Experiment 1B.


#### Results

We used the same trimming method as that in Experiment 1. Data from two participants were removed because their accuracy was lower than 75% in the spatial target detection task or affective word memory task. First, erroneous trials (983 trials, 7.11%) in the detection or memory task were deleted. Correct trials in the spatial detection task with reaction times below 100 ms and above 650 ms (569 trials, 4.12%) were deleted as outliers.

Accuracy data and reaction times from the spatial detection task were submitted to two separate 2 (valence of affective word: positive vs. negative) × 2 (location of spatial target: bottom vs. top) × 3 (ISI: 100, 500, or 1000 ms) ANOVAs. Participants (*F*1) and items (*F*2) were both considered random factors in the analyses.

Accuracy analyses found the main effect for ISI was only significant by item [*F*1(2,94) = 3.29, *p* = 0.063, <sup>η</sup>*p*<sup>2</sup> = 0.07; *F*2(2,572) = 5.28, *p* = 0.006, <sup>η</sup>*p*<sup>2</sup> = 0.02]. The participants' responses were most accurate when the ISI was 500 ms and least accurate when the ISI was 1000 ms. No other significant effects were found (*p*s > 0.05).

Reaction times analyses found the main effect for valence was only significant by participant [*F*1(1,47) = 13.08, *p* = 0.001, <sup>η</sup>*p*<sup>2</sup> = 0.22; *F*2(1,286) = 2.70, *p* = 0.101, <sup>η</sup>*p*<sup>2</sup> = 0.01]. The main effect for the location of spatial target was significant [*F*1(1,47) = 4.27, *p* = 0.044, <sup>η</sup>*p*<sup>2</sup> = 0.08; *F*2(1,286) = 8.90, *p* = 0.003, <sup>η</sup>*p*<sup>2</sup> = 0.03]. The main effect for ISI was significant [*F*1(2,94) = 114.957, *p* < 0.001, <sup>η</sup>*p*<sup>2</sup> = 0.71; *F*2(2,572) = 344.202, *p* < 0.001, <sup>η</sup>*p*<sup>2</sup> = 0.55]. The interaction between valence and location was significant [*F*1(1,47) = 4.15, *p* = 0.047, <sup>η</sup>*p*<sup>2</sup> = 0.08; *F*2(1,286) = 4.06, *p* = 0.045, <sup>η</sup>*p*<sup>2</sup> = 0.01]. Further simple effect analyses revealed that when the spatial targets were presented at the top, positive words facilitated the participants' responses [*F*1(1,47) = 12.87, *p* = 0.001, <sup>η</sup>*p*<sup>2</sup> = 0.22; *F*2(1,286) = 7.16,

*p* = 0.008, <sup>η</sup>*p*<sup>2</sup> = 0.02], compared to negative words. No such effect was found when spatial targets were presented at the bottom [*F*1(1,47) = 0.98, *p* = 0.327, <sup>η</sup>*p*<sup>2</sup> = 0.02; *F*2(1,286) = 0.03, *p* = 0.875, <sup>η</sup>*p*<sup>2</sup> < 0.01]. The three-way interaction between valence, ISI, and the location of spatial target was not significant [*F*1(2,94) = 1.71, *p* = 0.187, <sup>η</sup>*p*<sup>2</sup> = 0.04; *F*2(2,572) = 1.24, *p* = 0.289, <sup>η</sup>*p*<sup>2</sup> < 0.01]. No other significant effects were found (*p*s > 0.05; see **Figure 5**; **Table 3**).

We further conducted planned analyses to test whether interactions between valence and location were consistent under different ISIs. Results revealed that when the ISI was 100 ms, the interaction between valence and location was significant [*F*1(1,47) = 7.48, *p* = 0.009, <sup>η</sup>*p*<sup>2</sup> = 0.14; *F*2(1,286) = 7.08, *p* = 0.008, <sup>η</sup>*p*<sup>2</sup> = 0.02]. However, the interaction between valence and location was not significant when the ISI was 500 ms [*F*1(1,47) = 0.32, *p* = 0.572, <sup>η</sup>*p*<sup>2</sup> = 0.01; *F*2(1,286) = 0.45, *p* = 0.501, <sup>η</sup>*p*<sup>2</sup> < 0.01] and when the ISI was 1000 ms [*F*1(1,47) = 0.68, *p* = 0.415, <sup>η</sup>*p*<sup>2</sup> = 0.01; *F*2(1,286) = 0.54, *p* = 0.462, <sup>η</sup>*p*<sup>2</sup> < 0.01]. Simple effect analyses on the condition of 100-ms ISI found that when spatial targets were presented at the top of the screen, the facilitation effect of processing positive words on participants' responses was significant by participant [*F*1(1,47) = 5.45, *p* = 0.024, <sup>η</sup>*p*<sup>2</sup> = 0.10; *F*2(1,286) = 2.60, *p* = 0.108, <sup>η</sup>*p*<sup>2</sup> = 0.01]. Such a facilitation effect of processing negative words on responses to the bottom spatial targets was also marginally significant [*F*1(1,47) = 2.87, *p* = 0.097, <sup>η</sup>*p*<sup>2</sup> = 0.06; *F*2(1,286) = 3.07, *p* = 0.081, <sup>η</sup>*p*<sup>2</sup> = 0.01]. However, we only found a facilitation effect of processing positive words with upper spatial targets when the ISI was 500 ms [*F*1(1,47) = 5.05, *p* = 0.029, <sup>η</sup>*p*<sup>2</sup> = 0.10; *F*2(1,286) = 3.28, *p* = 0.071, <sup>η</sup>*p*<sup>2</sup> = 0.01] and 1000 ms [*F*1(1,47) = 8.05, *p* = 0.007, <sup>η</sup>*p*<sup>2</sup> = 0.15; *F*2(1,286) = 4.45, *p* = 0.036, <sup>η</sup>*p*<sup>2</sup> = 0.02]. No significant effect was found with lower spatial targets when the ISI was 500 and 1000 ms (*p*s > 0.05).



### Discussion

In this experiment, we found that when the ISI was 100 ms, keeping a positive word in mind facilitated the detection of the upper target, and keeping a negative word in mind facilitated the detection of the lower target. However, when the ISI was 500 or 1000 ms, keeping a positive word in mind facilitated the detection of the upper target, but no such effect was found in the detection of the lower target. Thus, the valence-space metaphoric association might be modulated by ISI. Further, in Experiment 1B, we also found positive words facilitated participants' responses for upper spatial targets, relative to negative words, when ISI was 750. This finding is the same as the finding in Experiment 2 when the ISI was 500 or 1000 ms. Hence, we replicated the finding from Experiment 1B successfully.

### General Discussion

The present studies tested the metaphoric association along the transverse and vertical axes and further investigated dynamic variations of this association under different intervals between affective words and spatial targets. Experiment 1A did not reveal that processing affective words affected spatial detection along the transverse axis. However, Experiment 1B demonstrated that processing affective words did affect spatial detection along the vertical axis. Experiment 2 replicated the findings from Experiment 1B and further found that the interaction between valence and location along the vertical axis might be modulated by ISI. Together, results indicate that both axis and ISI might be important in the metaphoric association.

Experiment 1A did not find a valence-space metaphoric association along the transverse axis. However, the body-specificity hypothesis holds that right-handed people associate right with positive and left with negative, whereas left-handed people show a reversed pattern (Casasanto, 2009). This association even emerges in children and cannot be explained by language or cultural conventions (Casasanto and Henetz, 2012). The absence of metaphoric association in Experiment 1A does not contradict the body-specificity hypothesis. This hypothesis posits a natural association between valence (e.g., positive vs. negative) and dominant/non-dominant hand (i.e., body). However, Experiment 1A tested the relationship between affective valence (e.g., positive vs. negative) and left/right space (which is not a part of body, i.e., non-body). Hence, the body-specificity hypothesis posits bodily association, whereas our Experiment 1A tested non-bodily association; this may explain the difference between the prediction of the body-specificity hypothesis and our results.

Although we did not control for handedness in Experiments 1A,B, the absence of metaphoric association in Experiment 1A could not be attributed to handedness. First, all participants were Chinese. The percentage of left-handed people is only 0.23% among Chinese (Li, 1983). Considering that we recruited participants randomly for Experiments 1A,B, the probability of recruiting left-handed participants was very low. Second, in Experiment 2, we only recruited right-handed participants and found the same results as those in Experiment 1B. Hence, we believe that the results would remain unchanged if we controlled for handedness.

We found a metaphoric association along the vertical axis in Experiment 1B, when ISI was 750 ms. This metaphoric congruency effect was only found when spatial targets were presented at the upper location. Processing positive words facilitated the detection of upper spatial targets compared to negative words. We did not find that processing negative words facilitated the detection of lower spatial targets, in contrast to the results of Meier and Robinson (2004; Experiment 2). In their experiment, participants verbally evaluated affective valence followed by discriminating a non-valenced target (i.e., q or p) that appeared at the bottom or top of the screen. They found that responding to positive and negative words facilitated the discrimination of upper and lower targets, respectively. They interpreted this symmetric facilitation effect as a Stroop-like effect. The Stroop-like effect predicts that upper targets following positive words and lower targets following negative words will be processed faster than upper targets following negative words and lower targets following positive words. This prediction was not fully supported by our results in Experiments 1B.

On the contrary, Schubert (2005; Experiment 5B) asked participants to judge the affective valence of group names that appeared at the top or bottom of the screen. They only found that positive group names were evaluated faster when they were presented at the top of the screen compared to negative group names. No such effect was found when they were presented at the bottom of the screen. Hence, whether the metaphoric congruency effect is symmetric is debatable. Further studies are needed to test this question. However, this finding is compatible with our results in Experiment 1B.

The asymmetric metaphoric congruency effect observed in Experiment 1B may be interpreted by the polarity correspondence (Lakens, 2012). Proctor and Cho (2006) propose that the polarity correspondence could be used for interpreting mapping effects in binary choice reaction tasks. People code stimulus and response alternatives as positive (+polar) and negative (−polar) polar along several dimensions, which is a basic aspect of human information processing. People's responses are facilitated to a greater extent when the polarities correspond rather than when they do not. Lakens (2012) extended the polarity correspondence into the metaphoric congruency effect. Lakens (2012) further manipulated the polarity frequency (75% +polar and 25% −polar, or vice versa) and found that the interaction between words and vertical position was removed in the 75% −polar condition, indicating that the metaphoric congruency effect could be explained by polarity correspondence.

In Experiment 1B, we only found a metaphoric congruency effect when spatial targets appeared at the top of the screen. This result is consistent with polarity correspondence, which holds that processing +polar (up) is faster than –polar (down). Only +polar spatial responses receive a polarity benefit but not –polar spatial responses (Lynott and Coventry, 2014). In the current experiment, positive words and upper spatial targets could be treated as +polar and negative words and lower spatial targets could be treated as −polar. Hence, only when spatial targets appear on the top of the screen (+polar), are responses facilitated by positive words (+polar), which is supported by the results of Experiment 1B.

In Experiment 1, we replicated previous findings and found that processing affective words facilitated spatial target detection along the vertical axis. However, we did not find this effect along the transverse axis. One potential explanation for this may be that, in Chinese culture, people treat the left side as more positive than the right side. However, the body-specificity hypothesis holds that the right side is more positive than the left side for right-handed people (Casasanto, 2009). This conflation between Chinese culture and the body-specificity hypothesis may eliminate the mapping between valence and left/right side. Further studies are needed to test this hypothesis.

In Experiment 2, we intended to test the time course of the metaphoric association. The results found that when ISI was 100 ms, processing upper targets was faster, following positive words compared to negative words. Meanwhile, this effect was also marginally significant for the processing of lower targets. On the contrary, when ISI was 500 or 1000 ms, we only found the facilitation effect of positive words on upper targets. The facilitation effect of negative words on lower targets disappeared. It seems that the metaphoric association was stronger with ISI of 100 ms. The symmetric facilitation effect of affective words on targets (i.e., positive words facilitated the processing of upper targets and negative words facilitated the processing of lower targets) only emerged at 100-ms ISI. This finding suggests that ISI is important in metaphoric association.

This metaphoric association observed in Experiments 1B and 2 is most likely based on attention allocation. Xie et al. (2014) adopted a similar paradigm using a 750-ms ISI and found that when spatial targets appeared at the top and bottom of the screen, a larger P200 amplitude was found after positive and negative words, respectively. This P200 may be related to attention allocation. Zhang et al. (2013) tested the neural mechanisms underlying spatial target discrimination after spatial cue words using a random ISI from 400 to 500 ms. Spatial cue words denoted objects that typically appeared in upper or lower space. The spatial targets were "p" or "q" that appeared at the top or bottom of the screen. Although the cue words did not predict discrimination of spatial targets, the authors still found enhanced N1 amplitudes for congruent conditions (e.g., "eagle" followed by upper spatial targets) compared to incongruent conditions (e.g., "eagle" followed by lower spatial targets). These findings indicated that the effect of processing of words on the subsequent spatial task might be based on spatial attention.

If the attention allocation interpretation is indeed a basis for metaphoric association in Experiments 1B and 2, this attention allocation may be different from attention allocation that is endogenously oriented by central cues. In Experiments 1B and 2, the central cues were affective words that did not contain valid spatial information for the subsequent spatial targets. In this view, the affective words were invalid cues. However, conceptual metaphor theory holds that affective valence is based on space: positive is based on upper space, whereas negative is based on lower space (Lakoff and Johnson, 1980). Vermeulen (2010) found that positive emotion decreased the attentional blink (the negative effect of identification of the first target on identification of the second target), whereas negative emotion increased it, implying that attention was distinctively affected by current affective states. Hence, affective words may be treated as special endogenous cues that may have typical cueing effects. The typical endogenous cueing effect was modulated by SOA. Funes et al. (2005) adopted color as a central symbolic cue in a spatial Stroop task. They found that the facilitation effect was maximal at an 850-ms SOA following central cues. The facilitation effect of typical endogenous cues would disappear after a long SOA. This may explain why interactions between the valence of affective word and the location of spatial target were different under short ISI (100 ms) versus long ISI (500 or 1000 ms) in Experiment 2. In addition, more studies are needed to test the difference between typical endogenous cues and special central cues (i.e., words that have metaphoric association with spatial information) in cueing tasks.

The present studies found that axis might be important in the valence-space metaphoric association. The role of axis in other metaphors has also been debated. Boroditsky et al. (2011) found that Chinese speakers prefer to think about time vertically and horizontally compared to English speakers who prefer to think about time horizontally. However, Tse and Altarriba (2008) revealed that both Chinese–English bilinguals and English monolinguals think about time vertically. From this debate, we could hypothesize that processing abstract concepts (e.g., emotion or time) mainly relies on the vertical spatial axis. The processing of abstract concepts in relation to the transverse spatial axis still needs further study. More studies on the development of metaphor may shed light on this question.

In Experiment 1, we used a between-subjects design to test the valence-space metaphor along the transverse and vertical axes because it is difficult for participants to respond using four keys simultaneously in one study. Follow-up studies may modify the current paradigm and adopt a non-hand-operated response in the experiment to test the three-way interaction between valence of affective word, location of spatial target, and axis directly within one experiment. Second, we did not test the valencespace metaphoric association along the transverse axis under different ISIs, because this association did not emerge under a moderate ISI in Experiment 1A (i.e., 750 ms). However, in Experiment 1B, this association emerged along the vertical axis at the same ISI. Further studies may test the time course of valencespace metaphoric association along the transverse axis. Third, in Experiment 2, we did not include a longer ISI (e.g., 2000 ms). The three-way interaction between valence of affective word, location of spatial target, and axis was not significant. Follow-up studies may consider replicating the current study including longer ISIs.

In sum, our studies found that axis might be important in the valence-space metaphor. The interaction between valence of affective word and location of spatial target only emerged along the vertical axis. Further, ISI might also be important in the valence-space metaphor. When ISI was 100 ms, positive words facilitated the detection of the upper targets, compared to negative words. Meanwhile, negative words facilitated the detection of the lower targets, compared to positive words. However, when the ISI was 500 or 1000 ms, results only revealed that positive words facilitated the detection of the upper targets, compared to negative words. No such effect was found in the detection of the lower targets. The current study thus provides evidence for the spatial specificity and time course of valence-space metaphors.

## Author Contributions

RW developed the study concept. JX and RW designed experiments. JX and YH ran data-collection procedures of Experiments 1A,B. JX and WL ran data-collection procedure of Experiment 2. JX and RW analyzed and interpreted the data. JX drafted the manuscript. RW and YH provided critical revisions for the manuscript.

### Acknowledgments

We thank Chi-Shing Tse and two reviewers for their helpful suggestions and comments on earlier versions of this manuscript. This research was funded by the National Excellent Doctoral Dissertation Foundation of China (201204), the Humanities and Social Science Research Base Project from the Ministry of Education of China (13JJD190006), the Excellent Young Teacher Foundation of University in Guangdong (Yq2013047), and the Graduate Research Innovation Foundation of South China Normal University (2013kyjj077).

## References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Xie, Huang, Wang and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Spatial biases during mental arithmetic: evidence from eye movements on a blank screen

### *Matthias Hartmann1,2\*, Fred W. Mast <sup>2</sup> and Martin H. Fischer <sup>1</sup>*

*<sup>1</sup> Division of Cognitive Sciences, University of Potsdam, Potsdam, Germany*

*<sup>2</sup> Department of Psychology, University of Bern, Bern, Switzerland*

#### *Edited by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada*

#### *Reviewed by:*

*Luisa Lugli, University of Bologna, Italy Thomas J. Faulkenberry, Tarleton State University, USA*

#### *\*Correspondence:*

*Matthias Hartmann, Division of Cognitive Sciences, University of Potsdam, Karl-Liebknecht-Strasse 24-25 House 14, D-1446 Potsdam OT Golm, Germany e-mail: hartmann.psychology@ gmail.com*

While the influence of spatial-numerical associations in number categorization tasks has been well established, their role in mental arithmetic is less clear. It has been hypothesized that mental addition leads to rightward and upward shifts of spatial attention (along the "mental number line"), whereas subtraction leads to leftward and downward shifts. We addressed this hypothesis by analyzing spontaneous eye movements during mental arithmetic. Participants solved verbally presented arithmetic problems (e.g., 2 + 7, 8–3) aloud while looking at a blank screen. We found that eye movements reflected spatial biases in the ongoing mental operation: Gaze position shifted more upward when participants solved addition compared to subtraction problems, and the horizontal gaze position was partly determined by the magnitude of the operands. Interestingly, the difference between addition and subtraction trials was driven by the operator (plus vs. minus) but was not influenced by the computational process. Thus, our results do not support the idea of a mental movement toward the solution during arithmetic but indicate a semantic association between operation and space.

**Keywords: mental arithmetic, eye movements, mental number line, operational momentum, embodied cognition, grounded cognition**

In Western cultures small numbers are typically represented to the left of larger numbers, both in external space (e.g., on rulers and timetables) and in cognitive space, following the concept of the "mental number line" (e.g., Dehaene et al., 1993; Hubbard et al., 2005; Fischer and Shaki, 2014a). The pervasive small-left and large-right-association is captured by the SNARC (spatial-numerical association of response codes) effect, showing left-sided response facilitation for small numbers and right-sided response facilitation for large numbers. Spatial-numerical associations have been well established in the horizontal dimension of space (see Fischer and Shaki, 2014a, for a review), and have recently been extended to vertical space (e.g., Ito and Hatta, 2004; Loetscher et al., 2010; Grade et al., 2012; Hartmann et al., 2012, 2014a; Holmes and Lourenco, 2012; Shaki and Fischer, 2012; Fischer, 2012; Winter and Matlock, 2013). For example, when participants name numbers at random, they generate smaller numbers during downward when compared to upward body motion (Hartmann et al., 2012; Winter and Matlock, 2013). These spatial-numerical associations are in line with the embodied approach of knowledge representation, according to which our sensory and motor experiences during concept acquisition remain associated with these concepts (e.g., Barsalou, 2008; Pulvermüller, 2013). In the case of numbers, their horizontal association has been attributed to reading and writing, as well as finger counting habits, while the vertical association might reflect the experience that "more" usually corresponds to higher space (e.g., Lakoff and Johnson, 1980; Zebian, 2005; Fischer and Brugger, 2011; Göbel et al., 2011; Fischer, 2012; Holmes and Lourenco, 2012; but see Hartmann et al., 2014a).

Spatial biases during number processing have predominantly been studied by means of simple number categorization tasks (small vs. large, even vs. odd). The role of spatial biases during more complex numerical tasks, such as mental arithmetic, is less clear (Fischer and Shaki, 2014b). In a seminal study, McCrink et al. (2007) asked participants to judge whether a final set of objects was the correct result of a preceding addition or subtraction process. Participants were more likely to accept a solution with too many objects for addition and with too few objects for subtraction. This systematic bias has been labeled "operational momentum effect." In the perceptual domain, the "representational momentum effect" describes the misperception of the vanishing position of a moving dot. Particularly, the vanishing position is perceived as being further along the dot's movement trajectory. In analogy, the operational momentum effect suggests that addition is conceptualized as excessive rightward movement and subtraction as excessive leftward movement along the mental number line (McCrink et al., 2007; Knops et al., 2009b). Further empirical evidence for such a spatial process during mental arithmetic comes from Wiemers et al. (2014) who found that addition and subtraction problems were solved faster when participants made arm movements congruent with the hypothesized movements along the mental number line (i.e., rightward or upward for addition, and leftward or downward for subtraction). Similarly, adding was found to be easier when participants rode upward in an elevator whereas riding downward facilitated subtracting (Lugli et al., 2013). Moreover, Marghetis et al. (2014) observed systematic leftward and rightward deflections in participants' hand trajectories when they indicated the results of addition and subtraction problems with a mouse cursor movement. Furthermore, Masson and Pesenti (2014)found that targets in the left visual field were detected faster after solving subtraction problems whereas targets in the right visual field were detected faster after solving addition problems. Finally, patients suffering from hemispatial neglect after right-hemispheric brain lesion show selective deficits for subtraction but not for addition problems, in line with their selective deficit in orienting attention toward the left side of space (Dormal et al., 2014).

Despite this empirical evidence, the exact mechanism leading to the spatial bias during mental arithmetic is far from clear (Fischer and Shaki, 2014b). First of all, the idea of moving leftward (for subtraction) and rightward (for addition) along the mental number line is only one of several possible explanations for the operational momentum effect (for a discussion of alternative accounts see Knops et al., 2013, 2014; Marghetis et al., 2014; Fischer and Shaki, 2014a). Moreover, most evidence for a spatial bias in mental arithmetic comes from tasks that imposed a specific spatial setting, for example by requiring participants to respond with a left or a right key (i.e., Masson and Pesenti, 2014), or involving movements along a specific spatial axis (Pinhas and Fischer, 2008; Lugli et al., 2013; Marghetis et al., 2014; Wiemers et al., 2014). These bipolar spatial assignments imposed by the task setting might also shape the spatial bias during mental arithmetic (Proctor and Cho, 2006). Lastly, Pinhas et al. (2014) showed that the operation sign itself (±) has a spatial connotation (plus-right and minus-left) in a speeded manual classification task. Therefore, it is unclear to what extent the reported results reflect spatial biases induced by the actual mental computation (i.e., the activated magnitudes) or rather by the semantic spatial association of the operation sign.

The aim of this study is to further investigate spatial biases during mental arithmetic by means of eye movements. Eye movements reflect the spatial focus of attention (e.g., Sheliga et al., 1994; Corbetta et al., 1998) and have been used to study the spatial character of ongoing mental processes with high temporal resolution (e.g., Spivey and Geng, 2001; Grant and Spivey, 2003; Altmann, 2004; Van Gompel et al., 2007; Huette et al., 2014; Johansson and Johansson, 2014; Hartmann et al., 2014b). Eye movement studies have contributed to the understanding of cognitive processes involved in numerical tasks (Suppes, 1990; Loetscher and Brugger, 2007; Loetscher et al., 2008; Moeller et al., 2011; Sullivan et al., 2011; Schneider et al., 2012; Zhou et al., 2012; Chesney et al., 2013; Van Viersen et al., 2013; Klein et al., 2014; Huber et al., 2014a,b). Most importantly in the context of the present study, spontaneous eye movements (i.e., eye movements that are not triggered in response to a perceptual event) follow spatial-numerical associations: Loetscher et al. (2010) were able to predict the magnitude of numbers in their participants' mind during random number generation, based on the direction and magnitude of spontaneous saccades occurring before the number was spoken out. Particularly, rightward and upward saccades were more frequent when the next number was larger than the previous one (see also Loetscher et al., 2008).

In this study, we analyzed spontaneous eye movements on a blank screen while participants solved verbally presented arithmetic problems. Based on horizontal and vertical spatial-numerical associations and on previous arithmetic-space compatibility effects (e.g., Lugli et al., 2013; Wiemers et al., 2014), we hypothesized that participants' gaze would shift rightward and upward during addition, and leftward and downward during subtraction. Crucially, analysis of the time course of the spatial bias will help to clarify the temporal dynamics of the spatial bias induced by the different elements involved in the operation (magnitude of the first and second operand, the operator, and the size of the solution). Moreover, our paradigm should help to further describe the nature of the spatial bias in mental arithmetic since no predefined spatial dimension was imposed by the stimulus or response arrangement.

### **METHOD**

### **PARTICIPANTS**

Twenty-five undergraduate students from the University of Bern participated in this study for course credit (19 women, mean age: 23.0, range: 19–45 years, three left-handed). Participants gave written informed consent prior to the study, and the study was approved by the local Ethics Committee. All participants had normal or corrected-to-normal visual acuity.

### **STIMULI AND PROCEDURE**

The following operands were used: 2, 3, 4, 5, 6, 7, 8, 9. Eighteen pairs of different operands were selected for constructing the arithmetic problems (see Appendix). For both addition and subtraction trials, the 18 pairs were presented once in the original order and once in the reverse order, resulting in a total number of 72 unique problems. For the purpose of this study it was important that the magnitude of the first operand does not allow participants to predict which operation (addition vs. subtraction) will follow. In most previous studies, addition trials were more likely when the first operand was a small number, and subtraction more likely when the first operand was a large number (e.g., Pinhas and Fischer, 2008; Wiemers et al., 2014), which could induce predictive eye movements along the mental number line before the onset of the operator. We minimized this effect by choosing a similar amount of small and large numbers as first operand for addition and subtraction trials. As a result of this, half of the subtraction trials had a negative solution size. Negative numbers, when intermixed with positive numbers, are located on the left side of the mental number line (Fischer, 2003; Ganor-Stern et al., 2010).

Participants were seated 70 cm in front of the screen and instructed to solve as fast and accurately as possible an auditorily presented addition or subtraction problem. Auditory stimuli were presented via loudspeakers positioned 30 cm to the left and right side of the screen. At the beginning of each trial, a central fixation cross was presented for 1 s. The fixation cross was implemented to shift spatial attention to the center of the screen at the beginning of each trial. This allows to compare the development of spatial biases with respect to the center of the screen across trials.

The cross disappeared at the onset of the first operand and remained blank. Each audio file (first operand, operator, second operand) lasted 500 ms. The operator followed 750 ms after the offset of the first operand, and the second operand followed 750 ms after the offset of the operator. The time course of a trial is shown in **Figure 2**. Participants pressed the space bar as soon as they had solved each problem and at the same time speak out the solution. The solution was noted by the experimenter. The inter-stimulus interval (i.e., the time between offset of the second operand and onset of the fixation cross preceding the next trial) was 5 s.

#### **APPARATUS**

Eye movements were recorded with an SMI RED tracking system (SensoMotoric Instruments, Teltow, Germany). Eye gaze was registered with a sampling rate of 50 Hz, a spatial resolution of 0.1◦ and a gaze position accuracy of 0.5◦. The stimuli were presented on a 17-inch screen (1280 × 1024 pixels) using Experiment Center Software and eye data were recorded with I-View X Software, both developed by SensoMotoric Instruments (SensoMotoric Instruments, Teltow, Germany). The primary output events of the eye tracker were fixations (the sample frequency of 50 Hz did not allow us to detect and analyze saccade latencies accurately). Fixations were extracted using Be-Gaze software (SensoMotoric Instruments, Teltow, Germany) and were defined by a minimum duration of 80 ms (4 samples) and a maximal dispersion of 100 pixels1 .

#### **STATISTICAL ANALYSIS**

We first documented arithmetic performance of our participants to show that they complied with our instructions. Response times (RTs) were measured from the onset of the second operand. Trials with RTs larger than 3 s were excluded from further analysis (0.4%). For the eye movement data analyses, we defined the following three time windows: Time window 1 started at the onset of the first operand and ended with the onset of the operator (0—1250 ms). Time window 2 started with the onset of the operator and ended with the onset of the second operand (1250—2500 ms). Finally, Time window 3 started with the onset of the second operand and ended when participants pressed the response key (2500 ms—response time). Within each time window, we analyzed the position of the first fixation that participants initiated (i.e., after the onset of the first operand, the onset of the operator, and the onset of the second operand). Moreover, we also analyzed within the same time windows the horizontal and vertical position at each sample of the full data stream (allowing to describe spatial biases with a high temporal resolution). One participant was excluded from the analysis of eye movements due to data loss. Trials where the initial fixation position was not on the central fixation position (more than ± 1◦ of visual angle) were excluded from the analysis (4% of trials).

#### *Analysis of the first fixation*

In order to recognize the content of the audio file (numbers and operator) it was necessary to hear approximately the first 150 ms of the audio file. We therefore excluded fixations that were initiated within the first 150 ms from the onset of the audio file. Moreover, fixations outside of the screen were excluded from this analysis (4.9% of fixations). Importantly, the position of the first fixation was always expressed relative to the x and y coordinates of the sample at the onset of the respective time window; this normalization controls for differences in the previous trial history and allows the comparison of fixation positions across trials.

For each time window, a repeated measures regression (using a linear mixed model approach with random intercepts for participants and fixed effects for the predictors) was computed for the horizontal and vertical position of the first fixation. For Time window 1, the magnitude of the first operand was used as predictor. For Time window 2, again the magnitude of the first operand was used as predictor (since this factor could still influence behavior in later time windows) along with the operator (+*,* −). For Time window 3, the magnitude of the first operand, the operator, the magnitude of the second operand, and the solution size were predictors. We also included the interaction between the operator and the solution size as predictor. This interaction captures a possible rightward shift for larger solution sizes during addition trials and a possible leftward shift for smaller (and negative) solution sizes during subtraction trials. For all analyses, the variables magnitude of operands and solution size were treated as covariates since we were interested in the *linear* effect of number magnitude on gaze position.

#### *Analysis of the full gaze stream*

In order to get a more fine-grained picture of the spatial biases induced by the different elements (operands, operator, solution size), we analyzed the horizontal and vertical gaze position for each sample of the raw data gaze stream (i.e., every 20 ms). Gaze positions recorded during eye blinks, as well as the samples immediately before and after a blink, were treated as missing values. All missing values, including samples with coordinates outside of the screen or signal loss, were replaced by linear interpolation. Trials that consisted of more than 30% interpolated data were then removed from the analysis (3.3%). In order to average and compare gaze position in Time window 3 across trials with different numbers of recorded samples (depending on the response time), data in Time window 3 were time-normalized into 60 samples (60 samples equals 1200 ms which roughly corresponds to the mean RT found in our sample) using linear interpolation. The same analyses as described above for the first fixations were then performed on each sample. The analyses for Time windows 2 and 3 included all samples from the beginning of the time window until the end of the trial, corrected for the position at the beginning of the respective time window. For the analyses performed on the samples of the gaze stream, we only considered effects as statistically significant when the *p*-values of at least 10 consecutive samples were below 0.05 (corresponding to a 200 ms interval; see Mathot et al., 2013 for a similar approach).

### **RESULTS**

#### **ARITHMETIC PERFORMANCE**

Error rate was low (1.1%) and was not further analyzed. Mean RTs were on average higher for subtraction than for addition

<sup>1</sup>The algorithm checks the dispersion of consecutive data points in a moving window by summing the differences between the points' maximum and minimum x and y values ([max(x) − min(x)] + [max(y)-min(y)]). If the sum is below 100 pixels, the window represents a fixation and expands until the sum exceeds 100 pixels. The final window is registered as fixation at the centroid of the window points with the given onset time and duration.

trials (1178 vs. 1148 ms)2 , as revealed by a paired *t*-test, *t*(24) = 2*.*51, *p* = 0*.*019. RTs for the different result sizes for addition and subtraction trials are illustrated in **Figure 1**. The findings that RTs were generally higher for subtraction than for addition trials and that RTs increased with increasing (absolute) result sizes (see **Figure 1**) are both in line with general findings about mental arithmetic (Ashcraft, 1992; Campbell, 2005). Thus, our participants complied with the task instruction to solve mental arithmetic problems.

#### **GAZE POSITION**

A full statistical report for the analysis of the first fixation in each time window is presented in **Table 1**. Here we only report the most important findings.

#### *Time window 1*

Participants initiated in 71% of trials a new eye fixation in Time window 1. There was no linear influence of the magnitude of the first operand on the position of the first fixation (horizontal gaze position: *p* = 0*.*360; vertical gaze position: *p* = 0*.*097). The analysis of the full gaze stream confirmed that there was no significant effect of the magnitude of the first operand.

#### *Time window 2*

Participants initiated in 74.4% of trials a new eye fixation in Time window 2. The operator was a significant predictor for the vertical gaze position, *F*(1*,* 1166) = 5*.*01, *p* = 0*.*025, but not for the **Table 1 | Statistical report of the linear mixed model analyses on the first fixation position after the onset of the operands and operator.**


*\*p < 0.05. The unit of the estimates is pixel; Operand 1, Operand 2, and solution size are treated as covariates.*

horizontal gaze position, *F*(1*,* 1166) = 0*.*60, *p* = 0*.*439: Gaze position of the first fixation initiated after the onset of the operator was located 12 pixels more upward for "plus" when compared to "minus."

The analysis of the full gaze stream confirmed the spatial bias induced by the operator for the vertical gaze position, as well as the absence of a bias for the horizontal gaze position (see **Figure 2**). Differences between addition and subtraction trials for the vertical gaze position start to develop shortly after the onset of the operator and remain for a large part of the trial. Significant differences between addition and subtraction trials are represented by the gray areas in **Figure 2B**. The first significant difference was detected 760 ms after the onset of the operator.

#### *Time window 3*

Participants initiated in 75.3% of trials a new eye fixation in Time window 3. Remarkably, none of the variables (magnitude of the first operand, operator, magnitude of the second operand, solution size, and the interaction between solution size and operator) predicted the position of the first fixation after the onset of the second operand.

The analysis of the continuous gaze stream (including the last sample where participants gave their responses) revealed that there was no effect of the operator, the magnitude of the second operand, the solution, and the interaction between the solution and the operator. Importantly, the effect of the operator found in Time window 2 is no longer significant when positions are corrected for differences at the onset of Time window 3 (as it was done for this analysis). Thus, gaze position was not systematically influenced by the computational process. There was, however, a trend for an effect of the magnitude of the first operand (a series of six consecutive samples with *ps <* 0*.*052) on the horizontal gaze position. This time period was identified between 2940 and 3080 ms after the onset of the first operand, or, respectively, between 440 and 580 ms after the onset of the second

<sup>2</sup>Noteworthy, the two highest means were found for the addition trials with the result sizes 12 and 14. These were the problems 7 + 5/5 + 7 (for 12), and 6 + 8/8 + 6; 9 + 5/5 + 9 (for 14). These problems are characterized by a carry operation, which is known to increase RT (e.g., Imbo et al., 2007). The problems with even higher result sizes (15, 16) do also require a carry operation but these problems include the operand 9 (6 + 9/9 + 6 for 16 and 7 + 9/9 + 7 for 17), which facilitates the carry operation (9 is close to 10 and allows for alternative strategies, such as "add 10 and subtract 1").

operand (corresponding to 25–33% of the progress for the timenormalized computation process). When we repeated the linear mixed effect model analysis for this specific time interval, the estimated linear effect was 3.4 (*SE* = 0*.*6; *p <* 0*.*001), clearly indicating that gaze position was shifted more rightward as a function of number magnitude, as shown in **Figure 3**.

### **DISCUSSION**

In this study we analyzed spontaneous eye movements on a blank screen when participants solved addition and subtraction problems. We found that the gaze was directed more upward during addition than during subtraction trials, and horizontal gaze position was partly determined by the magnitude of the operand. The former finding is in line with the small-down and large-up orientation of the vertical mental number line (e.g., Grade et al., 2012; Hartmann et al., 2014a; Experiment 1; Loetscher et al., 2010; Hartmann et al., 2012; Holmes and Lourenco, 2012; Winter and Matlock, 2013) and supports the view that addition is associated with upper space and subtraction with lower space (Lugli et al., 2013; Wiemers et al., 2014). The latter finding, a more rightward gaze position for larger magnitudes of the operand, is in line with the small-left and large-right orientation of the horizontal

mental number line (e.g., Dehaene et al., 1993; Fischer and Shaki, 2014a), and confirms that numbers can induce shifts of spatial attention (Fischer et al., 2003). Thus, our results show that spontaneous eye movements reflect systematic spatial biases during mental arithmetic and provide new evidence for an active role of eye movements for magnitude processing. Of particular relevance in the context of numerical cognition are the findings that the difference between addition and subtraction trials was induced by the operator, and that this effect manifested itself in the vertical but not in the horizontal dimension of space. These two aspects are now discussed in more detail.

Previous findings of operational momentum effects during mental arithmetic (e.g., McCrink et al., 2007; Pinhas and Fischer, 2008; Knops et al., 2009b) and motion-arithmetic compatibility effects (Lugli et al., 2013; Marghetis et al., 2014; Wiemers et al., 2014) suggest that mental addition and subtraction is accompanied by a mental movement along the number line. Do our results support this idea? In this study, the difference between addition and subtraction trials developed shortly after the onset of the operator, before the second operand was presented and consequently before the computational process was initiated. The difference between addition and subtraction trials remained present from there on. Importantly, when gaze position was controlled for the differences at the onset of the second operand (see analysis of Time window 3), there was no further contribution from the operator to the spatial bias in that time window. This clearly shows that the difference between addition and subtraction trials can only be attributed to the operator (i.e., the onset of the operator in Time window 2) and not to the addition or subtraction process *per se* (i.e., the computational process that took place in Time window 3). A spatial bias induced by the computational process would also result in an interaction between the operator and the solution size, which was absent in all our analyses. We therefore conclude that the addition-up and subtraction-down association we found reflects a semantic operation (addition vs. subtraction) spatial association effect rather than the consequence of a spatial shift that occurred *during* computation. Thus, our results do not support the idea that adding magnitudes involves a simulated rightward or upward movement, and subtracting a leftward or downward movement along the mental number line, at least not when addition and subtraction problems are solved on a trial-by-trial basis (note that mental movements or OM effects might be more pronounced for continuous counting). Instead, our results confirm an operation sign spatial association (OSSA) effect that was recently demonstrated by Pinhas et al. (2014) with manual responses.

Our replication and extension of Pinhas et al.'s results has important implications. First of all, we found an OSSA effect for auditorily presented operators. This shows that the perception of the operation *sign* is not mandatory, and suggest that the semantic processing of the operator is the crucial aspect. Consequently, the OSSA effect should be renamed into OSA (operator spatial association) effect. Moreover, our results suggest that the principal role of space during mental arithmetic might be the activation of metaphorical magnitude concepts, such as "more is up" (e.g., Lakoff and Johnson, 1980; Fischer, 2012; Holmes and Lourenco, 2012). The activation of such a spatial concept might support the task by providing an intuitive spatial reference for the solution, or in other words, by providing a "rough sense of expected magnitude against which the algorithmically derived solution can be compared" (Marghetis et al., 2014, p. 13; see also Stevenson and Carlson, 2003). Thus, it is conceivable that participants changed their vertical gaze position depending on the operator because the result of the computation will be smaller (down) or larger (up) than the current reference (i.e., the magnitude of the first operand). Marghetis et al. (2014) also found that the operator induced the strongest spatial bias in hand trajectories performed during mental arithmetic (see Figure 4 in Marghetis et al., 2014). To put it in a nutshell, there is a systematic spatial bias during mental arithmetic, but this bias might not primarily be constituted by simulating an exact movement along the mental number line but rather by the spatialization of an approximate sense of quantity, which might already be triggered by the operator.

Another interesting aspect of our findings is that the difference between addition and subtraction was only present in the vertical dimension of space. Based on the fact that the mental number line is running from left to right in ascending order (in Western cultures), we also expected an addition-right and subtraction-left association. Indeed, many studies point to such an association, for both non-symbolic (McCrink et al., 2007; Knops et al., 2009a,b) and symbolic (Pinhas and Fischer, 2008; Knops et al., 2009a; Marghetis et al., 2014; Masson and Pesenti, 2014; Pinhas et al., 2014; Wiemers et al., 2014) arithmetic. What are possible explanations for the absence of such an effect in our study? First of all, previous studies showing a spatial-arithmetic association in the horizontal dimension imposed an explicit horizontal component in their tasks. For example, target stimuli or response keys were arranged on a horizontal line on the screen or on the table, respectively (Pinhas and Fischer, 2008; Marghetis et al., 2014; Masson and Pesenti, 2014; Pinhas et al., 2014), or required participants to make hand movements along a horizontal line (Pinhas and Fischer, 2008; Wiemers et al., 2014). Thus, in all these studies, the horizontal dimension of space was made salient to participants, which might facilitate the use of the horizontal axis of space in participants' task representation. In the present study, we used a blank screen paradigm with no predefined spatial arrangements of stimulus or responses. We argue that, in cases where no predefined spatial frame of reference is provided, participants recruit those spatial associations that are grounded deepest in their cognitive system. The experience that "more" usually corresponds to upper space might constitute a fundamental concept of our cognitive system (e.g., Lakoff and Johnson, 1980; Zebian, 2005; Göbel et al., 2011; Fischer, 2012; Holmes and Lourenco, 2012; but see Hartmann et al., 2014a). For example, adding objects or water to a bowl raises the horizontal level. Moreover, larger objects occupy more upper space than smaller objects (e.g., skyscraper vs. cottage). Because such observations are universal and accompany the development of our cognitive system since early childhood, the concept of "more is up" might be more hard-wired (or grounded) than the concept of "more is right," which shows a great deal of flexibility across cultures and task demands (e.g., Bächtold et al., 1998; Zebian, 2005; Ristic et al., 2006; Shaki et al., 2009; Fischer et al., 2010). For these reasons, it might be more intuitive for participants to use the upper and lower space and not the left and right space in order to conceptualize addition and subtraction during mental arithmetic. In line with this view, the spatial-arithmetic compatibility effects found in Wiemers et al.'s (2014) study were more pronounced for the vertical than for the horizontal spatial axis.

As a last point, we want to discuss the effect of the magnitude of the first operand for the horizontal gaze position. As this effect did not reach our significance criterion, we do not want to overinterpret this effect but nevertheless point to some interesting aspects. Interestingly, the spatial bias induced by the magnitude of the first operand did not obtain in the time window where the magnitude was perceived. Rather, the effect was only evident after the onset of the second operand. This suggest that perceiving numbers does not always automatically shift spatial attention along the mental number line (Fischer et al., 2003) but requires that the number is extensively processed (see Fischer and Knops, 2014; Zanolie and Pecher, 2014, for a discussion). In this case, it might reflect the re-activation of the number meaning of the first operand in order to initiate the computational process that was triggered by the additional information provided by the operator and the second operand. A similarly delayed influence of the magnitude of the first operand was also found by Marghetis et al. (2014).

### **OUTLOOK**

A possible limitation of this study was that we used relatively simple arithmetic problems that can potentially be solved by memory retrieval (Ashcraft, 1992; Campbell, 2005). It is conceivable that more complex problems that rely more heavily on computation would recruit more pronounced spatial processing (and possibly to operational momentum effects). However, recent work from Fayol and Thevenot (2012) suggests that seemingly simple arithmetic problems (such as 3 + 4) can also activate a procedural Hartmann et al. Spatial biases during mental arithmetic

strategy. Further studies are needed in order to draw final conclusions about the nature of spatial biases in mental arithmetic and its role for different levels of complexity of arithmetic problems and formats (symbolic vs. non-symbolic). The use of methods that allow for a continuous tracking of the arithmetic process, such as eye or hand movement studies will be most fruitful for future research (Fischer and Hartmann, 2014; Marghetis et al., 2014). Moreover, future tasks should be designed in a way that allows the researcher to disentangle whether the spatial bias was induced by the computational process or by the operator alone, for example by including trials that do only contain operation signs but no operands (see Pinhas et al., 2014).

### **CONCLUSION**

We showed that spontaneous eye movements reflect spatial biases during mental arithmetic and highlighted an important role of the operator for inducing these spatial biases. On a global level, our results suggest that eye movements might play an important role in cognition because they translate abstract concepts, such as number magnitudes and arithmetic, into concrete spatial relationships, possibly in order to facilitate the understanding and mental manipulation of these concepts. Our results add to a growing body of research showing that apparent abstract mental processes are accompanied by sensorimotor processes, reflecting the embodied nature of knowledge representation (Gallese and Lakoff, 2005; Fischer, 2012).

### **ACKNOWLEDGMENTS**

This research was funded by the Swiss National Science Foundation (P2BEP1\_152104). We thank Antje Stahnke for assistance in data collection and Wilhelm Klatt for preparing the stimulus material.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fpsyg*.* 2015*.*00012/abstract

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 02 December 2014; paper pending published: 12 December 2014; accepted: 05 January 2015; published online: 22 January 2015.*

*Citation: Hartmann M, Mast FW and Fischer MH (2015) Spatial biases during mental arithmetic: evidence from eye movements on a blank screen. Front. Psychol. 6:12. doi: 10.3389/fpsyg.2015.00012*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2015 Hartmann, Mast and Fischer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Pushing forward in embodied cognition: may we mouse the mathematical mind?

### *Martin H. Fischer\* and Matthias Hartmann*

*Cognitive Science, University of Potsdam, Potsdam, Germany*

#### *Edited by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada*

#### *Reviewed by:*

*Thomas J. Faulkenberry, Tarleton State University, USA Amandine E. Rey, Lyon 2 University, France*

#### *\*Correspondence:*

*Martin H. Fischer, Division of Cognitive Science, University of Potsdam, Karl-Liebknecht-Str. 24-25, House 14, D14476 Potsdam OT Golm, Germany e-mail: martinf@uni-potsdam.de*

Freely available software has popularized "mousetracking" to study cognitive processing; this involves the on-line recording of cursor positions while participants move a computer mouse to indicate their choice. Movement trajectories of the cursor can then be reconstructed off-line to assess the efficiency of responding in time and across space. Here we focus on the process of selecting among alternative numerical responses. Several studies have recently measured the mathematical mind with cursor movements while people decided about number magnitude or parity, computed sums or differences, or simply located numbers on a number line. After some general methodological considerations about mouse tracking we discuss several conceptual concerns that become particularly evident when "mousing" the mathematical mind.

**Keywords: mousetracking, numerical cognition, SNARC effect, trajectories, on-line processing**

Today, cognitive scientists no longer study higher-level cognition separate from sensory and motor processes, even when investigating supposedly abstract knowledge domains such as language comprehension or numerical cognition. The "embodied turn" over the last two decades (Varela et al., 1991; Wilson, 2002; Glenberg et al., 2013) has raised interest in dynamic responses that presumably reflect underlying conceptual competition in real time.

Freely available software (Freeman and Ambady, 2010) has popularized "mousetracking" to study cognitive processing; this involves the on-line recording of cursor positions while participants move a computer mouse to indicate their choice (e.g., Spivey, 2007). Movement trajectories of the cursor can then be reconstructed off-line to assess the efficiency of responding in time and across space. Here we focus on the process of selecting among alternative numerical responses. Several studies have recently measured the mathematical mind with cursor movements while people decided about number magnitude or parity (Weaver and Arrington, 2013; Faulkenberry, 2014), computed sums or differences (Marghetis et al., 2014), or simply located numbers on a number line (Dotan and Dehaene, 2013). After some general methodological considerations about mousetracking we discuss several conceptual concerns that become particularly evident when "mousing" the mathematical mind.

### **METHODOLOGICAL CONSIDERATIONS ABOUT MOUSETRACKING**

#### **HARD- AND SOFTWARE ISSUES**

In contrast to established kinematic motion tracking, a computer mouse does not record three-dimensional position but position changes in two dimensions along an uncalibrated part of space that changes whenever we lift the mouse off its surface. Moreover, the temporal recording of mouse coordinates relies on the computer's operating system, which introduces limitations in sampling rate and temporal uncertainties. Despite these limitations, a formal comparison reveals reasonable recording quality if users exert some cautions (O'Reilly and Plamondon, 2011; see **Box 1**).

Depending on the mouse settings in the computer's control panel, experienced users can displace the cursor by quick pivoting movements of the wrist instead of displacing the hand smoothly across the desk, so that there is no linear relationship between hand displacement and cursor displacement. In mousetracking studies, mouse settings should therefore be selected carefully to prevent scaling of mouse cursor displacement. This includes disabling the "dynamic acceleration option" which is enabled by default, and lowering the speed of the mouse (see **Box 1**). Because these mouse settings play a crucial role, we advise to report the exact settings in the Method section, along with the display resolution, mouse sensitivity and resulting displacement ratio (see Bruhn et al., 2014, for an example). To date, the majority of studies do not report this information.

#### **TRACKING DIFFERENT PHASES OF THE COGNITIVE PROCESS**

A motor response generally consists of two phases: planning and execution. Movement planning begins when the target is (at least partially) known and ends when there is a physical displacement that initiates movement execution. When using classical response time paradigms, the response movement consists merely of a finger twitch on a button and scientists therefore rarely care about movement characteristics. With hand displacements, however, the early part of movement execution will largely reflect motor planning. Due to cognitive processing times, as well as afferent and efferent neural delays, only the later part of the movement will be sensitive to new information

#### **Box 1 | A checklist for conducting mousetracking studies.**

#### **Checklist for conducting a mousetracking study**

#### -**Reduce the participant's degrees of freedom**

Constrain the yaw (rotation around the vertical axis) of the mouse-pad to prevent hand rotations which are not adequately captured in the cursor trajectory, e.g., by wearing a wrist band.

#### -**Change default mouse settings**

Disable the default mouse acceleration option in the control panel of your operating system ("dynamic acceleration option" as labeled in Windows XP or "Enhance pointer precision" as labeled in Windows 7. Note that for Windows 7, additional effort is required to disable the acceleration function completely, for example by using a more sophisticated "gaming" mouse; for Macintosh users, type "defaults write .GlobalPreferences com.apple.mouse.scaling -1" into the Terminal (mouse acceleration cannot be disabled directly in the Mac control panel).

Also lower the default speed of the mouse to a reasonable range (e.g., second value from the left in the control panel) to capture cognitive effects in the trajectory measures.

#### -**Report mouse settings**

Report mouse settings as selected in the control panel and also report the resulting hand-to-cursor movement ratio (e.g., 1 cm hand movement results in x pixels mouse cursor displacement).

#### -**Report exact task instructions**

Instructing participants to begin the mouse movement at the beginning of the trial (before response selection has finished) helps to capture cognitive effects in the trajectory measures.


Control for bimodality (compute bimodality coefficients or Hartigan's dip statistic, or/and show probability plots of mouse trajectories).

#### about the current distance of the hand (or cursor) to the target.

Mousetracking allows researchers to push cognitive processing into movement execution and thereby makes features of the trajectory itself diagnostic. To this end, it is crucial to instruct participants to start moving their hands at the beginning of a trial, before the decision-related cognitive process is completed. In order to enhance such a behavior, a minimal displacement requirement shortly after target onset has been defined in some studies, and participants are reminded to start moving earlier when the requirement was not fulfilled in the previous trial (e.g., Freeman and Ambady, 2009; Scherbaum et al., 2010; Dshemuchadse et al., 2013; Faulkenberry, 2014; Marghetis et al., 2014). Some studies even require participants to move the hand *before* the target information in each trial is released (e.g., Dotan and Dehaene, 2013; Bruhn et al., 2014). However, some studies do not emphasize early movement onsets, inviting participants to complete decisionrelated cognitive processes before initiating their response, thus making initiation time (the time until movement onset) a more diagnostic measure (e.g., Weaver and Arrington, 2013). Since this trade-off between reaction time and movement time strongly depends on task instructions, we recommend reporting exact task instructions (see **Box 1**).

#### **INTERPRETING MOUSE TRAJECTORIES**

Mousetracking typically involves moving the mouse cursor from the central start box at the bottom of a display to either the left or right target box at the top of the display. There are two types of resulting trajectories: those where incongruent response mappings induce crossing over into the wrong hemispace before returning into the correct hemispace (e.g., Weaver and Arrington, 2013), and those where even under the incongruent mapping all trajectories remain in the correct hemispace and merely have differentially strong curvatures (e.g., Faulkenberry, 2014; Marghetis et al., 2014). Both types of results are currently interpreted as attraction by the competing distracting stimulus, due to the theoretical framework of dynamic competition (Spivey, 2007). However, in our opinion, only the former case, where trajectories actually verge into the distractor's hemifield, can be interpreted as evidence for attraction by the competing distractor. In the other case there is no spatial bias away from the correct target and curvature might simply reflect the earlier or later occurrence of the participants' decisions, due to increased task difficulty (cf. Faulkenberry, 2014). Moreover, even in the case where mean trajectories verge into the distractor's hemifield, this cannot automatically be taken as evidence for a continuous competitive cognitive process. Such a pattern can instead be the result of a small subset of trials in which participants incorrectly aimed for the wrong solution and corrected their trajectory during the motion. The latter case results in a bivariate variance distribution. It is therefore crucial to test variance distributions, for example by computing bimodality coefficients (cf. Spivey et al., 2005), or by using Hartigan's dip statistic (cf. Freeman and Dale, 2013; Faulkenberry, 2014). Given that this procedure tests the null hypothesis of uni-modal distributions, *p*-values that are only slightly larger than 0.05 should not be interpreted as evidence for a uni-modal distribution (null-hypothesis tests can yield *p*-values greater than 0.05 even when the tested assumption is violated to a degree that significantly affects the results of classic parametric tests; see Erceg-Hurn and Mirosevich, 2008). In case the researcher is interested to maintain the null hypothesis, it has been suggested to increase the conventional significance level α from 0.05 to 0.1 or 0.2 (Bortz and Schuster, 2010, p. 128). An alternative (or complementary) way to illustrate whether the average curve is representative for task performance is to present probability plots of mouse trajectories (see Figure 4 in Dshemuchadse et al., 2013 or Figure 2 in Scherbaum et al., 2010, for nice examples).

### **MAY WE MOUSE THE MATHEMATICAL MIND? SOME CONCEPTUAL CONCERNS**

Most conceptual domains *can* convey spatial meanings (e.g., the words "left" or "right"; or a directed gaze). However, none exhibits the rich and obligatory association of semantic features with space that characterizes number concepts. First, small and large magnitudes are associated to left/lower and right/upper space, respectively, leading to systematic biases in spatial behavior for single digit processing (the SNARC effect; Dehaene et al., 1993) as well as for mental arithmetic (the Operational Momentum effect; McCrink et al., 2007). For recent review of both effects see Fischer and Shaki (2014). Second, odd and even numbers are associated with left and right space, respectively, probably reflecting linguistic markedness of the associated labels (MARC effect; Nuerk et al., 2004). Third, each digit presentation requires a particular font size or auditory frequency that activates spatial associations indirectly, triggering the size congruity effect (SiCE; Henik and Tzelgov, 1982) for vision and the spatialmusical association of response codes for audition (SMARC effect; Rusconi et al., 2005; Fischer et al., 2013). Finally, in the case of multi-digit strings the relative position of each digit in the string determines its meaning via the place-value system (Nuerk et al., 2011, for review). This up to 6-fold association between space and number meaning(s) makes the interpretation of mouse trajectories in numerical tasks quite challenging: We need to know when the magnitude meaning of a number is known relative to its other spatially associated features, such as its parity, its decimal structure or its perceived intensity. An interpretation of typical trajectory-based measures, such as divergence points, area under the curve, or maximal deviation, is constrained by these uncertainties (for a detailed evaluation of trajectory biases from different features of number representation, see Dotan and Dehaene, 2013).

Moreover, the spatial nature of number concepts raises concerns about the validity of the mousetracking task itself, which requires movements in the horizontal plane in order to displace a cursor in the vertical plane. This task requirement raises two concerns: First, this visuo-motor mapping is non-intuitive and requires considerable mental effort to coordinate actions in one plane and their effects in another plane (e.g., Cunningham and Pavel, 1991). This non-intuitive transformation and the fact that the data reflect changes in cursor position, and not veridical hand position, make it implausible to assume that we obtain a valid proxy for "a record of the mental trajectory traversed" (Spivey et al., 2005, p. 10,398). Ideally, mousetracking users should constrain the yaw (rotation around the vertical axis) of the mouse-pad to prevent hand rotations (see **Box 1**). More suitable (and still relatively inexpensive) might be the direct recording of two dimensional hand position with digitizing tablets or even three-dimensional body position with Kinect© technology (e.g., Festman et al., 2013).

More importantly, the continuous forward movement of the hand, as well as the continuous upward movement of the cursor, both induce systematic biases into the activation of number concepts. Additionally, the mouse itself is typically located in the participant's right hemi-space and operated with the preferred (right) hand. Together, these four factors (the two movement directions and the two right spatial codes) are all associated with larger numbers. For example, turning right activates larger numbers (Loetscher et al., 2008; Hartmann et al., 2012; Shaki and Fischer, 2014), addition is easier when moving one's hand upward (Wiemers et al., 2014), and also forward and backward motion does interact with number processing (Fischer and Campens, 2009; Seno et al., 2011; Marghetis and Youngstrom, 2014). These inherent biases make number task as "special case" for mousetracking investigations. For number studies, we propose to move away from the standard paradigm (starting in the middle of the lower screen and move to the top left vs. top right) that does not allow researchers to capture adequately the various spatialnumerical associations. Instead, it may be helpful to incorporate additional spatial manipulations, such as starting at the top, placing the mouse in the center or the left side of the screen, or reversing the forward-upward-translation between mouse and visual motion. These manipulations might help to capture the various spatial-numerical association and to advance the understanding of their dynamic influence on cognition.

### **ACKNOWLEDGMENTS**

Martin H. Fischer's work is funded by ESF grant EW12-114 "From Numbers To Knowledge—20 Years Of Spatial-Numerical Associations" and by DFG grant 1915/2 on "manumerical cognition." Matthias Hartmann was funded by Swiss national Science Foundation (P2BEP1\_152104).

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 13 October 2014; paper pending published: 27 October 2014; accepted: 29 October 2014; published online: 20 November 2014.*

*Citation: Fischer MH and Hartmann M (2014) Pushing forward in embodied cognition: may we mouse the mathematical mind? Front. Psychol. 5:1315. doi: 10.3389/ fpsyg.2014.01315*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Fischer and Hartmann. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Extending the reach of mousetracking in numerical cognition: a comment on Fischer and Hartmann (2014)

#### *Thomas J. Faulkenberry1 \* and Amandine E. Rey2*

*<sup>1</sup> Department of Psychology and Counseling, Tarleton State University, Stephenville, TX, USA <sup>2</sup> Labaratoire Etudes des Mécanismes Cognitifs, Université Lumière Lyon 2, Lyon, France \*Correspondence: faulkenberry@tarleton.edu*

#### *Edited by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada*

#### *Reviewed by:*

*Jon Freeman, New York University, USA Matthias Hartmann, University of Potsdam, Germany*

**Keywords: numerical cognition, mousetracking, competition, dynamics, embodiment**

#### **A commentary on**

### **Pushing forward in embodied cognition: May we mouse the mathematical mind?** *by Fischer, M. H., and Hartmann, M. (2014). Front. Psychol. 5:1315. doi: 10.3389/fpsyg.2014.01315*

In a recent article, Fischer and Hartmann (2014) present a brief methodological review of the use of computer mousetracking in analyzing the processes involved in numerical cognition. Most certainly this review is a welcome addition to the mathematical cognition literature, especially in light of recent studies that have used the technique to study numerical decision processes. After presenting a general overview of the computer mousetracking method, Fischer and Hartmann make several recommendations (e.g., reporting exact mouse settings, constraining wrist movement, etc.) that will surely help to facilitate comparison and interpretation across a variety of studies as we continue to advance our knowledge of the dynamics of numerical processing. The purpose of the present commentary is not to be critical; rather, we hope that this commentary will be seen as complementary to (as well as complimentary of) the recommendations of Fischer and Hartmann (2014). We feel that their review is timely and informative. However, we also feel that some of the issues raised by Fischer and Hartmann warrant further discussion.

At its core, computer mousetracking is used to construct a temporally rich set of data during decision-making that allows one to conduct a more fine-grained analysis than end-state performance measures alone (such as RT and/or error rates). Most of the recent studies that use this technique to study numerical processes specifically look for the dynamic signature of increased trajectory curvature in certain comparison conditions (e.g., Santens et al., 2011; Faulkenberry, 2014). This signature has been used as evidence for parallel consideration of response options. For example, Santens et al. (2011) measured trajectories in a numerical comparison task in which participants were asked to compare a presented number to the fixed standard 5. As the distance from the stimulus number to 5 decreased, trajectories became more differentially curved, revealing a dynamic interpretation of the numerical distance effect (Moyer and Landauer, 1967). Santens et al. interpreted their results to be in line with a competitionbased model of numerical representations. Faulkenberry (2014) extended this result to a numerical odd/even task and showed (via distributional analyses of the response trajectories) that such differential curvatures result from a graded competition between parallel and partially-active representations, and not from averaging across widely different trajectory types. It is important to note that neither of these results could easily have been obtained via traditional cognitive processing measures.

It is on this note that we feel the review of Fischer and Hartmann (2014) unintentionally limits the utility of computer mousetracking to only providing evidence of continuous competition in numerical processing. On the contrary, several recent studies have used the technique to analyze the selective influence of various stimulus factors over the time course of a response. For example, Freeman and Ambady (2011) showed that trajectory deviations happen earlier for inconsistencies in pigmentation cues vs. shape cues in face recognition. Similarly, (Freeman et al., 2013) demonstrated earlier deviations for Chinese participants vs. American participants when processing faces with inconsistent contextual cues. While there are not yet any published studies in the domain of numerical cognition that look specifically at when trajectory deviations happen, Faulkenberry and Montgomery (2012) showed that in fraction processing, trajectory deviations which stem from components happen earlier than deviations which stem from holistic magnitude processing. The basic logic of these studies is that with an underlying mapping between response trajectories and perceptual/cognitive processes, any observed difference in the onset of motor deviations necessarily reflects a difference in the time course of the underlying perceptual and/or cognitive processes. As such, these types of manipulations hold promise for number researchers to tease apart the predictions from competing models of numerical processing.

There is one claim from which we hold a divergent opinion compared to Fischer and Hartmann (2014). Specifically, Fischer and Hartmann propose that competitive attraction from a distractor can be inferred *only* in the case where hand trajectories actually move the mouse completely onto the distractors side of the computer screen. Further, they propose that when trajectories remain completely on one side of the solution space, this "might simply reflect the earlier or later occurrence of the particpants' decisions, due to increased task difficulty" (Fischer and Hartmann, 2014, para. 7). This is in opposition to the continuous cognition framework (Spivey, 2007), which posits that any graded deflection of trajectories is due to attraction "toward" a competing response option (this is operationalized in terms of rising and falling activation values during a decision process). In fact, under this framework, increased task difficulty and increased competition are essentially synonymous. For example, (Gold and Shadlen, 2000) found when macaque monkeys were trained to indicate the perceived direction of a dot flow by making an eye movement toward that direction, the magnitude of eye movement deviations was modulated by the difficulty of perceiving the coherent dot flow direction. Similarly, (Santens et al., 2011) found that numerical distance modulated the curvature of hand trajectories in numerical comparison, even though those trajectories did not deviate into the competitors space. Both studies interpret these modulation effects as evidence of increased competition. Based on available evidence, we think that it is this increase of deviation/curvature in the presence of more difficult stimuli (or even a lack of deviation in a control condition) that serves as primary evidence for competition effects. Nevertheless, it is important for future research to investigate how the dynamics of hand trajectories reflect competition vs.

indecision, and more theoretical work is needed to determine if/when such concepts can/should be dissociated.

Despite our objection to their claim, Fischer and Hartmann (2014) do raise a point worth further investigation. In those cases where hand trajectories do actually verge into the competitors space, what does this tell us? Further study may reveal additional explanations, but we offer one possibility. In a face recognition study, (Freeman et al., 2011) found that hand trajectories swerved toward the attribute that is stereotypically associated with the opposite sex when face cues partly overlapped with that sex (e.g., masculine women or feminine men). They interpreted this swerving as not only representing competing representations, but as a partial triggering of the associated stereotype. Hence, it could be the case that when hand trajectories do verge into the competing space, this says something about the nature of the competitive processes involved. Future research will need to investigate this issue more fully.

In conclusion, we appreciate and endorse the review of Fischer and Hartmann (2014). In spite of a few concerns that we have outlined above, we believe that their recommendations will be very influential, not only to number researchers, but to anyone hoping to use computer mousetracking to study real-time dynamics in cognition.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 November 2014; accepted: 24 November 2014; published online: 10 December 2014.*

*Citation: Faulkenberry TJ and Rey AE (2014) Extending the reach of mousetracking in numerical cognition: a comment on Fischer and Hartmann (2014). Front. Psychol. 5:1436. doi: 10.3389/fpsyg.2014.01436 This article was submitted to Cognition, a section of the*

*journal Frontiers in Psychology.*

*Copyright © 2014 Faulkenberry and Rey. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The architecture of embodied cue integration: insight from the "motivation as cognition" perspective

Idit Shalev \*

Department of Education and the Zlotowski Center of Neuroscience, Ben-Gurion University of the Negev, Beer Sheva, Israel

Keywords: concept activation, cue integration, embodied cognition, feature integration, goal systems theory, motivation as cognition, social cognition

### The Malleable Nature of Embodied Cues in Judgment and Behavior

At the core of embodied cognition research is the assumption that higher level processing is grounded in the organism's lower level sensory and motor experiences (Barsalou, 1999, 2008; Meier et al., 2012; Winkielman et al., 2015b). Past research of perceptual multimodal cue integration has demonstrated that several mechanisms underlie perceptual integration (Treisman and Gelade, 1980; Zmigrod and Hommel, 2013). Based on embodied cognition theory, which indicates that activation automatically spreads from concepts driven by experiences in the physical world to their metaphorically-related social concepts (for reviews, Williams et al., 2009; Meier et al., 2012), it was proposed that to produce action, embodied cues associate between lower level and higher level cues. However, little is known about the factors that modulate this integration. This gap in the literature is of relevance because research of embodied cognition has demonstrated that perceptual symbols can lead to different patterns of activation across different contexts (Barsalou, 2008), which makes predictions about judgment and behavior difficult. For example, the associations between physical warmth/coldness and psychological warmth/coldness across different contexts yielded both assimilative effects (e.g., physical warmth increases psychological warmth) (Williams and Bargh, 2008) and contrast effects (e.g., physical coldness increases the need for social warmth) (Zhong and Leonardelli, 2008; Bargh and Shalev, 2012; Shalev and Bargh, 2014; Zhang and Risen, 2014).

Following the recent pragmatic turn in cognitive science, according to which cognitive processes and their underlying neural activity patterns should be studied primarily with respect to their roles in action generation (Glenberg et al., 2013), I argue that embodied cues are integrated according to their momentary functions within each individual's system of goals. Conceptualized as cognitive representations of desired end-points that affect evaluations, emotions and behaviors (Fishbach and Ferguson, 2007), goals serve as reference points toward which behavior is directed. I suggest that analyzing embodied cue integration from the "motivation as cognition" perspective (Kruglanski et al., 2002) may add to our understanding of which cues are perceived, what response is determined as appropriate in a given situation, and why different judgments and behaviors may be elicited by the activation of similar sets of embodied cues. In the sections below, I will discuss three types of constraints that stem from the "motivation as cognition" perspective, including the motivational properties of embodied cue integration (Eitam and Higgins, 2010), the allocational properties of embodied cue integration (based on attentional resource-limitation, see Kahneman, 1973), and the structural properties of cognitive-interconnectedness and uniqueness (Kruglanski et al., 2002).

#### Edited by:

Nicolas Vermeulen, Université Catholique de Louvain, Belgium

#### Reviewed by:

Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada

> \*Correspondence: Idit Shalev, shalevid@bgu.ac.il

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 22 November 2014 Accepted: 05 May 2015 Published: 21 May 2015

#### Citation:

Shalev I (2015) The architecture of embodied cue integration: insight from the "motivation as cognition" perspective. Front. Psychol. 6:658. doi: 10.3389/fpsyg.2015.00658 The three types of constraints, adopted from goal systems theory (Kruglanski et al., 2002), were invoked to explain the process of embodied cue integration.

## What Is the "Motivation as Cognition" Perspective?

The "motivation as cognition" perspective assigned distinct functions to motivational and cognitive variables. A basic assumption is that motivation can fluctuate from one moment to the next, thus determining the extent to which any kind of information (strategic and peripheral; conscious and unconscious) is processed (Kruglanski and Thompson, 1999). Mental representations of motivational networks comprise interconnected goals and means that may be automatically activated simultaneously by different cues, and as such, they may compete with each other for mental resources (Kruglanski et al., 2002). Likewise, according to this approach, several cognitive properties set the constraints within which the motivational properties may express themselves. Because both motivation and embodied cognition are types of cognition, this set of cognitive constraints may explain the way motivation influences embodied cue integration. In the sections below, I will discuss the constraints on cue integration, including the motivational properties of embodied cue integration (Eitam and Higgins, 2010), the allocational properties of embodied cue integration (based on attentional resource-limitation, see Kahneman, 1973), and the structural properties of cognitive-interconnectedness and uniqueness (Kruglanski et al., 2002).

### The Motivational Properties of Embodied Cue Integration

The first assumption of embodied cue integration is that because numerous sensori-motor cues can serve as the material for multiple social inferences, a selection process is needed to determine which cues to integrate in a given situation to create meaning. I suggest that perceptual or conceptual saliency depends on whether a mental representation reflects the individual's momentary goals, and what, if any, relationship those goals have with the salient cues in the immediate environment of the individual (Balcetis and Dunning, 2009; Balcetis et al., 2012). A similar line of thought was suggested by De Houwer (2009), indicating that associative learning effects are determined not only by the direct experience of events but also by prior knowledge and instructions. Pursuing this logic, Eitam and Higgins (2010) suggest that whether, and the degree to which, a stimulated mental representation is activated reflects the relative weights of one or any combination of three sources of motivational relevance: value relevance, or the extent to which acting on a mental representation will bring about desired results and/or prevent undesired results; control relevance, or the efficacy with which the activated representation makes things happen; and truth relevance, which establishes what is real. Thus, the relative extents to which these sources are relevant to the individual's needs determines the level and duration of activation, regardless of the content of the representation.

Indeed, recent findings of embodied cognition research have provided strong evidence for the effect of motivational relevance on cue integration. For example, one study demonstrated the source of value relevance by showing that the adoption of approach-type postures (e.g., leaning forward) was associated with increases in neural activation characteristic of approach situations (Harmon-Jones et al., 2011). In another study, the performance of avoidance type movements (pushing a shopping cart as opposed to holding it) was associated with fewer reward-oriented consumer choices at the checkout counter (Van den Bergh et al., 2011). The source of control relevance was demonstrated by showing that embodied simulations of facial expressions were expressed for conceptual understanding only if they were relevant to solving the task at hand (Niedenthal et al., 2009), indicating that embodiments are not passive byproducts of conceptual processing (Winkielman et al., 2015a). Finally, the source of truth relevance was shown in a study where participants were asked to verify or deny that a certain object has a certain property (i.e., answer a question such as "Do cats have wings?"). The results showed that the speed of property verification was related to the perceptual salience of the feature in question (Solomon and Barsalou, 2004). For example, property verification was quicker the more conspicuous the property, presumably because such properties are easier to see in a recalled or simulated visual representation.

### The Allocation Properties of Embodied Cue Integration

The second assumption of embodied cue integration is that the fundamental allocation property relies on limited mental resources. From that perspective, the allocation of cognitive resources has a functional purpose, namely, to minimize the extent to which mental resources are exploited in the creation of unified percepts. The property of limited resources is demonstrated, for instance, by motor fluency effects observed only when individuals are involved in monitoring situational constraints. For example, research showed that compared with rigid right-handers, flexible right-handers recalled product orientations better and showed a preference for objects on which the handle was oriented in the direction of the hand used for grasping (Eelen et al., 2013).

Another application for the limited resources effect is demonstrated by the switching cost entailed in shifting attention from one modality to another (e.g., from audition to vision), indicating that the second stimulus is processed more slowly than it would have been had the two stimuli used the same modality (e.g., Spence et al., 2001). The switching cost was also demonstrated by the perceptual simulation approach, indicating that verifying the properties of concepts in the auditory modality was slower after verifying a property in a different modality than after verifying one in the same modality (Pecher et al., 2003).

The limited resources assumption has several consequences. First, I suggest that the integration process is fundamentally economic and that it operates automatically by activating samples of the interconnection between sounds, sights, and other sensory signals that were encoded in memory based on previous learning (Brunel et al., 2009; Zmigrod et al., 2009; Bargh and Morsella, 2010; Vallet et al., 2010). Evidence for automatic activation is based on the ideomotor theory, which assumes the existence of an automatic action–effect integration mechanism that binds motor patterns and action effect representation (Chartrand and Bargh, 1999; Zmigrod and Hommel, 2013). Second, as was recently proposed by Winkielman et al. (2015b), I argue that non conscious automatic signals, including fluency and a sense of coherence, inform fundamental cognitive and social judgments (Winkielman and Schooler, 2011; Schwarz, 2015), thereby consuming fewer cognitive resources.

### The Structural Properties of Embodied Cue Integration

The third assumption of embodied cue integration is that unified percept configurations are influenced by sensori-motor cue interconnections, including the form and associative strength of those interconnections. The strength of association between multi-modal units is positively related to the uniqueness of the interconnections (Kruglanski et al., 2002).

This dynamic helps explain why specific embodied metaphors have stronger associative links than other metaphors with sensori-motor cues. A possible explanation could be that the repetition of specific social inferences across different contexts in response to sensori-motor contextual cues occurs when the strength of the association is high or in populations where this motivation is accessible. For example, evidence that washing one's hands also "washes away" feelings of guilt was found not only in a normal population (Zhong and Liljenquist, 2006; Lee and Schwarz, 2010, 2011), but also among patients with obsessive compulsive disorder in whom the association between contents related to physical and psychological cleanliness is stronger (Reuven et al., 2014). Likewise, research indicates that core metaphors (e.g., temperature, distance) are associated with multiple conceptual phrases (Schnall, 2014), suggesting possible variability in the strengths of the associations between sub-metaphors associated with the core metaphor. Likewise, individual and cultural differences also influence these associative strengths and may have an impact on the replicability of findings (Shalev and Bargh, 2014).

Another structural application of embodied cue integration is the substitutability relations of cues associated with an identical mental representation. For example, studies of the metaphorical links between physical and social temperatures

### References


(e.g., "warm smile," "cold as ice") showed that participants perceive others as "warmer" after they have held a warm rather than a cold cup of coffee (Williams and Bargh, 2008; IJzerman and Semin, 2009, 2010; Shalev and Bargh, 2011; Bargh and Shalev, 2012). Likewise, they experience a room as physically colder after having been socially rejected (Zhong and Leonardelli, 2008), indicating a possible substitutability between physical and semantic psychological concepts.

### Conclusions

This paper suggests that several constraints based on the "motivation as cognition" paradigm modulate the interrelations between perception, emotion and action, and in so doing, they influence embodied cue integration in both bottomup and top-down manners. On the one hand, active goals influence the feasibility of relevant embodied cues. On the other hand, the perceiver's likelihood of drawing a specific inference may be proportional to the strengths of the associations between contextual cues and sights and sounds encountered by the individual (Zaki, 2013). Based on this reasoning, I suggest that inferences are highly flexible and contextdependent, and therefore, they vary in accordance with situational framing effects (Loersch and Payne, 2011; Wiltshire et al., 2015). As with other psychological phenomena, individual differences (e.g., physical disability, mental health conditions) could increase the likelihood that specific motivational states will be associated with particular embodied cues. Likewise, the repetition of specific social inferences in response to similar sensori-motor contextual cues is possible and may depend on the strength of the association within unique cue configurations. The contribution of the embodied cue integration approach goes beyond explaining the variability of findings across different contexts. By combining cognitive architecture, semantic metaphoric configurations and structural motivational properties, embodied cue integration offers a possible path for integrating different lines of thought in the field of embodied cognition.

### Acknowledgments

The research was partially supported by the Helmsley Charitable Trust through the Agricultural, Biological and Cognitive Robotics Center of Ben-Gurion University of the Negev. I thank Nicolas Vermeulen for handling the manuscript, Guillaume T. Vallet and reviewer for their helpful comments, and Joseph Tzelgov for his constructive comments on a previous version of this manuscript.

Bargh, J. A., and Morsella, E. (2010). "Unconscious behavioral guidance systems," in Then a Miracle Occurs: Focusing on Behavior in Social Psychological Theory and Research, eds C. R. Agnew, D. E. Carlston, W. G. Graziano, and J. R. Kelly (New York, NY: Oxford University Press), 89–118. doi: 10.1093/acprof:oso/9780195377798.003.0006

Bargh, J. A., and Shalev, I. (2012). The substitutability of physical and social warmth in daily life. Emotion 12, 154–162. doi: 10.1037/a0023527


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Shalev. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# SMART-ER: a Situation Model of Anticipated Response consequences inTactical decisions in skill acquisition — Extended and Revised

### *Markus Raab1,2 \**

*<sup>1</sup> School of Applied Science, London South Bank University, London, UK*

*<sup>2</sup> Performance Psychology, Institute of Psychology, German Sport University, Cologne, Germany*

#### *Edited by:*

*Nicolas Vermeulen, Université Catholique de Louvain, Belgium*

#### *Reviewed by:*

*Sascha Topolinski, University of Cologne, Germany Ashley James Chapman, Northumbria University, UK*

#### *\*Correspondence:*

*Markus Raab, School of Applied Science, London South Bank University, 103 Borough Road, London SE1 0AA, UK e-mail: raab@lsbu.ac.uk*

Situation Model of Anticipated Response consequences in tactical decisions (SMART) describes the interaction of top–down and bottom–up processes in skill acquisition and thus the dynamic interaction of sensory and motor capacities in embodied cognition. The empirically validated, extended, and revised SMART-ER can now predict when specific dynamic interactions of top–down and bottom–up processes have a beneficial or detrimental effect on performance and learning depending on situational constraints.The model is empirically supported and proposes learning strategies for when situation complexity varies or time pressure is present. Experiments from expertise research in sports illustrate that neither bottom–up nor top–down processes are bad or good *per se* but their effects depend on personal and situational characteristics.

**Keywords: embodied cognition, sport, top–down process, bottom–up process, skill acquisition**

### **INTRODUCTION**

Consider the soccer goalkeeper's simple task of preventing a penalty shooter scoring a goal. The goalkeeper's behavior provides a good example of sensorimotor interaction, that is, the interaction of sensory and motor capacities. Given the distance of the ball to the goal (11 m or 12 yards), the mean speed of a ball of more than 20 m/s (Farina et al., 2013), and a required response of the goalkeeper of 100 ms or more before the actual kick (Farina et al., 2013), the goalkeeper must quickly decide which way to go. In simple terms, the goalkeeper's options are to move to the left, right, or middle. But how can one explain a specific choice—say, a move to the left—and predict when the goalkeeper will jump? The embodied cognition framework suggests that this action is based on immediate and stored sensorimotor experiences.

Sensorimotor interaction has long been described as a sequential and independent process through which an organism perceives a stimulus, cognitively processes that information, and then selects a response (Proctor and Vu, 2006). In the last decades, however, perception and action have been much more tightly linked, as reflected, for instance, in the theory of common coding (Hommel et al., 2001), which assumes that perception and action share common processes and representations. The premise that actions are coded in terms of their anticipated sensory consequences is a principle as old as psychology itself (James, 1890). A goalkeeper activating an action plan anticipates the sensory consequences of a movement when jumping to one of the goal's corners. Following previous investigations by Beilock (2008) and Pizzera and Raab (2012) developed new predictions about the interaction of sensorimotor and cognitive processes. They suggested that stored sensorimotor experiences can influence cognitive judgments. For instance, umpires may judge observed movements better when they rely on their own sensorimotor system, that is, when they

have experience with the movement they are being asked to judge.

In this paper I present a model that describes top– down and bottom–up processes of skill acquisition and their interactions over time and explore how learning shapes the use of these processes. This model follows other dualprocess models accentuating the importance of motor activity (e.g., Strack and Deutsch, 2004) and common principles of intuitive and deliberative judgments (Kruglanski and Gigerenzer, 2011). A high level of cognitive control of sensory processing characterizes top–down processes. In the penalty example, cognitive control could use knowledge about, say, the shooter's preference to shoot to the left corner. Top–down processes are likely to influence the gaze and the interpretation of sensory information. Bottom–up processes are characterized by an absence of cognitive control in sensory processing and use present information more directly. In the penalty example, this could be the position of the shooter relative to the ball (Savelsbergh et al., 2010).

### **DYNAMIC INTERACTIONS IN EMBODIED COGNITION**

There are at least four different ways top–down and bottom–up processes interact over time (see **Figure 1**). *Selective* interaction means following either top–down or bottom–up processes (i.e., no interaction), so in the penalty example, the goalkeeper would decide to jump to the left based only on knowledge of the shooter's preferred shooting direction, or the shooter would decide to shoot to the left side independent of the goalkeeper's moves. This kind of pure selection seems unlikely from an embodied account of cognition (Myachykov et al., 2013). In *competitive* interaction both processes contribute but one process dominates. For example, the goalkeeper would choose to use knowledge of the shooter's preference to shoot – from the perspective of the

shooter – left to decide to jump to his – from the perspective of the goalkeeper – right corner. *Consolidated* interaction means that both processes are involved in the choice. For example, if the top–down process indicates jump left and the bottom–up process indicates jump right, that choice conflict produces displacement activity or frozen behavior (Troisi, 2002); if both processes point in the same direction faster responses occur. *Corrective* interactions describe sequential effects of processes. For example, the goalkeeper may recall the shooter's preferences and prepare an action tendency toward that preference. However, when observing the shooter approach the ball, the ultimate choice may depend more on bottom–up processes that is for instance the position of the foot relative to the ball. These interactions are dynamic and may depend on previous experience on how success was experienced. Therefore tactical decisions in skill acquisition are embodied in the sensorimotor system and determine how situations are perceived.

#### **IMPLICITLY AND EXPLICITLY LEARNED SENSORIMOTOR INTERACTIONS**

Expert athletes such as the goalkeeper in the penalty example are often defined as having 10 years and about 10,000 h of training that can influence their current choices at any moment in time (Ericsson and Lehmann, 1996). Therefore and to understand the dynamics of sensorimotor interactions within an embodied cognition framework interactions of cognitive and sensorimotor processes need to be modeled with specific types of learning. Learning is often differentiated as implicit and explicit; both types are frequently cited in the motor and cognitive learning literature and researchers largely agree on the definition of the concepts (Kleynen et al., 2014).

Implicit learning is defined as a "non-intentional, automatic acquisition of knowledge about structural relations between objects or events" (Frensch, 1998, p. 76), and explicit learning as an intentional acquisition that results in verbalizable knowledge (O'Brien-Malone and Maybery, 1998). Looking at the learning situation itself in terms of where it sits on the continuum of intentionality may help identify implicit and explicit learning. Situations in which actions are incidental in nature engender implicit learning, whereas situations in which actions are intentional in nature engender explicit learning. In the penalty example, a player learning soccer at the beach may differ from a player in an early selection training group in which a coach and verbalized information about options are available. The former may have learned soccer implicitly and would refer to experienced bottom–up processes more than the latter, who has acquired verbalizable knowledge and may use top–down processes more in subsequent behavior. The interactions of top–down and bottom–up processes might fall into the categories described above. People probably combine years of implicit and explicit learning and thus experience hybrid learning (Mathews et al., 1989). It is unclear if hybrid learning leads to better choices than those made by implicit or explicit learning alone.

#### **THE EXTENDED AND REVISED SMART: NOW SMART-ER**

One model that explicitly predicts the effects of implicit and explicit learning on anticipated response consequences of actions is SMART, the Situation Model of Anticipated Response consequences in tactical decisions (Raab, 2007), in which time-pressure decisions in sports (e.g., a tactical choice of shooting or passing) are explained as a function of the interaction of top–down and bottom–up processes. In this model (see **Figure 2**), these processes dynamically interact as described above but in addition, a representation format, or equivalence class (Hoffmann et al., 2004), is chosen. Equivalence classes are representations of sensorimotor

which the initial option in a specific region (k) is contrasted to later evidence in other specific regions (r). Details for equations for weighting top–down and bottom–up processes can be found in Glöckner et al. (2012).

interactions of anticipated response consequences. An anticipated response consequence describes a representation of the sensorimotor system in which we predict future (anticipated) changes in the environment as a consequence of our movements. Equivalence classes are representations of sensorimotor interactions active when anticipating response consequences that group the consequences of specific choices together. Finally previous implicitly or explicitly learned behavior activates a choice rule that allows one to accumulate information before choosing between certain options (Glöckner et al., 2012).

Here I present an extended (E) and revised (R) SMART, and thus it is labeled SMART-ER. An important extension is the additional focus beyond the person on situations in which dynamic situation-specific sensorimotor interactions take place. Such situations are characterized by whether a fixed set of options are present, as has been used in most research to validate SMART (Raab, 2007), or whether people generate different options themselves, as demonstrated in option-generation

paradigms (Johnson and Raab, 2003). SMART has also been extended to include specific predictions based on the complexity of a situation, which can be manifested in the number of choice options, the visual information available, and the speed at which decision have to be made (Raab, 2003). SMART has been revised to specify when implicit learning and explicit learning—in contrast to hybrid learning—may be beneficial. SMART's predictions of when implicit motor learning is beneficial have been revised to be valid in less complex situations in which the sensorimotor interactions do not require attention regulation via top–down processes. In complex situations, explicit motor learning may be beneficial because it uses knowledge to attend to "information-rich" areas (Magill, 1998). Finally, hybrid learning may be beneficial in complex situations because this type of learning allows the interaction of top–down and bottom–up processes to be calibrated during learning. In simple situations, in contrast, bottom–up processes could regulate the choice and top–down processes would interfere and potentially deteriorate performance.

In the following sections I present empirical evidence supporting SMART-ER. Evidence has beenfound in laboratory research on the complex movements involved in choices in team sports (Raab, 2003; Raab et al., 2005). In other work, manipulation of top–down and bottom–up processes were tested by applying time-pressure, reducing access to knowledge from long-term memory (Raab, 2003). Instruction manipulations have also been applied (Raab and Masters, 2009). Measures of bottom–up processes include the percentage of early fixations to areas in which the final choice is present (Raab and Johnson, 2007), and for top–down processes whether the first option generated will be overruled if more time for generation is given (Johnson and Raab, 2003). Response time and gaze data have been used to predict participants' first and final choices and model predictions have been cross-validated to other trials or samples (Glöckner et al., 2012).

#### **EMPIRICAL EVIDENCE OF TOP–DOWN AND BOTTOM–UP PROCESSES**

Based on the tradition separating dual-processes (Chaiken and Trope, 1999) and the above-cited specific models from Strack and Deutsch (2004), Beilock (2008), and Kruglanski and Gigerenzer (2011) my own work supports these notions using longitudinal data. For instance, in a longitudinal study, expert team-handball players were asked to identify the best option for a playmaker in a video displaying attack situations. The task assessed (a) participants' first choices, (b) alternative options participants deemed appropriate, and (c) after all options were generated, the option participants chose as the best one (Raab and Johnson, 2007). Decision time and gaze behavior, indicating fixations on these options, were measured. Experts abandoned their first option in about 40% of cases and chose a different best option, possibly indicating a top–down influence, as the visual display did not change. Bottom–up processes have been identified by early fixations to important options (Raab and Johnson, 2007).

#### **EVIDENCE OF TOP–DOWN AND BOTTOM–UP PROCESS INTERACTIONS**

In the above-described study (Raab and Johnson, 2007), systematic gaze behavior (e.g., fixation on options on the left attack side) became more strongly correlated with the generation strategy (e.g., generating options on the left side) over the course of the study, indicating consolidated interactions of top–down and bottom–up processes. Measuring fixations over time makes it possible to predict the weighting of early and late information. Results of a study by Raab and Laborde (2011) showed best model fits and crossvalidation of model predictions for two-thirds of the participants when early information was weighted more than late information (i.e., indicating less reliance on top–down processes). For the remaining third of participants the pattern was reversed, and thus individual differences in how much deliberation using top–down processes is needed before a final choice is made may explain these patterns.

#### **EVIDENCE OF SITUATION-SPECIFIC LEARNING EFFECTS**

Situation Model of Anticipated Response consequences in tactical decisions – extended and revised predicts situation complexity affects implicit and explicit as well as hybrid learning. Results from four experiments with option-selection tasks in three different sports indicate that indeed, implicit learning produces better and faster choices when manipulating low complex situations (index of complexity defined by low number of options and low visual complexity) and explicit learning in complex situations (Raab, 2003). Implicit learners were found to have less verbalizable knowledge that could have been used for top–down processes in contrast to explicit learners. These experiments have been ecologically validated in more realistic sports situations using different kinds of instructions (Raab and Masters, 2009). Finally, hybrid learning, as predicted, has been found to outperform more implicit or explicit learning only in complex situations, indicating a more consolidated interaction of bottom–up and top–down processes (Raab et al., 2005).

#### **CONCLUSION**

Situation Model of Anticipated Response consequences in tactical decisions – extends and revises a previous situation model of anticipated response consequences of tactical decisions. This extension considers situation in a model of top–down and bottom–up processes and therefore indicates when specific sensorimotor interactions may occur and change behavior. Further, the model reconsiders the benefits of implicit, explicit, and hybrid learning strategies and how they may foster the use of top–down and bottom–up processes. This model has been tested mainly on sensorimotor interactions in quite complex situations, but further evidence has been found in fine motor control (Raab et al., 2013) and neurophysiological correlates are yet to be further tested (Hill and Raab, 2005). Future research should test the model's predictions in other domains and compare the model to others available (Glöckner et al., 2012).

#### **ACKNOWLEDGMENTS**

The author declares that he does not work for, act as consultant to, own shares in, or receive funding from any company or organization that would benefit from this article. The author developed the model and wrote the manuscript. The author agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

#### **REFERENCES**


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 October 2014; accepted: 11 December 2014; published online: 06 January 2015.*

*Citation: Raab M (2015) SMART-ER: a Situation Model of Anticipated Response consequences in Tactical decisions in skill acquisition — Extended and Revised. Front. Psychol. 5:1533. doi: 10.3389/fpsyg.2014.01533*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2015 Raab. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Limitless capacity: a dynamic object-oriented approach to short-term memory

*Bill Macken\*, John Taylor and Dylan Jones*

*School of Psychology, Cardiff University, Cardiff, UK*

The notion of capacity-limited processing systems is a core element of cognitive accounts of limited and variable performance, enshrined within the short-term memory construct. We begin with a detailed critical analysis of the conceptual bases of this view and argue that there are fundamental problems – ones that go to the heart of cognitivism more generally – that render it untenable. In place of limited capacity systems, we propose a framework for explaining performance that focuses on the dynamic interplay of three aspects of any given setting: the particular task that must be accomplished, the nature and form of the material upon which the task must be performed, and the repertoire of skills and perceptual-motor functions possessed by the participant. We provide empirical examples of the applications of this framework in areas of performance typically accounted for by reference to capacity-limited short-term memory processes.

#### *Edited by:*

*Lionel Brunel, Université Paul Valery, France*

#### *Reviewed by:*

*Guillaume T. Vallet, Centre de Recherche de l'Institut Universitaire de Gériatrie de Montréal, Canada Gaën Plancher, Université Lumiere Lyon 2, France*

#### *\*Correspondence:*

*Bill Macken, School of Psychology, Cardiff University, CF10 3AT, Cardiff, UK, macken@cardiff.ac.uk*

#### *Specialty section:*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology*

*Received: 29 September 2014 Accepted: 01 March 2015 Published: 23 March 2015*

#### *Citation:*

*Macken B, Taylor J and Jones D (2015) Limitless capacity: a dynamic object-oriented approach to short-term memory. Front. Psychol. 6:293. doi: 10.3389/fpsyg.2015.00293* Keywords: short-term memory, limited capacity, perceptual-motor processing, perceptual organization, language and memory

It is paradigmatic in cognitive psychology to attribute performance limitations, in the final instance, to the limited capacity of the processing systems that underpin that performance. This is particularly the case in short-term memory, a 60-years old paradigm in which the profound difference between these two concepts – contextually limited performance and structurally limited processing capacity – is not often confronted or even acknowledged. Indeed, the idea of a limitedcapacity, short-term memory system appears as an integral part in explanations of aspects of cognition as broad and diverse as the development and evolution of language (e.g., Baddeley et al., 1998; Wray, 2000), individual differences in intelligence (Hornung et al., 2011), distraction from task-irrelevant material (e.g., Lavie, 2005), mental arithmetic (e.g., Lee and Kang, 2002) and logical reasoning (e.g., Gilhooly, 2004). So embedded is this explanatory device that limited performance and limited capacity present themselves as inseparable, even identical, postulates. Here, we aim to scrutinize the conceptual and empirical underpinnings of these postulates, and we end by rejecting both of them as either determinable or veridical aspects of human functioning.

We do this from the perspective of what has become known as 'embodied' or 'grounded' cognition, which is to say that our explanatory concepts stem from a focus on the corporeal organism interacting adaptively with its environment. The components of that interaction are the sensorymotor processes of the organism and the way in which they enable it to gather information about and interact dynamically with its ecology. There are a vast range of projects and approaches that fall under the general term 'embodiment' (for broad overviews, see e.g., Glenberg, 1997; Wilson, 2002; Barsalou, 2008; Lakoff, 2012), but here, the relevance stems from our attempt to provide an alternative account for phenomena that classical cognitive science has sought to explain in terms of processes (e.g., encoding, storage, decay, interference) operating on 'central' representations whose essential form transcends the perceptual processes whereby they may be transduced and the motor processes whereby they may be converted into actions. We begin to pose our alternative, embodied approach by considering the basis of the foundational ideas of capacity limitation in cognitive science and set it within a broader, general critique of cognition.

## Capacity Limitation and the Genesis of Cognitive Psychology

When the crisis of Behaviorism occurred in the psychology of the 1950s, key advances in a range of domains provided a context for new thinking about human behavior. In particular, in a decade that was to shape human endeavor in many ways, a range of ideas which could be readily applied to an understanding of human performance were those about information processing and the programmable digital computer. There are two interconnected aspects of these ideas that provide the foundational basis for cognitive psychology: first, quanta of information can be posited, and, second, there are limits to the number of quanta that can be processed at any given moment (e.g., Shannon and Weaver, 1949; Turing, 1950). From this perspective, the limited performance of a device derives ultimately from the limited capacity for information transmission of its basic processing systems.

If the basis of intelligent behavior is the manipulation and transformation of such quanta of information (e.g., Newell, 1990), then it is necessary to establish what that quantum is in a given setting. Cognitive psychology construes capacity limitation in a variety of ways. For some, it is construed as structurally limited 'slots' for the representation of information (e.g., Luck and Vogel, 1997). For others, it manifests as finite processing resources that must be allocated over units of information (e.g., Cowan, 1995; Bays and Husain, 2008), or as temporal constraints on the maintenance of certain types of information (e.g., Baddeley et al., 1975; Barrouillet et al., 2011). Others model capacity limitation in terms of interactions between representations of different elements of information leading to interference or displacement (Lewandowsky and Oberauer, 2009). Others incorporate more than one such conception. Our critique here applies to all of these approaches as they all, in positing capacity-limited processing systems, share a fundamental underlying assumption; the very idea of capacity limitation necessarily connotes some primordial unit to which that capacity relates and by which it can be determined. Most commonly, in relation to capacity, these units are thought of as *items* or *chunks* (e.g., Miller, 1956; Newell and Simon, 1976; Cowan, 2000). However, while the *item* or *chunk* is conceived of as the basic unit to which processing is addressed, its dimensions or content are not necessarily the same in all settings; for a given process, a newly-learned multisyllabic word may be seen as an item formed of pre-existing syllables, but at the same time, those syllables may be deconstructed into smaller elements, such as phonemes.

So, even though they form a basic *sine qua non* for the idea that intelligent behavior may be thought of as the manipulation of units of information, what those elements might actually be – how they are to be quantized and quantified – is already far from clear. Cognition sidesteps this problem in operational terms by asserting what the experimenter or modeler judges to be the basic unit of any given task setting. Modeling attempts, while often explicitly remaining agnostic as to what the actual unit is, nonetheless proceed by asserting what it should be. So, within theories of short-term memory (e.g., Page and Norris, 1998; Burgess and Hitch, 1999; Henson, 1999), it may be a syllable or word, defined by a set of abstract features (vector values which themselves might be conceived of as primordial, or perhaps their constituent dimensions should be, and so on). The issue is exemplified in generative linguistics. Here, a formal and explicit attempt is made to define the primordial elements of a speech act in terms of phonological features that are central, in that they are meant to map to perception and execution while transcending both. So, for example, Chomsky and Halle (1968) posit a finite (provisional) set of phonological features from which all utterances may be assembled. This is just one example of the attempt to determine the basic units to which processing is addressed; in general terms, however, the endeavor pervades cognitive psychology, most assiduously perhaps in those attempts at quantifying capacity, where consideration may be given to whether, for example, a red circle should be deemed an item, or whether the critical quantities relate separately to its shape, its color, its location, and so on (e.g., Hardman and Cowan, 2014).

The problem here is knowing where to stop in the superor sub-ordinate direction on any other than arbitrary (or arbitrarily expedient) grounds. More critically, by whatever means the determination of the units is achieved, a fundamental issue remains, which is that the problem that the cognitive system is meant to solve in order to support coherent behavior must disappear at some level of granularity, and if the problem can disappear at that level of granularity, why not at any other? Take, for example, the classic 'problem of serial order' (Lashley, 1951), a problem to which much formal theorizing about limited capacity short-term memory addresses itself. The problem of serial order is that since an organism's behavior is temporally extended, the organism must implement some process whereby appropriate elements of that behavior are executed in the appropriate order. However, if it is indeed the case that a fundamental problem of behavior is that of the serial order of units of behavior, then there must necessarily be a point of granularity in behavior at which it is no longer a problem. This is so, simply because to posit the problem in this way means that there must be some primordial, indissoluble unit to which the ordering process is applied, otherwise one finds oneself, *à la* Zeno, in a regression to infinitesimally smaller levels of granularity. Thus, this conceptual approach to accounting for temporally extended behavior has to posit a realm where time does not exist (see Port and Leary, 2005). As with serial order, so with any other cognitive process: if there are *a priori* units which are the objects of constrained information processing, then those units must exist outside those constraints and must themselves already be formed, fully and indissolubly.

### The Nature of Symbolic Technologies and the Nature of Human Thought

The project of limited-capacity short-term memory, therefore, rests logically and methodologically on the possibility of positing *a priori* quanta that are essentially discrete and static: if they are not discrete, then this implicates basic quanta at either subor superordinate levels (which is to say that the *a priori* quanta for assessing capacity have been inappropriately defined), while if they are not static, then they are not amenable to the quantification on which the idea of capacity limitation necessarily rests. Before proposing an alternative to this general approach, we first consider why it is that human thought has come to be conceptualized in these cognitivist terms, and why an alternative conceptualization is necessary.

We argue here (see also Port, 2007; Wray, 2014) that the communicative technologies (of which writing is a prime example) developed by humans exert a profound influence on our intuitions, even as scientists, about the content and constituents of our behavior. The accomplishments of *Homo Sapiens,* and those that distinguish us from other animals, are less to do with the peculiarity or complexity of our behavior and our mastery over our environment than with the fact that we have developed symbolic technologies that serve to represent and organize that behavior. The products of the medieval cathedral builders are not so much distinguished from those of desert termites with respect to their architectural virtuosity, but in the fact that those cathedrals were first represented conceptually in a symbolic form – a plan – that could serve to coordinate the various inputs to those accomplishments (e.g., Marx, 1844/2007). Critical for current concerns, for about 6000 years human societies have been elaborating technologies for representing the meaningful contents of acts of speaking. The key feature of these technologies for our argument here, and that underlying their efficiency and survival, is that they comprise a finite set of discrete, static, *a priori* elements – symbols that may be assigned to the putative constituent sounds of speech – that can be lawfully combined in different ways to represent the utterances produced within a language. From early Phoenician attempts, through to the International Phonetic Alphabet we witness symbolic technologies that represent speech as an assembly of serially ordered segments that are discrete and static. As with written language, so too the technologies of logic and mathematics: they have been elaborated and refined to provide ever more powerful and abstract means of thinking about nature in terms of the lawful combination and transformation of symbols.

However, *prime facie* motivation for favoring an embodied over a traditional cognitivist approach to human thought arises because, unlike the static and discrete nature of symbolic technologies, biological systems and processes are inherently variable, graded and continuous in time and space, and so translating the behavior of those systems into descriptions based on the manipulation of discrete, static symbols necessarily loses something of the essence of those systems. In itself this is not a problem and may even be a necessity; analytic methodologies of whatever stripe have progressed by developing technologies for 'freezing'

this infinite gradation and variability, be it via the mathematics of calculus or the phonetic transcription of an utterance. Furthermore, if the gradation and variability that characterizes the observable behavior of biological systems were merely a surface aspect either of our measurement processes or of the noisy contingencies of any particular instance of behavior (just such an assumption lies at the heart of the generative approach to language), then again, the translation from graded and variable to discrete and static would pose no particular underlying issue. However, the development of dynamic systems theory, and a body of evidence that would not have been obtainable without the developments in computing power in recent decades, have revealed that, far from being contingent or epiphenomenal, this graded, variable nature of biological systems is precisely the essential characteristic that enables them to give rise to new forms through phylogenetic and ontogenetic development (e.g., Thelen and Smith, 1994).

As with biological systems generally, so too with language, where the massive variability in actually perceived and produced language – variability arising from contextual factors ranging from the level of the utterance, to that of the individual physical and environmental heritage, to that of the history of a particular linguistic community – serves (while posing a challenge for the language learner) as the engine for development, both in the particular phonological form of a given language and in the acquisition of that language by any individual learner (e.g., Pierrehumbert, 2003; Port, 2010). It is unsurprising, from this perspective, that the technologies designed to capture spoken language – based on the lawful combination of a finite set of static segments – obscure many of its inherent characteristics and functions. In this respect, the disjunctions between linguistic technologies and speech are manifold and well-known, from the famous elusiveness of invariant phonemes in actual speech, to the deliquescence under critical scrutiny of things whose existence seems as self-evident as 'words' (Wray, 2014), and the remarkably formulaic behavior of utterances that, to the literate, appear readily analyzable and manipulable (Goldberg, 2003; See Beckner et al., 2009 for an overview).

So, while our symbolic technologies for studying nature are, by design, static and discrete, that which we use them to understand is variable and graded. There should be nothing controversial in the observation that these symbolic technologies are not the same as the things to which they apply. We maintain, however, that so immersed are we, not only in the use of speech and in its formal segmental representation in writing, but also in the process of translating from one to the other and back again, that the nature of the technology for representing the thing comes to appear to us as constituting the essential nature of the thing itself. Indeed, many of our intuitions about the nature of speech turn out to derive from conventions of literacy training rather than inherent essential properties of speech sounds or acts (Port, 2007; Wray, 2014). As such, we argue, the cognitivist conception of human thought constitutes a reification – one which an embodied approach has the potential to overcome – whereby those symbolic technologies that are used to represent our thought processes are taken to embody the essential nature of those processes.

A host of conceptual problems can be traced to this reification, not least the essential idea in cognition that any act is the end result of a mental plan that is both precursor to and prefigurative of that act. While some older critiques of such ideas may not receive much current consideration (e.g., Ryle, 1949), more contemporary ones continue to gain purchase (e.g., Thelen and Smith, 1994). The outcome on which we are focused here, however, is that the conflation of the technology with the process of interest – exemplified here by language – leads to the sense that the process itself involves the lawful assembly and manipulation of discrete, static *a priori* units, rather than being inherently graded, variable and dynamic. However, a particular characteristic of the typical conduct of cognitive science serves to circumvent, or at least obscure, such problems; the material input and output of the process under investigation is *already* operationally quantized and ordered, and so the relationship between those inputs and outputs which forms the basis of inference about the processes whereby the inputs are transformed to outputs will necessarily afford description in quantized and ordered terms.

A clear illustration of the distinction between approaching a process from a traditional cognitivist versus an embodied approach can be seen in relation to the study of speech errors that plays a key role in theories of speech production generally. Importantly, the patterns of such errors is typically taken to provide strong support for the idea that speaking involves the lawful assembly of segments into extended utterances according to syntactic constraints across lexical, sub-lexical, and supralexical levels. (see e.g., Meyer, 1992; Dell et al., 1993; Levelt, 1993). However, as has been noted elsewhere (Port, 2010) the transcription of utterances into an ordered series of discrete tokens – transcription that provides the basis for the analysis of error patterns – means that the data, while it can reveal different kinds of constraints on segmental order errors, can *only* reveal segmental order errors, and therefore the possible accounts of those errors is already constrained to those based on the ordering of discrete segments, excluding variations of other kinds.

The problem with this becomes evident when speech errors are examined instead through real-time measurement of vocal tract movement during speaking (e.g., Goldstein et al., 2007). Such studies reveal that speech errors, like other types of errors in movement, reflect outcomes of simultaneous and graded execution of more than one gesture at any moment, rather than erroneous and exclusive assignment of segments to ordered frames. For example, when required to repeatedly utter the pair 'cop–top,' people occasionally make errors such as, for example saying 'cop– cop.' In a segmental transcription this error can only be one of two things – either the exclusive substitution of the /k/ for the /t/ at the onset of the second syllable, or the exclusive substitution of the word 'cop' for the word 'top'; either way, the error conforms to a scheme whereby segments are assembled into ordered frames. However, the actual movement of the vocal tract reveals gradations in the errors, such that the dorsum of the tongue (the place of articulation for /k/) varies in height at the point in time when the speaker should be saying the onset of 'top,' involving the tip of the tongue. In most cases it remains low but sometimes it achieves a height beyond that which would be appropriate for execution of the target syllable 'top.' Importantly, not only do these errors vary continuously (i.e., the dorsum does not just bimodally occupy erroneous 'up' or correct 'down' positions), but the erroneous movement of the dorsum is sometimes executed simultaneously with the correct target movement of the tip of the tongue to execute the /t/. The variation is continuous, not discrete or categorical. Furthermore, when these movement errors are compared with those present in phonetic transcription of the utterances, many of them turn out not to be so recorded. Detailed real-time analysis, therefore, reveals a picture in which errors of speech involve the simultaneous, graded and continuous real-time execution of more than one articulatory gesture, some of which may be captured by the discrete, ordered phonetic transcription and some of which are not.

The key point here is that the methodology of converting behavior into a discrete, ordered code not only obscures the graded and continuous nature of behavior, but ensures that explanations of that behavior in terms of discrete ordered segments are the only ones that present themselves. Here the conflation of behavior with the technology for representing it not only engenders a particular way of construing the essence of that behavior, but also means that the evidence collected about that behavior comes in a form that necessarily accords with the particular paradigmatic assumptions – that is, that the behavior of interest can be understood in terms of the processing of discrete, static units – and so will never cast doubt, in itself, on the appropriateness of that very paradigm. This precise problem exists in relation to capacity: if we performatively determine what the primordial units of processing are, and we implement those units in the input and chronicle their appearance or otherwise in the output, not only may we never quite settle on a precise determination of what the capacity of the system is (and the literature has not, to date, settled on such), neither will we ever see evidence that we are fundamentally misconstruing the nature of the process we are trying to understand. So, the ongoing theoretical debate about the capacity of short-term memory focuses on questions about what the primordial unit is and how it is to be operationally observed (items, features, chunks, etc.), how many different types of unit there are (verbal, non-verbal, spatial, visual, etc.), and how many and what type of systems there are for processing each of them.

### Dissolving the Question of Capacity: Dynamic Objects Versus Static Items

Rather than seeking explanations of limited performance within the structural characteristics of purported systems and representations underpinning that performance, we instead advocate a framework that focuses on the dynamic interplay of three broad aspects of any given performance setting (see **Figure 1** for a schematic representation), none of which alone can be said to be determinate of performance. The first aspect of the setting is the particular *task* that must be accomplished. This involves not merely the attribution of a function such as 'short-term memory' to a particular task, but a specific consideration of the precise requirements of process. So, for example, serial *recall*, involving the reproduction of a sequence after a brief interval, is fundamentally different from serial *recognition* within which no such

reproduction is required. A second aspect is the nature and form of the *material* upon which the particular task must be performed. Here too, consideration must go beyond mere questions of content – e.g., verbal versus visuospatial information – and must also address aspects of the formal organization of that particular content as it may impact on its perception. The third aspect is the available *repertoire* of the performer such as it may be deployed given the two preceding aspects. This repertoire is construed as including general and specific skills (e.g., the general ability for fluent speech production and the specific familiarity with a particular set of verbal material) as well as the performer's perceptual and motor functions (for similar perspectives, see e.g., Craik and Lockhart, 1972; Ericsson and Kintsch, 1995).

The limit to performance – 'capacity' – within this framework is a direct consequence of the confluence of these three broad aspects, with greater congruence amongst them leading to better performance. In this way, our account draws on and elaborates upon the notion of affordance (e.g., Gibson, 1977; Norman, 1988). It shifts the explanatory role of perceptual-motor processing into the foreground, making it a central concern in explanations of short-term memory performance, rather than, as is more conventionally the case, assigning it to the realm of peripheral input/output processing.

None of these aspects of the setting is regarded here as determined, or indeed determinate, in an *a priori* way. Therefore, the explanatory framework we propose is immune from the type of problems discussed above with *a priori* specification of the nature of the items to be processed and the structural properties of the systems that do that processing that inhere in the traditional view of capacity-limited short-term memory. Our objective here, then, is not to resolve the question of capacity, but to supplant it. The key conceptual component of this is to replace the notion of an item as the explanatory unit, with the notion of an object. This is not an unproblematic notion, although the difficulties are fundamentally different to those that accompany the notion of an item. The problematic character of the object, as we use it here, stems from its inherently dynamic nature – it resides neither in the environment, nor in the eye of the beholder, nor in the particular task goals on a given occasion, but rather in the dynamic interplay amongst all three aspects. Undoubtedly, properties of perceptual systems – a function of both ontogenetic and phylogenetic adaptations – and the environment the organism encounters play a key role in the formation of objects, but as a functional unit of performance, the object is also constrained and modified by the particular task at hand. So, an object is brought into being momentarily by the characteristics of these three broad aspects and dissolves or mutates with changes in those aspects. In one setting, the object of performance may be a holistic, temporally extended sequence of sounds, while with a change in, for example, the acoustic properties of that sound, or in the task that has to be performed, the object is transformed into larger or smaller objects. We provide detailed instantiations of these ideas in following sections.

Objects are instantiated in both perceptual and motor forms, with the impact of obligatory and deliberate processes manifested differently in the two domains. Perceptual systems serve to organize information from the environment into objects in an obligatory way, although some environmental input – exemplified in vision by the Necker cube – is also amenable to deliberate reorganization (see e.g., Macken et al., 2003). Movement is also object-oriented in this sense, but here, deliberate, goal-directed action is implicated more typically, for example, in the selection of a particular manual grip configuration to use a tool for cutting as opposed to stabbing. This mapping of perceptualobligatory and motor-deliberate is not exclusive. So, the obligatory processes of perceptual object formation may establish in a given environment (including both the material and the task) a set of affordances that may be obligatorily mapped onto systems for motor control. Thus, motor control systems are under a degree of obligatory access from perceptual systems (e.g., Rizzolatti and Luppino, 2001; Hickok and Poeppel, 2004). At the same time, the particular activity of motor control systems may influence the precise way in which the perceptual system organizes information in the environment (e.g., Ganel and Goodale, 2003). Thus, objects are formed dynamically out of the combined influence of the three broad aspects of the setting described above. Changes in any aspect may lead to changes in the form of these functional units. The level of performance achieved in any given setting is then a function of the readiness with which appropriate objects to accomplish the task may be formed, and obstacles to task performance are due either to impediments to the ready formation of appropriate objects or to the ready formation of objects that are inappropriate to the particular task requirements.

Below, we demonstrate the usefulness of this approach for understanding short-term memory performance by providing a detailed account of empirical findings to explicate how these concepts play out, and how they provide a coherent account of performance in short-term memory. In replacing the item with the object, our approach overcomes the problem of what the *a priori* units of processing are by not seeking to posit such units in the first place. Rather, we begin analytically by doing away with the assumption that there is any primordial unit, the manipulation of which underpins performance. Necessarily this requires methodological bootstrapping whereby we quantify performance under various conditions, but without actually insisting that the process of quantification addresses the essential units of that performance. The success or otherwise of this approach may be judged in the detailed account of empirical phenomena discussed below. Beyond this, in conceiving of the functional units of performance in momentary, dynamic terms, the question of deriving an underlying capacity limitation is obviated, since a change in any of the inputs to the dynamic system may fundamentally change the system's performance in relation to the task requirements, and do so in an unlimited way. Throughout the following empirical examples, we illustrate these concepts in a concrete way, illustrating the superiority of such an account of performance that focuses on the specific nature of the task, the form and content of the material, and the repertoire of skills and perceptual-motor processes of the participant over one that seeks to invoke capacity-limited systems underpinning that performance.

### Language and Short-Term Memory: Repertoire, Material, and Task

We begin discussion of how these ideas play out empirically by considering the role of language in short-term memory performance, not only because it forms a key focus in theories of the structure and functions of putative short-term memory systems, but also because language presents itself as a cognitive system *par excellence*. Conventional approaches explain the role of linguistic familiarity in short-term memory (e.g., lexical frequency) through its effect in supporting stability and/or retrievability, via long-term representations, of the volatile short-term representations being processed within verbal short-term memory, thereby ameliorating some of the effects of limited capacity (see e.g., Hulme et al., 1991; Schweickert, 1993; Gathercole et al., 2001). Some deficits associated with this account are illustrated in a series of experiments that manipulates such familiarity with the verbal material (Woodward et al., 2008).

A typical experiment from that series involved short-term serial recall for sequences of non-words (i.e., ones which, at the outset, were novel to the participants). Twelve such nonwords were divided into two sets (set A and set B, each of six non-words). The beginning of the procedure involved assessing participants' skill (fluency) in saying aloud this verbal material. They were required to read from a computer screen, as quickly and accurately as possible, three types of presentation: single nonwords, pairs of non-words and six-non-word sequences. For the pairs and sequences, the stimuli were either 'pure,' in that they were drawn solely from either the A or the B set, or 'mixed,' in that they were constructed from alternate non-words drawn from each set. While all participants bring their general fluent speaking ability to the setting, they were then familiarized with this material via 60 trials of serial recall for six-non-word sequences constructed of random orderings of A and B sets (30 trials for each set). Only 'pure' sequences were presented during this familiarization phase. Importantly, each of the 12 non-words was encountered equally often within this familiarization phase, and so all the individual items should become comparably familiar.

Following this phase, the measures of fluency described above were taken and compared with baseline skill measures. Familiarization with the material via the serial recall phase led to no reduction in spoken rate for either singles or pairs of non-words. However, the practice phase did enhance the fluency with which 6-non-word sequences were produced, indicated by a reduction in the time taken to speak aloud the sequences. Importantly, this enhancement was not merely due to practice with rehearsing and recalling sequences of this length, since the enhancement only occurred for 'pure' sequences with no change in the fluency with which 'mixed' sequences were produced as a result of the practice. This, then, indicates a very specific skill enhancement in this setting. What is more, when serial recall performance was subsequently tested, this specific skill enhancement led to superior recall performance for 'pure' versus 'mixed' lists, even though each type of list is constructed from items that have been encountered equally often during the familiarization phase.

Clearly, therefore, no explanation couched solely in terms of familiarity with supposed elements of the material ('items') or familiarity with the particular task (serial recall) can account for performance. It might be argued that the familiarization phase led to the establishment of representations of each item organized in associative networks defined by set, and that such associations provide support during recall which will be more evident for pure than mixed sequences (see e.g., Stuart and Hulme, 2000). However, in its focus on the maintenance and retrievability of volatile short-term representations, it is not clear that such a process could also account for the specific enhancement in fluency of speech output when the verbal material does not need to be recalled, but is merely read aloud from the screen. In other words, an account couched in traditional terms would need to suggest different mechanisms giving rise to the recall advantage and to the fluency advantage. On the other hand, an account which emphasizes the increased fluency in motor control associated with specific articulatory practice at assembling sequences of non-words from each of the two sets suffers no such lack of parsimony; the recall advantage and the fluency advantage are merely two consequences of the particular confluence of the task requirements, the nature of the material, and the specific skill within the participant's repertoire.

Undoubtedly, while such evidence raises problems with the general conceptualisation of the role of language in short-term memory, in itself it does not necessarily undermine the notion of 'central' items and chunks which underpin theorizing about limited capacity systems. Indeed, such evidence of effects of experience on a range of skills (memory and fluency) fits readily with classical cognitive ideas about the formation of chunks from smaller entities as a function of practice (e.g., Newell and Simon, 1976). However, a number of other aspects of these effects seem to point more specifically to a motor – specifically co-articulatory – basis to the increased performance, rather than to the formation of larger chunks in central representations of the material. For example, the type of sequential practice described above with a set of non-words led to enhanced subsequent performance with a different set of non-words if that set contained the same articulatory offsets and onsets between non-words as the familiarized sets. However, no such transfer occurred for sets of non-words that shared the internal vowel segment with the practiced set (Woodward et al., 2008, Experiment 3). Similarly, if the familiarization task involved the practiced sequences being uttered in a paced fashion so as to prevent actual co-articulation of successive non-words, then the particular advantage for pure over mixed sequences did not emerge, even though each set of non-words had been encountered together in a practice sequence the same number of times (Woodward, unpublished).

Further non-trivial shortcomings in the classical cognitivist account of the role of language in short-term memory performance become evident when the detailed aspects of the task and the perceptual form of the material are examined. Key evidence for the classical, item-oriented, account of the influence of longterm linguistic knowledge on short-term memory relates to the different influence of such knowledge on serial *recall* and serial *recognition*; a robust influence of linguistic familiarity (specifically, lexicality) is found when performance is tested by serial recall, but that effect is significantly diminished or eliminated when performance involves serial recognition. The key functional distinction between these two tasks is that in recall, the requirement is to reproduce in some form the previously presented sequence, while in recognition, a sequence is presented, followed by a second sequence, identical to or slightly different from (e.g., two adjacent items may be transposed) the original, and the task requires a *same*/*different* judgment. The substantial superiority of recall for words over non-words, and the relative absence of this effect in recognition has been taken to point to a role for linguistic knowledge in supporting the short-term maintenance or retrieval of verbal items; by re-presenting the memory items in the test cue a role for processes involved in the short-term retention of information about those items is obviated. In this way, the interaction between lexicality (words/non-words) and task (serial recall/serial recognition) provides key evidence for specifically mnemonic process (retrieval, maintenance) operating on itemlevel representations in short-term memory (e.g., Baddeley, 2003; Jefferies et al., 2009).

However, since such accounts are typically focussed on processes operating on central representations (maintenance, decay, interference, retrieval, etc.), the question of modality in which the material is presented is one that, until recently, went overlooked; specifically, while serial recall tasks have commonly been presented in both auditory and visual forms, serial recognition has almost always been implemented auditorily. For this reason, theorizing about the role of linguistic familiarity in short-term memory represents a case study in the folly of a paradigm that that relegates perception and motor control to matters of mere input to and output from a central system, the operation of which gives rise to the essential aspects of performance. We argued above that a key aspect of the influence of familiarity with linguistic material resides in the readiness with which it can be assembled (e.g., via co-articulatory fluency) into an extended sequence of articulatory gestures of the type that may subsume the reproduction of a sequence required in serial recall tasks. Indeed, there is abundant evidence that the influence of familiarity with a set of verbal material is particularly evident in settings in which performance involves the production or reproduction of extended sequences, rather than individual words (e.g., Wright, 1979; Woodward et al., 2008; Bybee, 2010). As already noted, serial recognition does not involve the requirement for sequence reproduction.

Furthermore, and critically in this context, temporally extended auditory sequences may be processed as holistic auditory objects such that they afford matching with other auditory objects in a global fashion – that is, one that does not rely on identification and comparison of the ostensible constituent elements across the two sequences to be judged same or different. For example, classic studies on the perception of auditory sequences have shown that people are able to judge whether two rapid sequences of the same sounds (e.g., a tone, a buzz, a hiss and a click) are in the same or different order without being able to identify what the order of the sounds is, or indeed to identify, for different sequence pairs, which elements have been re-ordered. The ability to identify the constituent source of any differences emerges only when the sequences are presented at a sufficiently slow rate (around 4–5 items per second) as to allow for each successive sound to be verbally labeled (see Warren, 1999). As such, given certain acoustic properties (such as rate and spectral similarity), the sequence of sounds functions a single unit – an object – while changes in those acoustic properties enables its dissolution into smaller objects. As the compass of the object changes, the type of performance afforded varies, facilitating either global matching of sequences or identification of individual constituents of the sequence along with their order.

Given this, presenting a sequence at certain rates in auditory form affords sequence matching performance that is, strictly, only nominally based on sequential processing; rather, the task can be construed as one in which a single auditory object is compared with another. Serial recall, on the other hand, since it always requires at output the reproduction of a sequence, will be influenced by the readiness with which such a sequence may be assembled from the input material regardless of its presentation modality – readiness that is facilitated by linguistic familiarity (Wright, 1979; Woodward et al., 2008). From this perspective, the interaction between lexicality and task type emerges not, as the classical account has it, because of the different burden on item retention in recall versus recognition together with the increased robustness of such item retention for linguistically familiar material. Rather it is because auditory sequences may be processed as single objects, and so tasks, like serial recognition, that may be accomplished with such a perceptual form, based on global perceptual matching, will not exhibit effects due the readiness with which its constituents may be processed. We tested this idea via the simple extension of examining serial recognition and serial recall for both auditory and visual forms of presentation (Macken et al., 2014). Although presentation and retrieval conditions were held constant across modalities, sequential presentation of visual verbal materials in the same location does not afford the same object formation that auditory presentation does and so the burden on sequential processing of the constituents is increased in that setting relative to auditory presentation. We showed that while the effect of lexicality was present regardless of presentation modality when serial recall was required, in serial recognition it was also present with visually presented lists but absent for auditory lists. The classical account, in which the influence of linguistic familiarity operates via enhanced processing of itemlevel lexical or syllabic representations, cannot account for this interaction. According to that view the influence of linguistic familiarity is minimized in settings where the burden on item retention is minimized, that is in serial recognition where the items are re-presented and there is no requirement on the participant to reproduce the content of the sequence. This condition applies whether recognition is auditory or visual – both forms involve re-presentation of the original items; hence, the effect of lexicality of the material should be the same regardless of presentation modality. From the object-based perspective, the effect is absent in serial recognition only when the material is presented auditorily since such a form of presentation affords holistic object matching, thus minimizing the need for processes associated with the deliberate assembly of the material into a sequence. For visual presentation there is no such affordance; therefore, the items must be encoded individually and composed into a sub-vocal sequence, a process that is modulated by linguistic familiarity. That the effect does indeed reside within the speech motor system, rather than within enhanced processing of lexical items, *per se*, receives further support from the fact that the effect of lexicality in visual serial recognition is eliminated if the speech motor system is otherwise occupied during the task (by requiring the participant to repeatedly whisper the task-irrelevant sequence *1, 2, 3*... ; Macken et al., 2014, Experiment 2).

The influence of language on short-term memory is, therefore, a complex one. Certainly, it is one in which perceptual and motor processes play a critical role, but even that does not tell the complete story, since a detailed consideration of the precise task requirements is also necessary in order to fully delineate its influence. Language presents itself as a skill, as both a generic aspect of repertoire that enables the formation of speech motor sequences that allow for the reproduction of target sequences, as well as a more specific skill that relates to the enhanced readiness with which particular sets of verbal material may be reproduced. However, the role of motor skill may or may not be evident in performance; some task settings may be addressed on a purely perceptual basis, within which, if the perceptual repertoire of the participant affords it, effects of linguistic familiarity may be absent, while changes to the task or to the form of presentation of the material may bring them back into play. Critically, it is not clear how the traditional approach to capacity could encompass this complex picture. With auditory serial recognition, the functional unit of performance is the whole sequence, while with visual serial recognition, the character of the constituents, along with the sequence, determines performance. The items in one setting are not manifest in the other, but neither is their manifestation determined by the form of their presentation, since if the task requirements are changed, the lexical character of the material is manifest in performance regardless of modality. Discerning in this pattern the lineaments of a system whose inherent characteristic involves a limit to the number of items of a particular type that may be processed is, we argue, a theoretically futile project.

### Objects, Affordances, and Short-Term Memory

The way in which the objects rendered by the perceptual repertoire of the participant, and the extent to which those objects determine short-term memory performance, is also evident in another canonical aspect of short-term memory performance typically attributed to processes operating at the item level: the *talker variability effect* (e.g., Greene, 1991; Goldinger et al., 1991). Serial recall for random sequences of, for example, spoken digits presented in a single voice is superior to a sequence alternating between a male and female voice on successive digits. In terms of the capacity-limited processing of items, this is usually explained by reference to an additional burden placed on the encoding of those items (due to the need to represent the indexical information relating to voice in the limited capacity system or to the need to recode the variable input into a homogenous, canonical form), thereby taxing the limited capacity for storage and/or processing. It is possible, however, to reconstrue these findings within our embodied, object-oriented framework, without recourse either to the notion of item or the burden placed on the limited capacity storage system by its encoding.

This task requires serial reproduction (be it spoken, written, typed, etc.) of the presented sequence of digits. When presented in a single (say, female) voice, auditory perceptual processes obligatorily organize that input into a single relatively coherent sequential representation that matches the required output order of the material. In other words, the content and formal organization of the material affords its ready mapping onto a speech output (or rehearsal) sequence that accords with the requirements of the task. However, that formal organization is fundamentally changed when voices alternate, say between male and female, from one spoken digit to the next. Under these circumstances, perceptual organization partitions the input into two objects, one corresponding to each voice. A substantial body of work on auditory perception (e.g., Bregman, 1990; Warren, 1999) has shown that a key consequence of this process of object formation is that there is very little coherence across objects; for example, the ability to discern the relative ordering of elements belonging to different objects is very poor, compared to the ability to determine the ordering of elements within an object. Consequently, neither of the objects formed by the alternating voices maps readily onto the required output form, since they represent sequences of alternate digits in the input sequence. Thus, in alternating lists the degree to which the material affords ready mapping from the perceptual input to the motor output is reduced due to obligatory perceptual processes that organize the input into sequential representations based on acoustic similarity, in this case the pitch and spectral qualities of the voice (see Hughes et al., 2009). Performance is reduced, therefore, since the combination of the form of the material and the perceptual repertoire of the participant renders objects that are less appropriate to the task requirement than when that material is presented in an acoustically homogenous form.

This partitioning of serial alternating voices into two same-voice objects takes time to build up (see e.g., Bregman, 1990; Carlyon et al., 2001), a fact that can be exploited to show how object formation, not the capacity-limited processing of individual items, is responsible for the talker-variability effect. The key manipulation involves a lead-in – a sequence of to-beignored items (e.g., a countdown from 9 to 1) prior to, and at the same tempo as the memory list – in either single- or alternatingvoice form (see Hughes et al., 2009). By allowing the build-up of object formation to take place prior to presentation of the tobe-remembered material, a lead-in of alternating voices should promote perceptual segregation by voice within the alternating to-be-remembered sequence, thereby reducing still further the affordance or mapping between input sequence and required output sequence. If item encoding were responsible, this manipulation would be expected, minimally, to have no effect, or to reduce the impact of voice alternation within the sequence by virtue of familiarization with the acoustic variability. However, the alternating voice lead-in causes further reduction in performance beyond that found with the basic alternating voice condition (Hughes et al., 2009).

This is not to say, however, that perceptual organization *alone* determines performance since if the task requirements are changed such that the same verbal material no longer needs to be retained in order, but only memory for the content is required, then the talker variability effect no longer emerges as it does when the requirement is for serial reproduction of the sequence (Hughes et al., 2011). This further illustrates that the specific interplay of the form and content of the material, the nature of the task that needs to be accomplished with that material, and the repertoire of the participant jointly determine performance in a way that cannot readily be explained by reference to structural properties of any of those individual aspects alone. Since all of these factors combine interdependently to determine the level of performance, again we see the fundamental problem with the project of limited capacity. We might frame the question rhetorically: what is the correct combination of task, material and participant repertoire to choose in order to accurately assay the capacity of the underlying system?

### Interference as Task-Irrelevant Affordance

These foregoing examples illustrate the application of our framework to settings in which only task-relevant material is presented to the participant. In our next example, we show how the same framework applies in settings in which task-relevant and task-irrelevant material is present. From the traditional, capacitylimited perspective, the presence of task-irrelevant material taxes the capacity-limited system via mechanisms such as depletion of resources or structural interference between representations of the relevant and irrelevant items, typically based on the similarity between the two (see e.g., Cowan, 1995; Kane and Engle, 2003; Lavie, 2005; Oberauer and Lewandowsky, 2008; Baddeley, 2012). Within cognitive science more broadly, the assumption that an observed reduction in performance in the presence of task-irrelevant material must be due to some process leading to the degradation of the item-level representations underpinning

The shortcomings of the classical, item-focussed, cognitivist perspective become clear when considering the impact on performance of the presence of task-irrelevant sound during a short-term memory task. The presentation of such sound, which participants are instructed to ignore and on the contents of which they are never tested, during a serial recall task leads to substantial (typically on the order of 30–50%) disruption to serial recall performance (e.g., Colle and Welsh, 1976; Salamé and Baddeley, 1982; Ellermeier and Zimmer, 1997). The effect cannot be due to the depletion of limited processing resources; the size of the effect does not diminish with time, either over the duration of an experimental procedure or over days and weeks (Hellbrück et al., 1996; Jones et al., 1997b), nor is predictable sound less disruptive than unpredictable sound (Tremblay and Jones, 1998), nor is the degree of disruption related to any other measures of what are considered a participant's ability to resist distraction (Beaman, 2004). Neither can the effect be attributed to structural interference between representations of irrelevant and relevant items; the sound need not be verbal, nor is its disruptive potency a function of its similarity to the task-relevant material (see Macken et al., 2009). There are nonetheless basic, acoustic characteristics that are necessary within the sound for the effect to emerge; the sound must be perceptually segmentable (e.g., due to the presence of silent intervals between successive sound tokens or the presence of rapid modulations in frequency and/or amplitude within continuous sound, such as are found in continuous speech), and each segmented entity must be different from the preceding one. Perceptually continuous sounds, or repetitions of an identical token do not cause disruption (Jones et al., 1993). In other words, the sound must constitute a sequence, of whatever content. However, again, this aspect does not, on its own, determine performance in the presence of such sound. If the task requirements are changed so that reproduction of the memory sequence is no longer required, while retaining the requirement to remember all the content, then performance is unaffected by the presence of task-irrelevant acoustic sequences. A task requiring reproduction of a sequence constituted of a random ordering of all but one of the digits 1 through 9 will be substantially disrupted by the presence of task-irrelevant auditory sequences. However, if precisely the same material is presented, in precisely the same way, but now the task requires identification of which of the digits was missing from that particular sequence, then no such disruption is evident (e.g., Jones and Macken, 1993; Beaman and Jones, 1998; Macken et al., 1999). This task still requires retention of the content of the sequence until the response is required, but the particular sequential order is no longer relevant (or useful) in accomplishing it. Clearly, then, the sound is not disrupting performance by somehow degrading the representations of the memory material, since it should otherwise affect any performance that depends on the retention of that material.

How does this pattern of disruption in the presence of taskirrelevant material fit within out framework? In our account of the talker variability effect, we argued that key details of that effect emerge from the way in which the acoustically variable form of the verbal material leads to obligatory perceptual organization that does not readily afford what the participant is required to do in the particular setting; it leads to the formation of two auditory objects, neither of which corresponds to the target sequence. The general point here is that, for task-relevant material, the greater the appropriateness of the momentary object, i.e., the affordance, or perceptual-motor congruence, between the input and output forms, the better performance will be. Precisely the same mechanisms, we argue, are at play when we examine the disruptive impact of task-irrelevant sound on the ability to reproduce task-relevant sequences. To the extent that such sound affords the same activity that is required of the task-relevant material – that is, sequential reproduction – then competing affordances are established within the setting that impact upon the ready accomplishment of the specific task goal. The impact does not operate by degrading the representation of the relevant material, but by providing an alternative object competing with the task-relevant sequence for control of the motor output process required to accomplish the task (analogous to effects of competing affordances on reaching and grasping in the visuo-motor domain, see e.g., Cisek, 2007). The serial recall task is one requiring the reproduction of a sequence, and the environment contains not only the task-relevant sequence, but other sequences, which via processes of obligatory perceptual organization render objects that represent potential, alternative candidates for control of the sequential motor system utilized to perform the focal task. Thus, when the requirement for sequential output, as embedded in serial recall, is removed from the task and all that is required is retention of the content, the presence in the setting of task-irrelevant sequential affordances is no longer of relevance for the accomplishment of that task and so performance proceeds unhindered. So, if the task does not require the construction of a sequential motor object in order to reproduce the sequence, but rather only requires retention of the content, there is no cost associated with the presence of task-irrelevant sequences.

Precisely the same de-potentiation of disruption from taskirrelevant material occurs if the sequential affordances are removed from that material. For example, Jones and Macken (1995) contrasted the disruptive effect of a repeated series of three sound tokens presented in a spatial configuration to give rise to perception of a single recurring sequence of three tokens (simultaneous stereophonic presentation via headphones) with that of the same three tokens, presented with the same recurring timing and order, but this time with each token presented from a separate location (left, center, and right channels of the headphones), thereby giving rise to the perception, not of a single sequence, but of three non-sequential streams of sound (see **Figure 2**). In the former case, the task-irrelevant material embodies sequential affordances, while in the latter, those affordances are stripped from the material by changing the form in which it is presented. This manipulation dramatically attenuates disruption, even though the content of the task-irrelevant material remains the same in both cases. So, here too, an account framed in terms of constituent items within task-irrelevant material competing with those within the task-relevant source for limited capacity storage or processing is untenable. The same content – the ostensible items – is present in both cases within the relevant and irrelevant material, but in one case the form of interaction leads to substantial impairment in performance, while in the other it does not.

### Interference as Object Formation

The power of this object-oriented approach in providing a better explanation of effects on performance classically attributed to interference becomes evident in two further, very different settings in which the ostensibly degrading effects of irrelevant on relevant items within capacity-limited systems have been charted.

The first involves the effect of interpolated sound on the ability to make *same*/*different* judgments about the pitch of two successive tones, and the second involves the disruptive effect on serial recall of auditory sequences of a redundant auditory item occurring after sequence presentation.

A key focus for the debate about the source of limited capacity in short-term memory involves the ability to retain information about simple events in the presence of both variable time delays and the presence of different types of material interpolated between the initial presentation of that event and the requirement to make a judgment about it (e.g., Oberauer and Lewandowsky, 2008; Mercer and McKeown, 2010). **Figure 3** depicts such a setting in which it appears that performance is determined by interference amongst the representations of items within a limited capacity system. The setting, from Deutsch (1970) involves the presentation of a short tone – the standard – followed several seconds later by another tone – the test – either identical to or different from (plus/minus a semitone) the standard. Different types of to-be-ignored material may be interpolated between tones, and the task is to make a *same/different* judgment.

The basic finding is that performance on the pitch discrimination task is substantially poorer when the interpolated material comprises tones in a similar frequency range and with the same timbre as the standard and test tones than when that material is formed of spoken digits. On the face of it, this seems to provide clear evidence for similarity-based interference within capacitylimited storage; the representation of the standard tone is more subject to degradation by other, similar-sounding tones, than by the acoustically different digits. As such, by the time the test tone is presented, the information necessary to perform the task is still relatively intact in the latter compared to the former condition (e.g., Crowder, 1993). The capacity limits in the system underpinning the performance are therefore revealed by its inability to sustain a level of performance when it must represent other, similar, items to the target material compared to when no such irrelevant material is present, or when the irrelevant material is sufficiently different as to lead to no structural interference.

Jones et al. (1997a). The standard tone (S) is succeeded either by a series of other tones of varying frequency or by a random sequence of spoken digits, followed in turn by the test tone (T) either identical in pitch to the standard, or a semitone higher or lower.

However, a radically different account of this classic finding is provided by our object-oriented approach.

The starting point for this account is that the process of object-formation has implications for the addressability of the constituent features of that object. We touched on this in the discussion of the perceptual basis of auditory serial recognition, with respect to the findings that judgments about whether two auditory sequences are the same or different may be made even when access to the precise constituents that differ (or not) between the two sequences cannot be made (Warren, 1999). Thus, the formation of an auditory object from an extended sequence of sounds may impede the extent to which aspects of the segments of that object may be identified. A classic demonstration of this is depicted in **Figure 4**, in which the task is to judge whether the order of a pair of tones is the same or different on two instances (Bregman and Rudnicky, 1975). On the second instance, the target tones are either presented unaccompanied (**Figure 4A**) or in the presence of preceding and succeeding 'flanker' tones close in frequency to the targets (**Figure 4B**). Performance is markedly impaired under these conditions, but not because the flanker tones have somehow interfered with the representations of the target tones. That this cannot be the source of disruption is demonstrated in **Figure 4C** in which a series of 'captor' tones is presented before and after, and at the same frequency as, the flanker tones.

Under these circumstances, performance is restored to that observed when the targets are unaccompanied, although there is now even more potentially 'interfering' material in the acoustic setting. This pattern reveals the formation of different auditory

FIGURE 4 | Depiction of three critical conditions from

Bregman and Rudnicky (1975). H, high tone, L, low tone, F, flanker tone, C, captor tone. Order of H and L tones may be either the same or different on both occasions. Panel (A) indicates condition where target test tones are presented on their own, panel (B) indicates the addition of flanker tones immediately before and after target test tones, and panel (C) indicates the further addition of captor tones preceding and following the flanker tones.

objects across the three settings; in the first, the pairs of target tones in the standard and test stimuli form an initial and subsequent object that affords global matching in order to make the judgment. In **Figure 4B**, when the target tones are re-presented at test, they no longer form an object on their own, but rather are bound into a different object, along with the flanker tones, which does not afford ready matching with the standard stimulus, since the critical information needed to perform the task is now part of, and embedded within, the test object. The presentation of the captor tones serves to change the object formation again; in this case the flankers are bound into, on the basis of their identical frequency, an object with the captor tones. The identical acoustic composition of the captor and flanker tones forms a powerful perceptual cue to them forming a coherent object in its own right, leading to the perceptual segregation of the target material from the preceding and succeeding material. This isolation as an object once more permits ready comparison of standard and test stimuli.

It is not interference between item-level representations in capacity-limited systems, therefore, but the addressability of information that is the crux of performance in this setting. The formation of elements into objects reduces the unique identifiability of those elements; they are no longer whole entities in themselves, but rather need to be recovered from the object, and there is a performance cost associated with this. Critically, it is the process of object formation that determines the addressability of those sub-object features, not processes of interference *per se*, since all that is required for the information therein to become readily addressable again is for a different object organization to be formed from the acoustic environment. The findings of Bregman and Rudnicky (1975) illuminate the setting shown in **Figure 3** and the different effects of interpolated tones and speech on the ability to judge whether or not the test and standard tones are of the same frequency. When followed by a series of similar sounding tones, the standard is more likely to become bound with those tones into an auditory object than when it is followed by the acoustically distinct speech, therefore the addressability of it as a tone in itself is reduced in the former compared to the latter. We tested this alternative account in a series of experiments (Jones et al., 1997a), two key aspects of which are illustrated in

Jones et al. (1997a). Panel (A) illustrates the manipulation of increasing the interval between the standard and the first interpolated tone, while Panel (B) illustrates the manipulation of doubling the number and rate of interpolated tones. In both cases, performance is better in the conditions illustrated in lower stimulus type than the upper one.

**Figure 5**. In the first case (**Figure 5A**), we compared pitch discrimination performance under conditions where the timing of the tones was the same as that used in the original Deutsch (1970) experiments – that is, the interval between the standard and the first interpolated tone was the same as that between interpolated tones – with conditions where that initial interval was increased. Notice here that the overall interval between standard and test tones has been increased by over a second, so the period over which the comparison has to be made has increased appreciably over the usual form of the task. Nonetheless, this manipulation improves performance significantly. However, the same timing manipulation (not depicted in **Figure 5**) with spoken digits between standard and test tones had no effect on performance, so it is not simply the case that increasing the interval between standard and interpolated material is advantageous to performance; it is only so when such an increase is likely to aid perceptual partitioning of the target material from the interpolated, and such segmentation is already so strongly signified by the change from pure tone to spoken digit that the timing is redundant: the target and the interpolated material have formed two distinct objects regardless of the timing manipulation. In **Figure 5B**, the comparison is between the original type of stimulus and one in which the rate at which interpolated tones are presented is doubled. Again, notice that this means that the number of ostensibly interfering events between standard and test tone has been doubled and yet, this leads to significantly better pitch discrimination performance; an effect diametrically at odds with the classical interference account of the effect of interpolated material. On the other hand, an account framed in terms of the consequences of object formation for the addressability of elements within those objects provides a ready explanation; just as with the manipulation depicted in **Figure 5A**, doubling the rate at which the interpolated tones are presented (**Figure 5B**) leads to them forming a coherent object in themselves, concomitantly leading to the perceptual segmentation of the target tone from that irrelevant material, thereby affording more ready perceptual matching.

It is not easy to see how this set of results can be explained by reference to the decay of or interference with volatile shortterm memory representations within limited-capacity storage of individual target events. Increasing the delay and increasing the quantity of interpolated material actually improves performance, and an account framed in terms of the formation of perceptual objects, and the consequences that has for the addressability of information within those objects provides a coherent account, not only of the effects depicted in **Figure 5**, but also of the finding that performance is better when the interpolated material is acoustically different from the target material. Similarity between operationally relevant and irrelevant sources of material affects performance, by this account, not due to processes of interference leading to the degradation of item-level representations of the relevant information, but by affecting the likelihood that relevant information will be bound into a single integrated object along with the irrelevant, coupled with the concomitant loss of individual identity accompanied by such dynamic object formation. The limits to performance, by this account, arise not from burdens on limited capacity systems to store or process task relevant in the presence of task-irrelevant information, but rather on the appropriateness or otherwise of the objects formed in the setting – a function of the nature of the material and the perceptual repertoire of the participant – in the light of the particular requirements of the given task.

Finally, we turn to yet another classic setting in which the limited capacity of short-term retention systems has been investigated – the *suffix effect* (Crowder and Morton, 1969). This refers to the finding that redundant auditory events occurring at the end of a to-be-remembered auditory sequence disrupt serial recall of that sequence, especially recall of items toward the end of that sequence. For example, recall of a random sequence of the digits 1–9 is disrupted if the last digit in the sequence is followed by an auditory item that is not to be recalled (e.g., the spoken word 'go'). The more acoustically similar the suffix to the list content, the greater its disruptive effect; for example changing the voice between sequence and suffix, changing the location, intensity, timbre and so on, all lead to a decrease in the suffix effect (e.g., Crowder and Morton, 1969; Nairne, 1990; Surprenant et al., 2000). Here again, degradation of representations of the target items by the irrelevant suffix appears to be implicated as the mechanism of disruption. However, it turns out not to be the case. Applying the type of consideration of auditory object formation discussed above, Nicholls and Jones (2002). See also Maidment and Macken, 2012; Maidment et al., 2013) compared the effect on serial recall of a suffix presented on its own, at the end of a to-be-remembered sequence with one presented in the presence of a 'captor' sequence running simultaneously with the to-be-remembered sequence of auditory digits. In both types of condition a suffix (the spoken word 'go') was presented after the final digit, so according to the item interference account, it should degrade the representation of the final sequence items. However, while the typical disruption to performance occurred under standard conditions, the addition of a captor sequence – repetitions of the word 'go' – throughout sequence presentation restored performance, and did so to the extent that the captor sequence formed a coherent perceptual object into which the suffix would be captured. The ability to recall the target information, therefore, is not impeded in the presence of the suffix due to degradation of its representation in limited capacity shortterm storage, but rather due to the effect the suffix has, in being bound with the target sequence, of reducing the addressability of information within the sequence, just as illustrated in the examples depicted in **Figures 4** and **5** above. When the suffix, despite occupying the same temporal and acoustic relation to the target material is captured and bound into an object other than the target sequence, then its impact on recall of that sequence is accordingly eliminated.

### Conclusion

Across these diverse settings we see a picture of performance that is not easily captured by an approach that presumes the functional units correspond to 'items' – be they tones or speech sounds, relevant or otherwise – and the form of interactions – storage, decay, interference, etc. – amongst these

items in limited capacity systems. Rather, performance reflects a much more dynamic set of processes that are implicated in the formation into objects of the whole task environment, the utilization and transformation of those objects, as a function of the nature of the material and the repertoire of the participant, in order to accomplish the particular requirements of a given task. We have described a wide variety of settings in which manipulations, from the classical view of limited capacity shortterm memory, would appear subtle or mere matters of input to and output from the limited capacity central system, but which from our dynamic object- oriented approach can be seen as *the* determinants, in combination, of performance.

Importantly, therefore, since they play no explanatory role, there is no requirement within such a framework to determine what the putative static, discrete items might be in terms of which the limits to performance might be quantified – all the explanatory concepts are orthogonal to such concerns. In this way, the reciprocal question of capacity and how it might be expressed or to what it might relate disappears. Further, the approach we have outlined here connotes a different conception of limited performance to that resident within 60 years of cognitive psychology. The analysis of the history and conceptual origins of the cognitive approach with which we began, implicates aspects of a particular *Zeitgeist* in the early conflation of limited performance with limited capacity of the systems underpinning performance. The approach we have outlined here attributes the limited performance which has constituted the focus of so much investigation not to the limited capacity of any system or systems, but rather (we might say, 'simply') to the fact that the settings in which such performance has been investigated have, by design, reduced the confluence between the repertoire of the participant, the task she or he is required to accomplish, and the material that forms the focus of that task. In place, then, of capacity and its concomitants, we propose performance limitations be examined by detailed focus on the dynamic interplay of multiple aspects of the task setting. Undoubtedly, these involve aspects of the perceptual-motor systems co-opted to perform a given task. But in themselves these are not determinant, since other aspects of the setting such as the precise requirements of the task, will determine when, what, and if such aspects of the participants' repertoire impact on performance. This means that there are, in principle, no structural limitations on performance; the more congruent the confluence of the aspects of the setting – the task material, the task requirements and the repertoire of the participant – the better performance will be. The examples we have described here show that such a framework accounts for the modulation of performance in a detailed and dynamic way, as a function of the congruence of multiple aspects of the setting; to the extent that the degree of such congruence is unlimited, then so too is performance.

### Acknowledgments

Preparation of this paper was supported by a research grant (no. ES/I028919/1) to the authors from the Economic and Social research Council of the UK.

### References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Macken, Taylor and Jones. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Embodied cognition of aging

Guillaume T. Vallet 1, 2 \*

<sup>1</sup> Centre de Recherche, Institut Universitaire de Gériatrie de Montréal, Montréal, QC, Canada, <sup>2</sup> Department of Psychology, Université de Montréal, Montréal, QC, Canada

Embodiment is revolutionizing the way we consider cognition by incorporating the influence of our body and of the current context within cognitive processing. A growing number of studies which support this view of cognition in young adults stands in stark contrast with the lack of evidence in favor of this view in the field of normal aging and neurocognitive disorders. Nonetheless, the validation of embodiment assumptions on the whole spectrum of cognition is a mandatory step in order for embodied cognition theories to become theories of human cognition. More pragmatically, aging populations represent a perfect target to test embodied cognition theories due to concomitant changes in sensory, motor and cognitive functioning that occur in aging, since these theories predict direct interactions between them. Finally, the new perspectives on cognition provided by these theories might also open new research avenues and new clinical applications in the field of aging. The present article aims at showing the value and interest to explore embodiment in normal and abnormal aging as well as introducing some potential theoretical and clinical applications.

#### Edited by:

Anna M. Borghi, University of Bologna and Institute of Cognitive Sciences and Technologies, Italy

#### Reviewed by:

Annalisa Setti, University College Cork, Ireland Katinka Dijkstra, Erasmus University Rotterdam, Netherlands

#### \*Correspondence:

Guillaume T. Vallet, Centre de Recherche, Institut Universitaire de Gériatrie de Montréal, 4545 Chemin Queen-Mary, Montréal, QC H3W 1W5, Canada gtvallet@gmail.com

#### Specialty section:

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Received: 28 January 2015 Accepted: 31 March 2015 Published: 16 April 2015

#### Citation:

Vallet GT (2015) Embodied cognition of aging. Front. Psychol. 6:463. doi: 10.3389/fpsyg.2015.00463 Keywords: embodiment, aging, dementia, sensory-motor processing, Alzheimer's disease, Parkinson disease

## 1. Introduction

In the last decades, human mind was thought to work as a computer, manipulating abstract symbols through different processing steps using clear rules (see Fodor, 1975). These rules are defined as insensitive to contextual variations, except when the context itself becomes an input of the cognitive system. A cat should always be a mammal with four legs regardless of the current situation (Collins, 1975). However, different places should trigger different personal memories (Tulving et al., 1983). This approach, known as cognitivism, was dominant in the study of cognition until recently when embodiment, also known as enactivism (Varela et al., 1991), started to revolutionize the way we conceive cognition (e.g., Glenberg et al., 2013).

In contrast to cognitivism, embodiment focuses on the interactions between cognition, the body and the environment. Cognitive processing is supposed to be directly impacted by the situation in which it occurs. A chair is processed differently when you are tired compared to when you want to change a light bulb (e.g., Barsalou, 1999; Pecher et al., 2003). Therefore, action as well as sensory and motor components are at the core of cognitive processes (Wilson, 2002). Embodiment has been successfully applied to numerous fields of cognition including language (Casasanto, 2011), memory (Versace et al., 2014), attention (Bradley, 2007), or action (Zmigrod and Hommel, 2013). Surprisingly, these theories remain almost unexplored in neuropsychology.

The present perspective aims at showing the value and interest to explore and apply embodied cognition theories to aging as an alternative to more traditional views. The first reason is theoretical. In order to become a theory of human cognition, embodiment should be validated across the whole spectrum of normal cognitive development from children to elderly adults. Embodied theories should furthermore be able to explain specific cognitive dysfunctions such as those occurring in neurocognitive disorders. A growing number of results are reported in babies and children with or without cognitive disorders (e.g., Wellsby and Pexman, 2014). Yet, only very few articles are published in the field of aging (e.g., Dijkstra et al., 2007), and even fewer articles tackle neurocognitive disorders from an embodied perspective (e.g., Vallet et al., 2013a). The second reason is more pragmatic. The specific changes occurring in aging make this population particularly interesting and relevant to investigate, including some of the core assumptions of embodiment presented below. Moreover, the inclusion of the body and the context in cognition should open new research avenues and may also lead to new clinical evaluation and remediation methods of cognitive functions. We will first present some key points about aging that will serve as the foundation for the next sections, which will deal with the application of embodiment to healthy aging and some neurocognitive disorders. We will focus on the interaction between cognition and (a) motor/action and (b) senses/perception based on theoretical arguments and when available, experimental evidence.

### 2. Normal Aging and Neurocognitive Disorders

Becoming older is associated with changes in almost all spheres of the individual including physical condition, the senses, brain function, and cognition. Each sensory organ is affected in aging (Ulfhak et al., 2002), resulting in a decline in perception, including higher perceptual thresholds (Fozard and Gordon-Salant, 2001). Similar phenomena occurs in the motor system with a loss of motor neurons, decrease in the muscle mass (e.g., Vandervoort, 2002), decrease in the strength, resulting in gait and balance alterations (Boelens et al., 2013).

As the brain suffers from aging, cognition is also changing. Elderly adults show the greatest decline in cognitive speed of processing (Salthouse, 2000), attention, and executive functions (Greenwood, 2000) as well as in some aspects of episodic memory, mainly in free recall (Danckert and Craik, 2013). In contrast, most aspects of language, semantic memory and more automatic aspects of attention or memory remain preserved (Glisky, 2007).

Most neurocognitive disorders are also accompanied by sensory and perceptual impairments. These deficits, however, are more important in such disorders than those occurring in normal aging. For instance, the alteration of the nervous system in Alzheimer's disease (AD) causes significant decline in touch (Stephen et al., 2010), audition (Gates et al., 2011), vision (Kirby et al., 2010), and so forth. Yet, basic perception tasks such as detection of visual features (e.g., orientation, colors, etc.) and perceptual priming tasks remain preserved (Fleischman et al., 2005; Joubert et al., 2010). Motor and movement disorders are also very common in neurocognitive disorders, especially in Parkinson Disease (PD, Beitz, 2014) and Lewy body's dementia (Molano, 2013). Cognition is also more deeply affected than in normal aging with predominant episodic memory deficits in AD (Bäckman et al., 2005) and dysexecutive syndrome in PD (Dirnberger and Jahanshahi, 2013).

The concomitant changes of perception, action, and cognition in normal aging make this population particularly relevant to study embodied cognition theories. Indeed, these theories posit a causal relationship between sensory, motor functioning and cognition (Glenberg et al., 2013). As predicted, numerous studies have shown significant associations between altered senses/perception and cognition in aging (e.g., Lindenberger and Baltes, 1997; Madan and Singhal, 2012), but only a few have explored such a relation between motor/action and cognition (e.g., Lindenberger and Baltes, 1994; Verlinden et al., 2014). We will present the main interpretations of these relations and we will then focus on the possible implications within an embodied perspective. The cognitive impairment in neurocognitive disorders is much more important than that reported in healthy aging. In contrast, these populations differ much less in terms of their sensorimotor functioning. Therefore, the issue here is more about the ability of embodied cognition theories to explain cognitive impairment in neurocognitive disorders rather than the link between sensorimotor functioning and cognition as in normal aging. For the sake of clarity, we will only present data in PD and AD as they represent the most frequent dementia with motor and sensory impairment, respectively.

### 3. Action, Motor Components and Cognition in Aging

The relationship between motor/action and cognition in aging is mainly explained by physical health or by a common underlying cause namely the dopamine depletion. A better cardiovascular health directly improves cognition as brain function relies on a healthy vascular function (Debette et al., 2011). It is a well-known fact that motor activity, such as involved, in physical activity improves cognition at all ages (for a review, see Hillman et al., 2008), an effect which has also been found in Mild-Cognitive Impairment (MCI) and dementia (see Scherder et al., 2007, for review). The beneficial effects of sports on cognition are not limited to long-time practice of sports and are observed following new physical intervention programs in healthy aging (e.g., Colcombe and Kramer, 2003) as well as in MCI and AD (Heyn et al., 2004).

The impact on cognition may also just come from enhanced motor execution. Numerous neuropsychological tests involve motor execution so that faster execution and improved dexterity would lead to higher scores in these tests (e.g., video-gaming in aging van Muijden et al., 2012). Reversely, slower performance reported in aging appear to find its origin in motor processing rather than in sensory processing occurring in earlier steps (Yordanova et al., 2004).

Another way to account for these relationships is to posit a common underlying cause, such as the dopamine depletion hypothesis in aging. The aging brain produces less dopamine than the younger brain, which results in both motor and cognitive deficits such as attention and executive functions deficits (Volkow et al., 1998; Bäckman et al., 2005). This hypothesis could easily be applied to PD. Indeed, the degeneration of subcortical structures in PD causes dopamine depletion, which in turn causes motor and executive deficits (Dirnberger and Jahanshahi, 2013).

These hypotheses are not mutually exclusive and even if they are concordant with embodied cognition theories, they are not specific to them. More directly related to embodiment, motor execution and action are supposed to be necessary simulated in cognitive processes that rely or involve motor or action components. Therefore, better physical health and enhanced dexterity should also facilitate and even enhance action/motor simulation in cognitive processing and reversely. For instance, healthy elderly are less accurate than younger adults in the estimation of perceived weight judgment (Maguinness et al., 2013). This effect of aging can be partly explained in terms of motor simulation, in other words as a function of the real or perceived motor-efficacy of elderly adults (see Potter et al., 2009, for perceived efficacy). Loss of strength in aging, real or perceived, might lead to increase difficulties lifting objects leading therefore to an overestimation of their weight.

As predicted by embodiment, several studies found that the body itself influences cognition (Casasanto, 2011; Osiurak et al., 2014). One study found that our body posture influence the retrieval of autobiographical memories in both young and older adults (Dijkstra et al., 2007). This suggests that memories depend on the context of the body as predicted by the embodied cognition theories. Following the same logic, action/motor impairment reported in PD should not only impact how they physically interact with the environment, but should also change their cognitive representations and processing of the world (Poliakoff, 2013).

Yet, the most common findings in embodiment deal with language and how the comprehension of action related sentences require motor simulation (Zwaan and Yaxley, 2004). This embodiment of language in action has been reported in healthy elderly adults (Dijkstra et al., 2004), as well as in patients with AD (De Scalzi et al., 2014) since they rely on the necessary simulation of the motor components involved in language to understand it. It may also explain how motor impairments in PD are associated with altered language comprehension of literal and symbolic action related/based sentences (Fernandino et al., 2013).

Finally, it is worth noting that different studies have shown that motor simulation in word processing occurs in the same brain areas that are responsible for real planning and execution of actions (e.g., Hauk and Pulvermüller, 2004). The overlap argues in favor of common resources and perhaps of common units between motor, action, and perception (e.g., Hommel, 2004). The deterioration of motor and action representations in PD should result in the same degradation of mental simulation of wholebody images (Conson et al., 2014) and the ability to judge the weight to be lifted by another person (Poliakoff et al., 2010).

### 4. Perception, Sensory Components and Cognition in Aging

Since the sixties, a growing number of studies have reported an association between the alteration of senses and cognition in healthy aging (Schaie et al., 1964) to nowadays (Baldwin and Ash, 2011). These links are predicted by embodiment, but they generally are explained in terms of three main hypotheses. According to the cognitive load hypothesis (Baltes and Lindenberger, 1997), degraded perception leads to higher cognitive demand which cause in turn cognitive alterations for more demanding tasks or when fatigue occurs. For the sensory deprivation hypothesis (Valentijn et al., 2005), the decrease of sensory input results from the neuronal atrophy which in turn causes cognitive decline. This hypothesis appears congruent with embodiment as it is compatible with the assumption of common units for perception and cognition. This idea of brain atrophy is also part of the common cause theory which states that a third factor, such as a non specific alteration of the central nervous system, is responsible for the sensory and cognitive decline (Lindenberger and Baltes, 1997). Finally, one could argue that for motor performance and cognition, degraded perception should result in more time to complete the task, and consequently lead to poorer scores on cognitive tasks (Gussekloo et al., 2005). Nonetheless, this last hypothesis appears unlikely. Indeed, young and middleaged adults under age-simulation conditions of reduced visual acuity, auditory acuity, or both did not exhibited lower cognitive performance relative to control conditions (Baltes et al., 2001; see also Scialfa, 2002).

Beyond these mutual influences of sensory/perceptual functioning on cognition, embodiment states that cognition is grounded in these perceptual components (Pecher and Zwaan, 2005). According to some authors, common resources and perhaps common units underlie perception and memory (e.g., Versace et al., 2014) as well as perception and action. Degraded perception should thus constitute degraded units in cognitive processing which in turn impaired cognitive performance. As a consequence, not only language should be grounded in sensorimotor components, but also conceptual knowledge (for a review, see Barsalou, 2008). Regarding sensory components, only two studies have explored this issue in aging (Vallet et al., 2011, 2013b) and found that healthy elderly adults exhibit grounded conceptual knowledge to the same extent as young adults.

The embodiment of conceptual knowledge could also be applied to AD. However, the disconnection syndrome that characterizes AD should impact the embodiment of conceptual knowledge in patients suffering from this disease. Some of the brain structures are disconnected from each other, mainly in the medial temporal region and posterior areas of the cortex (Delbeuck et al., 2003). Therefore, the necessary simulation of sensory components required to process a concept may not occur due to this neuroanatomical disconnection (Vallet et al., 2013a). In contrast to healthy young and elderly adults, AD patients did not exhibit perceptual cross-modal priming effects in an identification task. The lack of priming was surprising according to the classical approaches of cognition, since perceptual processing such as revealed by the priming effect is known to remain preserved even in moderate stages of the disease. In other words, embodied cognition theories have shed new light on an on-going debate.

Grounded knowledge also constitutes an interesting starting point to explore the changes in cognition due to aging with and without cognitive disorders. If knowledge remains grounded in its sensorimotor components, the degradation of these components should directly impact how knowledge is retrieved and used, i.e., memory performance. For instance, a recent study questioned the classical view of recognition impairment in healthy elderly adults by showing that they tend to perceive new items as old rather than old items as new (Yeung et al., 2013). This effect was found for high-interference items, items with a significant degree of feature overlap. According to the authors, it is the degradation of the perirhinal cortex which is known to be involved in higher-level visual processing that cause this effect. In other words, memory traces of healthy seniors may be slightly degraded or perhaps slightly less distinctive. The role of distinctive processing was put forward to explain the increase of false memories in aging (Butler et al., 2010).

Furthermore, this trace distinctiveness is thought to be at the core of the emergence of episodic memory when associated with integration according to some embodied cognition theories (see Brunel et al., 2013; Versace et al., 2014). Given that memory emergence relies on the simulation/enactment of the different components of the memory trace, episodic memory should then require the dynamic integration of these different simulations into one coherent representation (see Versace et al., 2014). Yet, it is now established that integration is impaired in AD in multisensory tasks (Festa et al., 2005), working memory tasks (Parra et al., 2009), and episodic memory tasks (Stoub et al., 2006), resulting from the disconnection syndrome. Therefore, episodic memory impairment in AD may result, at least partially, from the integration deficit reported in AD (see Delbeuck et al., 2007).

The distinctiveness and the role of (re)-integration in memory suggest that multisensory stimulation should enhance memory and possibly cognition. It was observed that programs based on Multi-Sensory Stimulation (MSS), also called snoezelen, increase the well-being of patients with dementia in healthcare houses. Even if some reviews did not find any specific effectiveness of these programs (Chung and Lai, 2002), recent data support beneficial effects on mood, behavior, communication (for review see Sánchez et al., 2013), and sometimes on cognition as assessed by MMSE (Ozdemir and Akdemir, 2009; Maseda et al., 2014). This possible effect on cognition is predicted by embodiment, but not by cognitivism. A future direction to improve stimulation/remediation programs may then be to enhance the distinctiveness of memory traces, along with effort develop the integration of the different components of the traces.

### 5. Conclusions

Numerous studies have reported the mutual influence of motor/action and sensory/perception functions on the one hand and cognition on the other hand. These relations fit naturally within embodied cognition theories since embodiment assumes cognition is in interaction with the body and the environment. Possibly due to sensory-motor and cognition decline occurring in aging, these associations appear even stronger in this population (Baltes and Lindenberger, 1997). Yet, embodiment remains marginally explored and applied to aging.

Nonetheless, the few studies published to date seem to have successfully validated embodied assumptions in normal aging and in neurocognitive disorders. The main result is that healthy elderly adults exhibit grounded conceptual knowledge and language in perceptual features (Vallet et al., 2013b) and motor features (Dijkstra et al., 2004). This seems to highlight one of the key differences between embodiment and cognitivism on the whole spectrum of normal cognition, ranging from children (Wellsby and Pexman, 2014) to elderly (Vallet et al., 2013b). The particular profile of healthy older adults viewed in an embodied perspective also contributes to formulate new hypotheses about their performance. As briefly mentioned above, the effect of age on the perihinal cortex may cause greater confusion to distinguish one memory trace from another (e.g., Butler et al., 2010; Yeung et al., 2013).

The same principles may be applied to neurocognitive disorders, with validation coming from grounded language in PD (Cardona et al., 2014) and AD (De Scalzi et al., 2014). The respective motor impairment in PD and in cognition in AD also represent new avenues of research in embodiment (e.g., Poliakoff et al., 2010; Vallet et al., 2013a). One of the most exciting perspective however remains the possible development of new interventions based on radically different assumptions than those used within a cognitivist perspective to help patients. One can imagine to focus on the dynamic nature of memory to improve episodic memory, or to focus on the whole body functioning of an individual to improve cognition. For instance, a recent study claims to revert cognitive decline in dementia following a drastic change of life-style without any cognitive stimulation program or dementia related drugs (Bredesen, 2014). This program is based on healthy diet, good sleep habits, meditation and exercises combined with medication used to maximized physiological changes (hormone balance, mitochondrial function, etc.).

Despite the lack of research in these areas, normal aging, as well as neurocognitive disorders, appear as particularly interesting and useful targets to explore and apply embodiment. We hope that this brief presentation will encourage cognitivists and neuropsychologists to further explore and consider these new avenues of research.

### Funding

Postdoctoral grant from the Fonds de Recherche du Québec – Santé (FRQS).

### Acknowledgments

GV is supported by a postdoctoral grant from the Fonds Québécois de la Recherche en Santé (FRQS). The author wishes to thank Benoit Riou for his help to conceptualize this perspective and Sven Joubert for his precious help and meticulous proofreading.

### References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Vallet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.