# **MULTISENSORY INTEGRATION IN ACTION CONTROL**

**Topic Editors Christine Sutter, Knut Drewing and Jochen Müsseler**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-312-7 **DOI** 10.3389/978-2-88919-312-7

## *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

## **MULTISENSORY INTEGRATION IN ACTION CONTROL**

Topic Editors:

**Christine Sutter,** RWTH Aachen University, Germany **Knut Drewing,** Giessen University, Germany **Jochen Müsseler,** RWTH Aachen University, Germany

Image by Jochen Müsseler

The integration of multisensory information is an essential mechanism in perception and in controlling actions. Research in multisensory integration is concerned with how the information from the different sensory modalities, such as the senses of vision, hearing, smell, taste, touch, and proprioception, are integrated to a coherent representation of objects. Multisensory integration is central for action control. For instance, when you grasp for a rubber duck, you can see its size and hear the sound it produces. Moreover, identical physical properties of an object can be provided by different senses. You can both see and feel the size of the rubber duck. Even when you grasp for the rubber duck with a tool (e.g. with tongs), the information from the hand, from the effect points of the tool and from the eyes are integrated in a manner to act successfully.

Over the recent decade a surge of interest in

multisensory integration and action control has been witnessed, especially in connection with the idea that multiple sensory sources are integrated in an optimized way. For this perspective to mature, it will be helpful to delve deeper into the information processing mechanisms and their neural correlates, asking about the range and constraints of this mechanisms, about its localization and involved networks.

# Table of Contents


**EDITORIAL** published: 10 June 2014 doi: 10.3389/fpsyg.2014.00544

## Multisensory integration in action control

#### *Christine Sutter <sup>1</sup> \*, Knut Drewing2 and Jochen Müsseler <sup>1</sup>*

*<sup>1</sup> Department of Work and Cognitive Psychology, RWTH Aachen University, Aachen, Germany*

*<sup>2</sup> Department for Experimental Psychology, Institute for Psychology, Justus-Liebig University, Giessen, Germany*

*\*Correspondence: christine.sutter@psych.rwth-aachen.de*

#### *Edited and reviewed by:*

*Bernhard Hommel, Leiden University, Netherlands*

**Keywords: human information processing, perception, tool use, recalibration, reference frame, vision, haptic, acoustics**

The integration of multisensory information is an essential mechanism in perception and action control. Research in multisensory integration is concerned with how the information from the different sensory modalities, such as the senses of vision, hearing, smell, taste, touch, and proprioception, are integrated to a coherent representation of objects (for an overview, see e.g., Calvert et al., 2004). The combination of information from the different senses is central for action control. For instance, when you grasp for a rubber duck, you can see its size, feel its compliance and hear the sound it produces. Moreover, identical physical properties of an object can be provided by different senses. You can both see and feel the size of the rubber duck. Even when you grasp for the rubber duck with a tool (e.g., with tongs), the information from the proximal hand, from the effective part of the distal tool and from the eyes are integrated in a manner to act successfully (for limitations of this integration see Sutter et al., 2013).

Over the recent decade a surge of interest in multisensory integration and action control has been witnessed, especially in connection with the idea of a statistically optimized integration of multiple sensory sources. The human information processing system is assumed to adjust moment-by-moment the relative contribution of each sense's estimate to a multisensory task. The sense's contribution depends on its variance, so that the total variance of the multisensory estimate is lower than that for each sense alone. Accordingly, the validity of a statistically optimized multisensory integration has been demonstrated by extensive empirical research (e.g., Ernst and Banks, 2002; Alais and Burr, 2004; Reuschel et al., 2010), also in applied setting such as tool-use (e.g., Takahashi et al., 2009; in the present research topic: Takahashi and Watt, 2014).

For this perspective to mature it will be helpful to delve deeper into the multisensory information processing mechanisms and their neural correlates, asking about the range and constraints of these mechanisms, about its localization and involved networks. The contributions to the present research topic range from how information from different senses and action control are linked and modulated by object affordances (Garrido-Vásquez and Schubö, 2014), by task-irrelevant information (Juravle et al., 2013; Wendker et al., 2014; for a review see Wesslein et al., 2014), by temporal and spatial coupling within and between senses (Cameron et al., 2014; Mueller and Fiehler, 2014; Rieger et al., 2014; Sugano et al., 2014) to childhood development of multisensory mechanisms (Jovanovic and Drewing, 2014).

Correspondences between the information from different senses play an important role for multisensory integration. Integration does, for instance, not take place when vision and touch are spatially separated (e.g., Gepshtein et al., 2005). However, cognitive approaches on action effect control assume that information from different senses is still coded and represented within the same cognitive domain, when the information concerns the same action (e.g., Müsseler, 1999; Hommel et al., 2001). The present research topic also addresses the corresponding issue of modality-specific action control (Boutin et al., 2013; Grunwald et al., 2014).

Overall, the present research topic broadens our view on how multisensory mechanisms add to action control. We thank all authors and all reviewers for their valuable contributions.

## **ACKNOWLEDGMENT**

This research was supported to by a grant from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) to Christine Sutter and Jochen Müsseler (DFG MU 1298/10).

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 May 2014; accepted: 16 May 2014; published online: 10 June 2014. Citation: Sutter C, Drewing K and Müsseler J (2014) Multisensory integration in action control. Front. Psychol. 5:544. doi: 10.3389/fpsyg.2014.00544*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Sutter, Drewing and Müsseler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Visual-haptic integration with pliers and tongs: signal "weights" take account of changes in haptic sensitivity caused by different tools

#### *Chie Takahashi 1,2 and Simon J. Watt <sup>1</sup> \**

*<sup>1</sup> Wolfson Centre for Cognitive Neuroscience, School of Psychology, Bangor University, Bangor, UK*

*<sup>2</sup> Behavioural Brain Science Centre, School of Psychology, University of Birmingham, Birmingham, UK*

#### *Edited by:*

*Christine Sutter, RWTH Aachen University, Germany*

#### *Reviewed by:*

*Alessandro Farne, INSERM, France Anna Schubö, Ludwig Maximilians University Munich, Germany*

#### *\*Correspondence:*

*Simon J. Watt, School of Psychology, Bangor University, Penrallt Rd., Bangor, Gwynedd, LL57 2AS, UK e-mail: s.watt@bangor.ac.uk*

When we hold an object while looking at it, estimates from visual and haptic cues to size are combined in a statistically optimal fashion, whereby the "weight" given to each signal reflects their relative reliabilities. This allows object properties to be estimated more precisely than would otherwise be possible. Tools such as pliers and tongs systematically perturb the mapping between object size and the hand opening. This could complicate visual-haptic integration because it may alter the reliability of the haptic signal, thereby disrupting the determination of appropriate signal weights. To investigate this we first measured the reliability of haptic size estimates made with virtual pliers-like tools (created using a stereoscopic display and force-feedback robots) with different "gains" between hand opening and object size. Haptic reliability in tool use was straightforwardly determined by a combination of sensitivity to changes in hand opening and the effects of tool geometry. The precise pattern of sensitivity to hand opening, which violated Weber's law, meant that haptic reliability changed with tool gain. We then examined whether the visuo-motor system accounts for these reliability changes. We measured the weight given to visual and haptic stimuli when both were available, again with different tool gains, by measuring the perceived size of stimuli in which visual and haptic sizes were varied independently. The weight given to each sensory cue changed with tool gain in a manner that closely resembled the predictions of optimal sensory integration. The results are consistent with the idea that different tool geometries are modeled by the brain, allowing it to calculate not only the distal properties of objects felt with tools, but also the certainty with which those properties are known. These findings highlight the flexibility of human sensory integration and tool-use, and potentially provide an approach for optimizing the design of visual-haptic devices.

**Keywords: tool use, multisensory integration, vision, haptic perception, cue weights**

## **INTRODUCTION**

When humans manipulate objects with their hands while looking at them, visual and haptic size information is integrated in a manner that is highly consistent with statistically optimal models of sensory integration (Ernst and Banks, 2002; Gepshtein and Banks, 2003; Helbig and Ernst, 2007a). Such models describe how, under the assumptions that estimates from each sense are on average unbiased, and their noises are independent and Gaussian distributed, the minimum-variance unbiased estimate (*S*ˆ*VH*; Equation 1) is a weighted sum of visual and haptic estimates (*S*ˆ*V*, *S*ˆ*H*) where the weight given to each signal (*wV*, *wH*) is proportional to the inverse of its variance (Equation 2; for a review see Oruç et al., 2003).

$$
\hat{\mathbf{S}}\_{VH} = \boldsymbol{\omega}\_{V}\hat{\mathbf{S}}\_{V} + \boldsymbol{\omega}\_{H}\hat{\mathbf{S}}\_{H} \tag{1}
$$

$$\omega\_V = \frac{1/\sigma\_V^2}{1/\sigma\_V^2 + 1/\sigma\_H^2} \quad \text{where} \quad \omega\_V + \omega\_H = 1 \tag{2}$$

The empirical findings that humans perform similarly to this model demonstrate that the brain 'knows' how much to rely on each sensory signal in a given situation. This is not trivial because relative weights of visual and haptic estimates must be adjusted moment-by-moment since they vary continuously as a function of the precise properties of particular viewing situations. For example, the reliabilities of visual and haptic size estimates almost certainly vary differently as a function of object size. And more challengingly, the reliability of visual estimates varies substantially with variations in any number of "geometrical" properties of the stimulus including the type of surface texture, the object's orientation with respect to the viewer, and viewing distance (Knill, 1998a,b; Gepshtein and Banks, 2003; Knill and Saunders, 2003; Hillis et al., 2004; Keefe et al., 2011).

Given the adeptness with which humans use tools, one might expect similar visual-haptic integration processes to operate when we manipulate objects with tools. This process is complicated, however, by the fact that in tool use haptic signals must be acquired via the handles of the tool, thereby systematically disrupting the relationship between hand opening/position and (visual) object properties. We have previously shown that, in making the decision of whether to integrate signals or not, the brain compensates for the spatial offset between visual and haptic signals introduced by simple tools (Takahashi et al., 2009). When we feel objects without a tool the degree of visual-haptic integration decreases with increasing spatial separation between signals, indicating the brain is sensitive to the probability that they refer to different objects, in which case combining them would produce errors (Gepshtein et al., 2005). We observed similar patterns of changes in visual-haptic integration in tool use, except the decision to integrate was modulated not by the separation between the hand (the origin of the haptic signal) and visual object, but by the separation between the tool tips and the object, as if the haptic signal was treated as coming from the tool-tip (Takahashi et al., 2009). This suggests the brain can correctly decide the extent to which visual and haptic information should be integrated, not based on the proximal sensory stimuli, but on their distal causes (Körding and Tenenbaum, 2006; Körding et al., 2007), taking into account the dynamics and geometry of tools.

Here we consider the problem of weighting visual and haptic "cues" (sensory estimates of size) appropriately when manipulating objects with tools. As well as spatially separating the signals, tools typically also alter the "gain" between the hand opening and object size (consider pliers and tongs, for example). In principle, this could make determining correct cue weights difficult: different tool geometries cause differences in the extent to which the haptic signal at the hand is multiplied up or down relative to object size, and the absolute sensitivity, or precision, of sensory systems generally varies with signal magnitude. Thus, different tool gains could introduce variations in the precision (reliability) of haptic size estimates that would ideally be accounted for. Here we determined the nature of the variations in the reliability of haptic size estimates with different tool geometries, and examined whether visual and haptic signals are weighted appropriately to take account of them.

There are various possibilities for how variations in tool geometry might affect the reliability of haptic size estimates, with rather different implications for what appropriate visualhaptic cue weights would be. We find it more straightforward to discuss the possible effects of different tool geometries in terms of sensitivity—Just Noticeable Differences (JNDs) of haptic size rather than reliabilities, because experimental data and theoretical ideas such as Weber's law are typically expressed in these units. Following previous researchers (for example see Clark and Yuille, 1990; Landy et al., 1995; Knill and Richards, 1996; Ernst, 2005), we assume, however, that cue reliability relates straightforwardly to single-cue sensitivity (JND). Consider haptic size-discrimination data, measured using a standard two-interval, forced-choice (2-IFC) task, in which the participant grasps two stimuli (standard and comparison) between thumb and index finger, and reports which was larger. If the resulting data are fitted with a cumulative Gaussian psychometric function, the JND can be expressed as the standard deviation of the psychometric function which, when divided by <sup>√</sup>2, is assumed to yield the standard deviation of the underlying estimate of haptic size (σ*H*). The reliability of the underlying estimate is the reciprocal of its variance (1/σ<sup>2</sup> *<sup>H</sup>*).

The possible effects of variations in the gain of pliers-like tools on haptic size sensitivity could depend on either "high-level" aspects, such as how object size is ultimately represented in the brain, or "low-level" aspects, such as how the sensitivity of the basic sensory apparatus varies with hand opening. Consider first the case where the limiting factor is the precision with which different object sizes are represented in high-level processing. This could arise because the neural population that represents size contains more neurons tuned to smaller sizes and fewer tuned to larger sizes, for example, in which case absolute sensitivity to object size would decrease with increasing size. For haptic estimates of object size derived from tools to be correct we must assume that, consistent with our previous findings regarding spatial offsets (Takahashi et al., 2009), the brain is able to correctly rescale haptic signals about hand opening so that object size estimates are encoded accurately in high-level processing, independent of the tool gain. Then, if there are no significant low-level (sensory) limits then haptic sensitivity to a given object size will be determined only by the high-level constraints, and will be unaffected by the tool (i.e., the hand opening) used to hold it. Thus, in this case there would be no need to adjust cue weights to take account of tool geometry.

Although high-level limits on sensitivity must presumably exist to some degree, it is hard to envisage a system that is unaffected by altering the input signal (at the hand), and so we consider low-level factors to be more likely to limit sensitivity. Their implications are also more difficult to visualise. We therefore consider the implications of this second case in more detail. Here, we assume that underlying sensitivity to changes in hand opening is unchanged by tool use, and so the impact of different tools on haptic sensitivity to object size depends directly on (*i*) how sensitivity to hand opening varies with magnitude of hand opening, and (*ii*) the relationship between object size and the hand opening required to feel it with a given tool (the tool "gain"). In many sensory domains, the relationship between JND and stimulus magnitude is described well by Weber's law, which here implies that JNDs in hand opening should be a constant proportion of hand opening. Empirical measurements of so-called finger-span discrimination indicate, however, that while JNDs do generally increase with hand opening, they also depart significantly from Weber's law (Stevens and Stone, 1959; Durlach et al., 1989). Indeed, it can be argued that this result is unsurprising, given that judging size from hand opening requires the comparison of the positions of two "systems" (finger and thumb), each of which contains highly non-linear relationships between position and the state of muscles and joint angles (Durlach et al., 1989; Tan et al., 2007). We note, however, that, presumably due to technical challenges in presenting haptic stimuli in quick succession, previous measurements of finger-span discrimination did not use a two-interval, forced-choice task to measure sensitivity. Durlach et al. (1989), for example, used a one-interval forced-choice (is the stimulus length, *l*, or *l* + *l* ?). The data may therefore reflect not only perceptual sensitivity but also the precision of memory representations of size. Thus, it remains unclear whether haptic size sensitivity follows Weber's law or not.

**Figure 1** considers the implications of these two alternatives (Weber's law, and non-linear sensitivity functions) for weighting visual and haptic signals appropriately in tool use. The top row of panels shows the Weber's law case. **Figure 1A** shows a hypothetical sensitivity function for hand opening (i.e., haptic size, for an object felt directly with the hand), assuming a Weber fraction of 0.1. A similar function is also plotted for visual size, assuming a slightly different Weber fraction (0.15). **Figure 1B** shows sensitivity to *object size* when felt with pliers-like tools of three different gains (expressed as the ratio of tool-tip separation to hand opening; **Figure 2**). To calculate these values we assumed that the underlying sensitivity function for hand opening was constant, and that using the tool introduced no additional external (or internal) noise. We calculated the hand opening that would result from feeling a given object size with a particular tool (for example, feeling a 20 mm object with a 0.7:1 tool results in a hand opening of 20/0.7 = 28.6 mm). Next, we used the function in **Figure 1A** to "look up" the appropriate JND in *hand opening*. Finally, we transformed this "hand JND" into a JND in units of object size, by calculating the change in object size that, given the tool geometry, would produce 1 JND change at the hand. Obviously, given our assumptions, the sensitivity to changes in object size when using the 1:1 tool is the same as with no tool (**Figure 1A**). It can also be seen, however, that if haptic size sensitivity follows Weber's law, sensitivity to changes in object size is unaffected by tool gain. This makes intuitive sense because while the 0.7:1 tool, for example, magnifies the signal at the hand, the absolute sensitivity decreases by exactly the same amount, and so there is no net change in sensitivity to object size. **Figure 1C** plots the optimal cue weights for estimating object size from vision and haptics, for each of the three tool gains in **Figure 1B**, calculated using Equation 2. It can be seen that because both size estimates follow Weber's law, and tool gain does not affect sensitivity (or therefore reliability) of object size estimates, the *relative* reliabilities are unchanged with object size and tool gain, and so the appropriate signal weights remain constant. This is an interesting outcome in that it would simplify the brain's task, because there is no need to adjust visual and haptic weights for different tool gains. It also implies, however, that there is no opportunity to optimise haptic sensitivity in visual-haptic interfaces by using tool gain to improve haptic sensitivity.

The bottom row of panels in **Figure 1** plots the same functions as the row above, but calculated assuming haptic size sensitivity at the hand is non-linearly related to hand opening (**Figure 1D**). The pattern is quite different to the Weber's law case. **Figure 1E** shows that haptic sensitivity to object size now depends directly

object size, with different tool gains (0.7:1, 1:1, and 1.4:1; see main text for details), and a hypothetical visual sensitivity function; **(C)** the optimal signal weights that result from the sensitivities in panel **(B)**, calculated using Equation 2. Panels **(D–F)** show the same calculations assuming a non-linear hand-opening sensitivity function.

on the tool gain and, for a given object size, can be made better or worse by choosing particular tools. Thus, if (*i*) sensitivity to changes in hand opening do not follow Weber's law, and (*ii*) low-level sensory limits directly determine haptic sensitivity to object size when using tools, the optimal visual and haptic weights for the same object in the world will change as a function of tool gain (**Figure 1F**), and so the brain should adjust them accordingly.

In Experiment 1 we examined how sensitivity to haptic size varies with hand opening in our experimental setup, using a two-interval, forced-choice procedure. We then measured the effect of different tool gain ratios on haptic sensitivity, using virtual tools created using a stereoscopic display and force-feedback robots. This allowed us first to establish whether or not sensitivity to object size is determined primarily by low-level sensory factors (i.e., in the manner modeled in **Figure 1**). Second, we could measure the shape of the haptic-size sensitivity functions with and without tools, allowing us to understand the expected effects on signal weights of different tool gains. Experiment 1 demonstrated that the reliability of haptic size estimates does vary with tool gain. We therefore examined in Experiment 2 whether the brain takes account of these changes, and adjusts signal weights based on the reliability changes induced by different tool gains. We measured weights given to the different signals at different object sizes, and with different tool gains, by measuring the perceived size of stimuli in which visual and haptic size was varied independently (so-called cue-conflict stimuli). We explore the implications of the results both for sensory integration mechanisms in visuo-motor behavior, and for the design of visual-haptic interfaces.

## **EXPERIMENT 1: MEASUREMENT OF HAPTIC SIZE SENSITIVITY**

## **MATERIALS AND METHODS**

#### *Participants*

Six right-handed participants took part in all conditions (3 males and 3 females; 19–36 years old). All participants had normal or corrected-to-normal vision, including normal stereoacuity, and none had any known motor deficits. The participants were naïve to the purpose of the experiment. The study was approved by the School of Psychology Ethics Review Committee, Bangor University, and all participants gave informed consent before taking part.

## *Apparatus*

Participants viewed 3-D stereoscopic visual stimuli in a conventional "Wheatstone" mirror stereoscope, consisting of separate TFT monitors (refresh rate 60 Hz) for each eye, centred on the body midline. Haptic stimuli were generated using two PHANToM 3.0 force-feedback robots (SenseAble Technologies, Inc.), one each for the index finger and thumb of the right hand. The robots allow participants' index fingers and thumbs to move in all six degrees of freedom (DoF), but sense and exert forces on the tips in translation (three DoF) only. The 3-D positions of the tips of the finger and thumb were continuously monitored by the robots (at 1000 Hz) and touching a virtual object resulted in appropriate reaction forces, simulating the presence of haptic surfaces in space. Participants could not see their hand, which was occluded by the stereoscope mirrors. The setup was calibrated so the visual and haptic "workspaces" were coincident. Head position was stabilized using a chin-and-forehead rest. Participants' heads were angled down approximately. 33◦from straight ahead (thus, the fronto-parallel plane was angled back approximately 33◦from earth-vertical).

## *Stimuli*

The stimuli were positioned on a (head-centric) fronto-parallel plane, at a distance of 500 mm from the eyes. The haptic stimulus consisted of two parallel planes (stiffness = 1.05 N/mm), whose surfaces were oriented at 90◦ to the fronto-parallel plane. Their separation (height in the fronto-parallel plane) was varied to change object "size."

In the no-tool condition, at the start of each trial, participants saw two spheres indicating the positions of the finger and thumb. In the tool conditions, participants saw a virtual plierslike tool attached to the finger and thumb markers (**Figure 2**). Because of the 3-DoF limit on the robots' position sense and force production, and because we wanted to be sure the "opposition space" between finger and thumb was oriented orthogonally to the haptic surfaces, the visual tool was constrained to move in the fronto-parallel plane. We also presented a background fronto-parallel force plane (present in both no-tool and tool-use conditions), making it easier for participants to keep their fingers/tool in the correct orientations (a trial would not commence if the finger/thumb positions were not oriented in the frontoparallel plane). Otherwise, the tool moved freely in this *x, y* plane, following the hand in real time, and opening and closing by rotating about the pivot (see **Figure 2**). Thus, the motion was akin to sliding the hand/tool along a surface such as a table, and felt intuitive and easy to carry out. We conducted pilot experiments (without a tool) to verify that the presence of the force plane did not affect size discrimination performance.

There were three differently colored tools, representing objectsize: hand-opening ratios of 0.7:1 (green), and 1:1 (blue), and 1.4:1 (red) (**Figure 2**). Tool gain was varied by moving the position of the pivot. All tools were 18 cm long, measured from the finger position to the corresponding tool tip. Different colors were used as an aid to learning/recalling the tool geometry. When a tool-tip touched the virtual object, the appropriate force was generated at the hand.

### *Procedure*

Size discrimination performance was measured in each condition using a two-interval, forced choice (2-IFC) procedure. On each stimulus interval, two visually presented "start zones" appeared (yellow spheres indicating the lateral position of the haptic stimulus, but not its size). Participants moved their hand to insert the finger/thumb spheres (no-tool condition), or the tool tips (tool condition) into the start zones, which then changed color to green, indicating that the participant should grasp the object. All visual information (including the finger/thumb spheres and visible tool) was extinguished on moving inward from the start zones, so only haptic information was available to judge object size. On each trial, participants completed two such intervals, then pushed a visual-haptic virtual button to indicate which interval contained the bigger object. Thresholds were obtained for six "base" hand openings (30, 40, 50, 60, 70, and 80 mm). Object sizes were therefore the same as the hand openings for the 1:1 tool-gain condition, and corresponded to object sizes of 21, 28, 35, 42, 49, and 56 mm in the 0.7:1 tool-gain condition, and 42, 56, 70, 84, 98, and 112 mm in the 1.4:1 tool-gain condition. On each trial the standard size was presented in one interval, and a comparison stimulus in the other, randomly ordered. A method of constant stimuli was used to generate comparison sizes, which on each trial were chosen at random from the set: base hand opening ± 1, 3, 6, or 9 mm. Base hand opening was selected at random on each trial. Participants completed 30 repetitions of each stimulus level, at all hand openings, and did not receive feedback about their performance. We did not measure performance at hand openings smaller than 30 mm because the smallest comparison stimulus (21 mm) was close to the minimum separation of the end effectors of the force-feedback robots.

To control the timing of the haptic presentation across conditions, participants were trained to grasp the stimulus for approximately 1 s in each interval and then release it. Trials on which contact time was outside the window 900–1200 ms generated an error signal, and were discarded. We added a small random jitter to the vertical position of the haptic object on each trial, so the task had to be carried out by judging object size (plane separation) rather than on the position in space of a single surface.

Trials were blocked by (*i*) no-tool, and (*ii*) tool conditions (counterbalanced order). We intermingled the three different tool-gain conditions (chosen randomly on each trial) to prevent adaptation of the relationship between felt hand opening and visual size. Participants carried out a block of practice trials in both tool and no-tool conditions to familiarise themselves with the task, and the different tools.

## **RESULTS**

For each observer, in each condition, the size discrimination data were fitted with a cumulative Gaussian, using a maximumlikelihood criterion. Following previous work, JND was defined as the standard deviation (σ) of the best-fitting psychometric function (e.g., Ernst and Banks, 2002; Knill and Saunders, 2003; Hillis et al., 2004).

#### **HAPTIC SENSITIVITY WITHOUT A TOOL**

**Figure 3** shows size-discrimination performance (JNDs) as a function of object size, in the no-tool condition, averaged across the six participants. Clearly the average sensitivity function is non-linear, and this was reflected in the individual sensitivity functions (see Supplementary material). All participants showed increasing JNDs at large object sizes, and two out of the six showed clear non-monotonic trends, with JNDs also increasing at small sizes. For a further three JNDs appeared to have reached their minima at around 30 mm hand opening. Thus, haptic size judgements in our experiment departed substantially from Weber's law (Stevens and Stone, 1959; Durlach et al., 1989). We were unable to measure thresholds at hand openings smaller than 30 mm and so we cannot determine if all participants would have shown such increases at smaller sizes. The resting positions of the thumb and finger in natural movements correspond to non-zero hand openings (Ingram et al., 2008). Non-monotonic sensitivity could therefore in principle arise from the comparison of two position systems (for finger and thumb; Durlach et al., 1989), each of which shows decreased absolute sensitivity either side of resting position (i.e., at smaller or larger hand openings).

#### **HAPTIC SENSITIVITY WITH DIFFERENT TOOL GAINS**

**Figure 4A** plots the average sensitivity function in each of the three tool conditions. The data are plotted in "object units" (JNDs

**FIGURE 3 | Sensitivity to hand opening in the no-tool condition.** The figure plots JNDs in hand opening as a function of hand opening, averaged across the six observers (for individual results, see Supplementary material). The dashed line shows a second-order polynomial fit to the data, used to generate predictions for the tool conditions. Error bars denote ± 1 standard error.

**FIGURE 4 | Haptic sensitivity with different tool gains. (A)** Object size JNDs as a function of object size, averaged across the six observers. Different colors denote the different tool gains. The gray dashed line shows the fit to the no-tool data in **Figure 3**, extrapolated to the whole range of object sizes. This is the predicted sensitivity function is only object size *per se* determines changes in sensitivity (i.e., a high-level limit). The red, blue, and green dashed lines show the predictions assuming that low-level hand-opening sensitivity limits performance. They were calculated in the same manner as in **Figure 1**, by combining the fit to the basic hand-opening sensitivity with the geometrical effects of the tool (see main text). **(B)** The same data re-plotted in units of hand opening. The dashed gray line again shows the fit to the sensitivity function in the no-tool condition, and the gray zone around it shows ± 1 standard error. Error bars in both plots are standard errors.

in object size as a function of object size). The gray dotted line shows the predictions for all tool-gain conditions if sensitivity is determined by high-level object representation. The curve is simply an extrapolation of the fitted polynomial function for the no-tool condition, from **Figure 3**. The colored dashed lines are predicted sensitivity functions for each tool condition assuming that low-level factors limit sensitivity. These are again calculated based on the polynomial fit to the average no-tool data in **Figure 3**, but using the calculation described in **Figure 1** (i.e., assuming that size sensitivity with tools is a straightforward combination of the sensitivity to hand opening and the geometrical effects of the tool).

JNDs were very similar in the no-tool and 1:1 tool conditions, indicating that basic sensitivity was unaffected by the use of a tool *per se.* Moreover, variations in the tool gain ratio caused clear changes in sensitivity to object size. It can clearly be seen that the data are better fitted by the three separate tool curves, rather than a single sensitivity function (see Supplementary material for more details). **Figure 4B** re-plots the JNDs from **Figure 4A** in units of hand opening, rather than object-size units. If our straightforward model of sensitivity during tool use assumed in **Figure 1** is correct, the sensitivities in hand-opening units should all lie on a single continuous function that represents the (unchanging) underlying sensitivity to hand opening. **Figure 4B** shows that this is indeed the case.

Taken together, Experiment 1 suggests that the sensitivity and therefore in principle the reliability—of haptic size estimates sensed using tools with different gain ratios is governed primarily by perceptual sensitivity to hand opening, and not by high-level limits on size representation (see Discussion). This finding, coupled with the observed violation in Weber's law, means that the geometry of different tools does alter the reliability of haptic size estimates. A reliability-based cue-weighting process should therefore take these changes into account. We turn to this question in Experiment 2.

## **EXPERIMENT 2: SIGNAL WEIGHTS IN TOOL USE**

Here we first established the stimulus parameters required for "two-cue" (vision-plus-haptics) conditions such that, for the same object sizes, changing tool gain should alter the reliability of the haptic signal, and so alter the signal weights. We then measured the actual weights given to each cue in these conditions using a cue-conflict paradigm.

## **METHODS**

## *Overview*

Some previous studies have used a 2-IFC task to measure performance when both visual and haptic signals are available (Ernst and Banks, 2002; Gepshtein and Banks, 2003; Gepshtein et al., 2005). This method has two key strengths. First, it allows highly accurate and precise measurement of signal weights. Participants are asked which of two intervals is larger: a cues-consistent comparison stimulus (*SH* = *SV*), or a cue-conflict standard stimulus (*SH* = *SV*), for a range of comparison sizes. The Point of Subjective Equality (PSE) of the resulting psychometric function provides an estimate of the comparison size required to match the perceived size of the cue-conflict standard, and from this reliable measures of cue weights can be derived (Ernst and Banks, 2002). This allows quantitative tests of the observed data against point predictions. Second, it provides a measure of the discrimination threshold when both cues are available simultaneously. This is a hard test of whether information from both cues is actually integrated on individual trials because, assuming that single-cue discrimination performance represents the best the observer can achieve, improvements with two cues must indicate use of information from both sources. In contrast, a measure only of bias can resemble optimal signal weighting if the system uses a single signal on each trial, but switches between them in a reliability-dependent way (Serwe et al., 2009). Unfortunately, however, a 2-IFC task was unsuitable here because it would have required participants to compare sizes across tool conditions on a single trial. For example, to measure the perceived size of objects felt with the 1:1.4 tool the resulting percept would have to be compared with perceived size using the 1:1 tool. The tools were necessarily rendered invisible during the judgement (to control visual reliability) and this, combined with changing tool type within a trial, introduces substantial uncertainty about the tool being used on a given interval. We piloted a 2-IFC task, and found that participants were frequently confused, and therefore chose to guess on a high proportion of trials (measured discrimination performance far exceeded single-cue performance). We therefore adopted a variant of a matching task here, so as to measure perceived size (and cue weights) from trials containing a single interval. The lower accuracy and precision of this method precluded detailed quantitative evaluation of the changes in signal weights with reliability. We therefore designed the stimulus parameters for Experiment 2 to produce a qualitative change in the pattern of signal weights if reliability-based signal weighting took place. We also based our analyses on average rather than individual data.

## *Participants*

Six right-handed participants took part in this experiment (2 males and 4 females; 19–36 years old). Four of them also participated in Experiment 1, but all participants were naïve to the specific purpose of this experiment. All participants had normal or corrected-to-normal vision with normal stereo acuity, and no known motor deficits. As before, the study was approved by the School of Psychology Ethics Review Committee, Bangor University, and all participants gave informed consent before taking part.

## *Apparatus and stimuli*

The same apparatus was used as in Experiment 1. The haptic stimuli, and the visually defined pliers-like tools were generated in the same manner as in Experiment 1.

The visual stimulus was a rectangular object in the same position and orientation (though not necessarily the same size) as the haptic stimulus. We used a random-dot stereogram stimulus very similar to that used by Ernst and Banks (2002) so that we could vary the reliability of the visual size estimates as needed. The visual stimulus is shown schematically in **Figure 5**. It consisted of a random-dot-defined square "bar" represented by a plane 20 mm in front of a random-dot background plane. The whole stimulus was 80 mm wide and 200 mm tall. Visual size was the height of the bar (i.e., visual size varied in the same direction as haptic size). The dot diameter was 4.0 mm, ± up to 1.0 mm random jitter (drawn from a uniform distribution). Average dot density was 0.20 dots per mm2. We used anti-aliasing to achieve sub-pixel accuracy of dot positions. In addition, because random dot placement could effectively make the stimulus larger or small than intended on particular trials, we chose 3% of the dots comprising the bar, and moved them to the edges, ensuring the stimulus was always the intended size. The viewing distance to the ground plane was 500 mm. We manipulated visual reliability (added noise) in the same manner as Ernst and Banks (2002). To do this, we added a random displacement in depth to each dot, drawn from a uniform distribution, where 100% noise indicates

that dot displacements were drawn from a range ±100% of the 20 mm "step" between background and bar (**Figure 5**).

### *Specifying the stimulus parameters*

Using the same method as Experiment 1, we first measured haptic-alone size sensitivity (JNDs) for the three tool gain ratios (0.7:1, 1:1, and 1.4:1), at three object sizes: 40, 60, and 80 mm. Note that here, haptic object size (as opposed to hand opening) was the same in different tool-gain conditions, because we wanted to examine the effects on signal weights of feeling the same object with different tools. Thus, hand opening varied with tool gain. The comparison sizes in "object units" were ±1, 3, 6, and 9 mm. This meant that with the 0.7:1 tool, the standard sizes in units of hand opening were 57.1, 85.7, and 114.3 mm (comparison sizes = standard ± 1.4, 4.3, 8.6, and 12.9 mm) and with the 1.4:1 tool, standard sizes at the hand were 28.6, 42.9, and 57.1 mm (comparison = standard ± 0.7, 2.1, 4.3, and 6.4 mm).

Object size sensitivity in each condition is shown **Figure 6A**. As we hoped, the different tool gains resulted in qualitatively different patterns of haptic sensitivity at different object sizes. Specifically, for the 40 mm object sensitivity was better with the 0.7:1 tool than with the 1.4:1 tool, and for the 80 mm object the pattern was reversed. This means we could manipulate haptic reliability in a manner that should result in clear differences in signal weights with variations in tool gain.

We wanted visual reliability not to be too low, because we wished to observe a clear contribution of both vision and haptics to the overall size estimate. Nor did we want visual reliability to be too high, because our manipulation of haptic reliability might then not have measurable effects. We therefore chose a visual noise level intended to approximately match each participant's visual sensitivity, with the 60 mm object and 1:1 tool, to his or her average haptic sensitivity across all conditions. It was not necessary to match visual reliabilities separately in all conditions because (*i*) we found in pilot testing that visual size JNDs with our stimuli varied with object size by only a small amount (0.86 mm with 20 mm variation in object size), and (*ii*) we were not testing point predictions. We therefore used data on

the relationship between visual noise and JND from a previous pilot experiment (*N* = 7) as a "lookup table" to specify the visual noise levels required in each case (see Supplementary material for details). The average of the predicted visual sensitivities is shown in **Figure 6A** (dashed gray line).

experimental condition, calculated using the sensitivity data from panel **(A)**,

**Figure 6B** shows the predicted pattern of signal weights based on the data from **Figure 6A**, and calculated using Equation 2 (assuming the relationship between sensitivity and reliability described in the Introduction). It can be seen that for the smallest object size, optimal integration predicts that haptics will receive more weight with the 0.7:1 tool than with the 1.4:1 tool. At the middle object size, the prediction is for similar weights (close to 0.5) for all tool types. At the largest object size (80 mm) the theory predicts a reversal of the pattern at 40 mm, with haptic weight being lower with the 0.7:1 tool, and higher with the 1.4:1 tool.

#### *Procedure*

where *wV* = 1 − *wH*

We measured the perceived size of stimuli when information was available from vision and haptics simultaneously. We varied visual and haptic sizes independently (cue-conflict stimuli) to measure the weight given to each. Each trial consisted of a stimulus presentation period and a response period. In the stimulus presentation period, visual and haptic stimuli were presented simultaneously and participants explored the virtual objects using a tool. The stimulus period closely resembled the previous haptic-only trial, except for the presence of the visual stimulus. Once again the tool tips were first inserted in start zones, then all visual information (including the tool) was extinguished at the commencement of grasp closure. When the tool tips touched the haptic object, the visual stimulus appeared. As before, participants were trained to respond within a 900–1200 ms temporal window.

In the response period, a visual rectangular cuboid appeared on the screen (width 100 mm, depth 20 mm), and participants adjusted its height (in 5 mm increments from 20 to 120 mm) to match the stimulus they had just experienced. The start "size" was randomized. In pilot experiments we found similar patterns of results for visual responses and haptic responses (reporting which of a range of felt sizes matched the stimulus) and so we tested only visual responses here (Helbig and Ernst, 2007b).

In no-conflict conditions, visual object size was equal to haptic object size (40, 60, or 80 mm). In conflict conditions, visual object size was varied ± 10 mm from the haptic object size, allowing us to determine the weights given to vision and haptics (assuming *wH* = 1 − *wV*; see below). Each participant's visual noise level was constant in all conditions, and set so as to match his or her visual and haptic sensitivities when viewing the 60 mm object and feeling it with the 1:1 tool (see earlier). The experiment was run in a series of blocks containing both no-conflict ( *SV* = *SH*) and cueconflict (*SV* = *SH* ± 10 mm). Each block therefore contained 27 combinations of stimuli (3 haptic object sizes × 3 visual sizes × 3 tool gains), randomly interleaved. Each participant completed 20 judgements per stimulus combination.

## **RESULTS**

**Figure 7** shows mean perceived size as a function of variation in visual object size for each tool-gain condition. The three panels show the data for the three haptic object sizes (40, 60, and 80 mm), respectively. If size estimates were based only on the haptic signal, the data would lie on horizontal lines. Conversely, if estimates relied on vision alone, the curves would lie on a line with a slope around 1.0. Clearly, the observed data are between these two extremes. This is consistent with estimates based on a weighted combination of both signals (Helbig and Ernst, 2007b). It can also be seen that, at all haptic object sizes, the data for the different tool-gain conditions are separated vertically. Relative to the 1:1 tool, perceived size was on average 2–3 mm larger with the 0.7:1 tool, which magnified the hand opening relative to object size, and a similar amount smaller with the 1.4:1 tool, which reduced the hand opening. These are relatively small effects, given the variation in actual hand opening across conditions (with the 60 mm haptic object, for example, hand openings in the three tool conditions were 85.7, 60 and 42.9 mm). This result therefore suggests that haptic size estimates when using a tool are rescaled to account for the tool gain, but that this "compensation" is incomplete (we return to this issue in the Discussion).

**Figure 8** plots the average weights given to vision and haptics for each combination of object size and tool gain, based on

**FIGURE 7 | Perceived size results from Experiment 2.** Each panel shows perceived size as a function of variations in the visual object size, for each tool type, averaged across all participants. Panels **(A–C)** show the data for haptic object sizes of 40, 60, and 80 mm, respectively. Error bars denote ± 1 standard error.

haptic signal weight for each "base" object size, and tool type, averaged across all participants. The weights were calculated from the effect of varying visual size in each case (the slopes of the lines in **Figure 7**), and assuming that *wH* = 1 − *wV* . See main text for details of this calculation. Error bars denote ± 1 standard error.

the slopes calculated from **Figure 7**. To calculate the weights, we assumed that perceived size (*S*ˆ*VH*) is a weighted sum of visual and haptic estimates (*S*ˆ*V*, *S*ˆ*H*), as specified in Equations 1 and 2 (Ernst and Banks, 2002). We assumed that was unbiased. As noted above, we cannot assume that *S*ˆ*<sup>H</sup>* is unbiased. But because in each condition we fixed haptic size and varied visual size, by making the reasonable assumption that the bias in the haptic size estimate is constant for a constant object size and tool gain, the slope of the perceived size data as a function of visual size directly represents the weight given to that signal (i.e., changes in haptic bias only would shift the data up or down on the *y*-axis, but not alter the slope). The visual weight, *wV*, was therefore defined as the slope of the best fitting linear regression to the data in each case, where *wH* = 1 − *wV*.

It can be seen that signal weights varied with both object size, and tool gain, in a manner similar to the predictions in **Figure 6B**. For the 40 mm object, the 0.7:1 tool resulted in more weight given to haptics, and the 1.4:1 tool resulted in more weight given to vision. As predicted, this pattern was reversed for the 80 mm object. Because we had clear predictions we did not conduct an omnibus ANOVA, but instead ran specific planned comparisons (one-tailed *t*-tests) to evaluate the statistical significance of the predicted effects. These tests showed that for the 40 mm object the weight given to haptics with 1.4:1 tool gain was significantly lower than with 0.7:1 tool gain [*t*(5) = 4.92; *p* < 0.01]. For the 80 mm object, the haptic weight with the 1.4:1 tool was significantly higher than with the 0.7:1 tool [*t*(5) = 2.23; *p* ≤ 0.05].

## **DISCUSSION**

In Experiment 1 we found that variations in haptic size sensitivity, as measured with a 2-IFC task, did not follow Weber's law but instead followed a more complex pattern with increased hand opening. We also found that haptic size sensitivity when using virtual tools that altered the relationship between object size and hand opening was a straightforward combination of sensitivity at the hand, and the effects of the tool geometry. Thus, the "gain" of pliers-like tools alters the reliability of haptic size information, and so should be accounted for by an optimal visual-haptic integration process. In Experiment 2, we found that the brain took account of these changes in haptic reliability introduced by different tool geometries, and adjusted the weighting of haptic and visual signals in a manner broadly consistent with statistically optimal sensory integration.

Said another way, our results show that the visuo-motor system was able to adjust appropriately the weight given to size estimates from vision and haptics with changes in haptic reliability introduced by using different tool mappings. This extends our knowledge about the flexibility of sensory integration mechanisms, in particular by suggesting that the brain can represent not only distal properties of the world sensed via a "tool transformation," but also the certainty (reliability) with which that information is known. This potentially confers the capability to combine visual and haptic signals rationally across a wide range of situations encountered in the world. Caution should be exercised in generalising our findings to the use of real tools, however. While our manipulation of tool gain accurately represented the functioning of a real tool, our virtual stimuli differed from real-world situations in several regards. In particular, movement of the tool itself was artificially constrained, and it had no perceptible mass. We used virtual stimuli to provide the degree of experimental control required for our approach, including varying visual reliability parametrically, varying visual and haptic sizes independently (to measure cue weights), and switching rapidly between tool types. But it remains to be determined whether the visuo-motor system operates similarly in real-world tool use.

## **THE CORRESPONDENCE PROBLEM IN VISUAL-HAPTIC INTEGRATION WITH DIFFERENT TOOL GAINS**

On its own the finding that haptic sensitivity to object size was simply determined by the sensitivity at the hand, and the effects of tool geometry, is perhaps unsurprising: the task simply requires two signals, both of which are modified by the tool geometry in the same way, to be discriminated from one another, and the overall *magnitude* of the two estimates does not matter. That is, discrimination need not be carried out in units of the object's size in the world, taking into account the tool geometry, but could simply be carried out in the more basic units of hand opening. For integration of two different sensory signals to be effective, however (Experiment 2), the brain must transform the two signals into common units. That is, it must solve a sensory "correspondence problem"—knowing the statistics of the mapping between estimates that are sensed in fundamentally unrelated units (Ernst, 2005; Roach et al., 2006). This is important not only for establishing the relative reliabilities of signals, as studied here, but also for more fundamental aspects of sensory integration (Landy et al., 1995). For example, the combined estimate should also generate an accurate (least biased) combined estimate of the object's size, which is also not possible if the relationship between visual size and (altered) haptic size estimates is not accounted for. Moreover, knowledge of the mapping between signals is also important in making the basic decision about whether to integrate signals or not. As in other sensory domains, visual and haptic signals can often refer to different objects in the world. To avoid combining unrelated signals the brain must therefore determine how likely it is that they share a common cause (Ernst, 2005; Körding and Tenenbaum, 2006; Roach et al., 2006; Körding et al., 2007; Shams and Beierholm, 2010). Recent work suggests this process could be achieved by comparing the statistical similarity of the different signals across dimensions such as spatial location, temporal synchrony, and also signal magnitude (e.g., Deneve and Pouget, 2004; Gepshtein et al., 2005; Shams et al., 2005; Roach et al., 2006; Bresciani et al., 2006; Ernst, 2007; Knill, 2007; Körding et al., 2007; Girshick and Banks, 2009; Takahashi et al., 2009). This makes sense because the probability that two signals relate to the same object is normally directly related to how similar the estimates are.

Our method does not allow us to determine whether information was combined optimally, in the sense of producing the minimum-variance combined estimate (Ernst and Banks, 2002). Nonetheless, the observed changes in perceived size with variations in visual size in Experiment 2 are consistent with optimal integration of information from vision and haptics in all three tool-gain conditions, suggesting the brain solved this sensory correspondence problem essentially correctly. As in previous work, sensory integration occurred appropriately despite the spatial offset between visual signals (at the object) and haptic signals (at the tool handle) (Gepshtein et al., 2005; Takahashi et al., 2009). Moreover, signals appeared to be combined correctly, with appropriate weightings, across variations in tool gain. Thus, even when the proximal signals (visual size and hand opening) were discrepant visual and haptic estimates were appropriately combined on the basis of the distal object properties, taking into account the tool geometry.

## *Remapping of haptic signals*

In principle, this correspondence of visual and haptic signals could be established by a remapping process that transforms the haptic signal at the hand to take account of the tool geometry, allowing accurate estimates of object size in the world, independent of the tool used to feel it. The perceived-size data from Experiment 2 (**Figure 7**) are broadly consistent with such a process. Assuming that visual size estimates were unbiased, and using our estimates of the weights given to each signal (*wV*, *wH*) we can calculate the haptic size estimate (*S*ˆ*H*) by rearranging Equation 1. **Figure 9** plots *S*ˆ*<sup>H</sup>* calculated in this way for each toolgain condition as a function of haptic object size (**Figure 9A**) and hand opening (**Figure 9B**). It can be seen that perceived size estimates with gains other than 1:1 were driven predominantly by object size, and not by hand opening. **Figure 9B** shows for example that, for the same hand opening, haptic size estimates altered substantially as a function of tool gain. Moreover, **Figure 9A** shows that *S*ˆ*<sup>H</sup>* varied with haptic object size with a slope of near 1.0. This suggests the brain was transforming the proximal estimate (hand opening at the handle of the tool) in order to estimate the distal haptic size (object size in the world), allowing size estimates, and decisions about sensory integration, to be based on these common units.

An important idea in motor control is that the brain constructs internal "forward models" of limbs allowing movements of the hand and arm in space to be predicted from motor commands (Wolpert et al., 1995). In principle such a model could be extended to include representation of the tool geometry (which is relatively simple compared to the relationship between joint angles and space) allowing tools to be controlled using the same underlying systems normally used to control the arm (Takahashi et al., 2009). This is similar to the more general idea that tools are "incorporated" into our sense of body position in space—the "body schema" (Maravita and Iriki, 2004). Relatedly, there is some experimental evidence for effectorindependent control processes for grasping, implying that motor outputs take tool geometry into account (Gentilucci et al., 2004; Johnson-Frey, 2004; Umilta et al., 2008; Gallivan et al., 2013).

object size **(A)** and hand opening **(B)**. See main text for details of how these values were derived. Error bars denote ± 1 standard error in each case.

Researchers have also studied the "reverse" of this process, showing that tool use can affect perceptual and cognitive processes including perception of the extent of peri-personal space, the allocation of spatial attention, and perception of our limbs (e.g., Farnè and Làdavas, 2000; Holmes et al., 2004; Bonifazi et al., 2007; Cardinali et al., 2009; Sposito et al., 2012), as well as motor output (Cardinali et al., 2009, 2012). These results are interesting, and point to the existence of general internal models that allow forward and reverse operations. The conclusions that can be drawn regarding the accuracy of the internal models underlying complex tool use are limited, however, for two reasons. First, although humans frequently use articulated tools, which introduce complex alterations to the hand/object mapping, the above studies focused almost exclusively on the effects of using tools that only extend reachable space. Second, the accuracy of the tool model could not be assessed because studies typically measured indirect consequences of using tools such as shifts in spatial attention, indexed by measures such as reaction time changes. Changes in motor output offer a more direct measure (e.g., Cardinali et al., 2009, 2012), but is difficult to make quantitative predictions regarding movement kinematics, particularly for aftereffects of tool use.

The finding that haptic size estimates change systematically with changes in tool gain is consistent with the existence of internal models for relatively complex transformations, allowing the magnitude of distal signals (object size) to be computed from the proximal signals (hand opening) sensed via a tool. It is interesting to note, however, that the "compensation" for tool geometry we observed, although substantial, is incomplete, *S*ˆ*<sup>H</sup>* with being biased slightly toward the actual hand opening. This could reflect uncertainty in the internal model of the tool (we do not have sufficient data to examine whether the effect reduces with time, as the internal model of the tool is refined, for example). It could also reflect uncertainty about which tool is currently being used in our task, because vision of the tool was extinguished during the size judgement to control visual reliability. Consider an estimate of the current tool gain based on a (Bayesian) combination of sensory input about what the current mapping state is, knowledge from previous experience of this mapping, and a prior for hand:object-size mapping built up from experience. Reducing uncertainty in either the sensory data or the knowledge of the mapping will lead to a greater influence of the prior (the typical mapping), which is presumably a 1:1 mapping between object size and hand opening (i.e., when there is no tool).

The full range of tool transformations that the visuo-motor system can model internally remains to be determined. In principle, equipped with an appropriate set of mathematical basis functions, any tool mapping, no matter how abstract, could be modeled. If one assumes, however, that our tool modeling capability did not evolve independently, but instead takes advantage of mechanisms that evolved for controlling our limbs in varying situations (caused by growth, fatigue, holding objects of different weights etc.), it seems likely that this architecture will impose constraints on the classes of transformation that can be modeled (consistent with findings from classical adaptation literature).

## *Estimating signal reliability in remapped haptic estimates*

Neural models of population coding offer a plausible neural mechanism by which the task of appropriately weighting different sensory signals could be achieved (see for example Zemel et al., 1998; Pouget et al., 2002; Knill and Pouget, 2004; Natarajan and Zemel, 2011; Fetsch et al., 2012). These models describe how neural populations can represent the probability distribution associated with an estimate of properties of the world, and so can represent both the magnitude and uncertainty (noise) of the estimate in a manner that is analogous to statistical models of cue integration (Ernst and Banks, 2002). In simple terms, noisier inputs, caused either by internal or external factors result in "wider" population responses, and vice versa. The product of two such probabilistic distributions (one for each signal), appropriately normalized, is equivalent to the statistically optimal sensory integration described earlier (Pouget et al., 1998; Deneve et al., 2001; Ernst and Banks, 2002; Knill and Pouget, 2004).

A key feature of such a mechanism is that by operating on probability distributions it could achieve optimal, reliabilitybased signal weighting moment-by-moment, without requiring the explicit calculation of signal reliabilities or weights, or explicit knowledge about the circumstances under which different signals are reliable (see Natarajan and Zemel, 2011). For quantitatively meaningful outputs to emerge, however, the two neural populations for the two senses must be appropriately calibrated with respect to one another (this is the sensory correspondence problem, described above). If we assume the brain's internal tool "model" operates at the level of the whole neural population coding for haptic size, then it could effectively scale, or remap, the output of each neuron in the population according to the geometrical transformation between hand opening and object size introduced by the tool. This "single" operation would remap the whole probability distribution and so in theory would achieve both appropriate rescaling of the magnitude of the haptic size estimate and of the "width" (uncertainty) of the distribution, allowing reliability-based combination with other signals in the manner we observed.

This process also provides a mechanism by which basic sensory factors limit the reliability of high-level (object size) estimates from haptics during tool use, because the low-level noise propagates through all levels of the system. The haptic-alone discrimination performance in Experiment 1 does not, on its own, provide compelling evidence for our claim for low-level limitations on high-level haptic size estimates with tools because, as we discussed above, the task could have been carried out in handopening "units." The agreement between observed and predicted signal weights in Experiment 2 suggests, however, that the singlesignal results accurately reflected the system's sensitivity in the two-signal case, when haptic estimates were presumably necessarily transformed into higher level (object-size) units in order to be combined with visual size estimates. Taken together, these results suggest that haptic-size sensitivity in tool use is indeed limited by low level sensory factors and not higher-level size-representation mechanisms.

#### *Rapid switching between visuo-motor mappings*

The process described above—remapping between neural populations that encode the same object properties specified by different senses—could also describe "classical" adaptation, for example to prism displacement. We deliberately randomly interleaved tool types in both our experiments (on average the tool gain was 1:1) specifically to prevent such adaptation to a constant "offset." The agreement between our predicted and observed signal weights is therefore consistent with participants switching between different visuo-motor mapping "states" on a trial-bytrial basis, and weighting signals correctly on each trial. This is consistent with other work on tool use suggesting that tool mappings are learned and can then be selected or switched by contextual information or information about tool dynamics (Imamizu et al., 2003; Massen and Prinz, 2007; Imamizu and Kawato, 2008; Botvinick et al., 2009; Beisert et al., 2010; Ingram et al., 2010). Similar ability to switch between (presumably learned) mappings has been observed in visuo-motor adaptation more generally (Cunningham and Welch, 1994). Perhaps the most commonly observed example of this is our ability to rapidly compensate for the effects of putting on and removing prescription spectacles, once we have sufficient experience with them (see Schot et al., 2012). Important questions remain, however, regarding the limitations on learning and switching between tool models, including the degree of complexity of tool transformation that can be dealt with effectively (see earlier), how many different tool models can be learned, and what are the signals that indicate the current tool mapping state to the system? Even in our relatively straightforward experiment there are several possibilities for what the system might be learning. For example, the different tools could be modeled independently, in which case information about one "tool mapping" would confer no information regarding a similar, but novel tool. Alternatively, the class of "simple gain tools" could be learned, along with a variable gain parameter, in which case our effects would transfer to novel tools of the same class. Indeed, it remains possible that nothing is learned, and that the current mapping state is recovered on each trial. Further studies are required to explore these possibilities.

## *Implications for designing tools and other visual-haptic interfaces*

Clearly there are many factors that must be borne in mind when designing tools and haptic interfaces, of which we have studied just one (haptic size sensitivity). Nonetheless, our data do provide pointers for how size sensitivity can be optimized in visual-haptic (or haptic-only) devices. The critical finding is that, because sensitivity to hand opening does not follow Weber's law, there is a particular tool gain that maximises haptic sensitivity for a particular object size. This is illustrated in **Figure 10**, which plots JNDs in object size as a function of both object size, and tool gain, using average data from Experiments 1 and 2, and assuming the

**FIGURE 10 | Effects of tool gain and object size on haptic size sensitivity.** The figure plots object size JNDs as a function of both object size, and tool gains. Continuous data were obtained by using the fit to the empirical data from Experiment 1 (no-tool condition), and assuming again that sensitivity in object-size units was a straightforward combination of sensitivity to hand opening and the effects of tool geometry (**Figure 1**). The regions of the figure where no data are plotted correspond to hand openings beyond our measured data. JNDs ≥20 mm are not represented accurately, but are plotted as a "flat" dark red region.

straightforward relationship already described between sensitivity to hand opening, and sensitivity to object size with different tool gains (Experiment 1). The diagonal dashed line represents the locus of best haptic-size sensitivity in this space. In principle, armed with this information, a haptic device can be optimized for size sensitivity (similar analyses could also be carried out for other transformations). Because optimal tool gain varies continuously with object size, however, it will be critically important to answer the questions posed earlier regarding our ability to learn multiple mappings, and to switch between them, to determine how haptic interfaces are to be truly optimized for complex environments.

## **CONCLUSIONS**

Tools commonly change the mapping between object size and hand opening. This potentially alters the reliability of haptic size estimates, complicating the problem of weighting visual and haptic estimates correctly in sensory integration. We first confirmed that pliers-like tools do indeed introduce such changes in haptic precision, and therefore reliability. We then examined the extent to which the brain takes account of these changes in visual-haptic integration during tool use. Our results suggest that the brain compensates (albeit incompletely) for changes in proximal haptic signals introduced by different tool geometries, allowing it to dynamically and appropriately adjust the weighting given to haptic and visual signals in a manner consistent with optimal theories of sensory integration. These findings reveal high levels of flexibility of human sensory integration and tool use, as well as providing an approach for optimizing the design of visual-haptic devices.

## **ACKNOWLEDGMENTS**

This work was supported by the Overseas Research Students Awards Scheme (Chie Takahashi), and the Engineering and Physical Sciences Research Council (Chie Takahashi, Simon J. Watt).

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpsyg. 2014.00109/abstract

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 October 2013; accepted: 27 January 2014; published online: 14 February 2014.*

*Citation: Takahashi C and Watt SJ (2014) Visual-haptic integration with pliers and tongs: signal "weights" take account of changes in haptic sensitivity caused by different tools. Front. Psychol. 5:109. doi: 10.3389/fpsyg.2014.00109*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Takahashi and Watt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Modulation of visual attention by object affordance

## *Patricia Garrido-Vásquez\* and Anna Schubö*

*Experimental and Biological Psychology, Philipps University Marburg, Marburg, Germany*

#### *Edited by:*

*Knut Drewing, Giessen University, Germany*

#### *Reviewed by:*

*Jay Pratt, University of Toronto, Canada Ann-Katrin Wesslein, University of Trier, Germany*

#### *\*Correspondence:*

*Patricia Garrido-Vásquez, Experimental and Biological Psychology, Philipps University Marburg, Gutenbergstrasse 18, 35032 Marburg, Germany e-mail: pgarrido@uni-marburg.de* Some objects in our environment are strongly tied to motor actions, a phenomenon called object affordance. A cup, for example, affords us to reach out to it and grasp it by its handle. Studies indicate that merely viewing an affording object triggers motor activations in the brain. The present study investigated whether object affordance would also result in an attention bias, that is, whether observers would rather attend to graspable objects within reach compared to non-graspable but reachable objects or to graspable objects out of reach. To this end, we conducted a combined reaction time and motion tracking study with a table in a virtual three-dimensional space. Two objects were positioned on the table, one near, the other one far from the observer. In each trial, two graspable objects, two nongraspable objects, or a combination of both was presented. Participants were instructed to detect a probe appearing on one of the objects as quickly as possible. Detection times served as indirect measure of attention allocation.The motor association with the graspable object was additionally enhanced by having participants grasp a real object in some of the trials. We hypothesized that visual attention would be preferentially allocated to the near graspable object, which should be reflected in reduced reaction times in this condition. Our results confirm this assumption: probe detection was fastest at the graspable object at the near position compared to the far position or to a non-graspable object. A follow-up experiment revealed that in addition to object affordance *per se*, immediate graspability of an affording object may also influence this near-space advantage. Our results suggest that visuospatial attention is preferentially allocated to affording objects which are immediately graspable, and thus establish a strong link between an object's motor affordance and visual attention.

#### **Keywords: object affordance, attention, motor, graspability, reachability**

## **INTRODUCTION**

Gibson (1979) coined the concept of object affordance in his ecological approach to perception. In a nutshell, affordances are what the environment offers to the observer, or in other words, what we can do with the objects surrounding us. For instance, a solid object of a certain size and/or furnished with a handle may afford us to grasp it, and this potential of the object for action is already present in the visual array. The concept of object affordance therefore establishes a strong connection between visual perception and motor behavior.

In recent years, neuroscientific research has been dedicated to unraveling the neural mechanisms of object affordance. One seminal study (Chao and Martin, 2000) compared the processing of different types of stimuli (faces, houses, animals, and tools) in a functional magnetic resonance imaging (fMRI) experiment. Tools, which can be considered a stimulus category that affords grasping, elicited stronger activations in the left premotor and posterior parietal cortices than other stimulus types. These regions are associated with moving one's hand and with grasping, respectively. Remarkably, these activations occurred in the absence of any task, and were observed even though the tools were only presented in the form of photos on a screen. These results therefore indicate that the implication of the tools for action was processed independently of any intention or possibility to interact with them. More recently, Proverbio et al. (2011) confirmed these results in an event-related potentials (ERPs) study, in which passive viewing of tool pictures elicited an enhanced left-frontal negativity compared to non-tools, starting about 210 ms after picture onset. This activity was localized in motor regions, namely bilateral premotor cortex and left post-central gyrus using an ERP source reconstruction method. These lines of research, together with results from related studies (Creem-Regehr and Lee, 2005; Handy et al., 2006) provide evidence of a neural link between visual processing on the one hand and motor-related activations on the other in object affordance. This link has also been referred to as visuomotor response (Handy et al., 2003; Gallivan et al., 2011).

Handy et al. (2003) put forward the idea that one possible consequence of this previously described visuomotor response may be that more attention is allocated to affording objects. In their study, the authors presented line drawings of two objects (one graspable and one not) simultaneously, either left and right or above and below from fixation. This design was referred to as object competition model by the authors, because it pits two objects against each other. A target was subsequently superimposed on one of the two objects, and participants were instructed to detect these targets as quickly as possible. ERP data indicated a sensory gain for graspable objects, with increased P1 amplitudes for targets superimposed on tools compared to non-tools, but only when the

targets appeared on tools in the right or in the lower visual field. Target detection latencies in the reaction times were significantly decreased for tools in the lower hemifield. The authors therefore concluded that tools automatically attract attention when they appear at locations important for grasping (Handy et al., 2003). However, in latter study the temporal delay between the onset of the object pictures and the targets to which reaction times and ERPs were measured was at least 650 ms long. Evidence suggests that selective visual attention is allocated much faster, namely within the first 200 ms of processing (Wykowska and Schubö, 2012). Therefore, the possibility arises that in the study by Handy et al. (2003) other processes than the initial allocation of selective visual attention were captured. For example, the attentional interpretation of the data is at odds with the absence of a significant reaction time advantage for target detection at the tool on the right, both in comparison to a non-tool at the same position, and with the tool-left condition. Therefore, further investigation using shorter delays between stimulus and target presentation is warranted.

In line with the findings by Handy et al. (2003), who reported that affording objects may attract attention as a function of their spatial location, some studies suggest that distance of an object from the observer could be a mediating factor in object affordance. More precisely, a tool may be more affording when it is close to the observer (peripersonal space) rather than located farther away and not immediately reachable (extrapersonal space). For example, Gallivan et al. (2009, 2011) reported that a region in superior parieto-occipital cortex (SPOC) reacts more strongly to objects immediately reachable with the right hand (for right-handers) compared to more distant objects or objects within immediate reach of the left hand but not the right, even during passive viewing. The authors suggest that SPOC encodes the potential of immediately acting on objects, which relates to their affordance (Gallivan et al., 2009). The role of object distance has also been corroborated in a TMS study by Cardellicchio et al. (2011), who reported significantly enhanced motor evoked potentials (MEPs) at participant's right hands when a graspable object (a cup) was presented in near space, compared to the presentation of a non-graspable object (a large cube) at the same location. However, in far space, this MEP difference was no longer significant, while the MEP elicited by the cup in near space was significantly greater than the MEP elicited by the cup in far space. In a similar vein, another study found significant congruency effects for pantomime movements in near space, depending on whether the handle orientation of a cup was congruent with the hand the participant was supposed to move. There were, however, no such congruency effects in far space (Costantini et al., 2010, 2011a). Furthermore, evidence shows that at the level of semantics, possible interactions with an object are triggered faster when these objects are immediately reachable than when not (Costantini et al., 2011b). Thus, experimental findings suggest that object affordance and the visuomotor response triggered by an object may be modulated by its distance to the observer. What has not been investigated so far is whether attention deployment to a graspable object is also modulated by object distance.

## **RATIONALE OF THE STUDY**

We investigated attentional deployment to affording and non-affording objects as a function of their graspability and reachability. To this end, we applied a task similar to the one used by Handy et al. (2003) with two objects presented simultaneously and a probe subsequently appearing on one of them. In order to avoid physical imbalances induced by lateralized stimulus presentation, objects were presented on the vertical meridian. To create an impression of depth, a virtual three-dimensional space was designed, depicting a table in a room. Due to their arrangement at the front versus at the back of the table one of the objects appeared close, the other one far from the observer. We used luminance change of the whole object as probe in order to avoid emphasizing a particular part of the object. A cup was used as an affording object and a cactus represented the non-affording category (see **Figure 1**).

In line with the procedure of previous studies (Gallivan et al., 2009, 2011) and in order to strengthen the motor association with the cup, grasp trials in which participants interacted with a real cup were included in the experiment. The cup on which participants performed the grasp trials was identical to the one presented on screen, such that haptic experience could be transferred to the cup on screen. Recent research has outlined the importance of haptic experience with objects, specifically when visual information is incomplete (Takahashi and Watt, 2012). This was also the case in our experiment, because even though the stimuli presented on screen were designed to appear as realistic as possible, they cannot achieve the level of visual completeness of a real cup. Furthermore, responding to graspable objects with button presses is a rather arbitrary action and may interfere with lifelong experience of interacting with objects (Handy et al., 2003; Valyear et al., 2012). Thus, interspersing the probe detection task with trials in which a natural interaction with the object is performed constantly reminds participants of the cup's real function.

Probe detection has been repeatedly used as a means to measure the allocation of visuospatial attention. The idea behind is that probe detection is speeded up when the probe appears at

detection task. Two objects appeared on the table on screen, one of which changed its luminance 200 ms after stimulus onset. The two left displays represent the identical-objects conditions, the two right displays show the different-objects conditions.

a currently attended location (Posner, 1980; Humphreys et al., 2004; Wykowska and Schubö, 2010). As discussed above, the delay between stimulus presentation and probe appearance in Handy et al. (2003) may be too long to tap initial allocation of visual attention. Therefore, it is unclear what the reported reaction time advantage for tools presented in the lower or right visual hemifield may actually have reflected. Building upon previous evidence (Wykowska and Schubö, 2012), we assumed that a 200 ms delay between stimulus and probe onset would be suitable to test initial allocation of attention in the present experiment. If affording objects in peripersonal space selectively attract initial attention, we would expect reaction times for probes appearing at the cup in near space to be faster than for the cactus at the same position. Furthermore, probe detection at the cup in near space should be faster than at the same object in far space. These predictions were tested and confirmed in Experiment 1. In a second experiment, we additionally varied immediate graspability by means of handle orientation. Based on studies relating an object's grasp affordance to its immediate reach- and graspability with the right hand in right-handers (Buccino et al., 2009; Gallivan et al., 2009, 2011), the near cup with its handle to the left should not attract more attention than the cactus at the same position, and there should be no significant difference between probe detection at the near and far cup position with handles facing to the left.

## **EXPERIMENT 1**

In Experiment 1 we sought to determine whether the affording object (a cup) located in near space attracts more visuospatial attention than (1) the same object presented in far space, and (2) the non-affording object (a cactus) in near space. This was measured indirectly with a probe detection task, interspersed with grasp trials.

## **MATERIALS AND METHODS**

## *Participants*

Thirty-nine students from the Philipps University Marburg participated in the experiment for course credit. Data sets from four participants were discarded, one due to technical problems during the measurement, one because of too many errors (11% of the trials, which is four standard deviations above the group mean), and two because of very slow responses (reaction time more than two standard deviations above the overall reaction time mean of all participants). The remaining 35 participants (23 female) had a mean age of 22.91 years (*SD* = 2.70) and were right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971). They reported normal or corrected-to-normal vision and performed normally in a color vision test (Ishihara, 1917). Participants gave written informed consent before the experiment. All experimental procedures were in accord with the Declaration of Helsinki.

## *Stimuli and apparatus*

A 22-in. computer screen with a refresh rate of 100 Hz was used for stimulus presentation. The screen was placed at a distance of 70 cm from the observer and its height was individually adjusted such that fixation was exactly at eye level for each participant. A button box was put centrally in front of the participants, with the left and right buttons of the box under their respective index fingers. Behind the box, a pedestal with a height of 14 cm was positioned and a wooden, custom-made cup was placed on a marked position at the middle of it, with its handle facing to the right. Stimulus delivery and experimental timing was controlled with Presentation Software (Neurobehavioral Systems, Albany, CA, USA).

The background for all stimuli was a colored virtual threedimensional room with a table in it (**Figure 1**) and extended across the entire screen. Vanishing point perspective was used to create an impression of depth. A black dot surrounded by a black circle was placed on the table as fixation spot. Two different objects could appear on the table during the experiment: a cup and a cactus. The cup consisted of a photograph of the wooden cup that was on the pedestal during the experiment. The cactus matched size and shape of the cup and was adjusted for its mean luminance value. The objects were 7.5 cm wide and 7.2 cm high when presented on screen. Two objects were always shown simultaneously on the table, one virtually near, the other virtually far from the observer. The near and far objects started at a distance of 2.8◦ below and above the center of the fixation spot, respectively. They subtended a horizontal viewing angle of 2.5◦ to the left and 4.1◦ to the right. Note that the stronger extension toward the right was caused by the handle orientation to that side. Due to the identical physical size of the near and the far object, the latter was subjectively larger than the former.

There were four different display types (**Figure 1B**). Two of them contained identical objects (either two cactuses or two cups); the other two displays contained two different objects (one cup and one cactus). The go stimulus for grasp trials consisted of a single cup placed on the virtual table, subtending the same horizontal angles as in the two-objects presentation, but vertically placed exactly halfway in between the near and far positions. Stimulus creation and manipulation was realized using Gimp Version 2.8.

Grasping movements of the participants were recorded with a Polhemus Liberty electromagnetic motion tracker at a sampling rate of 240 Hz providing X, Y, and Z coordinates of each sensor in space. Motion sensors were attached to the right wrist and to the thumb and index finger of the right hand using adhesive tape. Motion data recording was controlled with custom Matlab scripts (Mathworks, Natick, MA, USA) and interfaced with Presentation software using the Matlab Workspace Extension implemented in Presentation.

## *Procedure*

Each probe detection trial started with a fixation display for 1000 ms, during which the background stimulus, that is, the table in the virtual room, was visible to avoid abrupt visual changes upon the subsequent appearance of stimuli. Two objects then appeared on the table for 200 ms. Then, a probe (luminance change of the whole object) appeared on one of them for 100 ms, and subsequently the object returned to its original luminance. The other object did not change. Participants were instructed to press the corresponding button as quickly as possible in order to indicate the location of the luminance probe. Upon the participant's reaction or after a maximum of 2000 ms in case no response was registered, an empty gray screen was presented for a random duration between 1000 and 1500 ms (inter-trial interval) before the next trial started. Assignment of left and right buttons to the object

positions was counterbalanced among participants. Thus, there were eight different experimental conditions in the probe detection task, as shown from left to right in **Figure 1B**: cactus/cactus, cup/cup, cactus/cup, and cup/cactus, each of these in a probe-near and a probe-far version.

Grasp trials also started with 1000 ms fixation depicting the table. Upon the presentation of the go stimulus participants were to move their right hand toward the actual cup, grasp it by its handle, lift it in order to match the height of the cup on screen, and then put it back on the pedestal. The experimenter observed the movement and pressed a button as soon as the participants had returned with their right index finger to the right key of the button box. This initiated the inter-trial interval as described above, and then the next trial began.

The experiment consisted of 660 trials in total, among which 440 were probe detection trials and 220 were grasp trials. Trials were divided into 11 blocks of 60 trials each (40 probe detection, 20 grasp). After each block, participants received feedback about their mean reaction times and number of errors in the probe detection task during the last block. Each of the four stimulus displays appeared 110 times during the experiment, with half of the probes presented at the near and far positions, respectively. Stimulus sequence was randomized throughout the experiment and differed for each participant.

### *Data analysis*

*Probe detection data.* Trials with false or missing responses were excluded. Reaction times were computed from the onset of the probe until the button press was registered. For the purpose of outlier correction the experiment was divided into three parts to account for potentially slower reaction times in the first blocks or fatigue at the end of the experiment. The parts comprised the following blocks: part1: blocks 1–3, part 2: blocks 4–7, and part 3: blocks 8–11. Mean reaction times were calculated for each participant, condition, and part. Trials with reaction times exceeding ± 2 standard deviations from these individual means were excluded from the analysis (5% of trials on average).

After outlier correction, mean reaction times were computed separately for each subject in each of the eight conditions. Three factors were of interest in the current experiment: (1) *probe location*, that is, whether the probe appeared at the near or at the far position, (2) *object type*, that is, whether the probe appeared on the cup or on the cactus, and (3) *identity*, that is, whether two identical or two different objects (cup and cactus) were presented in a current trial. The inclusion of latter factor allowed us to elucidate whether the context in which an object is presented also affects probe detection. A 2 × 2 × 2 repeated-measures analysis of variance (ANOVA) with these within-subjects factors (*identity* – identical vs. different, *object type* – cup vs. cactus, and *probe location* – near vs. far) was computed on the reaction times. Error rates were very low in the present experiment (*M* = 2.36%, *SD* = 1.59) and were therefore not analyzed.

*Motion tracking data.* Due to technical problems, in the case of three participants no motion data could be acquired. Therefore, only 32 participants were used for this analysis. Motion tracking

data were processed with custom Matlab scripts. Movement onset was calculated separately for each trial as latency from the presentation onset of the go stimulus (a single, centrally presented cup) to the moment when the velocity of the wrist sensor exceeded 10 m/s and the index finger was at least 1 mm away from its starting position. Duration of the hand movement toward the cup was measured as the duration between movement onset and the point in time at which wrist velocity dropped below 10 m/s and the index finger was at least 20 cm away from its individual starting position. These criteria had to be fulfilled for a minimum of 10 consecutive sample points. Trials in which movement onset time was below 100 or above 1500 ms were excluded from further analysis. Movement durations below 200 ms or above 2000 ms also led to the exclusion of affected trials.

In order to compute the motion trajectories, a fourth-order low pass Butterworth filter with a cutoff frequency of 40 Hz was applied to the data. The motion tracking data between movement onset and the end of the movement were subsequently time-normalized in terms of percentages, 0% reflecting movement onset and 100% reflecting the end of the movement, that is, the touch of the object.

## **RESULTS**

## *Probe detection data*

**Table 1** and **Figure 2** provide an overview of all eight conditions. Probes which appeared in near space (*M* =337.00 ms, *SD* =36.53) were responded to faster than probes in far space (*M* = 349.75 ms, *SD* =38.98), reflected in a significant main effect of probe location, *<sup>F</sup>*(1,34) <sup>=</sup> 7.236, *<sup>p</sup>* <sup>=</sup> 0.011, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.175. When the probe appeared on the cup (*M* = 337.86 ms, *SD* = 38.95), participants detected it more quickly than when it appeared on the cactus (*M* =348.88 ms, *SD* = 36.84), as indicated by the significant main effect of object type, *<sup>F</sup>*(1,34) <sup>=</sup> 11.114, *<sup>p</sup>* <sup>=</sup> 0.002, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.246. A significant two– way interaction emerged between the factors identity and object type, *<sup>F</sup>*(1,34) <sup>=</sup> 10.929, *<sup>p</sup>* <sup>=</sup> 0.002, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.243. To follow-up on this result, we computed separate ANOVAs for displays with identical objects and displays containing different objects. In the identicalobjects condition, probes appearing on the cup (*M* = 340.26 ms, *SD* = 33.04) were not detected significantly faster than at the cactus (*M* = 341.73 ms, *SD* = 30.31; *F* < 1, *p* > 0.47). In the different objects condition, that is, when one cup and one cactus were presented simultaneously, participants detected probes appearing on the cup (*M* = 335.47 ms, *SD* = 36.23) faster than on the cactus (*M* = 356.04 ms, *SD* = 32.65), independent of their location, *<sup>F</sup>*(1,34) <sup>=</sup> 12.309, *<sup>p</sup>* <sup>=</sup> 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.266. This reaction time increase for probes appearing at the cactus in the different objects setting was also visible in the main effect of identity, *<sup>F</sup>*(1,34)=11.990, *<sup>p</sup>* <sup>=</sup>0.001, <sup>η</sup><sup>2</sup> <sup>=</sup>0.261 (identical: *<sup>M</sup>* <sup>=</sup>341.00 ms, *SD* = 36.45, different: *M* = 345.75 ms, *SD* = 39.95).

Most importantly, the three-way interaction between identity, object type, and probe location was significant, *F*(1,34) = 23.560, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.409. To test our hypothesis of a reaction time advantage for probe detection at the cup as compared to the cactus in near space, we computed an identity × object type ANOVA for near space probes. As hypothesized, probes appearing on the cup in near space were detected faster than probes appearing on the cactus at the same spatial position (see **Table 1**). This was reflected in a significant main effect of object type, *F*(1,34) = 16.115,


**Table 1 | Mean reaction times in all eight conditions, Experiment 1.**

*Note: the order of conditions in this table is the same as in Figure 2.*

*<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.322. The identity <sup>×</sup> object type interaction was not significant in near space (*F* < 1, *p* > 0.37), indicating that probe detection was generally faster at the near cup compared to the near cactus, independent of whether another cup or a cactus was simultaneously present in far space. To test the second part of our hypothesis, namely whether probes at the near cup were detected more quickly than at the far cup, we computed an additional one-factorial ANOVA in which probe detection at the near cup was statistically compared to probe detection at the far cup, independent of identity. As predicted, probe detection at cups in near space (*M* = 329.79 ms, *SD* = 36.62) was faster than in far space (*M* = 345.94 ms, *SD* = 35.83), *F*(1,34) = 10.746, *p* = 0.002, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.240.

In far space, the identity × object ANOVA yielded a significant interaction between both factors, *F*(1,34) = 4.673, *p* = 0.038, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.12. In the identical-objects condition, probes appearing at the far cactus were reacted to faster than probes appearing at the far cup (see **Table 1**), *<sup>F</sup>*(1,34) <sup>=</sup> 4.673, *<sup>p</sup>* <sup>=</sup> 0.038, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.121. This

pattern was reversed in the different objects condition, with faster probe detection at the far cup than the far cactus, *F*(1,34)=13.813, *<sup>p</sup>* <sup>=</sup> 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.289. Thus, there was no general probe detection advantage for cups in far space. The results furthermore indicate that responses were significantly slowed for probes appearing on the cactus at the far position when a cup was simultaneously present in near space.

To sum up, our results confirm a reaction time advantage for probe detection at the near cup, both compared to probe detection at the cactus in near space and at the cup in far space. Additionally, in the presence of the near cup probe detection times at the far cactus were significantly increased.

To explore whether button assignment influenced reaction times in terms of a facilitation of right-hand responses to near graspable stimuli (Costantini et al., 2011a), we conducted an additional ANOVA, which included the twofold betweensubjects factor "button assignment." This analysis revealed only a marginally significant probe location × button assignment interaction, *<sup>F</sup>*(1,33) <sup>=</sup> 3.831, *<sup>p</sup>* <sup>=</sup> 0.059, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.104. Therefore, faster right-hand responses to the near cup could not be statistically confirmed.

#### *Motion tracking data*

On average,< 1% (*SD* = 0.88) of grasp trials were discarded due to premature or very slow initiation of the movement. Furthermore, 1.68% (*SD* = 4.43) of grasp trials were excluded after applying the minimum and maximum movement duration criterion. Grasping movements were initiated on average 695.33 ms (*SD* = 79.44) after the onset of the go stimulus and had a mean duration of 752.17 ms (*SD* = 195.59). Average movement onset latency and duration were correlated with the reaction times in each of the eight probe detection conditions using Pearson correlations. This analysis did not yield any significant results (uncorrected *p*s > 0.06).

**Figure 3** shows the averaged motion trajectories of the index finger for each participant. As can be seen, the motion tracking data confirm that participants performed the grasping task in an appropriate manner.

#### **DISCUSSION**

In the present experiment we aimed to investigate how object affordance influences the allocation of visual attention as a function of object distance. In line with our predictions, probes appearing at an affording object (a cup) were reacted to faster in near space than in far space. Furthermore, this reaction time advantage at reachable distance was only observed for probes on the cup, but

**participant (single lines depict single participants). (A)** Trajectories in Experiment 1, plotted into the experimental setup. Participants started their movements from the right key of the button box and moved toward the cup

not for probes on a physically matched non-affording object (a cactus).

We interpret this reaction time advantage as indirect evidence that attention was preferentially allocated to the near cup, tracing back to the idea that a probe is detected faster when it appears at a currently attended location (Posner,1980; Humphreys et al., 2004). The point in time at which we presented the probe (200 ms after stimulus onset) is thought to tap initial allocation of selective visual attention, which can be measured in the N2pc component of the ERP (Luck and Hillyard, 1994; Eimer, 1996). We therefore consider the reaction time advantage to probes at the near cup an indirect reflection of initial attention deployment to the near cup. Thus, our data are in line with the idea that graspable objects preferentially attract attention when they appear at a spatial location important for grasping (Handy et al., 2003). Moreover, our results fit previous reports in the literature showing a near-space preference for graspable objects (Gallivan et al., 2009, 2011; Cardellicchio et al., 2011; Costantini et al., 2011a,b).

## **EXPERIMENT 2**

The data from Experiment 1 are in line with the idea that an affording object in near space might result in an attention bias toward that object; however the results cannot tell apart whether the near-space preference for the cup was due to its affordance *per se* (as opposed to the clearly non-affording cactus) or due to its arrangement in a graspable position, with a right-facing handle.

Behavioral studies indicate that the way in which handled objects are arranged visually influences motor reactions. Right-hand responses are executed faster when an object is presented with its handle facing to the right and vice versa, even when handle orientation is completely task-irrelevant (Tucker and Ellis,

which was on the pedestal (big gray shade). **(B)** The diagram shows the same finger trajectories in a two-dimensional coordinate system (X- and Y-coordinates), 0 represents the starting point of the grasping movement. **(C)** Finger trajectories in Experiment 2.

1998; Costantini et al., 2010, 2011a; Goslin et al., 2012). This pattern is interpreted in terms of an activation of the hand which would be used for actually grasping the object, and it has been referred to as affordance effect (Riggio et al., 2008) or spatial alignment effect (Costantini et al., 2010, 2011a) in the literature. Even though these effects are not always easy to disentangle from spatial compatibility effects (Simon and Rudell, 1967), studies have indicated that spatial compatibility effects and affordance-related reaction time effects may be at least partly dissociable (Pellicano et al., 2010; Cho and Proctor, 2013). In contrast to the evidence on handle-hand correspondence, other studies suggest that righthanders prefer objects ready to the right hand, such that they are immediately graspable with their preferred hand (Buccino et al., 2009; Gallivan et al., 2009, 2011). Importantly, the visuomotor response to objects that are immediately graspable with the right hand is enhanced compared to objects graspable with the left hand (Gallivan et al.,2009, 2011). Thus, if the visuomotor response leads to an attention bias as proposed by Handy et al. (2003), a manipulation of handle orientation in near space would influence probe detection performance.

To shed more light on this issue, Experiment 2 included objects with handles to the left. A cup with a left-facing handle at the near position would not appear immediately graspable to a righthander. We therefore hypothesized that in the case of left-facing handles, probes appearing on the near cup would not be detected faster than probes on the far cup or on the near cactus.

## **MATERIALS AND METHODS** *Participants*

Twenty-six students from the Philipps University Marburg participated in the second experiment for course credit. Data sets from three participants were discarded, two of them due to technical problems during the measurement. Another participant reported problems with three-dimensional vision and was therefore also excluded. The remaining 23 participants (19 female) had a mean age of 22.35 years (*SD* = 3.59) and were right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971). They reported normal or corrected-to-normal vision and performed normally in a color vision test (Ishihara, 1917). Participants gave written informed consent before the experiment. All experimental procedures were in accord with the Declaration of Helsinki.

#### *Stimuli and apparatus*

The experimental setup was identical to Experiment 1. The same held true for the stimuli; however the four stimulus displays for the probe detection task used in Experiment 1 (see **Figure 1B**) were now also present with the handles of both objects oriented to the left, subtending a horizontal viewing angle of 2.5◦ to the right and 4.1◦ to the left.

#### *Procedure*

The inter-trial interval was shortened to a random duration between 250 and 750 ms in order to include more trials. Because of the additional variation of handle orientation in probe detection trials, we reduced the number of identical-objects trials to a total of 176, equally distributed across cactus and cup and the two handle orientations. Each of the four different-objects displays (cup near-cactus far, cactus near-cup far in both orientations) was presented 110 times throughout the experiment. The experiment thus consisted of 836 trials in total, of which 616 were probe detection trials and 220 were grasp trials. Trials were divided into 11 blocks of 76 trials each (56 probe detection, 20 grasp). After each block, participants received feedback about their mean reaction times and number of errors in the probe detection task during the last block.

Apart from these changes, the procedure was identical to Experiment 1.

## *Data analysis*

Probe detection data were analyzed as in Experiment 1. The exclusion of reaction time outliers in the probe detection task affected 5.1% of trials on average (*SD* = 1.83). For data analysis, three factors were of interest in the current experiment: (1) *probe location*, that is, whether the probe appeared at the near or at the far position, (2) *object type*, that is, whether the probe appeared on the cup or on the cactus, and (3) *orientation*, that is, whether the handles were oriented to the left or right. A 2 × 2 × 2 repeated-measures

ANOVA with these within-subjects factors was computed on reaction times. Only trials with different-stimulus displays were used for analysis in the present experiment, due to the low number of identical-stimulus displays. Error rates were again very low (*M* = 3.4%, *SD* = 2.12) and were therefore not analyzed.

Motion data analysis was identical to Experiment 1. The motion tracker failed to record data in the case of one participant, and another participant exceeded the maximum movement duration in 63% of the trials and was therefore excluded from this analysis. Thus, the movement data were calculated on a set of 21 participants.

#### **RESULTS**

#### *Probe detection data*

**Table 2** and **Figure 4** provide an overview of all eight conditions in Experiment 2. First and foremost, the three-way interaction (object type × probe location × orientation) was significant, *<sup>F</sup>*(1,22) <sup>=</sup> 6.737, *<sup>p</sup>* <sup>=</sup> 0.017, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.234. We therefore tested the object type × probe location interaction separately for the two handle orientations.

For handles oriented to the right, the pattern reported in Experiment 1 could be replicated, with a significant object type × probe location interaction, *<sup>F</sup>*(1,22) <sup>=</sup> 4.339, *<sup>p</sup>* <sup>=</sup> 0.049, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.165. Probes which appeared on the near cup were detected faster than on the near cactus, *t*(22) = 3.572, *p* = 0.001, and faster than on the far cup, *t*(22) = 1.942, *p* = 0.033. In addition, reactions to probes on the far cactus were significantly slower than reactions to probes on the far cup, *t*(22) = 5.99, *p* < 0.001, which replicates the reaction time increase also observed in Experiment 1. For handles oriented to the left, the object type × probe location interaction was not significant (*p* = 0.35).

The three-way ANOVA also revealed a significant probe location × orientation interaction (*F*(1,22) = 13.222, *p* = 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.375). Near probes were generally detected faster than far probes when handles were oriented to the right (near: *M* = 335.81 ms*, SD* = 36.08; far: *M* = 357.23 ms, *SD* = 50.27), *<sup>F</sup>*(1,22) <sup>=</sup> 7.807, *<sup>p</sup>* <sup>=</sup> 0.011, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.262; for left-oriented probes there was no significant main effect of target location (*p* = 0.655). Thus, the data for right-oriented objects are in accord with Experiment 1.

Additionally, analysis of the object type × orientation interaction, *<sup>F</sup>*(1,22) <sup>=</sup> 78.736, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.782, revealed that for objects oriented to the right, probes appearing on the cup were responded to faster than on the cactus (cup:

**Table 2 | Mean reaction times in all eight conditions, Experiment 2.**


*Note: the order of conditions in this table is the same as in Figure 4.*

*M* = 329.47 ms, *SD* = 38.93; cactus: *M* = 363.57 ms, *SD* <sup>=</sup> 45.21), *<sup>F</sup>*(1,22) <sup>=</sup> 32.898, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.599, in line with results from Experiment 1. Descriptively, this also held true for left-oriented handles, but the effect was not significant (*p* = 0.169).

The faster reaction times for probes appearing on the cup as compared to the cactus were also evident in a significant main effect of object type in the three-way ANOVA, *F*(1,22) = 13.269, *<sup>p</sup>* <sup>=</sup> 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.376. Furthermore, responses were generally faster in trials with left-oriented (*M* = 330.52 ms, *SD* = 35.46) as compared to right-oriented handles (*M* =346.52 ms, *SD* =39.71). This pattern was reflected in a significant main effect of orientation, *<sup>F</sup>*(1,22) <sup>=</sup> 63.122, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>=</sup> 0.742.

We also explored whether button assignment influenced reaction times, in a way that participants who pressed the right button for near stimuli would be faster with cup handles to the right, as compared to participants with the left button assigned to near probes, who might experience reaction time facilitation by a leftfacing cup handle (Costantini et al., 2011a). We conducted an additional ANOVA, which included the twofold between-subjects factor "button assignment." However, the object type × probe location × orientation × button assignment interaction was non-significant (*p* > 0.66).

## *Motion tracking data*

Outlier correction of the movement data led to the exclusion of 1.37% (*SD* = 2.42) of trials on average due to reaction time errors; 2.86% (*SD* = 4.02) of the grasp trials were discarded because they did not meet the inclusion criteria for the minimally and/or maximally admitted movement duration. Grasping movements were initiated on average 742.42 ms (*SD* = 94.18) after the onset of the

go stimulus and had a mean duration of 841.70 ms (*SD* = 229.72). No significant correlations between these two motor variables and reaction times in the eight probe detection conditions were observed (uncorrected *p*s > 0.07).

The motion tracking data again confirm that participants performed the grasping task appropriately. The mean trajectories of each participant can be seen in **Figure 3C**.

## **GENERAL DISCUSSION**

Based on different studies indicating a stronger visuomotor response to affording objects in near as compared to far space (Gallivan et al., 2009, 2011; Costantini et al., 2010, 2011a,b; Cardellicchio et al., 2011), the present set of experiments used a probe detection task to investigate whether initial deployment of visual attention is stronger to graspable than non-graspable objects in near space, and whether such difference also holds true for a graspable object in near as compared to the same object in far space. Such pattern was in fact revealed in Experiment 1: probe detection was fastest when probes appeared at the cup in near space, which indicated that attention was preferentially allocated to the near, affording object. In the second experiment we could show that this reaction time advantage for probe detection at the near cup was no longer present with handles facing to the left.

As suggested by Handy et al. (2003), graspable objects which appear at locations important for grasping may draw attention even when they are task-irrelevant. This could happen because observers implicitly recognize an object's potential for action, thereby leading to an attentional bias toward that object. With the present set of experiments, we were able to extend previous findings by showing that such attention bias can also be observed as a function of object distance.

From our results, we can conclude that the attention bias induced by the near, graspable cup cannot be explained by attentional capture due to basic physical stimulus differences between our two object types. In fact, cup and cactus were matched for size, shape, luminance, and orientation, all of which are attributes that undoubtedly or very likely capture attention in a bottom-up fashion (Wolfe and Horowitz, 2004). The only basic attribute which might still work in such an attention-capturing fashion would be the object's color. However, two observations in our study allow us to rule out this possibility: on the one hand, if one color captured attention more than the other, this would also be evident in reaction times to far space probes; however, the results pattern was rather mixed in Experiment 1: neither the probes appearing on the cup nor those appearing on the cactus had a clear advantage in far space. Furthermore, in Experiment 2 there was no significant reaction time difference between probes on the cactus and probes on the cup even in near space, when handles were oriented to the left. This would provide additional evidence against the idea of bottom-up attentional capture by mere physical stimulus differences.

Instead, the pattern we reported in Experiment 2 clearly points to the role of immediate graspability of an affording object, such that only when the object is "ready to hand" an attentional bias toward it is induced (Handy et al., 2003; Handy and Tipper, 2007). This is in line with a study by Buccino et al. (2009), who reported that MEPs at the right hand were significantly higher

for objects with an intact handle as compared to a broken one, but only when objects were oriented to the right. Thus, not only affordance *per se* (in terms of the affording cup compared to the non-affording cactus in our experiment) is crucial for recognizing action potentials, but also the possibility to immediately interact with an object. This possibility, in turn, appears to be influenced by object distance as well as handle orientation. Here, it is particularly interesting to consider the case of a patient with lesions in parietal cortex, who had problems recognizing the action affordance of an object when its handle was oriented away from him, but whose performance benefitted significantly when the handle orientation was adjusted such that the object appeared immediately graspable to him (Humphreys and Riddoch, 2001). Thus, in the light of these findings the present results are in accord with the idea that not only object characteristics *per se*, but rather their potential for immediate action may bias attention toward an affording object. However, even though the object type × probe location interaction was clearly non-significant for objects with left-oriented handles in Experiment 2, from a merely descriptive point of view, participants were fastest at responding to probes on the near cup in the handleleft condition as well (see **Figure 4**). Therefore, it is possible that the near cup, even though not immediately graspable, still may retain some of its behavioral relevance because it is close to the observer.

In both experiments, participants responded exceptionally slow to probes at the cactus in far space while a cup was simultaneously present at the near position. We had initially not predicted such effect; however, it would also be in line with an attentional account of the present data. The reaction time increase in latter condition can be interpreted in terms of a strong attention bias toward the near cup, and subsequently increased costs of shifting the focus to the far cactus. Such strong reaction time increase was not observed when participants had to react to the far cup while there was also a cup at the near position (Experiment 1). It appears that engagement of attention (Posner and Petersen, 1990) by the near cup was comparable in identical-objects and differentobjects contexts, reflected in almost identical mean reaction times in both conditions. However, when the probe was presented at the far location, the reaction time difference between near and far was 19 ms in the identical-objects condition containing two cups, contrasted with 37 ms in the different-objects condition with the cup at the near position and the cactus at the far one. According to Desimone and Duncan (1995), objects in our environment compete for selective visual attention, a process which may be biased, among others, by their behavioral relevance. In this vein, the far cup would still have some behavioral relevance, but the cactus would not. This, in turn, would increase the competition between cup and cactus specifically when the cup appears immediately graspable. In line with this interpretation, in the display with two cactuses probe detection latencies were highly comparable for near and far, suggesting that no attention bias was present. The same is true for probe detection at the far cactus with handles oriented to the left. Due to the apparent lack of immediate behavioral relevance of the left-handled cup in near space, no attention bias was induced toward it, and therefore no reaction time increase could be observed.

The selective reaction time advantage for probes at rightoriented cups in near space allows us to rule out a general lower visual field preference as explanation for our results. In the present set of experiments, near space was always located below fixation while far space was located above. The reduction of reaction times to the cup in near space is therefore also in line with research supporting a lower visual field preference for grasping (Rossit et al., 2013), and it makes perfect sense that the cup advantage disappeared when it was oriented to the left and therefore not immediately graspable. Thus, our results support enhanced processing of immediately graspable objects at a location important for grasping, namely the lower visual field (Handy et al., 2003). However, even though the factors of distance on the one hand and upper/lower visual field on the other cannot be disentangled in the present experiment, they seem to be partly independent of each other. For example, enhanced activation in the SPOC during passive viewing of graspable objects in near space compared to far is also observed with all objects located below fixation (Gallivan et al., 2009, 2011). Research using methods with a higher temporal resolution than fMRI are needed to gain more insight into the mechanisms triggered by object distance and graspability on the one hand, and upper versus lower visual field on the other.

One might argue that the probe detection advantage at the cup could be due to a more frequent appearance of the cup on screen as compared to the cactus, because it was also presented on grasp trials. We acknowledge that the more frequent presentation of the cup during the experiment may cause higher familiarity with the cup than the cactus. It is also reasonable to assume that cups are generally more familiar to participants than cactuses due to everyday experience. However, we do not consider familiarity a likely explanation for our results, because there was no overall reaction time advantage for the cup, which would be expected from the familiarity interpretation. In Experiment 1, faster reactions to the cup were corroborated in near space, but in far space the pattern was not that clear. In Experiment 2, no significant reaction time difference between these two objects emerged when handles were oriented to the left. Furthermore, research suggests that high familiarity or motor experience with an object may in fact reduce the visuomotor response to it (Handy et al., 2006).

The failure to find a significant interaction with button assignment in our data seems to be at odds with findings from Costantini et al. (2010, 2011a). These authors reported that right-hand responses to a cup with its handle facing to the right were executed faster than left-hand responses, but only in near space. With handles facing left, this pattern was reversed. Thus, in the present experiment those participants who pressed the right button for probes appearing on right-oriented cups in near space should have had a reaction time advantage in this condition, compared to participants who pressed the left button for near space probes, who would be faster with cup handles facing to the left. Our data did not support such pattern, suggesting that a near object with a rightfacing handle does not necessarily facilitate right-hand responses and vice versa. On the one hand, this may depend on the action which is executed. In the present study, participants pressed a button while Costantini et al. (2011a) had their participants perform pantomime movements. Furthermore, we varied button assignment as a function of distance, but not handle orientation in the present experiment. Therefore it is not possible to directly compare reactions to different handle orientations within-subjects considering only near-space objects. Moreover, while several studies report that handle orientation facilitates responses with the corresponding hand, including button presses (Tucker and Ellis, 1998; Buccino et al., 2009; Goslin et al., 2012), the TMS study by Cardellicchio et al. (2011) showed generally enhanced MEPs at the right hand for near, graspable stimuli independent of handle orientation. To sum up, the evidence on handle-hand correspondence is somewhat equivocal; however, this also shows that object affordance might in fact be much more than just spatial compatibility.

In Experiment 2 we observed a main effect of handle orientation, which was characterized by generally faster probe detection when handles were oriented toward the left as compared to the right. One possible explanation for this observation is that due to the lack of action relevance of objects with handles oriented to the left, no attention bias toward the cup could be induced and thus the object competition model adapted in our experiments (Handy et al., 2003) would not result in competition between near and far objects (Desimone and Duncan, 1995). Another reason might be interference between the go cue for grasping (a single, central cup with its handle oriented to the right) and probe detection on right-oriented objects. In fact, one participant reported after the experiment that she had experienced the condition with handles toward the left as easier than responding to right-oriented objects because of their similarity with the go stimulus. Even though many other participants reported afterward that they were not aware of the variation of handle direction, such subtle response interference might still be present in the data.

In sum, our results fit the literature showing a near-space advantage for graspable over non-graspable objects, both when comparing an affording object to a clearly non-affording one, and also when the immediate graspability of an affording object is manipulated. Specifically, we could show that this near-space advantage for graspable objects goes along with an attention bias toward that object. Therefore, the data are in line with the idea of differential attention allocation to objects depending on their potential for action (Handy et al., 2003). However, in order to test this attentional account of the present data more directly, ERP studies are needed to gain more insight into the processes triggered by the objects and their respective positions in space.

#### **ACKNOWLEDGMENTS**

The authors would like to thank Mateja Lasnik, Alan Zemljic, and Ruben Brandhofer for help in data acquisition and the reviewers for helpful comments on an earlier version of this manuscript.

#### **REFERENCES**


Ishihara, S. (1917). *Test for Colour Blindness* (Tokyo: Kanehra Trading Inc.).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 September 2013; accepted: 16 January 2014; published online: 06 February 2014.*

*Citation: Garrido-Vásquez P and Schubö A (2014) Modulation of visual attention by object affordance. Front. Psychol. 5:59. doi: 10.3389/fpsyg.2014.00059*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Garrido-Vásquez and Schubö. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Context-dependent changes in tactile perception during movement execution

## *Georgiana Juravle1,2\*, Francis McGlone3 and Charles Spence1*

*<sup>1</sup> Crossmodal Research Laboratory, Department of Experimental Psychology, Oxford University, Oxford, UK*

*<sup>2</sup> Department of Systems Neuroscience, Center for Experimental Medicine, University Medical Center Hamburg-Eppendorf, Hamburg, Germany*

*<sup>3</sup> Faculty of Science, School of Natural Sciences and Psychology, Liverpool John Moores University, Liverpool, UK*

#### *Edited by:*

*Jochen Musseler, RWTH Aachen University, Germany*

#### *Reviewed by:*

*Arnaud Boutin, Leibniz Research Centre for Working Environment and Human Factors (IfADo), Germany Martin Grunwald, University Leipzig, Germany*

#### *\*Correspondence:*

*Georgiana Juravle, Department of Systems Neuroscience, Center for Experimental Medicine, University Medical Center Hamburg-Eppendorf, Martinistr. 52, Building West 34 (W34), Room 320b, 20251 Hamburg, Germany e-mail: g.juravle@uke.de*

Tactile perception is inhibited during movement execution, a phenomenon known as *tactile suppression*. Here, we investigated whether the type of movement determines whether or not this form of sensory suppression occurs. Participants performed simple reaching or exploratory movements. Tactile discrimination thresholds were calculated for vibratory stimuli delivered to participants' wrists while executing the movement, and while at rest (a *tactile discrimination task*, TD). We also measured discrimination performance in a same vs. different task for the explored materials during the execution of the different movements (a *surface discrimination task*, SD). The TD and SD tasks could either be performed singly or together, both under active movement and passive conditions. Consistent with previous results, tactile thresholds measured at rest were significantly lower than those measured during both active movement and passive touch (that is, tactile suppression was observed). Moreover, SD performance was significantly better under conditions of single-tasking, active movements, as well as exploratory movements, as compared to conditions of dual-tasking, passive movements, and reaching movements, respectively. Therefore, the present results demonstrate that when active hand movements are made with the purpose of gaining information about the surface properties of different materials an enhanced perceptual performance is observed. As such, it would appear that tactile suppression occurs for irrelevant tactual features during both reaching and exploratory movements, but not for those task-relevant features that result from action execution during tactile exploration. Taken together, then, these results support a context-dependent modulation of tactile suppression during movement execution.

#### **Keywords: tactile perception, reaching, exploration, active/passive movement, dual-tasking**

## **INTRODUCTION**

In order to achieve our goals in everyday life, we constantly move and interact with the environment; That is, we frequently perform goal-directed actions. By using simple detection and discrimination paradigms, researchers have provided evidence to suggest that tactile perception changes over the execution phase of goaldirected movements: Tactile sensitivity declines significantly over the execution phase of a movement (Buckingham et al., 2010; Gallace et al., 2010; Juravle et al., 2010; Juravle and Spence, 2011), while tactile stimuli are detected more rapidly (Juravle et al., 2011). Such findings suggest that two psychologically-grounded mechanisms (one of *attentional* facilitation and the other of *suppression*) may work in parallel over the execution phase of a planned movement. The facilitatory attentional effect may help an organism to detect and promptly respond to the incoming sensory novelty, whereas sensory suppression might help an organism to filter out those inputs that are deemed irrelevant to the current task.

That said, the experimental results that have been published to date can be criticized for not taking into account the relevance of the tactile input to the organism's goals: If goal-directed reaching were shown to impair what is felt during the course of a movement, then exploratory movements could provide an answer to the question of whether or not the sensory information arriving at our mechanoreceptors is treated as being of little relevance as soon as we start to move. Alternatively, however, what is felt might be relevant for the goal-directed action and thus necessary to our overall interaction with the environment. To the best of our knowledge, there are no experimental accounts in the literature that have attempted to contrast the characteristics of tactile perception during the execution of reaching movements with the execution of exploratory movements.

The motivation behind the present study therefore relates to a simple paradox: If, when moving, tactile perception is impaired (Dhyre-Poulsen, 1978; Chapin and Woodward, 1982; Chapman, 1988; Cohen et al., 1994), then how can one account for a lack of tactile suppression over the course of an exploratory movement? For example, consider for a moment an ecologically valid task, such as a blind person reading Braille. This task normally involves specific, most often bi-manual, disjoint movements in order to extract useful information from the display (Hughes and Jansson, 1994). Nevertheless, under those conditions in which the participants are required to detect displacements in refreshable Braille displays, tactile suppression occurs (Ziat et al., 2010). Tactile suppression of displacement refers to an inability to detect tactual changes in a moving stimulus, when such changes appear while the fingers are no longer in contact with the specific stimulus, i.e., the tested Braille displays in the above example (Ziat et al., 2010; see also Keyson and Houtsma, 1995). Exploratory hand and arm movements are used in order to identify 3D objects, as well as to distinguish specific characteristics or features of objects in our proximal environment. In daily life, when tactile information is needed in order to achieve our goals, exploratory movements are typically amongst the first to be used. Note that the perceptualmotor process of actively exploring a 2D/3D object is commonly referred to as *haptic perception* (Gibson, 1962; Klatzky et al., 1985; Hatwell et al., 2003; Grünwald, 2008; Lederman and Klatzky, 2009).

Just imagine that you are about to buy a new cashmere sweater: Provided that all of the visual attributes are indistinguishable for two garments, when deciding on the quality of the clothing, you will often move your hands and *feel* the material between your fingers. The movement of one's fingers across the material's surface provides the necessary information with regard to its perceived quality. Such exploratory haptic/tactile behaviors are associated in market research with a *'need for touch'* that some customers exhibit when evaluating products that they may be interested in purchasing (Peck and Childers, 2003). Indeed, this general liking for haptic input has typically been documented when people interact with clothing, and with novel, or highquality, products (see Spence and Gallace, 2011; Gallace and Spence, 2014, for reviews). Indeed, possibly mirroring the visual modality, exploratory movements have been metaphorically compared to *'windows through which the haptic system can be viewed'* (Lederman and Klatzky, 1987, p. 344).

The experiment reported here was designed to test whether the relevance of the tactile/haptic stimulus to a participant's goals modulates the degree to which sensory suppression is observed. For this, we introduced a movement task that is characteristic of haptic perception (i.e., an exploratory movement), together with the goal-directed reaches that have been used previously by researchers (e.g., Juravle et al., 2010). One of the possible implications of the evidence regarding the tactile suppression that occurs during the execution of goal-directed reaching movements (Juravle et al., 2010, 2011) is that touch may appear to be of little relevance to reaching. On the other hand, on a daily basis, exploratory movements are used with the aim of extracting and analyzing important features of the objects that we interact with. Therefore, by comparing the characteristics of tactile perception during the execution of exploratory and reaching movements, it was hypothesized that one could extract the *functional significance* of what is felt while moving.

During the experiment reported here, the participants were seated at a table with their hands and arms on top of the table surface. The participants performed a speeded goal-directed movement as a primary task, together with an unspeeded perceptual task, as a secondary task. The primary movement task involved either simple reaching or exploratory hand movements (i.e., the same as the reaching movements, with the only difference being that contact with the table surface was maintained). For the perceptual task, tactile discrimination thresholds were assessed for vibratory stimuli delivered to the participant's wrists while executing the movement and while at rest. Moreover, in another perceptual task, surface discrimination performance was measured (in a same vs. different task) for the materials covering the surface of the table, during the execution phase of the different movements (i.e., exploration and reaching). This performance measure was intended to evaluate the specificity of exploratory movements. For both perceptual tasks, tactile perception was specifically tested during the movement execution period, where sensory suppression effects have been reported previously (see Juravle et al., 2010, 2011). The vibratory discrimination and surface discrimination tasks could be performed either singly or together, both under active movement and under passive conditions (i.e., when no movement was required, but with the tactile stimulation delivered to the participant's skin by the experimenter, mimicking the surface contact specific to the reaching and exploratory movements).

For the tactile vibratory discrimination task (TD task), the first hypothesis predicted higher discrimination thresholds when participants attempted to report what they felt during the active execution of the movement (both reaching and exploration), as compared to the control rest condition. We hypothesized that if the acuity of a participant's tactile perception were to deteriorate during the course of movement execution, irrespective of the type of movement that is being executed, then no difference in TD task performance between the two active reaching and exploration movements would be observed. However, if exploration brings about an enhancement in what is felt, then tactile sensitivity should be higher for exploratory movements, rather than being diminished, during reaching movements.

The surface discrimination task (SD task) was conceived of as a task that would result in the best behavioral perceptual performance for exploratory movements. Therefore, a significant improvement in surface discriminatory performance was predicted during the execution phase of the exploratory movements, as compared to simple reaches. Moreover, significantly improved performance was expected for active hand movements (i.e., where the participants actively explored the table surface with the aim of gaining some information about its features), as opposed to the passive execution mode, that entailed no voluntary movement. Lastly, for both perceptual tasks, a significant deterioration in participants' performance was expected under conditions of dual-task, as opposed to single task, performance.

## **METHODS**

## **PARTICIPANTS**

Eight participants (1 male, all right-handed by self-report) took part in this experiment. The mean age was 26 years (ranging from 21 to 29 years). All of the participants reported normal touch and normal hearing. The experimental session lasted for approximately 150 min and the participants received a £20 gift voucher in return for taking part. The experiment was conducted in accordance with the ethical guidelines laid down by the University of Oxford.

## **APPARATUS**

The participants were seated at a table (81 cm wide); the experimenter was seated/standing at the other end of the table. A rectangular piece of sponge-like material (24 cm long, 11 cm wide, and 2.5 cm high) was attached to the left side of the table in order for the participants to rest their left hand during the experiment (*left hand resting position*). On the participant's right side, a rectangular piece of wood (21.2 cm long, 17.2 cm wide, and 2.5 cm high) was attached to the table, together with an additional piece of spongy material (same physical measures as for the piece of wood) on top of it (*right hand resting position*). A rectangular object (6.5 cm long, 1.5 cm wide, and 3 cm high; *start position*) was positioned between the two resting positions for the left and right hand at the edge of the table nearest to the participant. The *goal position* was signaled with an identical object placed 19 cm in front of the right resting position. See **Figure 1** for a depiction of the experimental set-up.

On each trial, a board (hand made from painting board; 56.5 cm long, 40 cm wide, 0.6 cm high) covered in cling-film, tinfoil, or a combination of the two materials, was placed in front of the participant, between the two resting positions (see **Figure 2** for the different types of board that were used in the experiment). Tactors (VW32 skin stimulators, 1.6 × 2.4 cm vibrating surface, Audiological Engineering Corp., Somerville, MA, USA) were attached with tape to the ventral part of both of the participant's wrists and their wrists were then covered in several layers of thin sponge. The participants were blindfolded and wore closed ear headphones (Beyer Dynamic DT 531) for the duration of each block of trials in the experiment. Two loudspeaker cones, one placed on either side of the table, delivered white noise throughout the experimental blocks. Depending on the task, the participants responded by means of two footpedals connected to the computer, as well as vocally, the latter response was entered into the computer by the experimenter.

## **PROCEDURE**

The experiment involved a speeded motor task (goal-directed reaching movement vs. exploratory movement) and two

unspeeded perceptual tasks tactile vibratory discrimination, TD vs. surface discrimination, SD. The motor tasks were designed either as active movements of *only* the right hand, or as passive movements; the left hand was thus always at rest during the experimental trials.

## *Speeded motor task*

Prior to the start of each trial, the participants rested their arms at the resting positions. The experimenter ensured that the participants' hands were at the start position and instructed them by saying 'Ready' and pressing a key on the keyboard to initiate the trial. At the experimenter's signal, participants brought their hand to the start position (i.e., since they were blindfolded, they learned to feel the start position object positioned at the edge of the table with the side of their index finger). At the start position, the participants were instructed to place their hand over the surface of the board such that they would feel the board's surface with just their index, middle, ring, and little fingers. The thumb, as well as the palmar region of the hand, was not used during this experiment. 500–750 ms after the experimenter's vocal instruction, a short auditory signal was delivered over the headphones (50 ms, 800 Hz). This acted as the Go signal for participants to initiate their movement.

In the *speeded reaching movement condition*, the participants executed an outward movement with their right hand lying across the board's surface. As such, if, at the start of the movement, the participant's forearm was parallel to their torso, it formed an angle of approximately 90◦ with respect to their torso at the end of the movement. The reaching movement involved a 'jump' over the surface of the board, from the start position to the goal position. At the end of the movement, participants touched the object

passive discrimination task.

at the goal position with the side of their little finger. Once the goal position was reached, the participants brought their hand back to the right hand resting position (see **Figure 3** for a depiction of the trial timeline in the active and passive execution modes of the REACH movement conditions). There was also a control *rest* condition in which no movement of the participant's right hand was required. For this, the participants kept their right hand in the resting position and only performed the perceptual vibratory tactile discrimination task. In the *speeded exploratory movement* condition, at the go signal, participants executed the same outward movement as for reaching movement, this time by keeping contact with the surface of the board until the goal position was reached.

## *Unspeeded perceptual tasks*

Two types of perceptual tasks were used to test tactile perception. In the *tactile vibratory discrimination task (TD)*, shortly after the go signal, both of the tactors that the participants wore on their wrists were activated (250 Hz, 12 dB sensation level, 1000 ms). The participants made an unspeeded intensity discrimination response: That is, they had to compare the right hand pulse to the left hand pulse and decide whether the intensity of former was stronger or weaker than that of the latter once they had completed the movement task (and returned their hand to the starting position). The participants were instructed to press one footpedal whenever a stronger pulse was presented to their right hand and the other footpedal whenever the right hand pulse was weaker. Response assignments for the left and right foot-pedals were counterbalanced across participants; see Juravle et al. (2010) for a detailed description of the TD task methodology. In the *surface discrimination task (SD)*, participants had to indicate whether they perceived a change in the material covering the board (by making a same vs. different response).

## *Single versus dual perceptual tasks*

Depending on the experimental condition, the two types of unspeeded perceptual tasks could either be performed separately or together. The participants were informed at the start of each block of trials whether it was a single or a dual task block. For the *single task* blocks, the participants only performed a single perceptual task for the entire duration of a block of trials. For example, they performed an exploratory movement on each trial and at the end they either pressed one pedal or the other in order to respond to the quality of the vibratory stimulation, or alternatively, they gave a vocal response with respect to the quality of the surface of the board. In the *dual task* blocks, the participants always performed the vibratory TD task as a first perceptual task. In randomly chosen trials, after they had made their pedal response for the first perceptual task, the experimenter requested their response to the second SD task. Once again, the participants responded vocally and the experimenter entered their response into the computer. Within an experimental dual task block, the participants did not know in advance which trials would require an additional perceptual response.

## *Active versus passive execution modes*

Two types of execution mode were used: active and passive. The active mode corresponds to the (active) speeded reaching or exploration movement executed by the participants, presented in Section *Speeded motor task*. In contrast, the passive execution mode was introduced as a means of mimicking the participant's movement, without the actual movement of the limb. In this respect, participants always kept their hands at the resting positions. At the start of each trial in the *passive exploration* condition, the experimenter placed the board underneath the participant's fingertips. At the Go signal, the experimenter touched their fingertips with the board and then slid the board at a constant speed along the surface of their fingertips. In the *passive reaching*

condition, the experimenter used two smaller boards, one covered in cling film, and the other in tinfoil (see **Figure 2A**). The experimenter kept the boards with their lower corners superimposed in her left hand, such that they formed a V at an angle of about 45◦ (see **Figure 2B**). At the beginning of the trial, depending on the condition, the experimenter positioned one of the ends of the board underneath the participant's fingertips. At the Go signal, the experimenter gently touched the participant's fingertips with the prepared end, after which, the experimenter moved the V, such that the other end could be positioned underneath and touch the fingers as well. This is an example of the procedure for a trial that required the use of different materials. When the same material was to be used for the passive reaching condition, the experimenter simply paused shortly after the first touch and then performed the second touch with the same material. Once the participants had made their response, the experiment progressed onto the next trial following a random inter-trial interval of 1500–2500 ms. The experimenter read the trial definition on the computer screen at the very beginning of each trial and changed the table board according to the up-coming condition.

## **DESIGN**

The experiment consisted of nine blocks of trials. Each block corresponded to an experimental condition. The manipulated variables were: type of movement (reaching vs. exploration), execution mode (active vs. passive), and type of perceptual task (single vs. dual). Therefore, it was a 2 × 2 × 2 design; the ninth block consisted of the control rest condition for the TD. The order in which the various experimental blocks were presented was counterbalanced across participants (see **Table 1** for a summary of the experimental design). Note that given the psychophysical task utilized, it was appropriate to use a small sample size to extensively test the various experimental dimensions (Quinn and Watt, 2012).

The design of the TD task was detailed previously in Juravle et al. (2010, 2011). The SD consisted of 28 trials per block. For half of the trials, boards consisting of the same material plates were used (i.e., 7 trials with cling film-only boards, and 7 trials with tinfoil-only boards). For the other half of the trials, the boards were made of different materials (i.e., 7 trials in which the covering material of the board changed from cling film to tinfoil, and 7 trials in which the material changed from tinfoil to cling film). The SD-only blocks consisted of the 28 randomly intermixed trials. The dual-task blocks had the 28 SD trials randomly intermixed amongst the TD trials. Given the psychophysical procedure used to determine the perceptual threshold in the TD task, the total number of trials needed for the completion of the TD conditions varied between participants; the maximum number of trials per staircase/condition was set to 120.

## **DATA ANALYSIS**

Perceptual thresholds were calculated for the TD data, together with percentages of correct responses for the SD data. Depending on the experimental question, several analyses were performed that involved the use of repeated measures analyses of variance (ANOVAs).

## *TD data analysis*

For the TD analysis, three different ANOVAs were performed. In order to investigate whether the execution of the movement interfered with what participants felt, a first ANOVA was conducted with the factor condition (rest vs. active exploration vs. active reaching). The next step involved investigating whether it was not only the movement that affected tactile sensitivity, but also the concomitant tactile input delivered to a resting hand. For this, a second ANOVA was performed with the factor of task type (TD at rest vs. TD at rest plus passive exploration vs. TD at rest plus passive reaching). The third analysis was designed to investigate whether the movement type and the mode of movement execution gave rise to differential changes in tactile perception. For this, a 2 × 2 repeated measures ANOVA was conducted with the factors of movement type (exploration vs. reaching) and execution mode (passive vs. active).

Lastly, for the dual-task conditions, bivariate correlations were conducted between the data from the two perceptual tasks, TD task and SD task, performed together under the dual task conditions. In order to investigate whether a relationship between the distribution of tactile thresholds in the TD task and performance in the SD task arose from the dual task situation, these primary correlations were followed by further partial correlations between the data from the two perceptual tasks performed together, while controlling for the data from the single task SD condition.

## *SD data analysis*

For the SD analysis, one general 2 × 2 × 2 repeated measures ANOVA was performed with the factors of movement type (exploration vs. reaching), execution mode (passive vs. active), and task type (single vs. dual). Furthermore, bivariate correlations were conducted for each type of movement, between the data collected under single and dual task conditions. In order to investigate whether a relationship between the distribution of correct responses in the SD task and that of the tactile thresholds in the TD task could be explained by the simultaneous performance of the TD task, separate partial correlations were conducted between the same data while controlling for the variable TD at rest.

#### **Table 1 | Summary of experimental design.**


*SD, surface discrimination task and TD, tactile discrimination task.*

## **RESULTS**

## **TD TASK RESULTS**

Mean thresholds and individual data from all participants are presented in **Figure 4**. In **Figure 5A**, the TD threshold data are plotted against the performance in the SD task for all of the dual-task conditions in the experiment.

As expected for the TD task, the results indicated that tactile perception was impaired during movement, *F*(2, <sup>14</sup>) = 8.26, *p* = 0.004. The participants were significantly less sensitive in discriminating the quality of tactile stimulation while they were performing the active exploration movement (*p* < 0.001), as well as the active reaching movement (*p* = 0.045), as compared to the control rest condition, where no movement was performed. No significant difference was observed between the mean thresholds of the two active movement conditions, exploration and reaching (*p* = 0.451).

With regard to the TD task when performed at rest, the results indicated a significant effect of the type of task, *F*(2, <sup>14</sup>) = 18.95, *p* < 0.001. That is, participants were significantly more sensitive to the quality of tactile stimulation for the TD at rest, as compared to those conditions in which the same task was performed while receiving 'passive exploratory input' (*p* = 0.003), or 'passive reaching input' (*p* = 0.001). Moreover, a significant difference was observed between the mean thresholds in the two passive dual task conditions: That is, tactile thresholds were significantly elevated (i.e., performance was significantly poorer) for the passive exploratory input, as compared to the passive reaching input (*p* = 0.030).

Lastly, there was no significant main effect of the mode of movement execution, *F*(1, <sup>7</sup>) <1, n.s., movement type, *F*(1, <sup>7</sup>) = 4.59, *p* = 0.069, nor any interaction between the two variables, *F*(1, <sup>7</sup>) <1, n.s., when comparing the data from the two movement types, across the two movement execution modes. No significant correlations were found between the distribution of the TD and that of SD performance under dual task conditions.

## **SD TASK RESULTS**

Percentages of correct responses for all the experimental conditions are presented in **Figure 6A**. The results highlighted significant main effects of all of the experimental variables: movement type, *F*(1, <sup>7</sup>) = 44.71, *p* < 0.001, execution mode, *F*(1, <sup>7</sup>) = 10.19, *p* = 0.015, and task type, *F*(1, <sup>7</sup>) = 36.29, *p* = 0.001. As such, participants' SD discrimination performance was significantly better under conditions of active movement as compared to passive movement, single tasking as compared to dual tasking, and while exploring the surface of the board, as compared to reaching movements; see **Figure 6B** for a depiction of the significant main effects. No significant interactions between the experimental variables were found.

For the passive execution mode, a positive correlation was demonstrated between the Reach SD task performance under conditions of single and dual tasking, *r* = 0.853, *p (one-tailed)* = 0.007, *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>.727. When controlling for the performance in the TD when performed at rest, the same strong correlation between the two variables was observed, *r* = 0.851, *p (one-tailed)* = 0.008, *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>.724, suggesting that the variance found in the passive SD reaching condition could not be explained by the additional TD task.

Similarly, for the active execution mode, a positive correlation was demonstrated between the Reach SD task performance under conditions of single and dual tasking, *r* = 0.679, *p (one-tailed)* = 0.032, *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>.461. When controlling for the performance in the TD when performed at rest, the same positive correlation between the two variables was observed, *r* = 0.683, *p (one-tailed)* = 0.045, *<sup>R</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>.466, suggesting that the variance found in the active SD reaching condition could not be explained by the performance of the additional task. See **Figure 5B** for plots of the significant correlations. No other significant correlations were found.

## **DISCUSSION**

The present experiment was designed to investigate, at a behavioral level, *whether* and *how* tactile suppression manifests itself during specific hand movements. To this end, a demarcation between *movement types*, as well as between *modes of movement execution*, was utilized in order to obtain a comprehensive view regarding what actually happens to tactual information during movement. Here, some of the methodological issues raised are considered, followed by a discussion of the results of each perceptual task, and ending with some general conclusions.

## **METHODOLOGICAL CONSIDERATIONS** *Speed of movement*

A first factor that it was not possible to control and which really needs to be taken into account is the speed of movement since it is known that the degree of experienced tactile sensory suppression decreases as the speed of the movement decreases (Angel and Malenka, 1982; Schmidt et al., 1990). Following on from this, it has been argued that when performing exploratory movements, participants may adjust the speed of their hand movements such that the desired features of the surface are more easily assessed (Chapman, 2009). As such, a slowing of the hand movement occurs during exploration, as opposed to the more rapid hand movement that occurs during reaching actions. On

these grounds, it has been suggested that the known attenuation of tactile perception will occur for faster movement speeds, but not for the slower ones that are typically used in exploration.

Indeed, in this respect, a recent study tested the critical speed of movement needed for sensory suppression to occur (Cybulska-Klosowicz et al., 2011). Participants performed a simple inward movement of their right hand, with the speed of the movement entrained to a signal presented on an oscilloscope; brief electrical stimuli were delivered to participants' middle fingers during the movement execution period, or in a control condition performed at rest. Participants had to make unspeeded perceptual judgments regarding the presence or absence of the weak tactile stimulation. In a blocked design, speeds ranging from very slow through to ballistic were tested. For each participant, the critical speed at which tactile detection dropped to chance level was calculated. Not surprisingly, for all of the participants, the critical speed exceeded 200 mm/s, with a mean of 472 mm/s. Such a result was taken to show that tactile suppression occurs with movement speeds outside the typical range of 50–200 mm/s that are used in exploration (Essick and Whitsel, 1985).

Moreover, the participants in Cybulska-Klosowicz et al. study (2011) were asked at the end of each block of trials whether

they would use the respective tested speed for an exploratory

Movement type in the SD task.

movement. Most of the participants indicated that they would use speeds slower than 200 mm/s for exploration, but at the same time, a significant proportion indicated faster speeds as being appropriate for an exploratory movement. The authors explained these results as follows: The participants did *not* have surface contact for the movement tested in their experiment which made the evaluation of the speeds difficult. This begs the question of what exactly happens with tactile suppression when tested with specific exploratory movements, a question that was specifically addressed in the experiment reported here.

## *Locus of tactile stimulation*

Tactile perception was measured by means of two perceptual tasks: one TD task measuring tactile discrimination thresholds at participants' wrists, and another SD task, measuring surface discrimination at the participants' fingertips. One could criticize the present design on the grounds that we did not measure tactile perception in both tasks at the same skin location. However, considering the perceptual tasks that we were interested in, having different skin locations to measure tactile performance was the most practical solution. In support of this, tactile suppression has nevertheless been shown to "invade" the moving limb, such that when moving a finger, a decrease in what is felt is also present for the surrounding regions of the arm (see Williams et al., 1998).

## *Dual-task effects in both TD and SD perceptual tasks*

The two tasks used in the present study to measure tactile discrimination performance could either be performed alone, or else together, within the same experimental block. Given this experimental design, some deterioration in performance was to be expected and was indeed detected when comparing conditions of single versus dual tasking: Participants demonstrated significantly higher tactile thresholds (i.e., poorer discrimination performance) when the TD task was performed together with the SD task, under both active and passive execution modes for the tested movements. Conversely, participants' performance in the SD task was significantly worse when this accompanied the TD task, as compared to those conditions in which the participants only performed a single task.

## **DISCUSSION OF TD TASK PERFORMANCE**

Given previous experimental results on tactile suppression during the execution of goal-directed reach-to-grasp movements (Juravle et al., 2010, 2011), it was hypothesized that increased tactile thresholds (i.e., poorer performance) would be observed for the *active* goal-directed reaching movements, as well as for exploratory movements, as compared to thresholds measured in a control no-movement condition. This hypothesis was confirmed: Participants' performance deteriorated during movement execution (i.e., tactile suppression was observed). Moreover, the two types of active hand movements (exploratory or simple reaching movements) resulted in a similar deterioration of what was felt during movement, thus indicating that irrespective of the type of movement performed, tactile perception was affected.

Furthermore, tactile perception was significantly impaired during the *passive* execution modes for both movements tested (i.e., participants kept their hand at rest, while the experimenter touched it such that it mimicked the contact with the table surface from the active conditions), as compared to the control rest condition (see also Williams and Chapman, 2000). Such a result could indicate that not only does actively moving the hand trigger tactile suppression, but also that the additional *distracting* tactile input provided to a resting hand leads to the same suppressive effect on tactile perception (Juravle and Spence, 2011). This distracting factor, one that gives rise to impaired performance in those conditions in which the two perceptual tasks were performed together, could, of course, be taken as a dual-task effect: TD task performance was impaired when participants performed the concomitant SD task. In a similar vein, the necessity for participants *to divide*their attention between the two tasks could have contributed to the clear deterioration in TD task performance, when the additional tactile SD was performed. However, in the case of the passive movement execution mode, tactile discrimination thresholds were significantly higher for the exploratory movements, as compared to the reaching movements. Such a result hints at the possibility that *distraction* was the more likely explanatory mechanism for the present results. Note, though, that the passive exploration task involved a sustained contact between participants' fingers and the experimental board, as opposed to the passive reaches that involved two temporally segregated touches, delivered by the experimenter. In this respect, the time given to the participant during the trial to extract the needed tactile discrimination cues in this experiment (e.g., tactile memory, Gallace and Spence, 2009) could be taken as an additional factor accounting for the significant difference between tactile sensitivity measured under conditions of passive exploration and passive reaching.

Lastly, when comparing the data from the two types of movement, across the two modes of movement execution, no significant difference in tactile sensitivity was observed for the two execution modes and the two movement types, nor was any interaction observed between the two variables. The latter result is particularly important since it underlines the fact that tactile sensitivity is similarly affected by the two types of movement, exploration and reaching. Such a finding could be taken to account for the fact that (i) either participants did not adjust the speed of their movement, in order for the exploratory movement to be performed appropriately (see Cybulska-Klosowicz et al., 2011); or else (ii) if participants adjusted their speed (i.e., they slowed down their movement), then speed alone does not delineate between a suppressed state of tactile perception and a non-suppressed state. Following on from this, a discussion of the *movement type* *relevance* for what is felt is needed. This possibility is considered in the next section.

## **DISCUSSION OF SD TASK PERFORMANCE**

Since the goal of tactile exploration is to gain information concerning the characteristics of objects that we come into contact with (i.e., the surface of the experimental board in this case), the first prediction with respect to the SD task entailed significantly higher discrimination performance for exploratory movements, as opposed to the simple reaching movements. This hypothesis was confirmed: Participants were significantly better at discriminating between the two materials covering the table surface when they performed exploratory movements, as compared to simple reaches. This is an important result, since it highlights *the relevance* of the movement chosen when measuring tactile performance during movement (Knecht et al., 1993). This result is further strengthened by the positive correlations found for both active and passive reaching movements between the SD task performance under conditions of single and dual tasking: If performance declined for the single reach SD task, it also declined for the dual reach SD task (performed together with the TD task), and the variance in either of their distributions was not explained by the additional perceptual task. Therefore, with respect to the question of the relevance of the task, it appears that the goaldirected reaches may not be the ideal movements with which to investigate tactile perception enhancements during movement execution.

Furthermore, as expected, participants' performance was significantly higher when actively performing the tested movements, as opposed to the passive execution condition. Note that for simple tactile features of objects (i.e., tactile roughness discrimination thresholds), the movement execution mode (active or passive) was not found to make a difference with respect to the performance on the task (Hsiao et al., 1993; Jones, 2009). These studies, however, have only used the natural exploratory movements in their design. In the present study, where exploration was contrasted with reaching movements, when performing a purposeful movement (i.e., moving the hand on the surface of the board in order to get tactual information about it), performance was significantly better as compared to simply receiving the same tactual information, in the absence of overt movement. Such a result thus highlights the importance of purposeful movement for tactile perception.

In conclusion, it would appear that for unspeeded perceptual tasks involving the delivery of tactile stimuli during the execution of simple reaching movements, as well as exploratory movements, a dichotomy based on *sensory-relevance* for movement is apparent: The characteristics of tactile stimulation that are not relevant to the motor task at hand will most likely be suppressed, in order to highlight other incoming valuable sensory information. However, tactile information that is relevant to the motor task at hand, such as that used in active exploration, will be enhanced. From this perspective, the attentional/suppressive influences on what is felt during movement could thus be regarded as being *context-dependent* (Chapin and Woodward, 1982; Fanselow and Nicolelis, 1999; Ferezou et al., 2007).

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 September 2013; accepted: 16 November 2013; published online: 06 December 2013.*

*Citation: Juravle G, McGlone F and Spence C (2013) Context-dependent changes in tactile perception during movement execution. Front. Psychol. 4:913. doi: 10.3389/ fpsyg.2013.00913*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Juravle, McGlone and Spence. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Visual target distance, but not visual cursor path length produces shifts in motor behavior

## *Nike Wendker , Oliver S. Sack and Christine Sutter\**

*Department of Work and Cognitive Psychology, RWTH Aachen University, Aachen, Germany*

#### *Edited by:*

*Knut Drewing, Giessen University, Germany*

#### *Reviewed by:*

*Knut Drewing, Giessen University, Germany Loes C. J. Van Dam, University of Bielefeld, Germany*

#### *\*Correspondence:*

*Christine Sutter, Department of Work and Cognitive Psychology, RWTH Aachen University, Jägerstra*ß*e 17-19, 52056 Aachen, Germany e-mail: christine.sutter@ psych.rwth-aachen.de*

When using tools effects in body space and distant space often do not correspond. Findings so far demonstrated that in this case visual feedback has more impact on action control than proprioceptive feedback. The present study varies the dimensional overlap between visual and proprioceptive action effects and investigates its impact on aftereffects in motor responses. In two experiments participants perform linear hand movements on a covered digitizer tablet to produce ∩-shaped cursor trajectories on the display. The shape of hand motion and cursor motion (linear vs. curved) is dissimilar and therefore does not overlap. In one condition the length of hand amplitude and visual target distance is similar and constant while the length of the cursor path is dissimilar and varies. In another condition the length of the hand amplitude varies while the lengths of visual target distance (similar or dissimilar) and cursor path (dissimilar) are constant. First, we found that aftereffects depended on the relation between hand path length and visual target distance, and not on the relation between hand and cursor path length. Second, increasing contextual interference did not reveal larger aftereffects. Finally, data exploration demonstrated a considerable benefit from gain repetitions across trials when compared to gain switches. In conclusion, dimensional overlap between visual and proprioceptive action effects modulates human information processing in visually controlled actions. However, adjustment of the internal model seems to occur very fast for this kind of simple linear transformation, so that the impact of prior visual feedback is fleeting.

**Keywords: aftereffect, tool use, dimensional overlap, contextual interference, human information processing**

## **INTRODUCTION**

Humans use tools to either extend their own capacities, to enlarge and strengthen single parts of their body, or as a way to sort out problems. In modern live we are confronted with technologies that transform body movements into tool movements by linear and dynamical perturbations (e.g., a computer mouse), and/or by inverting movement directions (e.g., a laparoscope in minimalinvasive surgery). These sensorimotor transformations challenge the human information processing system, since the sensory feedback from the moving hand (proximal action effect) and the sensory feedback from the moving effective part of the tool (distal action effect) do not correspond.

For controlling human actions, it is widely accepted that the proximal movement-effect loop is essential to generate an action plan from the very beginning. This so-called ideo-motor principle of action planning holds that agents select, initiate and execute a movement by activating the anticipation of the sensory codes of the movement's effects (James, 1890; Greenwald, 1970; for an overview see Hommel et al., 2001). However, in tool use distal action effects predominate action control while proximal action effect are attenuated or even ignored (Mechsner et al., 2001; Sutter and Ladwig, 2012; Wang et al., 2012; Ladwig et al., 2013; for an overview and limits in distal action effect control see, e.g., Sutter et al., 2013).

Fourneret and Jeannerod (1998) demonstrated that participants are not very aware about their own hand movements. Participants traced sagittal lines on a graphic tablet using a stylus held in their right hand while a mirror hid their hand movements. The mirror presented visual feedback, so that participants saw their lines projected from a computer screen. While in control trials the line was exactly the same as seen in the mirror, in perturbed trials the line appeared to deviate in one direction "right or left" by a variable angle (2, 5, 7, or 10◦). The main finding was that participants consistently displaced their hand in the opposite direction for drawing a visually sagittal line. When participants were asked in which direction they thought their hand had moved, participants largely underestimated their hand deviation in perturbed trials.

Ladwig et al. (2012) investigated the recall of proprioceptive information after performing a hand movement with perturbed visual feedback. In phase 1 (**Figure 1**, upper part), participants were asked to move the cursor horizontally from one target bar to the other by moving a pen on a digitizer tablet. The cursor amplitude presented on the display was shorter, equal to or longer than the hand amplitude. The digitizer tablet and the hand were covered with an occluder, so that participants only received perturbed visual feedback on the display. After reaching the target area the movement direction had to be reversed. In phase 2 (**Figure 1**, lower part) participants were asked to replicate the formerly performed hand amplitude as accurately as possible without any visual feedback. In one condition the hand amplitude was held constant while the cursor amplitudes on the display were shorter or longer than the hand amplitudes (**Figure 1**, left). In another condition the cursor amplitude was constant and hand

amplitudes were shorter or longer (**Figure 1**, right). In control trials in each condition hand and cursor amplitudes were equal.

In untransformed trials participants replicated movements very accurately. In perturbed trials hand amplitudes prominently shifted, influenced by the formerly received visual feedback. When participants had seen shorter (longer) cursor amplitudes the replicated hand amplitudes were accordingly shorter (longer). These shifts occurred in constant and varying hand amplitudes, but they were more pronounced when proximal effects varied. That means visual information from phase 1 biased motor replications in phase 2. The authors interpreted the shifts as a visual aftereffect. Common coding approaches (e.g., Prinz, 1997; Hommel et al., 2001) propose that sensory information from perceived actions and intended actions are coded and stored in a common representational domain. As a result of this, sensory information from different senses is likely to interact and to affect subsequent action control. The findings by Ladwig et al. (2012) demonstrate this kind of cross talk in terms of visual aftereffects. Because, if visual information from phase 1 could have been completely ignored in motor replication (phase 2), then inaccuracy in motor replications should have been independent from the visual information in phase 1. But this was not the case. Ladwig et al. (2012, 2013) observed a systematic pattern of under- and overshoots that depended on the length of the formerly seen cursor amplitudes: When participants had seen shorter (longer) cursor amplitudes (phase 1) the replicated hand amplitudes in phase 2 were accordingly shorter (longer). This pattern was even observed for constant hand amplitudes but varying cursor amplitudes. In this condition, replicated hand movements could have been performed without any corrections of the previouslyused motor program (Wolpert and Flanagan, 2001). That motor replications were still influenced from the formerly perceived visual information speaks in favor of the common representational domain of sensory information from perceived actions and intended actions (e.g., Prinz, 1997; Hommel et al., 2001).

Furthermore, the theory of event coding and the dimensional overlap model (Kornblum et al., 1990; Kornblum and Lee, 1995; Hommel et al., 2001) assume that when perceptual stimuli share some features with planned actions, these stimuli can either foster those actions or interfere with them depending on their similarity. Dimensional overlap is treated as a dichotomous variable and describes the match or mismatch between stimulus (S) and response (R) along functionally separable object dimensions (Kornblum and Lee, 1995). Orientation, size and shape are object dimension, whereas "vertical," "long," or "curved" are features on those dimensions (Treisman and Gelade, 1980). The impact of the dimensional overlap on aftereffects was investigated in a second condition (Ladwig et al., 2012). The horizontal hand motion on the tablet produced a vertical cursor motion on the display. The orientation of hand and cursor motion did not longer overlap (horizontal vs. vertical), and this resulted in smaller aftereffects when compared to the condition in which the orientation of hand and cursor motion did overlap (both horizontal).

The aim of the present study is to further investigate the impact of dimensional overlap on aftereffects in motor replications. For this we adapt the task introduced by Ladwig et al. (2012). Again, participants move the cursor on the display from a start position to a target, but now the cursor motion follows the shape of an inverted U while the hand motion still follows a straight horizontal line (**Figure 3**, upper part). In the condition *perturbed cursor motion* (**Figure 3**, left) the length between start and target area (= visual target distance) and the length of the hand motion are similar and remain constant. The variable length (short, middle, long) and the shape of the cursor trajectory (∩-shaped) are dissimilar from the horizontal hand motion. When features are similar (dissimilar), then dimensions do (not) overlap. In the condition *perturbed hand motion* (**Figure 3**, right) the constant visual target distance and the variable length of the hand motion are similar (middle) or dissimilar (short, long). The constant length and shape of the cursor trajectory are dissimilar from the varying horizontal hand motion. In phase 2 participants replicate the formerly performed hand amplitude (**Figure 3**, lower part).

The variation of length and shape of hand and cursor motion decouples two different relations in visually controlled aiming movements: First, the relation between hand motion and cursor motion (dissimilar length and dissimilar shape). And second the relation between hand motion and visual target distance (similar or dissimilar length and similar shape). The experimental variations of the dimensions shape and length, and their dimensional overlap in phases 1 and 2 are depicted in **Table 1**.

Thus, the first hypothesis (H1) concerns the dimensional overlap and its impact on aftereffects. H1a: If it is the relation between **Table 1 | The experimental variations of the dimensions shape and length, and their dimensional overlap in phases 1 and 2 ("=", similar; i.e., dimension does overlap; "-=", dissimilar; i.e., dimension does not overlap).**

hand motion and cursor motion (dimensions do not overlap) that accounts for the aftereffects, then in both conditions aftereffects should be present in terms of overshoots, since the cursor motion is always longer than the hand motion. Overshoots in motor replications should increase from short to long cursor motions. H1b: However, if it is the relation between hand amplitude and visual target distance (dimensions do overlap), then we do not expect any aftereffects in the condition with perturbed cursor motions. In the condition with perturbed hand motions (dimensions do or do not overlap) aftereffects should follow the same pattern as observed by Ladwig et al. (2012). When the visual target distance is shorter (longer) than the hand amplitude, participants should undershoot (overshoot). Therefore, we do not expect any aftereffects when the relation is 1:1. The ideo-motor principle (James, 1890; Greenwald, 1970) would predict the same pattern of results as H1b. Actions are cognitively represented with respect to the goal of the action, not with respect to the way we achieve the action's goal. In this sense, the relation between hand amplitude and visual target distance (H1b) should be more important for controlling actions than the relation between hand amplitude and cursor path length (H1a).

The second hypothesis (H2) considers the impact of the context in phase 1 on aftereffects in motor replications (phase 2). The contextual interference effect (Magill and Hall, 1990; Guadagnoli and Lee, 2004) describes a benefit for (motor) skill acquisition when tasks are presented in blocked practice condition, but a disadvantage on retention and transfer, and the other way around for the random practice condition. The reason for this seems to be due to the simple and automated (learning a task in one context blocked practice blocks) vs. elaborated (learning a task in multiple contexts—random practice blocks) cognitive processing when learning a task. In our experiments, the task irrelevant visual feedback in phase 1 can be considered as the context in which the motor task is performed. In Experiment 1 we present two small, randomized blocks of trials in which three different gains perturb either the cursor motion (one block: cursor motion varying, hand motion constant) or the hand motion (another block: hand motion varying, cursor motion constant). In Experiment 2 we present the same trials of perturbed cursor or hand motions as in Experiment 1 but randomly mixed within a block. We assume that participants in Experiment 1 may, at least implicitly realize that one aspect of the task in phase 1 remains constant within a block, either the hand motion or the cursor motion. The motion constancy and the smaller set size of contexts to be learned in Experiment 1 should lead to smaller aftereffects when compared to Experiment 2 in either both conditions (H2a) or in the condition with perturbed hand motions only (H2b).

Finally, we explore the following research question how experience shapes subsequent motor behavior. Prior reaching a familiarized visual target reduced subsequent reaching variability for this target position, but also reduced subsequent reaching accuracy for other target positions (Verstynen and Sabes, 2011). In other words, performance in target repetitions is better than in target switches. This makes perfect sense. Movements are usually pre-programmed with the previously-used internal model (Wolpert and Flanagan, 2001). When sudden changes (e.g., a gain change) occur, the motor system compensates for and adapts to these changes by modifying the pre-programmed action during movement execution (e.g., Rieger et al., 2005). Consequently, any error at that time reflects the specification of the preprogrammed movement. In the present experiment, participants did not receive any visual feedback in phase 2. Thus, they were not able to observe the difference between the to-be-replicated hand amplitude and their actual replication. However, this is relevant information for the motor system to adjust the forward model (Wolpert and Flanagan, 2001). Thus, in the present experiment the forward model can't be adjusted if the gain changes from trial to trial (switch condition). However, if it is repeated, then the repeated closed-loop control in phase 1 function in a way to adjust the internal model. Consequently, the forward model becomes more accurate and smaller aftereffects are expected for gain repetitions than for gain switches.

## **EXPERIMENT 1**

**METHODS**

#### *Apparatus, task, and stimuli*

The experimental setting (**Figure 2**) was the same as used by Ladwig et al. (2012). Participants sat in a dimly lit room in front of a DIN-A3 digitizer tablet (WACOM Intuos2, 100 Hz sampling rate). A wooden cover with a curtain prevented direct vision of the digitizer tablet and the participant's hand. In Experiment 1a the experimental tasks and cursor motions were presented on a 22" color CRT display, with a distance of approximately 58 cm between participant and display (Iiyama HM204DT, Vision Master Pr514, 100 Hz refresh rate, 1024 × 768 pixels). Moving the tip of the pen (WACOM Intuos2 Grip Pen) horizontally inside a cut out groove mounted onto the digitizer tablet (width and length of the groove: 0.4 and 50 cm) controlled the cursor on the display. The experimenter sat next to the participant and monitored the log file providing information about participant's performance on a separate display. An Apple Macintosh computer running Matlab software with the Psychophysics Toolbox extension controlled the experiment (Kleiner et al., 2007).

In phase 1 (**Figure 3**, upper part) of each trial two black dots (circle diameter 5.6 mm each, distance between dots 50 mm = visual target distance) and a gray circular cursor (circle diameter 4 mm) appeared on the white screen. The cursor was positioned onto the right dot, and the task in phase 1 required moving it to the dot on the far side as accurately as possible by moving the pen leftward along the groove on the digitizer tablet. The horizontal hand movement produced a ∩-shaped cursor motion on the display (i.e., the upper half of a vertical ellipse). When the cursor had reached the left dot, phase 2 (**Figure 3**, lower part) started: The screen turned blank, and participants had to move the pen back—rightward—without any visual feedback. The task in phase 2 required reproducing the initially performed hand amplitude as accurately as possible. The start position of the cursor on the left side inverted movement directions.

In phase 1 the relation between hand amplitude and cursor path length, and/or between hand amplitude and visual target distance was perturbed by three different gains. **Figure 3** (left) depicts the task for *perturbed cursor motions*. The hand amplitude (*d* = 50 mm) and the visual target distance (minor axis of the ellipse) were 50 mm and remained constant across trials. The constant length of the semi-minor axis (b; Equation 1) and a varying circumference (c; Equation 2 with gain factors 0.5, 1, or 1.5) defined the ellipse. The length of the major axis was approximated (A; Equation 3). Please note, for correctly fulfilling the task the cursor motion followed only the upper half of the vertical ellipse.

$$b = \frac{d}{2} = 25\,\text{mm} \tag{1}$$

$$c = \text{gain} \ast 240 \,\text{mm} \tag{2}$$

$$A \approx \sqrt{\frac{c^2}{2\pi^2} - b^2} \tag{3}$$

For perturbed cursor motions equations 4–6 present the transformation of the x-coordinates of the pen on the tablet (*xp*) into visual x- and y-coordinates along the ∩-shaped cursor path (*xc* and *yc*). The length of the major axis and the circumference of the ellipse (= cursor path length) varied as a function of the applied gain. The relations between hand amplitude (50 mm) and

were short (25 mm), middle (50 mm), or long (75 mm), while cursor path length and visual target distance remained constant (120 and 50 mm, respectively). In phase 2 (lower part) the initially performed hand amplitude was replicated without any visual feedback.

cursor path length (60, 120, or 180 mm) were 1:0.83, 1:0.42, or 1:0.28, and the relation between hand amplitude and visual target distance (50 mm) was 1:1. In phase 2, when participants were instructed to replicate the initially performed hand amplitude, the reproduction required moving the pen by 50 mm. Thus, the motor reproduction in phase 2 required the recall of the constant motor information from phase 1, while the visual information from phase 1 was irrelevant for solving the task and had to be ignored.

$$\text{For perturbed cursor motions } \alpha = \frac{\text{start\\_x}\_{p} - \text{x}\_{p}}{d} \ast 180^{\circ}$$

$$\text{For perturbed hand motions } \alpha = \frac{\text{start\\_x}\_{p} - \text{x}\_{p}}{\text{gain}} \ast \frac{180^{\circ}}{d} \text{ (4)}$$

$$\mathbf{x}\_{\varepsilon} = \operatorname{start}\\_\mathbf{x}\_{\mathcal{P}} - (1 - \cos(\alpha)) \ast \mathbf{b} \tag{5}$$

$$y\_{\varepsilon} = 148 \,\mathrm{mm} - \sin(\alpha) \,\mathrm{\*}A \tag{6}$$

where 148 mm defines the horizontal midline of the screen, i.e., the position of the minor axis on the screen.

**Figure 3** (right) depicts the task for *perturbed hand motions*. The visual target distance (minor axis of the ellipse) was again constant across trials. The constant length of the semi-minor axis (b; Equation 1) and the constant circumference (c; Equation 2 with gain factor 1) defined the ellipse. The length of the major axis was approximated (A; Equation 3). Equations 4–6 present the transformation of the x-coordinates of the pen on the tablet (*xp*) into visual x- and y-coordinates along the ∩-shaped cursor path (*xc* and *yc*). The elliptical cursor path remained constant and the hand amplitude varied as a function of the applied gain. The relations between hand amplitude (25, 50, or 75 mm) and cursor path length (120 mm) were 1:0.21, 1:0.42, or 1:0.63, and the relations between hand amplitude and visual target distance (50 mm) were 1:0.5, 1:1, or 1:1.5.

In phase 2, when participants were instructed to replicate the initially performed hand amplitude, the reproduction required moving the pen by 25, 50, or 75 mm. Thus, motor replications of hand amplitudes in phase 2 required the recall of varying motor information from phase 1, while the visual information from phase 1 was irrelevant for solving the task and had to be ignored.

The combination of hand amplitude (= 50 mm), cursor path length (= 120 mm) and visual target distance (= 50 mm) appeared in both conditions of perturbed cursor motions and perturbed hand motions, and were considered as control trials.

In Experiment 1b we did not provide any visual feedback on the display. Comparing results of conditions with and without feedback should clarify the impact of visual feedback on observed deviations. If visual feedback in phase 1 induced deviations in phase 2, then the hypothesized pattern of over- and undershoots should occur in Experiment 1a, but not in Experiment 1b.

A second experimenter sat opposite the participant. A perforated plastic plate (size 255 × 255 mm) was attached to the experimenter's side of the cut out groove. Two plastic blocks (95 × 15 × 9 mm) adjusted to the plate functioned as barriers and restricted the distance of the hand movement. All other materials were the same as in Experiment 1a.

In phase 1 of each trial a second experimenter adjusted both plastic barriers on the plate 25, 50, or 75 mm apart. The participant moved the pen along the groove from the right barrier to the left barrier. After movement initiation the experimenter removed the right barrier. When the pen had reached the left barrier, phase 2 started. Participants had to move the pen rightward to reproduce the initially performed hand amplitude of 25, 50, or 75 mm as accurately as possible. The start position of the pen on the left side inverted movement directions.

#### *Procedure and design*

Experiment 1a consisted of two blocks: In block 1 the path length of cursor motions varied [short (60 mm) vs. middle (120 mm) vs. long (180 mm)]. Cursor motions were always longer than the constant hand motions (50 mm). In block 2 the path length of hand motions varied [short (25 mm) vs. middle (50 mm) vs. long (75 mm)]. Hand motions were always shorter than the constant cursor motions (120 mm). The order of blocks was counterbalanced across participants. Participants were randomly assigned to movement directions.

Each block consisted of 45 trials (three gains with 15 repetitions each, randomly presented) and another six trials presented in advance of each block in order to familiarize subjects with the task (the same three gains as used in the experimental trials with two repetitions each, randomly assigned).

Before a block started, participants were instructed to move as accurately as possible and to produce continuous and smooth forth and back movements with the pen without interrupting. They were further instructed to reproduce the initially performed hand amplitude in phase 2 as accurately as possible and to monitor their hand motion in phase 1 carefully. At the beginning of each trial, the cursor as well as the start and target dot were presented on the screen. Participants were free to choose a start position within the groove on the tablet. That means, hand and cursor motions were not spatially aligned. A first click of the pen's button unlocked the cursor, and participants moved it to the opposite target dot while receiving continuous visual feedback. When the cursor was positioned on the target dot, participants pressed the pen's button a second time. Then, both dots as well as the cursor disappeared, and participants started the replication of the hand amplitude by reversing the movement direction with the pen. When they thought to have reproduced the initially performed hand amplitude, they finally pressed the pen's button to terminate the trial. Subsequently, a new trial was presented. Summarizing, trials consisted of two phases each: the initial phase with visual feedback (1) and the inverse replication phase without any visual feedback (2). The non-dominant hand rested relaxed on the participants' lap. The experiment lasted about 30 min.

Experiment 1b consisted of one block of 45 trials with hand amplitudes being 25, 50, or 75 mm (three hand amplitudes with 15 repetitions each, randomly presented). Another six trials were presented in advance to familiarize subjects with the task and procedure (the same three gains as used in the experimental trials with two repetitions each, randomly assigned). In this experiment two experimenters were present: the first experimenter fulfilled the same tasks as described for the experimenter in Experiment 1a, the second experimenter was responsible for presenting the trials (see below).

Before Experiment 1b started, participants were instructed to produce continuous and smooth forth and back movements with the pen without interrupting. They were further instructed to reproduce the initially performed hand amplitude in phase 2 as accurately as possible and to monitor their hand motion in phase 1 carefully. At the beginning of each trial, the second experimenter positioned the start barrier next to the pen and the second barrier at a distance of 25, 50, or 75 mm. A trial started with a first click of the pen's button. Then, participants moved the pen to the opposite barrier, while the second experimenter removed the start barrier. When the pen had reached the opposite barrier, participants pressed the pen's button a second time. They reversed the movement direction and started to reproduce the initially performed hand amplitude. When they thought to have reproduced the initially performed hand amplitude, they finally pressed the pen's button to terminate the trial and the second experimenter presented a new trial. The experiment lasted about 20 min.

Experiment 1a was based on a 2 × 3 design with the withinsubject factors *perturbed motion* (cursor motion vs. hand motion) and *length variation* (short vs. middle vs. long). Experiment 1b served as a control experiment with the within-subject factor *hand amplitude* (25 vs. 50 vs. 75 mm). The dependent variable was the mean estimated amplitude (in %), the gain between the observed replicated hand amplitude and the to-be-replicated hand amplitude (= observed replicated hand amplitude / tobe-replicated hand amplitude ∗ 100). Trials were considered as erroneous and omitted from analyses when the initial movement trajectory was non-continuous (with *v* = 0 within the initial hand movement) and/or its direction changed, when the initial movement overshot the target area, when the second button click occurred while the cursor was outside the target area and when the observed replicated amplitude was shorter than or equal to 10 mm.

#### *Participants*

For Experiment 1a 17 students (4 female) of the RWTH Aachen University, aged from 18 to 31 years (*M* = 24; *SD* = 4.2) volunteered. All participants were right handed, had normal or corrected-to-normal vision and were naïve with respect to the purpose of the experiment. Another 16 students (9 female) of the RWTH Aachen University, aged from 18 to 36 years (*M* = 24; *SD* = 5.1) volunteered for the control experiment (Experiment 1b). Fourteen of them were right handed, and all of them had normal or corrected-to-normal vision and were naïve with respect to the purpose of the experiment.

#### **RESULTS**

Mean estimated amplitudes (in %) were calculated for error-free trials [error rates at 4.7% (Experiment 1a) and 8.3% (Experiment 1b)]. First, we analyzed data from Experiment 1a using a 2 (*perturbed motion:* cursor motion vs. hand motion) × 3 (*length variation*: short vs. middle vs. long) analysis of variance for repeated measurements (ANOVA). Second, we compared replicated hand amplitudes with and without visual feedback in phase 1 (Experiment 1a, perturbed hand motion vs. Experiment 1b) by using a two-factorial ANOVA for repeated measurements with the within-subject factor hand amplitude (25 vs. 50 vs. 75 mm) and the between subject factor visual feedback (with vs. without visual feedback in phase 1).

**Figure 4** depicts the results for blocks with perturbed cursor motions (squares) and perturbed hand motions (black triangles). The ANOVA revealed a significant main effect for the factors *perturbed motion* [*F*(1, <sup>32</sup>) <sup>=</sup> <sup>14</sup>.35; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.47] and *length variation* [*F*(2, <sup>32</sup>) <sup>=</sup> <sup>41</sup>.70; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.72], and their significant interaction [*F*(2, <sup>32</sup>) <sup>=</sup> <sup>42</sup>.10; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.73].

For perturbed cursor motions (**Figure 4**, squares) observed hand amplitudes (phase 2) did not differ from the to-be-replicated hand amplitudes (phase 1: 50 mm). That means for all cursor path lengths (phase 1: 60, 120, or 180 mm) replications were very accurate (*M* = 102% (0.96 mm) vs. 103% (1.77 mm) vs. 106% (2.37 mm); *t*-tests not significant with *p*'s > 0.16). Perturbed hand motions (**Figure 4**, black triangles) were most accurate when in phase 1 the visual target distance (50 mm) was equal to the performed hand amplitude (50 mm), although observed hand amplitudes deviated from the to-be-replicated hand amplitudes [*M* = 105% (2.60 mm); *t*(16) = 2.66; *p* < 0.05]. When in phase 1 the hand amplitude was short (25 mm), in

phase 2 significant overshoots occurred [*M* = 133% (8.17 mm); *t*(16) = 6.23; *p* < 0.01]. When in phase 1 the hand amplitude was long (75 mm), in phase 2 significant undershoots occurred [*M* = 96% (−3.54 mm); *t*(16) = −3.21; *p* < 0.01].

Second, **Figure 4** depicts the results for replicated hand amplitudes with (black triangles) and without (gray triangles) visual feedback in phase 1. In Experiment 1b observed hand amplitudes deviated from the to-be-replicated hand amplitudes when the hand amplitude in phase 1 was short or middle [25 mm: *M* = 128% (6.83 mm), *t*(15) = 9.10; *p* < 0.01; 50 mm: 109% (4.92 mm), *t*(15) = 4.42; p < 0.01]. Replications were accurate when the hand amplitude in phase 1 was long [75 mm: *M* = 103% (2.76 mm); n.s.]. The ANOVA revealed a significant main effect of the factor hand amplitude [*F*(2, <sup>62</sup>) = 101.48; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01; <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.76] and a significant interaction with the factor visual feedback [*F*(2, <sup>62</sup>) <sup>=</sup> <sup>3</sup>.93; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.05; <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.11]. That means replicated amplitudes without visual feedback in phase 1 were more accurate and did not undershoot when compared with replicated amplitudes with visual feedback in phase 1. Consequently, the increased inaccuracy observed in replicated amplitudes with visual feedback in phase 1 can be interpreted as visual aftereffects.

## **DISCUSSION**

In this experiment we asked about the impact of dimensional overlap on aftereffects in motor replications. In the condition *perturbed cursor motion* we did not find any aftereffects. That means the variation of cursor path length did not induce aftereffects. Aftereffects occurred in the condition *perturbed hand motion* only. Considering our first hypothesis the results confirm H1b: Aftereffects vary as a function of the relation between hand path length and visual target distance, and not with respect to the relation between hand and cursor path length. Two conclusions can be drawn from these findings: First, the dimensional overlap modulated aftereffects. Concerning the dimension shape, hand motion (phases 1 and 2) and visual target distance (phase 1) did overlap, but hand motion (phases 1 and 2) and cursor motion (phase 1) did not overlap. Consequently, visual aftereffects appeared from length variations between hand path length and visual target distance, but not from cursor path length variations. However, the restrictions of measuring deviations along the x-axis only will be discussed later in more detail.

Second, manual actions are pre-programmed on the basis of target amplitude and target width (Fitts' law; Fitts, 1954). In other words, they are cognitively represented with respect to the action's goal (James, 1890; Greenwald, 1970; Hommel et al., 2001). In our task (phase 1), target amplitude was the distance between start dot and target dot. Thus, our finding also supports the notion of the ideo-motor principle and action effect account (James, 1890; Greenwald, 1970; Hommel et al., 2001). The present experimental setting, more generally speaking every visual target presentation, does not allow distinguishing between both conclusions. This point will also be discussed later.

Furthermore, aftereffects in the condition *perturbed hand motion* (8, 3, and –4 mm; range 12 mm) are considerably smaller than that obtained in a similar condition by Ladwig et al.. (2012; **Figure 3**, asterisks: 24, 7, and –8 mm; range 32 mm). There is a simple explanation for this. Hand path length and visual target distance in the present experiment is 50 mm (1:1 condition) and therefore shorter compared to the 120 mm amplitude used by Ladwig et al. (2012). If one accounts for the amplitudes the ratio between the range of aftereffects and the hand amplitude remains nearly the same (12/50 and 32/120 with ratios being 0.24 and 0.27, respectively). Consequently, it makes sense that we find in the present experiment smaller aftereffects for the smaller hand amplitudes.

### **EXPERIMENT 2**

#### **METHODS**

#### *Stimuli, design, and procedure*

These were the same as in Experiment 1, except for the constancy of either cursor or hand motion and set size of presented trials per block. Instead of the blocked presentation of trials with perturbed cursor motion (one block, set size: 3) or hand motion (another block, set size: 3), we presented trials with perturbed cursor motions and perturbed hand motions randomly within a block (set size: 5) to increase contextual interference. The two blocks consisted of 48 trials each (the same 2 × 3 combinations of perturbed motion and length variation as in Experiment 1 with 8 repetitions each, randomly presented) and another six trials presented in advance of each block in order to familiarize subjects with the task (the same 2 × 3 combinations of perturbed motion and length variation as used in the experimental trials with one repetition each, randomly assigned). The experiment lasted 30 min.

#### *Participants*

Another 14 students (5 female) of the RWTH Aachen University, aged from 17 to 34 years (*M* = 26; *SD* = 4.5) volunteered for the experiment. All but one participant were right handed. All participants had normal or corrected-to-normal vision and were naïve with respect to the purpose of the experiment.

#### **RESULTS**

Again, mean estimated amplitudes (in %) were calculated for error-free trials (error rate at 11.6%) and analyzed using a 2 (*perturbed motion:* cursor motion vs. hand motion) × 3 (*length variation*: short vs. middle vs. long) ANOVA for repeated measurements. Additionally, we compared replicated hand amplitudes with and without visual feedback in phase 1 (Experiment 2, perturbed hand motion vs. Experiment 1b) by using a twofactorial ANOVA for repeated measurements with the withinsubject factor hand amplitude (25 vs. 50 vs. 75 mm) and the between subject factor visual feedback (with vs. without visual feedback in phase 1).

**Figure 5** depicts the results for blocks with perturbed cursor motions (squares) and perturbed hand motions (triangles). The ANOVA revealed significant main effects for the factors *perturbed motion* [*F*(1, <sup>13</sup>) <sup>=</sup> <sup>20</sup>.42; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.61] and *length variation* [*F*(2, <sup>26</sup>) <sup>=</sup> <sup>33</sup>.02; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.71], and their significant interaction [*F*(2, <sup>26</sup>) <sup>=</sup> <sup>32</sup>.18; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.71].

For perturbed cursor motions (**Figure 5**, squares) observed hand amplitudes (phase 2) were quite accurate and did not differ from the to-be-replicated hand amplitudes (phase 1: 50 mm). That means the variation of cursor path length (phase 1: 60, 120, or 180 mm) did not induce any aftereffects [*M* = 109% (4.40 mm) vs. 107% (3.56 mm) vs. 108% (4.08 mm); *t*-tests not significant with *p*'s > 0.058]. Perturbed hand motions (**Figure 5**, triangles) were very accurately replicated when the tobe-replicated hand amplitude (phase 1) was 50 mm [*M* = 105% (2.66 mm); n.s.]. When in phase 1 the to-be-replicated hand amplitude was short (25 mm), significant overshoots occurred in phase 2 [*M* = 141% (10.47 mm); *t*(13) = 4.95; *p* < 0.01]. When in phase 1 the to-be-replicated hand amplitude was long (75 mm), significant undershoots occurred in phase 2 [*M* = 93% (−5.74 mm); *t*(13) = −2.75; *p* < 0.05]. Again, we compared replicated hand amplitudes with and without visual feedback in phase 1 (Experiment 2, perturbed hand motion vs. Experiment 1b). The ANOVA revealed a significant main effect of the factor hand amplitude [*F*(2, <sup>56</sup>) <sup>=</sup> <sup>83</sup>.71; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01; <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.74] and a significant interaction with the factor visual feedback [*F*(2, <sup>56</sup>) = <sup>8</sup>.47; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01; <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.23]. That means replicated amplitudes without visual feedback in phase 1 were more accurate and did

**FIGURE 5 | Experiment 2.** Mean estimated amplitude (%) for perturbed cursor motions (squares) and perturbed hand motions (triangles) as a function of length variation. A performance of 100% indicates exact replications. Error bars represent the standard error of the mean.

not undershoot when compared with replicated amplitudes with visual feedback in phase 1. Again, this proofs that the increased inaccuracy observed in replicated amplitudes with visual feedback in phase 1 are visual aftereffects.

Further analyses were done to investigate the impact of contextual interference (lower in Experiment 1 than in Experiment 2) on aftereffects in motor replications. For the conditions perturbed cursor motions and perturbed hand motions estimated amplitudes (%) were analyzed separately using a 2 [*contextual interference:* low (Experiment 1) vs. high (Experiment 2)] × 3 (*length variation*: short vs. middle vs. long) mixed ANOVA for repeated measurements. For perturbed cursor motions the analysis did not reveal any significant main effect or interaction (all *p*'s > 0.28). For perturbed hand motions the ANOVA confirmed the significant main effect of the factor *length variation* [*F*(2, <sup>58</sup>) = 87.74; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.75]. Other effects did not reach significance (*p* > 0.24).

To address our research question of how experience shapes subsequent motor behavior we comprised data from both experiments, since aftereffects did not differ between them. Data were analyzed separately for the conditions *perturbed cursor motion* and *perturbed hand motion*. For perturbed cursor motions (24% repetition trials, 76% switch trials) the 2 (*path length switch*: repetition vs. switch) × 3 (*length variation*) ANOVA for repeated measurements did not reveal any significant main effects or interaction (all *p*'s > 0.27). For perturbed hand motions (27% repetition trials, 73% switch trials) the ANOVA confirmed the significant main effect of the factor *length variation* [*F*(2, <sup>50</sup>) = <sup>53</sup>.92; *<sup>p</sup>* <sup>&</sup>lt; <sup>0</sup>.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.68] and a significant interaction between *length variation* and *path length switch* [*F*(2, <sup>50</sup>) = 10.31; *p* < 0.01, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>.29]. The main effect *path length switch* was not significant (*p* = 0.15). **Figure 6** depicts the results for the condition with perturbed hand motions. There are switch costs for path length changes that result in larger aftereffects [**Figure 6**, dashed line; 139% (10.0 mm), 106% (3.1 mm), and 95% (−3.8 mm); range 44% (13.8 mm)] compared to path length repetitions [**Figure 6**, solid line; 129% (7.4 mm), 108% (4.0 mm), and 97% (−2.2 mm); range 32% (9.6 mm)].

#### **DISCUSSION**

Again, we did not find any aftereffects in the condition *perturbed cursor motion*. But in the condition *perturbed hand motion* replications in phase 2 varied as a function of visual target distance. The finding replicates the pattern of results from Experiment 1, and supports hypothesis 1b once more.

In our second hypothesis we assumed that the motion constancy and the smaller set size in Experiment 1 would benefit motor replications in phase 2, and predicted smaller aftereffects in Experiment 1 than in Experiment 2. Although the data show a numerical increase in aftereffects the way we predicted, differences were not statistically significant. We will open a deeper discussion on that in the following section.

Finally, data exploration concerning a performance benefit in gain repetitions revealed a significant reduction of aftereffects for repetitions compared to switches. The pattern of results is similar to that found by Verstynen and Sabes (2011), who demonstrated the adaptation benefit for angular deviations. However, the task of the present experiment did not address adaptation.

Remember, participants did not receive any visual feedback in phase 2, where they had to replicate the hand amplitude from phase 1 as accurately as possible. It is known that movements are usually pre-programmed with the previously-used internal model (Wolpert and Flanagan, 2001). Deviations between predicted and actual outcome reflect the specifications of the pre-programmed movement, and are usually corrected online when they become apparent. In this way, the forward model is continuously adjusted. We assumed that gain repetitions, and more specifically the closed-loop control in phase 1 functioned in a way to adjust the internal model. This seemed to be the case and smaller aftereffects occurred for gain repetitions than for gain switches.

### **GENERAL DISCUSSION**

The aim of the present study was to investigate the contribution of dimensional overlap on aftereffects in motor replications. The task, adapted from Ladwig et al. (2012) was to move a cursor on the display from a start position to a target. The cursor motion followed the shape of an inverted U while the hand motion followed a straight horizontal line (phase 1). Then movement direction had to be inverted to replicate the formerly performed hand amplitude on the return without visual feedback (phase 2). The variation of length and shape of hand and cursor motion in phase 1 decoupled two different relations in visually controlled aiming movements (**Table 1**): First, the relation between hand motion and cursor motion (dissimilar length and dissimilar shape = dimensions did not overlap), and second, the relation between hand motion and visual target distance [similar or dissimilar length (dimension did or did not overlap) and similar shape (dimensional overlap)].

Both experiments confirm that aftereffects occur when dimensions between visual and proprioceptive action effects overlap. Thus, when the shape of hand motion and visual target distance was similar, and the hand path length (in phase 1) was shorter (longer) than the visual target distance, participants overshot (undershot) in phase 2. This pattern of aftereffects in terms of systematic over- and undershoots was observed in several studies by Ladwig et al. (2012, 2013) for motor responses. In their experiments, even though the hand amplitude was constant, the varying cursor amplitude in phase 1 produced aftereffects in phase 2. In this condition movements could have been performed without any corrections of the previously-used motor program (Wolpert and Flanagan, 2001), but they weren't. That motor replications were still influenced by the formerly perceived visual information speaks in favor of a common representational domain for sensory information belonging to the same event (e.g., Prinz, 1997; Hommel et al., 2001). The common representation in form of an event code makes it possible for sensory information to interact with each other and to influence subsequent actions. However, this is not the whole explanation of the present findings.

Comparing conditions with and without visual feedback in phase 1 should clarify the impact of visual feedback on observed deviations. We assumed that if visual feedback in phase 1 induced deviations in phase 2, then the hypothesized pattern of overand undershoots should occur in Experiment 1a and 2, but not in Experiment 1b. But this was not what we found. Deviations in phase 2 were present in both conditions, and although deviations were considerably smaller without visual feedback, they were in the same direction as compared to the condition with visual feedback. This strongly points at other factors—besides visual feedback—that influence motor replications in phase 2, for instance a regression-to-the-mean effect (Teghtsoonian and Teghtsoonian, 1978). In our experiments the middle path length (50 mm) represents the mean length. In the condition without visual feedback motor replications of the short (25 mm) and the long path (75 mm) length deviated about nearly the same amount from the "mean" (short-middle: - 1.91 mm; long-middle: - 2.16 mm). And, the larger deviations in the condition with visual feedback showed the same symmetry around the mean.

Rieger et al. (2005) found the same pattern of over- and undershoots when investigating the compensation for gain changes. Participants were asked to perform up- and downward strokes between two visual target lines by moving a pen on a covered digitizer tablet up and down. After six baseline strokes (gain 1:1) a gain was introduced for further six strokes. In one experimental condition the gain resulted in constant cursor amplitudes while the hand amplitude was shorter or longer (cf. Ladwig et al., 2012: condition varying hand amplitude). In another experimental condition the gain resulted in constant hand amplitudes while the cursor amplitude was shorter or longer (cf. Ladwig et al., 2012: condition constant hand amplitude). After that, another six baseline strokes were presented. Compensation for changes was measured by analyzing the deviation from the target line (in mm) for the first stroke performed after an experimental condition. When the hand amplitude (both experimental conditions) was longer (shorter) than the cursor amplitude, undershoots (overshoots) occurred in the first stroke performed afterwards. This result closely resembles the pattern found by Ladwig et al. (2012, 2013) as well as the pattern found in the present experiment for perturbed hand amplitudes. Although the differences between experimental tasks don't allow a direct comparison, the finding—that subsequent motor actions are influenced by formerly perceived visual information—is again in line with the predictions of common coding approaches (e.g., Prinz, 1997; Hommel et al., 2001).

Considering the impact of dimensional overlap on motor replications further, Ladwig et al. (2012) reduced the overlap between visual and proprioceptive action effects along one dimension. In one condition (Ladwig et al., 2012; Experiment 1) a 90◦ rotation of the visual cursor motion resulted in upward-downward movements of the cursor when the hand produced horizontal leftward-rightward movements on the tablet. Consequently, the orientation of hand and cursor motion did no longer overlap (horizontal vs. vertical). The shape was still similar (linear movements = dimensional overlap) and the length was either similar or not (dimension did or did not overlap). Aftereffects in terms of over- and undershoots were still significantly present. But they were considerably smaller when the dimensional overlap was limited (horizontal vs. vertical) compared to when dimensions did overlap (both motions horizontal). In the present experiments dimensional overlap concerns the shape (linear vs. curved) and length of motion (similar vs. dissimilar). Concerning shape, hand motion and visual target distance did overlap (both linear), but hand motion and cursor motion did not overlap (linear vs. curved). We observed aftereffects depending on length variations between hand path length and visual target distance only. However, future studies should also consider measuring deviations along the y-axis as well. We did not observe any aftereffects from curved amplitudes along the x-axis. But, if hand movements were not restricted along the y-axis as in our experiments, aftereffects from the curved amplitude could have been observed. Measuring deviations along both axes allow distinguishing between "length-aftereffects" (= deviations along the x-axis) and "shape-aftereffects" (= deviations along the y-axis).

Further experiments are necessary to fully confirm our conclusion about the dimensional overlap being responsible for aftereffects. If it is the dimensional overlap between visual and proprioceptive effects and not (only) the cognitive representation of the action's goal (James, 1890; Greenwald, 1970; Hommel et al., 2001) then performing curved hand motions instead of linear ones should lead to the pattern of aftereffects we predicted in hypothesis 1a.

To sum up, although in visually controlled manual movements visual and proprioceptive action effects might not be integrated for instance because they do not overlap or are spatially separated (e.g., Ernst and Banks, 2002; Gepshtein et al., 2005)—they nevertheless affect motor performance in terms of aftereffects.

The data could not support our second hypothesis, in which we expected smaller aftereffects in Experiment 1 than in Experiment 2, because of the simplified context in phase 1. In Experiment 1 always one aspect of the task in phase 1 remained constant within a block, either the hand motion or the cursor motion. This constancy and the smaller set size of contexts to be learned should benefit motor replications in phase 2. Aftereffects numerically increased the way we predicted, however, differences were not significant. There are several speculations why this happened. First, in both experiments trials randomly varied, and although motion constancy and set size differed it seems that the contextual changes between Experiments 1 and 2 were not very distinct and did not induce (enough) interference. Second, in our experiments, the mapping between hand path length and visual action effects was very simple (short vs. middle vs. long) and consisted of 5 different trials in total. Participants could have been able to acquire implicit knowledge about the transformations. A higher number of gain factors should increase contextual interference. In a yet unpublished experiment a signal between phases 1 and 2 indicated whether participants had to reproduce the hand motion or the cursor motion in phase 2. Aftereffects considerably increased when compared to a blocked reproduction of either hand or cursor amplitude. This could be another manipulation to increase contextual interference.

The data exploration of gain repetition and gain switch-trials supports the view that gain repetitions adjusted the forward model. It seemed to become more accurate so that smaller aftereffects occurred in gain repetitions than in gain switches (range of aftereffects: 9.6 mm vs. 13.8 mm). Aftereffects significantly dropped by 4.2 mm; that is a 30% benefit from a repeated prior trial. In the present experiment we did not control for the number of gain repetitions and gain switches. Comparable to former studies in our lab (e.g., Ladwig et al., 2012, 2013) trials were presented completely randomly to control for confounds in task presentation. Nevertheless, the results are quite promising, and further experiments will give a more detailed insight into the processes of sensorimotor control. Finally, one could assume that aftereffects are not (much) influenced by linear mappings, but might be more affected by dynamic mappings. For the latter action effects become less predictive, and research on these kinds of transformations demonstrate great inaccuracies in motor behavior. Moreover, users are not able to fully acquire a correct cognitive representation of the transformations, but approximate the internal model (e.g., Sülzenbrück and Heuer, 2009).

In conclusion, dimensional overlap between visual and proprioceptive action effects modulates human information processing in visually controlled actions. However, adjustment of the internal model seems to occur very fast for this kind of simple linear transformation, so that the impact of prior visual feedback is fleeting.

#### **ACKNOWLEDGMENTS**

We wish to thank Florian Bade, Stefan Ladwig, Jens Tiggelbeck, Michael Wagner, and Nora Zekorn for research support. Special thanks are dedicated to Dr. Simon Watt for inviting Christine Sutter to his lab at Bangor University (Wales/UK) and for the inspiring discussions we had there.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 September 2013; accepted: 27 February 2014; published online: 17 March 2014.*

*Citation: Wendker N, Sack OS and Sutter C (2014) Visual target distance, but not visual cursor path length produces shifts in motor behavior. Front. Psychol. 5:225. doi: 10.3389/fpsyg.2014.00225*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Wendker, Sack and Sutter. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**REVIEW ARTICLE** published: 06 February 2014 doi: 10.3389/fpsyg.2014.00084

## Vision affects tactile target and distractor processing even when space is task-irrelevant

## *Ann-KatrinWesslein1,2 , Charles Spence2 and Christian Frings1\**

*<sup>1</sup> Cognitive Psychology, Department of Psychology, University of Trier, Trier, Germany*

*<sup>2</sup> Department of Experimental Psychology, University of Oxford, Oxford, UK*

#### *Edited by:*

*Knut Drewing, Giessen University, Germany*

#### *Reviewed by:*

*Flavia Cardini, Royal Holloway University of London, UK Patricia Garrido-Vásquez, University of Marburg, Germany*

#### *\*Correspondence:*

*Christian Frings, Cognitive Psychology, Department of Psychology, University of Trier, Campus I, D-54286 Trier, Germany e-mail: chfrings@uni-trier.de*

The human brain is adapted to integrate the information from multiple sensory modalities into coherent, robust representations of the objects and events in the external world. A large body of empirical research has demonstrated the ubiquitous nature of the interactions that take place between vision and touch, with the former typically dominating over the latter. Many studies have investigated the influence of visual stimuli on the processing of tactile stimuli (and vice versa). Other studies, meanwhile, have investigated the effect of directing a participant's gaze either toward or else away from the body-part receiving the target tactile stimulation. Other studies, by contrast, have compared performance in those conditions in which the participant's eyes have been open versus closed. We start by reviewing the research that has been published to date demonstrating the influence of vision on the processing of tactile *targets*, that is, on those stimuli that have to be attended or responded to. We outline that many – but not all – of the visuotactile interactions that have been observed to date may be attributable to the direction of spatial attention. We then move on to focus on the crossmodal influence of vision, as well as of the direction of gaze, on the processing of tactile *distractors*. We highlight the results of those studies demonstrating the influence of vision, rather than gaze direction (i.e., the direction of overt spatial attention), on tactile distractor processing (e.g., tactile variants of the negativepriming or flanker task). The conclusion is that no matter how vision of a tactile distractor is engaged, the result would appear to be the same, namely that tactile distractors are processed more thoroughly.

**Keywords: touch, multisensory integration, selective attention, distractor processing, visuo-tactile interaction**

## **INTRODUCTION**

At each and every waking moment, our brains are likely to be processing some combination of visual, auditory, tactile, and even smell stimuli. That said, we are nevertheless able to focus our attention on a single sensory modality at a time, such as on audition when listening to a concert, or on vision when reading a book. However, no less remarkably, we can also integrate the inputs arriving from the different senses such as when watching a movie, where the auditory and visual inputs are both likely to being attended to simultaneously, or when looking at the object that we happen to be palpating in our hands. The basic ability to process information from two or more sensory modalities simultaneously and to integrate that information in order to form coherent representations of the external world renders us multisensory creatures (e.g., Stein et al., 1996; Ernst and Bülthoff, 2004).

It can be argued that the interactions observed between vision and touch represent a special case of multisensory integration. For, unlike other combinations of the "spatial senses" (including the modalities of vision, touch, and audition), these two senses are very often stimulated at one and the same time. The reason for this being that tactile stimulation is almost always accompanied by some visual event, that is, by a potentially observable object touching the body surface. Hence, an organism can often use visual information in order to help predict impending tactile

stimulation. Often, visual information can also be used to specify the location from which that stimulation happens to have originated in external space (see Gallace and Spence, 2014, for a review).

Due to its relevance to our everyday lives, the interplay between vision and touch has been investigated by a large body of research over the last 75 years or so (see Tastevin, 1937; Gibson, 1943 for early work), which has taken a variety of different approaches to the topic. While a number of researchers have utilized independent visual and tactile stimuli, other studies have investigated how vision of the body-part being stimulated can influence a participant's performance in a purely tactile task. Strikingly, and irrespective of the approach that has been taken, many studies that have looked at interactions between the modalities of vision and touch can be classified as being, in some sense, spatial (cf. Spence, 2013). In many studies, this is because the participants have had to perform tasks that were explicitly spatial, such as, for example, in the orthogonal spatial-cuing paradigm, where the target property to be judged by the participant is its relative elevation (see Spence and Driver, 2004; Spence, 2013, for reviews).

Other studies that have utilized, for example, the recently repopularized rubber-hand illusion (RHI) paradigm (Botvinick and Cohen, 1998; see also Tastevin, 1937, for early work in this area), have tended to utilize a visuotactile illusion resulting from the misattribution of the location of one's own limb in external space (see Makin et al., 2008, for a review; though see also Ehrsson et al., 2005). Meanwhile, many other studies have investigated the influence of variations in the direction of a participant's gaze (and hence vision) either toward, or away from, the bodypart that is being stimulated, on tactile perception. Overt visual (and hence spatial) attention is, by definition, associated with the current direction of a person's gaze. As such, to the degree that visual attention may give rise to enhanced tactile information processing at attended locations, these studies were not designed to reveal visuotactile interactions outside the realm of spatial attention.

Since in one way or another the participants in these commonly utilized tasks have needed to attend to a specific location, it is unclear whether vision actually affects tactile information processing merely when/because gaze (i.e., overt spatial attention) happens to be directed toward the location in space where a tactile event subsequently happens to be presented. That is, when interpreting the results of such studies, spatial attention (or the misattribution of the location of one's limb in external space) is a mechanism that can potentially explain the effects allegedly mirroring influences of vision on tactile information processing.

Recently, it has been suggested that vision and touch (as has been shown to be the case for other combinations of the spatial senses) likely interact in a "*what*" as well as in a "*where*" system (Spence, 2013; see, e.g., Schneider, 1969; Goodale and Milner, 1992; Creem and Proffitt, 2001, for the distinction of two pathways in the visual modality, and see, e.g., Reed et al., 2005; Van Boven et al., 2005, for this distinction in the tactile modality). Within such a dual-stream model, we are especially interested in the "what" system, that is, in the pathway by which vision influences the identification and identity- (rather than location-) based selection of tactile stimuli (see, e.g., Rock and Victor, 1964; Rock and Harris, 1967; Ernst and Bülthoff, 2004; Moeller and Frings, 2011, for a few of the studies that have, intentionally or otherwise, attempted to focus on visuotactile interactions within the "what" system).

In this review, the influence of vision on tactile information processing will be critically evaluated. In particular, we review the various evidence that supports a spatial, as well as a nonspatial, influence of vision on the processing of tactile distractors. In the first part of this review, however, we will consider the extant literature that has looked at the influence of vision on the processing of tactile targets. There, we present the results of spatial cuing studies and those studies that have investigated the impact of changes in the direction of a participant's gaze on tactile information processing. Then, turning to those studies in which spatial influences have been controlled for, we go on to present evidence demonstrating that the speeded detection of tactile targets can be facilitated, and tactile resolution enhanced, at those locations on the body surface that can be seen (as compared to when vision of the body-part isn't allowed; e.g., Tipper et al., 1998; Kennett et al., 2001b) even when the direction of a participant's gaze is held constant.

To date, far less is known about the influence of vision on the processing of tactile distractors. Thus, in the second part of this review, we will take a closer look at the literature that has attempted to analyse the influences of distinct visual stimuli, gaze direction, and vision (or rather gaze direction) on tactile distractor processing. We will argue that vision appears to enhance the processing of tactile distractors by spatial as well as non-spatial means – just as is the case for tactile targets – even when vision is entirely irrelevant to a participant's task.

## **SPATIAL CONTRIBUTIONS TO THE INFLUENCE OF VISION ON THE PROCESSING OF TACTILE TARGETS**

The research on visuotactile interactions that has been conducted to date can be broken down into two broad categories; on the one hand, both visual and tactile stimuli have been presented to test whether visual stimuli (e.g., cues or distractors) exert a significant influence over the processing of tactile events (e.g., targets) that happen to occur at around the same time. Here, tasks that are explicitly spatial have typically been used (Spence et al., 2004a). So, for example, in a number of studies, the location of the target has been the stimulus property that participants have had to respond to. In variants of the orthogonal spatial-cuing task (see Spence and Driver, 2004, for a review), as well as in variants of the crossmodal congruency task (see Spence et al., 2004b, 2008, for reviews), participants have often been required to discriminate whether a vibrotactile target presented to the thumb or index finger of either hand has been presented from one of the two upper locations versus from one of the two lower locations instead. Typically, participants have had to respond by making a speeded toe versus heel response to indicate the elevation of the target.

On the other hand, there are those studies in the literature in which the participants have either been instructed to direct their gaze toward the part of their body that is being touched, or to divert their gaze elsewhere. Within this group of studies, researchers have also compared participants' performances in those conditions in which vision of the body-part that was being stimulated was available versus those conditions in which the participants have been blindfolded (and hence vision was unavailable). Another comparison that researchers have been fond of making is between those conditions in which the participants either have, or have not, been able to see the tactile stimulus impacting on their skin surface. Note that these studies implicitly inherit a spatial bias, since the participants had to direct their (visual and tactile) attention selectively toward a particular location rather than another. For each of these kinds of visual manipulation, we will outline the role of space, and thus highlight how it might contribute to the interaction of interest.

In order to specify the spatial constraints on any interactions between visual and tactile stimuli, many studies have implemented variants of the crossmodal spatial attentional-cuing paradigm (see Posner, 1978, 1980, for the original unimodal spatial-cuing paradigm of visual spatial attention). This has become a wellestablished tool used by researchers in order to investigate how attention is directed spatially by the presentation of a pre-cue (see Spence and Driver, 2004, for a review of the crossmodal cuing literature; see **Table 1**).

In a typical exogenous study, spatially non-predictive visual pre-cues are presented shortly before tactile targets (or vice versa). Importantly, in the unimodal as well as the crossmodal variants of this task, the participant has to judge the external location



*Within this table the term vision describes whether a condition where vision of the body-part stimulated was provided is compared to a condition where vision of the body-part stimulated was prevented irrespective of whether vision was manipulated by blindfolding the participants or by selectively occluding the body-part stimulated. V* = *visual stimuli were presented; T* = *tactile stimuli were presented.*

(normally the elevation) from which the target stimulus has been presented (e.g., Kennett et al., 2001a, 2002; Spence and McGlone, 2001), and thus their task is inherently spatial in nature (see Spence and McDonald, 2004; Spence et al., 2004a, on this point).

It is now well-known from those visuotactile studies that have used the crossmodal orthogonal spatial-cuing paradigm<sup>1</sup> (and where the cue does not elicit a response bias; see e.g., Spence and Driver, 1997) that the responses of participants toward those tactile targets that happen to be presented from the same location (or side) as the visual pre-cues tend to be faster and more accurate than their reactions toward the same targets when the visual cue happens to be presented from the other side of central fixation instead (e.g., Kennett et al., 2002). Such a pattern of

performance facilitation has normally been explained in terms of an exogenous shift of spatial attention. As an aside, if the temporal interval between the onset of the visual cue and the tactile target is increased, then the facilitation that is normally observed at the cued location can sometimes be replaced by a longer-lasting inhibitory aftereffect, known as inhibition of return (IOR; e.g., Spence et al., 2000a).

Those studies that have used the crossmodal congruency task (see **Figure 1A**, for a schematic figure of the experimental set-up) have typically demonstrated that a participant's performance in a speeded elevation-discrimination task is impaired when visual distractors are presented from an incongruent elevation with respect to the tactile target than when both target and distractor happen to be presented from the same (i.e., congruent) elevation (see Spence et al., 2004c, 2008, for reviews). The "crossmodal congruency effect"is largest when the stimuli are presented from the same lateral position (or side of fixation), and has been shown to fall off as the lateral separation between the target and distractor increases

<sup>1</sup>The "orthogonal" here refers to the fact that the dimension along which the cue varies is orthogonal to the dimension along which the targets have to be discriminated, hence ruling out a response-bias explanation for any cuing effects that may be observed (see Spence and Driver, 1997).

(e.g., as when the target and distractor are presented to separate cerebral hemispheres).

The results of the large number of studies that have been conducted over the last decade or so using either one of these experimental paradigms – the crossmodal orthogonal spatial-cuing paradigm or the crossmodal congruency task – have generally converged on the conclusion that the relative location from which the multisensory stimuli have been presented determines the degree to which they exert an influence over one another (excepting any effects that can be attributed to mere eccentricity effects).

Referring to the distinction between *exogenous* and *endogenous* spatial cuing, there is now robust evidence to support the role of space in both types of crossmodal spatial-cuing paradigms (e.g., Spence et al., 2000b; Driver and Spence, 2004). What this means is that the relative location of visual and tactile stimuli determines visuotactile interactions in both a "bottom-up" as well as a "top-down" manner. More specifically, in those studies that have used the exogenous spatial-cuing paradigm, the influence of a salient pre-cue on a participant's reaction toward a subsequently presented target has been investigated. As a result, the pre-cue is non-informative with regard to the likely location (or identity) of the target (and may thus be regarded as a to-be-ignored

task-irrelevant distractor). Consequently, the target is as likely to occur in the same location as the pre-cue as it is to occur at a different one; thus, the *stimulus-driven* effect of a pre-cue on a target is obtained within exogenous spatial-cuing paradigms (e.g., Spence et al., 1998a, 2000a; Chong and Mattingley, 2000; Kennett et al., 2001a; Gray and Tan, 2002; Spence and McDonald, 2004; see also Spence et al., 2004a, for an overview of the crossmodal research utilizing the exogenous spatial-cuing paradigm).

By contrast, in those studies that have attempted to investigate endogenous spatial-attentional cuing, a pre-cue that is predictive with regard to the location of the target has been documented to give rise to attentional shifts. Thus, within the endogenous cuing paradigm, the *top-down* crossmodal effects of a pre-cue on a target have been examined (Spence et al., 2000b; see also Spence et al., 1998b, 2004b; Driver and Spence, 2004, for a review).

Importantly, visuotactile interactions have largely been obtained within both variants of the crossmodal spatial-cuing paradigm, despite the striking differences that have sometimes been observed between exogenous and endogenous spatial attention (see Spence and Driver, 1994; Klein and Shore, 2000; see also Spence and Gallace, 2007, for exogenous and endogenous attentional effects specifically in the tactile modality), thus

indicating that space (supramodally) moderates stimulus-driven as well as top-down effects between vision and touch.

The available research that has been published to date therefore suggests that it is the *relative location* from which the visual and tactile stimuli are presented in external space that determines the magnitude of any crossmodal spatial-cuing effects. So, for example, holding the hands in a crossed posture causes a reversal of the observed effects in exogenous (Kennett et al., 2002) as well as in endogenous cuing paradigms (Spence et al., 1998b, 2004b): a visual cue presented on the left (right) side elicits more pronounced interference effects for tactile targets presented on the right (left) hand when the hands are crossed. The same crossing effect has also been documented in those studies that have used the crossmodal congruency task (see Spence et al., 2004c, 2008, for reviews). Hence, irrespective of the posture adopted by the participant's hands, the influence of vision on tactile information processing is especially pronounced when the visual distractor occurs on the same side of external space as the tactile target. This result means that it is the location of the stimuli in external space, rather than their initial hemispheric projections, that is the crucial factor when it comes to determining how space moderates the integration of visual and tactile stimuli (see Sambo and Forster, 2009, for supporting evidence from an event-related potentials, ERP, study), at least in neurologically normal participants (see Spence et al., 2001a,b, for patient data; see also Valenza et al., 2004, for the effects of changes to the posture of the hands on the tactile discrimination performance in a patient with bilateral parietal damage).

The influence of vision on tactile information processing has been analyzed using the *attentional blink* (AB) paradigm. The AB refers to an impairment in responding to a target that happens to be presented after another target that requires a response, as compared to a target that happens to be presented after another target that does not require a response. Besides the well-established AB that has been documented repeatedly in the visual modality over the last couple of decades or so (see Raymond et al., 1992, for the original study), an AB has also been demonstrated in both the auditory (e.g., Soto-Faraco and Spence, 2002) and tactile modalities as well (Hillstrom et al., 2002; Dell'Acqua et al., 2006). Importantly, however, with regard to the scope of the present review, Soto-Faraco et al. (2002) reported evidence in support of the existence of a crossmodal visuotactile AB. Given that Soto-Faraco et al. (2002). implemented a spatial-localization task (i.e., a speeded target elevation-discrimination task), this result is in line with the evidence obtained within crossmodal spatial-cuing paradigms in highlighting that visuotactile interactions may be more apparent in those tasks where space is somehow relevant to the participant's task (cf. Spence, 2013). As an aside, note that the asymmetrical pattern of results in the blocked conditions indicates that responses associated with visual stimuli exhibited a stronger aftereffect over subsequent target processing than responses associated with tactile stimuli.

Building on the research demonstrating that a neutral visual stimulus enhances the processing of co-located tactile stimuli that happen to be presented subsequently, Poliakoff et al. (2007) demonstrated the modulation of the magnitude of this visuotactile spatial-cuing effect by the threat value of the visual stimulus (i.e., threatening pictures of snakes and spiders vs. non-threatening pictures of flowers and mushrooms). That is, threatening visual cues enhanced tactile processing at the pre-cued location more than did non-threatening visual pre-cues, indicating that threat value modulates the amount of (spatial) attention allocated to a visual stimulus, thereby influencing the processing of a subsequent tactile target at exact this location.

From a somewhat different viewpoint, Poliakoff et al.'s (2007) results indicate that proximity to the hands can increase the amount of attention that is allocated to a threatening stimulus. As an aside, then, Abrams et al. (2008) demonstrated that proximity of the hands can also augment the capture of attention by a non-threatening visual stimulus. In their study, proximity of the hands (hands held close to vs. far from the display where visual stimuli happened to be presented) moderated visual search, visual IOR, and visual AB. That is, the processing of visual stimuli was prolonged for those stimuli near the hands (i.e., participants were slower to disengage their attention from those visual stimuli close to the hands) as compared to those far from the hands. These results show that the disengagement of attention from visual stimuli is delayed near the hands. Thus, proximity to the hands can be concluded to alter visual information processing.

Yet, going beyond the investigation of effects of proximity to the hands (and again investigating tactile information processing), Van Damme et al. (2009) implemented a similar experimental setup as Poliakoff et al. (2007) but compared the effects of visual stimuli showing different types of threat, namely the threat of physical harm to the hand (which was the body region receiving the tactile stimulation in this study) versus general threat to the whole person. Extending Poliakoff et al.'s (2007) findings, these researchers demonstrated that physical threat *selectively* elicited a shift of their participants' tactile spatial attention. This was reflected in the prioritization of tactile information presented at the hand positioned at about the same location where the visual pre-cue showing physical threat had happened to occur over tactile information presented to the other hand. By contrast, the shift of auditory spatial attention was not modulated by the type of threat. Hence, auditory spatial attention may generally be enhanced in threatening situations, while the amount of attention captured by a tactile stimulus delivered to the hands further depends on the degree of apparent threat of physical harm toward specifically this body-part. Summing up, it seems that not only does the proximity of the hands to (threatening) visual stimuli determine the allocation of spatial attention but also can the focus of tactile spatial attention precisely be guided by the information which body-part it is that is threatened.

What we have covered so far in this review are those tasks concerned with *covert* spatial attention within the visual modality. In these tasks, participants focus their visual attention on a specific location without making any overt head, eye, or bodily movements (e.g., Spence et al., 1998a, 2000b; Kennett et al., 2002). Importantly though, the presentation of a visual stimulus has also been demonstrated to enable faster and more accurate saccades toward a to-be-detected tactile target (i.e., speeded *overt*-orienting response) if the visual stimulus is located at approximately the same spatial position (Diederich et al., 2003). Note here that covert and overt tactile spatial attention are typically linked, but – as for vision – can be separated under a subset of experimental conditions (e.g., Rorden et al., 2002). In conclusion, it seems that the magnitude of visuotactile interaction effects elicit by covert as well as overt spatial orienting is moderated by the distance between the visual and tactile stimuli involved.

In line with the assumption of a spatially specific influence of vision on touch, Valenza et al. (2004) reported a facilitation of tactile discrimination performance on trials where a visual distractor was presented on the same side as the tactile target as compared to trials where the visual distractor was presented on the opposite side. What's more though, is that this facilitation was observed only in healthy individuals but not in a patient with bilateral parietal damage. Still, the patient's left-hand responses were speeded up by a concurrent visual distractor (as compared to when no visual distractor was presented) irrespective of whether the distractor occurred at the right or the left side. As these are results from a single-case study, they provide only indicative evidence. Still, it should be noted that these results suggest that there are *spatial* mechanisms by which visual stimuli can affect tactile information processing but that, in addition, vision also exerts a spatially non-specific influence over tactile information processing.

Rather than using distinct visual and tactile stimuli, several researchers varied whether or not their participants were able to see the body-part receiving the tactile stimulation by manipulating the direction of participants' gaze. For example, faster tactile target detection has been reported when the eyes are directed toward the stimulated region on the skin surface than when they are directed toward another area (in the same or the opposite visual hemisphere; Honoré et al., 1989).

Accordingly, comparing two conditions within the same group of participants whose hands were occluded from view by cardboard boxes, namely a condition where the gaze was directed toward the hand receiving the tactile stimulation to a condition where the gaze was directed toward the other hand, Tipper et al. (1998) demonstrated tactile target detection to be faster in the former condition. Remarkably, these studies imply influences of the visual modality on the performance of a purely *tactile* task.

Still, the observed effects of gaze direction might reflect effects of spatial attention toward the body-part that is being stimulated. More specifically, the direction of gaze toward the body-part stimulated might enhance processing of stimuli occurring within the respective region on the body surface, thus causing the effects reported. In the studies of tactile target detection that have just been reviewed, the contributions of vision and gaze direction to tactile perception cannot be disentangled from these effects attributable to attention.

Combining the approaches of presenting a visual pre-cue prior to tactile stimulation and of manipulating whether vision of the body-part stimulated is provided, it has been demonstrated that visual events increase the probability of participants erroneously reporting a tactile sensation (as measured by the Somatic Signal Detection Task). However, this only holds true when vision of the stimulated body-part (i.e., the hand) is provided (Mirams et al., 2010). This finding emphasizes that non-informative visual stimuli may not only enhance the processing of tactile stimuli occurring at roughly the same location, but may also interfere with tactile processing, thus possibly leading to the sensation of touch in the absence of any actual stimulation. In other words, the direction of spatial attention toward a body-part that the participant expects to receive a tactile stimulus can have detrimental effects on tactile target detection (i.e., it can give rise to higher false-alarm rates).

Taken together, the findings presented so far show that what a person sees can affect their tactile perception and facilitate responding to a tactile target. Still, it remains an open issue the degree to which the influences that have been obtained result from spatial attention processes elicited by gaze direction, and whether vision can influence tactile target processing in a non-spatial fashion.

## **HOW VISION INFLUENCES TACTILE TARGET PROCESSING EVEN WHEN SPACE IS COMPLETELY TASK-IRRELEVANT**

In this section, we review those studies that have investigated whether tactile information processing can be influenced by vision while the direction of gaze (and thus the direction of spatial attention) is held constant. Strikingly, these studies still provide evidence in favor of the influence of vision on the tactile modality, thus suggesting mechanisms beyond those mentioned so far that underlie the influence of vision on touch.

In order to provide insights regarding the influence of vision on touch *albeit* the effect of the spatial domain, the experimental procedures implemented need to meet some important criteria. Most importantly (and unlike the crossmodal spatial-cuing tasks presented in the previous section), those tasks in which the to-bejudged target property is not spatial have to be used. Furthermore, the direction of gaze needs to be controlled for, since it represents a spatial confound when the influence of vision of a body-part being stimulated is under investigation.

Going beyond the influence of gaze direction, Kennett et al. (2001b) have reported that vision of the body-part stimulated *per se* can enhance tactile resolution. Control of the direction of gaze and thus of spatial attention was achieved by comparing two conditions with the participant's gaze being directed toward the same location in both conditions. While in one condition the participants were able to see the body-part that was being stimulated (i.e., the forearm) shortly before the stimulus was delivered, participants in the other condition were presented with a neutral object that appeared as though it was positioned at the location to which the stimulation was being delivered.

Tactile resolution, as assessed by means of the two-point discrimination threshold, was enhanced when vision of the body-part that was about to receive a tactile stimulus was provided (as compared to when gaze was directed to the same location but the body-part was occluded by a neutral object). Importantly, vision at the moment when the tactile stimulus touched the skin surface was prevented in any case. Hence, the observed results indicate that vision (beyond the orienting of gaze) can enhance the sensitivity of the tactile receptor field corresponding to the visually attended region on the body surface (see also Haggard et al., 2007; Cardini et al., 2011). Similarly, performance in a tactile orientation-discrimination task was enhanced when the body-part stimulated (the hand in this case) rather than

a neutral object was viewed, even if the neutral object in the latter condition was seen at the location of the body-part that was stimulated (see Cardini et al., 2012). Finally, seeing a hand has been shown to enhance tactile acuity on the face (Serino et al., 2009).

Utilizing a different approach to control for the influence of gaze direction, Tipper et al. (1998) provided one group of participants with *indirect* vision of one of their hands via a real-time image of their right versus left hand on a video monitor placed at the body midline. In particular, their gaze was never directed toward the real hand in this condition. In contrast, a second group of participants oriented their head and eyes toward their right versus left hand, and was thus provided with *direct* vision. In both conditions, tactile target detection was faster for those targets occurring on the hand viewed as opposed to the other hand, implying that vision without gaze affects tactile detection. What remains unclear, however, is whether directing attention to the body-part stimulated in another way, as, for example, by presenting participants with the word describing it, would also be sufficient to induce enhanced tactile information processing.

Nevertheless, the findings obtained so far at least provide suggestive evidence that vision may enhance the sensitivity of the tactile receptors on those locations on the body surface that are visually attended. Notably, this assumption is further supported by the results of yet another study in which tactile spatial detail has been demonstrated to be even further enhanced when the participant's view of the body-part that had been stimulated is magnified (i.e., when viewing the arm through a magnifying glass) than when seeing it without magnification (Kennett et al., 2001b). Importantly, however, in neither condition of this study could the arm be seen at the moment when the tactile stimulus impacted on the participant's skin.

One could argue that some kind of "habitual effects" underlie the effects of the studies by Tipper et al. (1998) and Kennett et al. (2001b) as participants are used to seeing the body-parts that were stimulated in these studies (i.e., the hand and the forearm). Yet, overcoming this limitation, Tipper et al. (2001) replicated Tipper et al.'s (1998) earlier findings using a body-part that is usually unavailable for proprioceptive orienting (namely the back of the neck). In sum, the results of these studies indicate that vision generally enhances the speed of tactile target detection and tactile resolution at the visually attended location on the body surface.

Here, mention should also be made of the studies conducted by Graziano and Gross (1992, 1993; see also Desimone and Gross, 1979; Bruce et al., 1981; Hikosaka et al., 1988), which revealed that there are bimodal visuotactile neurons in macaque monkeys (e.g., in the face- and the arm-region of the somatotopically organized putamen). For these neurons, the tactile receptive field has been demonstrated to approximately match the visual receptive field, meaning that these neurons respond to visual and tactile stimuli at the same location on the body surface. When the arm is moved, the visual receptive field thus moves with it (Graziano and Gross, 1994). The finding that visual information about a specific body-part enhances tactile detection performance as well as tactile resolution on this specific body-part may be attributable to such neurons responding to visual and tactile

stimuli at the same location on the body surface (see Graziano et al., 2004, for an overview of multimodal areas in the primate brain).

Attempting to analyse another potential pathway by which vision might affect tactile information processing, we developed a visuotactile response-priming paradigm in order to investigate whether visual stimuli hamper the processing of tactile stimuli if they are associated with distracting information. More specifically, we addressed the question of whether responses that are associated with irrelevant visual pre-cues interfere with (or facilitate) the responses that are elicited by tactile targets that happen to be presented at about the same time and vice versa (Mast et al., unpublished manuscript). To control for the effects of variations in spatial attention, the visual and tactile stimuli were presented from roughly the same location in external space. Therefore, participants positioned the hand to which the tactile stimuli were to be delivered directly behind a small monitor on which the visual stimuli were presented (see **Figure 1C**). Note that given this experimental set-up, the participant's spatial attention was always directed toward the same position irrespective of the mapping of the pre-cues and targets to modalities.

Within the response-priming paradigm developed by Mast et al. (unpublished manuscript), all of the stimuli – both the precues and targets – were associated with one of two responses. Hence, on each trial, the pre-cue and the target could be mapped onto the same responses (these are known as compatible trials) or opposite responses (known as incompatible trials). The participants were instructed to ignore the pre-cue and to discriminate which of the two possible targets had been presented according to the target-intensity (for the visual modality, the targets differed with regard to their brightness; for the tactile modality, these differed with regard to their amplitude). Thus, there were four different stimuli: one high intensity visual stimulus, one low intensity visual stimulus, one high intensity tactile stimulus, and one low intensity tactile stimulus.

The presentation of the visual pre-cues exerted a significant crossmodal influence over tactile target processing, that is, response latencies were significantly shorter in the compatible trials than in the incompatible trials. In other words, a significant response-priming effect was observed. This result shows that vision can aid tactile information processing by facilitating the retrieval of relevant information (here the S-R mapping) from memory, as, for example, by pre-activating the to-be-executed response.

Remarkably, no significant response-priming effect emerged when tactile pre-cues preceded the visual targets. Note that these contrary results as a function of the mapping of pre-cues/targets to modalities cannot be attributed to the operation of spatial attention (since spatial attention would have been expected to lead to comparable response-priming effects in both directions). Rather, these results suggest that the information that is attached to visual stimuli (associated responses in this case) is either more automatically retrieved from memory than the information that is associated with tactile stimuli or else that it is more difficult to inhibit those responses that happen to be elicited by task-irrelevant *visual* pre-cues than to inhibit those responses that are elicited by task-irrelevant *tactile* pre-cues. Both possible mechanisms may

contribute to the stronger response-priming effects from vision to touch than in the opposite direction. As an aside, in Soto-Faraco et al.'s (2002) study, an asymmetrical visuotactile AB has accordingly been obtained. With experimental blocks in which the visual target constantly led the tactile target or vice versa, these researchers reported a crossmodal AB only in the former condition (see also Dell'Acqua et al., 2001, Experiments 3–4). This is further evidence pointing to the conclusion that the information associated with visual stimuli somehow dominates over information attached to tactile stimuli.

Note that these mechanisms may also play a role within the crossmodal congruency paradigm. First, the *response-competition* account explaining the crossmodal congruency effect also inherits the idea that pre-cues elicit the retrieval of a particular response (or response tendency) even if no response to the stimulus is required. In the case of distractors presented from a location that happens to be different from the subsequent target location, this tendency is incongruent with the required response, whereas in the case of distractors presented from the same location as the subsequent target, it is congruent with the required response. Consequently, a response conflict is only present in the former condition, possibly contributing to the observed visuotactile effect (see Shore et al., 2006). Second, corroborating the pattern of results obtained within our response-priming paradigm, crossmodal congruency effects from vision to touch have been found to be stronger than those from touch on vision (Spence et al., 2004c; Walton and Spence, 2004; Spence and Walton, 2005). These findings are further in line with the body of evidence indicating a generalized bias of attention allocation toward the visual modality (e.g., Posner et al., 1976; Spence et al., 2001c).

Summing up, in those studies that have controlled for the influence of the spatial dimension, an influence of vision on tactile target processing is still observed. The evidence suggests, on the one hand, that vision enhances the processing of tactile stimuli applied to tactile receptor fields that correspond to the viewed locations on the body surface and, on the other, that visual stimuli can prime categorization responses to tactile targets when gaze is kept constant.

## **HOW SPACE CONTRIBUTES TO THE INFLUENCE OF VISION OVER TACTILE DISTRACTOR PROCESSING**

Most studies that have examined the influence of vision on tactile information processing have been concerned with the processing of tactile targets; that is, researchers have typically analyzed whether vision modulates responses to tactile *targets*. Consequently, much less is known about the influence of vision on tactile *distractor* processing, that is, on tactile stimuli that should be ignored or are irrelevant for (or may even interfere with) responding. One exception is a series of experiments that were conducted by Driver and Grossenbacher (1996). These researchers presented results suggesting that vision, guided by the direction of gaze, not only exerts an influence over tactile target processing but also over tactile distractor processing. In their study, a tactile target and a tactile distractor were delivered to the participant's right and left little fingers, respectively. Driver and Grossenbacher (1996) separately analyzed the influences of both vision (i.e., participants

were blindfolded vs. not blindfolded) and gaze direction on performance.

More effective tactile selection (i.e., lower differences in the latencies on those trials with distractors dissimilar to the targets as compared to trials with distractors similar to the targets) was observed when the participant's gaze was directed toward the finger that received the target than when their gaze was directed toward the finger receiving the distractor. Note, once again, that this finding implies that tactile information processing is generally enhanced at those locations where gaze happens to be directed, irrespective of whether the tactile stimuli happen to be targets (and therefore relevant with regard to the task at hand) or distractors (and therefore irrelevant with regard to the task at hand).

Accordingly, even in blindfolded participants, Driver and Grossenbacher (1996) observed less effective (or efficient) tactile selection when the hands were placed close together in external space than when they were placed far apart. This result is in line with the assumption that gaze direction generally enhances tactile information processing, as both the target and the distractor might have been positioned within the direction of gaze when the distance between the target and the distractor location was small. Thus, given the small distance between the participant's hands, spatial attention (as elicited by the direction of gaze) is likely to be simultaneously directed toward both the target and the distractor location. As a result, the processing of both the target and the distractor should be enhanced in the hands-close condition but not in the hands-far conditions (where the gaze, and therefore spatial attention, are selectively directed toward either the target or the distractor location), in turn, causing a stronger interference from dissimilar as compared to similar distractors within the former condition.

Somewhat differently, Soto-Faraco et al. (2004) gained strong support for the influence of vision over tactile information processing by demonstrating that the visually perceived distance between a participant's hands affects tactile selection when it is at odds with the actual proprioceptively specified distance. Therefore again simultaneously receiving a vibrotactile target and distractor stimulation on the previously defined target and distractor hand, respectively, their participants had to perform a speeded target elevation-discrimination task. In the critical experimental condition, a mirror was positioned vertically close to the participant's right hand, in a way that the participants had the visual impression of their left hand lying close to their right hand (although they could actually see a mirror-image of their right hand with their left hand being placed further apart from the right hand than the mirror-image; see **Figure 1B**). Just as in a hands-close condition without the mirror, tactile selection was less effective (i.e., the detrimental impact of a dissimilar as compared to a similar distractor was more pronounced) in this mirror-condition than in the hands-far condition without the mirror.

In another study in which a mirror was used to vary the visually perceived distance between the hands, participants performed a temporal order judgment (TOJ) task with tactile stimuli being presented to either index finger (Gallace and Spence, 2005). Significant performance differences were observed as a function of the participant's perceived hand separation (elicited by means of the mirror reflection of the own left hand). Performance was significantly worse when the participant's hands appeared visually to be close together than when the hands appeared at either middle or far distances. Importantly, just as was the case in the study by Soto-Faraco et al. (2004), the observed pattern of results was consistent with that obtained when the proprioceptively specified distance between the hands had been varied (investigated in a dark room, where vision of the hands was prevented; see Shore et al., 2005).

Although the results of these studies varying the visually perceived separation between the hands cannot be explained in terms of the direction of spatial attention by variations of the orientation of gaze, they are nonetheless highly dependent on space. Indeed, they point to a further mechanism by which space may contribute to the influence of vision on tactile information processing. More specifically, as the participant's hands are falsely perceived to be positioned near one another in the mirror-condition, these results indicate that visual information exerts a more profound effect on the spatial distribution of tactile selective attention than proprioceptive information concerning the distance between the hands. Consequently, the illusory visual perception of the left hand being positioned close to the right hand may lead to the allocation of attention onto the hand hidden behind the mirror as if that hand were actually positioned at the visually defined location.

Taken together, then, the findings presented in this section of the review demonstrate, on the one hand, that the direction of gaze toward the stimulated body-part enhances the processing of to-be-ignored tactile stimuli (i.e., distractors) just as it enhances the processing of tactile target stimuli independently of vision, possibly by guiding a participant's spatial attention. On the other hand, they show that the interference between tactile target and distractor stimuli crucially depends on the visually perceived relative location of tactile target and distractor stimuli rather than on their proprioceptively specified relative location.

## **HOW VISION INFLUENCES TACTILE TARGET PROCESSING EVEN WHEN SPACE IS TASK-IRRELEVANT**

When controlling for the direction of gaze and thereby usually for spatial attention (although one could of course always argue that it is possible that covert attention and gaze are directed toward different locations in external space), tactile selection tasks represent an especially useful tool with which to examine nonspatial influences of vision on tactile distractor processing. This is because, in these experimental studies, the effects of spatial attention as well as any attentional effects elicited (explicitly or implicitly) by the nature of the task instructions (namely to attend to the location where the tactile target will occur rather than to the distractor location) are controlled for. Note that Driver and Grossenbacher (1996) also used a tactile selection task in order to examine the influence of vision on target *and* distractor processing. However, to the extent that these researchers investigated the effects of the direction of gaze at the same time as they assessed the effects of vision, their results might be attributable to the variation of the direction of spatial attention by gaze.

Furthermore, Driver and Grossenbacher (1996) did not obtain any *crossmodal* effect of vision on tactile selection (i.e., no differences in performance were observed between blindfolded and sighted participants). It is, however, important to note that any potential effects here may have been masked by the effects of spatial attention. In this sense, our own more recent research can be seen as complementing Driver and Grossenbacher's earlier findings. More specifically, we utilized a negative-priming paradigm and a flanker paradigm in order to investigate how vision influences the processing of tactile distractors.

Implementing a tactile variant of the negative-priming paradigm, Frings and Spence (2013) conducted a study designed to compare a condition in which the participant's hands were positioned close together/touching with a condition in which their hands were positioned far apart. In both cases, the participants were unable to see their limbs since they were occluded from view by a cover (see their Experiments 2 and 3). The participants were presented with two vibrotactile stimuli at a time, one delivered to either hand. They were instructed to ignore one of these stimuli while responding to the other vibration as rapidly and accurately as possible (a color cue was presented on the screen to indicate whether the participants should respond to the vibrotactile stimulus presented to their right hand or the stimulus presented to their left hand).

Tactile negative-priming effects were computed as the slowing of response latencies in those (probe) trials in which the target constituted the vibrotactile stimulus that had been presented as the distractor (and thus had to be ignored) in the preceding trial (i.e., the prime trial), as compared to response latencies in those probe trials in which the vibrotactile targets had not been presented in the prime trial. Overall, the data revealed that the influence of the distance between the hands was qualified by a disordinal interaction with vision. This means that, while significant negative-priming effects were obtained when the participants' hands were occluded from view in the hands-close condition, they disappeared when the participant's hands were visible in this posture. The presence of a disordinal interaction implies that significant negative-priming effects were also obtained when the participants' hands were visible in the hands-far condition but not when the hands were occluded from view.

Note here that in Frings and Spence's (2013)study, the attention of the participants should have been directed to the target hand while performing the tactile selection task. Hence, the observed influence of vision on tactile information processing likely represents an effect that occurs regardless of a participant's voluntarily guided (spatial) attention. However, this study did not provide any information concerning the mechanism by which vision influences the processing of tactile distractors. In this regard, the Eriksen flanker paradigm (see Eriksen and Eriksen, 1974, for the original study conducted within the visual modality; and Chan et al., 2005, for its extension to the auditory modality; see also e.g., Evans and Craig, 1992; Craig, 1995; Craig and Evans, 1995, for tactile variants of the paradigm) provides a useful tool with which to investigate the depth of distractor information processing.

As in the negative-priming paradigm, a target and a distractor are presented simultaneously with each of the four stimuli possibly serving as target or as a distractor. Consequently, another common feature is that not only are the targets associated with a response but so too are the distractors. The crucial aspect of the flanker-interference paradigm, however, is that a 4-to-2 mapping is used, meaning that the four stimuli are mapped onto two responses. As a result, three types of trials can be distinguished along two dimensions, namely the dimension of perceptual congruency, whereby trials with distractors that are identical to the current target are compared to those trials on which the distractors are different (i.e., perceptually incongruent) from the target, and the dimension of response compatibility, whereby trials with distractors that are mapped onto the same response as the current target are compared to those trials in which the distractors are mapped onto the opposite response.

Two different interference effects can be computed reflecting these dimensions, the so-called flanker-interference effect at the level of perceptual congruency (calculated by comparing perceptually congruent with perceptually incongruent trials), and the so-called flanker-interference effect at the level of response compatibility (by comparing response-compatible with responseincompatible trials). The occurrence of flanker effects allows one to draw conclusions as to the level to which the distractors have been processed: if there is interference only at the level of perceptual congruency, then it implies that the distractor stimulus was not processed up to the level of response preparation. By contrast, if the distractor is processed up to the level of response preparation, then the responses elicited by the target and the distractor would be expected to interfere in response-incompatible trials (but not in the response-compatible trials), resulting in a flanker effect at the response level.

Note that those studies investigating tactile congruency effects (e.g., Driver and Grossenbacher, 1996; Soto-Faraco et al., 2004; Gallace et al., 2008; Frings and Spence, 2010) have typically implemented a paradigm inspired by the Eriksen flanker paradigm. Yet, strikingly, only incongruent and congruent trials have been compared and hence it has not been possible to separate the effects of perceptual and response compatibility.

To investigate the crossmodal influence of vision on the depth of tactile distractor processing, we implemented a tactile variant of the 4-to-2 Eriksen flanker paradigm (see also Evans and Craig, 1992; Craig, 1995; Craig and Evans, 1995). Participants simultaneously received two tactile stimuli every trial (see **Figure 1D**, for the experimental set-up). Once again, one of these stimuli was presented to either hand, with the blockwise instructions to attend to the stimuli presented onto one hand (i.e., the target hand), while ignoring the distractor stimuli presented to the other (i.e., distractor) hand. In order to control for any influence of (overt) spatial attention, we kept the direction of gaze constant. Furthermore, the participant's hands were placed next to each other, separated by a distance of about 40 cm, which makes it unlikely that spatial attention covers the external space including both hands, since participants appear to be able to split their attention between the two hands (Craig, 1985, Experiments 4–5; see also Craig, 1989). Next, we compared a condition in which the participants were blindfolded to another condition in which the participants were provided with a complete view of the experimental set-up (Wesslein et al., in press). Interestingly, vision was found to enhance the processing of tactile distractors from the perceptual level all the

way up to the level of response preparation: while flanker effects at both levels were observed in the full-sight condition, only the perceptual flanker effect was apparent in the blindfolded condition.

The differential effects reported in the conditions with blindfolded and seeing participants cannot be accounted for in terms of the effects of spatial attention, since that should have been directed toward the target hand in both conditions. Hence, spatial attention need not be directed toward the location at which a tactile distractor is delivered in order for vision to influence its processing. Furthermore, the crucial effect of vision was concerned with irrelevant tactile stimuli suggesting that attention need neither be voluntarily guided toward the location at which a tactile stimulus happens to occur for vision to exert an influence over tactile information processing. Importantly, then, the pattern of results provides some of the first evidence to suggest that vision alone may give rise to a deeper processing of both tactile target *and* distractor stimuli (namely to their processing up to the response level), thus supporting the view that there can be a strong crossmodal influence of vision on tactile information processing through a process of enhanced tactile processing by vision of the (non-attended) body-part stimulated.

Taken together then, these results suggest that vision affects tactile distractor processing beyond its role in guiding a participant's spatial attention toward the location of the tactile distractor. In fact, we have found evidence to demonstrate that vision might influence how deeply a tactile distractor is processed (e.g., whether it is processed up to the level of response selection) or how the eccentricity between tactile targets and distractors, that is, their distance from the body midline or maybe also the separation between them, is perceived.

## **SUMMARY AND CONCLUSION**

We have outlined the various ways in which vision influences the processing of tactile targets as well as tactile distractors. Discussing the cognitive mechanisms that may underpin such effects, we have attempted to highlight the important role that space plays in many of the crossmodal studies that have been published to date. Consequently, the visual modality – that is, either the presentation of distinct visual stimuli, the direction of gaze, and the visually perceived location of one's limbs in external space – was suggested to affect the allocation of spatial attention relative to the body-parts, thus enhancing the processing of tactile stimuli at visually attended locations. What's more, the information that was associated with irrelevant visual stimuli was demonstrated to interfere with information associated with tactile stimuli. The information associated with visual stimuli has thus been suggested to be automatically retrieved from memory, thus impairing tactile performance. As such, we have also presented a number of findings that together point to there being an influence of vision on touch that is independent of the spatial dimension (see **Table 1**). In reviewing the latter studies, we have highlighted how vision albeit the orientation of gaze affects the processing of both tactile target and distractor stimuli, for example, by furthering the sensitivity of the tactile receptor fields seen.

At present, knowledge concerning the influence of vision on tactile distractor processing is relatively scarce. Yet, one may ask whether there is any need to discuss the influence of vision on tactile targets and tactile distractors separately. Here, it is important to note that tactile targets will likely always receive attention since the participant has to respond to them in one way or another. By contrast, tactile distractors have to be ignored and would, presumably, ideally not receive any attention. As a consequence, one might argue that vision can have different influences on the processing of to-be-attended and to-be-unattended tactile stimuli: so, for example, one could argue that vision of the location where a (previously) unattended tactile stimulus happens to occur might have a larger impact on tactile information processing than vision of the location where an attended stimulus happens to be delivered (as the latter will receive attention in any way). However, concerning the impact of the guidance of spatial attention due to vision or gaze on tactile information processing, it can be concluded that there is no difference between the processing of tactile targets and tactile distractors. In particular, while responding to tactile targets is typically facilitated due to visually guided spatial attention (e.g., Honoré et al., 1989), interference from tactile distractors is increased due to visually guided spatial attention (Driver and Grossenbacher, 1996). Both phenomena can be attributed to the fact that spatial attention furthers the processing of the respective tactile stimuli, thereby making it easier to respond to them in the case of tactile targets while making it harder to ignore in the case of tactile distractors.

Turning now to the non-spatial influences of vision on the processing of tactile targets and distractors a somewhat different picture emerges. In fact, we have recently published data suggesting that vision of the stimulated body-part receiving the tactile distractor is a precondition for the processing of the distractor up to the level of response selection (see Wesslein et al., in press). This influence of vision is "distractor-specific," as targets have always to be processed up to the level of response selection simply because participants have to respond to targets. Once again, one might consider this influence of vision on tactile distractors as some kind of attentional effect. Looking at information processing models that assume three stages of information processing (a perceptual one, a central bottleneck in which the S-R mapping is applied, and a motoric one in which the concrete response is planned; see e.g., Welford, 1952; Allport, 1989; Pashler, 1991, 1994; Spence, 2008), one may argue that vision is needed to move tactile distractors through all three stages whereas interference at the first stage (i.e., the perceptual stage) is independent of vision (note, that perceptual masking of tactile targets due to tactile distractors was independent of vision; Wesslein et al., in press). In conclusion, we would like to argue that vision influences tactile distractor processing by modulating the amount of attention that is directed to the tactile distractor. Notably, it seems as though not only spatial attention but also non-spatial attention to tactile distractors is affected by vision.

#### **REFERENCES**


pairings of vision, touch and audition. *Exp. Brain Res.* 134, 42–48. doi: 10.1007/s002210000442


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

#### *Received: 23 October 2013; accepted: 21 January 2014; published online: 06 February 2014.*

*Citation: Wesslein A-K, Spence C and Frings C (2014) Vision affects tactile target and distractor processing even when space is task-irrelevant. Front. Psychol. 5:84. doi: 10.3389/fpsyg.2014.00084*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014Wesslein, Spence and Frings. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The role of differential delays in integrating transient visual and proprioceptive information

## *Brendan D. Cameron1,2, Cristina de la Malla1,2 and Joan López-Moliner 1,2\**

*<sup>1</sup> Vision and Control of Action Group, Departament de Psicologia Bàsica, Universitat de Barcelona, Barcelona, Spain <sup>2</sup> Institute for Brain, Cognition and Behaviour (IR3C), Barcelona, Spain*

#### *Edited by:*

*Knut Drewing, Giessen University, Germany*

#### *Reviewed by:*

*Valeriya Gritsenko, West Virginia University, USA Stefanie Mueller, Justus-Liebig-University Giessen, Germany*

#### *\*Correspondence:*

*Joan López-Moliner, Departament de Psicologia Bàsica, Universitat de Barcelona, Passeig Vall d'Hebron 171, Barcelona 08930, Catalonia, Spain e-mail: j.lopezmoliner@ub.edu*

Many actions involve limb movements toward a target. Visual and proprioceptive estimates are available online, and by optimally combining (Ernst and Banks, 2002) both modalities during the movement, the system can increase the precision of the hand estimate. The notion that both sensory modalities are integrated is also motivated by the intuition that we do not consciously perceive any discrepancy between the felt and seen hand's positions. This coherence as a result of integration does not necessarily imply realignment between the two modalities (Smeets et al., 2006). For example, the two estimates (visual and proprioceptive) might be different without either of them (e.g., proprioception) ever being adjusted after recovering the other (e.g., vision). The implication that the felt and seen positions might be different has a temporal analog. Because the actual feedback from the hand at a given instantaneous position reaches brain areas at different times for proprioception and vision (shorter for proprioception), the corresponding instantaneous unisensory position estimates will be different, with the proprioceptive one being ahead of the visual one. Based on the assumption that the system integrates optimally and online the available evidence from both senses, we introduce a temporal mechanism that explains the reported overestimation of hand positions when vision is occluded for active and passive movements (Gritsenko et al., 2007) without the need to resort to initial feedforward estimates (Wolpert et al., 1995). We set up hypotheses to test the validity of the model, and we contrast simulation-based predictions with empirical data.

**Keywords: position estimates, vision, proprioception, perceptual judgments, reaching**

## **INTRODUCTION**

The more rapidly the hand moves, the harder it becomes for the sensorimotor system to localize it in real time. We might have the intuition that we can directly see and feel where our hand is at any time, but sensory feedback takes time to reach the central nervous system, so each sensory sample lags the hand's real location. This is an important problem for the sensorimotor system, because motor commands are noisy [especially when the limb is moving quickly (van Beers et al., 2004)] and there is likely to be some error in the initial motor plan. If a correction is to be applied, the sensorimotor system needs to acquire a reliable estimate of the hand's location. In other words, to effectively correct the hand in flight, the sensorimotor system must know where the hand will be relative to the target when the correction occurs. It has been suggested that such an estimate is achieved by optimally integrating vision, proprioception, and a copy of the original motor command (efference copy) (Desmurget and Grafton, 2000). An internal model of the motor system could, theoretically, use the integrated information to forecast the hand's location.

An important piece of evidence for the putative role of efference copy in real-time hand localization is the tendency for participants to overestimate the current location of their unseen moving hand (Dassonville, 1995; Wolpert et al., 1995). The temporal pattern of overestimation is consistent with the involvement of an internal forward model (Wolpert et al., 1995). Here we re-examine the overestimation phenomenon, and we propose an alternate explanation, one based on the optimal integration of differentially-weighted visual and proprioceptive location estimates when the hand moves in the dark.

## **ESTIMATING THE CURRENT LOCATION OF THE UNSEEN MOVING HAND**

Where do people perceive their moving hand? One way to measure this is to provide a visual, tactile, or auditory cue during motion of the hand and then have participants retrospectively report where the hand was when the cue was presented (Dassonville, 1995; Gritsenko et al., 2007). Alternatively, a "stop" signal can be provided during the movement, after which participants report the location of their stopped hand (Wolpert et al., 1995; Gritsenko et al., 2007). These methods tend to show that participants overestimate how far their hand has traveled; however, the effect is not universal, as we will discuss shortly.

In an influential study testing perception of the unseen moving hand, Wolpert et al. (1995) observed a hand position overestimate of 0.5–0.9 cm that was present throughout the measured range of time points (movement durations of 0.5–2.5 s). In that study, participants first viewed their static hand for 2 s, after which vision was occluded, and then participants generated slow planar movements to the left or right of the start location, stopping the movement as soon as a tone was played. Movement distance, dictated by when the tone was played, ranged from 0 to 30 cm. After stopping their movements, participants used a trackball to position a visual marker over the perceived location of their unseen hand. The pattern of perceptual reports—an increasing then decreasing overestimate as movement time increased from 0.5 to 2.5 s—was consistent with a state estimation process where efference copy (combined with the initial estimate of the limb) is initially weighted more heavily than sensory information. Wolpert et al. proposed that as the movement progresses, the reliability of the prediction based on the initial state estimate decreases and the contribution of sensory information to the following state estimates accordingly increases. This shift in weighting from the forward model prediction to the sensorybased estimate was modeled with a Kalman filter, where the weights assigned to the prediction-based estimate and the sensory estimate are dependent on their relative accuracies. Wolpert et al. argued that a pattern of increasing then decreasing overestimation could not be explained by a purely sensory model. We will outline later how a sensory processing model may, in fact, be able to account for such a pattern.

Dassonville (1995) also observed a position overestimate of the moving hand. In Dassonville's study, participants began each trial with their arm extended and pointing toward an LED. A second LED was then illuminated, and participants rapidly moved their hand to point at the second LED. Prior to each trial room lights were illuminated, providing participants full vision of their limb and surroundings, but each trial was conducted in the dark, such that only the LEDs were visible. A tactile stimulus was applied to the index finger, either before, during, or after completion of the movement. After completing the pointing movement to the target LED, participants used the same limb to reach back to the location at which they sensed the application of the tactile stimulus. Dassonville observed that, on average, participants reported a location that was approximately 100 ms farther along the trajectory of the initial reach than where the tactile stimulus was applied. In spatial terms, the overestimate ranged from 0 to 30 cm, depending on the stage of the reach at which the tactile stimulus was applied. Interestingly, participants reported overestimation of the stimulus position even when it was presented just before the onset of the reaching movement. Dassonville argued that consistent overestimation of hand position during movement may be caused by the sensory processing delay for the tactile stimulus. By the time the participant registers the stimulus, their internal representation of the moving limb (presumably aligned with the actual position of the moving limb) has moved beyond the position at which the stimulus was applied. Accordingly, the participant reports a position that is positively biased. However, this explanation is difficult to reconcile with the pattern of results observed by Gritsenko et al. (2007), described next.

Gritsenko et al. (2007) also examined perception of limb position during movement, but they examined not only active movements, in which participants move their own arms, but also passive movements, in which participants' arms are moved for them. Gritsenko et al.'s goal was to test whether a position overestimate would occur in the absence of active movement; that is, would participants report a position overestimate during passive movement, when no efference copy is present? In Gritsenko et al.'s study, participants executed/experienced planar, single-joint 140◦ movements of their lower arm, which was occluded for the entirety of the testing session. In one condition, the participant's task was to remember the location of their moving hand at the time that a sensory cue was presented, and then to execute a return movement to that location, as in Dassonville (1995). Gritsenko et al. observed very similar results for active and passive exposure: Participants tended to overestimate limb position early in the movement (approximately the first 60◦ of the movement), but they then underestimated it later in the movement (approximately the last 60◦ of the movement). Gritsenko et al. suggested that this pattern could be explained by a Bayesian process, in which the unreliability of sensory estimates during motion of the limb led to a heavy weighting of the prior (previously experienced elbow angles in this case). They speculated that this prior might have been biased toward the midpoint of the elbow's range of motion, which would then have caused early cues to be overestimated and later cues to be underestimated.

In another condition of Gritsenko et al.'s (2007) study, the participant's task was to stop their movement when the cue was presented and then report the location of the stopped hand, as in Wolpert et al. (1995). Gritsenko et al. again observed little difference between active and passive exposure; however, participants underestimated the distance traveled by the arm at all tested angles, a result that contrasts with the consistent overestimation effect observed by Wolpert et al. (1995). Several methodological differences exist between Gritsenko et al.'s and Wolpert et al.'s stop tasks, so we do not know which difference is responsible for the conflicting results. There was no visual information regarding hand start location in Gritsenko et al., information that was available in both Wolpert et al. (1995) and Dassonville (1995); furthermore, Gritsenko et al. studied a single-joint movement, whereas Wolpert et al. (1995) and Dassonville (1995) studied multi-joint movements. Either or both of these factors may be responsible for the different effects. However, for our purposes the important finding from Gritsenko et al. (2007) is the close correspondence of the position estimates from the active and passive exposures. This finding suggests that some mechanism that is independent of efference copy might explain position misestimation during reaching.

None of the studies we have described here included a comparison condition in which vision of the reaching hand was available during the reach. Presumably, the researchers assumed that vision would allow for highly accurate position estimates and so they did not include full-vision conditions. However, if efference copy contributes to early position estimates, its effect on the movement should be present regardless of the type of sensory information (visual or proprioceptive) that is available. If, on the other hand, misestimation of the reaching hand actually depends on removing real-time vision (as is implicit in the studies discussed above), a mechanism for misestimation of the moving hand that relies on intersensory re-weighting (instead of prediction-to-sensory re-weighting), is worth considering.

## **A TEMPORAL MECHANISM BASED ON DIFFERENTIAL DELAYS**

We propose that some of the perceptual overestimation effects that have previously been reported can be explained with a temporal mechanism. Our temporal sensory-integration hypothesis is based on two premises: (1) that proprioceptive feedback is processed more quickly than visual feedback, and (2) that the integrated estimate of the reaching hand is more strongly influenced by the more reliable unisensory estimate (Ernst and Banks, 2002; Smeets et al., 2006). Accordingly, when people reach in the dark, the integrated estimate of their hand shifts toward the more reliable (and temporally leading) proprioceptive estimate. After presenting evidence for the differential delays between proprioception and vision, we will provide a basic rationale for how such a mechanism would work.

## **EVIDENCE FOR DIFFERENTIAL DELAYS**

## *Visual delays*

It takes at least 40 ms for a visual stimulus to reach V1. This relatively long latency (compared to other transduction latencies, such as the ones in auditory processing) is mainly due to the time that photoreceptors need to encode information. About 120 ms after visual stimulation, activation can be found in most cortical areas, and leads to conscious visual experience (e.g., Raiguel et al., 1989; Nowak et al., 1995; Lamme et al., 1998; Lamme, 2000, 2003; Lamme and Roelfsma, 2000). In total, the time that one needs to react to a visual stimulus has been estimated to be approximately150–200 ms, as will be described below (e.g., Brenner and Smeets, 2003; Barnett-Cowan and Harris, 2009).

One common task to measure differential delays and to compare them across modalities is the simple reaction time (RT) task, in which the experimenter measures the time that it takes to react to a stimulus of a determined sensory modality. In RT tasks, the difference between the sensory modalities provides us with an approximate value of the lag that one of the sensory modalities has to have with respect to another one in order for the participant to perceive them as simultaneous. From RT results, the time needed to react to a visual stimuli is about 150–220 ms (e.g., Brenner and Smeets, 2003; Barnett-Cowan and Harris, 2009), although this value can vary depending on factors such as the intensity of stimulation (e.g., Schiefer et al., 2001). However, one must take into account that RT is a behavioral measure and so the values provided do not only contain the signal processing time but also the time needed to react. To deal with the "extra time" added by the motor output, some authors have used neurophysiological techniques like ERPs (e.g., Rugg and Coles, 1995; Thorpe et al., 1996) to measure how long the processing period takes. By using this method Thorpe et al. (1996) concluded that highly demanding tasks involving visual image processing can be solved in 150 ms or even less.

Another way of measuring delays in the visual system is by looking at response times to target location changes. By perturbing the target's position one can measure how long it takes to correct an ongoing movement (e.g., Georgopoulos et al., 1981; Soechting and Lacquaniti, 1983; Prablanc and Martin, 1992; Brenner and Smeets, 1997, 2003; Veerman et al., 2008; Oostwoud-Wijdenes et al., 2011). Brenner and Smeets (1997) found that it takes about 400 ms to react (to start moving) to a visual stimulus subjects had to hit (this is the result of processing the visual stimuli, planning the hitting movement, and initiating the response). When the target is displaced during a movement, several factors influence how quickly the movement can be adjusted toward the target's new position. One of these factors may be the uncertainty about the direction of the possible position change. Soechting and Lacquaniti (1983), using double-step paradigms in which the direction of the change was known, reported that the time that it takes to modify trajectories was similar to reaction times toward the first stimulus and of the order of 110 ms. The time to respond may increase if the direction of the target change is not known. Boulinguez and Nougier (1999), for instance, showed a faster correction time (191 ms) for a 75% predictable location than for a 50% and 25% predictable location (213 and 211 ms, respectively) (cf. Cameron et al., 2013).

The timing of the perturbation can also affect the latencies of the corrections. Liu and Todorov (2007) found that the latency to correct an ongoing movement is of about 100 ms independently of the timing of the perturbation. Although this result is in accordance with others (e.g., Gritsenko et al., 2009; Oostwoud-Wijdenes et al., 2011), there are authors that have suggested that the closer to the end of the movement the perturbation takes place, the longer the latency of the correction (e.g., Reichenbach et al., 2009). Other factors affecting how quickly subjects can respond to a target position change are the attributes of the target: faster responses are observed toward targets defined by orientation, size or luminance than by color, texture or shape (e.g., Veerman et al., 2008).

There has also been some research on responses to visual perturbations of the position of the hand or of a cursor or a tool representing the hand's position (e.g., Saunders and Knill, 2003, 2004, 2005; Franklin and Wolpert, 2008; Proteau et al., 2009; Brière and Proteau, 2011). The most common situation in the cursor-jump experiments is that subjects have to move a cursor that represents the hand position toward a target and at some point the cursor jumps so that the trajectory of the movement has to be corrected. The reported latencies of the corrections for cursor jumps are about 140–160 ms (Saunders and Knill, 2003; Franklin and Wolpert, 2008; Veyrat-Masson et al., 2010), slightly larger than the ones for target jumps.

## *Proprioceptive delays*

Proprioception, which provides information related to body posture, is derived from receptors in skin, muscles, tendons, and joints. Accordingly, proprioceptive transmission time to the brain depends on the body part from which the signal originates. For this, and other reasons it is not easy to arrive at a precise estimate of proprioceptive processing times, but we outline some data in the following paragraphs that allow for an approximation.

In non-human primates, the time needed for afferent signals from proprioception to reach brain areas has been estimated to be as little as about 30 ms (Fetz et al., 1980; Soso and Fetz, 1980; Evarts and Fromm, 1981). In a study comparing reaction times to a visual stimulus and to a kinaesthetic one in humans, Flanders and Cordo (1989) found that it took approximately 250 ms to react to a visual stimulus and only 150 ms to react to a kinaesthetic one. In that study subjects had to modulate the left elbow torque in response to a stimulus that could be presented either visually or kinaesthetically. For the visual task, subjects saw the stimulus moving for 70 ms and had to increase or decrease the left elbow torque in a determined direction depending on the final position of the stimulus. For the kinaesthetic task, subjects' right elbow was rotated and they had to increase or decrease the left elbow torque in response to how the right elbow was rotated. In another study with an easier task, Flanders et al. (1986) reported smaller values but in the same direction (110 ms for kinaesthetic information and 190 for visual information). Shorter latencies were reported in Johansson and Westling (1987) who found compensatory responses after 75 ms in response to feedback from the skin receptors in a grip task.

Alary et al. (1998) recorded ERPs when passively moving the right index finger of healthy subjects. The shorter latencies they found were of about 56 and 32 ms for the right and the left index finger respectively in P1 (parietal areas) and of 115 and 96 ms respectively for N1 (frontal areas). Similarly, Mima et al. (1996) also used passive movement of the index finger and evoked potentials and reported the earliest cortical latencies in P1 of 34.6 ms and N1 at 44.8 ms. Seiss et al. (2002) showed that the latency values obtained for both flexion and extension were similar, and of about 90 ms in the N90 component. Factors such as the kind of stimulation (or the device used to create it) or the stimulated area could be responsible for the different values obtained in studies using ERP measures.

Although it is difficult to come up with a reliable estimate of the differential delays, from the data presented above we can estimate sensory delays of about 50–60 ms for proprioception and of about 100–120 ms for vision. So, in conclusion, we can say that proprioception leads vision by approximately 40–50 ms.

## **RATIONALE OF THE MECHANISM**

**Figure 1** illustrates the main features of the proposed mechanism by showing the changing position of a hand along a onedimensional path (gray curve) through time. The slope of this curve thus denotes the velocity of the hand. Two colored points indicate samples at two timepoints (*T*<sup>0</sup> and *T*1) along the trajectory. The green dot denotes the instant position at *T*<sup>0</sup> in the early part of the path, after the hand has just started to move and the speed is not yet very high. The red dot represents the instant position of the hand at time *T*1, when the hand is moving at peak velocity. Assuming the presence of differential delays, in accordance with the evidence reported above, the main idea is that the unisensory positional feedback of the hand at each of these two instant positions will reach the corresponding unisensory brain areas at different times. For example, when the hand moves slowly at time *T*<sup>0</sup> the corresponding instant position will be acquired by visual areas later than by proprioceptive areas. As a consequence, the online visual estimate lags the proprioceptive one. This differential latency in reaching the corresponding areas also manifests in a spatial shift between the visual and proprioceptive position estimates. This situation is represented by the vertical distance between the visual and proprioceptive feedback around the green dot in **Figure 1**. Because this spatial discrepancy results from a temporal difference, we predict that the felt and seen position will be sensed as being the furthest apart when the hand moves at peak velocity (time *T*1), as illustrated by the separation between proprioceptive and visual estimates around the red dot. From this point on, the spatial separation of the two unisensory position estimates will decrease.

## **INTEGRATED INFORMATION AND DELAYED FEEDBACK**

Relying only on available re-afferent signals to update changing positions of the limbs will necessarily lead to delayed actions or overreaching to static targets. We therefore have to assume some kind of adjustment when we integrate both unisensory estimates of position. In **Figure 1A** the integrated percept is ahead of the two unisensory ones and aligned with the actual hand position. This is an important problem in perception mainly caused by the neural transmission times in the sensory systems and has led to the persistent question of whether the perceived position of a moving object lags its "real position" (e.g., Cavanagh, 1997; Krekelberg and Lappe, 2001). Neural delays are present at both sensory and motor stages and, similarly to the internal models proposed to compensate for motor delays, additional compensatory sensory mechanisms have also been put forward (e.g., Nijhawan, 2008) to extrapolate the position of moving objects at the perceptual level. Most evidence for such a sensory mechanism comes from the flash-lag phenomenon (Nijhawan, 1994; Linares et al., 2007), in which a flashed object is perceived to lag a physically aligned moving object. This fundamental problem would apply to both vision and proprioception. In **Figure 1A** we have aligned the integrated percept of the hand with the actual hand position. This situation is also reproduced in **Figure 1B**, which illustrates one position sample of a moving hand. The integrated estimate is shifted (magnitude C) ahead to compensate for the neural delay. One can think of these perceptual mechanisms that correct for sensory delays as calibration mechanisms that would shift the corresponding integrated percept in space. However, what is important for our explanation is the relative difference between the visual and proprioceptive estimates irrespective of any compensation mechanism (e.g., extrapolation) to correct for these delays. When visual information is not present the visual estimate of the changing position is no longer reliable, but the system will still integrate, we assume, the information according to a maximum likelihood principle (Ernst and Banks, 2002). This will cause the integrated estimate to be shifted toward the proprioceptive position, which is now (without vision) more reliable. As a consequence, and after the compensation mechanism that shifts the position estimate to compensate for neural delays in feedback processing, the felt position of the hand is further ahead (**Figure 1C**) relative to when vision is available.

## **PREDICTION OF PERCEPTUAL BIAS FROM TRANSIENT PROPRIOCEPTIVE INFORMATION**

The perceptual bias for the moving limb tends to be in the direction of motion, that is, the limb is felt ahead of the actual position. Importantly, this bias does not appear to be constant along the whole limb trajectory but rather increases in the first part of the movement and decreases afterwards (e.g., Wolpert et al., 1995)

**FIGURE 1 | Sketch to illustrate our rationale. (A)** The gray curve denotes the actual path traveled by a hand: the changing position in one dimensional space is plotted against time. The slope of the curve at any given point describes the tangential velocity at this particular time. The green and red points therefore correspond to moments at which the hand moves slowly early in the path (green) and when it moves at the

highest speed half way to the target (red). See text for details. **(B)** Sketch of the position estimate based on integrated information. The dashed curve denotes the integrated estimate based on visual feedback (red) and proprioceptive feedback (blue). A constant shift is assumed to correct for the sensory delays. **(C)** The same as **(B)** but without visual information. The visual estimate (red) has a larger uncertainty.

and sometimes a bias in the opposite direction (behind the actual position) has been reported during the last part of the movement (e.g., Gritsenko et al., 2007).

It is important to keep in mind that this bias is relative to the actual position of the hand and it is often implicitly assumed that there would be no bias when the transient position was judged with vision of the hand. To our knowledge, however, no evidence has been reported for this. In **Figure 2** we plot the basic predictions regarding the perceptual bias in judging a transient position of the hand at the time of an external cue.

Our model, which is based on differential delays, makes strong predictions about the trend of the bias along the movement. Specifically, the amount of bias is mainly determined by the velocity of the limb at the time the probe or cue signals the moment of the judgment. This prediction is largely consistent with the observation in both Dassonville (1995) and Gritsenko et al. (2007) that position overestimates increase with increasing velocity. Unfortunately, the studies reporting perceptual biases of the unseen limb provide limited information about instantaneous limb velocity. In addition, the velocity profiles in those studies are not always easy to infer from the performed limb movements. Furthermore, the movements in some of the studies were not very fast, with movement times typically longer than 1.5 s. Slower movements may have been used to facilitate tracking of the felt position of the limb. In **Figure 2A** we plot the velocity profile from a pursuit task (Rodríguez-Herreros and López-Moliner, 2008) which is similar to the speed of the movements used in some of the studies addressing the perceptual bias during movement (Wolpert et al., 1995; Gritsenko et al., 2007). The velocity profiles determine the expected biases for the different delays, which are shown in **Figure 2B**. We outline in the next section how we computed these biases.

#### **SIMULATIONS OF PERCEPTUAL OVERESTIMATION**

We assume that the integrated estimate of hand position is aligned with the actual hand position. As a corollary of this assumption, the resultant percept is shifted to the proprioceptive one when vision is absent. The scenario depicted in **Figure 1A** would be equivalent to having an integrated estimate in which the variance for vision is very large, but with residual visual memory preventing it from reaching infinite values. To demonstrate the possible effects of such a mechanism, we use a real movement from a pursuit task (Rodríguez-Herreros and López-Moliner, 2008) so that we have the velocity *vt* and actual position *pt* of the hand across time *t*. Because the integrated position (*p vp <sup>t</sup>* ) is aligned in time with the actual one (*p vp <sup>t</sup>* = *pt*), the integrated position can be expressed as:

$$\boldsymbol{p}\_t^{\circ p} = \boldsymbol{\omega}\_{\boldsymbol{\nu}} \times \boldsymbol{p}\_t^{\boldsymbol{\nu}} + \boldsymbol{\omega}\_{\boldsymbol{\mathcal{P}}} \times \boldsymbol{p}\_t^{\boldsymbol{\mathcal{P}}} \tag{1}$$

However, due to the different visual and proprioceptive delays, the unisensory estimates of positions *p<sup>v</sup> <sup>t</sup>* and *p p <sup>t</sup>* will lag behind the actual position (*pt*) to different extents (the visual position will lag more). *p<sup>v</sup> <sup>t</sup>* would correspond with the visual estimate of the position at time *T*<sup>1</sup> in **Figure 1A** and *p p <sup>t</sup>* would correspond with the proprioceptive estimate at the same time. *wv* and *wp* are the weights given to the visual and proprioceptive estimates and are detailed below.

In order to compute the bias one could, therefore, obtain the unisensory estimates for vision and proprioception by finding earlier positions within a movement, such that the proprioceptive estimate would correspond to the actual position some time steps prior to the current position, and the visual estimate would correspond to an even earlier position. However, because the bias only depends on the differential delay between vision and proprioception, for simplicity we assumed no delay

for proprioception in our simulation. Accordingly, we included in Equation 1 delayed positions for vision and updated positions for proprioception. We used 30, 40, 50, and 60 ms of delay (proprioception leading vision), values that include lower and upper bounds for the differential delays reported in the literature (discussed above).

The weights for vision and proprioception depended on the reliability of each modality for localizing the hand. We set *wv* <sup>=</sup> (1/σ<sup>2</sup> *<sup>v</sup>* )/(1/σ<sup>2</sup> *<sup>v</sup>* <sup>+</sup> <sup>1</sup>/σ<sup>2</sup> *<sup>p</sup>*) and *wp* <sup>=</sup> (1/σ<sup>2</sup> *p*)/(1/σ<sup>2</sup> *<sup>v</sup>* <sup>+</sup> <sup>1</sup>/σ<sup>2</sup> *p*) where σ<sup>2</sup> denotes the variance of the modality. We used variances of 1 and 0.56 cm<sup>2</sup> for proprioception and vision, respectively, which are very similar to corresponding uncertainties of position estimates reported in previous studies (van Beers et al., 1999; de la Malla and López-Moliner, 2012). The actual perceptual bias was finally computed as the difference between the estimated position when there is no vision and the integrated position when there is full vision. (We assumed an infinite variance for the visual estimate when vision was absent, so the no-vision estimate is essentially equal to the proprioceptive estimate.) **Figure 2B** shows the predicted bias obtained from the same velocity profile shown in **Figure 2A** for the different delays.

The model captures the main trend reported in many of the studies: the bias is larger in the early part of the movement and decreases by the end. As we think the bias is caused by the differential delays, its magnitude will follow the velocity profile of the movement. For example, in Wolpert et al. (1995) the bias reaches a maximum of about 1 cm after 1 second of movement and drops afterwards. In **Figure 2**, there is a higher acceleration in the early part of the movement due to the fact that subjects had to catch up with the moving target after they started to move. In spite of these differences, the magnitude of the predicted bias caused by the differential delays is not very different from that reported in Wolpert et al. (1995). The inset of **Figure 2** illustrates the relation between the predicted bias and the tangential speed of the limb at the time of the judgments. The bias as a function of the tangential speed could be approximated by a linear function whose slope would be very close to the differential delay between vision and proprioception.

One important feature of the explanation based on the heavy weighting of efference copy in the early part of the trajectory (Wolpert et al., 1995) is that the bias should vanish when the movement is passive. As mentioned earlier, Gritsenko et al. (2007) found the same pattern of estimation error for active and passive movements. Our model, which is based on differential sensory delays makes the same predictions for both active and passive movements.

Interestingly, Gritsenko et al. (2007) found a difference in the reported bias between fast and slow movements in the same direction that we would predict from our model. A larger bias was observed for fast movements which is consistent with the bias having originated, at least in part, from the differential delays. However, they also report a bias in the opposite direction (judgments behind the actual position) by the end of the movement. Our model cannot explain this finding, but a confound could be present during the last tested positions in their study. They used eight angle positions to obtain the judgments and the cue changed color to signal the transient position at which subjects had to make the judgment. The cue was uniformly distributed across the different angles, so that as the movement unfolded the cue expectancy was progressively increasing. Therefore, the expectancy was higher for larger angles (late part) than for smaller ones (early part). This attentional factor could have accelerated the processing of the late cues relative to early ones, thereby reducing the bias at late cues.

### **FITTING PERCEPTUAL OVERESTIMATION DATA**

Gritsenko et al. (2007) provide information about the hand's velocity at the moment of the probe; therefore, we are in the position to illustrate to what extent our model can predict part of this study's data. In **Figure 3** we reproduce the data points from the active movement conditions shown in Figure 7A of Gritsenko et al. (2007). In this study the authors found no significant differences between active and passive movements. For the data that we show here, subjects extended their arms from 40◦ flexion between the upper arm and forearm to full extension (180◦). At some designated angle during the movement a cue (change of color plus a beep) was shown as a mnemonic cue. After completing the

assuming a differential delay of 60 ms between vision and proprioception. The blue solid line denotes the best fit (slope 0.066 s and zero intercept) including the data points that fall within the gray rectangle. The dashed solid line (slope 0.133 s and zero intercept) denotes the fit to all data points.

movement, subjects had to report the perceived position of the hand at the time of the cue. **Figure 3** shows data for four of the tested angles (60, 75, 90, and 105◦) which are color-coded.

At a first glance one can see the strong dependency between the bias and the speed of the hand. However, the differential delay account predicts a linear dependency between hand speed at the time of the probe and the reported bias. Therefore, our explanation cannot fully account for the data pattern shown in **Figure 3**. Nevertheless, note that the bias can go as high as 60◦ and seems certainly larger than biases of about 1 cm like those reported in Wolpert et al. (1995), which are in the prediction range of our differential delays hypothesis. One can also notice that there is an initial linear trend for all the conditions shown in **Figure 3** (data points within the gray rectangle). We fitted a linear model with only a single parameter (slope only and zero intercept) to this set of points. The blue solid line represents this linear fit which yielded a slope of 0.066 s, very close to the black solid line that denotes the predicted bias given a differential delay of 60 ms (near the upper bound of our estimate of the differential delay). This model accounts for 70 percent of the variability (*R*<sup>2</sup> <sup>=</sup> <sup>0</sup>.71). Although it is clear that the data do not behave linearly across the entire velocity range, we also provide the fit to all the data points for information purposes only. The slope for this fitted line is 0.13 s (dashed line in **Figure 3**), which is well beyond the upper bound of estimated differential delays between vision and proprioception. Some other factors must cause the exponential increase of the biases. Another important point is that data points within the linear part scatter quite a lot around the linear fit. Part of this variability appears to be explained by the angle at the time of the probe, with smaller angles showing larger biases.

In sum, our differential delay hypothesis can account fairly well for the linear trends shown in the overestimation biases reported in Gritsenko et al. (2007).

## **PREDICTION OF "UNDER-REACHING" TO STATIC TARGETS**

If people tend to overestimate the real-time position of their moving hand, it makes sense that they would also tend to under-reach the target: if the moving hand is felt to be closer to a target than it really is, movements should tend to be halted prematurely. However, it is not clear whether the perceptual and the motor phenomena have common underlying mechanisms. At first glance there are some clear differences. When participants are instructed to make perceptual reports, the system is encouraged to monitor the changing position of the limb, and this goal constrains the speed of the limb in order to meet the task requirements. On the other hand, reaching to static targets does not necessarily involve monitoring the changing position of the limb. For very fast reaching movements, it is unlikely that the system keeps track of the changing position. Instead, for movement times less than 200 ms, open-loop strategies probably control the hand. Yet, it is possible that for longer movement durations the underreaching reported in some studies could in part be explained by the temporal mechanisms we are proposing. In the next section we explore this possibility. In order to do so, we conducted simulations to obtain some indicative magnitudes of the bias based on the differential delays.

#### **SIMULATIONS OF UNDER-REACHING**

We started with 9 movements with bell-shaped velocity profiles, all of which had equal movement times but different peak velocities. **Figure 4A** shows the one-dimensional trajectories for the different movements. **Figure 4B** reproduces the velocity profile (noisy version) for each movement. For each movement we simulated 1500 trajectories as follows. In each of the 1500 iterations we first obtained a noisy version of the velocity profile. The noise was signal-dependent Gaussian noise (Harris and Wolpert, 1998), and **Figure 4B** shows one example for each type of movement. We then integrated the information to obtain the varying time series of the actual hand position for each trial. From the actual trajectory we then derived the feedback-delayed proprioceptive and visual estimates for each time step as follows.

We assumed that the initial position (before any movement) of the integrated hand estimate was aligned with the actual hand position (Smeets et al., 2006). In each trial of the simulation, the initial felt and seen positions of the hand were randomly drawn from a Gaussian distribution with a *SD* of 1 cm and 0.75 cm for proprioception and vision, respectively, centered around the actual position of the hand. These values correspond to the variances used before. Once we had the unisensory estimates of the initial positions we computed the delayed unisensory running estimates based on the previously obtained velocity profile of the actual movement. This produced two time series of changing position, one for vision and another for proprioception, with the only difference being the starting position, which was drawn at random.

**FIGURE 4 | The nine different movements used in the simulation of under-reaching bias. (A)** Changing position of the finger for the different movements. The movement time was always 0.7 s and the peak velocity varied from 10 cm/s (the slowest movement) until 200 cm/s (the fastest movement). **(B)** The corresponding velocity profiles with signal-dependent noise. Note that in **(A)** the noise is not noticeable after integrating the tangential velocity.

We then computed the integrated estimate by using Equation 1 as we did before. At each time step in which we computed the integrated estimate of position, the proprioceptive estimate corresponded to the same time step, but the visual estimate corresponded to a past position. The amount that the visual estimate lagged the proprioceptive estimate depended on the size of the differential delay. For each simulated trial and movement we used the same set of differential delays between vision and proprioception that we used before (30, 40, 50, and 60 ms, with proprioception leading vision). In order to get a measure of the bias, we compared the no-vision estimate (assuming infinite variance for visual reliability) and the integrated full-vision estimate. The running bias would then be maximum at peak velocity. To stop the movement we computed an error signal between the running estimate and the target position, which was defined as the final position of a template movement (in this case, final position when vision and proprioception are both available). When the error was less than a threshold we stopped the movement. The inset in **Figure 5** illustrates the threshold mechanism that we used. We computed a distance between two Gaussians, one representing the felt (or integrated) position of the hand (red-dashed Gaussian) and the other denoting the estimated target position (black-solid Gaussian). In the case of the no-vision estimate, as shown in the inset, the *SD* was set to 1 cm while for the integrated condition the *SD* was 0.6 cm (variance of 0.36 cm2, derived from optimally combining proprioception and vision). The *SD* for the target localization was 0.75 cm (which results in a variance of 0.56 cm2, the same used in **Figure 2**). The final end point was obtained by using the following expression:

$$Q(p = 0.75, \mu, \sigma) - Q(p = 0.25, \mu\_T, \sigma\_T) < 1\tag{2}$$

where Q is the inverse Gaussian cumulative distribution function; μ and μ*<sup>T</sup>* are the no-vision or full-vision position estimate and the target position estimate, respectively. σ and σ*<sup>T</sup>* are the corresponding uncertainties (SD) for limb and target estimates. When the difference between quantile 0.75 of the limb position and quantile 0.25 of the target position (d in the inset of **Figure 5**) was

**FIGURE 5 | The expected values for a bias in under-reaching static visual targets with the unseen hand as a function of peak velocity in simulated movements.** Different colors and symbols denote differential delays between visual and proprioceptive feedback. Inset: illustration of the estimation of the felt position of the hand (dashed red Gaussian) and the estimation of the static target (solid black Gaussian). A running distance (denoted by d) between both Gaussian was computed to determine the final end point based on unisensory estimates of the hand. See text for details.

less than 1 cm the movement was stopped. In this way we obtained the full-vision and no-vision endpoints, and could compute the relative difference between both as a measure of the bias. **Figure 5** shows this bias as a function of the peak velocity of the different nine movements that we simulated in **Figure 4**. The biases are shown for the four different differential delays used to compute the running position estimates for vision and proprioception.

Note that, as before, the reported simulated bias is independent of the compensation mechanisms that shifts the integrated percept of the hand to account for the transmission delays.

#### **COMPARISON WITH PREVIOUSLY REPORTED UNDER-REACHING**

The majority of studies of endpoint bias, to our knowledge, report an under-reach bias during targeted reaching (e.g., Soechting and Flanders, 1989; Chieffi et al., 1999; Engelbrecht et al., 2003; Diedrichsen et al., 2004; Elliott et al., 2004; Krigolson and Heath, 2004; Oliveira et al., 2005), but some have reported an over-reach bias (e.g., Lönn et al., 2000; Westwood et al., 2003), and some studies suggest that the presence and magnitude of an underreach bias depends on the delay between target occlusion and the onset of the reach (Westwood et al., 2003; Krigolson and Heath, 2004). It is not clear why the recency of target information influences the magnitude of under-reaching, but it may have more to do with trial-to-trial error minimization than with the real-time estimate of the hand. Indeed, motor optimization is likely to contribute significantly to endpoint biases. Under-reaching a target has potential benefits for system efficiency, as it protects against movement reversals, which can incur time and energy costs to the performer (Engelbrecht et al., 2003; Elliott et al., 2004; Oliveira et al., 2005). It is likely, therefore, that under-reach biases are in part caused by strategic modulation of the feedforward impulse as a way to minimize costly error corrections (Engelbrecht et al., 2003).

The sensory-integration hypothesis that we have presented here is consistent with an under-reaching behavior, but our hypothesis can only explain the portion of the bias that is related to the real-time estimate of the moving hand. Unfortunately, this putative sensory portion of the under-reach bias has not been isolated from feedforward contributions in previous research, so the model-based estimate that we provided here may not be directly comparable to previously reported under-reach magnitudes. Matters are further complicated by the different protocols used in previous studies of movement under-reaching; often, participants are directly immersed in a no-feedback reach environment, with no prior calibration of motor commands. (Our model assumes prior calibration of reaching, such that the feedforward component is properly calibrated to target distance.) This absence of calibration in some studies might explain, for instance, dramatic under-reaching for some open-loop tasks [up to 15 cm (Soechting and Flanders, 1989)], and overreaching in others (e.g., Khan and Franks, 2000; Westwood et al., 2003). Without motor calibration, different reach conditions, such as unconstrained whole-arm reaching to remembered targets in Soechting and Flanders (1989) and 1-dimensional planar movements constrained by a manipulandum in Khan and Franks (2000), may produce distinct biases that are unrelated to the online estimation phenomenon we address with our model. Future studies will be needed to test whether our model can account for any under-reach effects.

## **ALTERNATIVE EXPLANATIONS FOR POSITION OVERESTIMATION DURING REACHING BIASES CAUSED BY EFFERENCE COPY**

In the introduction we described Wolpert et al.'s (1995) explanation for position overestimation during reaching, which proposed that the early state estimates of the moving limb are dominated by an efference copy-based prediction. Wolpert et al.'s model provides a nice fit for their data; however, it also relies on the assumption that the motor system has access to both an efference copy and an internal forward model. In contrast, our model assumes neither efference copy nor forward modeling to produce a similar pattern of increasing and decreasing overestimation as a movement progresses, and in this sense it is the simpler model. However, we did make the assumption that a perceptual shift of the position estimate compensates for sensory delays (**Figure 1**). This predictive processing is more 'general-purpose' than the efference copy-based predictive processing employed in Wolpert et al.'s model, in that the same perceptual mechanisms that allow someone to predict the upcoming location of any moving stimulus, despite sensory processing delays, could also be employed for forward-shifting the estimated location of the moving hand. Whether or not our assumption of sensory compensation is simpler than Wolpert et al.'s assumption of motor-based prediction is arguable. However, it is important to note that our proposed compensatory shift does not have any influence on the *pattern* of overestimation produced by our model, whereas for Wolpert et al. (1995) motor prediction is integral to the pattern of overestimation. In fact, our model's predictions about differences between visual closed- and open-loop position estimates do not rely on any assumptions about compensatory shifting of the sensory estimates. Perhaps the best reason for favoring our model over an efference copy-based one, though, is that our scan account for the presence of limb position overestimation during both active and passive movements (Gritsenko et al., 2007). Moreover, our model can explain the velocity-dependence of the overestimation effect in both active and passive movements (Gritsenko et al., 2007).

One shortcoming of our model, however, is that it does not explain the underestimation performance that has been reported for later parts of a movement (Gritsenko et al., 2007). At this point, we cannot be sure if the late position underestimation is an artefact of the experimental protocol employed by Gritsenko et al. and if the effect is, therefore, independent of the estimation process we are attempting to explain. This, however, puts us in the tenuous position of potentially cherry-picking effects from Gritsenko et al. that support our model, such as the similar behaviors for passive and active exposures. That being said, we believe that the similar *patterns* of performance observed in the passive and active conditions is a more important effect than the actual size and direction of the estimation bias, which is likely to be sensitive to the specific protocol employed. Furthermore, because there was no comparison between vision and no-vision conditions in the Gritsenko et al. study (the comparison, strictly speaking, that our model is designed to describe), we cannot know the extent to which the underestimation effect at late cue times is inconsistent with our model.

In the end, we cannot state with certainty that our model is superior to an efference copy model for explaining position misestimation during movement. At the very least, however, we have presented a plausible sensory-driven mechanism for the misestimation phenomenon. Future comparisons between visual closedand open-loop position estimation will test the quality of our model.

## **BIASES IN THE PROPRIOCEPTIVE MAP**

One possibility that we have not yet addressed is that position overestimates are not related to movement per se, but rather to differences between visual and proprioceptive spatial maps. When the hand moves away from the body (as it does for most reaching movements), the hand may occupy locations at which the proprioceptively-sensed position is different from, and farther away from the body than, the visual one. Wilson et al. (2010), for instance, have shown that the right hand tends to be felt as though it is farther to the right than it really is. Thus, if participants make a rightward reach with their unseen right hand, their hand estimate might lead the real hand, producing perceptual position overestimates and, potentially, under-reach performance. (Position overestimates during leftward reaching with the right hand (e.g., Wolpert et al., 1995) would be harder to explain.)

While such proprioceptive biases may contribute to the overestimation phenomenon during reaching, we suspect that they do not account for all of it. Gritsenko et al. (2007), for instance, showed a speed-dependent overestimation effect, which suggests that motion of the hand does have an influence on the position estimate that is independent of the hand's current location relative to the body. Furthermore, Gritsenko et al. (2007) showed that reports of the stopped (i.e., static) hand exhibited a different pattern (one that did not meaningfully vary as a function of spatial location) than reports of the remembered location of the moving hand. Future experiments that directly compare static position reports with spatially-matched motion reports would help to clarify the contribution of a participant's proprioceptive map to the misperception of his or her moving limb.

## **SWITCHING BETWEEN VISUAL AND PROPRIOCEPTIVE ESTIMATES OF THE HAND AND THE POSSIBLE EFFECTS OF REALIGNMENT**

We have assumed that visual and proprioceptive estimates of limb position are integrated and that integration depends on the relative reliabilities of each estimate (van Beers et al., 1999; Ernst and Banks, 2002; Smeets et al., 2006). We have also assumed that the estimates are independent of each other, i.e., that one sense does not realign the other one (Smeets et al., 2006).

It is possible, however, that sensory estimates are not integrated. Rather, it may be that when vision is available it dominates position sense, and when vision is absent proprioception dominates position sense. This would not affect the direction of the bias that we have modeled, but it would increase the size of the bias. The predicted bias would be equal to the difference between the proprioceptive estimate (reaching in the dark) and the delayed visual estimate (reaching in the light), rather than the difference between the proprioceptive estimate (reaching in the dark with infinite variance for the visual estimate) and the integrated estimate (reaching in the light with weighted estimates).

It is also possible that the proprioceptive estimate is spatially realigned by vision (e.g., Cressman and Henriques, 2009). The effects of such spatial realignment on the running estimate of the hand as it moves in the dark would depend on the rate and direction of the deterioration of the alignment when people reach without vision. If the proprioceptive estimate remained stable after removal of vision, the direction of effector misestimation would be similar to what we have proposed here. If the proprioceptive estimate decayed, the effect on position estimation in the dark would depend on the direction of the decay.

Perhaps a more pertinent consideration is whether the proprioceptive estimate is *temporally* realigned by vision when visual feedback is available (that is, whether the sensorimotor system delays proprioceptive feedback in order to sync it with slower visual feedback). The effect of such alignment on position estimates following visual occlusion would depend on the rate of its decay in the dark. If temporal alignment decayed quickly, we would expect position overestimates to arise after only a few movements in the dark. If the decay occurred slowly, the overestimation bias would develop more gradually. As long as the decay was toward the baseline processing speed for proprioception (i.e., faster than vision), one should observe an overestimation bias. However, the rate at which the bias developed might differ from what we have modeled here.

## **BIASES IN THE LOCALIZATION OF MOVING OBJECTS**

Judgments about the location of moving objects at the time of a probe usually result in reported positions that are too far along their path (e.g., Brenner and Smeets, 2000; Whitney et al., 2000; Alais and Burr, 2003; Ögmen et al., 2004; Brenner et al., 2006). This is the very same pattern obtained for transient positions of the unseen moving limb with the only difference being that in the former case the target is an external object. This similarity raises the question of whether the phenomenon addressed here is caused by the same mechanisms as the biases generally reported for moving objects. One needs a time of interest at which to judge the position of a moving object and this is usually signaled by using flashes or tones. However, there is still much debate about the mechanism and functionality of this bias that is consistent with an extrapolation of motion. The idea that this bias in the direction of motion compensates for sensory delays motivates one of the explanations of this phenomena and the flash-lag effect (Nijhawan, 1994). By the time a physically aligned flash is detected (as a time marker), the moving object will have moved to a new position causing the spatial misalignment. This explanation is not very different than the one proposed by Dassonville (1995) to account for the positive bias in the estimate of the moving hand.

Interestingly, Nijhawan and Kirschfeld (2003) reported a flashlag effect between a flash and a rod moved with an unseen wrist. Subjects perceived a spatial misalignment between the rod and the flash. Note that this type of judgment involves comparing the position of the controlled rod relative to the cue, as in the typical flash-lag task, rather than ascertaining the position of a moving object at the time of the probe. The bias reported in Nijhawan and Kirschfeld (2003) is, however, in the same direction as the ones discussed in this article: subjects perceived the flash lagging the tip of the rod. In this study the flash or probe was presented when the rod was moving at the maximum velocity. Although the value was not reported, the average speed of the movement was 63.8 cm/s, which means that the maximum speed was higher then this value. The magnitude of the flash-lag was between 6 and 8 cm which is, admittedly, larger than would be predicted from the differential delays between proprioception and vision. There is, however, a clear difference between this study and the others. In Nijhawan and Kirschfeld (2003) the judgment relied on always comparing visual information and not a proprioceptive location at the time of a probe. Like the model outlined here, the flash-lag effect also has a clear dependency on velocity of the moving object (e.g., López-Moliner and Linares, 2006). Carefully designed experiments will, therefore, be needed to address the question of whether the bias when judging proprioceptive positions is actually a consequence of compensatory mechanisms for proprioceptive delays.

## **FUTURE DIRECTIONS**

Our hypothesis that differential delays between vision and proprioception contribute to position overestimation provides a new perspective on how the sensorimotor system monitors the realtime location of a moving limb. If our hypothesis is correct, it might imply that efference copy is either not incorporated into the real time estimate of the limb or that it is incorporated in an un-biasing way.

Cameron et al. Visuo-proprioceptive differential delays

Our model makes specific predictions about how the estimate of the limb should be influenced by different movement speeds and, while these predictions are consistent with previouslyreported overestimation effects, future experiments are needed that specifically examine position estimates as a function of the hand's instantaneous velocity, while controlling for both cue expectancy and the spatial location of the cue relative to the participant.

We also recommend some control procedures for future investigations of real-time position estimation: (1) probing position estimates in both visual open-loop *and* closed-loop conditions, and (2) probing position estimates when the hand is moving *and* when the hand is static (or, alternatively, changing the start location and direction of reaches across trials, such that they span the workspace and thereby control for any effects of the location of the cue/target relative to the body). We also suggest that more agreement among studies might be obtained if researchers ensure that participants' reaches remain calibrated across trials. Such calibration might be achieved, for instance, by randomly inserting, among test trials, motor calibration trials in which performance feedback is provided.

Finally, we hope that future studies will examine the relationship between real-time perceptual estimates of the reaching limb and goal-directed reach performance. While it is tempting to assume that perceptual position overestimation is directly related to an under-reaching bias, we are not aware of any studies that have tested this link.

## **ACKNOWLEDGMENTS**

The research group is supported by Grant 2009SGR00308 from the Catalan Government. The first author (Brendan D. Cameron) was supported by a Juan de la Cierva fellowship from the Spanish Government and the Marie Curie fellowship PCIG13-GA-2013- 618407. The third author (Joan López-Moliner) was supported by an ICREA Academia Distinguished Professorship award.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 September 2013; accepted: 15 January 2014; published online: 03 February 2014.*

*Citation: Cameron BD, de la Malla C and López-Moliner J (2014) The role of differential delays in integrating transient visual and proprioceptive information. Front. Psychol. 5:50. doi: 10.3389/fpsyg.2014.00050*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Cameron, de la Malla and López-Moliner. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Gaze-dependent spatial updating of tactile targets in a localization task

## *Stefanie Mueller and Katja Fiehler\**

*Department of Psychology, Justus-Liebig University Giessen, Giessen, Germany*

#### *Edited by:*

*Knut Drewing, Giessen University, Germany*

#### *Reviewed by:*

*Joan López-Moliner, Universitat de Barcelona, Spain Laurence R. Harris, York University, Canada*

#### *\*Correspondence:*

*Katja Fiehler, Department of Psychology, Justus-Liebig University Giessen, Otto-Behaghel-Str. 10F, 35394 Giessen, Germany e-mail: katja.fiehler@ psychol.uni-giessen.de*

There is concurrent evidence that visual reach targets are represented with respect to gaze. For tactile reach targets, we previously showed that an effector movement leads to a shift from a gaze-independent to a gaze-dependent reference frame. Here we aimed to unravel the influence of effector movement (gaze shift) on the reference frame of tactile stimuli using a spatial localization task (yes/no paradigm). We assessed how gaze direction (fixation *left/right*) alters the perceived spatial location (point of subjective equality) of sequentially presented tactile standard and visual comparison stimuli while effector movement (gaze *fixed/shifted*) and stimulus order (*vis-tac/tac-vis*) were varied. In the fixed-gaze condition, subjects maintained gaze at the fixation site throughout the trial. In the shifted-gaze condition, they foveated the first stimulus, then made a saccade toward the fixation site where they held gaze while the second stimulus appeared. Only when an effector movement occurred *after* the encoding of the tactile stimulus (shifted-gaze, *tac-vis*), gaze s*imilarly* influenced the perceived location of the tactile and the visual stimulus. In contrast, when gaze was fixed or a gaze shift occurred *before* encoding of the tactile stimulus, gaze *differentially* affected the perceived spatial relation of the tactile and the visual stimulus suggesting gaze-dependent coding of only one of the two stimuli. Consistent with previous findings this implies that visual stimuli vary with gaze irrespective of whether gaze is fixed or shifted. However, a gaze-dependent representation of tactile stimuli seems to critically depend on an effector movement (gaze shift) after tactile encoding triggering spatial updating of tactile targets in a gaze-dependent reference frame. Together with our recent findings on tactile reaching, the present results imply similar underlying reference frames for tactile spatial perception and action.

**Keywords: movement, reference frames, spatial localization, spatial updating, tactile, visual**

## **INTRODUCTION**

Object locations in our environment can be derived through various sensory channels which end in sensory-specific spatial maps. One inherent complexity arises when spatial information represented in different coordinate systems needs to be compared for future action. For example, directing the hand toward a glowing object in the dark requires the spatial comparison of the locations of the effector (the hand) derived through proprioception and the object derived through vision. So far, it is still an open question which reference frames are used to localize and compare spatial information of different sensory modalities.

Previous studies suggest the use of a gaze-dependent reference frame when people are asked to localize a visual target in space. Bock (1986) first found that participants overestimate the remembered location of peripherally viewed visual targets; an effect called the retinal magnification effect (RME). They point to far to the right if gaze is directed left of the target and vice versa. Similar gaze-dependent errors were later reported when participants initially foveate the target and then shift gaze to an eccentric location after the target was extinguished (Henriques et al., 1998). It has been argued that the mental representation of the (remembered) visual target had been updated, or remapped, into the visual periphery due to the gaze shift and that this remapped target representation is used to plan the movement resulting in the RME. Gaze-dependent spatial updating of visual targets has consistently been observed for goal-directed reaching (e.g., Medendorp and Crawford, 2002; Beurze et al., 2006; Schütz et al., 2013) and grasping (Selen and Medendorp, 2011) as well as spatial localization tasks where participants judged the position of an eccentric (remembered or present) visual target with respect to a proprioceptive (Fiehler et al., 2010), visual (Eggert et al., 2001) or auditory (Lewald and Ehrenstein, 1996, 1998) comparison stimulus. This suggests similar spatial coding mechanisms for action and perception.

An influence of gaze on spatial localization has also been found for somatosensory targets; however, the findings are less consistent compared to visual targets. Behavioral studies demonstrated gaze-dependent reach (Pouget et al., 2002b; Blangero et al., 2005; Jones and Henriques, 2010; Reuschel et al., 2012) and localization errors (Harrar and Harris, 2009, 2010; Fiehler et al., 2010; Pritchett et al., 2012) for proprioceptive and tactile targets as obtained in experiments with visual targets. This may imply similar spatial coding mechanisms for visual and somatosensory target modalities. However, a neuroimaging study which examined the reference frames for visual and proprioceptive reach targets argued for a flexible use of gaze-centered and body-centered coordinate systems depending on the sensory target modality (Bernier and Grafton, 2010). More specifically, the authors suggest a dominant use of a gaze-dependent reference frame for visual targets and a gaze-independent, body-centered reference frame for proprioceptive targets.

In a recent study we showed that an effector movement probably accounts for the incongruent results reported in previous research on goal-directed reaching to proprioceptive targets (Mueller and Fiehler, 2013). We investigated whether reach errors varied as a function of gaze relative to target depending on the presence or absence of an effector movement before the reach. An effector movement could be either a gaze shift after target presentation and before reaching or an active movement of the target (non-reaching) arm to the target location. The movement conditions were compared with a stationary condition where gaze was fixed throughout the trial and the target arm remained at the target location. We observed gaze-dependent reach errors only in conditions where an effector movement was introduced before the reach. Thus, an effector movement (of the eyes or the target limb) seems to trigger a switch from a gaze-independent to a primarily gaze-dependent representation of somatosensory reach targets. We obtained this result for tactile and proprioceptivetactile targets.

However, in our previous study (Mueller and Fiehler, 2013), the introduced effector movement of the target limb might have interfered with the reaching hand in contrast to the condition without effector movement. Dessing et al. (2012) recently suggested that the consistently observed gaze-dependent reach errors to visual targets originate (at least in part) from hand-related biases due to a misestimation of the proprioceptive feedback from the hand instead of a misestimation of the remembered target location (e.g., Henriques et al., 1998; Blohm and Crawford, 2007; Khan et al., 2007). Assuming that gaze-dependent reach errors are primarily caused by a mislocalization of the reaching hand, the question arises whether the influence of the effector movement on spatial coding, as observed in our previous study (Mueller and Fiehler, 2013), is merely due to interference of the introduced effector movement (eyes or arm) with the reaching hand. The goal of the present study was to test the influence of effector movement on gaze-dependent spatial coding of tactile targets in a perceptual localization task, thus eliminating the impact of reachrelated localization errors of the hand. By applying a cross-modal approach, we were able to directly contrast the reference frames of tactile and visual stimuli while varying the presence or absence of an intervening effector movement.

We conducted a psychophysical spatial localization task (yes/no paradigm) where the remembered location of a tactile standard stimulus had to be judged relative to the location of a remembered visual comparison stimulus. By exploiting the profound evidence on gaze-dependent coding and updating of visual stimuli obtained in localization tasks (Lewald and Ehrenstein, 1996, 1998; Eggert et al., 2001; Fiehler et al., 2010), we aimed to unravel the underlying reference frames used to encode tactile location. Gaze was varied relative to a tactile standard (fixation: left or right) and held eccentric during the response. We further included two gaze conditions which differed by whether a gaze shift was introduced *between* the presentation of the tactile standard and the visual comparison (*shifted-gaze*) or gaze maintained fixed at an eccentric location *from the beginning* of the trial (*fixed-gaze*). The two gaze conditions were combined with two possible stimulus orders where the visual comparison was presented before the tactile standard (*vis-tac*) or vice versa (*tac-vis*).

The points of subjective equality (PSEs) were assessed as an indicator for the perceived spatial relation of the tactile and the visual stimulus. In particular, we examined how the PSEs varied as a function of gaze direction, thus allowing conclusions about the underlying gaze-dependent reference frame of the tactile stimulus. We would expect PSEs to vary similarly with fixation if both the tactile and the visual stimulus were represented in a gaze-dependent reference frame. In contrast, PSEs should vary differentially with respect to gaze if only the visual but not the tactile stimulus is coded in gaze-dependent coordinates.

**Figure 1** depicts the respective result patterns for the two potential outcomes. The first row (A) shows a *differential influence of gaze* direction on the visual and tactile stimulus. Here, we assume that the location of the (physical) visual stimulus (gray circle) is perceptually displaced opposite to gaze direction (orange circle) while the location of the (physical) tactile stimulus (gray star) remains unaffected by gaze direction (yellow star). Consequently, the (physical) visual comparison has to be presented farther left to be perceived as aligned with the tactile stimulus (=PSE) if the subject fixates to the left [row (A), 1st panel]. Conversely, if the subject fixates to the right, the perceived visual comparison is misestimated to the left and thus, the (physical) visual comparison has to be presented farther right to be perceived as aligned with the tactile stimulus [row (A), 2nd panel]. This finally leads to a divergence of PSEs as a function of fixation [row (A), 3rd and 4th panel]. The second row (B) depicts a *similar influence of gaze* on the tactile and the visual stimulus. Here, we assume that directing gaze to the left or right leads to a misestimation opposite to gaze for both the visual (orange circle) and the tactile (yellow star) stimulus [row (B), 1st and 2nd panel]. Thereby, the spatial relation between the two stimuli (which is reflected in the PSE) is preserved and should result in similar PSEs irrespective of fixation [row (B), 3rd and 4th panel].

Based on our previous findings (Mueller and Fiehler, 2013), we hypothesize that an effector movement (i.e., a gaze shift) which is executed after the presentation of the tactile standard leads to gaze-dependent spatial updating of the remembered tactile target. Note that this case depends on both gaze condition (shiftedgaze) *and* stimulus order (*tac-vis*). When the location of both the tactile standard and the visual comparison is updated with respect to gaze (orange circle), the spatial relation between the two stimulus modalities should be preserved resulting in similar PSEs (**Figure 1B**). In contrast, if no intervening gaze shift is present after the presentation of the tactile standard (i.e., in the conditions fixed-gaze, *vis-tac/tac-vis* and shifted-gaze, *vistac*) we hypothesize the tactile stimulus to be represented in a gaze-independent reference frame. Consequently, gaze direction should only affect the visual but not the tactile stimulus and thereby result in different PSEs varying as a function of fixation (**Figure 1A**).

be misestimated opposite to gaze direction (orange circle) while the tactile stimulus remains unaffected by gaze direction (yellow star). Consequently, the same spatial relation between the physical tactile (gray star) and the physical visual stimulus (gray circle) results in a perceived spatial relation (colored star/circle) that varies as a function of fixation (1st and 2nd panel). For a specific standard location to be perceived as aligned to a visual comparison, the visual comparison has to be presented more leftwards when the fixation is left and more rightwards when the fixation is right. This is reflected in a shift of psychometric functions and thus on PSEs depending on

## **METHODS**

## **PARTICIPANTS**

Fifteen healthy participants took part in the experiment. After data cleaning (see section Data Analysis) the number of participants was reduced to 10 (males/females: 6/4, age range: 21–28 years, mean ± *SD*: 24 ± 2.4 years). All participants had normal or corrected to normal vision, were right handed, and provided written informed consent according to the local ethics committee. Course credits were received for participation.

## **GENERAL EXPERIMENTAL SETUP**

A schematic of the experimental setup is depicted in **Figure 2**. Subjects sat in front of the apparatus which was mounted on preserving their spatial relation (1st and 2nd panel). This is reflected in similar psychometric functions and according PSEs for left- and rightward gaze shifts (3rd and 4th panel; fixation left: lilac, fixation right: orange). **(C)** Trial sequence of the critical shifted-gaze, tac-vis condition where we expect a similar influence of gaze. (I) Presentation of the foveated tactile stimulus; (II) gaze shift to a peripheral fixation; gaze-dependent spatial update of the tactile stimulus; (III) presentation of the visual comparison; gaze-dependent spatial update of the visual stimulus; (IV) relation between the two stimuli is preserved and thus, the response does not change with fixation.

a table. The left forearm was placed inside the apparatus, parallel to the torso. Three solenoids on a height-adjustable board were arranged directly above the arm. When an electrical current was applied to a solenoid it drove out a small pin (length: 9 mm, diameter: 1 mm) which gently touched the dorsal surface of the arm. Touches (*tactile standard stimuli*) were located at 10◦ left, 10◦ right, and central (0◦) to the right eye and with the midpoint of the arm (from elbow to wrist) roughly aligned with the central stimulus. The distance between the right eye and the central stimulus was approximately 25.5 cm.

To mask the sounds associated with touch presentation, subjects wore in-ear headphones (Philips SHE8500) presenting white noise.

The arm and the solenoids were covered with a horizontally mounted black cardboard on which the visual comparison and fixation stimuli were projected by an LCD projector. Before each session a calibration grid was projected directly on the tactile stimuli (cardboard was removed) that were fixed within the apparatus in order to ensure that tactile and visual presentations were aligned. *Visual comparison stimuli* were single white dots (diameter: 5 mm) varying in location between 25◦ left and 25◦ right of each standard location. For each trial, the location of the visual comparison stimulus was determined by an adaptive staircase procedure with variable step size (see section Adaptive Staircase Procedure).

Fixation stimuli consisted of a white cross (height/length: 10 mm) which was presented 20◦ left or right of the location of the tactile standard stimuli. To ensure compliance with instructions, we recorded movements of the right eye by a head mounted EyeLinkII eye tracker system (SR Research) at a sampling rate of 250 Hz. Before each experimental block the eye tracker was calibrated with a horizontal three point calibration at 10◦ left, 10◦ right and 0◦ (the tactile standard locations). Responses were given by left or right button presses. Participants performed the task in a dark room. To avoid dark adaptation a small halogen table lamp was switched on for 800 ms before every trial. The experiment was performed using Presentation® software (Version 15.0, www. neurobs.com).

## **PROCEDURE**

Subjects completed a spatial localization task (yes/no paradigm) by indicating if they perceived a tactile standard stimulus left or right of a visual comparison stimulus. The task was performed under two gaze conditions (*fixed* vs. *shifted*) and two orders of stimulus presentation (*vis-tac* vs. *tac-vis*).

## *Adaptive staircase procedure*

Each standard-fixation combination (−10◦/0◦/10◦ × −20◦/20◦) was performed with two opposing staircases which differed by the initial location of the visual comparison. While one staircase started with an initial position at 25◦ left, the other started at 25◦ right to the standard location. Within each staircase, an adaptive algorithm determined both the magnitude of the shift of the visual comparison for the next trial (the step size) and the direction in which the visual comparison was shifted depending on the subject's response in the previous trial. More specifically, the step size was gradually decreased while the visual comparison approached the perceived standard location and thus, placing more observations around the parameter of interest (PSE).

The applied algorithm consisted in the accelerated stochastic approximation developed by Kesten (1958) and implemented a reduction of step size for each time the response (left or right) changed with respect to the preceding trial within one staircase. The initial step size was set to 28◦ which was reduced to half before the first step was carried out, i.e., the largest possible step size was 14◦ (for details, see Treutwein, 1995). The minimal step size was set to 2◦. The direction of each step depended on the previous response within the respective staircase and placed the visual comparison closer to the perceived standard location; e.g., if the subject indicated that the tactile standard was left (right) to the visual comparison, in the next trial the visual comparison was shifted leftwards (rightwards). We further imposed the restriction on the first trial within each staircase (where the visual comparison was separated from the standard by 25◦) that it had to be classified correctly; otherwise the first trial was repeated. Distances were set on the basis of previous findings that were obtained with a comparable setup (Mueller and Fiehler, 2013). **Figure 3** displays exemplary data obtained from one subject for both fixations and the two gaze conditions [panel (A) and panel (B)].

## *Gaze conditions*

In order to examine how an eye movement intervening stimulus presentation and response affects the reference frame of tactile and visual stimulus localizations, we applied two gaze conditions: (a) a fixed-gaze condition and (b) a shifted-gaze condition (see **Figure 4**). In both gaze conditions, gaze was directed at a fixation location during the response. In every trial, the tactile standard and the visual comparison were presented for 50 ms each.

The *fixed-gaze condition* (**Figure 4A**) started with the presentation of the fixation cross for 750 ms. Subjects were asked to fixate the indicated location and to maintain gaze at this location until they delivered the response. Thereby, gaze was fixed while the standard and the comparison stimulus were presented.

The *shifted-gaze condition* (**Figure 4B**) began with the presentation of the first stimulus; this could be either the standard or the comparison stimulus depending on stimulus order. Subjects were asked to fixate the felt or viewed location of the first stimulus before they shifted gaze toward the fixation location as soon as the fixation cross was presented. Gaze had to be held at the fixation location until the response was given. Thereby, an eye movement was introduced between the first and the second stimulus (i.e., between the presentation of the tactile standard and the

**FIGURE 3 | Observed data of one subject for the central standard location.** Depicted are the locations of the visual comparison presented in each trial of the two staircases when fixating to the left (1st panels) and to the right (2nd panels). Each staircase comprised 25 trials, resulting in 50 trials per fixation-standard combination. Over the course of trials the two staircases approached the PSE (dashed line). The 3rd panel in each row shows the resulting psychometric functions fitted to the responses collapsed

across the two opposing staircases. **(A)** Data of the condition fixed-gaze, *vis-tac* for which we expected a differential influence of fixation on the visual and tactile stimulus, i.e., a significant difference between the PSEs for the left and right fixation (vertical dashed lines). **(B)** Data of the condition shifted-gaze, *tac-vis* for which we expected the same influence of gaze on both stimulus modalities, i.e., no significant difference between the PSEs (vertical dashed lines).

visual comparison); thus the second stimulus was presented while gaze was directed at the fixation location.

### *Stimulus order*

To examine the differential effects of a gaze shift on the two stimulus modalities we varied the order in which they were presented. In the *vis-tac condition,* the visual comparison was presented before the tactile standard. In the *tac-vis condition,* the tactile standard was presented before the visual comparison stimulus. Note that depending on the gaze-condition gaze was aligned with the fixation location *before* or *between* the presentation of the standard and the comparison.

## *Control condition*

We further applied a control condition (**Figures 4C**, **5**) where subjects were asked to fixate the first stimulus and keep gaze at this location until the response. This condition was introduced to assess the perceived location of the tactile standards while gaze was either maintained at the standard (*tac-vis*, **Figure 5**, left) or held eccentric to the standard (*vis-tac*, **Figure 5**, right); i.e., no gaze shift occurred between the presentation of the standard and the comparison. For neither stimulus order we expect an influence of gaze direction biasing the spatial judgment of the tactile standard stimulus.

## **DATA ANALYSIS**

We assessed PSEs as a function of fixation (left or right) depending on gaze condition and stimulus order.

Two opposing staircases (starting 25◦ left/right of the standard, see **Figure 3**) were conducted for the three standard locations (10◦ left/right and central) combined with the two fixations (20◦ left/right of the standard), resulting in 12 staircases (2 × 3 × 2) for each gaze condition (fixed/shifted). The 24 staircases were performed in two stimulus orders (*vistac/tac-vis*), totaling in 48 staircases. The control condition did not involve different fixations reducing the number of staircases to 6 (2 × 3) which were carried out for the two stimulus orders, i.e., in total 12 staircases. Each staircase comprised 25 trials. Trials were randomized across the staircases within each gaze condition and within the control condition. Every 100 trials short breaks, with the light turned on, were included.

indicated location until the response was given. **(A)** Fixed-gaze: gaze was aligned with the fixation location before the tactile standard and the visual

**FIGURE 5 | Schematic hypotheses of the control condition for both stimulus orders.** *Tac-vis* **(left panel)**: subjects fixated the perceived location of the tactile standard (yellow star) and held gaze at this location while the visual comparison (gray circle) was presented subsequently. Viewing the visual comparison peripherally should lead to a misestimation of its position opposite to gaze direction (orange circle), however, leaving the left/right judgment unaffected. *Vis-tac* **(right panel)**: subjects fixated the visual comparison (gray circle) and held gaze at this location while the tactile standard (gray star) was presented subsequently. The foveated visual stimulus should be localized accurately (orange circle), thus providing a spatially correct reference when judging the location of the tactile standard (yellow star).

Stimulus order was varied in separate sessions and counterbalanced across participants. Within each stimulus order, gaze and control conditions were performed in randomized order. For data analyses, we collapsed the two opposing staircases that belonged together.

foveated the tactile standard and held gaze at this location while the visual comparison was presented.

Eye tracking data were exported into a custom graphical user interface (GUI) written in MatLab R2007b (TheMathWorks Inc., Natrick, MA) to ensure subjects' compliance with instructions in every trial. Trials were classified as valid and included in data analyses if gaze stayed within ±2.5◦ degree of the fixation location until the response was recorded. The percentage of valid trials had to be higher than 60% in every condition, otherwise the subject was excluded, yielding 10 (out of 15) remaining subjects for further analyses.

For valid trials, psychometric functions were fitted using psignifit version 2.5.6 (see http://bootstrap-software.org/psignifit/), a software package which implements the maximum-likelihood method described by Wichmann and Hill (2001). In order to account for our sampling scheme where high intensity values were underrepresented we fixed gamma at 0. The fitting procedure was conducted separately for each participant and standard-fixation combination (−10◦/0◦/10◦ × *left/right*) in each condition (shifted/fixed gaze x *vis-tac/tac-vis* and control × *vistac/tac-vis*); totaling in 30 psychometric functions per subject. **Supplementary Figure 1** depicts all psychometric functions of one subject in the two gaze conditions. The fitted parameter estimations for the PSE and the 84% difference threshold were exported to SPSS (SPSS Inc., Chicago, IL) wherewith all further computations were performed.

## **STATISTICAL ANALYSES**

We conducted a cross-modal spatial localization task in which the location of a remembered tactile stimulus had to be judged as left or right to a remembered visual comparison stimulus.

In order to check whether participants were able to discriminate the three standard locations, we first analyzed the PSEs of the control condition with a Two-Way RM ANOVA [standard location (3) × stimulus order (2)]. Analogously, we analyzed the slopes of the control condition indicating the precision of the spatial judgments.

To test our hypothesis (see **Figure 1**) that PSEs vary as a function of fixation depending on both stimulus order and gaze condition (Three-Way interaction), we conducted a 2 × 2 × 2 repeated measures analysis of variance (RM ANOVA) on PSEs with the factors gaze condition (*fixed/shifted*), stimulus order (*vis-tac*/*tac-vis*) and fixation (*left/right*).

Second, we analyzed PSE shifts as a function of fixation depending on stimulus order within each gaze condition by conducting a 2 × 2 RM ANOVA with the factors gaze condition and fixation. According to our hypothesis (see **Figure 1**), we expected a main effect of fixation in the fixed-gaze condition and an interaction between stimulus order and fixation in the shiftedgaze condition. To further examine a putative interaction in the shifted-gaze condition, one-tailed paired *t*-tests are performed to test for significant differences between the left and right fixation (PSEleft < *PSE*right) within each stimulus order. We expect PSEs to significantly differ across fixations for the *vis-tac* but not for the *tac-vis* condition.

Finally, we analyzed the precision of the spatial judgments as a function of stimulus order, gaze condition, and fixation [RM ANOVA: stimulus order (2) × gaze condition (2) × fixation (2)]. However, the conclusive value of this analysis is restricted by the fact that the applied adaptive algorithm aimed to estimate the PSE and not the slope (see Levitt, 1971, for details on the features of psychophysical procedures).

Each time sphericity was violated as determined with Mauchly's test, Greenhouse-Geisser corrected *p*-values are reported.

## **RESULTS**

The present study aimed to examine how a gaze shift after the presentation of a tactile target changes its reference frame. Based on the assumption that spatial localization and goal-directed movements to targets share similar spatial coding mechanisms, we expect a switch from a gaze-independent to a gaze-dependent reference frame for tactile targets in a visuotactile spatial localization task, consistent with our previous findings on goal-directed reaching to tactile targets (Mueller and Fiehler, 2013).

## **CONTROL CONDITION**

In order to assess the perceived location of the tactile standard in the absence of a bias with respect to gaze direction (see **Figure 5**), we conducted a control condition where participants were asked to fixate the first stimulus which could either be the tactile standard or the visual comparison depending on stimulus order (*tac-vis/vis-tac*). For the stimulus order *tac-vis,* subjects fixated the perceived location of the tactile standard and judged its relative location by simply indicating if it was left or right of the visual comparison that was subsequently presented into the visual periphery. Even if the visual stimulus was shifted with respect to gaze, it should not change the subject's response and thus, the PSEs (see **Figure 5**, left). For the stimulus order *vis-tac,* we assume that the fixated location of a visual stimulus can be judged quite accurately (Bock, 1986; Henriques et al., 1998) and therefore should provide a veridical reference when judging the location of the tactile stimulus which was subsequently presented (see **Figure 5**, right).

Results are shown in **Figure 6**. Mean PSEs (see **Table 1**) significantly varied with standard location irrespective of stimulus order [main effect standard location: *F*(2, <sup>18</sup>) = 23.7, *p* = 0.001], indicating that subjects were able to discriminate the three touch locations. As reported in previous studies (Harrar and Harris, 2009; Jones et al., 2012; Mueller and Fiehler, 2013), we observed a constant bias toward the side of the body where the limb was stimulated, i.e., when a somatosensory target is on the left hand or arm it is felt more leftward than it actually is. However, the magnitude of mislocalization reported in the literature (Harrar and Harris, 2009; Jones et al., 2012) is on average smaller (about 2 cm) compared to our results (about 4 cm). Since we observed similar (gaze-independent) biases in another experiment conducted with this setup (Mueller and Fiehler, 2013) we consider the increased magnitude of biases as reflecting a peculiarity of the setup which does not vary across conditions.

Slopes did not vary with standard location (−10◦/0◦/10◦) or stimulus order (*vis-tac*/*tac-vis*) in the control condition (*p*'s > 0.05; **Table 1**). Therefore, for further analyses we collapsed the data across the three standard locations.

## **INFLUENCE OF GAZE ON TACTILE LOCALIZATION**

We conducted a Three-Way RM ANOVA with the factors stimulus order (2) × gaze condition (2) × fixation (2) on PSEs. We expected a different effect of *fixation* on PSEs in the condition where gaze was shifted *after* the encoding of the tactile

**subjects for each standard location and stimulus order.** Error bars display the standard errors of the mean. Horizontal dashed lines indicate the physical standard locations.

standard (shifted-gaze, *tac-vis*) compared to the conditions where gaze was *fixed* (fixed-gaze, *tac-vis* and fixed-gaze, *vis-tac*) or gaze was shifted *before* the tactile standard was presented (shifted-gaze, *vis-tac*), resulting in a Three-Way interaction of stimulus order, gaze condition and fixation. Indeed, gaze condition interacted with fixation depending on the level of stimulus order [Three-Way interaction: *F*(1, <sup>9</sup>) = 21.6, *p* = 0.001]. **Figure 7** displays the mean PSEs as a function of fixation (x-axis) for each stimulus order combined with each gaze condition. To further explore this effect of gaze, we examined the interaction for fixed- and shifted-gaze, separately.

In the fixed-gaze condition, we expected the PSEs to vary as a function of fixation due to a gaze-dependent mislocalization of the visual comparison but not of the tactile standard (represented in a gaze-independent frame) as illustrated in **Figure 1A**. This effect should occur irrespective of the order in which the standard and the comparison stimuli were presented. Consistent with

**Table 1 | Mean PSEs and slopes with standard errors of the means of the control condition.**


*Data were averaged across subjects for each tactile standard location and stimulus order.*

**order).** Error bars display the standard errors of the mean.

our hypothesis, we found a main effect of fixation [*F*(1, <sup>9</sup>) = 20.3, *p* = 0.001; **Figure 7**, green and blue line) that did not vary with stimulus order [interaction: *F*(1, <sup>9</sup>) = 0.2, *p* = 0.650].

In the shifted-gaze condition, we hypothesized that the effect of fixation would critically depend on the order in which the tactile standard and the visual comparison were presented. Specifically, we expected that a gaze shift *after* the presentation of the tactile standard would trigger a shift from a gaze-independent to a gaze-dependent representation of the tactile standard. This, in turn, should result in a predominantly gaze-dependent representation of both the tactile standard and the visual comparison reflected by similar PSEs. That means, the effect of fixation should be comparable for the tactile standard and the visual comparison, thereby keeping their spatial relation constant (see **Figure 1B**). In accordance with our hypothesis, PSEs varied as a function of fixation depending on stimulus order [interaction: *F*(1, <sup>9</sup>) = 15.4, *p* = 0.003]. We further explored this effect by calculating *post-hoc* paired *t*-tests. The results demonstrated that PSEs significantly differed as a function of fixation [*t*(9) = −6.0, *p* < 0.001] if gaze was shifted from the visual comparison to the fixation location *before* the tactile standard was presented (stimulus order *vis-tac*; **Figure 7**, black line). However, if gaze was shifted *after* the encoding of the tactile standard, this effect vanished [*t*(9) = −1.2, *p* = 0.270; **Figure 7**, red line].

To check for putative effects caused by the three different standard locations (−10◦/0◦/10◦) we further performed the respective paired *t*-tests separately for the three individual touch locations using Bonferroni-adjusted alpha levels of *p* < 0.008 (0.05/6). Results were confirmed for each of the standard locations with significantly smaller PSEs for fixations to the left than to the right in the *vis-tac* condition (*p*'s < 0.005) but not in the *tac-vis* condition (*p*'s > 0.063).

## **SLOPES**

To test for differences in precision, the slopes of the psychometric functions were analyzed. We conducted a Three-Way RM ANOVA for stimulus order (2) × gaze condition (2) × fixation (2) analog to the analysis performed on the PSEs, and obtained an interaction between gaze condition and fixation [*F*(1, <sup>9</sup>) = 8.8, *p* = 0.016]. We further explored this effect by testing the difference within each gaze condition between the left and right fixation as well as the difference across gaze conditions within each fixation (averaged across stimulus orders). *Post-hoc* paired *t*-tests yielded no significant differences (*p*'s > 0.076).

## **DISCUSSION**

The present study investigated the role of an effector movement (gaze shift) on spatial coding and updating of tactile stimuli in a gaze-dependent reference frame. To this end, we examined how the spatial relation of a tactile and a visual stimulus varied with gaze direction (fixation *left/right*) depending on stimulus order (*vis-tac/tac-vis*) and gaze condition (fixed/shifted) in a visuotactile spatial localization task (yes/no paradigm). We found that gaze direction similarly influenced the localization of both the tactile and the visual stimulus when a gaze shift occurred after the presentation of the tactile stimulus (shifted-gaze, *tacvis*). In contrast, when gaze was fixed at an eccentric location before the tactile stimulus was presented (shifted-gaze, *vis-tac* and fixed-gaze, *tac-vis/vis-tac*) gaze direction differentially affected the spatial localization of the tactile and the visual stimulus.

The present results support our previous findings obtained in a goal-directed reaching task where we observed gaze-dependent reach errors when subjects either moved their eyes or arm/hand (effector movement) before they reached to a somatosensory (tactile or proprioceptive-tactile) target in comparison to conditions where no effector movement occurred (Mueller and Fiehler, 2013). This finding suggests a switch from a gaze-independent to a gaze-dependent spatial representation of remembered tactile stimuli triggered by an effector movement. Because the positional judgment task, used here, required no reaching movement we can rule out that the observed bias in spatial localization is due to proprioceptive mislocalization of the reaching hand (cf., Dessing et al., 2012), strengthening our previous findings (Mueller and Fiehler, 2013). Instead, the obtained biases rather reflect a mislocalization of the remembered target opposite to gaze direction (cf., Bock, 1986; Henriques et al., 1998).

Since we assessed the *relative* location of two subsequently presented stimuli in a cross-modal task, different hypotheses about the relative mislocalization of the tactile and the visual stimulus can be generated. Based on a considerable amount of research on spatial coding and updating of visual targets [localization tasks: (Lewald and Ehrenstein, 1996, 1998; Eggert et al., 2001; Fiehler et al., 2010); goal-directed reaching tasks: (Henriques et al., 1998; Lewald, 1998; Jones and Henriques, 2010)], we assume that the location of the visual comparison was always overestimated in the opposite direction of gaze. Following this assumption, we are able to infer the perceived location of the tactile stimulus by interpreting the positional judgments of the tactile standard relative to the visual comparison, expressed by the PSE. We interpret similar PSEs for both fixations as evidence for a gaze-dependent spatial representation of both visual and tactile stimuli while differences in PSEs are taken as evidence for a gaze-dependent representation of the visual but not of the tactile stimulus. We are aware that the PSEs differing between fixations could also be explained by opposing localization errors of visual (opposite to gaze) and tactile (in the direction of gaze) stimuli (cf., Harrar and Harris, 2009, 2010). The direction of gaze-dependent localization errors of tactile stimuli (in the direction or opposite to gaze) seems to depend on the task, in particular on head eccentricity during the time of response (Pritchett et al., 2012). However, opposing error patterns are unable to explain similar PSEs for both fixation sides, as we found for the condition where an effector movement (gaze shift) occurred after tactile stimulus encoding.

While tactile spatial information enters the nervous system in somatotopic coordinates unaffected by gaze direction, a gaze shift after the encoding of the tactile stimulus seems to trigger an update of its remembered location in gaze coordinates. Pritchett et al. (2012) also observed a switch from a body-centered to a gaze-centered reference frame when participants turned their head with the eyes (gaze = head angle + eye angle) after target presentation and before reporting the touch location on a visual scale. In line with the present findings, they concluded that tactile targets are coded in a gaze-centered reference frame "when the locations of the touches need to be remembered and reconstructed after a move." These findings together with previous studies on gaze-dependent spatial updating of visual and proprioceptive targets (Henriques et al., 1998; Pouget et al., 2002b; Beurze et al., 2006; Fiehler et al., 2010; Jones and Henriques, 2010; Reuschel et al., 2012; Schütz et al., 2013) indicate that spatial updating seems to be a mechanism which operates in gazecentered coordinates irrespective of the modality by which the location was originally perceived. However, it does not exclude a contribution of additional non-retinotopic reference frames, not tested here (cf., Pouget et al., 2002a). The use of a shared gazecentered representation might facilitate the integration of spatial information from different sensory modalities, especially in situations where an effector movement requires a fast and continuous update of information in space. Electrophysiological studies in monkeys have demonstrated that gaze-centered spatial updating is based on predictive signals of neurons in the posterior parietal cortex which provoke a shift of visual receptive fields to the new updated location even 80 ms before the beginning of the eye movement (Duhamel et al., 1992). Little is known about predictive spatial updating of tactile receptive fields. Avillac et al. (2005) determined the reference frame of tactile targets (air puffs) in area VIP of the posterior parietal cortex while the monkey fixated one of three visual targets. They found that eye position did not affect tactile receptive fields suggesting spatial coding in head/body-centered coordinates, consistent with our results in the conditions where gaze was held at an eccentric location before the tactile stimulus was encoded (fixed-gaze, vis-tac/tac-vis and shifted-gaze, vis-tac). So far (at least to our knowledge), studies investigating spatial updating of tactile receptive fields triggered by a gaze shift are lacking.

Further evidence for gaze-centered spatial updating comes from research on goal-directed reaching where an influence of gaze shifts on reach endpoints has been reported for visual (for reviews see, Medendorp et al., 2008; Medendorp, 2011), auditory (Pouget et al., 2002b), proprioceptive (Pouget et al., 2002b; Jones and Henriques, 2010; Reuschel et al., 2012) and tactile (Buchholz et al., 2013) targets. Together with the present findings on tactile spatial localization, these findings suggest a similar underlying reference frame for spatial perception and goal-directed movements. The use of a common frame of reference may facilitate the interaction of space perception and action; two functions that are tightly coupled at the behavioral and neuronal level.

In sum, our results suggest that an intervening effector movement (gaze shift) changes the reference frame of tactile targets in a spatial localization task. While spatial information about visual and tactile stimuli enters the nervous system through different sensory channels associated with different reference frames, it is updated in gaze-centered coordinates triggered by an intervening gaze shift. This mechanism seems to apply for goal-directed reaching (Mueller and Fiehler, 2013) as well as for spatial localization.

### **AUTHOR CONTRIBUTIONS**

Stefanie Mueller and Katja Fiehler designed the experiment, Stefanie Mueller collected and analyzed the data, Stefanie Mueller and Katja Fiehler wrote the paper.

## **ACKNOWLEDGMENTS**

This project was supported by the German Research Foundation (DFG Fi1567/4-1 assigned to Katja Fiehler).

## **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fpsyg.2014. 00066/abstract

#### **Supplementary Figure 1 | Psychometric functions of one exemplary subject for the fixedand shifted-gaze conditions.**

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The Associate Editor, Knut Drewing, declares that, despite being affiliated with the same institution as the authors, Stefanie Mueller and Katja Fiehler, the review process was handled objectively and no conflict of interest exists.

*Received: 30 September 2013; accepted: 17 January 2014; published online: 10 February 2014.*

*Citation: Mueller S and Fiehler K (2014) Gaze-dependent spatial updating of tactile targets in a localization task. Front. Psychol. 5:66. doi: 10.3389/fpsyg.2014.00066*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Mueller and Fiehler. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Effects of angular gain transformations between movement and visual feedback on coordination performance in unimanual circling

## *Martina Rieger 1,2 \*, Sandra Dietrich1,3 andWolfgang Prinz1\**

*<sup>1</sup> Department of Psychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany*

*<sup>2</sup> Department for Medical Sciences and Management, Institute for Psychology, University for Health Sciences, Medical Informatics and Technology, Hall in Tirol, Austria*

*<sup>3</sup> Department of Education, Leipzig University, Leipzig, Germany*

#### *Edited by:*

*Christine Sutter, RWTH Aachen University, Germany*

#### *Reviewed by:*

*Christine Sutter, RWTH Aachen University, Germany Chase Joseph Coelho, Lockheed Martin Corporation, National Aeronautics and Space Administration (Johnson Space Center), USA*

#### *\*Correspondence:*

*Martina Rieger, Department for Medical Sciences and Management, Institute for Psychology, University for Health Sciences, Medical Informatics and Technology, Eduard Wallnöfer Zentrum 1, A-6060 Hall in Tirol, Austria e-mail: martina.rieger@umit.at; Wolfgang Prinz, Department of Psychology, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1a, 04103 Leipzig, Germany*

*e-mail: prinz@cbs.mpg.de*

Tool actions are characterized by a transformation (of spatio-temporal and/or force-related characteristics) between movements and their resulting consequences in the environment. This transformation has to be taken into account, when planning and executing movements and its existence may affect performance. In the present study we investigated how angular gain transformations between movement and visual feedback during circling movements affect coordination performance. Participants coordinated the visual feedback (feedback dot) with a continuously circling stimulus (stimulus dot) on a computer screen in order to produce mirror symmetric trajectories of them. The movement angle was multiplied by a gain factor (0.5–2; nine levels) before it was presented on the screen. Thus, the angular gain transformations changed the spatio-temporal relationship between the movement and its feedback in visual space, and resulted in a non-constant mapping of movement to feedback positions. Coordination performance was best with gain = 1. With high gains the feedback dot was in lead of the stimulus dot, with small gains it lagged behind. Anchoring (reduced movement variability) occurred when the two trajectories were close to each other. Awareness of the transformation depended on the deviation of the gain from 1. In conclusion, the size of an angular gain transformation as well as its mere presence influence performance in a situation in which the mapping of movement positions to visual feedback positions is not constant. When designing machines or tools that involve transformations between movements and their external consequences, one should be aware that the mere presence of angular gains may result in performance decrements and that there can be flaws in the representation of the transformation.

**Keywords: unimanual coordination, visuo-motor transformation, gain transformation, sensorimotor integration, tool transformation, circling, synchronization**

## **INTRODUCTION**

Movements of the limbs are limited by the speed and the distance they can cover without moving the whole body at the same time. Tools, however, allow us to overcome motor system limitations. By using tools, we can reach distances out of bodily reach or achieve movement effects in the environment which are faster or slower than our actual movements. Tool use requires that an adjustment to some type of transformation between motor activity and resulting consequences in external space takes place. The transformation can be kinematic (i.e., refers to the relationship between the spatio-temporal characteristics of limb movement and the associated spatio-temporal characteristics of the tool movement) and/or dynamic (i.e., refers to the relationship between the forces the limb exerts and the forces that a tool exerts on the environment; Massen and Rieger, 2012). Kinematic transformations consist of two aspects (see Bedford, 1994). First, the consequences in external space happen in a different location than the actual motor activity. For example, when using a computer mouse motor activity takes place on a mouse-pad but the resulting consequences happen on a

computer screen. Second, the term transformation indicates that the mapping between motor activity and consequences in external space is not 1:1. When using a computer mouse or a touchpad the cursor on the screen covers a larger distance than the actual movement (correspondingly, the speed of the feedback is faster than the actual movement, gain larger than 1). When driving a car, turning the steering wheel by 90◦ does not result in the wheels also turning by 90◦, but less (gain smaller than 1). Thus, use of tools implies that a transformation has to be taken into account when planning and executing movements. The transformation itself seems to be an important part of the cognitive representation of tool-use actions (Massen and Prinz, 2007).

In the present study we were interested in gain transformations, a specific way to vary the mapping between motor activity and its consequences in external space. A transformation of gain means that the resulting consequences in external space are larger or smaller than the actual movement, as it is the case when using a computer mouse or turning a steering wheel. Gain transformations are generally thought to be easy to adapt to (Bedford, 1994; Bock and Burghoff, 1997; Seidler et al., 2001; Rieger et al., 2005), For example, drawing three strokes after a gain change is introduced is sufficient for adaptation (Rieger et al., 2005).

Gain transformations also influence movement difficulty as described by Fitts' Law (Fitts, 1954). Movements are more difficult (i.e., movement time is higher, they are performed less accurately) with higher gains than with lower gains (Rosenbaum and Gregory, 2002; Rieger et al., 2005; Mohler et al., 2007; Sutter et al., 2008). For instance, Mohler et al. (2007) asked participants to walk on a treadmill. They received visual input of half, the same, or twice the speed of actual walking. Preferred walking speed was lower with doubled visual speed and higher with halved visual speed compared to when visual and walking speed were the same. Similar results have been obtained with hand movements (Rosenbaum and Gregory, 2002; Rieger et al., 2005; Sutter et al., 2008). Again, movements are more difficult with higher gains, resulting in a deterioration in endpoint accuracy when movement frequency is given (Rosenbaum and Gregory, 2002), or in slower movements when participants are free to choose their movement speed but are instructed to adhere to spatial accuracy requirements (Rieger et al., 2005; Sutter et al., 2008). Presumably, those adjustments reflect that the cognitive system tries to maximize the predictability of the perceived trajectory in external space.

Most of the previous studies have investigated the influence of different gains in movements along a straight line (along the medial or saggital axis). In contrast, in the present study we investigated transformations scaling gain in circling movements. Such a transformation is for example present when using a hand driven spinning wheel. A hand driven spinning wheel requires that one hand rotates a drive wheel (usually the bigger wheel, which is often rotated by a handle) which turns the smaller spindle assembly, with the spindle turning several times for every turn of the drive wheel. Circling movements differ from movements in a straight line when an angular gain unequal to 1 between the movement and its feedback is introduced. Whereas the mapping of positions on the movement trajectory to positions on the visual trajectory is constant in movements on a straight line, this is not the case in circular movements. Rather, an angular gain unequal to 1 in a circular movement results in a constant change of the mapping of positions on the movement trajectory to positions on the visual trajectory, even though the gain itself remains constant. As an example, imagine that the starting position of the movement trajectory and the starting position of the visual feedback trajectory are both on the right side of a circle. If a gain of 1.5 is introduced, after moving one circle in movement space (the hand is again on the right side), 1.5 circles in visual space have been covered, and now the visual feedback is on the left side of the circle. After another circle in movement space, hand and visual feedback are both on the right side again: in movement space, two circles have occurred, in visual space three circles have occurred.

Circling movements have often been investigated in bimanual coordination studies (e.g., Swinnen et al., 1997). Research on bimanual coordination has demonstrated that people are more accurate and consistent if they execute bilateral mirror symmetric movements (movements in which the hands move toward and away from the body midline at the same time, e.g., moving one

hand clockwise and the other hand counterclockwise) than when they perform any other type of movement pattern (e.g., moving both hands clockwise, Swinnen et al., 1997). Transformed visual feedback has been used to study the relevance of motor constraints/motor related feedback (kinesthesis and proprioception) and perceptual-cognitive constraints/visual feedback for coordination performance. For instance, visual feedback of a circling movement has been shifted 180◦ (Tomatsu and Ohtsuki, 2005), or transformed to result in an easily perceivable pattern (mirrorsymmetric, Mechsner et al., 2001, Lissajous displays, Kovacs et al., 2010a,b) such that participants are able to perform complicated or awkward bimanual movement patterns (such as 4:3, Mechsner et al., 2001), which are otherwise impossible or very difficult to perform. These studies indicate that visual processes play an important role for bimanual coordination (see also Bogaerts et al., 2003; Mechsner, 2004). The perceptual ease of horizontally aligned symmetry information is also illustrated by perceptual studies: it is easier to judge images which are mirrored along a horizontal axis than images which are mirrored along a vertical axis (Quinlan, 2002). We therefore decided to instruct participants to coordinate transformed movement feedback with a stimulus in a way that a symmetric pattern emerges in visual space, which should be perceptually easy.

Coupling phenomena found in bimanual coordination tasks seem to persist in unimanual coordination, i.e., when coordination occurs between a single limb and a computer display (e.g., Wimmers et al., 1992; Buekers et al., 2000). In unimanual coordination there is no second limb with which movements need to be coordinated, but rather a coordinative stimulus/event. Since there can be no constraints on the motor level related to bimanual coordination (only one hand is moving), unimanual coordination has to follow the perceptual characteristics of the movement feedback, which can be either visual and/or proprioceptive/kinesthetic. Studies indicate that visual feedback dominates in many situations of unimanual coordination (Buekers et al., 2000; Roerdink et al., 2005; Dietrich et al., 2012). However, the states of the limb, and the perception of those states, must also be taken into account (Wilson et al., 2005a,b). Further, it depends on the type of task whether visual or kinesthetic/proprioceptive information is most beneficial (Alaerts et al., 2007). Similar to the present task, Dietrich et al. (2012) asked participants to perform a unimanual coordination task that required participants to coordinate the visual feedback of hand movements with a circling stimulus. To dissociate movements and the associated proprioceptive/kinesthetic feedback from visual movement feedback, participants performed the task under regular and transformed visual feedback (180◦ angular shift). Results indicated that coordination mainly occurs in visual space (similar data patterns with regular and transformed feedback), but subtle effects of coordination in movement space were also observed. Further, the presence of a transformation affected performance negatively. Thus, if movement and its feedback do not correspond, performance may suffer. However, the transformation in Dietrich et al. (2012) did not consist of a gain transformation, but rather a constant shift of the feedback relative to the hand. Müsseler and Sutter (2009) also investigated transformed circular movements. Participants drew circles on a display while the hand movements followed either vertical

or horizontal ellipses. Even though a gain transformation was involved to achieve this feedback (either in the *x*- or *y*-axis), the mapping of movement positions to feedback positions was constant, similar to when gain transformations are introduced in movements on a straight line. In contrast to those studies, in the present study gain transformations were introduced in such a way that the mapping of movement positions to feedback positions was not constant. The effect of such a transformation on performance as well as on awareness of the transformation is largely unknown.

In the present study, we used a unimanual coordination task in order to investigate how the perceptual-motor system deals with angular gain transformations resulting in a non-constant mapping of movement positions to feedback positions in circling. Participants were asked to coordinate a feedback dot (produced by the participants' movement and presented on the right side of a screen) with a continuously circling stimulus dot (presented on the left side of the screen), in order to produce mirror symmetric circular movements of the two dots on the screen. The movement angle of the hand was multiplied by a gain factor before being presented on the screen: we used 4 gains smaller than 1, a gain of one, and 4 gains larger than 1. This allowed us not only to compare transformed vs. regular conditions (e.g., Mechsner et al., 2001; Roerdink et al., 2005; Dietrich et al., 2012), but also to study the impact of transformation magnitude on coordination performance.

If only perceptual characteristics in visual space are important for unimanual coordination, the different gains between hand movement and its feedback should have no effect on performance, as the pattern participants were asked to produce in visual space was always the same. Thus, *accuracy* of performance, i.e., the time participants spend in the instructed visual pattern, should be equal for different gains. If movement speed, or some biomechanical variable related to movement speed, is important for coordination performance, performance should decline the smaller the gain, because smaller gains imply more distance has to be covered by the hand movement to produce the desired distance on the screen. Therefore movements have to be faster. However, if it matters that a transformation is introduced between movement and its feedback, the best performance should be observed at a gain of 1 and performance should be worse at both, gains smaller and gains larger than 1. If performance is worse in gains unequal to 1, we were further interested in whether the magnitude of the transformation matters for performance. On the one hand, one could expect that all gains which are not equal to 1 are performed equally well (or bad), because they all imply a constant change in the mapping of hand position to feedback position. On the other hand, the mapping change is more drastic in gains which show a larger deviation from 1 than in gains that show a smaller deviation. Thus, performance may vary gradually.

We further varied the speed of the stimulus dot in three levels, because previous studies have shown that coordination performance deteriorates when movement and/or feedback speed increases (Kelso, 1984; Haken et al., 1985; Heuer, 1993; Byblow et al., 1995; Carson et al., 1997; Roerdink et al., 2005), especially under transformation conditions (e.g., Salter et al., 2004; Alaerts et al., 2007; Dietrich et al., 2012). We therefore expected to find deterioration in performance with increasing speed.

In addition to the accuracy of performance, we were interested in *how* participants perform the task. First, we were interested in whether participants' movement feedback is on the ideal position as instructed (in perfect mirror symmetry), or whether it systematically lags behind or is advance of (leads) that position. We assumed that the feedback dot would be in advance of the stimulus dot, as the movements were performed with the right (dominant) hand and the feedback was presented on the right side of the screen. In bimanual coordination the dominant hand usually shows a slight lead over the non-dominant hand when coordinating symmetrical movements (Treffner and Turvey, 1995), an effect which seems to be due to attention rather than motoric factors, because the lead of the dominant hand disappears when attention is directed to the non-dominant hand (Amazeen et al., 1997). However, this lead might be affected by the gain transformation, because gain transformations may evoke subjective feelings of feedback being slow or fast.

The second way to investigate how participants perform the task was to analyze whether they show anchoring, that is, a reduced variability at specific locations on the trajectories (Roerdink et al., 2008). Regions at which anchoring occurs are often located at or near movement reversals or maximal excursions (e.g., Beek, 1989; Kelso and Jeka, 1992; Byblow et al., 1995; Fink et al., 2000), that is regions in which critical task-specific information is available for organizing cyclical movements (Beek, 1989; Kelso and Jeka, 1992). In addition to reducing kinematic variability at/around movement transition points anchoring stabilizes entire movement cycles (Roerdink et al., 2008). Anchoring has therefore often been regarded as a motoric phenomenon. However, Roerdink et al. (2005) found support for visual as well as motoric contributions to anchoring. Furthermore, Roerdink et al. (2008) found that anchoring in visual space and in movement space were independent from each other. Usually, anchoring is studied in reference to externally generated events like a metronome (e.g., Fink et al., 2000), or in relation to self-generated events like movement reversals (Roerdink et al., 2008), ball release in juggling (Beek, 1989), or feedback tones in tapping (Keller and Repp, 2008), all of which provide discrete information which can be used for anchoring. Such information was not available in our task. We therefore assumed that anchoring would occur in a visually salient location, that is, when the two dots are closest together in the middle of the screen. Due to the non-constant mapping of movement positions and feedback positions and correspondingly between movement positions and stimulus positions, such a position is difficult to conceive in movement space, we therefore investigated anchoring only in visual space.

We were further interested in whether awareness of the transformation depends on the magnitude of the transformation or whether a mismatch between movement and feedback position (i.e., any transformation) is sufficient to detect the transformation. Previous studies indicate that participants are not very good in knowing their actual hand positions when transformations between movements and their feedback are introduced and that the magnitude of a perturbation plays an important role for detecting it (Fourneret and Jeannerod, 1998; Knoblich and Kircher, 2004; Sutter et al., 2008; Müsseler and Sutter, 2009). This low awareness of one's own hand movement seems to stem from characteristics of the tactile and proprioceptive systems as well as insufficient spatial reconstruction of this information in memory (Müsseler and Sutter, 2009). Based on the previous studies, one can expect that the detection of the transformation depends on the magnitude of the gain. However, even with small gain transformations positions in movement and visual space eventually become very discrepant. For instance, with a gain of 1.2, 2.5 circles in movement space result in three circles in visual space, and hand position and feedback positions are thus on opposite sides of the circle. Thus, the mere presence of a transformation may be important for its detection in the present task, but not its size.

## **MATERIALS AND METHODS PARTICIPANTS**

Fourteen adults (eight female and six male, aged 20–28 years, mean = 24.6 years, SD = 2.2 years) took part in the experiment. Originally two more participants participated, but they were excluded from data analysis because they had difficulties performing the task. All participants were right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971) and reported normal or corrected-to-normal vision. They were paid seven Euros/hour to participate in a single session Participants gave informed consent. The study was conducted in accordance with the Declaration of Helsinki and was approved by the local ethics committee.

## **APPARATUS AND STIMULI**

The experiment was programmed using the C-language in a Microsoft DOS environment. Movements were recorded using a Wacom UD A3 writing pad (resolution: 500 pixels per centimeter, sampling rate 100 Hz), which was connected to the computer via a serial port. The serial port was open all the time and as soon as a new data sample was available this sample was further processed by the program. The writing pad was positioned on a desk horizontally in front of participants. Stimuli were presented on a 17 screen (refresh rate: 75 Hz, resolution: 800×600 pixels, positioned vertically). The center of the screen was aligned with the midsagittal axis of the participant's body and located behind and 15 cm higher than the writing pad. The background of the screen was black.

The stimulus was presented as a white dot (diameter = 0.43 cm, stimulus dot), moving clockwise on a circular trajectory (radius = 4.32 cm). A second white dot (feedback dot, radius 0.43 cm) was controlled by a stylus for the writing pad. The stylus was fixed inside a crank (radius 5 cm) that participants held, which could only be moved in circles. The crank was fixed below a wooden board (15 cm above the writing pad), which also served to shield the hand from view. The center of the circular trajectory of the hand was positioned 10 cm to the right of the body midline. The distance between the centers of the stimulus and feedback trajectories on the screen was 17.27 cm. Participants sat on a height-adjustable chair, which they could adjust to their comfort before the experiment started. Eye-screen distance was approximately 60 cm.

### **PROCEDURE AND DESIGN**

Participants were instructed to produce mirror symmetric movements of the dots on the screen: they were always asked to move their hand in counter-clockwise direction and to match the speed of the feedback dot to the speed of the stimulus dot (which always moved clockwise). The stimulus dot was presented in three different speeds; 0.8, 1, and 1.2 Hz (i.e., 0.8, 1.0, and 1.2 circles per second, respectively). The relation of the speed of the hand movement and the speed of the feedback dot was manipulated by introducing different gains.

The angle the hand moved between two measuring points (angular displacement) was multiplied by a gain factor between 0.5 and 2 before being displayed on the screen at the next refresh. The average delay until a data sample were presented on the screen was 7.67 ms, the maximum delay was 14.33 ms. This was due to the refresh rate of the screen and a maximum of 1 ms for data transmission and to perform the necessary calculations. There were nine different gains, 4 smaller than 1 (0.5, 0.6, 0.75, 0.8), requiring the hand movement to be faster than the movement of the feedback dot (MoFast gains), gain = 1, and 4 larger than 1 (1.25, 1.3, 1.5, 2), requiring the hand movement to be slower than the movement of the feedback dot (MoSlow gains). For an illustration see **Figure 1**.

The experiment started with a short trial in which participants were asked to turn the crank in order to check whether the writing pad worked properly and to allow participants to familiarize themselves with the apparatus. After that participants read the instructions and saw a demonstration of the mirror symmetric pattern they were later asked to produce. The demonstration consisted of two dots in the positions of the stimulus and feedback dots, moving clockwise and counter-clockwise, respectively. Participants were told that in the experiment the feedback dot would sometimes cover a larger or smaller angular distance than their hand and that they would occasionally be asked to indicate how likely they considered the presence of such a transformation in a preceding trial. They were also told that the speed of the stimulus dot increases during each trial. After that, the procedure was the same for every trial. Participants were instructed to hold their hand in the leftmost position at the beginning of a trial. They started trials themselves by pressing the space bar on a keyboard with their left hand. As soon as the space bar was pressed the stimulus dot appeared at the rightmost position of the stimulus trajectory and started moving. The stimulus dot increased its speed every 10 circles by 0.2 Hz (one trial thus consisted of all three speeds). Each trial lasted 30.83 s. Each gain was presented in one block for eight trials. After the sixth trial in each block participants were asked to rate whether a transformation was present in the last trial on a scale from 1 to 5 (1 = certainly not present; 2 = likely not present; 3 = undecided; 4 = likely present; 5 = certainly present), which was presented on the screen. Participants' decision was recorded by the experimenter. The order of gains (i.e., the nine blocks) was randomized between participants. After five blocks there was a break of at least 3 min. It took participants between 1 h and 1 h 30 min to complete an experimental session, as they had the opportunity to take brakes for as long as they wished between trials.

## **DATA ANALYSIS**

Because we were interested in performance after participants have adjusted to a certain transformation and not in the process of adaptation, we excluded the first three trials of every block from analysis, as they were regarded as training trials. Further, we excluded the first three circles of every speed level, to allow time for adaptation to the new speed requirements. For each remaining data point we calculated the angular difference by subtracting the ideal position of the feedback from the actual position of the feedback. Because the shortest distance between the two points was used, the angular difference cannot be smaller than −180◦ or larger than 180◦.

Based on the angular difference values we calculated the percentage of time participants spent in the instructed pattern [Instructed Mode (IM); angular differences between −45 and 45◦] in order to assess the *accuracy* of coordination. The expected value (if performance is random) is 25%. In order to assess *how* the task was performed, we calculated the spatial Constant Error (CE), a signed value indicating the average angular difference between the ideal and the actual angle, which indicates whether participants are in lead of or lag behind the stimulus. We also calculated the temporal CE. The data patterns of the spatial and temporal CE were very similar (as they are related in our task). We therefore decided only to report the spatial CE.

Further, as an indicator of anchoring, we analyzed the spatial variable error (VE), the standard deviation of the CE, at four locations of the stimulus trajectory (east, south, west, and north, as in a compass card). Note that east in the stimulus trajectory meant that participants were supposed to be in the west of the feedback trajectory. To calculate VE, we defined windows of 30◦ around the respective points of interest. A window of 30◦ was chosen in order to (a) cover a relatively narrow area around the points of interest and (b) still have several measuring points even with higher speeds. Angular difference values within this window were averaged for each circle. Then the standard deviation across circles was calculated from those values. Thus, VE describes the variability of the movement position across circles in those areas. We also calculated the temporal VE, as it has been argued temporal and

spatial aspects of anchoring should be separated (e.g., Roerdink et al., 2008). The data patterns of the spatial and temporal VE were very similar (again because they are related in our task). However, spatial VE increased with speed, whereas temporal VE decreased with speed. This is in accordance with studies showing that spatial variability is inversely related to movement time, whereas temporal variability is positively related to movement time (Schmidt et al., 1979). Since no additional information was gained from temporal VE, we only report the spatial VE.

Instructed Mode and CE were then analyzed using ANOVAs with the factors Gain (0.5, 0.6, 0.75, 0.8, 1, 1.25, 1.3, 1.5, and 2) and Visual Speed (0.8, 1, and 1.2 Hz). VE was analyzed with the additional factor location (east, south, west, north). *Post hoc* comparisons were conducted using *t*-tests. The ratings of the presence of a transformation were analyzed only with the factor gain using Friedman's test, Wilcoxon signed-rank test were conducted as *post hoc* tests. The significance level for *post hoc* tests was corrected using the Holm–Šídák procedure, where appropriate exact, minimum (*p*min) and/or maximum (*p*max) *p*-values are reported.

## **RESULTS**

## **ACCURACY OF PERFORMANCE: INSTRUCTED MODE**

The results for IM are depicted in **Figure 2A**. A significant main effect of Visual Speed, *<sup>F</sup>*(2,26) <sup>=</sup> 27.20, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.68, indicated that IM declined with increasing speed (0.8 Hz: *M* = 55.4%, 1.0 Hz: *M* = 48.7%, 1.2 Hz: *M* = 42.7%, *p*max = 0.005). A significant main effect of Gain, *<sup>F</sup>*(8,104) <sup>=</sup> 10.11, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.44, indicated that IM was higher with gain = 1 (*M* = 67.7%) than in all other gains (*M*min = 40.4%, *M*max = 54.9%, *p*min < 0.001, *p*max = 0.026). A significant interaction between Gain and Visual Speed, *<sup>F</sup>*(16,208) <sup>=</sup> 2.76, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.18, was also observed.

At 0.8 Hz speed, IM was significantly higher with MoFast gains (*M* = 58.6%) than MoSlow gains (*M* = 48.8%, *p* = 0.017). At the two faster speeds, IM did not significantly differ between MoSlow and MoFast gains (1.0 Hz: *p* = 0.62, 1.2 Hz: *p* = 0.52). Comparisons of the MoFast gains showed no significant differences in IM between gains at 0.8 Hz speed (*p*min = 0.15), but a significant decline in IM was observed between gain = 0.75 and 0.6 at 1.0

and 1.2 Hz speed (*p* = 0.016 and *p* = 0.002, respectively). The reverse was observed in MoSlow gains. Comparisons showed a decline in IM with higher gain at 0.8 Hz speed: IM was significantly lower with gain = 2 and gain = 1.5 than with gain = 1.3 and gain = 1.25 (*p* = 0.001), but the magnitude of gain did not significantly influence IM at 1.0 Hz (*p*min = 0.81) and 1.2 Hz speed (*p*min = 0.42).

#### **LEAD/LAG: CONSTANT ERROR**

The results for CE are depicted in **Figure 2B**. A significant main effect of Visual Speed, *<sup>F</sup>*(2,26) <sup>=</sup> 45.13, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.78,

indicated that participants were more in advance/lagged less behind the stimulus with lower speed then with higher speed (0.8 Hz: *M* = 18.2◦, 1.0 Hz: *M* = 6.5◦, 1.2 Hz: *M* = −1.2◦, *p*max=0.002). A significant main effect of Gain, *F*(8,104)=18.15, *p* < 0.001, η<sup>2</sup> <sup>p</sup> = 0.58, indicated that participants lagged more behind/were less in advance of the stimulus with smaller gains than with larger gains. A significant interaction between Gain and Visual Speed, *<sup>F</sup>*(16,208) <sup>=</sup> 2.98, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.19, was also observed. In MoFast gains CE was significantly more positive at 0.8 Hz speed (*M* = 8.1◦) than at 1.0 Hz speed (*M* = −12.5◦, *p* < 0.001) and 1.2 Hz speed (*M* = −16.8◦, *p* < 0.001). CE did not significantly differ between the latter two speeds (*p* = 0.25). In MoSlow gains CE did not significantly differ between the 0.8 Hz (*M* = 27.8◦) and 1.0 Hz speed (*M* = 23.4◦, *p* = 0.15), but was significantly less positive at 1.2 Hz speed (*M* = 12.8◦, *p*max = 0.005).

#### **CONTROL ANALYSES: IM CALCULATED FROM MEAN CE**

One may argue that variations in IM are due to systematic variations in CE. Because IM was calculated by using CE values within ±45◦ around the ideal position, it may be that when the mean CE is not zero, parts of the distribution around it are systematically not used in the calculation of IM. To rule out this possibility, we recalculated IM, using a window around participants mean CE ± 45◦ for each condition. The results for IM corrected for mean CE are depicted in the **Figure 2C**. Results were similar to the original analysis of IM. Significant main effects of Gain, *<sup>F</sup>*(8,104) <sup>=</sup> 10.69, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.45 and Visual Speed, *<sup>F</sup>*(2,26) <sup>=</sup> 44.96, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.78, indicated that IM was highest with gain = 1 (*M* = 76.5%, *p*max = 0.008) and that IM decreased with increasing speed (0.8 Hz: *M* = 63.8%, 1.0 Hz: *M* = 57.0%, 1.2 Hz: *M* = 48.5%, *p*max = 0.004). A significant interaction between Gain and Visual Speed, *F*(16,208) = 2.06, *<sup>p</sup>* <sup>=</sup> 0.01, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.14, was also observed. In this analysis, IM did not significantly differ between MoFast and MoSlow gains at any speed (*p*min = 0.09). Comparisons between the MoFast gains showed again that the magnitude of gain did not significantly influence IM at 0.8 Hz speed (*p*min = 0.13), but a significant decline in IM was observed between gain = 0.75 and gain = 0.6 at 1.0 and 1.2 Hz speed (*p* = 0.02 and *p* = 0.001, respectively). Again, a different pattern was observed in MoSlow gains. Comparisons between the MoSlow gains showed a decline in IM with higher gain at 0.8 Hz speed, IM was significantly lower with gain = 2 and gain = 1.5 than with gain = 1.3 and gain = 1.25 (*p* = 0.004). No significant differences in IM were observed between gains at faster speeds (1.0 Hz: *p*min = 0.27, 1.2 Hz: *p*min = 0.54). Thus, negative and positive CE values did not obscure the general data pattern of IM.

#### **ANCHORING: VARIABLE ERROR**

Results for VE are depicted in **Figure 3**. A significant main effect for Visual Speed, *<sup>F</sup>*(2,26) <sup>=</sup> 58.10, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.82, showed that VE increased with increasing speed (0.8 Hz: *M* = 51.1◦; 1 Hz: *M* = 61.0◦; 1.2 Hz: *M* = 70.8◦, *p*max < 0.001). A significant main effect of Gain, *<sup>F</sup>*(8,104) <sup>=</sup> 11.91, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.48, indicated lower VE in gain = 1 (*M* = 39.3◦) than in all other gains (*M*min = 58.3◦, *M*max = 71.8◦, *p*max < 0.001). The interaction between Gain and Visual Speed, *F*(16,208) = 2.08, *p* = 0.01,

η2 <sup>p</sup> = 0.14, indicated that the increase in VE from 0.8 to 1.0 Hz was significantly larger in MoFast gains (*M* =13.4◦) than with gain=1 (*M* = 2.4◦, *p* = 0.013). Results were intransitive, the increase in MoSlow gains (*M* = 6.0◦) did not differ significantly from the increase in MoFast gains (*p* = 0.04) and gain = 1 (*p* = 0.41). The increase in VE from 1.0 to 1.2 Hz did not significantly differ between MoFast gains (*M* = 12.4◦), MoSlow gains (*M* = 7.7◦), and gain = 1 (*M* = 7.9◦, *p*min = 0.046)

Most importantly, a significant main effect of Location, *<sup>F</sup>*(3,39) <sup>=</sup> 164.35, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.93, indicated that VE was lower when the stimulus dot was in the east (and, correspondingly, the effect dot in the west, *M* = 55.1◦) than in the other locations (south: *M* = 65.4◦; west: *M* = 65.3◦; north: *M* = 63.4◦, *p*max < 0.001). The significant interaction between Gain and Location, *<sup>F</sup>*(24,312) <sup>=</sup> 4.05, *<sup>p</sup>* <sup>&</sup>lt; 0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> = 0.24, reflected that the difference in VE between the east and the other locations

was smaller for gain = 1 (*M* = 12.6◦) than for all other gains (*M*min = 20.0◦, *M*max = 38.0◦, *p*max = 0.002). The interaction between Visual Speed and Location, *F*(6,78) = 9.94, *p* < 0.001, η2 <sup>p</sup> = 0.43, together with the significant interaction between Gain, Visual Speed, and Location *<sup>F</sup>*(48,624)=2.55, *<sup>p</sup>*<0.001, <sup>η</sup><sup>2</sup> <sup>p</sup> =0.16, indicated that the difference between locations in VE increased with increasing speed in MoFast gains (differences east vs. other locations, 0.8 Hz: *M* = 9.2◦, 1.0 Hz: *M* = 14.8◦, 1.2 Hz: *M* = 21.4◦, *p*max = 0.001), but no significant increase was found in gain = 1 (*p*min = 0.23) and MoSlow gains (*p*min = 0.12).

#### **AWARENESS OF THE TRANSFORMATION**

Box plots of the awareness ratings are displayed in **Figure 4**. Friedman's test showed a significant effect of gain, <sup>χ</sup>2(8) <sup>=</sup> 48.7, *p*<0.001. The presence of a transformation was less likely reported with gain = 1 than with other gains, apart from gain = 1.3 (*p* = 0.47, others: *p*min = 0.001, *p*max = 0.016). In MoFast gains the presence of a transformation was rated significantly more likely with gain = 0.5 and gain = 0.6 than with gain = 0.75 and gain = 0.8 (*p*min = 0.011, *p*max = 0.036). In MoSlow gains the presence of a transformation was rated less likely with gain = 1.3 than with all other gains (*p*min = 0.009, *p*max = 0.023), awareness ratings did not significantly differ between the other gains (*p*min = 0.21).

## **DISCUSSION**

In the present experiment we investigated how the perceptualmotor system deals with gain transformations in unimanual circling. Participants were instructed to coordinate a visual feedback dot of their hand movement with a continuously circling stimulus dot in order to produce mirror symmetric circular

**FIGURE 4 | Boxplots of the ratings of the presence of a transformation.** Mild outlier, between 1.5 and 3 × interquartile range above the third or below the first quartile, \*extreme outlier, more than 3 × interquartile range above the third or below the first quartile. Verbal coding for ratings: 1 = certainly not present, 2 = likely not present, 3 = undecided, 4 = likely present, 5 = certainly present.

movements of the two dots on the screen. The movement angle of the hand was multiplied by a gain factor before being presented on the screen. We used four gains smaller than 1 (MoFast), a gain of 1, and 4 gains larger than 1 (MoSlow). Speed of the feedback dot was varied in three levels. Accuracy of performance (IM) was highest with gain = 1. Accuracy declined with increasing speed. In MoFast gains the magnitude of gain did not matter in slow speed but performance declined in lower gains with increasing speed. In MoSlow gains accuracy declined in higher gains with slow speed, but not with faster speed. The analysis of CE showed that participants were more likely to lag behind the stimulus with higher speed than with lower speed. Further, with small gains participants lagged behind the stimulus, whereas with higher gains participants were in lead of the stimulus. Because systematic variations in CE may cause variations in IM, we recalculated IM corrected for mean CE. The data pattern remained the same, showing that CE did not compromise the original IM analysis. Participants showed anchoring in the middle of the screen where the two circles were closest to each other (east location of the stimulus dot, west location of the feedback dot). The difference between the east and the other locations was smaller for gain = 1 and increased less with speed in gain = 1 and MoSlow gains than MoFast gains.

The data show that the mere presence of an angular gain transformation affects coordination in unimanual circling negatively. Performance with regular feedback (gain = 1) was more accurate than performance with gains larger or smaller than 1. Thus, the same (perceptually easy) visual pattern was harder to produce if a transformation was present. If only the visual pattern had mattered for performance, the different gains between hand movement and feedback should have had no effect on performance. Difficulty of the task did also not depend on movement speed in a simple manner, because then a decline in performance from large to small gains should have been observed. Rather, the results are in favor of the assumption that the presence of a transformation affects performance negatively. This is in accordance with results showing that the transformation itself is an important part of the cognitive representation of tool-use actions (Massen and Prinz, 2007; Lepper et al., 2008). The results are in contrast to studies in which straight movements were investigated: here either accuracy decreases with increasing gain, or higher gains are compensated for with longer movement durations (Rosenbaum and Gregory, 2002; Rieger et al., 2005; Sutter et al., 2008). An explanation is that introducing a gain in circling movements implies a constant change in the mapping of hand position to feedback position, which is not the case in straight movements. It seems that with a constant mapping change limitations in performance do not (only) depend on a speed-accuracy relationship. Rather, there may be flaws in the representation of the transformation, resulting in an increased difficulty to predict the movement's consequences in external space (see below).

It was further of interest whether the magnitude of the transformation or merely its presence matters for performance. This depended on speed. In MoFast gains the magnitude of gain did not matter with slow speed but performance declined in lower gains with increasing speed. The effect of transformation magnitude in the MoFast gains with higher speed may be due to movement speed: coordination may be more difficult with faster speed due to higher demands on the motor system. This is corroborated by the finding that accuracy generally declined with increasing speed (see also Kelso, 1984; Haken et al., 1985; Heuer, 1993; Byblow et al., 1995; Carson et al., 1997; Roerdink et al., 2005). A different picture was apparent in MoSlow gains: accuracy declined in higher gains with slow speed, but not with faster speed. How can this be explained? It could be that slow movements with high gain are difficult because of the slowness of the hand movements; participants may have preferred to move faster. Studies have shown that there is a preferred movement speed for continuous movements, which also influences how movements at other speeds are performed (Naruse et al., 2001). This interpretation is corroborated by the results on the CE, which indicated that participants were more in lead of the stimulus with higher gains and slower speed.

The CE was systematically influenced by the magnitude of the transformation and visual speed. With lower speed and higher gain participants were more in lead of the stimulus, with higher speed and lower gain participants lagged behind the stimulus.With gain = 1 and in MoSlow gains participants were slightly in lead of the stimulus. The tendency that overall feedback was more likely to be in lead of the stimulus may be due to participants' use of the dominant hand in the task, as the dominant hand shows a slight lead over the non-dominant hand when coordinating symmetrical movements in bimanual coordination (Treffner and Turvey,1995). However, as this effect seems to be due to attentional rather than motoric factors (the lead of the dominant hand disappears when attention is directed to the non-dominant hand, Amazeen et al., 1997), an alternative explanation is that participants paid more attention to the feedback than the stimulus. The data pattern also suggests that the CE is related to movement speed: higher visual speed (and correspondingly movement speed) resulted in more lag/less lead. Similar, lower gain, also implying higher movement speed, resulted in more lag/less lead.

It is assumed that the nervous system controls movements using internal models (Wolpert and Flanagan, 2001), with inverse models choosing appropriate motor commands for desired goals and forward models predicting the sensory consequences of motor commands. The predictions can refer to bodily consequences (the hand movement itself) but also to the movement consequences in external space, like visual feedback. External consequences do not necessarily coincide with the bodily consequences when the movement is transformed as in tool use (Wolpert and Flanagan, 2001). In tool use people develop internal models of the tool transformation (Imamizu et al., 2000, 2003, 2007; Verwey and Heuer, 2007; Rieger et al., 2008; Sülzenbrück and Heuer, 2009, 2012). In the present task internal models need to take the gain transformation into account. Our data suggest that this may be insufficiently accomplished: with high gains/low movement speed the feedback resulting from a movement might be underestimated, resulting in the feedback being in advance of the transformation. Conversely, with small gains/high movement speed, the feedback produced by the movement may be overestimated, resulting in the feedback lagging behind the stimulus. This is in accordance with findings that the nervous system does not necessarily completely adapt to observed errors (Wei and Kording, 2009). Thus, there seem to be flaws in the representation of the transformation.

We also investigated whether participants show visual anchoring, i.e., reduced variability at salient locations of the dots' trajectories. Anchoring occurred where the two trajectories were closest to each other (east position of the stimulus dot and west position of the feedback dot). Because the position of the hand could not be determined by the position of the feedback dot in the present experiment, except with gain = 1, the actual hand position was not relevant for anchoring to occur in this position. Larger and smaller differences between the east and the other locations in variability (smaller difference for gain = 1 than other gains, higher difference with higher visual speed in MoFast gains) can be explained by overall task performance. Conditions in which variability was lower also showed lower differences between the east and the other locations. Importantly, the data show that for anchoring to occur, discrete timing events like tones (Fink et al., 2000; Keller and Repp, 2008), or movement reversals/maximal excursions (cf. Roerdink et al., 2008) are not necessary. Rather, visually salient locations are sufficient. They may serve a similar function as such events.

Circle drawing usually results in equal temporal variability along the entire trajectory (Spencer and Zelaznik, 2003). Therefore, circle drawing tasks are thought to require emergent timing in contrast to other tasks like tapping to a metronome which require event-based timing (Zelaznik et al., 2002). In contrast to our study previous results indicate that anchoring does not occur in circle drawing even when participants are asked to produce one circle between two beats of a metronome (Studenka and Zelaznik, 2011). However, when participants are not drawing freely, but place the stylus inside a circular track, anchoring at the timing target seems to occur (Repp and Steinman, 2010). The use of a crank for in the present task may thus have contributed to the occurrence of anchoring.

We argued that anchoring occurs when the circles are in the position closest to each other. An alternative explanation is that rather than the visual proximity of stimulus and feedback, the leftmost position of the circle produces the effect. Being in the leftmost position of a circle may have perceptual advantages over being at other position of a circle. However, the comparison locations we chose were at points for which similar arguments could be made (rightmost, topmost, and lowermost). Nevertheless, such an effect might also explain the differences in results between previous studies: Repp and Steinman (2010) used the west position of the circles for synchronization and found anchoring with a metronome, whereas Studenka and Zelaznik (2011) used the north and found no anchoring. As we did not vary the closest position between stimulus and feedback in our experiment, this has to remain an open question for future studies.

The magnitude of gain had an impact on participants' awareness of the transformation in MoFast gains. The greater the gain diverged from gain 1, the more likely participants noticed the presence of a transformation in MoFast gains. In MoSlow gains this effect was also apparent but less clear (the presence of a transformation was rated more likely with gain = 1.25 than gain = 1.3, only the latter one was rated less likely than the higher gains). The observation that the magnitude of the gain mattered for awareness

of the transformation is interesting: one could have expected that due to the constant change of the mapping of movement positions to feedback position with any gain other than 1 a transformation would always be detected equally well. Even with small deviations in gain from 1 there are eventually circles in which movement and feedback are on opposite sides. The results are in accordance with studies indicating that participants are not very good in knowing their actual hand positions in similar tasks and that the magnitude of a perturbation plays an important role for detecting it (Fourneret and Jeannerod, 1998; Knoblich and Kircher, 2004; Sutter et al., 2008; Müsseler and Sutter, 2009). During visible movements, proprioception does not seem to be attended to (Proteau and Isabelle, 2002), and processing of proprioceptive feedback may be masked by processing of visual feedback (Tremblay and Proteau, 1998). The observation, that even with gain = 1 participants were not sure that no transformation was presented, corroborates the interpretation that participants' awareness of the actual hand position may have been limited. Thus, the magnitude of the transformation may be more important for detecting it than a mismatch between movement and feedback position. Nevertheless, the observation that participants were not sure that no gain was present with gain = 1 may also be due to the design of the experiment: the presence of a transformation was more likely than its absence, which may have led participants to believe that a transformation was always present.

The present results have implications for the use of tools with gain transformations, which involve a constant change in the mapping of movement positions to feedback positions. First, such movements are more difficult to perform than untransformed movements. Thus, there are limits to the dominance of visual feedback in controlling actions involving tool transformations (see also Sutter et al., 2013). Second, the representation of the transformation in internal models can be flawed. It is important to note, that the performance decrements and flaws in the representation of the transformation were observed even though the initial adaptation phases to gains and speeds were excluded from the data analysis. It could however be, that with extended practice further adaptation processes take place.

In conclusion, the size of an angular gain transformation as well as its mere presence influence performance in a situation in which the mapping of movement positions to visual feedback positions is not constant. The representation of angular gain transformations by internal models may be flawed. Anchoring (reduced variability) at visually salient locations supports the coordination of transformed feedback with external events. Participants' conscious experience of the transformation depends on its magnitude. When designing machines or tools that involve transformations between movements and their external consequences, one should be aware that the mere presence of angular gains may result in performance decrements.

## **ACKNOWLEDGMENTS**

We thank Gudrun Henze and Silke Meissner for their contribution in helping with the experimental setup, data collection and data preparation. Further thanks go to Andreas Romeyke und Henrik Grunert for helping with the programming of the experiments and building the experimental apparatus.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 September 2013; accepted: 07 February 2014; published online: 05 March 2014.*

*Citation: Rieger M, Dietrich S and Prinz W (2014) Effects of angular gain transformations between movement and visual feedback on coordination performance in unimanual circling. Front. Psychol. 5:152. doi: 10.3389/fpsyg.2014.00152*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Rieger, Dietrich and Prinz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Concurrent sensorimotor temporal recalibration to different lags for the left and right hand

## *Yoshimori Sugano1\*, Mirjam Keetels <sup>2</sup> and Jean Vroomen2 \**

*<sup>1</sup> Department of Industrial Management, Kyushu Sangyo University, Fukuoka, Japan*

*<sup>2</sup> Department of Cognitive Neuropsychology, Tilburg University, Tilburg, Netherlands*

#### *Edited by:*

*Jochen Musseler, RWTH Aachen University, Germany*

#### *Reviewed by:*

*Eckart Altenmüller, University of Music and Drama Hannover, Germany Clemens Maidhof, University of Helsinki, Finland*

#### *\*Correspondence:*

*Yoshimori Sugano, Department of Industrial Management, Kyushu Sangyo University, 3-1 Matsukadai 2-Chome Higashi-ku, Fukuoka 813-8503, Japan e-mail: sugano@ip.kyusan-u.ac.jp; Jean Vroomen, Department of Cognitive Neuropsychology, Tilburg University, P.O. Box 90153, 5000 LE Tilburg, Netherlands e-mail: J.Vroomen@uvt.nl*

Perception of temporal synchrony between one's own action and the sensory feedback of that action is quite flexible. We examined whether sensorimotor temporal recalibration (TR) involves central or motor-specific components by concurrently exposing the left and right hands to different lags. The experiment was composed of a pre-test, an adaptation phase, and a post-test. During the adaptation phase, participants tapped their left and right index fingers in alternating fashion while each tap induced an auditory feedback signal (a short click sound). One hand was exposed to a long delay between the tap and the sound (∼150 ms), while the other hand was exposed to a subjective no-delay (∼50 ms). Before and after the adaptation phase (the pre- and post-test), participants tried to tap in synchrony with pacer tones (ISI = 1000 ms). The results showed that the hand that was exposed to the delayed sound corrected for this delay by tapping earlier (a larger anticipation error) than the no-delay hand, indicating TR. Different amounts of TR were found when the left and right hand were concurrently exposed to the same versus different delays. With different exposure- delays for the two hands, there was aTR even for the hand that did not experience any delay in the feedback signal. However, it is not the case with the same exposure delay for the two hands. TR of the hand that experienced delayed feedback also occurred faster and was more complete (∼40% greater than that of the hand with no subjective delay) if the two hands were exposed to the same rather than different delays (∼20% greater than that of the hand with no subjective delay). These results suggest the existence of cross-talk between the hands, where both central and motor-specific components might be involved.

**Keywords: adaptation, temporal recalibration, motor-sensory synchrony, tapping, sensorimotor coordination, delayed auditory feedback, internal clock**

## **INTRODUCTION**

Perception of temporal synchrony between one's own action (e.g., tapping) and a sensory feedback following the action (e.g., a flash or a tone) can be flexibly changed after prolonged exposure of an artificially induced temporal delay of the sensory feedback, which sometimes leads to a reversed sensation of the cause-effect relationship (Cunningham et al., 2001; Stetson et al., 2006; Heron et al., 2009; Sugano et al., 2010, 2012; Stekelenburg et al., 2011; Keetels and Vroomen, 2012). This remarkable flexibility of sensorimotor timing is often explained by the concept of temporal recalibration (TR; Fujisaki et al., 2004; Vroomen et al., 2004). However, the mechanism underlying sensorimotor TR is still unclear (for review, see Vroomen and Keetels, 2010). One plausible hypothesis for sensorimotor TR is that a single supramodal mechanism, which is usually referred to as a "central clock," is responsible for adapting to the perceived time across all sensory pairings, including motor timing. This central clock refers to a single, dedicated centralized internal-time-keeper mechanism in which pulses are generated by a pacemaker and are counted by a counter (Creelman, 1962; Treisman, 1963). This idea is in line with data showing equal amounts of sensory TR across all sensory pairings (Hanson et al., 2008). Support for this concept also

comes from studies showing that sensorimotor TR readily transfers between sensory modalities (Heron et al., 2009; Sugano et al., 2010), and transfers from learned to novel tasks (Fujisaki et al., 2004; Pesavento and Schlag, 2006).

However, there is other evidence that is difficult to reconcile with a centralized-clock model. Instead, this evidence points toward early, peripheral timing mechanisms that are selective for modality and low-level stimulus features (for review, see Eagleman, 2008). For example, some researchers reported a complete absence of recalibration outside the audio-visual domain (Navarra et al., 2007; Harrar and Harris, 2008), while others reported relatively lower levels of a visuo-tactile recalibration mechanism that operates separately for the left and right hand (Takahashi et al., 2008). The magnitude of audio-motor recalibration has also been found to be greater than visual-motor and tactile-motor recalibration, and there are also costs involved when the modality of the sensory event changes between the adaptation phase and test phase (Heron et al., 2009). Moreover, it has been reported that audio-motor recalibration does not transfer to visuo-motor synchronization tasks (Sugano et al., 2012). Training in a visual temporal order judgment (TOJ) task also does not transfer to an auditory TOJ task and vice versa (Alais and Cass, 2010). Furthermore, training on auditory interval discrimination does not transfer to visual interval discrimination (Lapid et al., 2009; Grondin and Ulrich, 2011). It has been demonstrated that when presenting a beep and flash coming from a single location after a voluntary action with variable delays, the motor-auditory timing was recalibrated independently from the motor-visual timing (Parsons et al., 2013).

Striking evidence against the notion of a central clock involves concurrent recalibration in audio-visual synchrony perception (Roseboom andArnold,2011; Heron et al.,2012;Yuan et al.,2012). Here, it has been reported that observers can have multiple concurrent estimates of audio-visual synchrony for different audio-visual pairings, and TR can occur in positive and negative directions concurrently, provided that the signals are spatially or contextually separated.

However, it is unclear if such concurrent recalibration is possible for domains other than audio-visual temporal processing. It is of special interest if concurrent TR occurs for sensorimotor synchronization, because in the sensorimotor domain the perceived delay between an action and its consequence can be diminished due to intentional binding (Haggard et al., 2002). Some studies have indeed suggested that separate multiple-clocks exist in sensorimotor temporal processing (e.g., Parsons et al., 2013; Yarrow et al., 2013). Yarrow et al. (2013) compared within- and acrosslimb transfer of sensorimotor TR and suggested that the former reflected a genuine shift in neural timing (peripheral mechanism), while the latter was achieved via a criterion shift (central mechanism), suggesting the existence of separate peripheral timing mechanisms between limbs. Parsons et al. (2013) have shown that independent shifts of timing in response to an auditory and a visual stimulus occur when they are presented with different delays after a motor action, suggesting multiple independent timelines coexisting within the brain. Moreover, it has been shown that patients with a unilateral deficit in the cerebellum showed more variable tapping with their hand and foot corresponding to the impaired side. However, such variability is not observed in the case of the effectors corresponding to the contra-lateral side (Ivry et al., 1988). This observation also indicates that there can be separate timing systems for the two sides of the body (Ivry and Richardson, 2002).

Though these studies offer support for a multiple-clock model in controlling sensorimotor coordination, the concept has not been directly tested in the context of concurrent adaptation. Here, we therefore have sought to verify whether or not concurrent sensorimotor TR occurs for the left and right hand after exposure to different lags. We used motor-auditory pairings rather than motor-visual pairing since the former is expected to evoke greater effects (Heron et al., 2009; Sugano et al., 2012).

## **PREDICTIONS**

We hypothesized three possible modelsfor temporal control mechanisms that might explain multiple concurrent TR for different sensorimotor delays: a single-central-clock model, a multipleperipheral-clock model, and a hybrid-clock model (single-central plus multiple-peripheral clocks). Predictions generated by these three models are shown in **Figure 1**.

We predicted that TR in a tapping task, in which participants try to tap in synchrony with an auditory pacing signal, will manifest itself as a compensatory shift in the natural negative asynchrony between the tap and pacing signal. After exposure to delayed sensory feedback, observers thus were expected to tap *earlier* to compensate the previously experienced delay (Sugano et al., 2012). The rationale for this is from the Paillard–Fraisse hypothesis and its modified version, the sensory accumulator model (e.g., Aschersleben and Prinz, 1997; Aschersleben et al., 2001; Aschersleben, 2002). This model assumes that the perceived timing of a pacing signal and the perceived timing of a tap should be synchronized at the level of central representations in a synchronization task and the difference of perceptual latencies between them causes the tap-asynchrony.

The single-central-clock model assumes that a single, unified (e.g., amodal) clock regulates all temporal coordination in the brain. It predicts that tap asynchronies do not differ between the left and right hands if they were exposed to different delays, because the effects of lag adaptation for the left and right hand are "pooled" together via a single central mechanism (**Figure 1A**). In contrast, the multiple-peripheral-clock model assumes that different limbs are timed by different clocks. It thus predicts that tap asynchronies will be different for the two hands after exposure to different delays, because the clocks for the left and right hand are separated and adapted separately to each specific delay (**Figure 1B**).

The hybrid-clock model assumes that there are both central and peripheral clocks, and that the peripheral clocks are linked together via the central clock. It predicts that the tap-asynchrony for the left and right hand can be recalibrated separately, but the difference will be smaller than in the multiple-clock model due to cross-talk mediated by the central clock (**Figure 1C**).

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

Fifty-two participants from Kyushu Sangyo University and Tilburg University (twenty-five female, mean age 21.8, three left-handed, all using a computer mouse with their right hand) participated. Twenty-seven were assigned to a mixed-exposure condition in which the feedback delay (lag) was a within-subjects factor. The other twenty-five were assigned to a pure-exposure condition in which the feedback delay was a between-subjects factor. In the mixed-exposure condition, approximately half of the participants (fourteen) performed right-hand tapping with delayed feedback and left-hand tapping with non-delayed feedback. For the other half, the hand-delay assignment was reversed. In the pure-exposure condition, approximately half of the participants (twelve) received delayed feedback; the remaining thirteen received non-delayed feedback. All participants had normal hearing and normal or corrected-to-normal vision. Informed consent was obtained from each participant. The experiment was approved by the Local Ethics Committee of Kyushu Sangyo University and Tilburg University, and followed the declaration of Helsinki.

#### **STIMULI AND APPARATUS**

Participants sat at a desk in a dimly lit booth looking at a white fixation cross on a CRT display (100 Hz refresh rate) at approximately 65 cm viewing distance. The auditory stimulus was a 2,000 Hz

pure tone pip (30 ms duration with 2 ms rise/fall slope) presented via headphones. White noise was continuously presented via headphones to mask the sound of taps. Two special gaming mice (Logitech G300) were connected to a PC to collect the tapping data with high temporal precision (<1 ms). Participants' hands were occluded so that they could not see the movement of their fingers.

## **DESIGN**

There were three factors in the experimental design. The exposure type (mixed- vs. pure-exposure) was a between-subjects factor. The test type (pre- vs. post-test) was a within-subjects factor. The feedback delay (50 ms as non-delayed vs. 150 ms as delayed) was a within-subjects factor in the mixed-exposure condition, and a between-subjects factor in the pure-exposure condition. These three factors yielded eight different experimental conditions. Each condition consisted of 20 trials.

In the mixed-exposure condition, the delay was fixed for each hand but it was different for the left and right hand. The combination of the hand (right vs. left) and the feedback delay (50 vs. 150 ms) was fixed for each participant but changed across participants. It was treated as a residual factor and was counterbalanced between participants. The order of which hand tapped first was also treated as a residual factor and was counter-balanced between participants as well.

In the pure-exposure condition, the participants were exposed to the same amount of delay (50 vs. 150 ms) for the left and right hand in the adaptation phase. The two exposure delays were run with different participants to avoid carryover effects between adaptations to different lags. The alternating order of hands was also treated as a residual factor and was counter-balanced between participants. Experimental and residual factors are summarized in **Table 1**.

## **PROCEDURE**

The experiment was composed of a pre-test, an adaptation phase, and a post-test (see **Figure 2**). In the pre-test, participants tried to tap (i.e., mouse-press) their left and right index fingers in synchrony with the tone that served as a pacing signal. The taps (mouse-presses) were not accompanied by any feedback signals


#### **Table 1 | Experimental design and factors.**

(i.e., sounds). The tone was delivered 16 times per trial at a constant inter-stimulus interval (ISI) of 1000 ms. Participants skipped the first two pacing signals to get into the rhythm, and then synchronized their mouse presses with the following fourteen pacing signals. For each trial, there were thus seven taps for each hand.

After completion of the pre-test, the adaptation/post-test phase began. Each trial started with a short adaptation phase immediately followed by the post-test. In the adaptation phase, participants made 16 voluntary mouse-presses with their left and

right index fingers in alternating order, trying to keep the inter-tap interval at approximately 1000 ms. The order (the right first, or the left first) was same as the pre-test. After each mouse-press, a tone was delivered at a constant delay at either 50 ms (non-delayed condition) or 150 ms (delayed condition), following earlier studies (e.g., Sugano et al., 2010, 2012). These values were expected to elicit quantifiable adaptive shifts, while they were still perceived as a single event (150 ms), or were expected to be perceived as subjectively simultaneous (50 ms). In the post-test that immediately followed the adaptation phase, the participants then performed the synchronization task, which was identical to the pre-test. Trials were repeated if more than two taps were missed (1.05% in total: 11 trials by 10 participants).

Participants also completed a short practice session before the experimental session in order to get acquainted with the experimental procedure. The whole experiment lasted ∼50 min including the instruction, the practice session, and the experimental session.

Trials from the practice session were excluded from further analysis. The data from the pre- and the post-test were analyzed in the following analysis. Data were discarded if participants tapped on the wrong side. The tap-asynchrony was defined as the timing differences between the tap and the pacer tone, and was negative if the tap preceded the pacer. Missing responses (error taps) were only 0.48% of the total number of taps. Tap asynchronies out of the range from the mean plus minus three standard deviations (−300 to +110 ms) were regarded as outliers and were eliminated from the analysis (1.07% of the total number of taps). The first tap for each hand was also removed from the analysis because of possible instability. The rest of the tap asynchronies (six measurements per hand and per trial) were averaged over the 20 trials for each experimental condition.

### **RESULTS**

#### **AVERAGE TAP ASYNCHRONIES**

The group-averaged tap asynchronies are presented in **Table 2**. All tap asynchronies were negative, which reflects the well-known anticipation tendency in sensorimotor synchronization (see, e.g., Aschersleben, 2002). The temporal recalibration effect (TRE) was defined as the change in tap-asynchrony between the pre- and the post-test. All TREs were, as expected, negative meaning that the anticipation tendency became greater after exposure to delayed and non-delayed feedback in voluntary tapping.

Firstly, we analyzed tap asynchronies separately for the mixedand pure-exposure conditions. The TRE in the mixed-exposure condition was stronger for the delayed (−46.7 ms) than the nondelayed (−27.2 ms) hand. Note that there was a TRE for the non-delayed hand that possibly indicates cross-talk between the delayed and non-delayed hands. A repeated-measures ANOVA


Similar analyses were run in the pure-exposure conditions. The TRE was similar to the mixed-exposure condition, except that the difference between the delayed and non-delayed hand was greater in the pure-exposure (39.4 ms, ∼40%) than the mixed-exposure (19.5 ms, ∼20%). A mixed-model ANOVA was conducted on the individual tap asynchronies in the pure-exposure condition with test type (pre- vs. post-test) as a within-subjects factor and exposure delay (50 vs. 150 ms) as a between-subjects factor. There was a significant main effect of test type, *F*(1,23) = 36.23, *p* < 0.001, and an interaction between test type × exposure delay, *F*(1,23) = 26.02, *p* < 0.001. The main effect of the exposure delay was not significant, *F*(1,23) = 1.61, *p* = 0.22. A subsequent ANOVA for each test type with exposure delay as a betweensubjects factor revealed that the effect of exposure delay was not significant in the pre-test, *F*(1,23) = 0.04, *p* = 0.85, but was significant in the post-test, *F*(1,23) = 8.94, *p* < 0.01. As with the mixed-exposure data, the TREs were entered into separate one sample *t*-tests, showing that the TRE was significantly negative


*Standard errors of mean are shown in parenthesis.*

*\*\*p* < *0.001 (i.e., negative numbers indicate tap before pacing stimulus).*

for the delayed condition (−43.7 ms), *t*(11) = 7.65, *p* < 0.001, but not for the non-delayed condition (−4.3 ms), *t*(12) = 0.83, *p* = 0.21.

We also compared the TRE between mixed- versus pureexposure by ANOVAs per exposure delay (50 vs. 150 ms) with exposure type (mixed- vs. pure-exposure) as a betweensubjects factor. This ANOVA showed a main effect of exposure type (mixed- vs. pure-exposure) in the non-delayed condition (−27.2 ms vs. −4.3 ms for mixed- vs. pure-exposure, respectively), *F*(1,38) = 8.19, *p* < 0.01, while it was not significant in the delayed condition (−46.7 ms vs. −43.7 ms for mixed- vs. pure-exposure, respectively), *F*(1,37) = 0.13, *p* = 0.72. In mixed-exposure, the delayed hand thus affected the non-delayed hand, but not *vice versa*.

#### **BUILD-UP OF TR**

Secondary analyses were performed to examine the build-up of the TRE. To examine this, we divided the 20 trials of each condition into 10 blocks of two trials each. The mean tap asynchronies per block are shown in **Figure 3**.

An exponential decay function, *P*<sup>2</sup> + (*P*<sup>0</sup> − *P*2) × exp(−*P*<sup>1</sup> × *x*), was fitted to the mean tap asynchronies over the trial-blocks where *P*<sup>0</sup> reflects a "starting point" before adaptation (*x* = 0), *P*<sup>1</sup> reflects a "rate of change" (the greater, the faster the decay) and *P*<sup>2</sup> reflects an "end point" after adaptation was completed (*x* → ∞). The fitting was carried out using the NLS function in the statistical package R version 2.12.1 (R Development Core Team, 2010) with the NL2SOL algorithm, which gave the nonlinear least-squares estimates of fitting parameters. The fitted lines are shown in **Figure 3**, and the estimated values of the parameters are shown in **Table 3**.

As can be seen in **Figure 3**, although the mean tap-asynchrony in the pre-test slowly declined across trial-blocks, the trends were almost the same across experimental conditions confirming that the baseline was the same across conditions. A possible reason for the gradual increment of the asynchrony in the pretest might be a reduced tactile sensitivity (due to a fatigue of mechano-receptors or a decrease of attention for the tactile feedback) caused by repeated tapping. If tactile sensitivity declines, then the latency of the tactile feedback might increase, thus causing a bigger tap-asynchrony (e.g., Aschersleben et al., 2001).

The mean tap-asynchrony of the delayed condition in the posttest sharply declined and quickly reached a plateau in the pureexposure condition when compared with the mixed-exposure condition. This observation was supported by the fact that the estimated parameter reflecting "rate of change" (*P*1) was greater in the pure than the mixed-exposure (0.890 vs. 0.289, respectively), albeit not significantly different, *t*(16) = 0.84, *p* = 0.41. The *P*<sup>1</sup> parameter of the non-delayed condition showed the same pattern between the pure- and mixed-exposure (0.816 vs. 0.334, respectively), though the difference was again not significant, *t*(16) = 0.30, *p* = 0.77. The reason why we could not get significant *P*<sup>1</sup> differences in the post-test is, at least in part, due to a relative larger standard error in estimating *P*<sup>1</sup> in the pureexposure condition than in the mixed-exposure condition (1.612 and 0.713 vs. 0.113 and 0.082, see **Table 2**). These results suggest

that the build-up of the TRE tended to be slower and less complete in the mixed-exposure condition than in the pure-exposure condition.

#### **DISSIPATION OF TR**

To examine if there was dissipation of TR, mean tap asynchronies for each tap within a trial were calculated. The mean tap asynchronies across hands for the 2nd until the 7th tap (1st tap was omitted from the analysis as mentioned before) are shown in **Figure 4**. As is clearly visible, although the tap asynchronies in the post-test became more negative as the number of taps increase, the difference between the delayed and the non-delayed conditions remained constant in all taps. To confirm this, mean tap asynchronies per tap were entered into a repeated-measures or mixed-model ANOVA per exposure type (mixed- vs. pure-exposure) and test type (pre- vs. post-test), with tap position (2nd to 7th tap) as a within-subjects factor and exposure delay (50 vs. 150 ms) either as a within-subjects (mixed-exposure) or a between-subjects factor (pure-exposure). As expected and shown in **Figure 4**, these ANOVAs revealed a significant main effect of tap position (in the pre- and post-test) and exposure lag (in the post-test only) under both exposure types (all ps < 0.05). Most importantly, the tap position did not interact with exposure delay in either exposure type in the post-test, *F*(5,130) = 1.71, *p* = 0.14, in mixed-exposure and *F*(5,115) = 0.84, *p* = 0.53, in pure-exposure, indicating that the TRE did not dissipate within the short term period of one trial (∼14 s).

#### **VARIABILITY OF TAP ASYNCHRONIES**

Similar analyses were conducted on the variability (the standard deviation) of the tapping responses. The group-averaged standard deviations are shown in **Table 4**.

A repeated-measures and a mixed-model ANOVA were applied separately for the mixed- and the pure-exposure conditions respectively, with test type as a within-subjects factor and exposure delay as a within-subjects (mixed-exposure) or a between-subjects factor (pure-exposure). ANOVAs showed that only a main effect of test-type (pre- vs. post-test) was significant in both exposure types, *F*(1,26) = 15.45, *p* < 0.001 (mixed-exposure), *F*(1,23) = 8.16, *p* < 0.01 (pure-exposure), showing that the variability of taps became greater (less stable) in the post-test (50.4 ms for the mixedexposure, 56.7 ms for the pure-exposure) than in the pre-test (43.5 ms for the mixed-exposure, 52.6 ms for the pure-exposure). The variability of tapping after an exposure to delayed sensory feedback was comparable to that after non-delayed feedback, suggesting that the TR occurs without changing the stability of tapping.

#### **DISCUSSION**

In the present study, we tested whether concurrent TR for different feedback delays is possible in sensorimotor coordination (finger tapping). During a short adaptation phase, participants tapped their left and right fingers in alternating fashion while a tone was delivered 50 ms (a subjective no-delay) or 150 ms (delayed) after each tap. After this adaptation phase, participants then tried to tap in synchrony with pacing tones. In line

**FIGURE 3 | Mean tap asynchronies per trial-block. (A)** The mixedexposure condition. **(B)** The pure-exposure condition. One trial-block contains two consecutive trials. A negative tap-asynchrony means that the tap comes before the tone (i.e., an anticipation error). Error bars represent 1 standard

error of mean (SEM). An exponential decay function, *P*<sup>2</sup> + (*P*<sup>0</sup> − *P*2) × exp(−*P*<sup>1</sup> × *x*), was fitted to the mean tap asynchronies over the trial-blocks. The meaning of each parameter was explained in the text. The fitted lines are shown in solid lines.


**Table 3 | Estimated parameters in fitting a decay function for the mean tap asynchronies.**

*Standard errors are shown in parenthesis.*

*Note: the decay function is, f*(*x*) = *P*<sup>2</sup> + (*P*<sup>0</sup> − *P*2) × exp(−*P*<sup>1</sup> × *x*)

with previous studies (Sugano et al., 2012), results showed that the tap-asynchrony became greater (i.e., a larger anticipation error) after exposure to delayed feedback, presumably because participants shifted their motor timing or the perceived timing of the sensory signal to compensate for the delay (i.e., a temporal recalibration effect: TRE). Importantly, when the left and right hands were concurrently exposed to different delays, then each hand displayed a different amount of TRE. This means that concurrent TR for different delays is possible in the sensorimotor domain, as it is in the audio-visual domain (Roseboom and Arnold, 2011; Heron et al., 2012; Yuan et al., 2012).

It is also of note that the non-delayed hand in the mixedexposure condition increased its anticipation error, but not so in the pure-exposure condition. This suggests that there might be a cross-talk of TRE between hands in mixed-exposure. Further evidence for cross-talk can be found in the build-up course of the TRE, which tended to be slower and less complete in mixed than in pure-exposure.

The results of the present study argue in favor of a motorshift account for sensorimotor TR, which assumes that it is the motor timing (e.g., when did I move my finger or when did my finger hit the pad?) rather than sensory timing (e.g., when did I hear the sound?) which is recalibrated with delayed sensory feedback, as the left and right hand can be recalibrated differently. Earlier research also suggested that TR of sensorimotor events is mainly caused by a shift in the motor component (Sugano et al., 2010, 2012, Stekelenburg et al., 2011). However, when discussing the motor or sensory nature of TR, it is important to be cautious because motor timing is not a single entity, rather it can be decomposed into several components such as an intention to move, an actual motor command, an efferent copy of that command, a proprioceptive feedback about the movement and the position of the joints, and tactile feedback from clicking the mouse (e.g., Frissen et al., 2012; Sugano et al., 2012). At present, it is difficult to elucidate which component actually has been adjusted by TR because all these components are correlated. To disentangle them, future research might measure the timing of various action-related components. One approach might use the Libet clock-hand paradigm (Libet et al., 1983) to measure the timing of the intention to move.

## **POSSIBLE MECHANISM FOR CONCURRENT ADAPTATION**

The present results are most easily accounted for by a hybridclock model in which a single central clock and multiple peripheral clocks are linked together (**Figure 1C**). This model is closely related with a two-level model for motor timing in which a timing goal is represented at a central level and a movement itself is generated by an automatic motor system (e.g., Semjen and Ivry, 2001). In this view, the left and right hands have their own peripheral clocks that are linked via a central clock. The peripheral clocks are in charge of sensorimotor timing for the left and right hand independently. The central clock is a master clock that regulates a global timing goal and links multiple peripheral clocks. This hybrid-clock model predicts that the left and right hand can be recalibrated differently, though its size should be smaller than the prediction generated by the multiple-clock model.

The single-central-clock model and the multiple-peripheralclock model do not predict the present results well. The former assumes that a single central clock regulates all the timing in the brain and predicts that the left and the right hand are recalibrated the same way (**Figure 1A**). The latter assumes that different limbs are timed by different clocks and predicts that the two hands are recalibrated independently after exposure to different delays (**Figure 1B**).

It has been suggested that the mechanism of interval timing is separated into two systems, a cognitively controlled one and an automatic one (Lewis and Miall, 2003; Buhusi and Meck, 2005; Repp and Su, 2013). The central clock component might correspond to the cognitively controlled timing mechanism, whilst the multiple peripheral components might correspond to the automatic timing mechanism. It is of importance to realize that the time scale in which these two distinct timing systems work is different. The cognitively controlled mechanism works mainly in the supra-second range, whilst the automatic system works in a subsecond range (Lewis and Miall, 2003; Buhusi and Meck, 2005; Repp and Su, 2013). In line with this dichotomy, motor timing is thought to be controlled by the automatic system that works in a sub-second range.

Another well-known dichotomy in timing control of rhythmic finger tapping is the difference between phase correction (i.e., adjustment of a tap-stimulus synchronization) and period correction (i.e., adjustment of inter-tap intervals; Mates, 1994a,b; Repp,

due to instability. A negative tap-asynchrony means that the tap comes before the tone (i.e., an anticipation error). Error bars represent 1 standard error of mean (SEM).


#### **Table 4 | Mean standard deviation of tap-asynchrony.**

*Standard errors of mean are shown in parenthesis.*

*\*p* < *0.01, \*\*p* < *0.001.*

2001a,b). They differ in their degrees of cognitive control and may be associated with different brain circuits (Repp,2005). Period correction may need more cognitive control than phase correction, which is largely automatic (Repp, 2005). The peripheral clocks might be more related with phase correction while the central clock might be engaged in period correction. If so, then more cross-talk of TRE might be observed in a synchronization task with tempo changing pacing signals such as gradual tempo acceleration and/or deceleration because tuning-in the tempo requires a cognitively controlled period correction mechanism (e.g., Repp, 2005; Schulze et al., 2005).

#### **NEURAL CORRELATES OF CENTRAL AND PERIPHERAL CLOCKS**

What neural mechanisms or regions are candidates for the centralized single clock and the peripheral localized clocks? The cerebellum and the thalamo-cortico-striatal circuits might be candidates for the peripheral and the central timing mechanism, respectively.

The automatic timing system is thought to be controlled mainly by the cerebellum. Dysfunction of cerebellum causes impairments in synchronization (Repp and Su, 2013) and motor adaptation (Bastian, 2008). It has been shown that patients with a unilateral deficit in the cerebellum showed impaired tapping only for the impaired, ipsilateral side (Ivry et al., 1988), suggesting that the cerebellar hemispheres have a separate clock controlling the sensorimotor synchronization task for each side. Moreover, activation of the cerebellum is context-dependent, thus suggesting localized, rather than centralized representation of time (Coull and Nobre, 2008). Although the cerebellar hemispheres might be interconnected during bimanual tapping (Pollok et al., 2005b), this might not be the case during the unimanual tapping (Pollok et al., 2005a). The alternate tapping by either hand, as used in the present study, is thought to utilize a similar timing control mechanism as the one in unimanual tapping (Summers et al., 1989; Semjen and Ivry, 2001). Accordingly, the left and right cerebellar hemispheres might be working in isolation during the adaptation phase. The cerebellum might thus be a possible candidate for the peripheral timing control to different feedback delays.

The thalamo-cortico-striatal circuits are thought to be involved in the cognitively controlled timing system which works in a

supra-second range (Buhusi and Meck, 2005) and are subject to attentional modulation (Repp and Su, 2013). In the thalamocortico-striatal network, it has been shown that the left dorsal premotor cortex (dPMC) is crucial for accurate timing of either hand (Pollok et al., 2008; Bijsterbosch et al., 2011). More generally, the left hemisphere dominates over both the left- and right-hand tapping (Pollok et al., 2005b, 2008), irrespective of hand dominance (Pollok et al., 2006). Activity of the basal ganglia has also been shown to be independent of the motor effectors (right/left hand, speech) used in rhythmic timing (Bengtsson et al., 2005). Pollok et al. (2005b) found a coupling between the left and right premotor areas during bimanual tapping, which led them to suggest that a cross-talk between the limbs might occur at the level of motor planning and programming. Furthermore, recent studies have shown that there is nearly perfect intermanual transfer of various motor-skills including visual-motor learning (Imamizu and Shimojo, 1995), anticipatory timing (Teixeira, 2000), motor-skill learning (Perez et al., 2007) and a pegboard task (Schulze et al., 2002), suggesting that there is cross-talk between the hemispheres in the intermanual transfer of motor learning. The cerebral hemispheres, especially the PMC and the supplementary motor area (SMA), are probably the crucial loci for it (e.g., Schulze et al., 2002; Perez et al., 2007; Kirsch and Hoffmann, 2010). Possibly, then, cross-talk of TR between hands might occur in these higher cortical networks.

## **WHY WAS TR OF THE DELAYED HAND IN MIXED-EXPOSURE SIMILAR TO PURE-EXPOSURE?**

One result that is somewhat difficult to reconcile with a cross-talk account is that the TRE of the delayed hand reached the same level in mixed as in pure-exposure, whereas a cross-talk account would predict a smaller effect in mixed-exposure. One speculation is that the 150 ms delay in the mixed-exposure condition became more noticeable due to a contrast effect. Although somewhat anecdotic, several participants in the mixed-exposure condition reported that they noticed the 150 ms delay, whereas this was rarely the case in pure-exposure. Possibly, the staggered pattern of delays after the right- and left-hand tapping might have participants attend to the delayed feedback itself instead of the task-relevant inter-tap intervals. It has been reported that attention to delayed timing

can boost TR in the audio-visual domain (Heron et al., 2010), and possibly a similar mechanism works in the sensorimotor domain, thus boosting the TR for the delayed hand in the mixed-exposure condition.

## **CONCLUSION**

This study demonstrates that tapping in synchrony with a pacer tone can be used as a viable measure of TR. It appears that TR has both central and motor-specific components (see also, Sugano et al., 2012). The timing of the left and right hand could be adjusted differently after exposure to different delays of sensory feedback. This concurrent adaptation to different delays occurred slower and was less complete than when both hands were exposed to the same delay, thus suggesting that there was cross-talk of adaptation between the hands. These results are best explained by a hybrid-clock model with linked central and peripheral internal time-keepers.

#### **AUTHOR CONTRIBUTIONS**

Yoshimori Sugano, Mirjam Keetels, and Jean Vroomen designed the research; Yoshimori Sugano and Mirjam Keetels performed the experiment; Yoshimori Sugano analyzed data; and Yoshimori Sugano, Mirjam Keetels, and Jean Vroomen wrote the paper.

### **ACKNOWLEDGMENTS**

This work is supported by Kyushu Sangyo University and Tilburg University. We thank anonymous reviewers for their helpful comments and Dr. T. D. Keeley for his suggestions on English style.

### **REFERENCES**


Pesavento, M. J., and Schlag, J. (2006). Transfer of learned perception of sensorimotor simultaneity. *Exp. Brain Res.* 174, 435–442. doi: 10.1007/s00221-006-0476-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 September 2013; paper pending published: 19 December 2013; accepted: 03 February 2014; published online: 25 February 2014.*

*Citation: Sugano Y, Keetels M andVroomen J (2014) Concurrent sensorimotor temporal recalibration to different lags for the left and right hand. Front. Psychol. 5:140. doi: 10.3389/fpsyg.2014.00140*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology. Copyright © 2014 Sugano, Keetels and Vroomen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited andthatthe original publication inthis journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The influence of intersensory discrepancy on visuo-haptic integration is similar in 6-year-old children and adults

## *Bianca Jovanovic\* and Knut Drewing*

*Department for Developmental Psychology, Institute for Psychology, Justus-Liebig University, Giessen, Germany*

#### *Edited by:*

*Christine Sutter, RWTH Aachen University, Germany*

#### *Reviewed by:*

*Chie Takahashi, University of Birmingham, UK Priscila Caçola, University of Texas at Arlington, USA*

#### *\*Correspondence:*

*Bianca Jovanovic, Department for Developmental Psychology, Institute for Psychology, Giessen University, Otto-Behaghel-Str., 10F, 35394 Giessen, Germany e-mail: bianca.jovanovic@ psychol.uni-giessen.de*

When participants are given the opportunity to simultaneously feel an object and see it through a magnifying or reducing lens, adults estimate object size to be in-between visual and haptic size. Studies with young children, however, seem to demonstrate that their estimates are dominated by a single sense. In the present study, we examined whether this age difference observed in previous studies, can be accounted for by the large discrepancy between felt and seen size in the stimuli used in those studies. In addition, we studied the processes involved in combining the visual and haptic inputs. Adults and 6-year-old children judged objects that were presented to vision, haptics or simultaneously to both senses. The seen object length was reduced or magnified by different lenses. In the condition inducing large intersensory discrepancies, children's judgments in visuohaptic conditions were almost dominated by vision, whereas adults weighted vision just by ∼40%. Neither the adults' nor the children's discrimination thresholds were predicted by models of visuo-haptic integration. With smaller discrepancies, the children's visual weight approximated that of the adults and both the children's and adults' discrimination thresholds were well predicted by an integration model, which assumes that both visual and haptic inputs contribute to each single judgment. We conclude that children integrate seemingly corresponding multisensory information in similar ways as adults do, but focus on a single sense, when information from different senses is strongly discrepant.

**Keywords: multisensory, child development, integration, visuo-haptic display, intersensory integration**

## **INTRODUCTION**

Perception is essentially multimodal, with different senses contributing different aspects to the overall appearance of the environment. In some cases, different senses can also convey redundant information about the same object property, as for example, size: the size of an object can be seen and felt at the same time. While it is obvious that being able to process different information enriches our perception, it is at first glance less clear how the convergence of redundant information from different senses contributes to perception. Well-established models on perceptual integration [overview in Ernst and Bülthoff (2004)] suggest that adults combine redundant information from different senses in a way that enhances the reliability of the resulting percept. In contrast, studies with 5- to 6-year-old children imply that their judgments are dominated by one sense, that is, for example, either by visual or by haptical information, depending on testing conditions (Misceo et al., 1999; Gori et al., 2008). A corresponding dominance has been suggested to reflect a lack of multisensory integration (Gori et al., 2008). This conclusion, however, stands in contrast to studies with infants implying that the integration of spatially and temporally coordinated information from different senses begins during the first year of life already (e.g., Rosenblum et al., 1997; Kerzerho et al., 2008). The present paper investigates one possible alternative explanation for the failure to find integration in the 5- to 6-year-old children. We argue that in the studies finding unisensory dominance in children (Misceo et al., 1999; Gori et al., 2008) the ways in which stimuli were presented might have suggested to the children that the information provided by the different senses did not originate from one and the same object, and thus children did not relate the inputs.

How do adults combine redundant information from different senses? A model that has been widely applied to different instances of information integration in human perception is the now well-established Maximum-Likelihood-Estimate (MLE) model of "optimal integration" (Landy et al., 1995; Ernst and Bülthoff, 2004). According to this model the brain takes into account all perceptual information (or cues) available to judge a property, e.g., size information from different senses for size judgments, and combines them in order to obtain a maximally reliable percept. In a first step, estimates (ˆ*si*) for the property are derived from each cue (*i*) and in a second step, by weighted averaging, all estimates are combined into a coherent percept (ˆ*sP*):

$$
\hat{s}\_P = \sum\_i w\_i \hat{s}\_i \quad \text{with} \quad \sum\_i w\_i = 1; 0 \le w\_i \le 1 \tag{1}
$$

Estimates derived from each perceptual cue are prone to noise (variance σ<sup>2</sup> *<sup>i</sup>* ). By averaging different estimates, the system can reduce the variance in the combined percept (Landy et al., 1995). How the weights are set, depends on the reliability of the individual estimates. The reliability is the inverse of the variance (=reliability *Rj* <sup>=</sup> <sup>1</sup>/σ<sup>2</sup> *<sup>j</sup>* ). "Optimal" cue weights *wj* that result in the minimal variance σ<sup>2</sup> *<sup>P</sup>* of the final percept ˆ*sP* are proportional to the relative reliabilities of the estimates (Oruç et al., 2003):

$$W\_j = \prescript{R\_j}{}{\left\{ \sum\_{i=1..j,...,N} R\_i \text{ with } R\_p = \sum\_{i=1..N} R\_i \right\}} \tag{2}$$

Accordingly, dominance of a single sense results when the variance of the estimate derived from that sense is rather low as compared to the variance of the other senses' estimates. If the variances are similar, the different cues are predicted to have similar contributions to each perceptual judgment. Predictions from the MLE model have been confirmed for the integration of different cues within a single sense, such as different visual depth cues like shading or stereo cues (e.g., Young et al., 1993; Perotti et al., 1998; Backus et al., 1999; Hillis et al., 2004). Concerning multisensory integration, the predictions of the MLE model have been tested in several studies (e.g., Ernst and Banks, 2002; Alais and Burr, 2004; Helbig and Ernst, 2007a). In these studies variances of the single senses' estimates on a property, such as object size, have been assessed by measuring discrimination thresholds for that property in a condition in which subjects were using a single sense alone. The single senses' actual variances were, then, used to predict the expected optimal variances and weights of the estimates in a bisensory condition. Actual bisensory variances were assessed by measuring discrimination thresholds in bisensory conditions, and the senses' weights were assessed by introducing small unnoticeable intersensory discrepancies between the information given in the two senses on the same object, e.g., discrepancies between seen and felt length of that object. Many studies quantitatively confirm that multisensory integration is well described by the assumptions of MLE model, including the optimal weighting of information (Ernst and Banks, 2002; Alais and Burr, 2004; Helbig and Ernst, 2007a). However, there is controversy over the situations in which the weights are indeed set "optimally" (Oruç et al., 2003; Rosas et al., 2005; Cellini et al., 2013). Thus, suboptimal integration has also been observed, for example, for the integration of signals to slant from visual texture cues and cues from haptic exploration (Rosas et al., 2005). In a few studies predictions from integration models have been contrasted with predictions from the model of "probabilistic cue switching" (or "stochastic selection"; e.g., Nardini et al., 2008; Serwe et al., 2009; Kuschel et al., 2010). Probabilistic cue switching means that participants do not integrate the cues in multi-cue situations, but focus on one single cue of a given stimulus per trial, with a constant relative choice probability for each cue (hence, "probabilistic"). That is, which one of several cues is used for a perceptual judgment, alternates between stimulus presentations. This contrasts with integration, where each of the available cues contributes to the judgment on each stimulus presentation. The model of probabilistic cue switching has proven useful to identify conditions under which cues are not integrated, which, for instance, has been observed when cues are related to each other only on a symbolic level (Serwe et al., 2009).

Developmental studies have examined whether children combine multisensory information in similar ways as proposed for adults in the integration models. Psychophysical measurement is applicable only with children who are able to give verbal judgments on the object properties in question and to compare these in a systematic way. Thus, the relevant studies with children have mainly focused on children from the age of 5 years and older. The findings from the relevant child studies are mixed. Supporting evidence that is in line with the adult findings comes from a study by King et al. (2010). They measured 7- to 13-year-old children's integration of visual and proprioceptive cues on target position. Children's estimates of target position were influenced by both, visual and proprioceptive cues, and the weight of proprioception increased with age. This increase was linked to age-related improvements in proprioceptive precision. The pattern of findings was interpreted as being consistent with the assumption that information is weighted more strongly the more reliable it is. However, the authors did not test directly for integration mechanisms, as they had not enough data. In contrast, Misceo et al. (1999) found little evidence for integration in a visuo-haptic matching task. These authors used an anamorphic lens to induce an intersensory discrepancy between the observed and the felt size of an object. Children aged 6, 9, and 12 years viewed objects through a lens while manually grasping them through a hand-concealing cloth. Then, they selected a match from a set of comparison objects. When adults performed this task (Hershberger and Misceo, 1996), the size of the match was in-between the observed and the felt object size. For adults, the size of the match deviated from the felt size by 30–70% of the discrepancy between the seen and the felt size, which corresponds to a visual weight of 30–70% in the judgment. Six-year-old children, however, exhibited nearly complete visual dominance (about 80% visual weight; Misceo et al., 1999). In another recent developmental study on visuo-haptic integration (Gori et al., 2008) 5- to 6-year-old children (but not adults) were again found to display almost complete unisensory dominance: haptic dominance in a size discrimination task (∼20% visual weight, age group 5 years) and visual dominance in an orientation discrimination task (∼90% visual weight, age group 6 years). In addition, Gori et al. (2008) tested the predictions of the MLE model of optimal integration. For the 5- to 6-year-old children both weights and bisensory variances clearly deviated from the model's predictions. In the age groups between of 8 and 10 years, however, the children's behavior increasingly resembled that of adults, suggesting that the ability to integrate visual and haptic input develops during this period. In contrast, the data from the 5- to 6-yearolds were interpreted as indicating that children of this age do not yet integrate information from different senses, but rather rely on one single sense (Gori et al., 2008).

This is quite surprising, given the evidence that even infants are able to relate information originating from different senses [overview in Lickliter and Bahrick (2004)]. One example of early multisensory integration is the McGurk effect (McGurk and MacDonald, 1976), which shows that the combination of multisensory information can lead to a percept that is qualitatively different from that provided by the single senses: when participants are presented with discrepant auditory and visual syllables, often some kind of fusion occurs between the syllables (Rosenblum et al., 1997). Rosenblum et al. (1997) have found that the McGurk effect is already present in 5-month-old infants (see also Kushnerenko et al., 2008). The McGurk effect has also been found in 4- to 10-year-old children with similar effects as in adults, or, in some cases, somewhat smaller than in adults (e.g., Massaro et al., 1986). In the same vein, recent work indicates that visual speech cues can help infants to discriminate phonemes (Teinonen et al., 2008). Finally, and of crucial importance, Kerzerho et al. (2008) showed that 5-month-old infants' discrimination of different haptically experienced orientations can be influenced by the presentation of consistent or discrepant visual context cues: when the visual context cues were consistent with the haptic cues, infants became able to differentiate between orientations they were unable to differentiate with haptic information alone. In contrast, when the spatial orientation presented visually was discrepant to the one presented haptically, infants' performance was disrupted.

Given that there is evidence of infants' ability to match and integrate perceptual information from different senses, it is puzzling that multisensory integration is so hard to find in older children. On the one hand, it could be the case that these early abilities rely on qualitatively different mechanisms for processing and integrating perceptual information. On the other hand, children's low integration performance in the studies cited above (Misceo et al., 1999; Gori et al., 2008) could be explained by several methodological issues. As an example, Misceo et al. (1999) used a lens to introduce intersensory discrepancy. However, the lens used in these studies was relatively strong, so that it halved the visually presented object size. Stimuli with correspondingly large discrepancies might not induce natural multisensory processes, because large discrepancies provide a cue suggesting that the information from the different senses probably stems from different objects and does not belong together. In the study by Gori et al. (2008), this specific problem was avoided, as the induced discrepancies were quite small. However, the objects used in that study consisted of two spatially divided parts. Thus, participants examined a pair of objects attached to the front and rear surfaces of a panel so as to simulate a single object protruding through a hole. The participants felt the one in the back while viewing the one in front. Crucially, with this method it might not have been apparent to the participants that haptic and visual inputs stemmed from the same object, as they did not originate from the same location. Earlier work on the bi-partite task has shown that even for adults task instructions regarding a shared origin of visual and haptic inputs is required to promote integration (Miller, 1972). Correspondingly, Gepshtein et al. (2005) demonstrated that physical proximity is an important precondition for the combination of different sensory cues.

Taken together, in the studies discussed here (Misceo et al., 1999; Gori et al., 2008), the cues provided by the experimental paradigms might have hindered the young children to perceive a relation between the visual and the haptic inputs, as on the one hand there were great size discrepancies between seen and felt stimuli (Misceo et al., 1999) and on the other hand the physical proximity between felt and seen parts of the stimuli was low (Gori et al., 2008). In the present studies, we sought to overcome these various methodological problems.

We studied visuo-haptic length judgments using the lens technique, because this technique allowed us to provide the participants with perceptual cues that indicate the common origin of haptic and visual inputs: while participants looked at the stimulus, they simultaneously saw how their fingers touched the stimulus through a soft cloth, which should be a strong cue indicating that they felt and saw the same single object (Helbig and Ernst, 2007b). Participants touched through a soft cloth in order to insure that they were able to see their finger movements without any optical distortion of their fingers. To test for the influence of magnification, we used different lenses. The lenses induced either large or small visuo-haptic discrepancies. We expected that with small intersensory discrepancies even 6-year-old children would be able to use inputs from both senses, much in the same way as adults do. Beside adults, we decided to investigate 6-year-old children, because the previous studies (Misceo et al., 1999; Gori et al., 2008) suggest that children in this age group can understand and accomplish the necessary experimental tasks, while at the same time their abilities to integrate visual and haptic information are not yet developed. If we however found positive evidence of integration already in 6-year-old children, the assumption that integration abilities do not develop before school age would be cast into doubts. We tested behavior against predictions from models of optimal and suboptimal integration and of probabilistic switching between the senses.

## **MATERIALS AND METHODS**

Six-year-old children and adults compared the length of different rectangular standard stimuli (of 20–30 mm length) with a set of comparison stimuli in a two-interval forced choice task combined with the method of constant stimuli. Standard stimuli were presented either haptically (precision grip), or visually, or to both senses. Comparison stimuli were presented only haptically in each condition. In visuo-haptic conditions we used cylindrical reducing and magnifying lenses in order to dissociate the seen length of the standard stimulus from its felt length. For the groups with large discrepancies between seen and felt length the magnifying/reducing factor was 1.5; for the small-discrepancy groups the factor was 1.25. Due to their cylindric shape the lenses did not affect the seen width of the objects.

Participants successively explored the standard and the comparison stimulus and afterwards had to indicate which of the two stimuli they had perceived to be larger. We assessed length judgments of the standard stimuli by the points of subjectively equal length (PSE). From visuo-haptic length judgments we derived the senses' weights in bisensory judgments. In addition, we measured 84%-discrimination thresholds (just noticeable difference, JND) in order to assess uni- and bisensory variances. We predicted bisensory weights and thresholds using models that assume optimal integration, suboptimal integration or no integration at all (probabilistic switching between senses).

In contrast to most of the previous studies (e.g., Ernst and Banks, 2002; Gori et al., 2008), in which comparison stimuli were presented in the same modality as the standard stimuli, in the present experiment the comparison stimuli were always presented only haptically. Our modified method measured values JNDs and PSEs on the same scale in all modality conditions, namely as compared to haptic stimuli, and differential biases between the senses were assessed and considered in the further analyses [cf. Equation (3) and footnote 1; cf. also (Reuschel et al., 2010)]. In contrast, the previously used methods measured these values on different modality-specific scales and did not assess potential biases. Second, it has been argued that automatic aspects of multisensory integration are better captured when participants compare bisensory stimuli to unisensory comparison stimuli, as bisensory comparison stimuli can also trigger deliberate processes of integration (cf. e.g., Shams et al., 2000; Bresciani et al., 2006; Ernst, 2006; Helbig and Ernst, 2007b). Previous adult studies, in which bisensory stimuli were matched to unisensory stimuli, show, however, that judgments can be slightly shifted toward the sense to which the comparison stimuli were presented (Hershberger and Misceo, 1996; Helbig and Ernst, 2007b). Ernst (2006) explains this shift by an incomplete fusion between the senses, while with complete fusion judgments on bisensory stimuli that refer to either of the two senses are predicted to be the same [corresponding findings in Lederman et al. (1986)]. If fusion is complete, findings are consistent with the predictions of the MLE model on optimal integration, while incomplete fusion corresponds to suboptimal automatic integration.

## **PARTICIPANTS**

Children were sampled from different kindergartens in the regions of Hagen and Giessen, adult participants were mainly sampled from Giessen University. Informed consent was obtained from the parents before testing. We collected complete data sets from 40 adults and 40 children. However, we removed the data from 10 participants (7 children, 3 adults), who had an outlier JND defined by a measured value larger/smaller than average ±3 standard deviations in the respective condition. Given the small number of data points per JND a temporary lack of attention to the task is a potential reason for such outliers. The final sample of the large discrepancy group included 22 children with a mean age of 6;2 [years; months] and an of age range 5;5–6;11 (50% females; 68% right-handed) and 24 adults with a mean age of 32 years and an age of range: 18–51 years (46% females; 79% right-handed). The final sample of the small discrepancy group included 11 children with a mean age of 5;6 and an age range of 5;1–5;11 (36% females; 73% right-handed) and 13 adults with a mean age of 25 years and an age range of 20–34 years (62% females; all right-handed).

## **APPARATUS AND STIMULI**

The entire apparatus was mobile and the experiments were conducted in a quiet room in the respective kindergartens, or the university. Participants sat–vis-à-vis to the experimenter–in front of a table. Side-by-side, on the top of the table stood two "presentation boxes." One presentation box contained the standard stimulus, the other one contained the comparison stimulus. Participants could look at the stimulus through diving goggles and an exchangeable lens; a blind in the box occluded left-eye views. The experimenter placed one stimulus at the bottom of each box. After placement, participants could reach through a soft cloth at the sides of the box to simultaneously see and feel the stimulus. The soft cloth prevented participants from seeing their fingers through the lens while they were able to see their finger movements. Stimuli were rectangular plastic plates that were covered with a red-colored smooth film (1 mm high, 20 mm wide, length 14–36 mm). A custom-made computer program prescribed the presentation order and collected the participants' responses.

## **DESIGN AND PROCEDURE**

The design comprised the between-participant variables Age Group (Children vs. Adults) and Intersensory discrepancy (large vs. small) and the within-participant variables Modality (Haptics alone, Vision alone, Haptics and Vision) and Stimulus Set (shortmagnified vs. long-reduced). Each of the 12 combinations of the conditions of the variables Intersensory discrepancy, Modality, and Stimulus Set was realized by a specific "standard stimulus," as we will explain below in detail.

Each standard stimulus was paired with a range of comparison stimuli that were presented haptically in each condition several times—using the method of constant stimuli in a two-interval forced choice paradigm. In each trial we presented the participants successively with a standard and a comparison stimulus. Participants were instructed to indicate which stimulus had been larger by pointing to the corresponding presentation box. From these responses we calculated the individual points of subjective equality (PSE) and the 84%-discrimination thresholds (just noticeable difference, JND) of each standard compared to the comparison stimuli.

The 12 different standard stimuli were implemented using two different physical stimuli; a shorter physical stimulus that was visually presented via a magnifying lens (Stimulus Set: shortmagnified) and a longer physical stimulus that was visually presented via a reducing lens (Stimulus Set: long-reduced). For the large discrepancy condition the physical stimuli were 20 and 30 mm long so that haptic standard stimuli in the large discrepancy condition were 20 and 30 mm long. For visual presentation the 20 mm-stimulus was magnified and the 30 mm-stimulus reduced by a factor of about 1.5 so that visual standard stimuli in the large discrepancy condition should have had seen lengths of about 30 and 20 mm. In visuo-haptic conditions the discrepant visual and haptic information was presented simultaneously. For the small intersensory discrepancy condition the physical stimuli were 20 and 25 mm long and optically magnified or reduced by a factor of 1.25 (seen length about 25 and 20 mm).

Each standard was paired with comparison stimuli that were distributed around the comparison's value that we expected to be perceived as equal to the standard in length (i.e., the PSE). In haptic and visual alone conditions, the expected PSEs were the standard's felt and seen lengths. In visuo-haptic conditions we only expected that the PSEs would be in-between seen and felt length, and hence used a slightly higher number of comparison stimuli around the mean of felt and seen length. Details can be found in **Table 1**. We accepted the unequal number of comparison stimuli, because we aimed to keep the experiment as short as possible in order to keep the children's attention engaged during the entire experiment.

Each standard-comparison pair was presented three times. The experiment was divided into three parts. Each part involved three blocks of trials, one block for each Modality condition (Haptics alone/Vision alone/Haptics and Vision). The order of Modality blocks was balanced across participants. Each block contained

#### **Table 1 | Length of comparison stimuli for each standard.**


each standard-comparison pair once. The order of presentations in each block was randomized, preventing adaptation in visuohaptic conditions. The experiment was conducted in 2–3 sessions of less than 30 min duration each.

In each trial participants first explored the standard stimulus. We chose to present standard and comparison stimuli in a fixed order, because this facilitated the participant's task, as they were required to treat standard and comparison stimuli slightly differently. Note that potential perceptual bias due to the fixed order is implicitly considered and invalidated when we calculate weights [Equation (3) and footnote 1]. In haptic alone conditions participants felt the standard for about 1–2 s. They were instructed to grasp with the thumb and the index finger of their dominant hand. Participants always touched the stimulus through a soft cloth. In visual conditions participants looked at the standard for about 2 s. In visuo-haptic conditions participants grasped the standards while looking at it, keeping visual and haptic presentation times approximately equal to the unisensory conditions, i.e., 1–2 s. In visuo-haptic conditions participants saw their finger movements through the soft cloth, so that they knew that the visual and haptic input stemmed from the same object. After having explored the standard stimulus, participants felt the comparison stimulus through a soft cloth in the other presentation box for about 1–2 s. Then, they indicated which of the two stimuli they had perceived as being larger by pointing to the corresponding presentation box. The experimenter entered the participant's response in a computer program. The experimenter also guided the participant through the experimental trials: She instructed the participants online when to explore each stimulus and when to respond while paying attention that stimulus exploration times, and the time between the exploration of the two stimuli did not exceed 2 s and response times did not exceed 10 s. Between trials the experimenter changed the stimuli and lenses in the presentation boxes as indicated by the computer program.

#### **DATA ANALYSIS**

Applying the method of constant stimuli, we acquired 21–27 responses per participant and condition. We plotted the proportion of trials in which the standard was perceived as being longer than the comparison against the length of the comparison. The PSE is defined as the amplitude of the comparison stimulus at which either stimulus is equally likely to be chosen. The JND is defined as the difference between the PSE and the amplitude of the comparison when it is judged longer than the standard 84% of the time. We fitted cumulative Gaussians to the psychometric functions using the psignifit toolbox for Matlab which implements maximum-likelihood estimation methods (Wichmann and Hill, 2001). The parameter μ of the Gaussian estimates the PSE, and, σ estimates the 84%-discrimination threshold (JND).

From the PSEs we estimated the individual weights of visual information *wvemp* in visuo-haptic judgments for each standard stimulus; the haptic PSE*<sup>H</sup>* and the visual PSE*<sup>V</sup>* were estimated from group means; they were combined with the individual visuo-haptic PSE*VH* as follows<sup>1</sup> :

$$\omega\_{\text{wemp}} = \frac{\text{PSE}\_{VH} - \text{PSE}\_{H}}{\text{PSE}\_{V} - \text{PSE}\_{H}} \tag{3}$$

Further, we aimed to predict the variance in the visuo-haptic conditions from the variances of haptic and visual estimates. This required, first, to estimate haptic and visual estimate variance from the JNDs, which we did as follows (left side)<sup>2</sup> :

$$\text{JND}\_{H} = \sqrt{\sigma\_{h}^{2} + \sigma\_{h}^{2}} \rightarrow \sigma\_{h}^{2} = \frac{1}{2} \text{JND}\_{H}{}^{2}$$

$$\text{JND}\_{V} = \sqrt{\sigma\_{h}^{2} + \sigma\_{\nu}^{2}} \rightarrow \sigma\_{\nu}^{2} = \text{JND}\_{V}{}^{2} - \frac{1}{2} \text{JND}\_{H}{}^{2}$$

$$\text{JND}\_{V\_{H}} = \sqrt{\sigma\_{h}^{2} + \sigma\_{\nu h}^{2}} \rightarrow \sigma\_{\nu h}^{2} = \text{JND}\_{VH}{}^{2} - \frac{1}{2} \text{JND}\_{H}{}^{2} \quad \text{(4)}$$

The uppercase letters *H*, *V*, and *VH* indicate the three modality conditions, the lowercase letters refer to the modality-specific estimates derived from haptic (*h*), visual (*v*), and visuo-haptic stimuli (*vh*). Further, it is assumed that modality-specific variances are similar for different stimulus values.

Visual and haptic variance estimates were used to predict visuo-haptic variances and visual weights according to the MLE

<sup>1</sup>Remember that the PSEs assessed internal length estimates *<sup>s</sup>*ˆ*<sup>i</sup>* in Equation 1, and that they do so on a single scale across modality conditions, namely as compared to haptic-only stimuli. We calculate the weights from the different PSEs only, and, thus, directly calculate the estimate's weights *wi* in [Equation (1)]. By using this method, perceptual bias cannot not bias the calculation of estimate weights. This is an advantage over previously used methods to calculate weights, because these were partly based on physical stimulus values and require the untested assumption that perception is unbiased (Ernst and Banks, 2002; Helbig and Ernst, 2007a; Gori et al., 2008).

<sup>2</sup>The equations follow from the following considerations: If in a discrimination task standard and comparison stimuli are presented in the same modality, it is assumed that the squared 84%-discrimination threshold (JND) equals twice the variance of the underlying modality-specific estimates (Ernst and Banks, 2002). The underlying model is that in each discrimination trial two independent internal estimates with equal variance, one derived from the standard and one from the comparison stimulus, contribute to the overall judgment variance that is assessed by the JND. Because in the present experiment, standard and comparison are not presented in the same modality, we assume that the two corresponding estimates contribute with unequal, but modality-specific variances σ<sup>2</sup> to the JNDs in the different conditions. Further, it is assumed that estimates from the two stimuli have uncorrelated noises and, thus, their variances add up.

model of optimal integration [from Equation (2); (Ernst and Banks, 2002)]:

$$
\sigma\_{\nu h\_{\rm MLE}}^2 = \frac{\sigma\_\nu^2 \ast \sigma\_h^2}{\sigma\_\nu^2 + \sigma\_h^2}, \\
\omega\_{\nu\_{\rm MLE}} = \frac{\sigma\_h^2}{\sigma\_h^2 + \sigma\_h^2} \tag{5}
$$

Further, we tested for suboptimal integration. For the weights of suboptimal integration we assumed the empirical weights *wv*\_*emp*. Assuming that the estimates' noises are uncorrelated, visuo-haptic variance was predicted as follows (Kuschel et al., 2010):

$$
\sigma\_{\nu h\_{sub}}^2 = \nu\_{\nu\_{\rm empty}}^2 \sigma\_\nu^2 + (1 - \nu\_{\nu\_{\rm empty}}^2) \sigma\_h^2 \tag{6}
$$

Finally, we predicted visuo-haptic variances assuming probabilistic cue switching. In this case the empirical weights *wvemp* estimate the probability that only the visual input is used to estimate the length of a visuo-haptic stimulus. Visuo-haptic variances were predicted as follows [adapted from Kuschel et al. (2010); cf. (Nardini et al., 2008)].

$$
\sigma\_{\nu h\\_switch}^2 = \left. \boldsymbol{\omega}\_{\nu\_{emp}} \sigma\_{\nu}^2 + (1 - \boldsymbol{\omega}\_{\nu\_{emp}}) \sigma\_h^2 + \boldsymbol{\omega}\_{\nu\_{emp}} (1 - \boldsymbol{\omega}\_{\nu\_{emp}})
$$

$$
\times \left( \text{PSE}\_V - \text{PSE}\_H \right)^2 \tag{7}
$$

Visuo-haptic variances from all three models were transformed back into JND predictions and compared to the actual JNDs. Predictions were based on individually averaged values (averaged over Stimulus Sets) and weight estimates were confined to be between 0 and 1.

#### **RESULTS**

#### **PSEs**

As should be the case, PSEs from the haptic conditions were, on average, close to the actual values of the physical stimuli, i.e., close to 20 mm for the Set "short-magnified," and close to 25 and 30 mm, respectively, for the Set "long-reduced" (**Figure 1**). They did not significantly differ between age groups (*p*s > 0.10 for main effect and interactions with Age group: ANOVA of haptic PSEs with variables Intersensory discrepancy, Age group, Stimulus set). Visual PSEs indicated that the optical magnification and reduction of the physical stimuli was successful, but the PSEs did not perfectly correspond to the expected values: The physical stimulus of 20 mm length was expected to be magnified to a seen length of 25 mm in the small discrepancy condition and to 30 mm in the large discrepancy condition, but visual PSEs were, on average, 23.8 and 29.9 mm, respectively. The physical stimulus of 25 mm in the small discrepancy condition and that of 30 mm length in the large discrepancy condition were both expected to be reduced to a seen length of 20 mm, but visual PSEs were, on average, 20.7 and 17.2 mm, respectively. Additionally, the magnifying or reducing effect of the lens as assessed by the visual PSEs was slightly more pronounced for children than for adults, except for the reducing lens in the small discrepancy condition (**Figure 1**). An ANOVA (variables Intersensory discrepancy, Age group, Stimulus set) showed significant interactions Age group × Stimulus set, *F*(1, <sup>66</sup>) = 5.132, *p* = 0.027, and Age group × Stimulus set × Intersensory discrepancy, *F*(1, <sup>66</sup>) = 6.935, *p* = 0.011.

The deviations of the visual PSEs from the target values do, however, not limit the conclusions that can be drawn from the experiment, because further analyses were based on the visual and haptic PSEs of the stimuli, not on their physical parameters.

Finally, as can be well seen from **Figure 1**, PSEs from the visuohaptic conditions were in-between the PSEs from corresponding visual and haptic conditions. This indicates some combination of the discrepant visual and haptic information. The relative shift of the visuo-haptic PSEs from the haptic toward the visual PSEs will be analyzed in the next subsection on visual weights.

#### **WEIGHTS**

standard errors.

The empirical visual weights *wvemp* were submitted to an ANOVA with the within-participant variable Stimulus set and the between-participant variables Age group and Intersensory discrepancy. The visual weights were larger for children as compared to adults, *F*(1, <sup>66</sup>) = 45.42, *p* < 0.001. Further, visual weights were, on average, larger for the large intersensory discrepancy as compared to the small one, *F*(1, <sup>66</sup>) = 18.15, *p* < 0.001, but this effect was modified by the age group [interaction Age Group × Discrepancy, *F*(1, <sup>66</sup>) = 5.20, *p* = 0.027]. Separate tests confirmed the effect of Intersensory discrepancy on the visual weight only for the children, *F*(1, <sup>31</sup>) = 21.80, *p* < 0.001, but not for the adults, *F*(1, <sup>35</sup>) = 2.00, *p* = 0.17. There were no other significant effects on the visual weights (*p*s > 0.15).

#### **JNDs**

JND values were submitted to an ANOVA with the betweenparticipant variables Age group and Intersensory discrepancy

and the within-participant variables Modality and Stimulus set (**Figure 3**). On average, the JND values were larger for children than for adults, *F*(1, <sup>66</sup>) = 18.87, *p* < 0.001. Further, JNDs differed between Modalities, *F*(2, <sup>132</sup>) = 4.22, *p* = 0.017, and this effect was modified by the extent of the intersensory discrepancy, interaction: *F*(2, <sup>132</sup>) = 3.48, *p* = 0.034. With large intersensory discrepancies, visual and haptic JNDs were similar but visuohaptic JNDs were significantly larger than both unisensory JNDs, suggesting a bisensory disadvantage. In contrast, with small intersensory discrepancies, visual JNDs were significantly larger than haptic ones, while visuo-haptic JNDs did not reliably differ from the unisensory JNDs (*post-hoc t*-tests, Bonferroni-adjusted per intersensory discrepancy, α = 5%). Numerically with small intersensory discrepancies the bisensory JNDs were in-between the unisensory ones (**Figure 3**). Other effects in the ANOVA were not significant (*p*s > 0.15). Note that the lack of interaction with Age group suggests that the pattern of results was similar for children and adults.

#### **MODEL PREDICTIONS**

Average and individual model predictions for visuo-haptic JNDs are depicted in **Figures 4**, **5**. Although individual JND values are somewhat spread, which is a consequence of the relatively low number of data points that we were able to collect per child, the figures already provide a clear overall picture of the results. It can be seen that with the small intersensory discrepancy both the adults' and the children's data are well predicted from the model of suboptimal integration (similar average JNDs for observed and predicted values in **Figure 4** corresponding to a slope close to 1 in **Figure 5**). In contrast, with the large intersensory discrepancy none of the models provides a good fit. The following section will report the inference statistics in detail.

**Ch[ildren]) averaged over the two conditions of Stimulus Set.**

## *Optimal integration*

Using the model of optimal integration, we predicted optimal visual weights and optimal JNDs in bisensory conditions based on the corresponding unisensory JNDs.

In a certain number of cases (27%) optimal weight estimates were clear outlier values (>1 up to 31)<sup>3</sup> , which is a consequence of magnification of measurement errors through the estimation procedure. Hence, we compared predicted and observed weights using non-parametrical Wilcoxon-tests, because these are based on ranks rather than on absolute values, and we report median weights instead of means. With the large intersensory discrepancy the children's empirical weights were significantly higher than predicted from optimal integration (*Med*: 0.86 vs. 0.35, *Z* = 3.339, *p* = 0.001), while the adults' empirical weights were significantly lower than predicted (*Med*: 0.44 vs 0.68, *Z* = 2.686, *p* = 0.007; note that the median weights used in these analyses slightly differ from the averages depicted in **Figure 2**). With small intersensory discrepancies the same numerical trends were visible, but not significant (children: *Med*: 0.48 vs. 0.26, *Z* = 1.067, *p* = 0.268; adults: *Med*: 0.28 vs. 0.42, *Z* = 1.013, *p* = 0.311).

Further, we compared predicted and observed visuo-haptic JND values using two ANOVAs with the variables Age group and Value (predicted vs. observed), one analysis for each of the two intersensory discrepancies. For both discrepancies the observed JNDs were significantly larger than predicted from optimal integration, independent from Age group [Value effect, large discrepancy: *F*(1, <sup>44</sup>) = 29.87, *p* < 0.001, small discrepancy: *F*(1, <sup>22</sup>) = 8.07, *p* = 0.009; interaction Value × Age group, large: *F*(1, <sup>44</sup>) = 1.56, *p* = 0.23, small: *F*(1, <sup>22</sup>) = 2.29, *p* = 0.14]. We further tested the slopes of the linear regression (no constant, **Figure 5**) of observed upon predicted JNDs against a slope of 1 (=perfect prediction). These analyses confirm that optimal predictions underestimate the actual visuo-haptic JNDs for almost each discrepancy and age group [adults-large, slope 1.46, *t*(23) = 2.45, *p* = 0.01; children-large, slope 1.54, *t*(21) = 2.68, *p* = 0.007; children-small, slope 1. 47, *t*(10) = 2.25, *p* = 0.02; exception: adults-small, slope 1.17, *t*(12) = 1.09, *p* = 0.15, one-tailed].

#### *Suboptimal integration*

We further predicted visuo-haptic JND values under the assumption that participants integrate visual and haptic information suboptimally with the measured empirical weights. Again, two ANOVAs with the variables Age group and Value were conducted. For large intersensory discrepancies observed JNDs were significantly larger than predicted from suboptimal integration, *F*(1, <sup>44</sup>) = 16.14, *p* < 0.001, again independent from Age group [interaction Value × Age group, *F*(1, <sup>44</sup>) = 0.357, *p* = 0.55]. In contrast, for small intersensory discrepancies the observed JND values did not significantly differ from the predicted values [Value

<sup>3</sup>This did not occur for JND predictions, as can be seen in **Figure 5**. Although we conducted all analyses on predicted JNDs twice: with all data and excluding data with outlying optimal weights (>1). The conclusions were the same and we, hence, only report analyses on all data.

effect, *F*(1, <sup>22</sup>) = 0.45, *p* = 0.009; interaction Value × Age Group, *F*(1, <sup>22</sup>) = 0.10, *p* = 0.75], indicating that these data are consistent with the assumption of a suboptimal integration both in children and adults. Slope analyses confirm that suboptimal predictions underestimate the actual visuo-haptic JNDs for large discrepancies [adults, slope 1.32, *t*(23) = 2.13, *p* = 0.04; children, slope 1.28, *t*(21) = 2.12, *p* = 0.046], but fit well with the data for small intersensory discrepancies [adults, slope 1.06, *t*(12) = 0.48, *p* = 0.64; children, slope 1. 01, *t*(10) = 0.04, *p* = 0.97].

## *Cue switching*

Finally, we tested the measured visuo-haptic JNDs against predictions from cue switching, i.e., assuming that participants used either only visual or only haptic cues with probabilities that are estimated by the empirical weights. Again, two ANOVAs were conducted. For both large and small intersensory discrepancies observed JNDs were significantly smaller than predicted from cue switching, [large: *F*(1, <sup>44</sup>) = 42.93, *p* < 0.001; small: *F*(1, <sup>22</sup>) = 5.85, *p* = 0.024]. For small discrepancies the Value effect was independent from Age Group [interaction, *F*(1, <sup>22</sup>) = 0.004, *p* = 0.95], while for large discrepancies it was more pronounced for adults than for children, *F*(1, <sup>44</sup>) = 5.89, *p* < 0.019. However, the slope analyses confirm that cue switching overestimates the actual visuo-haptic JNDs for each intersensory discrepancy and age group [adults-large, slope 0.45, *t*(23) = 12.24, *p* < 0.001; children-large, slope 0.66, *t*(21) = 3.26, *p* = 0.002; adults-small, slope 0.71, *t*(12) = 3.16, *p* = 0.004, children-small, slope 0.81, *t*(10) = 1.46, *p* = 0.04; one-tailed].

## **DISCUSSION**

In the present study, we investigated how adults and 6-year-old children combine seen and felt object length. We studied visuohaptic judgments introducing a large or a small intersensory discrepancy between seen and felt length. We assessed the contribution of each sense to the bisensory judgments via the points of subjectively equal length of discrepant visuo-haptic stimuli to stimuli that were only felt. Further, we tested different models on how visual and haptic inputs are combined by comparing their predictions with the actual data.

In adults, the contribution of vision to the judgments was moderate and did not reliably depend on the magnitude of the intersensory discrepancy (average 33 and 42% for small and large discrepancies, respectively). In contrast, the children's judgments were dominated by seen length (85%) for large discrepancies, but less so for small discrepancies (54%). We conclude that children—but not adults—concentrate on a single sense, here vision, when inputs from two senses are in large conflict. However, when the inputs from the different senses seem to correspond to each other, children also can use both inputs.

But how exactly did children and adults combine the inputs from the different senses? We tested model predictions of optimal integration (using optimal weights), suboptimal integration (assuming that participants used the measured weights) and probabilistic cue switching. Integration models predict that on each presentation of a visuo-haptic stimulus participants combine both the visual and the haptic input into a single length estimate for that stimulus using a weighted average of estimates from the two senses. In contrast, probabilistic cue switching means that participants never integrate but, with a certain probability, use either only the visual input or only the haptic input to estimate the stimulus' length. Overall, with the small intersensory discrepancy, visuo-haptic JNDs tended to be in-between visual and haptic ones, but were, in contrast to the predictions of the optimal integration account (Kuschel et al., 2010), not lower than each of the unisensory JNDs. In addition, JND predictions from the model of suboptimal integration provided a good match for the data both for adults and children, whereas predictions from cue switching, and also partly predictions from optimal integration were rejected. We, hence, conclude that with the small intersensory discrepancy both adults and children integrated visual and haptic information suboptimally. In contrast, with large discrepancies, visuo-haptic JNDs were higher than predicted from optimal and suboptimal integration both in children and adults. Thereby, the bisensory JNDs were higher than the unisensory JNDs in either sense. Kuschel et al. (2010) have shown that if inputs from two senses are integrated, the variance of the bisensory estimates cannot be higher than the maximum of the variances of the two unisensory estimates. Because the present biand unisensory JNDs monotonically relate to the corresponding variances [cf. Equation (4)], we can, hence, conclude that with the large discrepancy neither adults nor children integrated the inputs from the two senses. However, their performance was still better than predicted from probabilistic cue switching, which we will discuss below. Taken together, we conclude that both children and adults integrated the visual and haptic input when the discrepancy between the inputs was small, but failed to integrate with large discrepancies.

Overall, the results are able to explain discrepancies between earlier findings on the development of multisensory integration. While studies on infants tend to suggest that the perception of intersensory relations and multisensory integration has an early onset, previous psychophysical studies on pre-school and school children that used psychophysical tasks similar to those used in adults found little evidence for integration in visuo-haptic tasks before the age of 8–10 years (Misceo et al., 1999; Gori et al., 2008). In these studies children rather focused on a single sense and did not combine the information from both senses. However, in these studies cues might have suggested that visual and haptic inputs did not have a common origin: either the discrepancy between the visual and the haptic input was quite large (Misceo et al., 1999) or the two inputs originated from clearly different locations (Gori et al., 2008). In the present study we explicitly tested whether a large discrepancy between the two inputs has an impact on integration. In fact, we confirmed that children do not integrate visual and haptic inputs when the discrepancies between the two inputs are large, but rather focus on a single sense (here: vision). However, we also provide evidence that with small intersensory discrepancies and cues promoting common origin, 6-year-old children integrate the two inputs, similar as adults do. Hence, we conclude that children's ability to integrate information from different senses develops earlier than suggested before. Other studies are in line with this conclusion. As an example, King et al. (2010) found multisensory-motor integration (vision and proprioception) in children aged between 7 and 13 years that was dependent on the acuity, i.e., the reliability of the single senses. The authors assume that the processes involved in multisensory integration are similar in children and adults. Furthermore, the already mentioned McGurk effect found in 4- to 10-year-old children, provides a strong case for the general abilities of pre-school children to integrate information across different senses. Massaro et al. (1986) demonstrated that children displayed the McGurk effect, but weighted the auditory information more strongly than adults. Obviously, the weighting depended on children's inferior abilities in lip-reading: children (as well as adults) who were better in lip-reading weighted the visual information more strongly than children who were less proficient in lip-reading. However, the basic integration processes were the same, according to the authors.

The finding of suboptimal integration in the small discrepancy condition seems at first glance at odds with previous findings on adult'smultisensoryintegration as being optimal (Ernst andBanks, 2002; Alais and Burr, 2004; Helbig and Ernst, 2007a). However, as compared to many previous studies we presented comparison stimuli only to a single sense and, thus, focused on the automatic aspectsofintegrationin thebisensoryconditionswhilediminishing deliberate ones (cf. Ernst, 2006). The present data do not allow to decide whether integration including also deliberate aspects would have been optimal. Still the findings provide clear evidence for integration in the small discrepancy condition. It is less clear how visual and haptic inputs were used in the large discrepancy condition. Note that our results are only partly consistent with the previous literature. Like previous studies using the lens paradigm with large intersensory discrepancies (Misceo et al., 1999), we observed dominance of a single sense in children's bisensory judgments, and unambiguous contributions of both senses in adults. However, as opposed to the study by Gori et al. (2008) using bi-partite objects that found optimal integration in the adult but not in the children sample, we found no evidence of optimal integration neither in the child, nor in the adult sample. In the present study, the condition with large intersensory discrepancies led to a performance in both age-groups that rejected models of integration. The difference between the studies might originate in a particularity of the bi-partite task. Earlier work on the bipartite task has shown that for adults task instructions regarding a shared origin of visual and haptic inputs is required to promote integration (Miller, 1972). We, hence, speculate that the explicit cognitive cues on the same origin given in the study of Gori et al. (2008) might have been differentially efficient in children and adults, while in the present study only implicit cues on the origin of theinputswere given thatwere similar efficientin the two groups. Hence, we observed the same behavior in adults and children in the large discrepancy condition. Thereby, it is not entirely clear how participants combined the two inputs in this context. While we can reject integration, the same is true for probabilistic cue switching as an overall explanation. However, it might be the case that the overall data reflect a mixture of different combination strategies: e.g., some individuals might have switched while others might have integrated or single individuals might have alternated between these strategies. Also fluctuations in the weights over the experiment would be able to predict high bisensory variances. The present data, however, allow no distinction between these options and further research is required on this issue. It is important to note that while large discrepancies hindered a proper integration in both adults and children, they did probably not lead to a complete single-cue strategy.

Altogether, we can conclude, however, that children combine multisensory information in similar ways as adults do, both under conditions promoting and hindering integration.

## **ACKNOWLEDGMENTS**

Thanks to Max Krefft and Manuela I. Vielhaber for running the experiments. This work was supported by a grant from the Deutsche Forschungsgemeinschaft (DR 730/1-2, FOR 560). A preliminary partial analysis of the data has been previously presented in a conference paper (Drewing and Jovanovic, 2010).

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 October 2013; accepted: 16 January 2014; published online: 30 January 2014.*

*Citation: Jovanovic B and Drewing K (2014) The influence of intersensory discrepancy on visuo-haptic integration is similar in 6-year-old children and adults. Front. Psychol. 5:57. doi: 10.3389/fpsyg.2014.00057*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Jovanovic and Drewing. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Modality-specific organization in the representation of sensorimotor sequences

## *Arnaud Boutin\*, Cristina Massen and Herbert Heuer*

*IfADo – Leibniz Research Centre for Working Environment and Human Factors, Dortmund, Germany*

#### *Edited by:*

*Christine Sutter, Rheinisch-Westfaelische Technische Hochschule Aachen University, Germany*

#### *Reviewed by:*

*Martina Rieger, University for Health Sciences, Austria Francesca Perini, Bangor University, UK*

#### *\*Correspondence:*

*Arnaud Boutin, IfADo – Leibniz Research Centre for Working Environment and Human Factors, Ardeystra*β*e 67, D-44139 Dortmund, Germany e-mail: boutin@ifado.de*

Sensorimotor representations of movement sequences are hierarchically organized. Here we test the effects of different stimulus modalities on such organizations. In the visual group, participants responded to a repeated sequence of visually presented stimuli by depressing spatially compatible keys on a response pad. In the auditory group, learners were required to respond to auditorily presented stimuli, which had no direct spatial correspondence with the response keys: the lowest pitch corresponded to the leftmost key and the highest pitch to the rightmost key. We demonstrate that hierarchically and auto-organized sensorimotor representations are developed through practice, which are specific both to individuals and stimulus modalities. These findings highlight the dynamic and sensory-specific modulation of chunk processing during sensorimotor learning – sensorimotor chunking – and provide evidence that modality-specific mechanisms underlie the hierarchical organization of sequence representations.

**Keywords: sensorimotor representation, stimulus modality, chunking, implicit/explicit processing, sequence learning**

## **INTRODUCTION**

In daily life we are surrounded by multiple sources of sensory information (Robertson and Pascual-Leone, 2001). Our capacity to act on the external world by efficiently gathering and processing sensory information coming from different modalities (e.g., visual, auditory) is a fundamental aspect of human cognition, which constitutes the bedrock for coherent and skilled behaviors (see Conway and Christiansen, 2005). Understanding the ability to integrate and represent behaviorally relevant sensory information devoted to action production is a central issue in the sensorimotor control and learning literature (e.g., Robertson and Pascual-Leone, 2001; Abrahamse et al., 2009; see Abrahamse et al., 2010, for review; Boutin et al., 2010). Here we ask whether the organization of the internal representation of a sensorimotor sequence is affected by the modality – visual versus auditory – of the sensory signals.

Insights into how the brain represents sensorimotor skills are provided by sequence learning paradigms (e.g., Verwey, 2001; Verwey et al., 2010). These paradigms are highly suitable for the study of the organization of relevant environmental information for action production (e.g., Boutin et al., 2010). Learning of complex serial behaviors involves the binding of discrete, independent actions into unified sequences of actions, called motor chunks (e.g., Sakai et al., 2004, for review Verwey et al., 2010). It has been suggested that improved performance in the course of learning entails a gradual transition from a sequence of individual movements to the preparation and execution of one or more series of movements, which is the hallmark of chunk processing (e.g., Verwey et al., 2002; Rhodes et al., 2004). Several lines of evidence suggest that the resultant segmentation of the movement sequence reflects a hierarchical organization at the representational level (Wright et al., 2010). In theory, processing within a motor chunk is considered to be carried out automatically by the motor system, while processing between motor chunks is thought to be controlled by the cognitive system (e.g., Rushworth et al., 2004; Sakai et al., 2004).

Operationally, motor chunks are defined by certain characteristics of a response-time profile, where response times (RTs) are plotted as a function of the serial position of the responses within a sequence. The response-time profile is not only determined by the physical characteristics of the responses, such as the fingers used, and the transitions between responses, such as within-hand and between-hand transitions, but primarily by characteristics of the sequence representation. In particular, a long RT followed by one or more considerably shorter RTs marks the beginning of a chunk (Verwey et al., 2010). The organization of the sequence representation can go beyond the chunking of individual responses, with chunks becoming integrated into higher-level units, so that a hierarchical organization emerges (e.g., Povel and Collard, 1982; Rosenbaum et al., 1983; Koch and Hoffmann, 2000).

In previous studies of chunking and the hierarchical organization of sequence representation, mostly sequential key-press tasks with visual-spatial stimuli have been used (e.g., see Sakai et al., 2004, for review; Verwey et al., 2010; Wright et al., 2010). In addition, the sequence was typically constructed with an inherent and obvious organization (Koch and Hoffmann, 2000). The advantage of such sequences is that almost all participants adopt the a priori organization, as reflected by mean response-time profiles. In contrast, when the sequence does not adhere to an obvious organization, individual participants adopt individual organizations (Sakai et al., 2003). In the present study, we test whether individual organizations of a sequence, which is void of an obvious inherent organization, reflect the modality of the stimuli.

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 1 — #1

Sequences can be represented in terms of the stimuli, in terms of the responses, or in terms of stimulus-response (S–R) compounds (see Abrahamse et al., 2010, for review). The hypothesis that the organization of sequence representations might be affected by stimulus modality could be taken to imply a stimulus component of the representation. However, an effect of stimulus modality could also occur if only motoric features of the responses were represented. For instance, visual and auditory stimulus sets would in general have different S–R compatibility with the response set, as it is the case in the present experiment. These differences are likely to be associated with different variations of RTs across the sequential responses. The role of temporal factors, such as longer inter-stimulus intervals,for the organization of sequence representations is well known (e.g., Stadler, 1995). Thus, a regular pattern of RTs could shape the organization of the sequence representation, even if only motoric features were represented, and this shaping should be different for different response-time patterns associated with different stimulus modalities.

There are other differences, such as the spatial frame which is present for visual, but not for auditory stimuli. Such a spatial frame could support the organization in other units than the ones preferred without a frame. Consider the characterization of relational structures that can be used in organizing a sequence representation (example elements are 1 2 3 4), such as runs (1 2 3...), trills (1 2 1 2...), repetitions (1 1 1...), reflections (1 2 4 3...), and transpositions (123234...; Restle, 1970). Without a spatial frame, for instance, reflections might be less conspicuous than with such a frame, which provides boundaries at which reflections could occur. More generally, biases toward certain organizations are likely to be different for spatial patterns of successive visual-stimulus locations and for "musical" patterns of successive auditory-stimulus pitches. Even without specific hypotheses on what these differences are, they should affect the individual organization of a to-be-learned sequence that is not dominated by an inherent and obvious organization.

In this study, we contrast two practice conditions with different modality-based S–R compatibility with the response set. Specifically, both practice groups are required to perform the very same sequence of motor responses. However, their practice conditions differ in terms of sensory stimuli (visual and auditory). That is, sequence production relies on different sensory-motor mappings in the two groups: visual-motor and auditory-motor. Hence, based on the assumption that sensory-based mechanisms contribute to sequence structuring and the formation of motor chunks, we hypothesize that practice of a motor sequence that does not convey any a priori hierarchical organization leads to the development of individual and modality-specific organizations of sequence representations.

## **MATERIALS AND METHODS PARTICIPANTS**

Thirty undergraduate students from TU Dortmund University participated in this study in exchange for course credit or 15€. They were randomly assigned to the "Visual" (*N* = 15; Mean age = 22.2 ± 1.9 years; six females) and "Auditory" (*N* = 15; Mean age = 23.2 ± 2.8 years; four females) practice conditions when arriving at the laboratory. All participants were

right-handed according to the Edinburgh Handedness Inventory (Oldfield, 1971), had normal or corrected-to-normal vision, and were unaware of the specific purpose of the study. They gave written informed consent prior to participation in the experiment, which was conducted with the general approval of the local ethics committee.

## **APPARATUS AND STIMULI**

The display was viewed from a distance of approximately 50 cm. It showed four horizontally aligned squares presented white-onblack in the center of the screen. The squares were 2 cm wide and 2 cm high, spaced 1 cm apart. The mode of stimulus presentation (visual or auditory) was dependent on group assignment (see **Figure 1**). In the visual condition, a stimulus was one of the four squares filled green. In the auditory condition, stimuli were computer-synthesized tones of 50 ms duration, presented binaurally through stereo headphones. The four different tones used in this study had frequencies of 300, 675, 1552, and 3565 Hz. Stimulus presentation and response registration were controlled by custom-made programs using the Matlab® R2011b software (The MathWorks, Inc., Natick, MA, USA) and the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007).

## **TASK AND PROCEDURE**

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 2 — #2

In the visual condition, participants were required to respond to visually presented stimuli (filled squares), which were spatially compatible with the response keys. In the auditory condition, participants were required to respond to auditorily presented stimuli, which had no direct spatial correspondence with the response keys. Each tone was assigned to a unique key on the computer keyboard, with the lowest pitch corresponding to the leftmost key (i.e., index finger) and the highest pitch to the rightmost key (i.e., little finger). **Figure 1** illustrates the S–R mapping used in both groups together with the practiced sequence and the test sequence.

In the auditory condition the S–R mapping was more difficult than in the visual condition (see Buchner et al., 1998). Therefore, participants assigned to the auditory condition underwent an initial *familiarization* phase of unrecorded trials in order to make them familiar with the (tone-key) S–R mapping, and to avoid high error rates during practice. They had to complete a 40-element sequence of randomly presented tones by depressing the corresponding keys. The instructions emphasized accuracy, which had to be above 85%. Only when participants had reached the pre-set learning criterion for the tone-discrimination performance, the practice phase started.

During the practice phase, participants were required to respond as rapidly and accurately as possible to a sequence of stimuli (visual or auditory) by depressing the appropriate response keys with their dominant right hand on a standard German QWERTZ keyboard. They held their right-hand index, middle, ring and little fingers on the response keysV, B, N, and M, respectively. Each practice trial began with the presentation of four empty squares. The first imperative stimulus was presented after a random foreperiod of 1–3 s (in 0.5-s steps). The response of the participant triggered the presentation of the next stimulus, and so forth until the end of the practice block. Each block consisted of 10 repetitions of a 12-element sequence. The time needed to produce the 120 key

presses was shown as feedback at the end of each practice block for 5 s. The display was then erased, and the screen remained black for 20 s. Breaks were inserted between the 14 blocks that composed the practice phase to prevent fatigue. When participants were ready to proceed with the next practice block, they pressed any one of the four response keys.

To differentiate sequence learning from generalized practice effects, a test block with a new sequence of stimuli was presented after the end of practice. Performance with the 12-element test sequence served as a reference to determine sequence-specific learning of the practiced sequence (see Abrahamse et al., 2010). The order of stimuli in the test sequence was different from the practiced sequence, but the test sequence contained all of the twofinger transitions that composed the training sequence. Both in the practiced and test sequence the same key was not pressed twice in succession, and the same two-finger transition never occurred twice. No mention was made about the regularities in the order of stimuli.

After the test block participants were given a post-experimental free-recall test to evaluate their conscious awareness of the sequence, that is, their explicit knowledge. They were instructed to write down the sequential order of the 12 elements that composed the practiced sequence on a sheet of paper. Performance was scored by determining the number of serial positions for which the correct element was recalled.

#### **DATA ANALYSIS**

Response time was defined as the time interval between stimulus onset and depression of the corresponding key. We designate the response times for the successive responses to the stimuli of the sequence as RT1, RT2 ... RT11, and RT12. For each block of trials, we determined the error rate and the mean of all response

times (neglecting error trials). For the analysis of response-time profiles, we computed the means of RT1, RT2 ... RT11, RT12 from the 10 repetitions of the sequence in each practice block. These means were subjected to statistical analyzes as detailed in the results section.

## **RESULTS**

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 3 — #3

## **MEAN RT AND ACCURACY DURING PRACTICE**

Mean response times and error rates in the practice blocks are shown in **Figure 2**. They were submitted to separate 2 (Group: visual, Auditory) × 14 (Block: 1–14) ANOVAs with repeated measures on the factor block. When relevant, Duncan's multiple range test was used to determine the specific effects contributing to the general ANOVA effects.

Analysis of RT during practice revealed a significant group × block interaction, *F*(13, 364) = 24.93, *p* < 0.001, η2 *<sup>p</sup>* = 0.47, reflecting group differences in the speeding-up of responses across practice blocks. *Post hoc* analysis indicated that both groups improved their performance from Block 1 to Block 14 (from 522 to 440 ms for the visual group, and from 868 to 468 ms for the auditory group; *ps* < 0.001). More specifically, participants of the visual group responded more rapidly than those of the auditory group from Block 1 (*p* < 0.001) to Block 3 (*p* < 0.01), but not from Block 4 (*p* = 0.07) to Block 14 (*p* = 0.73).

Mean error rate during practice amounted to 3.4% in the visual group and 16.8% in the auditory group. The analysis revealed a significant group × block interaction, *F*(13,364) = 2.85, *p* < 0.001, η2 *<sup>p</sup>* = 0.09, reflecting group differences in the evolving of error rates during practice. *Post hoc* comparisons detected a significant decline of the error rate in the auditory group across practice blocks (from 19.4 to 13.0 %; *p* < 0.001), but not in the visual group (from 2.4 to 3.3%; *p* = 0.57).

#### **SEQUENCE-SPECIFIC LEARNING**

Mean response times and error rates in the test block are also shown in **Figure 2**. Differences to the last practice block reveal sequence-specific learning. RTs and error rates in the last practice block and the test block were analyzed in separate 2 (Group: visual, Auditory)×2 (Block 14, Test) ANOVAs with repeated measures on the factor block. When relevant, Duncan's multiple range test was used to determine the specific effects contributing to the general ANOVA.

Analysis of the response times revealed a significant group <sup>×</sup> block interaction, *<sup>F</sup>*(1, 28) <sup>=</sup> 7.43, *<sup>p</sup>* <sup>=</sup> 0.01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.21. *Post hoc* comparisons indicated that both groups responded more rapidly on Block 14 than on the test block (440 and 508 ms for the visual group, *p* = 0.05; 468, and 665 ms for the auditory group, *p* < 0.001), which is the hallmark of sequence-specific learning. Moreover, the analysis indicated that the visual group was faster than the auditory group on the test block (*p* = 0.01), while no performance difference was observed on Block 14 (*p* = 0.64).

The analysis of the error rates revealed a significant group <sup>×</sup> block interaction, *<sup>F</sup>*(1, 28) <sup>=</sup> 9.95, *<sup>p</sup>* <sup>&</sup>lt; 0.01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0.26. *Post hoc* analysis showed higher error rates for the auditory group on the test block than on Block 14 (respectively 13.0 and 22.3 %, *p* < 0.001), while no difference was observed in the visual group (respectively 3.3 and 4.2 %, *p* = 0.64).

### **RESPONSE-TIME PROFILES**

For each practice block, we computed the response-time profile across the 12 serial positions of the sequence. As expected, these profiles varied across participants. **Figure 3** shows example profiles of two participants, one in the visual and one in the auditory condition. Before plotting them, the profiles were normalized, that is, the deviations from the profile means were divided by the standard deviations of the RT1 ... RT12 of the respective profiles. In addition, the profiles were grouped into those during early practice (Blocks 1–5) and those during late practice (Blocks 10– 14). During early practice the decline of response time was faster than during late practice, and more marked changes of the profiles were expected.

We analyzed the profiles by way of computing distances between them. More specifically, we used 1 − *r* as a distance measure, with r as the correlation between two profiles. Geometrically, each profile can be conceived as a vector in 12-dimensional space. The correlation between two profiles is equivalent to the cosine of the angle between the vectors. Our measure of distance or dissimilarity is invariant against overall differences in response times between profiles as well as against different scalings of the RT variations across serial positions, but sensitive to shape differences, that is, to differences between the relative durations of the RTs in the profiles compared (see Heuer, 1984). The measure varies between 0 and 2 (respective correlations between 1 and −1), which corresponds to angles ranging from 0◦ to 180◦ between the respective vectors. The mean distances between pairs of profiles of blocks 1–5 (early practice) in **Figure 3** were 0.610 and 0.477 for the two participants in the visual and auditory group, respectively. For blocks 10–14 (late practice), they were 0.176 and 0.375, respectively.

**Figure 4A** shows the mean within-participant distances for early practice (Blocks 1–5; for each participant this was the mean of 10 distances computed between profiles from pairs of blocks 1–2, 1–3, 1–4, 1–5, 2–3, 2–4, 2–5, 3–4, 3–5, 4–5) and late practice (Blocks 10–14; for each participant this was the mean of 10 distances computed between profiles from pairs of blocks 10– 11, 10–12, 10–13, 10–14, 11–12, 11–13, 11–14, 12–13, 12–14, 13–14) in the visual and auditory groups. We compared early and late practice by means of Wilcoxon signed-rank tests. Neither in the visual group, *T*(15) = 34, *p* > 0.10, nor in the auditory group, *T*(15) = 52, *p* > 0.20, was the distance between profiles significantly reduced during late practice. In addition, we compared the two groups by means of Mann–Whitney *U*-tests. These tests did not reach statistical significance, neither for early practice, *U* = 81, *p* > 0.10, nor for late practice, *U* = 122, *p* > 0.20.

In **Figure 4A**, the mean within-participant distances are compared with mean between-participant distances. In 1000 simulated random samples, we re-shuffled participants within groups for each practice block. For each of these pseudo "participants" we computed the same mean distances as in the case of the

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 4 — #4

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 5 — #5

within-participant distances, but now – for each distance – the second profiles (from the later practice blocks) were from a randomly chosen different participant. These between-participant distances were considerably larger than the observed within-participant distances. In fact, for none of the simulated samples was the mean distance smaller than the corresponding observed mean withinparticipant distance. Thus, profiles were significantly more similar within than between participants. This is clear evidence of specific individual response-time profiles.

**Figure 4B** shows mean within-group (but betweenparticipant) distances and mean between-group distances. For each participant and each block of trials we computed the mean distance to the profiles of the other 14 participants of the same group and of the 15 participants of the other group, and these means were averaged across Blocks 1–5 and Blocks 10–14. Thus, all distances were between profiles from the same practice block of different participants in same or different groups, whereas the distances shown in **Figure 4A** were always between profiles from different practice blocks of same or different participants within the same group. Note that the mean between-group distances shown in **Figure 4B** are identical for the two groups for mathematical reasons, whereas the standard errors are different. For each participant in each group, different distances entered the mean between-group distance, but across participants the set of distances was the same.

According to **Figure 4B**, within-group distances were smaller than between-group distances. For the statistical comparison by means of Wilcoxon signed-rank tests we collapsed both groups of participants. The within-group distances were significantly smaller both early in practice, *T*(30) = 3, *p* < 0.01, and late in practice, *T*(30) = 80, *p* < 0.01. In addition **Figure 4B** shows larger distances later in practice than earlier. The increase from early to late practice was significant both for within-group distances, *T*(30) = 52, *p* < 0.01, and between-group distances, *T*(30) = 83, *p* < 0.01, and it did not differ significantly between the two types of distance, *T*(30) = 186, *p* > 0.20. Thus, response-time profiles were more similar within each of the two groups who practiced with different stimulus modalities than between these two groups, and similarity declined in the course of practice, that is, individual profiles became more dissimilar.

#### **EXPLICIT KNOWLEDGE**

The results of the free-recall test at the end of the experiment revealed that none of the participants was able to identify and report the entire practice sequence. They exhibited only fragmentary sequence recall with a mean of 5.2 ± 2.5 elements in the visual group, and a mean of 2.9 ± 1.3 elements in the auditory group. We compared the number of recalled elements between groups by means of Mann–Whitney *U*-tests. The statistical analysis indicated a significant difference between groups, *U* = 50, *p* < 0.01, revealing that the visual group expressed better explicit knowledge than the auditory group.

In an additional step we tested whether the difference in explicit knowledge could be critical for the smaller between-group (and between levels of explicit knowledge) than within-group (and within levels of explicit knowledge) similarity of response-time profiles. For this purpose we formed sub-groups with poorer

and better explicit knowledge. For the visual group, there were nine participants with four or less correctly recalled elements and six participants with more than four correctly recalled elements; for the auditory group there were seven participants with two or less correctly recalled elements and eight participants with more than two correctly recalled elements. For each participant we contrasted the mean distance to the profiles of the other participants of the same group and the same sub-group with the mean distance to the other participants of the same group, but the other sub-group with a different level of explicit knowledge. Collapsed across all participants, early in practice the mean distances were 0.684 and 0.665 within and between sub-groups with different levels of explicit knowledge, respectively, and late in practice the mean distances were 0.792 and 0.767. According to Wilcoxon signedrank tests, the differences were not significant, neither early in practice, *T*(30) = 168, *p* > 0.10, nor late in practice, *T*(30) = 161, *p* > 0.10.

## **DISCUSSION**

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 6 — #6

The present results reveal individual response-time profiles that become more different in the course of practice and, more importantly, a smaller variation of the profiles within than between the two groups who practiced with visual and auditory stimuli, respectively. Thus, the modality of the stimuli during sequence learning shapes the individual organization of the sequence representation, in particular the formation of motor chunks. These findings highlight the dynamic and sensoryspecific modulation of chunk processing during sensorimotor learning and provide evidence of modality-specific mechanisms, which contribute to the hierarchical organization of sequence representations.

#### **SENSORY-BASED MECHANISMS FOR MOTOR CHUNKING**

The present findings show that individuals organize sequence representations in different ways. Response-time profiles in successive practice blocks of the same person were more similar than response-time profiles in successive practice blocks of different persons. In addition, response-time profiles of different persons in the same practice blocks became more dissimilar in the course of practice. This elaborates observations according to which chunked representations of sequences are formed even when there is no a priori organization of the sequence, but these chunked representations differ between individuals (Sakai et al., 2003). In addition to the individual specificity of sequence organization, we show specificity for stimulus modalities. Thus, individual factors combine with the influence of the stimulus modality so that the response-time profiles at each stage of practice are more similar for persons for whom the stimulus modality is the same than for persons for whom the stimulus modality is different. In which way stimulus modalities shape the organization of sequence representations is currently as unknown as the individual factors that shape the organization. Only hypotheses are possible at the time being.

With respect to the underlying neural structures, there is considerable evidence that basal-ganglia circuits contribute to the formation of newly acquired skills in promoting the gradual structuring of the entire set of actions into ordered subsets

(e.g., Graybiel et al., 1994; Graybiel, 1998; see Graybriel, 2008, for review). Intracranial recordings from the human basal ganglia provide evidence of an integrative role of this structure in the processing of sensory, cognitive, and motor information (see Bares and Rektor, 2001). Thus, considering that the basal ganglia also contribute to S–R learning (Graybiel, 1998), this opens the possibility that both visual and auditory sensory inputs participate in the shaping of the hierarchical organization of sequential sensorimotor behaviors.

The present study started with the hypothesis that the individual organization of sequence representations might be shaped by stimulus modalities. This hypothesis was confirmed, and it opens the question of how the effects of stimulus modalities come about. Regarding this question, we have no conclusive answer, but only a number of possibly contributing factors. The first type of factors relates to spatial characteristics of the visual stimuli that were not inherent to the auditory stimuli. This difference between stimulus sets is accompanied by a number of differences that might affect sequence representations. First, there is a difference in mean response time because of different levels of S–R compatibility. This difference could affect the degree to which stimulus characteristics are included in the sequence representation. Second, there are probably different patterns of delays between successive responses, which could shape the sequence representation. Third, the presence versus absence of spatial stimulus characteristics goes along with the presence of a spatial reference frame for visual stimuli, but not for auditory stimuli. Such a frame could modulate the conspicuity of certain relational structures such as reflections.

The second type of factors relates to the self-organizing tendencies that are inherent to sequences of visual and auditory stimuli. The spontaneous organization both of concurrently and successively presented stimuli, both visual and auditory, has been studied since the emergence of Gestalt psychology (e.g., Wertheimer, 1923), and it should be different for the stimulus sets of the present study. However, at present the spontaneous organizations of the stimulus sequences remain unknown because they are neither evident nor have they been studied empirically. Thus, although we provide firm evidence of modality-specific organization of sequence representations, the nature of the mechanisms involved remains as an unsolved problem.

## **IMPLICIT LEARNING, EXPLICIT LEARNING, AND CHUNKING WITH VISUAL AND AUDITORY STIMULI**

Sequential learning, and the accompanying chunking, is a multifaceted process with both implicit and explicit components (e.g., Nissen and Bullemer, 1987; Hikosaka et al., 2002; Robertson et al., 2004). The interplay between implicit (unconscious) and explicit (conscious) processing during sequence learning has long been the subject of theory and research (e.g., see Cleeremans et al., 1998, for review; Destrebecqz and Cleeremans, 2001; Robertson et al., 2004). Traditionally, while learners show improved performance when the same behavior is rehearsed, they often fail to exhibit verbalizable (explicit) knowledge about the acquired information (Willingham et al., 1989). This kind of learning is considered to be implicit (Cleeremans et al., 1998). Heretofore, most of the studies explored the extent to which awareness of the sequence relates to performance and learning (e.g., Destrebecqz and Cleeremans, 2001; Robertson et al., 2004). However, to the best of our knowledge, nothing is known about the extent to which explicit knowledge about the sequence affects the binding of the associate motor responses. We shall discuss both these relations of explicit knowledge to learning and chunking in turn.

A common indicator of (implicit) sequence learning is the difference between response times at the end of practice and in a test block in which the practiced sequence is replaced by a new one, often a random sequence. In the present study these learning scores were 68 ms in the visual group, but 197 ms in the auditory group. According to established standards, one would conclude that implicit learning was better with auditory than with visual stimuli. However, contrary to empirical evidence (e.g., Curran and Keele, 1993) and theoretical underpinnings (e.g., Keele et al., 2003), poorer learning scores in the visual group went along with better explicit knowledge, while better learning scores in the auditory group went along with poorer explicit knowledge. This finding nourishes doubts that the learning score is always an adequate measure of sequence-specific learning. These doubts are justified as soon as different levels of S–R compatibility are involved. The reasons are detailed in the following.

Consider initial performance in the present experiment. Response times were clearly faster in the visual group than in the auditory group because of different levels of S–R compatibility. At the end of practice this difference had almost disappeared, although effects of S–R compatibility typically survive extended practice periods, though they become smaller (Dutta and Proctor, 1992). Thus, in the present experiment the difference between the two groups would not have disappeared because the difference in S–R compatibility vanished as a result of practice, but because the stimuli, and thus the S–R mapping, became largely irrelevant for response selection. This is a consequence of acquiring a sequence representation and making use of it. The almost identical response times in the visual group and the auditory group at the end of practice could even be taken to suggest that the stimuli did not play any role whatsoever for response selection, and the sequence representations were based on response features only. This situation changes when a new sequence replace the practiced sequence. Then response selection is based on stimuli again, and different levels of S–R compatibility matter. Thus, the common learning score is affected by S–R compatibility, in particular by the differences between the simple mapping of sequence representations on responses and the high or low compatibility of S–R mappings. In general, learning scores should be larger when the S–R compatibility is low.

Regarding the question whether conscious awareness of the motor sequence influences sensorimotor chunking, the present data provide a partial answer. Response-time profiles of participants with similar levels of explicit knowledge were not more similar between participants than response-time profiles of participants with dissimilar levels of explicit knowledge. Accordingly, the dissimilar profiles of the two practice groups cannot be attributed to their difference with respect to explicit knowledge. Further, response-time profiles are not shaped by explicit knowledge in a way comparable to how they are shaped by stimulus modality.

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 7 — #7

Thus, we tentatively suggest that the hierarchical organization of a sensorimotor sequence – sensorimotor chunking – is essentially an implicit process. This suggestion is tentative because explicit knowledge could modulate characteristics of response-time profiles for which our analysis is insensitive, such as the variability of response times across serial positions.

## **CONCLUSION**

We demonstrated that the sensory signals guiding task production have an important influence upon the process of skill acquisition. The structured response pattern that emerged through practice is dependent on the available sensory information, suggesting that sensory-based mechanisms mediate the formation of motor chunks. Findings account for an individual and modalityspecific organization in the representation of sensorimotor sequences. Our results challenge purely motor-based accounts for chunk processing and lend support to the claim that sensorybased mechanisms underlie motor chunking – sensorimotor chunking.

## **ACKNOWLEDGMENTS**

The German Academic Exchange Service (A/11/93877) and the German Research Foundation (BO 3797/1-1) provided support for this research.

## **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 8 — #8

*Received: 30 September 2013; accepted: 26 November 2013; published online: 11 December 2013.*

*Citation: Boutin A, Massen C and Heuer H (2013) Modality-specific organization in the representation of sensorimotor sequences. Front. Psychol. 4:937. doi: 10.3389/fpsyg. 2013.00937*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2013 Boutin, Massen and Heuer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fpsyg-04-00937" — 2013/12/9 — 17:20 — page 9 — #9

## Human haptic perception is interrupted by explorative stops of milliseconds

#### *Martin Grunwald1 \*, Manivannan Muniyandi 2, Hyun Kim3, Jung Kim4, Frank Krause1, Stephanie Mueller <sup>1</sup> and Mandayam A. Srinivasan3*

*<sup>1</sup> Haptic-Research Laboratory, Paul-Flechsig-Institute for Brain Research, University of Leipzig, Leipzig, Germany*

*<sup>2</sup> Department of Applied Mechanics, Biomedical Engineering, Indian Institute of Technology, Madras, Chennai, India*

*<sup>3</sup> Laboratory for Human and Machine Haptics, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA, USA*

*<sup>4</sup> Department of Mechanical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea*

#### *Edited by:*

*Jochen Musseler, RWTH Aachen University, Germany*

#### *Reviewed by:*

*Holger Mitterer, University of Malta, Malta*

*Christian Frings, University of Trier, Germany*

#### *\*Correspondence:*

*Martin Grunwald, Haptic - Research Laboratory, Paul-Flechsig-Institute for Brain Research, University of Leipzig, Johannisallee 34, 04103 Leipzig, Germany e-mail: mgrun@ medizin.uni-leipzig.de*

**Introduction:** The explorative scanning movements of the hands have been compared to those of the eyes. The visual process is known to be composed of alternating phases of saccadic eye movements and fixation pauses. Descriptive results suggest that during the haptic exploration of objects short movement pauses occur as well. The goal of the present study was to detect these "explorative stops" (ES) during one-handed and two-handed haptic explorations of various objects and patterns, and to measure their duration. Additionally, the associations between the following variables were analyzed: (a) between mean exploration time and duration of ES, (b) between certain stimulus features and ES frequency, and (c) the duration of ES during the course of exploration.

**Methods:** Five different Experiments were used. The first two Experiments were classical recognition tasks of unknown haptic stimuli (A) and of common objects (B). In Experiment C space-position information of angle legs had to be perceived and reproduced. For Experiments D and E the PHANToM haptic device was used for the exploration of virtual (D) and real (E) sunken reliefs.

**Results:** In each Experiment we observed explorative stops of different average durations. For Experiment A: 329.50 ms, Experiment B: 67.47 ms, Experiment C: 189.92 ms, Experiment D: 186.17 ms and Experiment E: 140.02 ms. Significant correlations were observed between exploration time and the duration of the ES. Also, ES occurred more frequently, but not exclusively, at defined stimulus features like corners, curves and the endpoints of lines. However, explorative stops do not occur every time a stimulus feature is explored.

**Conclusions:** We assume that ES are a general aspect of human haptic exploration processes. We have tried to interpret the occurrence and duration of ES with respect to the Hypotheses-Rebuild-Model and the Limited Capacity Control System theory.

**Keywords: haptic exploration, movement stops, finger exploration, active touch perception, haptic perception process**

## **INTRODUCTION**

The sense of touch has already been described by Aristoteles (1986) and Weber (1846) as the most complex sensory system of men. Fundamental Experiments by von Skramlik (1937), Gibson (1962), and Revesz (1950) revealed that we have to distinguish between active touch (haptic perception) and passive touch (tactual perception). On account of its connection to motor processing, haptic perception is, among others, elementary for learning, body image, body schema, orientation in space, motor control, sexual activities, and perception of the blind (Schiff and Foulke, 1982; Heller and Schiff, 1991; Hatwell et al., 2003; Grunwald, 2008). Despite the huge importance of haptic perception for men, far more studies exist on the topic of passive, tactile stimulus perception. One reason for this may be methodological difficulties posed by the investigation of 10 finger tasks. Even though various studies concerning human haptic perception already exist, many aspects of information processing during haptic perception are still to be explained. Especially, the specifics of complex finger and body movements need to be investigated more thoroughly in healthy and unhealthy humans. Early on, the pioneers of haptic research (among others: Revesz, 1950 and Katz, 1925) have pointed out that it is crucial for the comprehension of human haptic perception, to understand how and with what kind of exploratory procedures surface and object characteristics are observed (e.g., with the fingers). Therefore, the precise analysis of exploratory procedures is essential to understanding the dynamics of movement and exploration during haptic perception. Accordingly, the analysis of these processes has the same significance as the precise analysis of ocular movements for the comprehension of visual perception. The exact knowledge of the interactions between visual scanning movements and cognitive stimulus processing has lead to substantial methodological as well as contentual progress in the field of visual research (Rayner, 1995; Kennedy, 1997; Walker, 1997; Inhoff et al., 2002; Vaughan, 2002; Findlay, 2005; Lansdown, 2005; Klein and Ettinger, 2008; Sui, 2008; Wade, 2010).

Concerning object exploration it is known that human haptic perception is accompanied by different touch movements, especially of the fingers. With the help of these active exploration movements, stimuli features are detected by different receptors (thermal receptors, vibration receptors, pain, and pressure receptors of the skin, muscles, tendons, soft tissue and joints). Furthermore, some studies have shown that exploration movements depend on task features as well as stimulus features (Lederman and Klatzky, 1987, 1993; Klatzky et al., 1993; Klatzky and Lederman, 1995). As early as in the middle of the last century, Ananev stated that touch and exploration movements of haptic perception include phases during which the fingers or hands hardly move or do not move at all (Ananev et al., 1959). He found that interruptions of movements occurred primarily on corners and edges. On a descriptive level it has, therefore, been known for quite a while that haptic explorative finger movements are interrupt by pauses. Lederman and Klatzky (1987) have described several "exploratory procedures" (EP) that were typically used by test subjects to explore object properties. To capture these global exploratory procedures, video footage was analyzed frame by frame. Their classification of EPs includes two static procedures: "static contact" and "unsupported holding." The authors postulated that "static contact" is used to perceive object temperature and that "unsupported holding" is associated with the perception of object weight. The durations of these procedures were between 0.01 and 0.08 s for static contact and between 0.03 and 2.12 s for unsupported holding.

However, besides these global object EPs, little is known about the haptic perception of complex structures (e.g., raised-line pictures). The breaks and pauses that may occur during the exploration of complex haptic features have hardly been analyzed in healthy humans, yet.

In view of these facts, the consensus seems to be that haptic exploration is strongly linked with exploratory procedures. But it remains unclear to what purpose, why, when and for how long the explorative movements of the fingers stop. A theoretical and functional integration of explorative stops (ES) into the knowledge base of the haptic perception process is missing to date. That this problem has been addressed so little so far is all the more surprising as the direct comparison of explorative finger movements and eye movements is virtually obtrusive. More than half a century ago Russian psychologists formed first theoretical ideas that explorative hand and eye movements may be similar to each other (Zinchenko, 1957; Leontew, 1959; Zinchenko and Ruzskaia, 1962). A central aspect of this comparison concerns the scanning movements that are required for both the hands as well as the eyes to perceive. Alternations of saccadic movement and fixation periods that occur during the active visual process are well established. The oculomotor actions of vision are marked by a perpetual alteration between fixation pauses and saccadic eye movements. These fixation pauses are neither accidental occurrences nor an epiphenomenon of the oculomotor system. Results from cognition research and eye movement research have shown that the duration of the fixation pauses is associated with stimulus complexity (e.g., Krause, 1988; Kaller et al., 2009). The duration of the fixation pauses of the eyes increase with increasing complexity of the stimuli and, therefore, with increasing cognitive demands. Several theoretical concepts and empirical studies document a direct correlation of visual information processing and oculomotor acivity (Lüer et al., 1988; Liversedge and Findlay, 2000; Engbert and Kliegl, 2004; Martinez-Conde et al., 2004; Thomas and Lleras, 2007; van Gompel et al., 2007; Hutton, 2008).

Since we start with the premise that a psycho-physiological correspondence exists between the visual and the haptic system, it may follow that the exploration process of the human fingers may be composed of alterations of explorative movement and fixation periods as well. The present study was designed to capture possible ES of milliseconds during haptic exploration of various objects and surfaces. To make this possible, a measurement method with a higher temporal resolution than a video recording (frame-to-frame analyses) was necessary. Up to now, neither a digital measuring method which is able to measure the precise length of breaks during motion nor respective psychophysiological studies have been reported. Therefore, we have developed a new measuring method to capture Experimental evidence for the existence of ES during the haptic exploration of objects and patterns.

The present study consisted of 5 Experiments (A–E). The first part of the study consists of Experiments A–C. The second part of the study (Experiments D and E) will be presented further below. Experiment A was used to test whether ES occur during the haptic exploration of sunken relief structures of unknown stimuli (Experiment A). In Experiment B the participants had to explore and recognize common objects. In Experiment C the spatial and angular position of angle legs had to be recognized and reproduced by the participants. The experimental settings and procedures are presented in **Figures 1A–D**. Further methodological descriptions are given in methods part one. We assumed that ES of milliseconds would occur during all three experiments (Hypothesis 1).

Furthermore, we expected to find, that ES would occur during uni- and bi-manual haptic exploration (Hypothesis 2). To test Hypothesis 2, Experiments A and B were conducted singlehandedly as well as with both hands.

In correspondence with findings from eye movement research we assume, furthermore, that the mean duration of the ES is associated with the familiarity and complexity of the stimuli (e.g., Krause, 1988; Kaller et al., 2009). Therefore, the duration of the ES should increase with increasing complexity and novelty of the stimuli. To test this, the difficulty and complexity of the stimuli differed between Experiments A–C. The stimuli within each Experiment (A–C), however, were similar in their difficulty and complexity. Accordingly, we expect to find the shortest ES during the exploration of common objects. The longest ES should occur during the exploration of the unknown sunken relief stimuli. It is well established through visual research as well as haptic research, that reaction and recognition time are associated with the complexity of a task or stimulus (Krause, 1978, 1981; Lüer et al., 1988; Grunwald et al., 1999, 2001c). The more complex a stimulus is, the longer are the corresponding times. Therefore, the processing

times pose as a direct measure of the internal cognitive information processing procedures. If this relationship exists for the ES as well, we expect to find a positive correlation of mean exploration time (ET) and mean duration of ES (Hypothesis 3). If, however, the ES are a random and reflexive occurrence that is unrelated to the stimulus properties, no association between ET and ES should be found.

To map the precise spatial locations of ES the PHANToM device was used in the second part of the study. As described above, the fixation pauses of the visual system do not occur accidentally, but are directly associated with information processing. Analogously, we assume that the ES of haptic perception depict cognitive information processing. Therefore, we assume that a spatial and temporal relationship exists between the occurrence and the duration of ES during haptic exploration and the stimulus properties. It is known from eye movement research that visual information processing occurs only during fixation pauses and not at all during saccadic eye movements. Correspondingly, we assume that the ES intervals represent phases of stimulus processing and sensory integration as well as aspects of motion control. Also, we assume that ES will not occur independently from the spatial structures of the stimulus features. We expect that ES will occur more often and with longer duration on stimulus areas which are high in information content (e.g., on corners, edges, and curves; Hypothesis 4). In the same line, ES should occur less often and with shorter duration on less complex stimulus areas like straight horizontal or vertical lines.

The perception of haptic patterns and objects is a serial process that requires gradual processing as well as cognitive integration of sensory motor information parts–similarly to visual stimulus processing. Therefore, we expect to find a temporal dynamic of (a) the frequency of ES occurrence and (b) the duration of ES during the course of haptic object and pattern recognition. Based on Richard Gregory's perception theory (Gregory, 1973) we believe that the haptic perception process consists of sequences of proposing and testing hypotheses until a final percept is generated. Therefore, we expect that the duration and frequency of ES will vary during the course of haptic exploration, especially on complex stimulus areas (e.g., corners) (Hypothesis 5). Specifically, we assume that the decoding of stimulus features at the beginning of haptic exploration will be accompanied by longer ES than the end of the exploration.

To test hypotheses 4 and 5 a technology was necessary that would facilitate a high resolution analysis of the spatial stimulus features as well as of the temporal course of haptic exploration. Therefore, the PHANToM haptic device was used in Experiments D and E. The participants had to explore virtual and real sunken reliefs with the tip of the PHANToM device (**Figures 2A–C**; see methods part 2). Both virtual and real stimuli were explored with the help of the PHANToM device to assess whether differences exist in the exploratory procedures.

## **MATERIALS AND METHODS PART 1**

## *Measurement methods for experiments A–C*

During the haptic tasks of Experiments A–C finger and hand movements were measured by a digital apparatus, which is designed for measurements of smallest changes within a magnetic field. The apparatus consisted of three linked, highly sensitive magnetic sensors (sensor type: KM 110BH/2310, Philips Semiconductors U.S.). The magnetic sensors were located within the stimulus fixture in Experiments A and B (**Figure 1**), whereas in Experiment C, the magnetic sensors were located within the moveable angel legs. The sensitive measuring range of the sensors amounted to 9 cm. Small magnets (3 mm in diameter), which

virtual and real sunken relief stimuli in two types of orientation.

were glued to the dorsal side of the distal interphalangeal joints of the test persons, generated a measurable magnetic field. During finger and hand movements the magnetic field changed, and the electric output of the magnetic sensors varied between 0 and 300μV. Whereas, during an absolute motionless state the electric output varied only between 0 and 1 μV. Therefore, this measurement method had a very high temporal resolution. The output signals were recorded with a sampling rate of 166.66 Hz and were saved digitally. To record the measurements a digital EEG device (Walther Graphtek, Munich, Germany) was used. The measurement of ET began when the hands first touched the stimuli (Experiment A and B) or when they first touched the angel legs (Experiment C). The output signals of the magnetic sensors were recorded separately for each angle leg (Channel 1, 2). The analysis of the output signals was carried out with the software BRAIN VISION (Munich, Germany). Signals within a range of 0–1μV were marked as ES. Output signals >1.0μV were marked as motions.

## *Experiment A*

The test persons had to explore (with their fingers) the structure of different sunken reliefs while their eyes were closed. The structure of the reliefs consisted of milled traces with a depth of 3 mm and a width of 7 mm (**Figure 1B**). Optimal positioning of the stimuli in relation to the fingers was ensured by an adjustable holder. During haptic exploration the participants' forearms rested on a wide base in order to allow free movements of the fingers only. No arm and shoulder movements were made during haptic exploration. The ET per stimulus was not limited. After haptic exploration the participants were asked to open their eyes and to draw the perceived structure on a piece of paper. The test persons were prevented from gathering any visual information from the stimuli. They received no feedback on the quality of their reproduction or on the stimulus structure itself. Before the Experiment proper began, the participants were allowed to look at and explore a sample stimulus (that was not included in the following Experiment) to become familiar with the haptic material. They practiced the exploration task for 1 min.

Three task types were distinguished: left hand tasks, right hand tasks and two-handed tasks. Each participant had to complete all tree tasks. To complete each task the participants had to explore and draw two sunken reliefs. In other words, they explored two sunken reliefs with the left hand, two with the right hand and two with both hands, consecutively. For the three task types different haptic stimuli were used to prevent memory effects (Richardson and Richardson, 1996). Based on the study by Ballesteros et al. (1997) we used one symmetrical sunken relief and one asymmetrical sunken relief for each task. The same haptic stimuli of sunken reliefs have been used before in psycho-physiological and clinical Experiments to investigate brain electrical changes, e.g., in patients with Anorexia nervosa (Grunwald et al., 2001a,b), Alzheimer's Disease (Grunwald et al., 2002a) and healthy participants (Grunwald et al., 1999, 2001c).

## *Experiment B*

In Experiment B the participants had to explore and to recognize 15 common objects: 5 objects with the left hand (left hand tasks), 5 objects with the right hand (right hand tasks), and 5 objects with both hands (two-handed tasks). The following stimuli were used: corkscrew, pen, note-book, walnut, screwdriver, battery, toothbrush, glasses, candle, eggcup, crown cap, matchbox, cigarette lighter, woodscrew, blister pack (**Figure 1C**). The order of the stimuli as well as the order of the tasks varied between the test persons. (Some test persons started with right hand tasks, some with left hand tasks and so on.) No time limit was set for haptic exploration. While exploring, the participants' eyes were closed. Additionally, a shield prevented

#### **Table 2 | Descriptive data for Experiment C (Angle Paradigm).**


*Mean Time of explorative stops (ES) in seconds, Mean number of explorative stops (N Stops), mean time of haptic exploration (ET) in seconds and ANOVA results.*



*Mean Time of explorative stops (ES) in seconds, mean number of explorative stops (N Stops), mean time of haptic exploration (ET) in seconds.*

participants from seeing the stimuli. An acoustic signal indicated that participants should start with the haptic exploration. They were allowed to move and explore freely without restrictions as long as they did not lift the stimuli from the holder. In that moment when the test persons recognized the common object they were to take the hand (or hands) away from the stimulus and name the object. **Figure 1C** shows an example of a two-handed task with the stimulus "matchbox." Before the Experiment began the participants performed a training trial with a pair of scissors (training stimulus) to get familiar with the course of the Experiment.

## *Experiment C*

In Experiment C we used the experimental setting of the Angle Paradigm as outlined in **Figure 1D**. The experimental setting of the Angle Paradigm has already been used in several studies (Grunwald et al., 2002b; Grunwald and Weiss, 2005). The design consisted of two angle legs, of which one angle leg had to be adjusted by the participant. We distinguished between two task types—right parallel task and left parallel task. For the right side task the *left* angle was locked at a certain angle position and the participant was asked to bring the right angle leg in a parallel position to the locked left angle leg. In contrast, to solve the left side tasks the left angle had to be adjusted to the locked right angle leg. Each task consisted of five different angular positions. No time limit was given and no visual feedback was provided. The starting position of the angle legs that had to be adjusted by the test subjects was 90◦. All participants performed two training trials to become familiar with the assignment. After the training trials the participants were given visual as well as verbal feedback about their results in form of degrees of deviation.

Afterwards, the participants were blindfolded while their hands rested on touch sensitive switches. These switches indicated when the participant began moving toward the angle legs. Then, the experimenter prepared the first task. **Figure 6** shows the left angle leg (as seen by the test person) which was adjusted to a defined angle (nominal value) by the experimenter. The right angle leg had a starting position of 90◦. Next, the participant was asked to bring the right angle leg in a parallel position to the left (target) angle leg. Then, the experimenter noted the adjusted angle (actual value) and prepared the next task. Nominal values for the right side tasks were: 135, 158, 125, 165, and 145◦, the nominal values for the left side tasks were: 45, 22, 65, 15, and 35◦. All participants had to solve the tasks of one task type in the same order, but the order of the task types varied.

#### **Table 3a | Description of experimental data from Experiment D (virtual) and Experiment E (real).**


*Statistical comparison between orientation 1 and orientation 2.*

*\*Significance level paired t-test.*

**Table 3b | Description of experimental data and statistical comparison between virtual stimuli (Experiment D) and real stimuli (Experiment E).**


*\*Significance level paired t-test.*

During the exploration, the left hand was only allowed to touch the left angle leg, and the right hand was only allowed to touch the right angle leg. No cross-over or both-handed exploration or touching of the opposite angle leg and the tabletop was allowed. Both hands should leave the table and explore the angle positions simultaneously. The measurement of ET started with the first contact of the hands and the angle legs. The target angle leg was not moveable by the test person. The exploration of the angle legs was performed through various up-and-down movements of one or more fingers along the angle legs. The participants were asked to return their hands to the starting position on the table as soon as they finished a task.

The angle position was assessed by a digital measuring instrument with an accuracy of one hundredth of a degree, provided by the company NESTLE (Dornstetten, Germany). Additionally, the deviations of the angles were shown on a separate display.

Two hollow metal bars (5 mm × 10 mm × 240 mm) served as angle legs. The distance from the table to the end of the angle legs was 28.7 cm in the position of 90◦. The distance between the angle pivots was 28 cm. After the angles were adjusted the angle data was

recorded manually by the experimenter. ETs (the time needed for the adjustment of the angle leg), duration and number of ES were assessed.

#### **PART 2**

#### *Measurement methods for experiment D (virtual)*

The participants had to explore the structure of virtual sunken reliefs with the PHANToM device. Their eyes were closed. During the exploration the participants held the PHANToM device in their right hand, in standard position. They were able to move their hand and forearm freely. The participants could choose their individual starting position for each stimulus. The PHANToM device generates the virtual stimulus by giving force-feedback signals while the participant moves the device through the air. No time limit was set for the haptic exploration. No visual feedback was given on the stimulus structure at any point of the Experiment.

The stimuli had a virtual size of 13 × 13 cm. Their structure resembled milled traces of 3 mm depth and 7 mm width. To prevent the tip of the PHANToM device from slipping off the sidewalls, they were programmed as 6 cm high walls. The virtual stimuli were constructed with the program package Autodesk 3D MAX. The actual sunken relief stimuli that were used in Experiment E served as a data base.

#### *Measurement methods for experiment E (real)*

The participants had to recognize the structure of real sunken reliefs stimuli (one example see **Figure 2A**) with explorative movements of a metal tip. The metal tip was fixed to the upper end of the PHANToM device (see **Figure 2A**). Their eyes were closed during the procedure. No visual feedback was given at any time of the Experiment. The stimuli consisted of hard plastic boards of 13 × 13 × 0.5 cm with a relief structure of milled traces with a depth of 3 mm and a width of 7 mm. All test persons held the PHANToM device in their right hand. Neither a starting position nor a time limit was given for the exploration. For the exploration, the sunken relief stimuli were fixed in a solid holding device. To prevent participants from gathering any visual information of the stimulus, not even by chance, a screen was strategically placed.

**Table 4 | Number (No) of explorative Stops (ES), and mean time of ES in relation to stimulus features for Experiment D (virtual sunken relief) and experiment E (real sunken relief).**


**orientation 1 and orientation 2 (10 subjects).** The duration of explorative stops is marked in different grayscales. The stimulus structure is displayed in the middle (blue).

#### *Experiments D and E*

A *Phantom Desktop* (Sensable Technologies, USA) was used, with six-degrees-of-freedom positional sensing, Nominal position resolution: >1100 dpi, ∼0.023 mm, Force feedback (3◦ of Freedom; x, y, z), Stiffness: x-axis > 10.8 lb/in (1.86 N/mm); y-axis > 13.6 lb/in (2.35 N/mm); z-axis > 8.6 lb/in (1.48 N/mm). Therefore, a precise mapping of the spatial positions of the ES was possible with this device.

The experiments encompassed five sunken relief stimuli that were presented twice (virtual stimuli in Experiment D and real stimuli in Experiment E, as described above). The stimuli were presented in standard orientation (0◦) (orientation 1) and then the same five stimuli were presented again, but turned by 180◦ (orientation 2). The order of the presented stimuli varied for each participant. For each stimulus, corrected x-, y-, and z- coordinates of the PHANToM device (that means the position of the metal tip during the exploration) were recorded with a sampling frequency of 1 KHz and stored digitally. During the exploration the participants could move their hand and forearm freely (they did not lie on a base). After the exploration process, the participants were asked to open their eyes and to draw the perceived sunken relief pattern on a piece of paper. After they finished drawing, the participants closed their eyes again and the next stimulus was presented.

Before the Experiment began, a test stimulus (real/ virtual) was presented and the experimental task, the PHANToM device and its operations were explained. During a 10 min training presentation the participants were free to open and/or close their eyes to get familiar with the stimulus structure and the task.

## *Reference study to experiments D and E*

Due to the peculiarities of performing haptic tasks with the PHANToM device, we assessed which characteristic movements occur during haptic perception of a horizontal line under virtual and real conditions. These reference studies were performed prior to the actual Experiments. We used these tasks as references because they require less perceptual cognitive processes with the most important being motor control. To perform the reference task the participants (*n* = 10) touched a single horizontal line with the PHANToM device for 5 min. Their eyes were closed. The subjects were informed that their task would be to repeatedly follow the horizontal line and that the resulting measures would be used as reference values. Two conditions were used: First, the participants were presented a real sunken relief line; secondly, a virtual sunken relief line was presented. The scanning velocity during this test differed from subject to subject. All participants generated motion stops with a mean length of *M* = 89 ms (*SD* = 40 ms). These stops occurred only at the end points of the line (left or right side, under virtual and real condition). These kinds of stops were termed "mechanical stops." Since no theoretical basis for the discrimination between mechanical and ES based on duration exists, the somewhat artificial value of 89 ms was used to discriminate between ES and mechanical stops during the PHANToM Experiments. The cutoff was generated merely to account for the technical limitations of the exploration movements due to the PHANToM device. Therefore, for Experiments D and E only those motion stops that lasted longer than 89 ms were marked as ES and used for data analysis.

During manual haptic exploration pure motor stops may possibly occur as well. However, again, there is no theoretical basis for discriminating between motor and ES at the current stage of research. The present study is the first to explore the mere existence of stops during the haptic exploration process. The determination of possible subgroups of stops needs to be left to future studies. Since no technical movement limitations exist during manual haptic exploration and since there is as much reason to assume that ES that are shorter than 89 ms exist just the same, no such cut-off was used for Experiments A–C.

## **PARTICIPANTS**

## *Experiments A–C*

The same eight healthy volunteers (4 men, 4 women) took part in all three Experiments (A–C). All participants were right-handed according to a test of handedness by Salmaso and Longoni (1985). After all test persons had been fully informed about the aim and content of the investigation, written consent was obtained. The participants received 10 C compensation for each session. The study was approved by the local Ethics Committee of the University of Leipzig (Germany).

## *Experiments D and E*

Ten test persons (5 women, 5 men) took part in the investigation. The participants were students and assistants of the Research Laboratory of Electronics (RLE, MIT, Boston USA). All participants were right-handed according to a test of handedness by Salmaso and Longoni (1985). Between the execution of Experiments D and E a 4 weeks waiting-period was met by all participants. Each participant received a monetary compensation of 10\$ for each session. All participants were informed about the aims of the study and gave their written consent. The study was approved by the local Ethics Committee at the Massachusetts Institute of Technology (Boston, USA).

## **STATISTICS**

For all analyses statistics software SPSS 20.0.0 was used. For statistical comparisons between tasks (Experiments A–E) ANOVA were calculated. For the statistical comparisons between task types (orientation 1, 2) paired *t*-tests (critical alpha 0.05) were used. The standard Pearson correlation coefficient (one tailed) was used to assess the correlations between ET and length of ES.

## **RESULTS**

## **PART 1**

The analysis of the data revealed that during haptic exploration of sunken reliefs (Experiment A) several ES of on average 300 ms occurred. ES were observed during one-handed as well as two-handed exploration (**Table 1**). During the exploration of common objects (Experiment B) stops with an average length of 70 ms occurred (**Table 1**). Again, stops were observed during one-handed as well as two-handed tasks. Explorative movements during the "angle paradigm" (Experiment C) were also interrupted by ES, which had an average length of 190 ms (**Table 2**). Thus, for haptic exploration, it could be demonstrated that the fingers persisted in a static position on the stimulus during phases of ES during all three task types (Hypothesis 1) as well as during one- and two-handed exploration (Hypothesis 2). The number and duration of ES did not differ between one- and two-handed tasks.

Hypothesis 3 was also confirmed. Remarkably, the mean length of the ES differed significantly between the three experimental conditions [*F*(2, <sup>61</sup>) = 34.05, *p* < 0.001]. The shortest average length of ES was measured during the exploration of common objects, whereas the longest average length of stops occurred during the exploration of sunken reliefs. To calculate the correlative relationships between length of ES and ET the data was transformed (logarithmized). The standard Pearson correlation revealed a significant correlation (*r* = 0.730, *p* < 0.001;


**Table 5 | Pearson correlation (one-tailed) between relative exploration time and length of ES (explorative stops in ms) per stimulus for Experiment D (virtual sunken relief) and Experiment E (real sunken relief).**

see **Figure 3**) of ES and ET. Accordingly, the mean length of ES increased with the mean duration of the ET (Hypothesis 3).

#### **PART 2**

During the two PHANToM experiments (Experiment D with virtual sunken reliefs and Experiment E with real sunken reliefs) ES were observed as well. ES with a mean length of *M* = 186.17 ms were measured in Experiment D (virtual condition) and of *M* = 140.02 ms in Experiment E (real condition). No differences in mean length of ES, number of ES, and mean ET (see statistical results in **Table 3A**) were found between the two orientation condition (stimulus orientation 1, 2). But, as expected, significant differences existed between virtual and real stimuli. ETs as well as the length of the ES were much shorter in Experiment E (real sunken relief stimuli) than in Experiment D (virtual stimuli). Furthermore, significantly fewer ES occurred in Experiment E than in Experiment D (see **Table 3B**). Additionally, a linear correlation between ET and mean length of ES was found for Experiments D and E. This result is in line with the correlation found across Experiments A–C. The correlation coefficient was *r* = 0.289, *p* = 0.035 for virtual stimuli (Experiment D), and *r* = 0.331, *p* = 0.043 (Pearson, one–tailed) for real stimuli (Experiment E) (see **Figure 4**). That means that with increasing ET also the mean length of ES increased in both experiments. Therefore, Hypothesis 3 was confirmed for the PHANToM experiments D and E as well.

The analysis of how ES are distributed spatially during the haptic exploration of virtual and real sunken reliefs (with the PHANToM device) revealed that ES vary in frequency at different stimulus features (Hypothesis 4). ES occur more frequently at corners, endpoints of lines and on curves (see **Table 4**), whereas fewer ES were observed on vertical and horizontal lines. For an example of the spatial distribution of ES in Experiment D and E please see **Figure 5**. The mean ES length did not differ between the different stimulus features (**Table 4**), neither for virtual stimuli [*F*(6, <sup>71</sup>) = 1.175, *p* = 0.330] nor for real stimuli [*F*(6, <sup>78</sup>) = 0.393, *p* = 0.882]. The number of ES per stimulus feature did differ significantly, however, in both experiments [*F*virtual(6, <sup>71</sup>) = 6.228, *p* < 0.001; *F*real(6, <sup>78</sup>) = 9.094, *p* < 0.001; **Table 4**].

An additional, explorative analysis revealed that the number of ES differed from the number of motions. That means, that ES did not occur during every haptic motion at every stimulus feature, as exemplary outlined in **Figure 6**.

To investigate Hypothesis 5 (whether the duration of ES varied during the exploration process) the *relative* ET and the duration of ES were correlated per stimulus. We expected to find a systematic decrease of stop duration toward the end of the ET. The Pearson correlations (one-tailed) revealed divergent and non-significant results. Both positive and negative correlations occurred, that did not reach the critical alpha value (α = 0.0025, Bonferroni correction for multiple comparisons). The results are presented in **Table 5**. Exemplary regression plots for one real and one virtual stimulus are displayed in **Figure 7**.

### **DISCUSSION**

All five experiments demonstrated that the haptic exploration movements include ES of milliseconds. Thus, basically, the haptic exploration process (with closed eyes) may be regarded as an alternating cycle of explorative motions (EM) and ES. During haptic exploration of unknown sunken relief stimuli (Experiment A) ES with a mean duration of 329.50 ms occurred, whereas during haptic exploration of common objects (Experiment B) ES lasted only 67.47 ms, on average. The average duration of ES during the processing of space-position information (angle leg adjustments, Experiment C) was 189.92 ms. Mean ES of 186.17 ms were observed during the exploration of virtual sunken reliefs (Experiment D) with the PHANToM haptic device. ES of 140.02 ms, on average, were found during the exploration of real sunken reliefs (Experiment E) which were touched with the

PHANToM device. ES were observed during one-handed as well as two-handed tasks. The results confirm the hypothesis that human haptic perception is generally accompanied by movement pauses of the exploring fingers and hands in healthy humans.

A strong correlation was revealed between mean duration of ES and mean ET per stimulus (see **Figure 3**). Short ETs coincided with shorter ES. In contrast to this, ES lasted significantly longer during longer ETs. Therefore, the duration of ES is not independent from ET. The same correlation was also found in Experiment D and E. Ergo, the correlation of mean ET and length of ES was found for both virtual and real stimuli, during both manual and PHANToM exploration. The stimuli of the different experiments differed in complexity. As introduced above, ET poses as an indicator of information processing and cognitive demands. According to studies from Rösler et al. (1993) and Grunwald et al.(1999; 2001c) ET varies depending on the perceptive-cognitive processing effort during haptic exploration. We found that longer ETs and increasing stimulus complexity coincided with a longer average duration of the ES. The shortest ES were measured during the haptic exploration of common familiar objects. Thus, the strong correlation between mean ET and duration of ES may be understood as the perceptive-cognitive processing effort during information integration. Similar results, showing that stimulus complexity and the duration of fixation pauses are correlated, have been presented for the visual modality (Krause, 1988; Kaller et al., 2009).

To answer the question where and at which stimulus features ES occur, we used an experimental design and apparatus (Experiment D and E) which allowed us to precisely register the Cartesian coordinates (x, y, z) of the haptic exploration process. The PHANToM haptic device makes haptic perception in virtual space possible (Salisbury and Srinivasan, 1997). In Experiment D the participants had to recognize five different virtual sunken reliefs with the PHANToM device while their eyes were closed. To compare virtual and real stimuli, the virtual sunken reliefs of Experiment D were presented as real sunken reliefs in Experiment E. The test persons had to explore these real sunken reliefs with a special one-point-stick that was mounted to the end of the PHANToM holding device (see methods Part 2).

In both cases the touch perception with the PHANToM device presents a profound reduction of the natural dimensions of haptic perception. Natural haptic perception should be considered as far more complex, as it is not restricted to the information from one single point as the haptic perception with PHANToM is. Despite these limitations, haptic perception with the PHANToM device is roughly comparable to the haptic perception of a single finger or with a handheld pen.

The spatial distribution of ES in Experiment D and E showed that ES occur more frequently at certain stimulus features (i.e., corners) in contrast to other features (i.e., lines). However, ES did occur on all stimulus positions; not only on cross and end points. Also, the analysis showed that salient stimulus features are more frequently explored than ES occur. That means that ES do not occur every time the finger moves along the stimulus feature. Thus, the number of haptic motions that may be observed at a certain stimulus feature may be higher than the number of ES that occur at the same feature. This characteristic indicates that not stimulus features themselves are responsible for ES, but that the occurrence of ES is more likely to be related to the perception process—possibly even to information processing.

Hypothesis 5 was based on the assumption that the participant generates a hypothesis about the whole stimulus right from the beginning of the exploration process. Therefore, the ES were expected to be longer at the beginning of the exploration than at

the end, because more new information has to be processed at the beginning than at the end of the exploration. The assumption implies that the amount of information that has to be processed corresponds with the duration of the stops. However, the supposition ignored the differential exploration properties of the PHANToM device as opposed to natural 10 finger exploration (Experiments A–C). The exploration with the PHANToM device consists of only one contact point with the stimulus and, therefore, only one-point information. Consequently, it is not possible to generate a hypothesis about the whole stimulus at the beginning of the exploration during Experiments D and E.

Therefore, we are not surprised that the assumption of a systematically negative correlation between stop duration and temporal position during the exploration process had to be dismissed. The temporal allocation of ES and stop duration showed positive as well as negative correlative associations at low significance levels for different stimuli. Additionally, a temporally stable distribution of ES was observed across the exploration process. These findings (the occurrence as well as the dynamics of ES) may still be in line with the "hypotheses-rebuild-model" (see **Figure 8**), however. In this model, perception is understood as an active constructional process and not as a passive observation of environmental stimuli. Analogous to Richard Gregory's perception theory (Gregory, 1973) the haptic perception process may consist of sequences of proposing a hypothesis and testing the hypothesis. Hypotheses about the expected structure of stimulus features (nominal value) are serially compared with incoming information about actual stimulus features (actual value) by bottom-up as well as top-down processing.

During the first phases of the perception process the hypotheses are pre-attentive. If there are no differences between the expected value and the actual value the result of the comparison will be stored. This process lasts until a difference is stated between actual and nominal value on a conscious level. If an unexpected stimulus feature occurs (e.g., corner instead of line) the nominal value hypothesis has to be corrected. For the proposition of new nominal value hypotheses only limited processing resources of working memory are available. The necessary resources are regulated by the limited capacity control system (LCCS) (Gopher and Donchin, 1985). A possible consequence of nearly exhausted processing resources may be that the further income of sensory information is put on halt. Explorative movements may come to a standstill during the reorganization of working memory resources, which may be measurable as ES. The results of the present study showed, that a temporally stable distribution of ES across the exploration process occurred. This may be due to continuously incoming information that needs to be processed by working memory. Likewise, a continuous generation of hypotheses about the expected actual values is necessary during the one-point exploration with the PHANToM device.

The hypothesis-rebuilt-model may also serve as an explanation why ES are shorter during the exploration of common objects (see Experiment B) than during the exploration of unknown objects. The degrees of freedom for hypotheses about common objects may be limited by existing knowledge. And thus, hypotheses are generated faster and less information has to be stored in working memory.

Furthermore, the model may even be fit to explain why ES do not occur *per se* at complex stimulus features (i.e., cross points). In terms of Gregory's perception theory, sensory income would only be interrupted at those features for which the internal hypotheses are not validated yet. In terms of the hypothesestheory the occurrence of ES would be a function of a perceptivecognitive test process. However, the validity of this model cannot be clarified by the results of the present study. Future studies need to examine whether the frequency and/ or duration of ES systematically changes after unforeseeable changes of the stimulus structure (e.g., virtual stimuli) during the exploration process. In that case, the duration and number of ES should increase with each structural change of the stimulus because the participants would have to constantly adjust their hypotheses.

For the time being, individual variations that may be due to different explorative strategies and differences in processing time cannot be explained conclusively by the present results. Furthermore, the methodological limitations of the PHANToM device call for the analysis of temporal and spatial characteristics of ES during 10 finger tasks in future studies. Nevertheless, during the present study ES were observed during the haptic exploration of a wide variety of stimuli. Therefore, it feels save to assume that ES are a stable aspect of human haptic perception.

Future studies may evaluate the possible relevance of ES for diagnostic purposes. Possibly, differences in the distribution, frequency and duration of ES may be found for people with different kinds of psychiatric disorders or cognitive strategies.

With the help of electrophysiological parameters (EEG, MEG or fMRI) further studies may reveal corresponding cortical processes of touch motions and of ES during human haptic perception. Spectral EEG analyses of the theta-band may elucidate whether ES are associated with working memory consolidation. If that is the case, a significant increase of theta would be expected. If ES are accompanied by hypotheses-rebuild-processes, on the other hand, increases of beta and gamma frequencies may be more likely. Besides the analyses of cortical processes, future studies should focus on the question which perceptive-cognitive processes form the basis of human haptic perception. In our opinion, more detailed analyses of ES could contribute essentially to the understanding of human haptic perception—maybe as much so, as the analysis of fixation pauses contributed to the understanding of visual perception.

In that regard, it would also be interesting to analyze whether ES occur when additional visual information is present during haptic exploration. During all present experiments the participants' eyes were closed. Futures studies should examine if a functional correspondence exists between fixation pauses of the eyes and ES of the haptic system. Additionally, the occurrence of ES in congenitally blind participants should be tested. Although Braille reading studies have shown that the fingers regularly stop during the reading process (Millar, 1987; Appelle, 1991; Davidson et al., 1992), it is not yet known, whether ES occur in congenitally blind persons during the exploration of objects and surfaces as well.

## **AUTHOR CONTRIBUTIONS**

Martin Grunwald experimental design, data analysis, wrote the paper. Manivannan Muniyandi, Stephanie Mueller, Frank Krause data analysis, discussion the results, wrote the paper. Hyun Kim and Jung Kim experimental hardware and software design. Mandayam A. Srinivasan experimental design, discussion.

## **ACKNOWLEDGMENTS**

This work was supported by the G.A. Lienert Foundation (Germany), German Research Foundation of Eating Disorders and by the RLE from the MIT. We thank *Two.null Design* (Germany) for digital convertion of the real sunken reliefs, K. H. Beier for developing the measuring apparatus for Experiment A–C, Travis Baumgartner for language advice.

## **REFERENCES**


haptic object recognition task. *Cogn. Brain Res.* 11, 33–37. doi: 10.1016/S0926- 6410(00)00061-6


Katz, D. (1925). *Der Aufbau der Tastwelt*. Leipzig: Johann Ambrosius Barth.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 09 September 2013; accepted: 21 March 2014; published online: 09 April 2014.*

*Citation: Grunwald M, Muniyandi M, Kim H, Kim J, Krause F, Mueller S and Srinivasan MA (2014) Human haptic perception is interrupted by explorative stops of milliseconds. Front. Psychol. 5:292. doi: 10.3389/fpsyg.2014.00292*

*This article was submitted to Cognition, a section of the journal Frontiers in Psychology.*

*Copyright © 2014 Grunwald, Muniyandi, Kim, Kim, Krause, Mueller and Srinivasan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*