**ATTENTION, PREDICTIONS AND EXPECTATIONS, AND THEIR VIOLATION: ATTENTIONAL CONTROL IN THE HUMAN BRAIN**

**Topic Editors Simone Vossel, Joy J. Geng and Karl J. Friston**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2015 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-367-7 **DOI** 10.3389/978-2-88919-367-7

## *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

## *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

## *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **ATTENTION, PREDICTIONS AND EXPECTATIONS, AND THEIR VIOLATION: ATTENTIONAL CONTROL IN THE HUMAN BRAIN**

Topic Editors:

**Simone Vossel,** University College London, United Kingdom & Research Centre Jülich, Germany **Joy J. Geng,** University of California Davis, USA **Karl J. Friston,** University College London, United Kingdom

Rethinking the role of the rTPJ in attention and social cognition in light of the opposing domains hypothesis: findings from an ALE-based meta-analysis and resting-state functional connectivity; Benjamin Kubit and Anthony Ian Jack; doi: 10.3389/fnhum.2013.00323

# Table of Contents



Jocelyn L. Sy, James C. Elliott and Barry Giesbrecht

*133 Paired-Pulse Transcranial Magnetic Stimulation Reveals Probability-Dependent Changes in Functional Connectivity Between Right Inferior Frontal Cortex and Primary Motor Cortex During Go/No-Go Performance* 

A. Dilene van Campen, Franz-Xaver Neubert, Wery P. M. Van Den Wildenberg, K. Richard Ridderinkhof and Rogier B. Mars


## Attention, predictions and expectations, and their violation: attentional control in the human brain

## *Simone Vossel 1,2\*, Joy J. Geng3 and Karl J. Friston1*

*<sup>1</sup> Wellcome Trust Centre for Neuroimaging, University College London, London, UK*

*<sup>2</sup> Cognitive Neuroscience, Institute of Neuroscience and Medicine (INM-3), Research Centre Jülich, Jülich, Germany*

*<sup>3</sup> Department of Psychology, Center for Mind and Brain, University of California Davis, Davis, USA*

*\*Correspondence: s.vossel@fz-juelich.de*

#### *Edited and reviewed by:*

*John J. Foxe, Albert Einstein College of Medicine, USA*

**Keywords: attentional networks, predictions, trial history, reward, emotions, neuroimaging, TMS, EEG**

In the complex scenes of everyday life, our brains must select from among many competing inputs for perceptual synthesis so that only the most relevant are fully processed and irrelevant (distracting) information is suppressed. At the same time, we must remain responsive to salient events outside our current focus of attention—and balancing these two processing modes is a fundamental task our brain constantly needs to solve.

This Research Topic examines how attentional control is guided by sensory predictions, prior knowledge, reward, task sets, and emotional factors. Moreover, the neural signatures of these mechanisms are investigated in Original Research Articles or summarized in Review, Perspective and Hypothesis and Theory Articles. Findings from a wide range of state-of-the-art complementary neuroscientific methods such as fMRI, M/EEG, TMS, and ALE-based meta-analysis are presented.

The collection of papers of this Research Topic provides an overview over our current knowledge in the field and also presents novel stimulating hypotheses on how attention is controlled in the human brain. It moreover bridges the gap to other disciplines such as decision-making and social and affective neuroscience.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 May 2014; accepted: 17 June 2014; published online: 02 July 2014. Citation: Vossel S, Geng JJ and Friston KJ (2014) Attention, predictions and expectations, and their violation: attentional control in the human brain. Front. Hum. Neurosci. 8:490. doi: 10.3389/fnhum.2014.00490*

*This article was submitted to the journal Frontiers in Human Neuroscience. Copyright © 2014 Vossel, Geng and Friston. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Attention and predictions: control of spatial attention beyond the endogenous-exogenous dichotomy

## *Emiliano Macaluso1\* and Fabrizio Doricchi 1,2*

*<sup>1</sup> Neuroimaging Laboratory, Fondazione Santa Lucia, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Rome, Italy <sup>2</sup> Dipartimento di Psicologia, Università degli Studi "La Sapienza", Rome, Italy*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Ralph Weidner, Forschungszentrum Jülich, Germany Klaartje Heinen, Institute of Cognitive Neuroscience, UK*

#### *\*Correspondence:*

*Emiliano Macaluso, Neuroimaging Laboratory, Fondazione Santa Lucia, Via Ardeatina 306 00179, Rome, Italy e-mail: e.macaluso@hsantalucia.it*

The mechanisms of attention control have been extensively studied with a variety of methodologies in animals and in humans. Human studies using non-invasive imaging techniques highlighted a remarkable difference between the pattern of responses in dorsal fronto-parietal regions vs. ventral fronto-parietal (vFP) regions, primarily lateralized to the right hemisphere. Initially, this distinction at the neuro-physiological level has been related to the distinction between cognitive processes associated with strategic/endogenous vs. stimulus-driven/exogenous of attention control. Nonetheless, quite soon it has become evident that, in almost any situation, attention control entails a complex combination of factors related to both the current sensory input and endogenous aspects associated with the experimental context. Here, we review several of these aspects first discussing the joint contribution of endogenous and stimulus-driven factors during spatial orienting in complex environments and, then, turning to the role of expectations and predictions in spatial re-orienting. We emphasize that strategic factors play a pivotal role for the activation of the ventral system during stimulus-driven control, and that the dorsal system makes use of stimulus-driven signals for top-down control. We conclude that both the dorsal and the vFP networks integrate endogenous and exogenous signals during spatial attention control and that future investigations should manipulate both these factors concurrently, so as to reveal to full extent of these interactions.

**Keywords: endogenous, exogenous, salience, prediction, parietal cortex**

## **INTRODUCTION**

The ability to suitably allocate processing resources is fundamental for the efficient processing of incoming sensory signals and for the generation of appropriate behavior in complex environments. The brain continuously receives a large amount of sensory signals that cannot be fully processed. Attention selection allows preferential processing of some signals and filtering out of other inputs. What are the constraints that govern this selection process, thus determining what signals should gain access to in-depth processing?

Traditionally, attention research has distinguished between two types of mechanisms that contribute to the selection process. On one hand there are endogenous, top-down factors that primarily relate to the subject's "internal" goals and intentions. On the other hand, there are exogenous, bottom-up effects that primarily relate to the characteristics "external" stimuli. Endogenous factors are associated with top-down voluntary control, while exogenous factors are associated with stimulus-driven control that is thought to take place automatically. The distinction between these two types of control can be rather intuitive: when searching for a friend in a crowd, we will voluntary shift attention from one face to another until the external input (i.e., what we see) matches our internal knowledge about the physical appearance of our friend. By contrast, in a crowd of people all dressed with dark cloths, we will quickly notice a person dressed in bright red, who will catch attention automatically. Experimentally, the impact of endogenous and exogenous factors for selection have been studied with a variety of paradigms. Somewhat related to the "real-life" examples above, visual search tasks can be used to highlight endogenous vs. exogenous factors for the allocation of spatial attention. Search tasks including a target-item "similar" to the distracters (e.g., look for a vertical red bar presented among vertical green bars and tilted red bars) will be slow and inefficient. The search time will depend on the number of competing distracters thus indicating that search proceeds serially with attention shifted voluntarily from one item to another item. By contrast, when the target item stands out (e.g., look for a red bar presented among green bars) the search is rapid and efficient, with the red target capturing attention in an automatic manner irrespective of the number of distracters.

Another way to investigate endogenous and exogenous control of spatial attention includes using variants of the Posner spatial cueing paradigm (Posner, 1980). Rather than presenting distracters during the search of a target, the target is presented in isolation but it is preceded by a spatial cue. In the endogenous version of the task, the cue is presented centrally (e.g., a leftward or rightward pointing arrow) and it predicts the target location on most of the trials (e.g., 75–80% "valid" trials). In the remaining trials the cue is "invalid" (25–20%) and the target is presented at a different position compared to the location initially signaled by the cue. Behaviorally, targets proceeded by a valid cue are detected/discriminated more rapidly/accurately than targets proceeded by invalid cues. The reason for this is that subjects can strategically use the central cue to voluntarily shift attention towards the cued location. This leads to enhanced processing when the target is presented there (valid trials), while additional re-orienting operations are needed when the target is presented somewhere else (invalid trials). In the exogenous version of this paradigm, the cue is presented peripherically (e.g., a box is flashed in the left or right visual hemifield), again followed by a target either at the same (valid) or different (invalid) location. Unlike the endogenous version, now the cue does not predict the target location (i.e., 1:1 ratio of valid and invalid trials) and the subject has no strategic reason to use the cue to voluntarily direct attention. Nonetheless, the detection of the target is faster and more accurate at the cued/valid location compared with the opposite/invalid location, provided that the interval between cue and target is short (e.g., Klein, 2000). In this case, the interpretation is that the sudden appearance of the box on one side attracts attention automatically, triggering exogenous mechanisms spatial orienting.

Many functional imaging and Event Related Potentials (ERPs) studies have investigated the neural basis of endogenous and exogenous visuo-spatial attention control. These highlighted the central role of the frontal and parietal lobes, with a notable distinction between the response patterns in dorsal and ventral regions. A variety of tasks requiring endogenous control of spatial attention highlighted the activation of dorsal fronto-parietal regions (e.g., Corbetta et al., 1993; Gitelman et al., 1999; Yantis et al., 2002). The activation pattern typically includes the intraparietal sulcus (IPS) and/or the posterior parietal cortex (PPC) plus the frontal eye-field (FEF) in the premotor cortex. These areas are found in serial search tasks (Gitelman et al., 2002; Himmelbach et al., 2006; Fairhall et al., 2009), plus many other non-spatial attention tasks also requiring the voluntary allocation of processing resources (e.g., see Wojciulik and Kanwisher, 1999). Accordingly, the "dorsal attention network" is commonly accepted to be involved in the endogenous control of attention (Corbetta and Shulman, 2002). Furthermore, specific experimental designs allowed segregating brain activity associated with the different phases of attention tasks, which often comprise multiple control processes. In spatial cueing tasks, the dorsal attention network has been found to activate upon the presentation of the predictive cue (e.g., Hopfinger et al., 2000; Doricchi et al., 2010) and to show sustained activity in the cue-to-target delay (Kastner et al., 1999, see also Hahn et al., 2006). These preparatory effects, and the sustained activity in the absence of any external stimulus, are consistent with the "internal"/endogenous nature of the responses in the dorsal attention system. Analogous preparatory effects have been also found within the occipital visual areas that represent the attended/cued location (e.g., see also Chawla et al., 1999, Kastner et al., 1999, for related effects in non-spatial selective attention tasks). These findings support the proposal that attention selection entails top-down signals that originate in the dorsal system and modulate activity in sensory areas that represent the currently relevant location (e.g., Simpson et al., 2011). This, in turns, would yield to greater responses when visual stimuli appear at the attended location, particularly so when the display also contains other distracting visual stimuli (Desimone and Duncan, 1995; Kastner et al., 1999; Geng et al., 2006; see also Moore, 2006, for review).

On the other hand, areas in the ventral parietal and frontal cortex activate when a behaviorally relevant stimulus is presented outside the current focus of attention, for example a visual target that follows and invalid predictive cue in spatial cueing paradigms (Arrington et al., 2000). These targets stimuli are thought to capture attention in a stimulus-driven manner and the "ventral attention network" has been associated with the control of exogenous attention (Arrington et al., 2000; Corbetta and Shulman, 2002; Macaluso et al., 2002; Corbetta et al., 2008, for review). This network comprises the temporo-parietal junction (TPJ) and the inferior frontal gyrus (IFG) and is considered to be typically lateralized in the right hemisphere (Shulman et al., 2010). Nonetheless, quite soon, it has become evident that the conditions typically associated with activation of the ventral fronto-parietal (vFP) entail not only spatial aspects of attention control (i.e., the visual target triggering a shift of attention from one position to another), but they also produce some breach of expectation (e.g., the subject expects the target at one location, but this appears at a different position) and/or require some task-related judgment of the stimulus at a previously unattended location.

In the following sections we discuss recent evidence that, on one hand, highlighted the role of bottom-up signals for attention control in the dorsal fronto-parietal system and, on the other hand, demonstrated that the ventral system combines stimulusdriven aspects of spatial reorienting with endogenous signals related to expectation and the representation of task-related contextual information (predictions). These data challenge the traditional dichotomy linking the dorsal system with endogenous attention and the ventral system with exogenous attention, and indicate a novel perspective to understand the multifaceted mechanisms underlying the control of spatial attention.

## **STIMULUS-DRIVEN EFFECTS IN THE DORSAL ATTENTION SYSTEM**

As noted above, a wealth of imaging evidence has associated activation of the dorsal fronto-parietal network with endogenous/voluntary attention control (Wojciulik and Kanwisher, 1999; Corbetta and Shulman, 2002). Nonetheless, sensory aspects of the incoming signals may also play an important role, particularly so in posterior parietal regions. Indeed, while showing increased activation even in the absence of any sensory input (Kastner et al., 1999; Hopfinger et al., 2000), the parietal cortex shows some residual spatially-specific responses also to unattended stimuli (Saygin and Sereno, 2008). The co-occurrence of sensory-driven responses and top-down internally-controlled activity, also associated with motor planning, has triggered an intense debate about the format of the spatial representations in parietal cortex (e.g., Andersen et al., 1993; Colby and Goldberg, 1999). Among the many hypotheses, the concept of "saliency map" has recently gained much interest. Saliency maps are topographical representations of the visual space and code for the relative relevance of different locations. Depending on definition, these account for bottom-up interactions between multiple sensory input/features (Itti and Koch, 2001, see also Borji and Itti, 2013) or integrate bottomup effects with top-down goals (Gottlieb et al., 1998; Treue, 2003, for review; and Gottlieb, 2007; see also "priority maps", Ptak, 2012). Saliency maps have been associated with neuronal responses in dorsal parietal (Gottlieb et al., 1998; Kusunoki et al., 2000; Constantinidis and Steinmetz, 2001) and dorsal pre-motor regions (Thompson et al., 2005), as well as visual areas in occipital cortex (Mazer and Gallant, 2003). In most of these regions, the patterns of response reflect both bottom-up as well as top-down factors that jointly contribute to make a region of space relevant.

In a series of imaging studies we made use of saliency maps to investigate attentional selection using naturalistic stimuli (i.e., complex pictures and videos). In order to explicitly characterize the effect of bottom-up signals, we considered the saliency model proposed by Itti and Koch (2001). This decomposes complex images into a set of conspicuity maps, representing local contrast in intensity, orientation and color, plus motion and flicker for dynamic visual stimuli (i.e., videos). These maps are then combined in a single map that represents the bottom-up salience of each location in the image, irrespective of the feature dimension making the location salient. The final saliency map reflects only sensory-driven aspects (i.e., competition between locations in the image) and can predict fixation patterns during the viewing of complex visual stimuli (Carmi and Itti, 2006; Elazary and Itti, 2008). Nonetheless, eye-movement studies have highlighted that factors other than pure bottom-up signals influence spatial orienting in complex environments (Navalpakkam and Itti, 2005; Einhauser et al., 2007; see also Borji and Itti, 2013). Accordingly, in our studies we combined saliency-related bottom-up indexes with measures derived from overt eye-movements recorded during free viewing of the stimuli. With this we sought to track the influence of bottom-up and top-down factors, and the interaction between the two, during selective processing of complex and naturalistic stimuli (see also Seidl et al., 2012).

In a first study, we investigated the impact of visual salience on spatial orienting using a virtual visual environment (Nardo et al., 2011). We computed the mean salience of each visual frame and used this as a covariate for the analysis of the imaging data. This revealed that activity in the occipital visual cortex and the PPC increased with increasing salience of the visual input (see also Bordier et al., 2013, who separated the effect of salience vs. single visual features). Notably, when we combined saliency data and patterns of eye-movements we found that activity in the entire dorsal fronto-parietal network (i.e., including both parietal and frontal nodes) increased specifically when the salient bottom-up signals successfully/efficaciously attracted the subjects' gaze. These activations were found also when we asked subjects to maintain central fixation (i.e., now indexing the efficacy of the bottom-up signals in one session, and using these indexes to analyze the fMRI data in a different session), consistent with an interpretation based on spatial attention/selection rather than mere oculomotor control. Related results were also obtained by Bogler et al. (2011), who used multivariate pattern recognition to reveal that PPC represents "selected" saliency signals following a winner-takes-all mechanism.

Previous studies considered the role of the dorsal frontoparietal cortex for the processing of bottom-up signals within two main frameworks. On one hand, it has been proposed that activation of the dorsal system "follows" the detection of relevant stimuli by the ventral attention system (Geng and Mangun, 2011; Vossel et al., 2012). Accordingly to this view the dorsal system generates top-down signals that modulate processing in visual cortex, which require re-setting when a relevant/infrequent "bottom-up" stimulus is presented outside the current focus of attention (Corbetta and Shulman, 2002; Shulman et al., 2009). A different prospective entails a more "direct" activation of the dorsal system due to forward input from the visual cortex. In this context, differences between the involvement of the posterior IPS and the anterior FEF nodes of the dorsal fronto-parietal system have been proposed. For example, using concurrent Transcranial Magnetic Stimulation-fMRI over either IPS or FEF, Ruff et al. (2008) showed that the inter-regional influence of FEF over visual cortex does not depend on the visual input, while the influence of IPS can change according to the current visual context. IPS-TMS affected activity in retinotopic visual areas (V1-V4) only in the absence of visual input, while activity in area V5/MT+ was modulated only in the presence of visual stimuli. The authors suggested that the difference between the effect of TMS over IPS and FEF reflects the primarily top-down nature of the FEF-tooccipital connectivity vs. the presence of strong bottom-up driving projections from visual-to-IPS areas (see also Buschman and Miller, 2007; who suggested fast bottom-up attentional signals in IPS vs. slow top-down effects in prefrontal areas). Our data with complex stimuli also showed a greater influence of the bottom-up signals in IPS with that of FEF: while both regions were modulated according to the efficacy of the salient visual input in attracting subject's attention, the activity in IPS also co-varied with the overall stimulus salience irrespective of efficacy (cf. above and see Geng and Mangun, 2009; who reported that the saliency of non-target visual stimuli is represented in IPS but not in FEF). Moreover, our study did not show any effect of sensory salience in vFP areas that instead activated when the display included attention-grabbing events (i.e., virtual human-like characters that entered the scene at unpredictable times, Nardo et al., 2011). Overall, these results are consistent with a "direct"—rather than via ventral areas—influence of bottom-up signals in the IPS, and fit with the hypothesis that the dorsal fronto-parietal cortex codes for the current attention priorities (Gottlieb, 2007).

These findings demonstrate that in complex and naturalistic experimental situations that involve high levels of sensory competition, the dorsal fronto-parietal system responds to bottom-up salient signals. However, these responses do not only reflect the impact of the external input but also they appear to index the efficacy of these signals for driving spatial selection. Eye-movement data provide us with one index of such selection processes, but do not gives us any hint about the possible consequences of the spatial selection bias. Using a working memory (WM) task, we sought to establish whether stimulus-driven attention to salient elements of a complex image has any consequence on subsequent memories of that image (Santangelo and Macaluso, 2013). Specifically, we asked whether salient objects associated with activation of the dorsal attention network are more likely to be remembered than non-salient objects or salient objects that do not activate the dorsal system (i.e., non-efficacious salient stimuli, cf. also Nardo et al., 2011). For this, we presented pictures of complex natural scenes and—following a short delay—we probed memory for one object. Critically, the probe was extracted either from an highly salient location of the image or a location of minimal saliency (see also Fine and Minnery, 2009). Behaviorally, we found better WM performance for high than low saliency probes. Analysis of the imaging data during encoding revealed increased activation in the posterior parietal cortex when the trial included a salient probe that—later, at retrieval—would be successfully remembered. This demonstrates that the selection of salient elements in a complex environment do not only trigger on-line spatial orienting (cf. eye-movements), but also leads to in-depth processing including stimulus' storage in WM.

The finding of an interplay between attention and WM is not surprising, given the wealth of data showing analogous patterns of activation during attention and WM tasks (LaBar et al., 1999; Corbetta et al., 2002; Anderson et al., 2010). Several accounts have been put forward to explain the role of attention in WM. One of the main proposal is that attention contributes to the maintenance of items/objects in WM and that activation of the dorsal attention system—and IPS in particular—reflects the process of shifting endogenous attention between these items (see Awh and Jonides, 2001; Majerus et al., 2007; Magen et al., 2009; and Nee et al., 2013; albeit note that the latter meta-analysis pointed to inferior rather than superior parietal areas for attention shifting in WM). Related more directly to the issue of the contribution of bottom-up signals, Bundesen et al. (2011) proposed a computational framework that integrates mechanisms of selective attention and the access to WM storage by a subset of items in multi-items displays. The model emphasizes the dynamic remapping/changes of neurons' receptive fields and the modulation of inter-regional communication (i.e., what information is sent from lower to higher levels of the visual hierarchy), via attentional weights generated by "priority" maps (cf., Gottlieb, 2007; Ptak, 2012).

These accounts are primarily concerned with the interplay between attention and WM considering situations were multiple, unrelated items compete for the access to a limited capacity WM buffer (typically holding 3–4 items, see Todd and Marois, 2004; Xu and Chun, 2006). However, in our experiment with naturalistic stimuli (Santangelo and Macaluso, 2013) the various items/objects in the scenes were not unrelated, rather they existed in relation to each other (e.g., glasses are typically found on a table, while shoes are on the floor). This, together with the finding of an increased functional coupling between the parietal cortex and the hippocampus when subjects encoded objects that later would be correctly retrieved, lead us to interpret our results in a slightly different framework. Specifically, we suggested that prior knowledge about the global scene configuration plays a role during the encoding of the position of objects within natural scenes. This would fit with theories proposing that WM entails the activation of long-term memory (LTM) representations, and that selective attention provides a possible mechanism to activate relevant portions of the LTM system (Cowan, 1993; Oberauer, 2002). In this context, attention does not only provide "pointers" to objects that are relevant, but may also help integrating bottomup signals with information stored in LTM.

In sum, by making use of naturalistic stimuli that entail high levels of competition for processing resources, we showed activation in posterior parietal regions when salient visual signals trigger orienting of spatial attention (Nardo et al., 2011) and when salient visual objects are successfully encoded in WM (Santangelo and Macaluso, 2013). These results indicate that bottom-up salience contributes to making a given location/object relevant (stimulus-driven selection), which in turns increases the likelihood of on-line spatial orienting towards that location and the storage in memory of information concerning that location.

Based on these findings, it would be also important to assess whether any effect of bottom-up salience is maintained or disrupted in patients with unilateral attentional deficits, as left spatial neglect (see also Section Hemispheric Lateralization and the Spatial Neglect Syndrome, below). The dorsal attention system is usually structurally intact in these patients: nonetheless, the activity of dorsal areas can be functionally reduced due to structural lesions of adjacent areas in the ventral attention system. Consistent with this, the study of visual ERPs suggested that in neglect patients bottom-up visual processing can be impaired at around 130 ms post-stimulus, at the level of structurally intact dorsal attention areas (i.e., the IPS plus dorsal occipital cortex, cf. N1 component; DiRusso et al., 2008). However, the population of neglect patients is characterized by a high between-patients variability, linked to the location and extent of brain damage, so that a corresponding variability in the processing of bottom-up salience can be expected.

## **SPATIAL REORIENTING AND EXPECTANCY COMPONENTS IN THE VENTRAL ATTENTION SYSTEM**

As noted in the introduction, the classical comparison highlighting the activation of the vFP system involves contrasting "invalid" versus "valid" trials within spatial cueing paradigms including central predictive cues. A main assumption of this approach is that the predictive cues generate a "validity context" associated to the ratio of valid over invalid cues. The statistical contingency between cues and targets cumulates during task performance and it is thought to modify behavioral performance and brain activity related to the different operations implicated in attentional orienting. Under this assumption, one may expect that symbolic cues indicating the position of the upcoming targets at chancelevel will be ineffective in biasing attention and will not produce any validity effect. Here we review evidence demonstrating that, unexpectedly and contrary to the initial assumption, effects of cue validity can be also observed within a neutral "validity context", i.e., when valid trials are as frequent as invalid ones and cues are statistically non-predictive. These experimental conditions are of particular relevance because they dissociate the effects of spatial re-orienting on invalid trials from any violation of expectations that characterize invalid trials presented within a predictive "validity context".

The first study that emphasized the importance of isolating the neural correlates of the violation of expectancy carried by targets presented at unexpected spatial or temporal locations is a PET investigation by Nobre et al. (1999). Through the reanalysis of data from a previous study (Coull and Nobre, 1998), these authors compared brain activations recorded during phases of a Posner-like task in which the percentage of valid trials was 100% to those recorded in phases with 60% of valid and 40% of invalid trials. Invalid trials selectively engaged inferior parietal areas bilaterally, though with a larger activation in the right hemisphere, the right lateral premotor cortex and the inferior-orbitofrontal cortex bilaterally. The authors concluded that the response of the inferior-orbitofrontal cortex to invalid trials is recruited by violation in the expected cue-target congruency and that the parietal and premotor areas might instead be prevalently involved in more exquisitely spatial-attentional processes, as the shift of the attentional focus. Although the experimental design did not include any comparison of brain activations related to blocks with frequent vs infrequent invalid trials, i.e., trials implying identical spatial-attentional operations though carrying different degrees of unexpected violation of cue-target congruency, the points made by these authors were the first that set the terms of the spatial/expectancy components in reorienting.

A few years later, Giessing et al. (2004) reasoned that although event-related fMRI designs are more suitable to study transient neural processes related to infrequent events like invalid trials in a conventional endogenous Posner task, they can be insensitive to more sustained neural processes related to reorienting of attention. To isolate both transient and sustained neural components of reorienting, these authors used a hybrid event-related/block design arranging invalid trials within blocks having different ratios of valid vs. invalid trials: 100% valid–0% invalid, 75% valid–25% invalid and 50% valid–50% invalid. This allowed comparing both transient event-related (invalid vs valid contrast independently of block type) and sustained (100% Valid–0% Invalid blocks vs. 50% valid–50% invalid blocks) brain activity related to reorienting. The study revealed common event- and block-related activations in the right intraparietal sulcus. Moreover, in the blocked design the activation of this area was positively correlated with the number of invalid trials in a block. The blocked design also revealed the activation of the right occipital-parietal junction whereas the event-related analysis isolated additional activations in the right superior parietal lobule and in the left intraparietal sulcus. Unfortunately, this study did not provide us with detailed results of the contrast between the 75% and 50% block validity conditions that would have allowed for a first glance to the spatial and expectancy components of reorienting. At a behavioral level, this study reported the maintenance of the validity effect in blocks with non-predictive cues (50% valid–50% invalid): however, one should consider that this finding can still be accounted by the general positive "validity context" of the task, since across the different types of validity blocks the majority of trials were valid (73.4%).

Capitalizing on behavioral evidence showing that the ratio of valid vs. invalid cues modulates the validity effect (Jonides, 1980, 1983; Eriksen and Yeh, 1985; Madden, 1992), Vossel et al. (2006) were the first to directly assess the influence of the validity context on neural activity related to attentional reorienting. These authors used an event-related design where, in each trial, differently colored central arrow-cues predicted with different validity the position of ensuing targets, e.g., green arrow 90% of validity vs. blue arrow 60% of validity. The comparison between trialrelated brain activities (i.e., with no separation of cue- and targetrelated brain response) in the two validity conditions showed that infrequent invalid trials (i.e., 90% validity) determined greater activation in the right IFG and middle frontal gyrus (MFG) and in the right inferior parietal cortex (angular and supramarginal gyrus, plus the intraparietal sulcus). As in previous studies, the reaction times (RT) advantage for validly cued targets was larger in the high-validity condition (90%). The authors' advanced alternative hypotheses for the heightened response in parietal and frontal areas observed for invalid trials presented in the highvalidity context (90%). For the frontal areas, they proposed that infrequent invalid trials might be more unexpected and novel or might induce stronger prefrontal inhibition of motor responses until accomplishment of spatial reorienting. The parietal response to the validity context received two possible interpretations: (1) more demanding reorienting effort to infrequent invalid targets; (2) less frequent and thus more surprising violation of expectancy with infrequent invalid targets.

More recently, Shulman et al. (2009) investigated spatial vs. expectancy components of reorienting using a modified version of the shift/stay paradigm devised by Yantis et al.(2002). Participants had to detect a target-object that was presented within one of two visual streams of object-groups, one presented to the left and one to the right side of central fixation. At the beginning of each stream-trial, the stream to be attended was cued by a peripheral red box shortly appearing to the left or the right of fixation. Crucially, streams were presented within three different block-conditions: (1) in a first condition the probability that on ensuing stream-trials the red-cue shifted from one side of fixation to the other was high (86% shift cues) whereas the probability of remaining on the same side-stream was low (14% stay cues); (2) in a second condition the probability of occurrence of shift and stay cues was equal (50%); (3) in a third condition the probability of shifting attention from the left to right of fixation or viceversa was low (14% shift cues) while the probability of remaining on the same side-stream was high (86%). In this way reorienting of spatial attention to the peripheral red box was made orthogonal to the likelihood of operating reorienting. In other words, spatial reorienting and unexpectedness of reorienting were operationally dissociated among blocks with different shift/stay probabilities. Analysis of Blood oxygen-level dependent (BOLD) signal revealed that the right TPJ was significantly activated by shift cues independently of the likelihood of reorienting (note that this is at variance with previous data by Vossel et al. (2006), showing sensitivity of the right TPJ to the expectancy component of reorienting; see also below for a further discussion of this). In contrast, the response of the right IFG was strongly influenced by unexpected shifts of attention (High Stay/Low Shift cuing condition). The basal ganglia and the frontal insular region were also activated by unexpected shifts of attention: however, analysis of resting state connectivity demonstrated that this network was functionally separated from the classical TPJ-IFG ventral system. The authors proposed that the response of the basal gangliainsular network might be linked to the change in the attentional set, or to higher inhibition of competing attentional processes elicited by unexpected peripheral cues triggering reorienting of attention. Finally, dorsal attentional areas in the IPS and FEF showed intermediate sensitivity to expectancy: according to the authors the increased response of these areas to unexpected shifts of attention was driven by expectedness-related responses in the right IFG and in the basal ganglia-insular network.

The studies summarized above investigated the influence of the "validity context" by examining brain responses related to the entire cue-target period or to peripheral stimuli that acted both as attentional relevant stimuli and cues (Shulman et al., 2009). However, one important finding in the study of attention is that cumulated knowledge about the occurrence of previous stimuli (trials' history) can influence the focusing of attention ahead of the occurrence of new relevant stimuli. Shulman et al. (2007) showed that the more target occurrence in a rapid visual stream becomes probable along the timeline, the more TPJ is deactivated. Put in other words, the response of the attention controlling networks is modulated according to the likelihood/predictability of a given target type to occur. With respect to studies using the Posner task with central cues, the observation by Shulman et al. (2007) highlights the importance of disentangling the influence of the validity context on cue- and target-related brain activity.

With this aim we recently reinvestigated the influence of cuepredictiveness on cue- and target-related brain activity (Doricchi et al., 2010). Two groups of participants were considered: one performed an endogenous Posner task with highly predictive cues (80% valid–20% invalid trials) whereas the other performed the task using non-predictive cues (50% valid–50% invalid trials). Keeping a fixed cue-target probabilistic relationship during the entire task allowed avoiding the possible influence of strategic factors related to trial-by-trial or block-by-block changes in the probabilistic cue-target congruency. In the same study, we also introduced spatially-neutral cued trials to assess the influence of the validity context on attentional benefits (RT advantage to validly cued vs neutrally cued targets) and costs (RT advantage to neutral vs invalidly cued targets). We found that during endogenous orienting, predictive cues lead to a greater deactivation of the right TPJ as compared to non-predictive ones. Since deactivation of the TPJ has been previously related to the filtering out of unattended events (Shulman et al., 2007), this finding suggests that when cues are not predictive and invalid targets frequent, the TPJ reduces the filtering out of uncued locations to facilitate reorienting. The study of target-related activation showed that, compared to valid targets, frequent and infrequent invalid targets equally activated the right TPJ, whereas infrequent invalid targets produced a stronger response in the right IFG and MFG: this shows that the TPJ is sensitive to the simple mismatch between cued and actual target location and that IFG-MFG are sensitive to the unexpectedness of such a mismatch. Both the left and right TPJ displayed no preference for targets presented in the ipsilateral or contralateral space (unpublished data). No effect of validity context was found in cue- and targetrelated responses of dorsal attentional areas (IPS and FEF). This seems at variance with findings by Shulman et al. (2009): this

different result, however, could be accounted on the fact that in our study the validity context was stable, whereas in the study by Shulman et al. (2009) the predictiveness of peripheral shift/stay cues was randomly alternated between 14%, 50%, and 86% across the 20 blocks of trials. Finally, unlike the right IFG/MFG that was modulated according to expectancy (cf. infrequent invalid targets), the left TPJ and left IFG responded to frequent valid targets matching cue-related expectations. In agreement with these results, DiQuattro and Geng (2011) reported activation of the left TPJ and IFG in response to salient contextual stimuli predicting, i.e., matching, the concomitant task-related stimuli.

The comparison of brain and behavioral responses to valid and invalid targets with those to neutrally cued ones provided a number of additional functional and behavioral observations. Independently of cue predictiveness, valid targets activated the left TPJ, whereas invalid targets activated both the left and right TPJs. These findings suggest that the selective activation of right TPJ that is usually found in the direct comparison between invalid and valid trials (but see below for a number of studies reporting bilateral TPJ response to invalid targets) may result from a common response to both valid and invalid target in the left TPJ. At the behavioral level, in the non-predictive condition the validity effect was reduced, though, not abolished. This is important because it shows that even when the general validity context of an endogenous Posner task is constant and neutral, non-predictive cuing can still bias participants' attention. Even more importantly, the analyses of the benefits and costs, showed—quite surprisingly that the reduction of the validity effect in the non-predictive condition was entirely explained by a drop in the costs; whereas benefits were equivalent in the predictive and non-predictive conditions (it is worth noting that only participants showing a reliable validity effect, i.e., RT advantage to valid vs. invalid targets, during a training pre-test session with 80% valid cuing were included in the study). All together these findings show that: (a) the left TPJ hosts both neuronal populations coding the mismatch between cued and actual target location on invalid trials and cuetarget matches on valid trials; (b) the right TPJ hosts neuronal populations coding the mismatch between cued and actual target location; (c) the validity context modulates TPJ activity during the cue period but shows no comparable influence on target-related brain response: this suggests that in fMRI studies investigating the influence of the validity effect it is suitable to disentangle cue- and target-related effects. Cue- vs. target-related effects also provides us with a possible explanation for the different findings between the study of Vossel et al. (2006), which reported sensitivity of the TPJ to the validity context, and those by Shulman et al. (2009) who did not report any such effect. In the study by Vossel et al. (2006) there was no separation between central/cue- and peripheral/target-related brain responses, whereas in the study by Shulman et al. (2009) only peripheral/target-related responses were studied. In our study (Doricchi et al., 2010), cue- and targetrelated brain responses were investigated separately. This showed that the right TPJ is sensitive to expectancy during the cueperiod, when it showed a greater deactivation for informative compared to non-informative cues, though not during the target period, when it showed equivalent levels of activity both in the low- and high-expectancy conditions. These results suggest that in Vossel et al. (2006) the expectancy effects in the right TPJ may have been partially due to cue-related activity, whereas the effects in the right IFG-MFG can be attributed to the sensitivity of these areas to target-related expectancy, i.e., greater responses to unexpected/infrequent invalid targets.

In two other studies we investigated the response pattern of the vFP system seeking to further dissociate pure stimulusdriven effects from expectation and task-relevance (Natale et al., 2009, 2010). In one study (Natale et al., 2010), we made use of an attention capture paradigm with non-predictive cues (Folk et al., 1992) to investigate whether task-irrelevant visual features associated with the target modulate activity in TPJ-IFG. The paradigm consisted in classical exogenous cueing task, with a box briefly flashed in one or the other hemifield, shortly followed by a task-relevant visual target at the same (50% valid trials) or the opposite side (50% invalid trials). By virtue of this 1:1 ratio between valid and invalid cues, the experiment entailed a fully neutral "validity context". The target was a triangle presented within the box and the task of the subjects was to report whether the triangle was pointing upward or downward. Critically, here we manipulated the color of the targets (red or blue) and the color of the cues (red or green). Despite the red/blue color of the target was fully task-irrelevant, at the behavioral level we found larger effects of cue-validity (invalid vs. valid trials) when the cue was red (set-relevant cues: same color as the target set) than when the cue was green (set-irrelevant cues: color not included in the target set). The imaging data revealed that invalid set-relevant red-cues activated the right TPJ, irrespective of target color; while invalid trials with set-irrelevant green-cues did not activate the TPJ. These results confirm that in the absence of any relationship between cues and targets (green cue + red/blue target), exogenous invalid cues do not trigger any spatial selection processes in the ventral attention system. However, it is sufficient that the cue shares some propriety with the current task-set (here, "redness") to activate the ventral network, even in the absence of any breach of expectation. Since this effect was specific for the invalid trials (vs. valid trials), we assumed that it arose at the onset of the target. However, because of the specific timing of the stimuli, we were unable to separately assess the effect of set-relevance on cue-related vs. target-related activity (i.e., targets were presented immediately after the cues) nor whether any differential pattern of de-activation compared to resting state activity played any role here (i.e., the range of inter-trial intervals was only 1.9–3.0 s; cf. Shulman et al., 2007). Nonetheless, these results highlighted that external signals—specifically, the spatial relationship between the cue and the target—and internal information about the current task-set jointly contribute to the re-orienting effects in ventral attention system (see also Serences et al., 2005).

Another approach that enabled us to investigate the interplay between non-predictive signals and endogenous task-settings involved using a double-cue paradigm (Berger et al., 2005). The aim of a double cue paradigm is to engage endogenous and exogenous attention control concurrently within the same trial, thus providing a direct measure of whether/how these two mechanisms interact with each other. Specifically, the presentation of a peripheral non-predictive cue after a predictive endogenous cue enables studying the effect of a fully task-irrelevant and non-predictive stimulus (i.e., the exogenous cue) presented under conditions of top-down focused attention. This may be important when considering that the typical situation yielding to activation of TPJ-IFG involves shifting attention from a relevant location (e.g., that signaled by a predictive cue) to another relevant location (i.e., the position of an invalidly cued target). On each trial we presented first an endogenous predictive cue, then an exogenous peripheral cue and, finally, the visual target (Natale et al., 2009). The results highlighted activation of the vFP network when the endogenous cue was invalid, irrespective of the validity/invalidity of the exogenous cues. This demonstrated that this network does not process task-irrelevant and non-predictive stimuli, even when attention has been endogenously focused and the task-irrelevant stimuli provide spatial information that mismatches the current spatial expectations (i.e., trials including "invalid" exogenous cues). These results support the view that activation of the ventral attention system is involved in stimulus-driven updating of spatial expectations, only when the stimulus (e.g., a task-relevant target or a set-relevant cue) signals a "new" location that is potentially relevant.

To summarize, there is now ample evidence that activity in the vFP system reflects some interplay between stimulus-driven factors (e.g., the onset of an unexpected stimulus) and other endogenous/internal constraints. Mere bottom-up stimulus onset is insufficient to activate this network (e.g., Kincade et al., 2005; Indovina and Macaluso, 2007), while the specific relationship between the characteristics of the external signal and the internal goals/expectations plays a pivotal role for the activation of this system (e.g., see Corbetta et al., 2008, for review; Natale et al., 2010). This relationship may be relatively direct, e.g., the stimulus requires some overt judgment/response despite breaching current expectations (e.g., a task-relevant invalid target, following a predictive cue); or can be more subtle, e.g., the stimulus is taskirrelevant but still taps into task constraints that are currently relevant (e.g., contingent capture paradigms).

## **INTEGRATION OF ENDOGENOUS AND EXOGENOUS SIGNALS: OPEN ISSUES AND FUTURE WORK HEMISPHERIC LATERALIZATION AND THE SPATIAL NEGLECT SYNDROME**

Despite the common assumption that mechanisms of attentional reorienting are predominantly—if not exclusively—homed in the right hemisphere (Shulman et al., 2010), a number of studies have reported bilateral rather than right unilateral activation of the TPJ in response to unexpected and invalid targets (Serences et al., 2005; Asplund et al., 2010). Through comparisons with neutral trials, we have recently shown that the left TPJ activation to invalid targets is often missed because this area responds both to invalid and valid targets (Doricchi et al., 2010). Here, we wish to emphasize that these effects in the left hemisphere can help understanding the reorienting deficits in neglect patients. The observation that the same regions that in the right ventral attention network (TPJ and IFG-MFG) are activated by reorienting towards both sides of space are also the most frequently damaged in patients with left spatial neglect (Doricchi and Tomaiuolo, 2003; Mort et al., 2003; Bartolomeo et al., 2007; He et al., 2007; Doricchi et al., 2008; Verdon et al., 2010) was taken as evidence to explain the higher incidence of neglect after right brain damage (Corbetta et al., 2008). However, assuming an exclusive competence of the right ventral network in reorienting towards both side of space would imply that damage to this network should cause a bilateral reorienting deficit. At variance with this prediction, neglect patients show severely impaired reorienting towards invalid targets in the contralesional left side of space, but no comparable ipsilesional deficits (Posner et al., 1984; Morrow and Ratcliff, 1988; Friedrich et al., 1998; Losier and Klein, 2001; Vossel et al., 2010; Rengachary et al., 2011). A possible explanation of this is that the residual ipsilesional reorienting abilities in neglect patients are based on the reorienting response of the intact left hemisphere. The competence of the left ventral network in detecting both cue-target matches on valid trials and cue-target mismatches in invalid trials may also explain the preserved ability of neglect patients in representing and exploiting the statistical contingency governing the spatial distribution of attentional relevant stimuli (Bartolomeo et al., 2001; Geng and Behrmann, 2002). We advise reconsidering hemispheric lateralization during spatial re-orientating, as this will potentially refine our understanding of pathologies associated with deficits of spatial attention.

#### **EARLY OR LATE ATTENTIONAL FUNCTION FOR THE TPJ?**

Electrophysiological investigations suggest that the TPJ plays a crucial role in the late phases of attentional processing, when it is believed to generate the P300b component signaling the match or mismatch between actual and predicted sensory input (Knight and Scabini, 1998). Capitalizing on this evidence we have proposed (Doricchi et al., 2010) that the response of the right and left TPJ to invalid targets may reflect the activity of a late-processing "MisMatch" system and the response of the left TPJ to valid targets that of a complementary "Match" system. Both systems would provide signals that help the brain in updating the internal models of statistical cue-target congruency which, in turn, would help keeping or switching the attentional task-set and the building-up of predictions about the position of upcoming targets. DiQuattro and Geng (2011) recently have provided evidence showing that one important function of the left hemispheric "Match" system concerns the processing of salient visual contextual cues that regularly predict, i.e., match, the position of concurrent but less salient targets. Interestingly, while the reorienting of attention after invalid central cues does activate the right (and left) TPJ, reorienting after invalid exogenous peripheral cues does not (e.g., Kincade et al., 2005). Based on our interpretation of the role of TPJ, this finding suggests that a key difference between exogenous and endogenous orienting is that only in the latter case a template of the expected target can be prepared during the cue period and then compared, through a Match/MisMatch process, with the actual target.

However, in a recent fMRI study Vossel et al. (2012) provided evidence challenging the idea that spatial reorienting is first initiated in dorsal parietal areas and that the TPJ comes into play only in a late phase of target processing. These authors used dynamic causal to characterize the effective connectivity within and between the ventral and dorsal attentional system during orienting to valid targets and reorienting to invalid targets. One key finding of this study was that invalid cuing enhanced connectivity from the ventral right TPJ to the dorsal right IPS rather than viceversa. The authors concluded that the violation of the expected cue-target congruency signaled by the TPJ may precede and help the reorienting-related shift of spatial attention governed by dorsal areas.

The notion that the allocation of processing resources involves some comparisons between expectations stored in "internal models" and the incoming sensory input (cf., Match/MisMatch systems, above) bears some relationship with recent proposals concerning the role of "predictive coding" in visual processing (Rao and Ballard, 1999). These postulate a hierarchical organization of processing where higher-order nodes represent the expected signal and inform lower-level nodes about this prediction (topdown influences). Upon stimulation, the input is compared with the predictions and any resulting error is fed-forward in order to update the internal model. Such architectures have been used to explain several visual phenomena (e.g., extra-classical receptive fields, Rao and Ballard, 1999) and, more recently, expectationrelated effects in visual attention (Spratling, 2008; see also Summerfield and Egner, 2009, for review; Feldman and Friston, 2010). In this context, low probability invalid trials may generate a prediction error, possibly with an additional update of an internal model that would keep track of the probabilistic relationship between the positions signaled by the cues and where the targets are actually presented. Nonetheless, we should notice that a central tenet of predictive coding concerns the hierarchical organization of processing, which appears suitable to describe attention-related effects within occipital areas (e.g., Spratling, 2012, who implemented predictive coding to model saliencyrelated effects in primary visual cortex) or between high-order parietal (frontal) regions and visual areas (see also den Ouden et al., 2010, for a possible role of sub-cortical structures); but may be less appropriate to explain interactions between the dorsal and the vFP networks or between the left and the right TPJ.

In summary the interplay between dorsal areas adjacent to the IPS and the ventral TPJ appears to be an important challenge for future studies. These will have to be cautiously taken into account that a given area can show both fast and slow responses to invalid targets (Chambers et al., 2004) and that the interaction between different visual-attentional areas can be characterized by complex reciprocal exchange of feedback/forward signals. In addition it would of interest to investigate the role played by the ventral and dorsal attentional network when the statistical predictiveness of cue stimuli is updated and exploited to anticipate upcoming target events. Available evidence allows us to hypothesize that the ventral attentional system may perform a dynamic, trial-by-trial evaluation of the cue-target contingency and that the results of this is fed to dorsal areas. In turn, dorsal regions would make use of this information to update higher-order salience/priority maps that—via top-down control—modulate the processing of incoming signals in occipital sensory areas. This may be tested using TMS perturbations of the TPJ and dorsal parietal regions at different intervals, following the presentation of cues with variable validity/predictiveness.

## **ONE OR MULTIPLE FILTERS FOR THE ALLOCATION OF ATTENTIONAL RESOURCES?**

We reviewed several studies showing that the validity context modulates the validity effect: the higher the statistical predictiveness of the cues, the higher the RT advantage for valid as compared to invalid targets. This relationship has been modeled as the result of a single-filtering operation, where the ratio between the accumulation of attentional resources at the attended position and the withdrawal of resources from unattended locations is positively correlated to the predictiveness of the cues. However, we have recently found that the reduction of the validity effect with non predictive endogenous cues (i.e., neutral "validity context") can be selectively driven by the abatement of attentional costs, whereas benefits can be maintained with both predictive and non predictive cuing (i.e., with both positive and neutral "validity contexts"). In an ensuing ERPs study (Lasaponara et al., 2011) we have recently found that the abatement in attentional costs is matched with the disappearance of differences in the amplitude of the P1 wave evoked by invalidly vs. neutrally cued targets (see Luck et al., 1994). By contrast the maintenance of benefits in predictive and non-predictive cuing conditions is paralleled by larger N1 amplitude in response to validly vs. neutrally cued targets. This difference between the effect of predictiveness on costs and benefits cannot be easily accounted for by a single filtering mechanism. A linear relationship between the level of cue predictiveness and the ratio between the activation of the cued vs. the uncued location would predict a symmetrical reduction of benefits and costs with non-predictive cuing. A more articulated interpretation is needed (Lasaponara et al., 2011).

At a neurophysiological level, visual-spatial orienting is regulated by the combined action of different—though functionally related—pools of visual, visuomotor and saccadic neurons within the dorsal fronto-parietal network (e.g., in the intraparital cortex: Ipata et al., 2006; Thomas and Parè, 2007; Superior Colliculus: McPeek and Keller, 2002; and in the FEF: Bruce and Goldberg, 1985; Schall and Hanes, 1993; Hanes and Schall, 1996; Thompson et al., 1996, 1997; Sommer and Wurtz, 1998; Bisley and Goldberg, 2003; Juan et al., 2004; Schall, 2004; Thompson et al., 2005). This allows hypothesizing synergic filtering mechanisms, via anatomically separated but functionally related pools of neurons. Some degree of independence between these pools can provide us with an explanation for the differential effect of the "validity context" on benefits and costs. For example, in the case of non-predictive cuing, the activation of visuomotor responses directed toward the cued location could guarantee the

## preservation of attentional benefits, whereas concomitant visual selection of both cued and uncued locations might help reducing attentional costs. In humans, some segregation between mechanisms of visual and visuomotor selection was demonstrated in cortical areas traditionally involved in saccadic programming. Muggleton et al. (2003) showed that the inactivation of the FEF by TMS stimulation slows down the selection of poorly salient or non-predictable visual targets in visual search tasks (see Thompson et al., 1997, for equivalent neurophysiological findings in the macaque). These observations highlight the need of exploring further the complex and multi-stages mechanisms that regulate the strategic allocation of attentional resources in space.

## **CONCLUSION**

Traditional views of attention control posit a distinction between endogenous control in dorsal fronto-parietal regions and stimulus-driven control in vFP areas. However, in recent years, such a strict dichotomy has been challenged. Here we reviewed evidence that the dorsal system makes use of bottom-up salient signals to select relevant elements in complex environments; and that the processing of external stimuli in the ventral system takes into account endogenous factors associated with the experimental context (e.g., task-set, expectations, predictiveness). We emphasize that attention control must pick up statistical ir-/regularities of the environment and integrate these with on-line information about the current sensory input. This interaction determines the selection of the most relevant stimuli and governs the allocation of the attentional resources. At the physiological level, this is likely to require some interplay between the dorsal and the ventral attention systems. We propose that the ventral system performs moment-to-moment match/mismatch operations, comparing current expectations/predictions with the actual sensory input. The result of these operations leads to a continuous update of the expectations and predictions, which the dorsal system utilizes to control the allocation of spatial attention.

## **ACKNOWLEDGMENTS**

The research leading to these results has received funding from the European Research Council under the European Union's Seventh Framework Program (FP7/2007-2013) / ERC grant agreement 242809 to Emiliano Macaluso; and from the Italian Ministry of Health to Fabrizio Doricchi (grant RF10.091). The Fondazione Santa Lucia, is supported by The Italian Ministry of Health.

## **REFERENCES**


*Curr. Opin. Neurobiol.* 3, 171–176. doi: 10.1016/0959-4388(93)90206-e Arrington, C. M., Carr, T. H., Mayer, A. R., and Rao, S. M. (2000). Neural mechanisms of visual attention: object-based selection of a region in space. *J. Cogn. Neurosci.* 12(Suppl. 2), 106–117. doi: 10.

1162/089892900563975 Asplund, C. L., Todd, J. J., Snyder, A. P., and Marois, R. (2010). A central role for the lateral prefrontal cortex in goal-directed and stimulusdriven attention. *Nat. Neurosci.* 13, 507–514. doi: 10.1038/nn. 2509


task. *Cereb. Cortex* 20, 1574–1585. doi: 10.1093/cercor/bhp215


*Sci.* 13, 520–525. doi: 10.1111/1467- 9280.00491


430. doi: 10.1126/science.274.5286. 427


47, 1790–1798. doi: 10.1016/j. neuropsychologia.2009.02.015


meta-analysis of executive components of working memory. *Cereb. Cortex* 23, 264–282. doi: 10. 1093/cercor/bhs007


activation of frontal, parietal, and sensory regions underlying anticipatory visual spatial attention. *J. Neurosci.* 31, 13880–13889. doi: 10. 1523/jneurosci.1519-10.2011


identified in the activity of macaque frontal eye field neurons during visual search. *J. Neurophysiol.* 76, 4040–4055.


involvement in visual attention. *Neuron* 23, 747–764. doi: 10. 1016/s0896-6273(01)80033-7


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 May 2013; accepted: 29 September 2013; published online: 21 October 2013.*

*Citation: Macaluso E and Doricchi F (2013) Attention and predictions: control of spatial attention beyond the endogenous-exogenous dichotomy. Front. Hum. Neurosci. 7:685. doi: 10.3389/fnhum.2013.00685*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Macaluso and Doricchi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The dorsal attentional system in oculomotor learning of predictive information

*Philip Tseng1,2, Chi-Fu Chang1,2, Hui-Yan Chiau1,2, Wei-Kuang Liang1,2, Chia-Lun Liu1,2, Tzu-Yu Hsu1,2, Daisy L. Hung1,2, Ovid J. L. Tzeng1,3 and Chi-Hung Juan1,2\**

*<sup>1</sup> Institute of Cognitive Neuroscience, National Central University, Jhongli, Taiwan*

*<sup>2</sup> Laboratories for Cognitive Neuroscience, National Yang-Ming University, Taipei, Taiwan*

*<sup>3</sup> Institute of Linguistics, Academia Sinica, Taipei, Taiwan*

#### *Edited by:*

*Joy Geng, University of California Davis, USA*

#### *Reviewed by:*

*Britt Anderson, Brown University, USA Amanda Ellison, Durham University, UK*

#### *\*Correspondence:*

*Chi-Hung Juan, Institute of Cognitive Neuroscience, National Central University, No.300, Jhongda Rd. Jhongli City 320, Taiwan e-mail: chijuan@cc.ncu.edu.tw*

The dorsal attentional network is known for its role in directing top-down visual attention toward task-relevant stimuli. This goal-directed nature of the dorsal network makes it a suitable candidate for processing and extracting predictive information from the visual environment. In this review we briefly summarize some of the findings that delineate the neural substrates that contribute to predictive learning at both levels within the dorsal attentional system: including the frontal eye field (FEF) and posterior parietal cortex (PPC). We also discuss the similarities and differences between these two regions when it comes to learning predictive information. The current findings from the literature suggest that the FEFs may be more involved in top-down spatial attention, whereas the parietal cortex is involved in processing task-relevant attentional influences driven by stimulus salience, both contribute to the processing of predictive cues at different time points.

**Keywords: probability, predictability, visual attention, eye movements, TMS, transcranial magnetic stimulation**

Regularities in the visual environment are predictive of future events. Learning of such regularities may therefore help the visual system generate anticipatory activities (i.e., expectations) and reduce computational burden. Take traffic lights, for example; there is a serial order for which signals light up (temporal), as well as the color (feature based) and location (spatial) of each light, thus reflecting regularities in several domains of one's daily life. This information can be fully predictive of future events if they are always 100% valid, such as the traffic light. But even with partial predictive power (not always 100% valid), which we refer to as probabilistic, it remains advantageous to pick up such information, as the knowledge of any regularities can help the visual system reduce its computational load because future events can be anticipated and better managed in advance. This learning of predictive information is especially useful given that our ability to represent multiple objects within a scene at any given moment is rather limited (e.g., Buschman et al., 2011; Tsubomi et al., 2013). Indeed, many studies have already demonstrated the visual system's capability to learn and exhibit knowledge of regularities in the environment with and without subjective awareness (e.g., Chun and Jiang, 1998; Fiser and Aslin, 2001; Kristjánsson et al., 2001; Nakayama et al., 2004; Geng and Behrmann, 2005). Eye movement studies have also shown that people can direct their eyes and attention to highly probable locations faster than to low-probability locations without employing an explicit strategy to do so (e.g., Farrell et al., 2009). In this paper, we attempt to review evidence demonstrating the existence of predictive learning as well as their possible neural mechanism, such as the dorsal attentional network.

The dorsal attentional network is comprised of the frontal eye fields (FEF) and the posterior parietal cortex (PPC), where PPC includes the superior parietal lobule (SPL) and intraparietal sulcus (IPS). The dorsal network has been shown to be involved in voluntary top-down orienting of visual attention. This network shows strong activity when a spatial cue is presented, explicitly indicating where participants should direct their attention (Corbetta et al., 2008). However, it is important to note that even in the absence of a cue, contextual information, if predictive, can sometimes work in similar ways as a spatial cue in directing one's attention. For example, visual search with a repetitive and predictive distractor layout can promote implicit learning and efficient attentional orienting to the target location in observers over time, a phenomenon known as contextual cuing (Chun and Jiang, 1998). In this case, the contextual information (i.e., distractor configuration) is predictive of the target location, thus functioning like a probabilistic spatial cue in directing visual attention, and thereby activating the dorsal network (Manginelli et al., 2013). Therefore, the dorsal network has been suggested to mediate the processes of top-down attentional set, such as the expectation of a cue (Corbetta et al., 2008). This pre- and post-stimulus activation of the dorsal network can facilitate efficient selection of, and orienting toward, the stimuli of interest.

## **POSTERIOR PARIETAL CORTEX**

The idea that one's visual attention is efficiently oriented toward the predictive and salient cue in the environment is plausible, because one of the functions of PPC is indeed attentional orienting and capture. Specifically, the right (hemisphere) PPC has been associated with a variety of functions in visual attention (Behrmann et al., 2004; Rushworth and Taylor, 2006), including attentional control (Nobre et al., 1997; Ellison and Cowey, 2006; Morris et al., 2007), updating spatial mapping (Andersen et al., 1985; Merriam et al., 2003; Morris et al., 2007), and shifting spatial attention (Ellison et al., 2003; Chambers et al., 2004; Constantinidis and Steinmetz, 2005; Mevorach et al., 2006; Schenkluhn et al., 2008). Damage to the right PPC may lead to hemifield neglect, where patients become unaware of the contralateral visual space although their visual acuity is not affected (Heilman and Van Den Abell, 1980; Mesulam, 1981). In the contextual cuing example above, it is important to note that the predictive context is facilitative by efficiently directing eye gazes toward the target location (Neider and Zelinsky, 2006; also see Kunar et al., 2007; for an effect in response selection). Indeed, Peterson and Kramer (2001) monitored participants' eye movements and found that fewer saccades were made when context was repeated, suggesting a more efficient allocation of attention to the old condition. Thus, predictive contextual information, even with uneven probability (Tseng et al., 2011), can have a direct impact on visual attention and eye movements, suggesting a critical involvement of the dorsal attentional network that is also heavily involved in oculomotor control. Indeed, Schankin and Schubo (2009) measured event-related potentials (ERP) using this paradigm and found greater negativity in the posterior parietal region around 200 ms after display onset. This component is known as the N2pc component (Luck and Hillyard, 1994), which reflects the allocation of visual attention in the parietal region that is contralateral to the visual field of the attended stimulus. This physiological evidence demonstrates an important correlation between attentional allocation and PPC. In contextual cuing, however, there exists a confound between distractor saliency and predictability, thus the exact role of PPC remains unclear. That is, it is unclear whether the activation of PPC was due to the bottom-up salience of the distractors, or because of their predictive nature that matched the top-down task set. To dissociate these two factors, one needs a situation where salient but nonpredictive distractors compete with the target and therefore need to be suppressed. To answer this question, one important study by Geng and Mangun (2008) used a visual search paradigm, where the search target appeared either with a low-contrast (low competition) or high-contrast (high competition) distractor. These authors found that target-induced aIPS (part of PPC) activation scaled linearly with increasing RT, especially when the distractor was salient. Thus, PPC activation seemed to be coupled with salient distractors at the perceptual level. Subsequent study by Mazaheri et al. (2011) recorded EEG activity using the same paradigm. Critically, in the 1-s pre-stimulus time window before each high-competition trial, these authors found higher alpha activity from PPC on trials where the target was correctly identified by the first saccade. Based on the wealth of literature that suggests alpha activity as an indication of cortical inhibition (e.g., see Klimesch, 2012 for a review), the results suggest that PPC alpha was indicative of suppression of salient distractors, consistent with the conclusion by Geng and Mangun (2008) that PPC is involved in processing salient perceptual information.

Using brain stimulation, another direct test of such idea was carried out by Chao et al. (2011), with the use of transcranial magnetic stimulation (TMS) coupled with a similar 1-target and 1-distractor visual orienting paradigm (see also Hodsoll et al., 2009) that is preceded by either a spatially-predictive or neutral cue. In this study, saccade curvatures, along with saccade latency and accuracy, were recorded as they have been shown to be indicative of the active excitation and suppression processes that the visual system must resolve between the target and the distractor (e.g., Walker et al., 2006; McSorley et al., 2009). Not surprisingly, people could saccade faster and with more precision and smaller curvature (less "pulling" effect from the distractors) when the target location was predictable. The pattern becomes interesting when TMS was applied over the right PPC. When target location was predictable, PPC TMS had no significant effect on saccade curvature. However, when target location was unpredictable, TMS over the right PPC decreased saccade curvature toward the distractor such that impaired PPC functioning actually strengthened distractor suppression (**Figure 1**). These results suggest that PPC plays an important role in attentional capture (thus why PPC TMS would decrease attentional capture toward the distractor), and subsequent study has shown that such effect took place at modulating the torque of each eye movements (Liang and Juan, 2012). Together, the alpha inhibition and TMS suppression of PPC from Mazaheri et al. (2011) and Chao et al. (2011) demonstrate that PPC is sensitive to salient distractors, and is indirectly sensitive to predictive information when predictability also becomes a salient feature.

**FIGURE 1 | Effect of TMS on PPC in predictable and unpredictable contexts, from Chao et al. (2011).** The Y axis denotes the range of saccade curvatures, where negative numbers indicate less curvature toward the distractor. Panel **(A)** shows that PPC TMS decreased saccade curvature toward the distractor when distractor location is unpredictable, whereas Panel **(B)** shows that PPC TMS had no effect when distractor location can be predicted in advance. This suggests a critical role for the right PPC in attentional capture, and how predictability can modulate PPC involvement.

Without a salient distractor, is PPC still involved in predictive processing? As mentioned, research has shown that as long as predictability is a relevant aspect of the task, PPC should also play an important role in processing targets and predictive information. For example, a TMS study by Ellison et al. (2003) found that TMS over PPC disrupted performance in visual search when participants had to decide whether a single item presented was a target or not. This impairment effect, however, was only present when target location was unpredictable. Further support comes from an fMRI study by Kristjánsson et al. (2007), who found decreased PPC activation when target form or location is known ahead of time via priming, suggesting that PPC is sensitive to predictability and unpredictability beyond salience competition between target and distractors.

Here it is important to emphasize again the goal-directed nature of the dorsal network when interpreting these results regarding PPC. Specifically, PPC does not respond to all perceptually-salient distrators, but only those that are relevant to the current task set. One important fMRI study by Downar et al. (2001) instructed participants to monitor a change either in visual or auditory modality. These authors observed increased activation in right PPC when a change occurred, but only when the change happened in the modality that is relevant to participants' current behavior. Indeed, monkey neurophysiology has already shown that lateral intraparietal cortex (LIP, equivalent to the vicinity of human PPC) encodes behaviorally salient objects (Gottlieb et al., 1998), desirable actions (Dorris and Glimcher, 2004), and the color of a cue if it is associated with an eye movement (Toth and Assad, 2002). In addition, target predictability, when leading to reward (i.e., behaviorally salient), can have a significant effect in modulating LIP activity as it can represent predictive information such as the weighted likelihood of certain shape combinations (Konen et al., 2004; Yang and Shadlen, 2007). Together, it is plausible to conclude from these studies that all manipulations of various psychological constructs such as salience (Geng and Mangun, 2008), predictability (Chao et al., 2011), reward (Yang and Shadlen, 2007), desirability (Dorris and Glimcher, 2004), and behavioral relevance (Toth and Assad, 2002; Muggleton et al., 2011), are all essentially similar in nature. That is, PPC activation is observed as long as a stimulus, be it a target or distractor, is closely matched with one's current task/goal. This idea may help generalize the nature of PPC activation across different contexts. And as such, PPC is not only crucial to spatial attention, but can also be involved in processing predictive information when such information is perceptually or behaviorally relevant, or salient.

## **FRONTAL EYE FIELDS**

On the other end of the dorsal attentional network is FEF. As its name implies, much of the work on FEF have been devoted to the oculomotor control of eye movements and visual attention, although studies have begun to document FEF involvement in other domains of cognition such as inhibitory control (e.g., Muggleton et al., 2010). Early studies such as Miller (1988) studied the effects of absolute and relative target position. A target letter was to be detected in a sequence of four letters in which one location had a higher probability of containing the target. The letter sequence was occasionally offset in horizontal position to probe whether the effects of high probability was dependent on absolute position or the position in the sequence (relative position). It was found that target location probability benefited from both types of spatial relationship. Subsequent study by Kingstone and Klein (1991) also demonstrated that people can be sensitive to the likelihood that a specific stimulus form would appear in a particular spatial location.

More recently, Geng and Behrmann (2005) investigated the role of targets' spatial probabilities in a visual conjunction search task, combined with endogenous and exogenous cues. These authors found that spatial probability indeed induced an implicit facilitation to attentional orienting. But most importantly, the facilitation from spatial probability is additive to the explicit endogenous cue (the effect was purely additive) and interacted with the salient exogenous cue. Thus, the effect of probability in visual selection seems to occur early at the stage where salient exogenous cues are processed. To further investigate this issue, Liu et al. (2010) used a similar setup while recording participants' eye movements. In this version of the task, participants responded with pro- and antisaccade eye movements, which refers to eye movements toward (pro) and away from (anti) the target (**Figure 2A**). Prosaccades have been consistently found to have shorter and longer RT due to the extra stages of suppression in antisaccades (termed the antisaccade cost; Everling and Fischer, 1998; Kristjansson, 2007; Kristjánsson et al., 2001; Kristjansson et al., 2004; Olk and Kingstone, 2003; Munoz and Everling, 2004), and therefore the magnitude of the antisaccade cost provides a suitable measure of the modulating effect that probability may have on oculomotor control. These authors found that under these conditions, prosaccades to the probabilistically-salient location became faster. The sizes of the antisaccade cost also changed to compliment the magnitude of prosaccade probability. Most important, the saccade RTs followed the magnitude of probability saliency such that the RTs decreased gradually as the probability of a certain location increased, and vice versa (**Figure 2B**). Together, these results suggest that the oculomotor system is sensitive to multiple levels of spatial probability. It is important to note that both studies by Geng and Behrmann (2005) and Liu et al. (2010) accounted for the effect of repetition priming by taking trial-totrial RT facilitation into account. While the effect of repetition priming is undoubtedly present, the effect of spatial probability is independent of such effect.

The neural mechanism behind such spatial probability learning in oculomotor behaviors likely involves FEF, as it tends to produce pretarget-related neural activity during saccade selection and saccade preparation, respectively (Schall et al., 1995; Thompson et al., 1996, 1997; Schall, 1997; Bichot and Schall, 2002; Sato and Schall, 2003; Juan et al., 2004, 2008). Note that, however, the bilateral FEF both have distinctive functions in mediating oculomotor control. Studies have shown that the left FEF behaves much like the right PPC, where it is mostly involved when target location is unpredictable. One notable study by Lane et al. (2012) manipulated spatial predictability via spatial priming in a visual search task while applying TMS over either left FEF, right FEF, or right PPC. These authors found that when target location was predictable, TMS only increased reaction time when

2011). Participants were shown a central disc that cued either a prosaccade or antisaccade. This paradigm is able to manipulate levels of probability in prosaccade locations but not antisaccade locations. Behavioral results suggested that these two types of saccades can indeed be dissociated since the effect of probability in prosaccade is not transferred to the same location in an antisaccade. **(B)** Effect of spatial probability on antisaccade cost and SRT from Liu et al. (2010, 2011). This figure shows how the magnitude of antisaccade cost correlates linearly with the level of prosaccade probability. This is because the prosaccades are facilitated by spatial probability while antisaccade SRT remained relatively similar, thereby creating bigger discrepancies between the two SRTs (antisaccade cost).

it was applied over the right FEF, and not the left FEF or the right PPC. Their findings suggest that left FEF and right PPC are only involved when target location is unpredictable (also see Campana et al., 2007), whereas the right FEF is more involved in top-down visual attention that treats predictability in a task-driven manner. In addition, using the same orienting paradigm and manipulation of probability as their previous study (Liu et al., 2010), Liu and colleagues (2011) applied theta burst TMS over participants' right FEF or supplementary eye fields (SEF) for 20 s (**Figure 3A**) and found that FEF TMS, but not SEF, successfully disrupted the effect of probability such that high-probability prosaccades became slower when TMS was applied. These results suggest that right FEF, but not SEF, is critical to the learning of spatial probabilities <sup>1</sup> in this orienting paradigm (**Figure 3B**). Furthermore, when a target becomes less predictable, the top-down effort in search of the new target also requires heavy FEF involvement, presumably because the neuronal buildup has to start over toward a new target that is either located at a new location or defined by one or more new features. As such, TMS over rFEF when target suddenly becomes unpredictable (Muggleton et al., 2003) or is no longer defined by an old set of features (Muggleton et al., 2011) will also impair saccade latency. Together, these studies suggest that rFEF is critical to the processes of target selection either via pretarget neuronal buildup (when target is predictable) or top-down attentional orienting (when target is unpredictable). Indeed, TMS over rFEF at early and late time points during an antisaccade task revealed that FEF involvement occurs at early (target selection) and late (endpoint selection) stages (**Figure 4**), both of which are necessary component for mediating the effect of probability (Juan et al., 2008). In addition, preparatory-related activities can be found in FEF with trained monkeys in an antisaccade task, where the endpoints of the probable saccade type enjoys a lower threshold or early neural activity buildup (Dorris and Munoz, 1998; Dorris et al., 2000; Everling and Munoz, 2000; Connolly et al., 2005) and speeds up the process of saccade preparation and thus decrease saccade latency. fMRI data also showed that rFEF activity can be used to predict saccadic reaction time (SRT) in humans (Connolly et al., 2005). Thus, these findings suggest that the effect of location probability on SRT could be reflective of the neural firing rate within a subpopulation of neurons in the rFEF. This may account for the role of rFEF in mediating the effects of location probability because both target and endpoint selections are necessary for the benefit of location probability to surface.

## **COMPARING FEF AND PPC**

If FEF and PPC both process predictive and probabilistic cues in the environment, how do they differ from each other in terms of timing and function? Several neuroimaging and stimulation studies that compares FEF and PPC involvement using the same paradigm may provide some clues to this question. First, in a nonpredictive visual search array, Kalla et al. (2008) applied TMS over either FEF or PPC at various timings and found that FEF involvement took place early (0–40 ms) while PPC is late (120–160 ms). But as previously mentioned, FEF involvement in oculomotor control can occur both early and late (under the right conditions), covering both stages of target selection and endpoint selection. Indeed, in the aforementioned study by Lane et al. (2012), they found that TMS over right FEF impaired visual search in both primed (predictable) and non-primed (unpredictable) condition, suggesting a general search mechanism that is mediated by right FEF. Stimulation over PPC had no effect when the target was at a predictable location, which is in line with the results from

<sup>1</sup>It is worth mentioning that due to the nature of brain stimulation, one alternative account to this pattern of result is that perhaps TMS is affecting the connections between brain regions, instead of the neural locus itself. Therefore, it remains possible that some of the stimulation results reviewed here is due to a connectivity issue of FEF or PPC's communication with other areas.

on saccade latency, results from Liu et al. (2011). Mean saccadic RTs as a function of TMS, saccade type, and probability. The top two panels indicated FEF and SEF TMS conditions, respectively. In FEF TMS condition, the pattern of the location probability effect was affected by TMS and also the general

the mean. Panel **(B)** show how right FEF TMS decreased SRT in prosaccades to the high probability location, suggesting a critical role for the right FEF in mediating the effect of spatial probability. This effect was not observed in the SEF TMS condition.

the Chao et al. (2011) study, who also found that PPC TMS is effective in saccade curvature only when target location is unpredictable. An fMRI study using a cuing paradigm also found that PPC activation was modulated by the perceptual salience of the stimuli (target and distractor alike), whereas FEF activation was associated with the location of spatial attention (Geng and Mangun, 2008). This suggests that FEF may be more involved in top-down spatial attention, whereas PPC is involved in processing task-relevant attentional influences driven by stimulus salience, both of which can be utilized in processing predictive cues in the environment.

## **CONCLUSION**

In this paper we have briefly reviewed how predictive information in the environment can powerfully modulate human visual attention and oculomotor control. Importantly, neurophysiological studies suggest that such learning requires a dynamic interplay between regions within the dorsal attentional network, which includes PPC and FEF. Specifically, FEF is responsive to predictive information via its top-down early preparatory neural activity buildup that biases the processes of target selection and saccade preparation; whereas PPC responds to salient bottom-up stimuli that carry predictive information, even if they are not targets, if such information matches one's current behavioral goal. Much of the work until now has emphasized the individual contributions of these regions to mediate probability learning. Future studies that disentangle the timing, roles, functional connectivities, and interactions between these regions will provide exciting new insights into how the visual system strategically adapts to the environment.

## **ACKNOWLEDGMENTS**

We are grateful to Neil G. Muggleton and Chang-Mao Chao for their constructive suggestions to earlier versions of the manuscript. This work was sponsored by the

#### **REFERENCES**


National Science Council, Taiwan (101-2628-H-008-001- MY4, 102-2420-H-008-001-MY3, 99-2410-H-008-022-MY3, 97-2511-S-008-008-MY5). Philip Tseng is supported by NSC (101-2811-H-008-014).

context guides spatial attention. *Cogn. Psychol.*, 36, 28–71. doi: 10.1006/cogp.1998.0681


and frontal eye fields in spatially primed visual search. *Brain Stimul*. 5, 11–17.


in visual search tasks. *J. Exp. Psychol. Hum. Percept. Perform.* 14, 453–471. doi: 10.1037/0096-1523. 14.3.453


representations for attention and action. *Neuropsychologia* 44, 2700–2716.


668–679. doi: 10.1111/j.1469-8986. 2009.00807.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 24 May 2013; accepted: 09 July 2013; published online: 02 August 2013. Citation: Tseng P, Chang C-F, Chiau H-Y, Liang W-K, Liu C-L, Hsu T-Y, Hung DL Tzeng OJL and Juan C-H (2013) The dorsal attentional system in oculomotor learning of predictive information. Front. Hum. Neurosci. 7:404. doi: 10.3389/fnhum.2013.00404 Copyright © 2013 Tseng, Chang, Chiau, Liang, Liu, Hsu, Hung, Tzeng and Juan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Dissociations between spatial-attentional processes within parietal cortex: insights from hybrid spatial cueing and change detection paradigms

## *Rik Vandenberghe1,2\* and Céline R. Gillebert 1,3*

*<sup>1</sup> Laboratory for Cognitive Neurology, Department of Neurosciences, Katholieke Universiteit Leuven, Leuven, Belgium*

*<sup>2</sup> Neurology Department, University Hospitals Leuven, Leuven, Belgium*

*<sup>3</sup> Department of Experimental Psychology, University of Oxford, Oxford, UK*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Emiliano Macaluso, Fondazione Santa Lucia, Italy Radek Ptak, University Hospital Geneva, Switzerland*

#### *\*Correspondence:*

*Rik Vandenberghe, Neurology Department, University Hospitals Leuven, Herestraat 49, 3000 Leuven, Belgium e-mail: rik.vandenberghe@ uz.kuleuven.ac.be*

Spatial cueing has been used by many different groups under multiple forms to study spatial attention processes. We will present evidence obtained in brain-damaged patients and healthy volunteers using a variant of this paradigm, the hybrid spatial cueing paradigm, which, besides single-target trials with valid and invalid cues, also contains trials where a target is accompanied by a contralateral competing stimulus (competition trials). This allows one to study invalidity-related processes and selection between competing stimuli within the same paradigm. In brain-damaged patients, lesions confined to the intraparietal sulcus result in contralesional attentional deficits, both during competition and invalid trials, according to a pattern that does not differ from that observed following inferior parietal lesions. In healthy volunteers, however, selection between competing stimuli and invalidity-related processes are partially dissociable, the former relying mainly on cytoarchitectonic areas hIP1-3 in the intraparietal sulcus, the latter on cytoarchitectonic area PF in the right inferior parietal lobule. The activity profile in more posterior inferior parietal areas PFm and PGa, does not distinguish between both types of trials. The functional account for right PF and adjacent areas is further constrained by the activity profile observed during other experimental paradigms. In a change detection task with variable target and distracter set size, for example, these inferior parietal areas show highest activity when the stimulus array consists of only one single target, while the intraparietal sulcus show increased activity as the array contains more targets and distracters. Together, these findings lead us to the hypothesis that right PF functions as a target singleton detector, which is activated when a target stands out from the background, referring both to the temporal background (expectancy) and the momentaneous background (stimulus-driven saliency).

**Keywords: area PF, temporoparietal junction, intraparietal sulcus, superior parietal lobule, invalidity, attentional priority map**

## **1. INTRODUCTION**

Spatial attention encompasses a wide set of divergent processes that govern the distribution of attentional weights over locations that are, or may be, occupied by objects. A powerful concept in spatial attention research, stemming from neurophysiology and computational neurobiology, is the "attentional priority map". The attentional priority map refers to a topographic representation of attentional weights (Bushnell et al., 1981; Koch and Ullman, 1985; Colby et al., 1996; Gottlieb et al., 1998; Itti and Koch, 2001; Bisley and Goldberg, 2003; Vandenberghe and Gillebert, 2009; Ptak, 2012; Jerde and Curtis, 2013). The attentional weights depend, among other variables, on sensory evidence (Bundesen and Habekost, 2008) obtained through multiple input channels (visual, auditory, ···). The current review will be restricted to effects obtained within the visual modality. Although attentional priorities may be sustained over a prolonged period of time (Vandenberghe et al., 2001a,b; Husain and Rorden, 2003), most evidence with regards to parietal cortex relates to its role in transitions between attentional priority maps (Vandenberghe et al., 2001a; Molenberghs et al., 2007). Here we will describe novel evidence obtained using two paradigms, the hybrid spatial cueing paradigm (Gillebert et al., 2011, 2012a, 2013) and the change detection paradigm with varying target and distracter set size (Gillebert et al., 2012b), in patients (Gillebert et al., 2011) and in the intact brain (Gillebert et al., 2012a,b, 2013), both from a localizationist and a connectionist perspective.

## **2. THE HYBRID SPATIAL CUEING TASK**

## **2.1. CONVERGING EVIDENCE FROM FUNCTIONAL IMAGING AND PATIENT LESION DATA**

Numerous experiments in humans have provided converging evidence for the distinct role of different parietal regions in spatial attention (for reviews, see Vandenberghe and Gillebert, 2009; Vandenberghe et al., 2012). Here we will focus on the hybrid spatial cueing paradigm (**Figures 1A**,**B**) which allows one to simultaneously study two key operations of spatially selective attention: selection between competing stimuli (Desimone and Duncan, 1995) and the processing of invalidly cued targets (Corbetta et al., 2000). The hybrid spatial cueing paradigm enables one to contrast the neuroanatomy of both processes within the same subjects, both in healthy volunteers and in patients with parietal brain damage, and to relate the findings to the cytoarchitectonic organization of the parietal cortex, provided that proper sensory control conditions are used (Vandenberghe et al., 2005; Molenberghs et al., 2008; Gillebert et al., 2013). Compared to baseline, the validly cued singlegrating trials require interpretation of the central arrow cue (Woldorff et al., 2004; Bonato et al., 2009), assignment of a

high attentional weight to the cued peripheral location, shortterm maintenance of that weight, detection of the grating at the cued location and discrimination of its orientation, response selection based on a conditional-associative rule and response execution. Compared to the validly cued single-grating trials, the presence of an irrelevant contralateral distracter induces a need to select a target among nearly identical distracting stimuli on the basis of the instructional spatial cue, and to suppress undue interference by the distracter's orientation on response selection. Behaviorally, competition trials are usually associated with a cost compared to valid single-grating trials (Vandenberghe et al., 2005; Molenberghs et al., 2008). By titrating the orientation difference to be discriminated, performance measures during functional magnetic resonance imaging (fMRI)

**FIGURE 1 | Experimental paradigms at the focus of the current review. (A,B)** Hybrid spatial cueing paradigm. **(A)** Timing. **(B)** Spatial cueing with single validly cued target, invalidly cued trials (20%) and competition trials (20%). For the competition trials proper sensory control experiments were conducted to tease out attentional and sensory effects of adding a distracter. **(C,D)** Change detection paradigm. **(C)** Timing and basic design. In reality the target set size was varied in discrete steps (1, 2, 4, or 6

targets) as well as the distracter set size (0, 2, 4, or 6 distracters) according to a factorial design. Letters were targets, numbers distracters. **(D)** Factorial design with varying targets (T) and distracters (D) within the array. The height of the bars corresponds to the number of targets (#T in VSTM) and the number of distracters (#D in VSTM) loaded into visual short-term memory (VSTM) in each of the cells of the factorial design, according to the Theory of Visual Attention.

can be strictly matched between the two trial types, removing the potential confound of aspecific differences in task difficulty (Vandenberghe et al., 2005; Molenberghs et al., 2008). The addition of a distracter also causes a sensory mismatch with the valid single-grating trials. Additional sensory control trials therefore are necessary to distinguish attentional effects from the sensory effect induced by adding a distracter (Vandenberghe et al., 2005; Molenberghs et al., 2008; Gillebert et al., 2013). The cognitive processes induced by invalidly cued trials have been intensively studied and described before [see Corbetta et al. (2008) for a review]. Briefly, with the delay durations and trial frequencies we used, when an invalidly cued grating appears, subjects have to detect the grating at the unexpected location and shift attention from the predicted location to the target location.

When subjects have to select between competing stimuli, the middle segment of the intraparietal sulcus (IPS) is consistently more active compared to single stimulus conditions (Vandenberghe et al., 2005; Molenberghs et al., 2008; Gillebert et al., 2012a, 2013) This difference in middle IPS activity between double stimulation and single grating stimulation is absent under sensory control conditions where attention is directed centrally (Vandenberghe et al., 2005; Molenberghs et al., 2008; Gillebert et al., 2013). When the functional activity map obtained in healthy volunteers by contrasting competition trials vs. singlegrating trials is overlaid with the voxel-based lesion-symptom map obtained in patients with unifocal cortical stroke from a closely similar contrast, the overlap is situated at the lower bank of middle IPS (Molenberghs et al., 2008). This provides evidence that the deficit in spatially selective attention following inferior parietal lesions can be accounted for by extension of the lesions into the lower bank of the middle IPS which is also activated in healthy controls (Molenberghs et al., 2008). Further evidence for the critical role of middle IPS in this selective attention paradigm comes from a detailed study of a case with a reversible lesion confined to middle IPS with extension into the superior parietal lobule caused by a venous sinus thrombosis, case NV (**Figure 2A**): the lesion provoked a deficit during spatial cueing when a distracter was present ipsilesionally compared to single-stimulus conditions and also during invalidly cued trials (Gillebert et al., 2011) (**Figure 2B**). The effect of adding a distracter was limited to the contralesional target conditions while the effect on invalidly cued trials was present for both ipsi- and contralesional targets (Gillebert et al., 2011; Vandenberghe et al., 2012). The shifting deficit for both left- and rightward attention may possibly be due to the extension into the superior parietal lobule (see below) (Vandenberghe et al., 2001a; Yantis et al., 2002; Molenberghs et al., 2007). When the lesion partially regressed due to the resolution of vasogenic edema, the behavioral deficit also recovered (**Figure 2B**) (Gillebert et al., 2011).

Studies of rare focal lesions in parietal cortex have been fruitful in elucidating the critical role of specific parietal areas during

**FIGURE 2 | (A,B)** Case NV. **(A)** In green the right middle IPS lesion in case NV, affecting the horizontal segment of IPS with extension into the superior parietal lobule. **(B)** NV's accuracy (expressed as A- ) obtained in the different conditions of the hybrid spatial cueing paradigm (red), compared to age-matched controls (black). NV was tested on two instances. On day 4

(full red line) the lesion was as visualized in **(A)**, on day 107 the lesion had substantially regressed (dotted red line). For more details see Gillebert et al. (2011). **(C,D)** Case HH. **(C)** In blue the left posterior IPS lesion in case HH. **(D)** HH's performance obtained in the different conditions of the hybrid spatial cueing paradigm (red), compared to age-matched controls (black).

spatially selective attention and reorienting. Such lesions can have a size of only one or a few cm3, sparing white matter tracts and sometimes with only limited effects on connected regions at a distance (Gillebert et al., 2011). Such lesions provide a spatial resolution far beyond what can be obtained from ischemic lesions of major branches of the middle cerebral artery. In order to properly evaluate the functional effects at a distance, restingstate (Carter et al., 2010; Gillebert et al., 2011; Gratton et al., 2012) or task-related fMRI (He et al., 2007; Gillebert et al., 2011) in the same patients is often essential. The value of such a multimodal imaging approach is also clear from a second case (case HH) with a focal lesion of the descending segment of left IPS, (**Figure 2C**), giving rise to a strictly lateralized contralesional spatial-attentional deficit (**Figure 2D**) (Gillebert et al., 2011). The lesion was significantly smaller than NV's lesion and confined to posterior IPS only (**Figure 3**). The lesion hit an area largely corresponding to IPS0/1. IPS0/1 is a visually responsive retinotopically organized region (Silver and Kastner, 2009; Bressler and Silver, 2010) that shows increased activity when attention is directed to the contralateral hemispace (Yantis et al., 2002; Vandenberghe et al., 2005; Vandenberghe and Gillebert, 2009). Apart from effects of the direction of attention, IPS0/1 is also influenced by spatiotopic mnemonic factors (Sheremata et al., 2010; Jerde and Curtis, 2013). Gillebert et al. (2011) reported the first evidence of the consequences of a lesion of IPS0/1. A lesion of left IPS0/1 preserves the visual fields leaving performance during single-target valid trials intact (**Figure 2D**). When the attentional demands are increased by adding an ipsilateral distracter, performance drops for contralesional targets. This is also true following an invalid spatial cue (**Figure 2D**). The deficit is not due to functional effects at a distance: in the IPS0/1 lesion case, both task-related and resting-state fMRI reveal that the inferior parietal lobule and the ventral attention network (Corbetta and Shulman, 2002; Corbetta et al., 2008) are functioning within a normal range (Gillebert et al., 2011). Although no right-sided isolated IPS0/1 lesions have been reported as yet, a recent study showed that repetitive transcranial magnetic stimulation (TMS) over the right IPS0/1 of healthy volunteers impairs target discrimination in the contralateral side of space (Capotosto et al., 2013). Together with results from fMRI activity in the intact brain (Yantis et al., 2002; Vandenberghe et al., 2005; Xu and Chun, 2006; Molenberghs et al., 2008; Xu and Chun, 2009), these findings can be integrated in a functional-anatomical model where IPS is subdivided in different areas (**Figure 4**). Posterior and middle IPS may intervene at different stages of attentional selection. The effects of IPS0/1 lesions can be accounted for by a strictly lateralized loss of attentional enhancement of a visual response to contralateral stimulation under attentionally demanding conditions (**Figure 4**). The middle IPS segment, on the other hand, may be involved in calibration of attentional weights for individual visuoperceptual units. According to this hypothesis, the role of IPS0/1 in attentional enhancement is defined in purely spatial coordinates while the contribution of middle IPS occurs at a stage where individual objects that occupy specific locations have already been identified (Xu and Chun, 2006, 2009; Gillebert et al., 2012b). This hypothesis is principally founded on fMRI studies in the intact brain (Vandenberghe et al., 2005; Xu and Chun, 2006, 2009) (for review, see Vandenberghe and Gillebert, 2009) and still requires further validation. It is compatible with the performance deficits seen in the two single cases with IPS lesions. By themselves, the differences in behavioral deficits between NV and HH should not be overinterpreted: the attentional deficits in these two cases do not constitute a double dissociation, the lesions do not only differ in hemispheric side but also in extent and degree of involvement of the superior parietal lobule, and the posterior portion of NV's lesion overlaps substantially with HH lesion (**Figure 3**). The different degree of laterality of the spatial-attentional deficit between the two cases does not necessarily mean that the right IPS has a more bilateral representation of space than the left (Weintraub and Mesulam, 1987). It can also be explained by the extension of NV's lesion into the medial wall of the superior parietal lobule which has involved in spatial shifting regardless of hemispace or direction (Vandenberghe et al., 2001a; Molenberghs et al., 2007). Based on a recent single-pulse TMS experiment (Szczepanski and Kastner, 2013), we would predict that a right-sided IPS0/1 lesion would give a similarly lateralized left-sided deficit as that provoked by the left-sided lesion in HH to the right, but this remains to be proven.

Noteworthy, the behavioral deficit on invalid or competition trials following focal IPS lesions does not differ statistically from the deficits following typical inferior parietal lesions that clinically lead to hemispatial neglect or visual extinction (**Figure 5**) (Gillebert et al., 2011). At first this may seem to contradict

canonical findings reported by Friedrich et al. (1998) but detailed analysis of the data reported by Friedrich et al. (1998) shows that in fact the two studies are compatible. The Friedrich et al. (1998) study is often interpreted as if it localizes the pathological increase of the invalidity effect to the inferior parietal cortex as opposed to superior parietal cortex. In the Friedrich et al. (1998) study, however, the main analysis did not contrast superior with inferior parietal lesions but parietal lesions extending into the superior temporal gyrus (STG) (the "TPJ" group) with parietal lesions that do not extend into STG (the "PAR" group). The "PAR" cases also had inferior parietal damage and some of the "TPJ" lesions extended into superior parietal cortex. Furthermore, the sensitivity for detecting a shifting deficit in the "PAR" group was probably relatively low given the complex factorial design (4 factors) (for a detailed discussion, see Vandenberghe et al., 2012). A voxel-based lesion-symptom mapping study in 20 left-sided neglect patients also confirmed that IPS is one of the critical regions associated with a contralateral orienting deficit and a pathological increase of the invalidity effect for contralateral targets, along with the temporoparietal junction (TPJ) and middle frontal gyrus (Ptak and Schnider, 2011).

It is important to note that the hybrid spatial cueing paradigm isolates specific components of the spatial-attentional deficits that can be seen clinically following parietal lesion. While a deficit in the competition trials compared to the single-grating trials can occur even when the clinical extinction test is within the normal range, as of yet we have not encountered clinical extinction without a deficit in the competition trials (Molenberghs et al., 2008; Gillebert et al., 2011). Likewise, patients with a deficit on target cancelation whom we tested always had a deficit on the invalid trials compared to the single-grating trials (Molenberghs et al., 2008). It is worth noting that our data are principally based on patients with unifocal cortical lesions who can do computerized testing in the acute stage with a proper sitting balance and therefore has not included patients with moderate or severe neglect.

A further parietal structure, the superior parietal lobule (SPL), has been implicated by numerous functional imaging studies (Vandenberghe et al., 2001a; Yantis et al., 2002; Shomstein and Yantis, 2004; Molenberghs et al., 2007; Serences and Yantis, 2007; Kelley et al., 2008) in the spatial displacement of the focus of attention. This is surprising as previous lesion evidence in humans provided relatively few hints for a role of superior parietal lobule in spatial shifting. According to recent evidence, however, based on the hybrid spatial cueing paradigm (Vandenberghe et al., 2012), a bilateral lesion of SPL leads to an impairment in shifting attention from the invalidly cued location to the target, regardless of its location and with preserved performance during competition trials. Medial parietal and superior parietal lesions also lead to an increased movement time during visual search (Müller-Plath et al., 2010). Recent electrophysiological recording (Brignani et al., 2009) and electrophysiological stimulation studies (Capotosto et al., 2013) have provided further evidence for the critical role of SPL in spatial shifting. A spatial shift against a sustained attention baseline provokes an event-related potential starting around 330 ms with posterior parietal distribution which does not depend on the direction of the shift, leftward or rightward (Brignani et al., 2009). Furthermore, 150 ms of repetitive TMS at 20 Hz targeting right superior parietal medial cortex 500 ms prior to onset of the shifting cue impairs target discrimination regardless of target location, in left or right visual fields (Capotosto et al., 2013).

At the moment, what is missing is statistical evidence for a double dissociation between spatial attentional processes when patients with focal lesions of parietal cortex in different locations are directly compared to each other, hence the importance

of defining functional dissociations in the intact brain. The latter studies can then serve as a basis for designing paradigms in patients that may be successful in detecting double functional dissociation following lesions.

## **2.2. HYBRID SPATIAL CUEING WITHIN A CYTOARCHITECTONIC REFERENCE FRAME**

TPJ is consistently activated when reorienting attention or during breaches of expectancy (Corbetta et al., 2000). Coordinates of TPJ foci, however, vary widely between studies (Decety and Lamm, 2007; Mars et al., 2012). The inferior parietal lobule, which encompasses TPJ, is by no means homogeneous cytoarchitectonically (Caspers et al., 2006, 2011). According to cytoarchitectonic studies of postmortem brains, the angular gyrus can be subdivided into areas PGa and PGp, and the supramarginal gyrus into areas PFop, PFt, PF, PFm, and PFcm (Caspers et al., 2006, 2011) (**Figure 6A**). Are these human cytoarchitectonic areas differentially involved in competition vs. invalid trials (Gillebert et al., 2013) when controlling for expectancy (trial frequency kept at 20% of all trials for each of the two types of trials)? To answer this question, we applied a dual approach: starting from the cytoarchitectonical divisions, we defined volumes of interest and compared the aggregate response amplitude between the single target trials, the competition trials and the invalid trials in the hybrid spatial cueing paradigm. In a second approach, starting from the functional activity map, we overlaid the activations on the cytoarchitectonic map in order to evaluate to which degree functional boundaries coincide with cytoarchitectonic boundaries (Gillebert et al., 2013). The main challenge is the inter-individual variability in the extent and boundaries of the cytoarchitectonic areas and the probabilistic nature of assignment of voxels to a specific cytoarchitectonic area. This variability has been estimated from a relatively small set of postmortem brains (*n* = 10). This information is incorporated in the probabilistic maps (Eickhoff et al., 2005, 2006). The variability between subjects in size and borders can be relatively high in specific areas, such as hIP1-3 (Scheperjans et al., 2008).

Right PF is the only area exclusively activated for invalid vs. valid trials and not during competition trials (Gillebert et al., 2013) (**Figure 6B**). In right PF the difference in response amplitude between invalid and valid cueing is significantly larger than the difference between competition trials and single-target trials (Gillebert et al., 2013). In contrast, cytoarchitectonic areas hIP1 and hIP3 in IPS exhibit significantly higher activity levels during competition trials compared to invalid trials (**Figure 6B**). Other inferior parietal areas, such as PFm, PGa, and PGp, were bilaterally involved both in competition and invalid trials, without any significant differences between the two trial types (**Figure 6B**). The differential activity pattern between hIP1- 3 and PF provides evidence for a functional dissociation between two types of attentional processes, those related to invalidity vs. processes of selection between competing stimuli. Note that Friedrich et al. (1998) suggested the term "extinction-like" for the invalidly cued trials but according to the above evidence, selection between competing stimuli and spatial reorienting following invalid cues are anatomically dissociable processes. A probabilistic tractography study (Caspers et al., 2011) suggested that PFm and PGa corresponded to the TPJ node of the ventral attention network (Corbetta and Shulman, 2002). Instead, our fMRI findings within a cytoarchitectonic reference frame suggest that right PF is most tightly linked to the invalidity effect in the classical spatial cueing paradigm while PFm and PGa are more generally involved in invalid as well as competition trials, at least when both trial types have a low expectancy rate.

In a second step, we superimposed the activity clusters obtained by contrasting the invalid with the valid single-target trials and by contrasting the double- with the single-target trials onto the cytoarchitectonic map (Gillebert et al., 2013). Boundaries of activation did not follow boundaries between cytoarchitectonic areas. This may suggest that, for the cognitive operations fulfilled by inferior parietal cortex, there is no functional segregation within strict cytoarchitectonic boundaries. This is also apparent from the connectivity pattern of the cytoarchitectonic areas which shows gradients between areas rather than strict segregation (**Figures 6C**,**D**) (Caspers et al., 2011; Gillebert

areas as derived from Caspers et al. (2006). **(B)** Time-activity curves in a selection of cytoarchitectonic areas that showed differential effects between the competition trials, the invalid and the valid single-target trials. **(C)** Hierarchical clustering analysis based on the time courses during resting-state fMRI in the different cytoarchitectonic areas that

cueing paradigm. **(D)** Functional connectivity matrix. The cross-correlation matrix is sorted on the basis of the hierarchical clustering results, so that adjacent VOIs have the most similar connectivity profiles. Significant correlations (*P <* 0*.*05, Bonferroni-corrected for the number of pairwise comparisons) are indicated by a black dot.

et al., 2013). There was, however, one exception: the contrast of invalid vs. valid trials yielded a right inferior parietal activity cluster that coincided relatively closely with area PF (Gillebert et al., 2013).

## **2.3. INVALIDITY, COMPETITION, AND CONNECTIVITY**

Next, we evaluated to which degree the different cytoarchitectonic areas belong to different resting-state networks (Gillebert et al., 2013). First, we derived the time courses of each of the cytoarchitectonic areas and performed a hierarchical clustering analysis (**Figure 6C**). Right PF was the only parietal area where the time course did not cluster with any of the other parietal regions. The time courses of hIP1-3 clustered with the time courses of PFm (Gillebert et al., 2013). When we used the cytoarchitectonic areas as seeds for a resting-state connectivity analysis across the entire brain, the patterns we obtained were in line with prior evidence (**Figure 6D**): Right PF belonged to a network with inferior frontal gyrus and anterior insula, hIP1-3 and PFm were connected to prefrontal cortex, and PGa was part of the default mode network (Gillebert et al., 2013). The connectivity pattern of PFm and PGa/PGp may provide hints about their functional contribution. PFm has been implicated in the multiple demand network (Duncan, 2010) or "executive control network" (Seeley et al., 2007) while PGa/PGp probably corresponds to the inferior parietal nodes of the default mode network.

According to probabilistic tractography measures of connectivity of inferior parietal cytoarchitectonic areas, the main connections of PF are with inferior frontal gyrus, insula, and cortex surrounding the central sulcus and anterior superior parietal cortex (Caspers et al., 2011). This connection may correspond to the third branch of the superior longitudinal fascicle (Thiebaut de Schotten et al., 2011). PGp, on the most posterior end, is mainly connected with occipital and temporal cortex, as well as prefrontal cortex (Caspers et al., 2011). The connection from PGp to anterior temporal cortex may correspond to the inferior longitudinal fascicle (Schmahmann and Pandya, 2006).

How do different parietal regions interact with each other and the occipital cortex to construct the attentional priority map as an emergent property? Two recent studies (Gillebert et al., 2012a; Vossel et al., 2012) addressed this issue empirically by means of Dynamic Causal Modeling (Friston et al., 2003; Penny et al., 2004). Using the spatial cueing with single-target valid trials and competition trials, Gillebert et al. (2012a) evaluated how the addition of an irrelevant distracter within-hemifield alters effective connectivity between early visual extrastriate cortex and the middle segment of IPS. When a distracter is present, the feedback connection from middle IPS to extrastriate cortex is strengthened while no effects are seen on the feedforward connections. The strengthening of the feedback connection fits with the hypothesis that middle IPS biases the competition between stimuli in upstream visual areas (Desimone and Duncan, 1995).

Vossel et al. (2012) examined how TPJ and middle IPS differentially interact with each other and with early visual cortex during orienting and reorienting of attention in a modified spatial cueing task. They observed that the feedback connection from middle IPS to extrastriate cortex was modulated by the direction of attention (leftward or rightward). Interhemispheric connections were modulated between FEF bilaterally rather than IPS. In addition, compared to validly cued targets, invalidly cued targets increased the effective connectivity from visual cortex to right TPJ, and from right TPJ to IPS and the inferior frontal gyrus (Vossel et al., 2012). The TPJ region studied by Vossel et al. (2012) corresponds principally to area PGa and PGp (Caspers et al., 2006).

## **3. THE CHANGE DETECTION PARADIGM**

Spatial cueing paradigms in their original form or in various guises continue to engender novel insights into parietal function (Gillebert et al., 2011, 2012b, 2013; Vossel et al., 2012). As of yet, lesion studies based on spatial cueing, however, did not reveal a clear dissociation between parietal regions (**Figure 5B**), probably because the typical inferior parietal lesions extend beyond PF into areas such as PFm and PGa (Gillebert et al., 2013), and also into the lower bank of IPS (Molenberghs et al., 2008). Another classical paradigm probing the distribution of attentional weights as well as the capacity of visual short-term memory (VSTM) is the change detection paradigm (Luck and Vogel, 1997). It has been applied in patients (e.g., Jeneson et al., 2012) and in healthy volunteers (e.g., Todd and Marois, 2004; Xu and Chun, 2006; Mitchell and Cusack, 2008). When performing the change detection task during fMRI, IPS activity increases with an increasing number of items encoded in VSTM, and correlates across individuals with VSTM capacity (Todd and Marois, 2004; Xu and Chun, 2006).

In an effort to disentangle the role of middle IPS in selection between competing stimuli (Vandenberghe et al., 2005; Molenberghs et al., 2007) from its role in visual short-term memory storage (Todd and Marois, 2004; Xu and Chun, 2006), Gillebert et al. (2012b) factorially varied the number of targets and distracters during change detection (**Figures 1C**,**D**). The behavioral relevance of the items was determined by alphanumerical class rather than by spatial location (**Figure 1C**). Trial-by-trial variations in the number of target and distracter items accessing VSTM were modeled mathematically based on the Theory of Visual Attention (Bundesen and Habekost, 2008; Dyrholm et al., 2011). As expected based on Todd and Marois (2004) and Xu and Chun (2006), activity in middle IPS increased asymptotically with increasing number of targets and also with increasing number of distracters (Vandenberghe et al., 2005). One of the unanticipated findings was a clear dissociation between middle IPS and right anterior IPL (encompassing mainly PF, PFm and PGa): While middle IPS increased with increasing number of targets and increasing number of distracters, anterior IPL showed highest activity when a single target was present (**Figure 7**). The double dissociation between IPS and right PF obtained with the change detection paradigm (Gillebert et al., 2012b) necessarily constrains the interpretations one can attribute to PF in selective attention. In the hybrid spatial cueing experiment right PF was activated by invalid trials but not during competition trials (Gillebert et al., 2013), which is in line with its exclusive activation during the single target/zero distracters condition in the change detection experiment (Gillebert et al., 2012b). We postulate that right PF functions as a target singleton detector, and is activated by conditions where a single target stands out from the background, both in sensory terms and in terms of what is expected.

Apart from this manifest dissociation, the experimental data also revealed that activity levels of middle IPS could not be reliably modeled purely on the basis of a VSTM storage account when the array contained multiple targets and distracters: There was a systematic undershoot of IPS activity under conditions of high target and high distracter set size compared to what one would predict based on the number of items entering VSTM (**Figures 1D**, **7**) (Gillebert et al., 2012b). This may suggest that the threshold for access to VSTM can be variably adapted depending on a trade-off between easy access of targets to VSTM vs. more difficult access of distracters (Gillebert et al., 2012b).

In the change detection experiment (Gillebert et al., 2012b), posterior IPS showed a response profile that was similar to that seen in middle IPS (**Figure 7**). There was no laterality effect as both left and right hemispace were equally likely to contain targets and the total amount of targets and distracters was also matched across conditions between left and right hemispace. The presence of many targets and distracters elicited suppressive effects that resulted in lower response amplitudes with larger arrays, indicating that this mechanism extends to relatively early stages of the visual processing stream.

In another TVA based study (Moos et al., 2012), transcranial direct current stimulation (TDCS) was applied to the right horizontal IPS segment and its consequences were mathematically modeled based on TVA. TDCS of right middle IPS (Moos et al., 2012) led to a hemifield-independent effect on parameter α. This parameter reflects the ability to select targets and

is shown for the 16 different conditions of the factorial design. For further details Gillebert et al. (2012b).

ignore distracters, and is expressed mathematically as the ratio of the attentional weight of a distractor to the attentional weight of a target at the same location. These findings are in line with a model where the middle segment of IPS plays a pivotal role in the calibration of attentional weights and the compilation of an attentional priority map (Vandenberghe et al., 2005; Vandenberghe and Gillebert, 2009) (**Figure 4**).

Increased activity levels during single-target conditions in the absence of distractors also occurred in PFm and PGa in the change detection experiment (Gillebert et al., 2012b). The interpretation of PFm and PGa is more tentative at the moment. PFm and PGa exhibited a relatively aspecific pattern during the hybrid spatial cueing experiment, being activated during both the invalid trials as well as the competition trials (Gillebert et al., 2013), while in the change detection experiment they were principally activated in the single target, zero distracter condition (Gillebert et al., 2012b). PGa belongs to the default-mode network (Wu et al., 2009; Uddin et al., 2010) and PFm to the multiple demand network (Duncan, 2010). Both areas have been activated by a wide variety of paradigms. The contribution of these inferior parietal areas to cognitive processes across multiple domains has been the subject of several recent reviews (e.g., Duncan, 2010; Cabeza et al., 2012; Seghier, 2013).

## **4. CONCLUSION**

Converging evidence from functional imaging of the intact brain and parietal lesion cases indicates that middle IPS (corresponding to cytoarchitectonic areas hIPS1-3) has a critical role in selection between competing stimuli (Vandenberghe et al., 2005; Molenberghs et al., 2008; Vandenberghe and Gillebert, 2009; Ptak, 2012), superior parietal lobule in spatial-attentional shift regardless of target location (Vandenberghe et al., 2001a; Yantis et al., 2002; Molenberghs et al., 2007), and right area PF in processes related to invalidity Corbetta et al. (2000); Gillebert et al. (2013). Right PF may be particularly important when single targets stand out from the background, by virtue of a striking difference from the rest of the visual scene.

## **ACKNOWLEDGMENTS**

Supported by FWO grants G.0076.02 and G0668.07 (EuroCores) (Rik Vandenberghe), KU Leuven Research grants OT/04/41, OT/08/056, OT/12/097 and EF/05/014 (Rik Vandenberghe), and Belspo Inter-University Attraction Pole P6/29 and P7/11. Rik Vandenberghe is a senior clinical investigator of the Fund for Scientific Research (FWO), Flanders (Belgium), and Céline R. Gillebert is a Sir Henry Wellcome Trust postdoctoral research fellow.

## **REFERENCES**


*Neuroimage* 25, 1325–1335. doi: 10.1016/j.neuroimage.2004.12.034


*Nat. Rev. Neurosci.* 4, 26–36. doi: 10.1038/nrn1005


cathodal tdcs over right IPS. *J. Neurosci.* 32, 16360–16368. doi: 10.1523/JNEUROSCI.6233-11.2012


10.1177/1073858412440596. [Epub ahead of print].


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 25 June 2013; published online: 16 July 2013.*

*Citation: Vandenberghe R and Gillebert CR (2013) Dissociations between spatial-attentional processes within parietal cortex: insights from hybrid spatial cueing and change detection paradigms. Front. Hum. Neurosci. 7:366. doi: 10.3389/fnhum.2013.00366 Copyright © 2013 Vandenberghe and Gillebert. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Rethinking the role of the rTPJ in attention and social cognition in light of the opposing domains hypothesis: findings from an ALE-based meta-analysis and resting-state functional connectivity

#### *Benjamin Kubit <sup>1</sup> and Anthony I. Jack2 \**

*<sup>1</sup> Department of Psychology, University of California, Davis, Davis, CA, USA*

*<sup>2</sup> Brain, Mind, and Consciousness Lab, Department of Cognitive Science, Case Western Reserve University, Cleveland, OH, USA*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Rogier B. Mars, University of Oxford, UK Frank Van Overwalle, Vrije Universiteit Brussel, Belgium*

#### *\*Correspondence:*

*Anthony I. Jack, Brain, Mind, and Consciousness Lab, Department of Cognitive Science, Case Western Reserve University, Crawford 617 E, 10900 Euclid Ave., Cleveland, OH 44106, USA e-mail: tony.jack@gmail.com*

The right temporo-parietal junction (rTPJ) has been associated with two apparently disparate functional roles: in attention and in social cognition. According to one account, the rTPJ initiates a "circuit-breaking" signal that interrupts ongoing attentional processes, effectively reorienting attention. It is argued this primary function of the rTPJ has been extended beyond attention, through a process of evolutionarily cooption, to play a role in social cognition. We propose an alternative account, according to which the capacity for social cognition depends on a network which is both distinct from and in tension with brain areas involved in focused attention and target detection: the default mode network (DMN). Theory characterizing the rTPJ based on the area's purported role in reorienting may be falsely guided by the co-occurrence of two distinct effects in contiguous regions: activation of the supramarginal gyrus (SMG), associated with its functional role in target detection; and the transient release, during spatial reorienting, of suppression of the angular gyrus (AG) associated with focused attention. Findings based on meta-analysis and resting functional connectivity are presented which support this alternative account. We find distinct regions, possessing anti-correlated patterns of resting connectivity, associated with social reasoning (AG) and target detection (SMG) at the rTPJ. The locus for reorienting was spatially intermediate between the AG and SMG and showed a pattern of connectivity with similarities to social reasoning and target detection seeds. These findings highlight a general methodological concern for brain imaging. Given evidence that certain tasks not only activate some areas but also suppress activity in other areas, it is suggested that researchers need to distinguish two distinct putative mechanisms, either of which may produce an increase in activity in a brain area: functional engagement in the task vs. release of suppression.

**Keywords: right temporo-parietal junction (rTPJ), attention, social cognition, opposing domains hypothesis, anti-correlations, default mode network, task positive network (TPN), functional imaging methodology**

## **INTRODUCTION**

Research in cognitive neuroscience has implicated cortical regions near the right temporo-parietal junction (rTPJ) in a broad variety of tasks ranging from social interactions (Saxe and Powell, 2006) to attentional interactions with inanimate, visuo-spatial stimuli (Corbetta and Shulman, 2002; Corbetta et al., 2008). The central issue for this paper is how we may best account for observations of rTPJ involvement in attention and social processing.

## **ANATOMICAL AND FUNCTIONAL AMBIGUITY AT THE rTPJ**

The rTPJ does not have a distinct anatomical marker, but is considered to lie at the conjunction of the posterior superior temporal sulcus, the inferior parietal lobule and the lateral occipital cortex (Corbetta et al., 2008). This region of cortex has an unusually high degree of inter-individual variability in gross anatomical structure, as revealed both by careful anatomical observation (Ono et al., 1990) and quantified measures (Van Essen, 2005). Work on the cytoarchitecture of this region reveals substantial individual variation both in the size of functional regions and in the relationship between cytoarchitectonic borders and macroanatomical landmarks (Caspers et al., 2006). These factors make precise localization of functional regions near rTPJ identified using fMRI and PET challenging. A number of distinct anatomical labels have been used in the literature, including rTPJ, angular gyrus (AG), inferior parietal lobe, supramarginal gyrus (SMG), posterior temporal cortex and posterior superior temporal sulcus. These labels are not always used consistently; hence they cannot be relied upon to discriminate one functional region from another. Here we focus on a putative functional division between more posterior TPJ regions, including the AG, and more anterior TPJ regions, including the SMG.

## **ATTENTION AND THE rTPJ**

The rTPJ is thought to play a role in reorienting attention to behaviorally salient stimuli. The exact requirements for a stimulus to be considered salient remain unclear (Frank and Sabatinelli, 2012), however, the area has been shown to respond to distractors that share features with the target stimulus (Indovina and Macaluso, 2007) or are spatially informative of a targets' location (Geng and Mangun, 2011). Regions near rTPJ show increased activity in response to breeches of expectation as well as identification of the target stimulus itself (Corbetta and Shulman, 2002). The most prominent theory integrating the rTPJs' function with other attentional processes suggests the area belongs to a right lateralized ventral attention network (VAN), composed of the TPJ, the middle and inferior frontal gyrus, frontal operculum, and anterior insula (Corbetta et al., 2008).

Current theory (Corbetta et al., 2002, 2008) suggests the VAN, specifically the rTPJ, plays the role of detecting unexpected but behaviorally relevant stimuli, and acts as a circuit breaker for the dorsal attention network (DAN). The DAN (Corbetta et al., 1998; Fox et al., 2005, 2006) is comprised of the intraparietal sulcus (IPS), superior parietal lobule, and the frontal eye fields (FEF) and is thought to be involved in top-down attentional processes. The DAN maintains visuo-spatial information with regards to the current task-defined goals, such as in response to a directional cue, while the VAN remains inhibited until a target or salient distractor is presented, at which point activity in the VAN interrupts the maintenance of attention in the DAN in order to reorient attention (Corbetta et al., 2002, 2008). Within the context of the VAN, the rTPJ has been most studied using variations on two tasks: oddball and Posner cue paradigms.

The standard oddball paradigm presents less frequent stimuli against a stream of frequent stimuli. The key feature is the novel/rare nature of the oddball targets compared to the typical or standard/frequent nature of the baseline stimulus. Visual stimuli are typically presented sequentially at a central fixation point (Bledowski et al., 2004) and in auditory tasks the stimuli are typically presented through headphones in both ears simultaneously (Stevens et al., 2005), although exceptions exist (Linden et al., 1999). As a result, the extent to which the task elicits spatial reorienting is often limited. In most instances participants are instructed to respond with a button press (Downar et al., 2001, 2002; Kiehl et al., 2005) or keep a mental count (Linden et al., 1999) of the number of target stimuli presented in the visual, auditory, and tactile sensory modalities (Linden, 2005).

The Posner cue-type experiment triggers the reorienting of attention in response to invalid cues. During the task the participant is presented with a central cue that more often than not predicts the location of a target stimulus. During invalid trials, the participant is cued to a different location than that of the target stimulus, necessitating a spatial reorienting of attention toward the target. The goal of the task is to detect the target stimulus and respond with a button press upon detection (Macaluso et al., 2002). The task has been studied in the visual (Corbetta et al., 2002) and auditory (Mayer et al., 2009) sensory modalities.

The oddball and Posner cue-type designs both involve the detection of unexpected (low frequency) task-relevant stimuli. Since this is a hypothesized function of the VAN, the co-localization of activations associated with both paradigms is consistent with theoretical accounts of the VAN. However, these tasks also differ in at least one important respect. Posner cue-type tasks require the reorienting of attention from one spatial location to another to respond to invalid trials. In contrast, oddball tasks don't require the participant to break their current focus of attention and make a spatial shift to a new location when a low frequency stimulus is presented.

## **SOCIAL COGNITION AND THE rTPJ**

The rTPJ has also been strongly implicated in social reasoning, specifically theory of mind (ToM) tasks. ToM refers to the ability to understand the intentions of a conspecific, i.e., to predict their actions through the attribution of beliefs and desires (Gallagher and Frith, 2003). ToM studies typically involve short stories followed by questions about the beliefs of one of the protagonists (Gallagher et al., 2000; Saxe and Powell, 2006) or the attribution of intentions to characters depicted in a comic strip (Vollm et al., 2006). The ToM condition is typically contrasted with stories describing human activity without the need for mental state attributions, such as outdated physical representations (Perner et al., 2006).

The rTPJ is part of a larger network of regions which is consistently activated by a variety of social cognition tasks which involve thinking about internal mental states, often referred to as the mentalizing network (Ochsner et al., 2004; Amodio and Frith, 2006; Saxe et al., 2006; Van Overwalle, 2009; Denny et al., 2012; Mars et al., 2012b; Schilbach et al., 2008, 2012). The regions which are most consistently associated with mentalizing are the rTPJ, the medial parietal/posterior cingulate cortex (MP/PCC) and the dorsal medial prefrontal cortex (dMPFC). There is evidence that the these medial mentalizing regions play a relatively general role in social cognition, including emotion processing and introspection (Schilbach et al., 2012), whereas the function of the rTPJ appears to be more specific to the attribution of beliefs and intentions to others (Saxe and Powell, 2006; Saxe et al., 2006).

## **RELATIONSHIP BETWEEN ATTENTION AND SOCIAL COGNITION IN THE rTPJ**

The current literature remains unsettled as to the extent the locus of activity at the rTPJ for mental state attribution coincides with the locus of activity at the rTPJ region involved in attentional processes. Mitchell (2007) found no topographical distinction between either process at the group or individual level of analysis. A meta-analysis published by Decety and Lamm (2007) found overlapping yet significantly different areas recruited for social and reorienting processes. Decety and Lamm's interpretation of these findings focuses on the overlap. This is curious, since meta-analytic investigations can statistically support the claim that two conditions have distinct spatial profiles, but cannot directly speak to the issue of whether two regions do or do not have functional overlap<sup>1</sup> . Nonetheless, these researchers explain these findings by noting there may be similarities between the process involved in reorienting spatial attention and reorienting to another person's point of view (Decety and Lamm, 2007; Mitchell, 2007; Corbetta et al., 2008). In contrast, Scholz et al. (2009) found evidence of distinct activation peaks associated with ToM and attention reorienting, using both group and individual level analyses<sup>2</sup> . These authors resist the view that attention reorienting and ToM tasks share a common neural or psychological mechanism.

An important finding from work in resting state functional connectivity (rs-fcMRI) is the observation of negative correlations between cortical networks. Fox et al. (2005) identify two anticorrelated networks: the default mode network (DMN) and the task positive network (TPN). The DMN includes a region near rTPJ, the AG. The TPN overlaps the DAN and a second network called the fronto-parietal control network (FPCN) (Vincent et al., 2008)<sup>3</sup> . The TPN also includes a region near the rTPJ, the SMG (Fox et al., 2005; Jack et al., 2012). Research on the relationship between social and non-social processes in the brain suggests these antagonistic networks support two distinct cognitive domains. The opposing domains hypothesis holds that the mutually inhibitory relationship between the DMN and TPN reflects a cognitive tension between social cognition (including mentalizing and introspection) and non-social cognitive processes (typically recruited by attention demanding non-social tasks) (Jack et al., 2012). These findings suggest not just that there are at least two distinct regions near rTPJ, but also that they are in tension with each other. This claim is supported not only by resting state functionally connectivity analysis, but also

by the finding that the same regions are activated and suppressed (relative to a resting baseline) by different task conditions (Jack et al., 2012). The task-induced activation and deactivation of these regions is important to note, because this evidence cannot be explained away as a potential artifact of methods commonly used in functional connectivity analyses (Murphy et al., 2009). Critically, a broad range of evidence now supports the view that the maintenance of externally-oriented attention in non-social tasks suppresses activity in the DMN below resting levels (Raichle and Snyder, 2007). It follows from this that the breaking of attention may give rise to a relative increase in activity in regions associated with social cognition, even in the absence of any social processing demands and purely as a result of the termination of suppression—allowing activity to return to resting levels.

rs-fcMRI has also been used as a data-driven tool to identify the borders of distinct functional regions on the basis of changes in connectivity. Initial work on this application indicates considerable variability in the degree to which clear boundaries between regions can be defined (Cohen et al., 2008), however, some areas contain very clear boundaries between contiguous regions with highly disjoint patterns of functional connectivity. One such boundary occurs in the TPJ, between the AG and SMG, in the immediate vicinity of activation foci associated with ToM tasks and with the VAN. These findings support the existence of two distinct functional networks, including a more posterior region incorporating the AG and a more anterior region incorporating the SMG, which are contiguous at the TPJ (see Figure 3 in Cohen et al., 2008). The existence of more than one region in this area is also supported by work in a distinct modality, diffusion tensor imaging, which identifies distinct regions near the rTPJ using tractrography–based parcellation (Mars et al., 2012a).

## **AN ALTERNATIVE ACCOUNT**

The opposing domains hypothesis holds that regions involved in non-social attentional processing and social cognition are not only distinct, but also tend to suppress each other. How might this theory account for observations of the rTPJ's involvement in both attention and social processing? We suggest extending the opposing domains hypothesis with an additional auxiliary hypothesis: the breaking of attentional set that occurs during reorienting of attention leads to an increase in activity in social regions as a result of the release of suppression associated with the maintenance of focused attention. If both the opposing domain hypothesis and this auxiliary hypothesis are correct, then several predictions follow: (1) There should be distinct loci of activation associated with processes which are clearly social in nature (e.g., ToM tasks) and processes which are clearly non-social (e.g., detection of a non-social target, as occurs in oddball tasks). (2) Invalid trials in Posner-cue type experiments should lead both to an increase in activity in social regions (associated with release of suppression during reorienting) and an increase in activity in non-social regions (associated with detection of a non-social target).

The opposing domains account suggests distinct rTPJ areas are involved in social and attentional processing. Why might researchers have struggled to clearly distinguish between these putatively distinct but adjacent areas? We suggest that the region's inconsistent structural organization and variations across

<sup>1</sup>This follows from the fact that meta-analytic investigations are based on information about activation peaks, which are not informative about the spatial extent of activation. Further, variations in individual anatomy and in atlas registration for different studies mean that even conditions with distinct peak loci may not be resolved and appear to overlap. On the other hand, if formal meta-analysis reveals a significant difference in location between conditions, then a secure inference can be made that the conditions have spatially distinct activation profiles, because the location of peaks is informative about the spatial distribution of activation and random variations in anatomy contribute to the error term.

<sup>2</sup>Scholz et al. (2009)'s title might be read as implying the existence of two regions that they demonstrate are functionally distinct. However, their own evidence suggests functional overlap, since their attention reorienting region is modulated by ToM and their ToM region is modulated by attention reorienting. Scholz et al. (2009) do not present a statistical analysis that addresses the issue of whether the regions they identify are functionally overlapping or distinct. This would require demonstrating an interaction with spatial location, where the spatial locations are identified on the basis of independent data. They do present a statistical analysis based on individual subject analysis which supports the claim the conditions are associated with distinct peak activations. This finding is consistent with findings we report, and with the view that there is functional overlap between ToM and reorienting.

<sup>3</sup>While the TPN was aligned with the dorsal attention network in Fox et al's initial papers (Fox et al., 2005, 2006) the spatial characterization of the TPN in those analyses was constrained both by negative correlations with seeds in the DMN and by positive correlations with points generated by studies of visual attention. Later studies have more simply identified areas which are negatively correlated with DMN seeds (Chang and Glover, 2009; Fox et al., 2009; Chai et al., 2012; Jack et al., 2012). These regions overlap both the DAN and FPCN.

experimental paradigms have resulted in the misattribution of contiguous regions' response profiles to a single region. The response profile of the rTPJ, in the context of the VAN, may be falsely informed by fMRI findings that fail to account for the strong negative correlation, observed both in resting connectivity and due to tasks, between separate areas at the rTPJ. BOLD changes associated with reorienting may reflect the sum of two independent effects which occur in contiguous regions effectively simultaneously (given the temporal resolution of fMRI). The first is activation above resting baseline of the SMG associated with the detection of a low-frequency task-relevant stimulus. The second is release of deactivation in the AG, possibly only a recovery to baseline levels, which may in some paradigms be followed by a rapid return to a suppressed state due to processes involved in target detection (SMG activation) and/or re-engagement of attention (DAN activation). Although these two putative effects would reflect very different cognitive mechanisms, they may nonetheless produce similar event-related responses in immediately contiguous regions.

If this account is correct, then the "circuit breaker" function which VAN theory attributes to the rTPJ may be best explained by the posterior TPJ's (including the AG) involvement in social cognition, a type of processing which is in competition with focused attention. Such an account would still suggest a possible "circuit breaker" role for the posterior TPJ, however, this role would likely be non-specific in nature, involving a tendency to suppress attentional processes in general rather than communicating specific information that might inform the re-orienting of attention. This account holds that the anterior TPJ (including the SMG), in contrast to the posterior TPJ (including the AG), is directly involved in attentional processes.

## **SUMMARY OF KEY HYPOTHESIS**

The key hypothesis we propose here, and marshal evidence to support, is as follows: Reorienting (unlike oddball) paradigms require the participant to break their attentional set i.e., on invalid trials the participant must release sustained focused attention from its cued location to complete the task. The maintenance of focused attention is (one of) the cognitive process that tends to suppress DMN regions (while activating attention regions). When focused attention is broken, this suppression is (usually only temporarily) lifted. This causes activity in the posterior TPJ (e.g., AG) to increase relative to its suppressed state, just as happens when a compressed spring is released.

While this hypothesis is novel and tentative in the context of attention reorienting tasks, there is prior evidence which broadly supports this "compressed spring" model of DMN network activity. There is clear evidence that DMN regions are more suppressed for higher effort non-social tasks, and that there is return to baseline when participants disengage, either because the task finishes or because of mind-wandering (McKiernan et al., 2003; Mason et al., 2007). In addition, there is evidence of a "rebound" effect, such that DMN activity is greater during resting periods the more it has been suppressed by a preceding working memory task (Pyka et al., 2009). We hypothesize that the sudden breaking, and subsequent refocusing, of attention that occurs in reorienting tasks produces a similar pattern, but on a shorter timescale. That is, reorienting produces a transient release of suppression whose BOLD time course looks similar to that of an above-baseline event related response.

While this hypothesis is tentative, it nonetheless raises questions about the view that the AG is involved in attentional reorienting in the manner envisaged by VAN theory. In addition to having implications for VAN theory, this idea has quite broad implications for the interpretation of neuroimaging findings. The usual inference that is made from the observation that an area increases in activity concomitant with a task event is that the area plays a direct functional role in the task-related cognitive processes that occur at that moment. This is the basic logic of cognitive subtraction (Price and Friston, 1997). However, this logic has already been implicitly acknowledged as incorrect for cases where an increase in activity can be more simply explained by a decrease in suppression (McKiernan et al., 2003; Mason et al., 2007). VAN theory focuses on a region which, similar to other DMN regions, is typically deactivated compared to rest during task performance (Shulman et al., 2007). VAN theory interprets activation of this region following the well-established and intuitive logic of cognitive subtraction. Our provocative suggestion is that this logic fails to apply. Specifically, we suggest that transient increases in activity near the AG have been incorrectly attributed to that region playing an active role in attention reorienting, when the observed effect is really due to the transient release of suppression of that region<sup>4</sup> .

#### **EXPERIMENTAL DESIGN**

To test our alternative account of rTPJ involvement in attention and social cognition, we sought to localize and investigate the functional connectivity of regions associated with the detection of task-relevant infrequent stimuli, the attribution of intentions to agents, and the reorienting of attention. To do this, we use formal meta-analytic methods to distinguish the localization of activations associated with oddball, ToM and reorienting paradigms. Of particular significance is that, unlike a prior formal meta-analysis which investigated attention and social processes in rTPJ (Decety and Lamm, 2007), we distinguish oddball from reorienting tasks. We predict that oddball paradigms will preferentially recruit the anterior TPJ (e.g., SMG), ToM tasks will preferentially recruit the posterior TPJ (e.g., AG), and reorienting will tend to be localized between the AG and SMG. Next, we examine functional connectivity associated with these distinct foci. In accordance with the opposing domains hypothesis we predict very different cortical networks will be associated with ToM and oddball seeds. The reorienting seed is predicted to lie on the border between these networks, and hence correlations with this seed should reflect some combination of signals associated with the other two seeds.

<sup>4</sup>A concern the reader may have with this account is that it would appear inefficient for the brain to expend energy increasing activity in a region whose function is unrelated to task demands. However, a large body of work indicates the brain is 'inefficient' in this way: DMN activity typically increases when non-social task demands terminate (Raichle and Snyder, 2007). Hence, this concern is not specific to the account we give here.

## **MATERIALS AND METHODS**

## **LITERATURE SEARCH AND COORDINATE SELECTION**

The research articles used as a source of foci for the meta-analyses were identified in two ways. First, we gathered papers referenced in Decety and Lamm's formal meta-analysis (2007), as well as Corbetta and Shulman's (2002), and Corbetta, Patel and Shulman (2008) reviews. Second, additional papers were identified by performing a search on Google Scholar using the terms "fmri" or "pet"; and "reorienting," "posner," "oddball," "target detection," or "ToM."

Once a database of 50 potentially relevant papers was identified, each paper was categorized as containing either a ToM, attention reorienting, or target detection task. ToM tasks were defined as reasoning about beliefs, intentions, or thoughts. Foci of interest contrasted tasks requiring the attribution of mental states to matched tasks that did not require the participant to consider others' beliefs or intentions. Attention reorienting tasks were defined as redirecting attention toward a target stimulus after a breach of expectation. Foci of interest contrasted trials when participants had to redirect attention after being misinformed about the upcoming target stimulus' location to trials when participants were correctly informed. Target detection tasks were defined as the presentation of a distinct and infrequent stimulus during a stream of frequent stimuli. Foci of interest contrasted trials when participants encountered an oddball to non-oddball trials.

Rather than filtering out papers based on a reported coordinates' proximity to idealized rTPJ coordinates as in a prior metanalysis (Decety and Lamm, 2007), foci tables containing analyses that reflected a given task definition were all included in the metaanalyses. All of the foci from an analysis were extracted from a paper and reported in stereotactic coordinates (*x*, *y*, *z*). If the coordinates were reported in the Montreal Neurological Institute space, they were converted to the Talairach and Tournoux (TAL) space using the Brett transformation (Brett, 1999).

## **META-ANALYSES**

Separate meta-analyses were performed to localize activation for each task using activation likelihood estimation (Eickhoff et al., 2009), with a full-width-at-half-maximum (FWHM) of 10 mm, *p*-value threshold of *p <* 0*.*004, and a false discovery rate (FDR) threshold of *q* = 0*.*05. In addition, differences in activation between the three tasks were computed using difference maps (Laird et al., 2005), using 5000 permutations. The thresholded ALE maps from both analyses were visualized on a fiducial representation of a standardized brain atlas (PALS-B12 human atlas) using Caret version 5.612.

## **RESTING STATE FUNCTIONAL CONNECTIVITY ANALYSES**

For each task, the results of the meta-analyses were visualized in Caret and the centres of activation near the rTPJ were identified and used as seeds for three separate resting state functional connectivity analyses. **Table 1** lists the coordinates used as seeds for the analyses. Resting state data was retrieved from the public database NITRC on February 15, 2010. Two data sets were used: Beijing\_Zang (Zang, Y. F.; *n* = 198 [76M/122F]; ages: 18– 26; *TR* = 2; no. of slices = 33; no. of timepoints = 225) and Cambridge\_Buckner (Buckner, R. L.; *n* = 198 [75M/123F]; ages: 18–30; *TR* = 34; no. of slices =47; no. of timepoints =119). The total combined number of subjects was 396 (245 female), aged 18–30 (mean age 21.1). The data was aligned to 711–2B atlas space. All methods were identical to those reported by Fox et al. (2005, 2006, 2009; Jack et al., 2012) and similarly employed a global gray matter regressor, except that statistical contrasts used a random effects method (Jack et al., 2012), and the resulting statistical images were whole brain corrected for multiple comparisons (*z >* 3, *n* = 17). Contrasts either used one fisher-*z* transformed correlation image per subject entered into a single sample *t*-test, or two such images corresponding to the two seeds entered into a paired *t*-test.

## **RESULTS**

## **META-ANALYSES**

The studies used in the primary meta-analyses are listed in **Tables 2**–**4**. In total, the reorienting category contained 14 papers

#### **Table 1 | Connectivity analysis coordinates.**


*Coordinates used as seeds for each task in the resting state connectivity analyses.*

## **Table 2 | Target detection meta-analysis studies.**


#### **Table 3 | Reorienting meta-analysis studies.**


*\*Denotes additional papers included in the secondary meta-analysis.*

(139 foci), 12 papers (199 foci) were in the oddball category, and 12 papers (104 foci) were in the ToM category.

In response to a reviewer concern that the meta-analysis accurately represented each category, a secondary, *post hoc* metaanalysis was conducted including foci from an additional four reorienting and 16 ToM papers. A total of 18 reorienting papers (169 foci) and 28 ToM papers (239 foci) were used in the secondary analysis. Papers used in the secondary meta-analysis are listed and indicated in **Tables 2**–**4**. **Figure 1** shows the results from this secondary extended meta-analysis instead of the primary analysis. The results were highly consistent, such that the seed regions originally identified by identifying peak significance did not need to be altered (**Figure 1**). The principle difference between the two meta-analyses was that the secondary analysis produced more extended areas of significance in the expanded categories.

**Figure 1D** displays the results of the three single-condition analyses. Each of the three conditions shows areas of activation unique to each task (see figure description for peaks of activation; **Table 5** for whole-brain peaks of activation). The ToM and reorienting region-of-interest (ROI) near the rTPJ show some overlap (purple area), with the ToM ROI extending more posterior at the AG and the reorienting ROI extending more anterior. While the peak of the reorienting ROI lay dorsal to the ToM ROI, the reorienting ROI extended in a dorsal-ventral direction such that it clearly separated a posterior TPJ region (including the AG) from an anterior TPJ region (including the SMG). Note the clearly distinct peak activation region at the rTPJ for the target detection ROI, located more anterior at the SMG compared to both the ToM and reorienting ROIs. **Figures 1A–C** displays the results of the difference maps. All three comparisons resulted in distinct areas of peak activation for each task near the rTPJ, conforming **Table 4 | Theory of mind meta-analysis studies.**


*\*Denotes additional papers included in the secondary meta-analysis.*

to the same spatial distribution suggested by the initial metaanalyses. The peaks of activation clusters for each difference map from the primary analysis are listed in **Table 6**.

These findings support our hypotheses that the detection of infrequent behaviorally-relevant stimuli is associated with peak activation in the anterior TPJ (SMG) that attributing intentions to others is associated with a distinct locus of peak activation in the posterior TPJ (AG), and that tasks involving spatial reorienting demonstrate peak activation at points intermediate between these areas.

## **RESTING STATE FUNCTIONAL CONNECTIVITY ANALYSES**

**Figures 2A–C** displays the results of the resting state connectivity analyses.

Consistent with our view that regions supporting ToM (e.g., AG) and regions supporting target detection (e.g., SMG) have distinct functional roles, the ToM and target detection ROIs show very different patterns of resting connectivity. There was a complete absence of overlap in either their positive or negative connectivity patterns (a direct comparison is illustrated in **Figures 3**, **4**). Consistent with our claim that the ToM region is part of the DMN the ToM seed shows positive connectivity with the DMN, particularly MP/PCC and dMPFC regions associated with mentalizing. In addition, consistent with our claim that the ToM region has a reciprocal inhibitory relationship with the DAN, regions anti-correlated with the ToM seed show an excellent correspondence with the DAN as identified in prior publications (Fox et al., 2005, 2006).

The target detection seed demonstrates a positive relationship with the anterior insula, supplementary motor area, and anterior cingulate cortex; regions involved in saliency detection, effort, and task difficulty typically recruited during oddball tasks (Linden et al., 1999). Consistent with our claim that regions supporting target detection have a reciprocal inhibitory relationship with the DMN, regions anti-correlated with the target detection seed show an excellent correspondence with the DMN as identified in prior publications (Fox et al., 2005), including rTPJ, MP/PCC, and dMPFC regions specifically associated with mentalizing (Van Overwalle, 2009; Denny et al., 2012).

Similar to findings reported in Fox et al. (2006), our reorienting seed identified positively correlated regions in medial frontal gyrus, inferior frontal gyrus, a region in medial prefrontal cortex posterior to the dMPFC region previously mentioned, and anterior insula. Hence our positive connectivity pattern was broadly equivalent, however, the positive correlations we observed appeared relatively weaker, and we identified anticorrelations with DAN regions which were not observed by Fox et al. (2006).

Visual inspection of **Figure 2B** indicates that the reorienting seed demonstrates substantial overlap between both the positive and negative resting state correlation patterns of the ToM seed (see **Figures 3**, **4**, yellow areas) and target detection seed (see **Figures 3**, **4**, light blue areas). To further examine the hypothesis that the reorienting seed involves the combination of signals associated with the other seeds, we examined differences in connectivity between the reorienting seed and the two other seeds. If the reorienting seed corresponds to a region with a distinct functional connectivity pattern, then distinct regions should be observed which cannot be accounted for by the connectivity of the other seeds. However, this was not what we observed. Examining differences between the reorienting and target detection seeds (**Figure 2D**), we found a pattern very similar to that observed for the ToM seed (**Figure 2C**). In particular, no areas of positive connectivity were identified which could not be accounted for by hypothesizing that the reorienting seed involves the combination of signals from the ToM and target detection seeds. Examining differences between the reorienting and the ToM seeds (**Figure 2E**), we found a pattern very similar to that

#### **Table 5 | Meta-analyses results.**


*Coordinates of clusters produced by the primary meta-analyses. Anatomical labels produced by GingerALE.*

observed for the target detection seed (**Figure 2A**). There were two areas of positive connectivity which appeared greater than for the target detection seed, in anterior middle frontal gyrus, and inferior frontal/insula. However, these apparent positives could be accounted for by anti-correlations with the ToM seed. No areas of positive connectivity were identified which could not be accounted for by hypothesizing that the reorienting seed involves the combination of signals from the ToM and target detection seeds.

## **DISCUSSION**

Our goal in this paper is to articulate an alternative account of the involvement of regions near the rTPJ in attention and social processing, and provide evidence which is more consistent with our account than with extant theory concerning the VAN.

## **CHALLENGES TO VAN THEORY**

Our findings are consistent with other findings which suggest there are at least two functionally distinct regions near rTPJ (Caspers et al., 2006; Cohen et al., 2008; Scholz et al., 2009; Mars et al., 2012a), and that these regions are part of two distinct networks which can be differentiated using rs-fcMRI (Fox et al., 2005; Cohen et al., 2008; Mars et al., 2012a) and by virtue of their differential engagement in attention demanding social and non-social tasks (Fox et al., 2005; Jack et al., 2012). We add to these prior observations by demonstrating that these distinct networks at the rTPJ correspond to distinct loci for target detection and ToM, using formal meta-analysis. These findings present three challenges to current theory concerning the VAN (Corbetta and Shulman, 2002; Corbetta et al., 2008).

First, contra Corbetta and Shulman (2002), our findings indicate that target detection has a distinct locus from reorienting. Current theory holds that oddball and reorienting paradigms both activate the VAN because both involve the detection of behaviorally relevant unexpected stimuli. However, we suggest this account oversimplifies reorienting of attention by equating it to a purely confirmatory process (i.e., target detection). A target is undoubtedly detected during invalid trials, but in addition, the preceding attentional set is broken and the locus of attention changed to the unexpected location. The existence of this additional process in the Posner cue-type design is supported by highly consistent findings of longer response times for invalid compared to valid trials (Corbetta et al., 2002; Hopfinger and Ries, 2005; Mayer et al., 2009). In contrast, there is no need to break attentional set in oddball paradigms. In accordance with our distinction between the two types of task, the meta-analysis identified two separate areas at the rTPJ for reorienting and target detection.

Second, contra Corbetta et al. (2008), our findings indicate that ToM paradigms recruit a neighboring but significantly distinct locus from reorienting and target detection. Our account can explain the seemingly contradictory findings of prior studies which have directly compared ToM and reorienting tasks. Importantly, both prior studies included analyses of individual participants, overcoming the problem of inter-individual differences at

#### **Table 6 | Difference maps results.**


*Results from the difference maps from the primary meta-analysis. Centres of activation as reported by GingerALE for each contrast listed with papers containing foci that fell within the areas of activation. Note that a foci does not have to lie within a cluster to significantly contribute to the cluster. "Subjects represented' is the percent of subjects from the papers within the significant cluster over the total amount of subject in the given task category. "rTPJ mentioned" is the percent of papers specifically implicating the rTPJ within the significant clusters. REATTN, reorienting; ODATTN, target detection; TOM, theory of mind.*

the rTPJ. Mitchell (2007) found no topographical distinction between either process, whereas Scholz et al. (2009) found evidence of distinct activation peaks associated with ToM and attention reorienting. These differences between the studies may be accounted for by differences in the methods of analysis, or by scanner resolution differences, as Scholz et al. suggest. Alternatively they may be due to differences in the designs of the reorienting paradigms, which are likely to

**FIGURE 2 | Resting state connectivity results.** Results from the resting state connectivity analyses for each seed showing distinct patterns of connectivity for the **(A)** target detection, **(B)** reorienting, and **(C)** ToM seeds. The target detection seed shows a positive relationship with the TPN and a negative relationship with areas of the DMN. The ToM seed shows the opposite pattern, a positive relationship with the DMN, and a negative relationship with TPN areas. Results from the resting state connectivity

contrasts showing the comparison of **(D)** reorienting and target detection connectivity and **(E)** reorienting and ToM connectivity. The contrast shown in **(D)** yields a pattern of connectivity highly similar to the ToM seed connectivity **(C)**, while the contrast shown in **(E)** yields a pattern highly similar to the target detection seed connectivity **(A)**. Left hemisphere connectivity patterns were very similar to right hemisphere connectivity patterns.

**FIGURE 4 | Negative connectivity results for all three seeds.** The ToM and target detection seeds demonstrate a complete lack of overlap between their negative resting state correlation patterns (purple areas). All three seeds show minimal overlap in negative connectivity (white areas).

have altered the relative balance of contributions made by the AG and SMG networks to the reorienting event-related signal <sup>5</sup> . In fact, even using high resolution imaging with regions defined in individual participants, Scholz et al. (2009) report modulation of the ToM area associated with reorienting and modulation of the reorienting area associated with ToM. This finding is difficult to account for on Scholz et al's own model, which holds the regions play wholly functionally distinct roles in reorienting and ToM. However, it is consistent with our view that ToM and target detection are functionally connected by virtue of a mutually inhibitory relationship (Jack et al., 2012). A metaanalysis published by Decety and Lamm (2007) also found a significant difference in peak activation location associated with social and attentional processes. Our results are consistent with theirs. However, they did not distinguish reorienting from target detection foci.

Third, contra Fox et al. (2006), our findings suggest that rsfcMRI derivations of the VAN using a reorienting seed may result from the confounding of distinct signals. To allow a meaningful comparison, we used identical rs-fcMRI methods to the prior report (Fox et al., 2006). The only differences are that: our reorienting seed is based on a larger sample of reorienting foci which we analyzed using formal meta-analysis methods, our functional connectivity findings are derived from a considerably larger sample, we used random rather than fixed effects analysis methods, and we added the use of paired *t*-tests for the purposes of comparing connectivity associated with different seeds.

The contrast between the reorienting and target detection connectivity produced a correlation pattern almost identical to that of the ToM seed, whereas the contrast between the reorienting and ToM connectivity produces a correlation pattern almost identical to that of the target detection seed. The logic of our analysis is straightforward. If the reorienting seed corresponds to a distinct functional network, then the paired *t*-tests should have revealed evidence of connectivity to regions which could not be accounted for by correlations with the ToM and target detection seeds. We do not deny the possibility that there is a distinct functional network interposed between the AG and SMG, as suggested by some recent reports (e.g., Yeo et al., 2011). However, we do not believe that the methods used in these reports are able to clearly distinguish between correlations which arise due to the summing of signals from contiguous regions and correlations which genuinely reflect the existence of a distinct network. Further, we note very low confidence estimates for networks in this region (see Figures 8, 10 in Yeo et al., 2011). Since it is more parsimonious to assume two networks are present in this region, as opposed to three (Figure 7 in Yeo et al., 2011) or six (Figure 9 in Yeo et al., 2011), we suggest this should be the null hypothesis pending the development of independently validated methods that can unequivocally distinguish between these possibilities.

## **CIRCUIT BREAKING**

VAN theory and our account are both consistent with a circuit breaking role for rTPJ regions which are suppressed during visual search. However, our account suggests a different type of circuit breaking. VAN theory holds that suppressed regions are involved in the filtering of unexpected stimuli and, when a task relevant unexpected stimulus is detected, send information about that stimulus to the DAN to guide the reorienting of attention (Shulman et al., 2007; Corbetta et al., 2008). Our account sees filtering and sending information about salient stimuli as potential functions of the anterior TPJ (e.g., SMG). The posterior TPJ (e.g., AG) is the primary locus of suppression, and is dedicated to tracking the intentions of perceived agents. Nonetheless, since the AG is in tension with the DAN, our account is consistent with its playing a more general circuit breaking role.

One possibility is that transient activation of the AG sends a non-specific reset signal to the DAN, akin to adding noise to a dynamic system so that it can settle into a new global minimum. However, we note that theoretical explanations proposing the role of the rTPJ as a circuit-breaker (Corbetta et al., 2008) lack confirmation of the area's purported beneficial role in resetting top-down influences from the DAN. The existing evidence shows increases in activity at rTPJ to be detrimental to target detection (Shulman et al., 2007), and a negative relationship between behavioral performance and a measure of the VAN's causal influence on the DAN (Wen et al., 2012). Research on the time course of the rTPJ and DAN, while not conclusive, suggests the rTPJ's activity follows transient activity in the DAN (DiQuattro and Geng, 2011); results contrary to the circuit-breaker hypothesis of rTPJ function. Instead, the anterior TPJ (SMG) may be involved in updating attentional sets by working in concert with the IFG, which in turn modulates activity in the DAN (Sridharan et al., 2007; DiQuattro and Geng, 2011; Vossel et al., 2012; Weissman and Prado, 2012). Hence, we remain neutral concerning the potential circuit breaking role of the posterior TPJ (e.g., AG), awaiting evidence which more clearly distinguishes the roles of these regions. An alternative to the circuit breaker hypothesis, which is equally consistent with our account, is that disruption of a suppressive signal that originates either in the DAN or a third region such as the IFG causes the posterior TPJ (e.g., AG) to be temporarily released.

Published maps of the VAN obtained using rs-fcMRI are variable. There are notable discrepancies between two papers with overlapping authors (Fox et al., 2006; Mantini et al., 2009), most notably with regard to whether or not anti-correlations are seen with the DMN, but also to regions of positive connectivity. One of the VAN maps coheres well with our SMG target detection map (Mantini et al., 2009), the other is more similar to our reorienting seed map (Fox et al., 2006). Our account can readily explain such discrepancies, which may result from small variations in the location of the seed near the border between discrete functional networks. However, another possible explanation is the presence of a third, more dorsal region at the rTPJ, in-between the AG and SMG. Recent work has emphasized the role of additional networks other than the VAN and DAN in attention (Petersen and Posner, 2012). One such network, the frontoparietal control network (FPCN), is involved in moment-to-moment aspects of

<sup>5</sup>Notably Scholz et al. (2009) only found a very small area of significant activation associated with attention reorienting in their group analysis, even though they had a relatively large number of participants (*n* = 21). This suggests that their implementation of the attention reorienting paradigm was different from other groups, who have identified more extensive activations.

executive control, often associated with cue-onset activity within trials, and includes an area more dorsal than the rTPJ node of the VAN. However, the extent to which this region is distinct from DAN (Dosenbach et al., 2008) and VAN (Dosenbach et al., 2006) areas near the rTPJ remains unclear. Outside of standard attentional control tasks, the FPCN is also hypothesized to support executive control in tasks that specifically recruit the DMN (Spreng et al., 2010). Spreng et al. (2012) argues that the network supports goal-directed cognition, whether it be social or visuospatial in nature, pointing to the mediatory connectivity profiles between the FPCN and DAN, as well as the FPCN and DMN, as evidence.

The overlap between our reorienting connectivity areas and the FPCN is unclear, nonetheless, our connectivity contrasts are potentially congruent with such an account. The FPCN's high degree of interconnectivity with both the TPN and DMN may be reflected in our finding that separately subtracting reorienting connectivity from AG and SMG connectivity leaves no regions left over that could not be explained by correlations with the AG and SMG seeds.

In summary, the number of attention networks has increased and evolved into a more complex account than simply the DAN and VAN (Corbetta and Shulman, 2002). Such a view is consistent with our account that reorienting is a complex process, however, our explanation does not require the addition of a network to explain reorienting-related activity at the rTPJ. If reorienting does rely on a third attentional network including a more dorsal rTPJ region, then our challenge to VAN theory would be restricted to the identification of a distinct region at the rTPJ involved in attention but dissociable from target detection (Corbetta and Shulman, 2011).

## **EMPIRICAL LIMITATIONS**

We acknowledge limitations to our empirical findings. First, our meta-analytic findings rely on the anatomical alignment of studies conducted using different scanners whose images have been co-registered to different atlases. Given that our sample was of a reasonable size, these differences should have led an increase in randomly distributed noise and thus greater difficulty resolving distinct localizations. Nonetheless, the possibility of systematic error remains. Second, we have postulated that two factors contribute to reorienting responses. However, we have not directly manipulated these factors in order to establish this claim. Ideally, future work will employ high resolution imaging and paradigms that parametrically modulate these factors in order to distinguish their effects on different cortical areas. Third, we acknowledge that careful anatomical work suggests a number of distinct functional regions near rTPJ (Caspers et al., 2006) and that our groupbased methods may have failed to capture important aspects of this fine grained structure. Although our work is at a similar anatomical resolution to work that has guided VAN theory, we acknowledge that higher resolution work on individual subjects may confirm the existence of a region specific to reorienting between the AG and SMG. Hence, our account of rTPJ involvement in reorienting in terms of the combination of signals from contiguous regions associated with two wide-scale functional networks may turn out to be wrong. In that case, our challenge to VAN theory would be restricted to noting the need to differentiate between regions involved in reorienting, target detection (Corbetta and Shulman, 2002) and ToM (Corbetta et al., 2008).

## **NOVEL METHODOLOGICAL CLAIMS**

Our theoretical account of reorienting relies on two relatively novel claims. The first is that event-related BOLD effects with positive going waveforms can be attributed to the transient disengagement of suppression in a paradigm. The second is that positive connectivity maps derived from standard rs-fcMRI methods may, in some cases, fail to identify coherent functional networks. We acknowledge that further work is wanted to establish these claims. At the same time, we point to considerations which support the plausibility of these claims.

First, there is now a substantial body of work which establishes that activity levels of the default network can, in some cases, be best accounted for by the suppressive effect of task demands which are positively associated with functions instantiated in entirely distinct cortical networks (McKiernan et al., 2003; Mason et al., 2007; Buckner et al., 2008; Andrews-Hanna, 2011). If this view is accepted, it represents a relatively minor step to presume that the transient event-related release of these suppressive effects could give rise to a positive going BOLD waveform.

Second, we note that the methods of rs-fcMRI are relatively novel, and to date have only been partially validated. It has already been shown, both mathematically and in practice that they can produce artifactual results, particularly in relation to negative correlation maps (Murphy et al., 2009) <sup>6</sup> . Although we don't know of validated examples of spurious positive correlations, they are no less mathematically plausible. The unusually high degree of inter-subject variability in anatomy and functional organization at the TPJ (Van Essen, 2005; Caspers et al., 2006) further increases the potential for signals from neighboring but functionally distinct areas to be confounded when deriving rs-fcMRI maps of this area.

## **IMPLICATIONS FOR THEORY**

A natural assumption which has guided some prior accounts has been the view that attentional reorienting is an evolutionarily basic process which has been coopted to play a role in social cognition (Decety and Lamm, 2007; Corbetta et al., 2008). However, it is important to remember that the parsing of the cognitive

<sup>6</sup>This represents an important methodological concern, however the reader should note that the negative correlations we report are validated by other methods. First, a number of laboratories have observed anti-correlations using conservative methods that don't employ mean signal regression (Chang and Glover, 2009; Fox et al., 2009; Chai et al., 2012; Jack et al., 2012). Second, Jack et al. (2012) validate anti-correlations derived from resting connectivity by demonstrating that they correspond with task related activations and deactivations seen in both the DMN and TPN. Finally, it is important to note that conservative methods which do not use a global regressor likely underestimate the degree of true anti-correlations, and that findings using a global regressor appear more accurate when compared to independent evidence: The methods of Fox et al. (2005) using global normalization, which we also use here, demonstrate good correspondence with regions that are consistently deactivated during cognitively demanding non-social tasks (Raichle and Snyder, 2007).

operations involved in tasks is a complex and partially speculative process. Reorienting may not be a basic cognitive process, but may instead be a complex process which involves contributions from different regions with computationally distinct roles. Recent accounts of the evolution of the human cortex suggest that social processing demands have played an important role in the massive evolutionary expansion of cortex, which is evident from comparisons between humans and our nearest evolutionary neighbors. Our view is guided by this work, and suggests that some observations which propose a putative role for the rTPJ in attention may be best explained by an alternative hypothesis. Namely, the view that social processing is accomplished by basic cognitive processes which evolved specifically for that purpose, which are not only distinct from but also in tension with basic attentional processes.

While a synthesis of the attention literature lies beyond the scope of this paper, we suggest that some current ambiguities may be resolved by distinguishing between the functions of the anterior TPJ (e.g., SMG) and the posterior TPJ (e.g., AG). For example, a recent review on neglect proposes that the attentional deficits are a result of damage to VAN regions, disrupting communication between the left and right DANs (Corbetta and Shulman, 2011), however, the authors admit the neural mechanisms explaining interactions between the VAN and DAN are poorly understood. Research has demonstrated deficits in sustained attention in patients with posterior parietal cortex lesions (Malhotra et al., 2009) and target detection from TMS over the AG, not the SMG (Chambers et al., 2004). The AG region of the DMN has demonstrated abnormal functioning in patients with a variety of neurological disorders (Zhou et al., 2007; Broyd et al., 2009) as well as traumatic brain injuries (Bonnelle et al., 2011) characterized by low sustained attention. In light of our results, we suggest that the attentional deficits characteristic of neglect patients with damage to the rTPJ region may not be explainable unless the focus of neglect research is widened to include the effects of brain networks whose primary function is not attention.

In terms of social cognition, the alternative accounts we focus on here have emphasized the notion that mechanisms for external attention have been evolutionarily coopted to play a role in social cognition (Decety and Lamm, 2007; Corbetta et al., 2008). In contrast, we hypothesize that mentalizing (i.e., our capacity to represent the internal mental states of conspecifics) was built upon a system for internal attention, e.g., whose original functions were those of interoception and self-regulation. According to our account, this system evolved to be in tension with a system for representing the physical and mechanical properties of inanimate objects, which are built upon systems for external attention, e.g., perception and the manipulation of objects. Our account of mentalizing as coopting mechanisms for internal attention fits best with the anatomy of medial parts of the DMN associated with mentalizing (dMPFC and MP/PC). The evidence from rs-fcMRI and activation studies strongly suggests the AG is part of the same network as these medial regions, however, it's anatomical location is less congruent with a connection to internal attention. Instead, the right AG lies near to a right lateralized system of occipital and temporal regions involved in the sensory processing of socially relevant information (Kanwisher et al., 1997; Peelen, 2004; Pelphrey, 2005). In other words, the posterior TPJ may represent a critical junction box where different types of social information are integrated, namely information that derives from internal attention (medial DMN regions) and external attention (right lateralized regions for social perception). This fits well with the posterior TPJ's more specific functional role in representing the intentions of perceived agents (Saxe and Powell, 2006; Saxe et al., 2006).

This raises an interesting question: might there be an evolutionary reason for the tension between posterior and anterior TPJ regions? While such an account would be speculative, it does seem that there are good reasons for a region with the function of posterior TPJ to have an inhibitory connection with regions involved in visual search, and for its activity to increase when an unexpected stimulus is detected. Outside the laboratory, suddenly appearing unexpected stimuli are often animals or conspecifics, which might pose a survival threat. Attempting to find one more apple is not so important as attending to the danger posed by a predator. In this scenario, there is not only an advantage to breaking the current attentional set, there is also an advantage to expediting the processing of social cues and rapidly generating a model of the agent's intentions. Hence, while there is no obvious feature of laboratory reorienting tasks which calls for the engagement of social processing; this may nonetheless occur because the engagement of social processing upon detection of a salient unexpected stimulus is adaptive as a general rule. Consistent with this speculative account, there is evidence that animate motion captures attention more rapidly than inanimate motion (Pratt et al., 2010). If this account is borne out, then it may be that information is indeed passed from social processing areas in the posterior TPJ to the DAN in order to reorient attention. Our hypothesis is that this information would derive from active anticipation of the likely actions of a perceived agent using ToM. Hence, surprisingly, many of the functions attributed to the rTPJ by the VAN account are consistent with the account offered here. The major difference is that we hypothesize these reorienting functions evolved because of evolutionary pressure for more sophisticated social processing, and our accounts predicts these function will be most profitably investigated using realistic social paradigms.

Distinguishing between these accounts is clearly theoretically significant for our understanding of cortical function. In addition, it has implications for therapeutic approaches. If it is correct that attentional reorienting represents a basic process which is coopted for social cognition, then this would suggest that early intervention by training attention might be an effective treatment for individuals with social deficits, such as individuals with Autism Spectrum Disorders. On the other hand, if our account is correct, then non-social attention training programs are not likely to be effective for improving social function, and may even be detrimental.

## **CONCLUSIONS**

For more than a decade, the theory of the ventral attention system has played a leading role in the interpretation of findings which implicate the rTPJ in attention and social processing. In this paper we propose an alternative account which appeals to the interplay between two distinct regions at the rTPJ which are associated with antagonistic functional networks involved in social and nonsocial processing. We present empirical evidence which is more consistent with this alternative account than prior accounts, identifying distinct loci and functional connectivity maps associated with target detection, reorienting and ToM. We acknowledge this evidence is limited in scope, relying entirely on meta-analysis and rs-fcMRI. It does not make use of experimental manipulation of the processes under investigation, high-resolution imaging, or analysis of individual participants, all of which we expect to be critical to establishing a definitive account. However, these findings do motivate further consideration of our account, which has significant implications. First, it has the potential to make sense of a large and confusing literature on the role of the rTPJ in attention and social processing. Second, it suggests an alternative view of the evolution of brain function, in particular functions associated with social cognition. Third, our account emphasizes attempts to understand neural activity not just by reference to the immediate demands of the experimental task, but also by reference to constraints which our neural

## **REFERENCES**


versus distinct processes. *Soc. Cogn. Affect. Neurosci.* 5, 48–58. doi: 10.1093/scan/nsp045


structure places on cognition. Task analysis of attention reorienting paradigms does not suggest any role for social processing. Nonetheless, we suggest that activation patterns associated with these paradigms cannot be fully understood without reference to an inbuilt neural tension between focused attention and social processing.

## **ACKNOWLEDGMENTS**

The authors would like to thank Brian Hysell and Scott Tillem for help with the meta-analysis; Kevin Barry and Avi Snyder for help with the connectivity analysis; and Karl Friston, Joy Geng Emiliano Macaluso, Avi Snyder, and two anonymous reviewers for comments on earlier versions of the manuscript. In particular we would like to thank the anonymous reviewer who took the trouble to identify and list additional studies suitable for inclusion in the secondary meta-analysis. This work was supported in part by grants to AIJ from the Leonard Krieger Fund and the University Hospitals Case Medical Center Spitz Brain Health fund.


M., Drury, H. A., et al. (1998). A common network of functional areas for attention and eye movements. *Neuron* 21, 761–773. doi: 10.1016/S0896-6273(00)80593-0


dorsal and ventral attention systems. *Proc. Natl. Acad. Sci. U.S.A.* 103, 10046–10051. doi: 10.1073/pnas.0604187103


neural networks are recruited for self perspective inhibition and complexity of reasoning. *Neuroimage* 61, 921–930. doi: 10.1016/j.neuroimage.2012.03.012


linguistic influence on neural bases of 'Theory of Mind': an fMRI study with Japanese bilinguals. *Brain Lang.* 98, 210–220. doi: 10.1016/j.bandl.2006.04.013


brain networks account for sustained and transient activity during target detection. *Neuroimage* 44, 265–274. doi: 10.1016/j.neuroimage.2008.08.019


attention. *Brain Res.* 1121, 136–149. doi: 10.1016/j.brainres.2006.08.120


H., et al. (2009). Impact of working memory load on FMRI resting state pattern in subsequent resting phases. *PLoS ONE* 4:e7198. doi: 10.1371/journal.pone.0007198


at rest. Social cognition as the default mode of cognizing and its putative relationship to the "default system" of the brain. *Conscious. Cogn.* 17, 457–467. doi: 10.1016/j.concog.2008.03.013


and surface-based (PALS) atlas of human cerebral cortex. *Neuroimage* 28, 635–662. doi: 10.1016/j.neuroimage.2005.06.058


endogenous orienting of attention in parietal and frontal cortex. *Neuroimage* 32, 1257–1264. doi: 10.1016/j.neuroimage.2006.05.019


*Neuroimage* 49, 894–904. doi: 10.1016/j.neuroimage.2009.08.060


(2007). Functional disintegration in paranoid schizophrenia using resting-state fMRI. *Schizophr. Res.* 97, 194–205. doi: 10.1016/j.schres.2007.05.029

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 12 June 2013; published online: 10 July 2013.*

*Citation: Kubit B and Jack A I (2013) Rethinking the role of the rTPJ in attention and social cognition in light of the opposing domains hypothesis: findings from an ALE-based meta-analysis and resting-state functional connectivity. Front. Hum. Neurosci. 7:323. doi: 10.3389/fnhum.2013.00323*

*Copyright © 2013 Kubit and Jack. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Perspective: causes and functional significance of temporal variations in attention control

#### *Agatha Lenartowicz <sup>1</sup> \*, Gregory V. Simpson2 and Mark S. Cohen1*

*<sup>1</sup> Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, CA, USA*

*<sup>2</sup> Attention Research Institute, San Francisco, CA, USA*

#### *Edited by:*

*Joy Geng, University of California Davis, USA*

#### *Reviewed by:*

*R. Nathan Spreng, Cornell University, USA Francisco X. Castellanos, New York University School of Medicine, USA Jane Couperus, Hampshire College, USA*

#### *\*Correspondence:*

*Agatha Lenartowicz, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, 760 Westwood Plaza, Suite 17-369, Los Angeles, CA 90095, USA e-mail: alenarto@ucla.edu*

Attention control describes the human ability to selectively modulate the plethora of sensory signals and internal thoughts. The neural systems of attention control have been studied extensively, warranted by the importance of this ability to daily functioning. Here, we consider an emerging theme in the study of attention control—slow temporal fluctuations. We posit that these fluctuations are functionally significant, and may reflect underlying interactions between the neural systems related to attention control. We explore thought experiments to generate different perspectives on landscapes created by the interactions between attention control networks and the sources of input to these control systems. We examine interactions of the fronto-parietal and the default mode networks in the context of internal cognition, and the noradrenergic modulatory projections in the context of arousal, and we consider the implications of these inter-network dynamics on attention states and attention disorders. Through these thought experiments we highlight the breadth of potential knowledge to be gained from the study of slow fluctuations in attention control.

**Keywords: attention control, fluctuations, network interactions, attention deficits, internal cognition**

## **ATTENTION CONTROL AND FLUCTUATIONS**

Attention control allows us to ignore distracting information so that we may focus selectively on information relevant to goal directed behavior. Copious research has demonstrated that attention control involves "top-down" signals from association cortices, biasing activity in sensory regions to enhance the magnitude of attended signals; and evidence is building to show that topdown processes also suppress the magnitude of ignored signals (Posner and Dehane, 1994; Desimone and Duncan, 1995; Miller and Cohen, 2001). Integral to top-down biasing is the frontoparietal network (FPN, also referred to as the "executive control network") (Corbetta and Shulman, 2002; Dosenbach et al., 2006; Fox et al., 2006; Raichle, 2011), a network encompassing dorsal and medial prefrontal cortices and superior parietal cortices, that acts to distinguish attended from ignored signals (Ruff and Driver, 2006; Gazzaley et al., 2007; Capotosto et al., 2009) (**Figure 1A**, *left panel*). Recruitment of this system is thought to occur when multiple sensory signals compete for processing resources (Braver and Cohen, 2000; Botvinick et al., 2001; Miller and Cohen, 2001) and/or at the trigger of a salient orienting signal (e.g., novel or loud sound, such as a fire alarm; Posner and Petersen, 1990; Posner and Dehane, 1994).

Attention control also appears to be a fluctuating system. Castellanos et al. (2005) showed that the speed of response in an attention task increases and decreases over a period of 15 s (0.068 Hz) or so, and that these fluctuations are particularly pronounced in children with attention deficit hyperactivity disorder (ADHD). Monto et al. (2008) showed that the accuracy in a simple detection task fluctuates in runs of 10–100 s, that track electrophysiological fluctuations of the same period. The activity of the FPN also shows fluctuations in this frequency range (Vincent et al., 2008). These findings indicate that attention control integrity varies across time, and that this variability has implications for behavior and disease. In contrast, predominant models of attention are concerned with what we term "momentto-moment attention" (**Figure 1A**, *right panel*); they explain how attention control influences the processing of discrete, primarily external, sensory events by averaging attention signals across moments—and as such across fluctuations—in time. In the current perspective we therefore explore the emerging question, *what are the causes and functional significance of temporal variations in attention control*?

## **SYSTEMIC AND INPUT SOURCES OF FLUCTUATIONS**

If we consider the FPN as a core system that underlies attention control, then fluctuations of attention control (proportional to the strength of modulation of the relative strengths of target and distractor signals, **Figure 1A**) likely indicate fluctuations in the efficacy of this system. Sources of these fluctuations may be classified further as either *systemic* or *input*. A systemicallybased fluctuation in efficacy would be defined as a limitation in FPN functionality occurring when the entire system is temporarily less active, either because of operational characteristics (e.g., the entire system is activated insufficiently) or because of negative interactions with other neural systems (e.g., its activity is suppressed by another system)—with no change to the inputs. An input-based fluctuation would be defined as a misdirection of FPN activity relative to a desired goal, such as when

**FIGURE 1 | In a representative scenario of attention control and its fluctuations (A), (left panel) activity corresponding to an attended signal (e.g., visual cortex) is enhanced (red thermometer level is high) and activity corresponding to an ignored signal (e.g., auditory) is suppressed, due to control signals from the FPN.** The magnitude of enhancement is quantified by comparing activity for the target stimulus when it is attended relative to when it is ignored (right panel). This can be conceived as an average index of "moment-to-moment" attention, an approach that ignores any underlying fluctuations in the attended signal amplitude (blue and red dots) that may be related to functionally significant variations in attention control. Potential sources of such fluctuations are shown below **(B)**, and can be system-based (i) and input-based (ii). In the case of systemic sources a decrease in FPN activity could arise by (i), the antagonistic influence of another network (left; DMN–default mode network) or decreased modulatory input (right; LC–locus coeruleus

noradrenergic inputs) that decrease activity in FPN, leading to an attenuation of control over sensory processing regions, and therefore lower indices of control as measured in the target processing region (red thermometer level is low). In the case of input-based fluctuations, FPN activity level does not change but is redirected to a different processing input (ii). In this example attention is oriented toward an internal input (e.g., a memory, indicated by ∗), resulting in a decrease in responses to other inputs—including other internal inputs (e.g., planning dinner) and the external target (visual) inputs. The identity of cortical regions that process internal inputs is unknown (?), as are the interactions of such regions with higher-order networks (e.g., DAN/DMN). We posit here one possibility, that the input cortices and higher-order networks responsible for their processing will be positively correlated in their relationship with FPN. In the example here, the involuntary capture of attention by an internal input leads to positive correlations between DMN and FPN.

attention control is rerouted by distracting signals, be they external signals or, as we consider here, internal thoughts—without a change in FPN activation. Therefore, a key to understanding the fluctuations in efficacy of attention control is knowledge of the conditions for, and products of, the interactions of FPN with other neural systems and inputs. We consider here two such candidates, arousal and internal cognition.

## **THE CASE OF AROUSAL**

The idea of a systemic fluctuation of FPN is demonstrated readily in the effect that chemical neuromodulators have on its efficacy. All four of the primary neuromodulators, the catecholamines (dopamine and noradrenaline), acetylcholine and serotonin, have been shown to affect attention (Foote and Morrison, 1987; Coull, 1998; Briand et al., 2007; Rokem et al., 2010). For brevity we focus on the example of the noradrenergic (NE) system (for a comprehensive review see Moore and Bloom, 1979; Foote et al., 1983; Berridge and Waterhouse, 2003), often referred to as the LC-NE system because all of its cortical projections arise from a single nucleus in the brainstem, the locus coeruleus (LC). The LC-NE system is of interest because, being part of the reticular activating system (Moruzzi and Magoun, 1949), historically it has been associated with arousal (Berridge and Waterhouse, 2003). In turn, arousal is a prerequisite for attention. Indeed decreased firing of LC neurons is correlated both with drowsiness (Roussel et al., 1967; Aston-Jones and Bloom, 1981) and with poor attentional performance (Mason and Iversen, 1978; Aston-Jones et al., 1999). Excess LC firing, like excess arousal, is also detrimental to performance (Aston-Jones et al., 1999).

How do these observations contribute to systemic fluctuations of FPN? The LC-NE system has diffuse projections across cortex, with terminals that include the FPN (Moore and Bloom, 1979). The effect of NE, specifically, is to decrease spontaneous firing and increase the evoked response (Foote et al., 1975), interpreted as an increase in fidelity and gain of the neuronal response (Aston-Jones and Cohen, 2005). The LC-NE system therefore influences the responsivity of FPN to inputs (as well as of other systems) when attention control is required. This suggests that the LC-NE system could contribute to fluctuations of attention control when the LC-NE system is either under- or overactive, translating into a weaker or stronger response of the FPN as an entire system, given no change in the inputs (Aston-Jones and Cohen, 2005; for a complementary interpretation see Corbetta and Shulman, 2002; Corbetta et al., 2008). This is therefore a *systemic* not an *input* fluctuation of attention control. A weaker response of the FPN would translate into weaker modulatory control over target regions, meaning less target enhancement and less distractor suppression (**Figure 1B-i**, *right panel*).

## **THE CASE OF INTERNAL COGNITION**

A very different example of a seemingly systemic fluctuation is internal cognition, which refers to thinking; it encompasses mind wandering, self-evaluation, problem solving and active remembering (Giambra, 1995; Smallwood and Schooler, 2006; McVay and Kane, 2009, 2010; Schooler et al., 2011; Christoff, 2012; Smallwood, 2012), processes that have been associated with the activation of a group of functionally connected regions that include medial prefrontal cortex, posterior cingulate cortex, restrosplenial cortex, as well as medial temporal and lateral inferior parietal cortices (Binder et al., 1999; Gusnard et al., 2001; Johnson et al., 2002; Gordon et al., 2007; Mason et al., 2007; Buckner et al., 2008; Christoff et al., 2009; Andrews-Hanna et al., 2010; Stawarczyk et al., 2011). Together these regions comprise the so-called default mode network (DMN) (Shulman et al., 1997; Mazoyer et al., 2001; Raichle et al., 2001; Buckner et al., 2008).

Internal cognition is of particular interest because thinking can, in principle, interfere with attention control over external inputs both through systemic and input pathways. As a systemic influence thinking can be considered a competitor to FPN activity (**Figure 1B-i**, *left panel*). This hypothesis is supported by early findings showing that FPN activity is correlated negatively with that of the DMN (Fox et al., 2005; Fox and Raichle, 2007). In turn, DMN activity is correlated positively with lapses of external attention (Weissman et al., 2006), reflecting moments when participants are off-task (Buckner et al., 2008; Andrews-Hanna, 2012) and have decreased control over external signals (Weissman et al., 2009; Schooler et al., 2011; Smallwood et al., 2012). Accordingly, fluctuations of attention control could be interpreted as instances during which DMN activation suppresses activity in the FPN and attention control is disrupted by internal cognition, a systemic fluctuation of attention in reference to external signals.

A model based on antagonistic interaction between FPN and DMN, while a useful starting point, is likely an oversimplification of the underlying dynamics. It assumes that internal control and external control are independent, antagonistic systems, which leads to the difficult question: "*Who*" determines which type of control is "on" at any given time? Plausibly, the FPN and DMN interact within a negative feedback circuit (where each suppresses the other), and their individual engagement is determined by the strength of their relative inputs. This does not seem consistent with our ability to quickly switch between internal and external cognition—with no change in inputs. Alternatively, some other system determines whether external attention control or internal cognition is engaged (Sridharan et al., 2008; Leech et al., 2012). A more parsimonious interpretation is that FPN *is* that other system. Namely, DMN activity can be thought of as another input into FPN that is suppressed when attention is oriented externally—resulting in an apparent negative correlation between the two systems. The activation of DMN while attempting to attend to external signals would then be thought of as an input-based source of fluctuation (**Figure 1B-ii**).

Perhaps most telling with regard to this notion is the observation that in some circumstances the DMN and FPN are correlated positively (Christoff et al., 2009; Spreng et al., 2010; Smallwood et al., 2012; Spreng et al., 2013), arguing against a strictly antagonistic relationship. In these studies, the authors proposed that the positive coupling between FPN and DMN was interpreted more appropriately as attention control working in the service of internal cognition. For instance Spreng et al. (2010, 2013) reported a positive correlation between FPN and DMN during retrieval of autobiographical memories, but not when participants engaged in a visuospatial task. This may be interpreted as attention control being oriented toward memory retrieval, biasing which internal signals (corresponding to memories) were to be retrieved and which were to be ignored.

These observations are consistent with the existence of a single attention control system, the FPN, which can be oriented to process internal or external sources of input (**Figure 1B-ii**). Moreover, these sources of input may correspond to other potential control systems – such as the DMN. Coincidently, during a visuospatial task, Spreng et al. (2010, 2013) found that the FPN was no longer coupled with the DMN, but was instead correlated positively with activities in frontal eye fields and inferior parietal sulci, which comprise the dorsal attention network (DAN), a specialized control system involved in visuospatial attention. From this perspective, internal cognition can lead to fluctuations of attention control by reorienting FPN away from systems controlling external inputs (e.g., interaction with DAN to process visuospatial information) and toward systems controlling internal inputs (e.g., interaction with DMN in the service of memory retrieval). One hypothesis that arises here is that such orienting of the FPN would be expected to suppress external signals in general (along with inappropriate internal signals such as false memories in the autobiographical retrieval example). Direct support for this idea has been reported: Mind wandering—a well-known example of an attention lapse—is associated with decreased processing of *both* attended and ignored external signals (Weissman et al., 2006; Smallwood et al., 2008; Barron et al., 2011; Kam et al., 2011), a phenomenon referred to recently as "perceptual decoupling" (Smallwood et al., 2007; Schooler et al., 2011).

## **A LANDSCAPE OF ATTENTION CONTROL AND OUTSTANDING QUESTIONS**

The notion that multiple influences can modify the behavior of the FPN implies that attention control can take on multiple states that are determined by the context of its inputs and systemic influences, or more simply, by its system interactions. For instance, if we take the above examples of attention control inputs (internal/external) and activation (low/high arousal) and explore the product of their interactions along two axes, a landscape of attention control *states* emerges (**Figure 2**). Following the orientation axis, we see that attention control can be oriented internally or externally. We show no variability in orientation, acknowledging that it is categorical. Following the horizontal arousal axis, we see that attention control efficacy varies with arousal level with an optimum in the middle—reflecting the Yerkes-Dodson relationship exemplified by this system (Aston-Jones et al., 1999). Hence most efficacious attention states occur at the peak of this function, though the orientation can vary (e.g., focus can be directed internally such as when problem solving, or externally such as when reading a book), whereas in the extremes of the arousal axis, attention lapses occur. Scanning the resulting landscape, two important observations arise.

The first is that attention lapses can take on multiple flavors. In this analysis we observe four domains, produced by crossing arousal states with orientation of attention. If arousal is high, we predict that FPN will be excessively responsive to all stimuli and will therefore fail in discriminating between relevant and irrelevant inputs. If attention were oriented externally, this may be manifest as oversensitivity to external stimuli, whereas if attention were oriented internally, it might translate into racing thoughts (perhaps rumination). If however arousal is low, we predict that the FPN response will be sluggish, resulting in reduced responsivity to stimuli. Again discrimination between relevant and irrelevant stimuli would be compromised. In this case, if attention were oriented externally, we would expect behaviors to be driven by the most salient or most automatic responses ("bottom-up") since minimal control is applied to inputs. Similarly if attention were oriented internally, we would expect the presence of mind wandering, in which internal cognition drifts from one topic to another. This is also the state in which externally oriented attention would be vulnerable to drifting to internal content and, similarly, internal orientation could drift to external content. Note that in all four states the outward symptom would be poor attention to the task at hand, but for very different reasons.

**FIGURE 2 | By intersecting arousal and input orientation, we observe six states of attention control: focus directed internally (e.g., problem solving) or externally (e.g., reading a book), and four classes of deficits of attention control.** The region of maximal attention performance lies along the length of the orientation axis, where it intersects the midpoint of the arousal axis. At this point is the peak of the proposed arousal-attention function, described by the Yerkes-Dodson curve, when attention control efficacy is maximal. The proposed effect of LC-NE on FPN neural response is thought to result in a sensitized response when arousal is high and a sluggish response when arousal is low. The interpretation of these extremes in terms of attention deficits varies with the orientation of attention control. High arousal is interpreted as over-activity of the FPN, which could produce racing thoughts or rumination when attention is oriented internally (bottom-right), and stimulus sensitivity when attention is oriented externally (top-right). Low arousal is interpreted as under-activity of the FPN, which would result in automated, "bottom-up," responses. For internally oriented attention this may be analogous to mind wandering (bottom-left). For externally oriented attention (top-left) this may be analogous to attention responses that are based on prepotency of stimuli (e.g., tendency to read words rather than name ink color in the Stroop task) or salience (e.g., attention capture).

This perspective raises some questions regarding attention control mechanisms, especially with regard to internal cognition. The present synthesis implies that internal cognition is subject to the same rules of attention control that apply to external inputs: enhancement of relevant information and suppression of irrelevant information. Accordingly, distractions *within* internal cognition ought to be manifest much like those for external signals. As an example, consider the case where you attempt to meditate, but instead drift into thinking about work. Or imagine that you are trying to retrieve the name of a high-school friend (e.g., "Jenny"), but your memory keeps drifting to your colleague who has similar sounding last name (e.g., "Jensen"). In both cases, a potent unrelated thought captures your attention in the internal modality—much like when a loud sound captures your attention in the external modality. Therefore *relevant* and *irrelevant* signals may be defined for internal cognition much like external signals such as sights and sounds, and a correct response of the system would be for the FPN to suppress those irrelevant work thoughts, or the competing memory.

What does it mean for FPN to suppress an internal thought? Can thoughts be conceived as isomorphic with levels of cortical activity, much like sounds and sights are isomorphic with levels of activity in auditory and visual cortices? If so, what are these cortical regions or networks that would be modulated? Would the structure of internal thought representations require an additional control system that interacts with the FPN? More generally, how would we measure lapses that occur *within* the internal modality? While certain aspects of attention control may be preserved between internal and external modalities, it is possible that asymmetries exist. Furthermore, while we describe a landscape of attentional states, we have not addressed how transitions occur between these states. How do the observed fluctuations in network activity and in behavioral performance relate to transitions in attentional states? For example, when FPN activity is low does this create a lower barrier for attention to wander or be captured? These are important questions that beg further investigation.

Our second, and related, observation is that current models of attention control are based largely on the study of externally oriented attention. Accordingly, investigations of attention control impairments are restricted largely to the upper half of **Figure 2**, more precisely to the upper left. Interestingly, while distractibility has been ascribed to a failure of the attention control network, the cause is not transparent. While it is certainly likely that in some disorders the FPN is impaired, in other instances an apparent dysfunction of FPN may be accounted for by low arousal decreasing activation of a normally functioning FPN. The implication of this proposition is that apparent dysfunctions of FPN may arise through multiple mechanisms. An interesting test case in this regard is ADHD. Key symptoms of this disorder have been impairments of working memory and response inhibition (Barkley, 1997; Tannock, 1998), leading to the inference that prefrontal cortex function, which subsumes core nodes of the FPN, is dysfunctional (Castellanos and Tannock, 2002; Arnsten, 2006; Casey and Riddle, 2012). Yet, the disorder also has been associated with an impairment of arousal, possibly due to an underlying noradrenergic disorder (van der Meere and Sergeant, 1988; McCracken, 1991; Biederman and Spencer, 1999).

## **REFERENCES**


fluctuations in the sleep-waking cycle. *J. Neurosci.* 1, 876–886.


Several questions arises: are the apparent attention control symptoms mediated, at least in part, by an underlying deficit in arousal and, therefore, the sustaining or engaging of attention control (Huang-Pollock and Nigg, 2003; Huang-Pollock et al., 2005; Castellanos et al., 2006; Friedman-Hill et al., 2010)? Are the fluctuations in attention control within an individual related to interactions of FPN with the LC-NE system? For instance, the fluctuations of attention control observed by Castellanos et al. (2005), more pronounced in children with ADHD, had a period of approximately 15 s. Is this frequency correlated with the fluctuations of the LC-NE system in this group? Is the amplification of these fluctuations related to an aberration in the cellular properties of the neuromodulatory projections?

### **CONCLUSION**

Our objective in this perspective is to highlight the significance of known fluctuations of attention control. We suggest that sources of these fluctuations consist of two categories, systemic and input, and that may they be thought of as interactions between FPN and other neural systems. We have presented a possible landscape of attention control that may result from the interactions of these systems. Recognizing that this demonstration is incomplete—inevitably other neuromodulators, other networks and associated system interactions are involved—we present this short perspective to highlight the increasing emphasis on and exciting research that has emerged in describing the brain in terms of network interactions. We believe that understanding of these interactions in the context of attention control fluctuations is imminent and will lead to an improved characterization of the dynamics of attention control and of its impairments.

#### **ACKNOWLEDGMENTS**

The authors would like to acknowledge the NIH for financial support: grant# R33DA026109 and R21MH096329 (MSC). Agatha Lenartowicz is a fellow of the Third Generation Klingenstein Foundation. We would like to thank our reviewers for invaluable feedback on the manuscript and the ideas presented within.


association cortex contribute to topdown modulation of visual processing. *Cereb. Cortex* 17, I125–I135. doi: 10.1093/cercor/bhm113


135–148. doi: 10.1016/0006-8993 (78)90658-3


A default mode of brain function. *Proc. Natl. Acad. Sci. U.S.A.* 98, 676–682. doi: 10.1073/pnas.98. 2.676


during waking thought: a cognitive neuroscience exploration of the wandering mind. *Can. J. Exp. Psychol.* 66, 316.


Default network activity, coupled with the frontoparietal control network, supports goal-directed cognition. *Neuroimage* 53, 303–317. doi: 10.1016/j.neuroimage.2010. 06.016


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 01 May 2013; accepted: 02 July 2013; published online: 23 July 2013.*

*Citation: Lenartowicz A, Simpson GV and CohenMS (2013) Perspective: causes and functional significance of temporal variations in attention control. Front. Hum. Neurosci. 7:381. doi: 10.3389/ fnhum.2013.00381*

*Copyright © 2013 Lenartowicz, Simpson and Cohen. This is an openaccess article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## Attention as foraging for information and value

## *Sanjay G. Manohar 1,2\* and Masud Husain1,2*

*<sup>1</sup> Department of Experimental Psychology, University of Oxford, Oxford, UK*

*<sup>2</sup> Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, Oxford, UK*

#### *Edited by:*

*Simone Vossel, Wellcome Trust Centre for Neuroimaging, UK*

#### *Reviewed by:*

*Antonio Rangel, CalTech, USA Brian A. Anderson, Johns Hopkins University, USA*

#### *\*Correspondence:*

*Sanjay G. Manohar, Department of Experimental Psychology, University of Oxford, South Parks Road, Oxford, OX1 3UD, UK e-mail: sanjay.manohar@ psy.ox.ac.uk*

What is the purpose of attention? One avenue of research has led to the proposal that attention might be crucial for gathering information about the environment, while other lines of study have demonstrated how attention may play a role in guiding behavior to rewarded options. Many experiments that study attention require participants to make a decision based on information acquired discretely at one point in time. In real-world situations, however, we are usually not presented with information about which option to select in such a manner. Rather we must initially search for information, weighing up reward values of options before we commit to a decision. Here, we propose that attention plays a role in both foraging for information *and* foraging for value. When foraging for information, attention is guided toward the unknown. When foraging for reward, attention is guided toward high reward values, allowing decision-making to proceed by accept-or-reject decisions on the currently attended option. According to this account, attention can be regarded as a low-cost alternative to moving around and physically interacting with the environment—"*tele*foraging"—before a decision is made to interact physically with the world. To track the timecourse of attention, we asked participants to seek out and acquire information about two gambles by directing their gaze, before choosing one of them. Participants often made multiple refixations on items before making a decision. Their eye movements revealed that early in the trial, attention was guided toward information, i.e., toward locations that reduced uncertainty about value. In contrast, late in the trial, attention was guided by expected value of the options. At the end of the decision period, participants were generally attending to the item they eventually chose. We suggest that attentional foraging shifts from an uncertainty-driven to a reward-driven mode during the evolution of a decision, permitting decisions to be made by an engage-or-search strategy.

**Keywords: attention, saccades, foraging, uncertainty, information, expected value, risk, bayesian updating**

## **INTRODUCTION**

Recent studies have suggested that visual attention might play a role both in acquiring information and searching for reward. Several groups have demonstrated that reward can guide attention (Ding and Hikosaka, 2007; Hickey et al., 2010; Anderson et al., 2011; Schütz et al., 2012; Camara et al., 2013). Others have argued that attention needs to be drawn to stimuli that have a high uncertainty to facilitate acquisition of information (Yu and Dayan, 2005; Hogarth et al., 2008; Gottlieb and Balan, 2010). Acquiring information by directing attention is an active, dynamic process (Ballard et al., 1995; Shinoda et al., 2001), where information is the reduction of uncertainty in our estimate of world states or future outcomes (Feldman and Friston, 2010).

Which of these two drives, reward or uncertainty, controls the shifts of attention before a decision? Information integration for decisions has been the objective of a wealth of neuroscientific studies (e.g., Platt and Glimcher, 1999; Shadlen and Newsome, 2001; Smith and Ratcliff, 2009; Basten et al., 2010; Hare et al., 2011), but surprisingly little research has focused on the dynamic control of attention while searching *for information* (Reutskaja et al., 2011; Gottlieb, 2012). In most experimental situations, observers simply choose between two options at a discrete point in time, but are not allowed to sample the environment and integrate different types of information as they might naturally, over time.

Behavioral ecology, by contrast has concerned itself with how animals sample the environment (forage) before coming to a decision (Krebs et al., 1978; Stephens, 1987; Stephens and Krebs, 1987). Here we present a new experimental paradigm that allows us to compare how attention is directed to reward, risk, and uncertainty about reward. We then discuss a framework in which attentional guidance shifts during choice, from information-driven, to reward value driven.

Attention influences decision processes both by selecting which information is accumulated in decision variables (Einhorn and Hogarth, 1981; Roe et al., 2001; Krajbich et al., 2010), but also by biasing choice toward the attended option (Shimojo et al., 2003; Brandstätter, 2011). But what guides attention itself? Unless carefully guided, attention would be maladaptive, biasing information and choice. When attention biases choice, attending to the higher expected value (*EV*) might be beneficial; whereas when attention determines which information is gathered, then attending to uncertainty might be beneficial (Itti and Baldi, 2009). Although information-seeking may ultimately help to obtain reward, we distinguish it from "value-driven" guidance in which attentional is directly attracted toward reward.

Information could drive attention in two possible ways. A *perceptual* model of attention predicts that we focus on items that have greater uncertainty in their *identity* (Feldman and Friston, 2010). However, an *action*-driven model of attention would require that we focus on items that have greater uncertainty in their *value*. In other words, attention's primary role might be to provide decision making systems with information about the *EV* of the options being considered (Gottlieb and Balan, 2010), and thereby reduce risk.

Neither of these information-driven models explains the finding that, in choice, we generally choose the item we were last attending to (Krajbich et al., 2010), at least when the attended item is more valuable than the alternatives. We suggest that this tendency, although intuitive, requires explanation, and reveals key features of the tight link between attention and choice. A parsimonious explanation of this phenomenon is to regard attention as a form of *foraging*.

Rather than simply deciding which item is better, we argue that decisions are made by an "engage or search" strategy. Unlike classical decision-making models, this captures the intuition that we rarely choose something we are not attending to (Reutskaja et al., 2011). Attentional shifts, then, can be viewed as a low-cost alternative to physically moving around an environment before engaging with the world. In other words, attention might be a mechanism of "*tele*foraging": gathering and evaluating information at a distance before physically engaging with the environment.

In such a model, when we are free to search for information, attention would be considered to be driven *both* by uncertainty and *EV*, to jointly achieve the goals of information acquisition, and option selection. Option selection is then framed as either accepting the currently attended option ("engage") or moving to the other location ("search"). From this perspective, any progressive reduction of uncertainty by guiding attention can be viewed as "foraging for information."

Foraging for food involves deciding, after each movement, whether to engage a current option, or to move off and continue the search (e.g., Charnov, 1976; Kolling et al., 2012). Foraging for information, we propose, might involve deciding *at each fixation* whether information is sufficient to support choosing of the attended option, or not. Critically, over the course of each individual fixation, we might expect the amount of information being acquired to decrease (**Figure 1**). Thus, attention might shift to a new location when the information rate drops below a threshold, in parallel with animal models of foraging for reward (Waage, 1979; Stephens and Krebs, 1987).

Viewed as foraging, information acquisition would be expected to show a characteristic timecourse. Exploration during foraging is driven by our estimates of uncertainty in a variable environment (Behrens et al., 2007), so rather than simply attending to the highest expected value, a *systematic exploration* of the options would be envisaged to occur, perhaps described by an analog to the optimal departure rule developed for animal foraging (Pyke, 1978). Furthermore, according to this view,

**FIGURE 1 | Foraging for information.** We test the view that foraging for information involves the leaky accumulation of information about the fixated item. Information acquisition involves a time-dependent, location-specific *gain in precision*. Participants should leave a location when the information gain rate falls below a threshold, in parallel with classical foraging for reward (Stephens and Krebs, 1987). The location fixated next is determined by which location has the greatest estimated information gain rate. Meanwhile information about the original item *decays*. This predicts that participants refixate the first item seen, that dwell times shorten over the course of a trial, and that longer fixations result in fewer subsequent refixations of the same item.

options might also be revisited, as needed, to acquire more information (Waage, 1979; Pyke, 1984; Gill, 1988).

But later during a decision process, the marginal information yield (reduction in uncertainty) of an attentional shift should become small (**Figure 1**) as less information is gained with each new fixation (Armel and Rangel, 2008). Therefore, according to this perspective, we would anticipate that attention becomes progressively more governed by expected value and guided toward the more valuable option. This schema allows a foraging-type *"accept or reject" decision* to be made at each fixation, culminating in the selection of an option.

An alternative way of putting this hypothesis is that under conditions of uncertainty, *information* carries salience, but as more information is acquired, *reward value* should become salient. The allocation of attention during a decision is initially uncertainty-driven, but as information is "consumed," and *EV* estimates become more precise, *EV* itself guides attention, culminating in choice of the attended option. Such dynamic changes in attentional guidance could resolve a longstanding rift in the attention literature, between those that demonstrate attention to uncertainty, vs. those showing that reward guides attention.

We designed a task specifically to examine the timecourse of attentional control before a decision is made. In our design, participants are allowed to *forage for information* from a limited set of risk and reward data for as long as they like before they make their decision. By tracking their eye movements we can obtain a measure of where, how and in what order attention is deployed over time prior to a decision. Participants viewed two gambles, on the left and right of the screen, each of which was characterized by a probability and a monetary stake, displayed numerically on a vertical axis (**Figure 2A**). They had to fixate these four numbers to acquire information about the two gambles, importantly *without* any time limit, before they chose one of the two gambles by a keypress.

After choosing, they either won or lost the stake of the chosen gamble, with the specified probability of winning. Thus, choosing a probability greater than 50% was likely to win the stake, whereas below 50% was likely to lose money. A range of expected values and risks were chosen for each gamble. One gamble was always more risky than the other, but could have a higher or lower *EV* than the safer gamble (**Figure 3**). This allowed us to describe the

**FIGURE 2 | (A)** Our task is a choice between two gambles, presented on the left and right hand sides of the screen. Participants freely viewed a display with four numbers, to acquire information about the two gambles. Without a time limit, they selected the preferred gamble by a keypress. Each gamble had a probability of winning vs. losing, denoted with a "%" suffix, and a monetary stake, denoted with a "£" prefix. After selection, a sound indicating win/lose was played over a loudspeaker, and the bank balance was displayed centrally. The numbers were small and were presented close to isoluminance, ensuring that fixation was necessary to identify numbers. **(B–F)** Example scan paths of the first four acquisitions from one participant, aligned so that the first saccade is to the lower left. Trajectories are classified according to the fixation pattern: each of the three saccades could either be within an option or across options. Numbers represent order of acquisition.

trajectory of attention in terms of the relative "pull" of *EV* and uncertainty (composed of gamble risk and *EV* variance).

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

In our task, participants had to make a choice between two gambles, but were given unlimited time to come to a decision. The gambles were presented on the left and right hand sides of the screen and participants freely viewed a display with four numbers, two on either side of the screen, to acquire information about the two gambles. Each gamble was given a *probability* of winning vs. losing (denoted with a "%" suffix) and a monetary *stake* (denoted with a "£" prefix). Both the probability and stake associated with a gamble were presented separately, one above the other (location randomized). Participants selected their preferred gamble by a keypress. After selection, a sound indicating win/lose was played over a loudspeaker, and the "bank balance" was displayed centrally, which was either incremented or decremented by the chosen stake. We recruited 17 participants from an advert, mean age 41. Research was conducted with informed consent, and was approved by the Imperial College Research Ethics Committee.

#### **STIMULI**

Stimuli were displayed in Matlab and PsychToolbox on a CRT at 1024 × 768 pixels, 100 Hz. Participants had to fixate a central cross before the start of each trial. Numbers were displayed in the four quadrants of the display, at an eccentricity of 10◦, with size 0.5◦. Probabilities were indicated with a "%" suffix, and monetary stakes were indicated with a "£" prefix (**Figure 2A**). In order to ensure that identifying a number required fixating it, all numbers were two digits long, were masked by "#" symbols on all four sides, and were close to isoluminance with the background.

Fifty percent of the trials were *"colour-coded,"* such that probabilities were in one color, and stakes in another, with the code being consistent for each participant (counterbalanced). Participants were informed of these color contingencies before the experiment. Thus, in the color-coded trials, they could know whether each location contained a probability or a stake, in advance of fixating it. This allowed us to examine whether participants could utilize such prior knowledge to strategically fixate

items of the same "dimension" (stake or probability) when looking between options.

During the decision period, an Eyelink 1000 Hz infrared eye tracker allowed us to follow the sequence in which numbers were fixated over the decision period. Participants then made a choice by pressing a left or right key with the index or middle finger of their right hand. When the choice was made, an auditory tone of high or low pitch indicated whether participants had won or lost, and after 1500 ms the running total (bank balance) amount won was displayed in the center of the screen for 1 s. Participants had to fixate a central cross for 500 ms prior to the start of the next trial. Participants completed two 64-trial blocks over 30 min and were paid based on their winnings.

We analyzed fixations in the period from display onset to choice keypress. We removed blinks and discarded fixations shorter than 50 ms and fixations off the display items. The item fixated at any time was determined with an 8◦ radius. Blinks accounted on average for 2.4% of decision time, and off-item fixations accounted for 3.8% of the decision time. Dwell times were calculated as the time between arriving at an item, and arriving at the next item.

### **GAMBLES**

The probability and stake for each gamble gives an expected value (*EV*) and a risk (*R*). Here, risk is defined as variance or uncertainty in the outcome:

$$EV = \mathbb{S} \cdot (2P - 1) \tag{1}$$

where *S* is stake and *P* is probability of winning. Note that the factor 2*P* − 1 incorporates the possibility of both winning and losing the stake. Probabilities under 50% yield a negative *EV*. From Equation (1), we can see that a gamble with a 50% *probability* of winning or losing has *EV* = 0. At the start of a trial, both P and S are uncertain, but after acquiring information, they will be more precisely known. Therefore, we can consider both S and P as random variables that must be estimated by the brain.

Of note, *knowing only the probability* gives information about the expectation of *EV*, whereas *knowing only the stake* does not: the expectation of *EV* remains zero. For example, knowing whether the stake is £10 or £90 makes no difference to participants' (mathematical) expectation of reward, because they could either win or lose it.

Next, we can calculate gamble risk, defined as variance of reward value:

$$R = 4S^2P(1-P)\tag{2}$$

According to this equation, a probability of 50% carries the highest risk because the outcome is most uncertain, and as probabilities get closer to 0 or 100%, the outcome is more predictable, so risk falls. Notably, the expectation of risk also changes when we learn a stake (unlike our expectation of *EV*)—i.e., after seeing a £90 stake, the risk estimate is high, since the outcome value is highly variable: +£90 or −£90.

On each trial, one of the two gambles had a high risk, and the other had a low risk (**Figure 3**). Values were chosen using four trial types, where the risky *EV*/safe *EV* were +8/+8, +8/−8, −8/+8 or −8/−8. Each of the four values (two probabilities and two stakes) was then randomized by adding a uniformly distributed integer from −10 to +10. This gave a set of trials which had a spectrum from similar *EV*s to different *EV*s, and high to low *EV*s. Similarly, risks ranged from high to low, with the difference in risks ranging from 20 to 70. The risky gamble's stake was between 57 and 77, and the safe stake was 10–30.

## **RESULTS**

## **PRE-CHOICE BEHAVIOR**

During the decision period, we traced the order of acquisition of information (one subject's first 4 fixations are shown in **Figure 2B–F**). "Acquisition" was defined as a period during which gaze remained on a single number (stake or probability), before moving to a different quadrant. Each acquisition lasted between 85 and 1800 ms, and could constitute several consecutive re-fixations around one particular item. Participants visited all four locations on 89% of trials. An optimal strategy might be to make only four acquisitions—provided that working memory can store four items, as some have argued to be the case (Cowan, 2010). However, we found that participants made on average 6.6 acquisitions before coming to a decision, and sometimes required up to 14 (**Figure 4A**).

In other words, they frequently refixated items prior to making a decision. One might predict that on this task, participants would visit all four locations before refixating any of them, consistent with "inhibition" of visited locations seen in visual search (Gilchrist and Harvey, 2000; Weger and Inhoff, 2006). However, our data showed, surprisingly, that on 49% of trials participants made refixations to a previously examined location *before* they had visited all four locations.

Mean dwell time on each acquisition was 762 ms and this decreased systematically over the course of a trial (**Figure 4B**). In this and subsequent analyses of fixation duration, we excluded the final acquisition during which the button-press choice was made, because the duration of this final fixation was presumably not determined by attentional search processes, but rather by action initiation. Dwell time on the first item was longer when a high *stake* was fixated, compared to a low stake [stake *>* median of

**FIGURE 4 | (A)** Average histogram of the number of acquisitions (periods contiguously fixating one number) on each trial. Participants usually make four or more acquisitions, but sometimes require 14. **(B)** Dwell times decrease during the course of a trial. The final acquisition of each trial was excluded. Mixed-effects One-Way ANOVA showed a main effect of acquisition serial position in the trial, and the red bar shows pairs of significant differences (*p <* 0*.*05).

£41, mean difference 33 ms, *t(*16*)* = 2*.*18, *p* = 0*.*045]. The gamble's *probability* had no effect on dwell time (*p* = 0*.*38). Thus, at the start of a trial, gaze—and by inference, attention—appeared initially to be attracted by higher risk (since stake determines the variance in outcome) but not by higher *EV*.

### **CHOICE BEHAVIOR**

Participants chose the higher-*EV* gamble on 69% of trials overall. This occurred more often on "easy" trials—i.e., when the *EV*-difference between choices was large (absolute *EV* difference *>* median of £11: 77 vs. 61%, main effect of *EV* difference, *p <* 0*.*001). The higher *EV* was chosen less often when the risk difference was large (64 vs. 74%, main effect of risk difference, *p* = 0*.*03). Participants took less time to choose between the options when *EV*s were similar and large. There were strong biases for participants to choose the first option they fixated (*p <* 0*.*001)

(left) predicts subsequent choice, despite being uncorrelated with any of the values seen. This demonstrates that participants are reliably biased by the first information they acquire. The final acquisition (right) strongly reflects the choice that is about to be made, with an accuracy of close to 80%: participants rarely choose an option they are not attending to. **(B)** Which factors influenced choice? An 8-factor model logistic regression model was fitted to each subject's choices, i.e., whether they chose the risky or safe option. We included included a bias term indicating individual risk preference, *EV* and risk of each option, and also eye movement factors from panel 5A—indicating whether the first and last fixations on each trial were to the risky option. The mean fitted normalized regression coefficients are shown. Error bars are s.e.m. across subjects. Asterisks indicate a regressor is significantly different from zero using *t*-test across subjects (*p <* 0*.*05). The initial fixation regressor was correlated with the final fixation regressor, and did not significantly contribute to choice on this analysis.

or the last item fixated (*p <* 0*.*001, **Figure 5A**), consistent with previous reports (Krajbich et al., 2010). This was despite the first saccade being directed essentially randomly (probability of 25% +/− 2% to each type of item, probabilty or stake, high or low value, *p >* 0*.*5), even when informative color coding (see Methods and below) was present. Logistic regression revealed that preference was governed primarily by *EV* difference but was also influenced by final fixations [both *t(*16*) >* 7, *p <* 0*.*001, **Figure 5B**]. The preferred option consistently received more fixations and longer fixations, also consistent with previous findings (Glöckner and Herbold, 2011).

In our experimental design, 50% of the trials were "colourcoded," such that probabilities were consistently in one color, and stakes in another. Thus, in the color-coded trials, participants could know whether each location contained a probability or a stake, in advance of fixating it.

If participants used this color information to guide attention, we might expect more horizontal saccades compared to diagonal saccades when corresponding dimensions (probability or stake) were aligned horizontally, and the converse when they are aligned diagonally. We found that although horizontal saccades were always more likely than diagonal saccades, there was no effect of display alignment (*t*-test of proportion of between-option saccades that were horizontal, *p >* 0*.*05), indicating that participants did not use color information in attentional guidance.

Choice reaction times were significantly faster when colorcoding was present [4.32 vs. 4.69 s, *F(*1*,* <sup>16</sup>*)* = 8*.*88, *p* = 0*.*009], irrespective of whether the probabilities and stakes were horizontally or diagonally aligned. The advantage of color-coding was also evidenced by shorter durations of acquisitions (736 vs. 836 ms for the first acquisition).

#### **INFORMATION FORAGING**

To analyse the data further we next developed a method to consider how information about *EV* is acquired over multiple fixations. A foraging account of attention postulates that the rate of acquiring new information decreases as participants gain greater knowledge about the fixated target (**Figure 1**).

$$k\_1 \frac{dI}{dt} = I\_{\text{max}} - I \tag{3}$$

Rate of gain of information ∝ 1 − information already known

Once the information gain rate drops below the average information gain rate in the task, participants would be expected to direct attention to a new location, according to the marginal value theorem developed for foraging behavior (Charnov, 1976).

To explain refixations, we further assume that, after attention has left, the entropy of the posterior gradually rises, as information is lost. In other words, there would be a *natural decay*:

$$k\_2 \frac{dI}{dt} = -I \tag{4}$$

Rate of loss of information ∝ amount of information currently known

With the assumption of decay, refixations can be explained by a rule that moves attention toward the unknown (toward high entropy). This information foraging account predicts that:


All three predictions turned out to be borne out by the results.

One might predict that on this task, participants would visit all four locations before refixating any of them, consistent with "inhibition" of visited locations seen in visual search (Gilchrist and Harvey, 2000; Weger and Inhoff, 2006). However, our data showed, surprisingly, that on 49% of trials participants made refixations to a previously examined location *before* they had visited all four locations.

## *Refixations go to locations fixated longer ago (P1)*

At each fixation, we calculated the recency with which each display item was previously seen—i.e., how many items ago it was last fixated. On acquisitions that were refixations, the recency of the *fixated* item was 3.13 (*SD* 0.29). This compared with a recency of 2.72 (*SD* 0.13) for the other two items that were not selected by that eye movement [*F(*1*,* <sup>19</sup>*)* = 16*.*9, *p <* 0*.*001]. Thus, participants preferentially refixated items that had *not* been seen recently. The effect can be equally explained by foraging or inhibition of return.

#### *Refixation durations compared to new fixations (P2)*

Refixations were shorter than acquisitions at unvisited locations even when they occurred at the same serial position in the trial [**Figure 6A**, *t(*16*) >* 2*.*8, *p <* 0*.*01 at serial positions 3, 4, 5, and 6], just as might be predicted from a foraging perspective. This

**(B)** Dwell times are longer when the item is never fixated again,

finding suggests that once viewed, an item cannot hold attention for as long.

## *Initial fixation time affects subsequent refixation duration (P3)*

Initial dwell times were shorter at a location that was later refixated, compared to locations that were not refixated, even for acquisitions at the same serial position within a trial (**Figure 6B**, *p <* 0*.*01 for acquisitions at serial positions 1 and 2; *p <* 0*.*05 at 3 and 4). Thus, items that were briefly viewed were more likely to be refixated. This is in keeping with less information being accrued on shorter acquisitions (**Figure 1**). Participants who made shorter fixations on average also made more refixations (regression of mean dwell time over first four acquisitions against 1/(number of refixations), transformed to remove positive skew, *<sup>r</sup>*<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*26, *p* = 0*.*038), confirming that less time spent on an item leads to its refixation (**Figure 6C**).

All these findings support an information-seeking model that parallels animal models of foraging. An explanation of some of these results could be that refixations are guided by the strength of some memory trace. Is there any specific evidence that *information* is in fact the driver of attention? To answer this, we must examine how information gain depends upon the actual numbers seen.

#### **BAYESIAN ESTIMATE OF EV AND RISK FOR EARLY FIXATIONS**

While information accumulation is described by Equations (3) and (4), deciding where next to look requires a normative rule governing attention. Such a rule would specify how attention is driven by the *distributions of the estimated decision variables*, as they evolve over the decision period. We postulate that visiting and re-visiting of locations *optimizes information gain*. Similar information-guidance rules for attention have previously been proposed for low-level feature searches (Renninger et al., 2007; Hou and Zhang, 2008). In the context of choice, we expect attention to be specifically guided by uncertainty in *EV*.

For the first two fixations of a gamble, we follow step-by-step the best estimate of *EV* and risk, by tracking the evolution of

times on average.

the Bayesian density for the *EV* and risk. We start with a flat prior, representing the lack of knowledge about the items on screen (qualitatively similar results apply if the prior is taken over all actually presented trials). After a single fixation, either the probability or the stake is known with greater precision, illustrated here as a gaussian distribution (**Figure 7**, heatmap to left of distribution).

If a *stake is seen first*, the density over stakes is transformed from the flat prior, to a peaked posterior (**Figure 7**, left and middle columns). We approximate this as

$$
\pi \left( S = s \mid e\_1 \right) \propto \pi(S = s) \cdot \mathcal{N} \left( s - e\_1, \sigma \right), \tag{5}
$$

Posteriorover stakes = prior over stakes × information gained*,*

where <sup>π</sup>*(<sup>S</sup>* <sup>=</sup> *<sup>s</sup>)* <sup>=</sup> <sup>1</sup> <sup>100</sup> is the prior and π*(S* = *s* |*e*1*)* is the posterior over stakes after the stake value *e*<sup>1</sup> is seen.

The intuition is that participants do not know for certain what number is displayed, but a narrower distribution represents having more precise knowledge. Similar belief-updating methods have recently been used for locating targets in machine vision (Butko and Movellan, 2008) and inferring word identity in reading (Bicknell and Levy, 2010).

Importantly, participants can now form estimates about the *EV* and risk:

$$\pi \left( EV = \nu \mid e\_1 \right) = \int \pi(P = p) \cdot \pi \left( S = \frac{\nu}{2p} \middle| e\_1 \right) dp \qquad (6)$$

Posterior probability = probability of *S* · *(*2*P* − 1*)* of *EV* being *v* being equal to *v*;

$$\pi\left(R=r\mid e\_1\right) = \int \pi(P=p) \cdot \pi\left(S = \frac{1}{2}\sqrt{\frac{r}{p(1-p)}} \mid e\_1\right) dp\left(7\right)$$

Posterior probability <sup>=</sup> probability of 4*S*2*P(*<sup>1</sup> <sup>−</sup> *<sup>P</sup>)* of risk being *r* being equal to *r.*

These follow from combining Equations (1) and (2) with the posterior of (5). This captures the notion that after seeing a high or low stake, participants update their expected winnings and risks.

After a *second* fixation within the same gamble, participants acquire information about the probability *e*2, and the new estimated density of the probability P is given by

$$
\pi\left(\mathcal{P}=\mathcal{p}\mid e\_2\right) \propto \pi(\mathcal{P}=\mathcal{p}) \cdot \mathcal{N}\left(\mathcal{p}-e\_2,\sigma\right),
\tag{8}
$$

with the prior <sup>π</sup>*(<sup>P</sup>* <sup>=</sup> *<sup>p</sup>)* <sup>=</sup> <sup>1</sup> <sup>100</sup> . Putting π *P* = *p* |*e*<sup>2</sup> in place of π*(P* = *p)* in Equations (6) and (7) gives the new posteriors for *EV* and risk after the second fixation, π *(EV* |*e*1*,e*2*)* and π*(R* |*e*1*,e*2*)* (**Figure 7**, right column). This posterior now incorporates the fact that participants have some knowledge about both the stake and probability to estimate what they can win.

After the first fixation on the stake, should participants fixate the probability of the same option? We quantify how much information can be gained by looking at the probability, using an information metric. The expected information gained about *EV* (the gain from a within-option saccade, i.e., vertical saccade)

**FIGURE 7 | Bayesian updating of expectations.** What information is obtained in the first two acquisitions? Heatmaps on the left of each panel illustrate the participant's estimated distribution of probability and stake. From this we calculate the estimated distribution of *EV*s and Risks Equations (6) and (7). Far left: the priors give a relatively flat distribution for *EV* and risk. First column: after the first

acquisition, either a probability or stake is seen, narrowing the distribution in that dimension, and altering the density of *EV* and risk. Second column: after the first acquisition, the participant shifts attention to the other value in the same option, and his estimate of *EV* and risk improves again. As more information is accrued by fixations, the distributions become more peaked.

could be measured in bits as the average over possible values of *e*<sup>2</sup> of

$$\text{Information} = D\_{\text{KL}}\left[\pi\left(EV = \nu \mid \varepsilon\_{1}, \ e\_{2}\right) \parallel \pi\left(EV = \nu \mid \varepsilon\_{1}\right)\right]$$

$$= \int \pi\left(EV = \nu \mid \varepsilon\_{1}\right) \cdot \log\left(\frac{\pi\left(EV = \nu \mid \varepsilon\_{1}\right)}{\pi\left(EV = \nu \mid \varepsilon\_{2}\right)}\right) d\nu. (9)$$

Information gain = distance between probability distributions before and after seeing an item*.*

Intuitively, if gazing at a location could dramatically change the distribution of possible *EV*s, then that location is potentially very informative. That is, informativeness is defined as the distance between the current and possible future distributions of *EV*.

Analogous results are found when a probability is fixated first. The information gained by remaining within an option is shown in **Figure 8**, and is characterized as follows:


These features are robust to differing amounts of information per fixation (changes in σ). We took σ = 15 for the residual uncertainty about a number after it is fixated once. Note that predictions P4–P6 (predicting fixation sequence) are independent of P1–P3 (predicting fixation duration), because the Bayesian updating in its present form ignores fixation durations and decay. A composite model incorporating both decay and timedependent updating could be used, which would generate all six predictions P1–P6, but would require fitting of accumulation and decay rate parameters. Instead, we chose to split the two aspects of the model to allow for more straightforward testing.

#### **IS THE FIRST SHIFT OF ATTENTION DRIVEN BY EV OR INFORMATION?**

After the first acquisition, attention could either be directed within the gamble to the other number (vertical saccade), or across to the opposite side gamble (horizontal or diagonal). If attention were driven by expected value, after the first fixation, we would expect participants to look within an option after seeing a high probability, but not after a low probability, and no effect of stake size (seeing a high stake indicates a high risk, but without informing about the expected value). This prediction is illustrated by the bars in **Figure 8B**. On the other hand, if information guides attention, we should expect high stakes to cause more withinoption saccades than low stakes—because the higher the stake, the more informative is the corresponding probability.

We found that overall participants were generally more likely to look within the current gamble (60% preponderance). If the first fixation was on a stake, participants were more likely to look within the option, compared to when they first fixated a probability [63 vs. 57%, *t(*16*)* = 2*.*17, *p* = 0*.*046, **Figure 8B**]. This


**FIGURE 8 | (A)** Attention may be guided by *EV* or by information seeking. The two drives predict different patterns of fixation in our task. If attention were *EV*-seeking, gaze ought to remain within the current option if the first-seen item was a high probability, but not if it were a low probability. On the other hand, if attention were information-seeking, gaze ought to remain within the current option if a high stake was seen, compared to a low stake. **(B)** After the first fixation, participants may look vertically within the option, or across to the other option. Where they look next depends on what they just saw: within-option saccades are commoner after seeing a stake. This is predicted when attention is information-driven, rather than *EV*-driven. Green bars represent the theoretical information gain from making a within-option saccade, calculated as *DKL*[posterior EV || prior EV] prior p*,*s, which represents how much information one could expect to learn after making a particular saccade. Yellow bars represent the Bayesian estimate of *EV* of the current option. Both green (information) and yellow (*EV*) bars are arbitrarily scaled. **(C)** On trials where the first two acquisitions were within one gamble, participants sometimes refixate the first item seen. This is more likely when the probability was high (*p* = 0*.*047), but there was no effect of stake (*p >* 0*.*05, with no interaction).

is consistent with an information-seeking account of attention, since stakes initially provide no information about *EV*, whereas probabilities do. However, we did not find an effect of the *magnitude* of the probability or stake first fixated (*p >* 0*.*2). Comparing these result with optimal information-seeking (previous section) shows that, in our participants, attention seeks information according to criterion (P4), but not (P5) or (P6), for the first gaze shift.

## **WHAT DRIVES REFIXATIONS ON THE SECOND SHIFT OF ATTENTION?**

Next, we examined only trials where the first two acquisitions were both within one option. At this point participants had seen *both* the stake and probability of one option. Subsequent saccades could either be vertical, refixating the first value seen, or could go across to the other option.

Equations (5)–(9) describe how informative the next shift of attention would be, given the estimates at the end of the second acquisition.

If refixation were driven by *EV*, we might expect more of these refixations when a high probability and a high stake were seen, and fewer when a low probability and high stake was seen. Note that a pure information-seeking account would always predict moving to the other option. We found that on average participants immediately refixated in 37% of trials, and there were more refixations when the probability was high than when it was low [main effect of option probability, *F(*1*,* <sup>48</sup>*)* = 4*.*15, *p* = 0*.*047, **Figure 8C**]. This is consistent with a *pull of the higher EV*, and demonstrates that the second shift of fixation is not simply random. As expected there was no effect of the stake size (*p* = 0*.*7). However, we did not find the expected interaction between the probability and stake (*p* = 0*.*49): high stakes did not increase the drive of probability.

In these analyses of the first and second shifts of attention, we included color coding as a factor. There was no main effect of color coding, and no interaction (*p >* 0*.*05). Since we had only expected color coding to be relevant for the first two shifts of attention, we collapsed across color conditions for the following analysis of later fixations.

## **SUBSEQUENT TIMECOURSE OF ATTENTIONAL CONTROL BY EV AND INFORMATION**

Information seeking only partly predicts the first two shifts of attention. For subsequent fixations, however, it is more effective. We can follow the acquisition sequence that participants made, iteratively applying Bayesian updating Equations (5)–(9). At each fixation, we calculated the online estimate of the option *EV*s and risks, assuming a fixed amount of information about the number is acquired on each acquisition, with no forgetting. The expectation of information gain Equation (9) gives us the optimal next saccade to maximize information—either information about *EV* or risk. **Figure 9A** shows on each fixation, whether or not participants fixated the "best" item in order to maximize information about *EV*, or risk. On the fourth and fifth acquisitions in a trial, attention is strongly drawn toward the higher information location, but on later acquisitions only weakly so [compared to chance, *t(*16*) >* 2*.*79, *p <* 0*.*05 correcting for 24 multiple comparisons].

What was the timecourse of the attentional pull of *EV*? Early acquisitions were equally likely to go to the lower or higher *EV* option, whereas later acquisitions (7th and 8th) tended toward the higher *EV* option [**Figure 9B**, *t(*16*) >* 2*.*15, corrected *p <* 0*.*05, up to 58% to higher *EV*; qualitatively similar results were obtained using σ = 5, 15, or 60]. Thus, value had a stronger pull later in the decision period.

The acquisition immediately after all locations had been visited was strongly drawn toward the *initially fixated* item (**Figure 10**), despite initial the initial fixation being at chance to each item type; this is precisely what would be expected from the decay of information.

**FIGURE 9 | Timecourse of attentional control. (A)** Early on in a trial, attention is drawn by information. There is a strong pull by information about expected value, as calculated by Bayesian updating. The y-axis shows how often participants' saccades coincide with the information-seeking prediction. This falls to chance (33%) after the sixth acquisition in a trial. There is a weak effect of information about risk. Asterisks denote acquisitions when gaze was significantly drawn toward the highest information, relative to chance (*p <* 0*.*05). **(B)** Participants increasingly tend to fixate on the option with higher *EV* through a trial. For both **(A,B)**, *EV* and risk estimates for participants' fixation sequences π*(EV* | *e*1*, e*<sup>2</sup> *... ei)* were calculated using Bayesian updating rules using σ = 15.

To rule out possible bias due to there being more acquisitions in trials with lower and more similar *EV*s (see below), we aligned each trial's acquisition series to the *end* of the sequence, such that all the final, penultimate etc. acquisitions were grouped. Again, the effect of *EV* increased monotonically through the trial, to the final saccade which had a 62% chance of going to the higher *EV*. The final saccade correlated with choice on 80% of trials (**Figure 5**).

The results suggest that both uncertainty and *EV* can drive attentional shifts, but at different points in a trial. A possible attention-guiding rule might be to maximize some linear combination of informativeness and estimated expected value, where the weighting changes through the trial:

$$V\_i = \alpha \cdot \mathbb{E}\_{\mathfrak{p},s}(I\_i) + (1 - \alpha) \cdot \mathbb{E}\_{\mathfrak{p},s}(EV\_{\text{option}(i)}) \tag{10}$$

value of expected expected fixating a = α · information gain by + *(*1 − α*)* · value of location fixating item option

over the three possible shifts of attention. Here *Vi* represents the intrinsic worth of a given shift of attention, *Ii* is its information gain given by Equation (9), and *EV*option*(*i*)* is the estimated reward *EV* of the corresponding option. The coefficient α(*t*) might begin at 1, and decrease to zero through a trial, weighting first information then value.

## **AMOUNT OF FORAGING FOR INFORMATION DEPENDS ON EV AND RISK**

We quantified foraging for information by the number of acquisitions (changes of fixation quadrant) before choosing. Participants made more acquisitions when the expected values of the gambles were *both low*, than when they are both high [ANOVA, median split factors: mean EV, *EV* difference, mean risk, risk difference; main effect of mean *EV*, *F(*1*,* <sup>16</sup>*)* = 13*.*4, *p* = 0*.*0038]. They also made more acquisitions when the *difference in expected values* of the two gambles was *small* (**Figure 11**), i.e., harder decisions led to more exploration [main effect of *EV* difference, *F(*1*,* <sup>16</sup>*)* = 8*.*96, *p* = 0*.*0086]. This would be consistent with estimated distributions of value getting progressively sharper, or more accurate, with more information: sharper posterior distributions are required in order to distinguish between options with similar *EV*s, as predicted by diffusion and rise-to-threshold models (Carpenter and Williams, 1995; Ratcliff and Smith, 2004). When the two risks were similar, the number of acquisitions was strongly modulated by *EV* difference. But when the two *risks were very different*, *EV* had little effect [interaction of risk difference with *EV* difference, *F(*1*,* <sup>16</sup>*)* = 5*.*53, *p* = 0*.*023) (**Figure 11**).

Are these similarity-driven refixations specifically targeted to the most informative locations? If refixations were attracted by information about individual display items, we would expect participants to refixate probabilities when the probabilities are similar, and stakes when the stakes are similar. However, this effect is not seen (**Figure 12**, left). Participants *do* make more refixations when the probability difference is small, but the extra refixations are not specifically directed to the probabilities [main effects of mean probability and probability difference, *F(*1*,* <sup>112</sup>*)* = 22 and 28, respectively, *p <* 0*.*001, but no interaction with which item was refixated). Similar stakes also increase refixations compared to different stakes, but again a *general* increase of refixation is seen, not specific to the stakes [effect of stake difference *F(*1*,* <sup>112</sup>*)* = 0*.*03, **Figure 12** right]. This finding suggests that the comparison takes place not in feature-space, but in value-space: both the probabilities and stakes are counted as informative, when comparison of either is difficult.

**FIGURE 11 | Participants make more acquisitions on trials where the mean** *EV* **of the two gambles is low, and when the two** *EV* **s are similar (higher difficulty).** The presence of a large risk difference reduces the difficulty effect (interaction *p <* 0*.*05). High mean risk increases the difficulty effect when mean *EV*s are low, but decreases it when mean *EV*s are high (interaction *p <* 0*.*05).

## **DISCUSSION**

We designed a task in which participants could freely acquire information before making a decision. Two options were inspected, each of which had a monetary stake and a probability of winning vs. losing that stake. Unlike standard decision-making paradigms, we examine the trajectory of attention (indexed by eye

integrated *EV* and risk of the options. Asterisks: 3-way ANOVA *p <* 0*.*05.

position) before the choice is made. After freely acquiring information, participants made a button press choice. We found that they frequently refixated items, even before visiting all four locations. Early in a trial, the trajectory of attention was directed to locations with the highest information gain. Later on, attention was guided to the option of higher expected value (**Figure 9**).

Why would locations be refixated? We interpret the findings in terms of foraging: choosing an option involves first approaching the option, then deciding whether to accept or reject that option. Early in the trial, under uncertainty, attention is directed to high-variance options, in an attempt to resolve their uncertainty by acquiring information. As information accumulates, however, attention becomes progressively guided by reward value, such that an "engage/search" strategy could be used to make the best choice.

The temporal pattern of the attentional trajectory provided support for an information foraging mechanism:


But is the assumption of looking toward uncertainty warranted? If attention were guided solely by information seeking, we would not observe biases of looking toward reward (Ding and Hikosaka, 2007; Milstein and Dorris, 2007; Hickey et al., 2010). On the other hand, If attention were guided solely by reward, we would not learn about our environment (Hogarth et al., 2008).

## **TWO COMPETING HYPOTHESES FOR GOAL-DRIVEN GUIDANCE OF ATTENTION: SHARPENING PERCEPTION vs. SHARPENING VALUE REPRESENTATION**

According to a *perceptual model*, attention should favor objects whose identity is uncertain. This is the prediction of models in which attention aims to improve the precision of our internal representation of causes in the world, e.g., a free energy formulation of perception. A competing model is that attention favors objects which inform us about *expected value* (Milstein and Dorris, 2007). For example, if an object is likely to indicate what the value of an option is, it should command attention. Here, attention aims to improve informed choice, and attentional trajectories are computed in terms of option-value precision, as opposed to perceptual precision. Perceptual information-seeking is agnostic of the actual numbers seen. On the other hand, *EV*-based information-seeking predicts that revisiting patterns should depend on the actual numerical values. Such effects are seen in our data (**Figures 8B,C** and **12**), consistent with the possibility that the initial trajectory of attention is computed to reduce uncertainty in *option-value* space, rather than perceptual space, using an information-maximizing principle. This could in principle be implemented using an active inference framework. This distinction provides a new way to disentangle different levels of "top-down" attentional control: in our task, the eyes are directed not simply to perceptual uncertainty, but to option value uncertainty.

Our results thus lead us to consider that *value* uncertainty is more likely to be relevant than *perceptual* uncertainty, in this task. Numerical values may be subject to similar noisy integration to qualitative stimuli (Krajbich et al., 2012) Such a proposal would be consistent with evidence that numerical magnitude representations in the parietal lobe are limited in their precision, in contrast to precise symbolic representations present during immediate perception (Naccache and Dehaene, 2001; Brannon, 2006).

## **EXPLAINING REFIXATIONS**

Refixations, we argue, occur because of incomplete knowledge of previously visited items. This could be due to poor retention or poor acquisition. Although retention is generally considered to have a capacity of 4 or more items (Snyder and Kingstone, 2000; Gilchrist et al., 2001), a variable-precision account of working memory retention might predict refixations, particularly when combined with temporal decay (Bays and Husain, 2008; Zokaei et al., 2011). A more straightforward explanation of refixation is that participants only *acquire* a limited amount of information about each target as they fixate it. This can be expressed as incremental changes in the estimated probability density over the four display values (**Figure 7**). The gain of information may depend on fixation duration, and subsequently information may decay (**Figure 1**).

To explain refixation patterns, we invoke a concept of "*infomation salience."* The information content of a stimulus can be quantified as the distance between probability densities over *EV* before and after an item is identified. Thus, information content indicates the reduction in uncertainty that a stimulus might bring when identified. The concept of information salience is meant to describe the way in which attention can be captured by this informativeness, even when other accounts (inhibition of return, Posner and Cohen, 1984; Itti and Koch, 2001) predict it should not. Our task allows us to quantify mathematically what has been called "attention to the unknown" (Gottlieb, 2012), and compare it directly with other attentional biases, including perceptual salience and reward.

One old candidate for explaining attention to the unknown, is *inhibition of return* (Rafal et al., 1989). IOR has long been thought of as an aid to foraging in an environment (Klein and MacInnes, 1999; Gilchrist and Harvey, 2000; Klein, 2000; Hooge et al., 2005), and has inspired dynamic models of sequential attentional selection (Itti and Koch, 2001; Hou and Zhang, 2008). IOR both slows and prevents returning saccades (Bays and Husain, 2012), and in this way, may function as a novelty bias.

Of interest, one study has shown IOR to be contingent upon the occurrence of reward and dependent upon medial frontal cortex (Hodgson et al., 2002). IOR may persist for up to 5 previously attended locations (Snyder and Kingstone, 2000); its duration is increased by amphetamine and may be reduced in Parkinson's disease (Filoteo et al., 1997; Poliakoff et al., 2003). It also varies between individuals according to DAT1 gene polymorphisms (Colzato et al., 2010). Frontal dopaminergic mechanisms are thus likely to be crucial in generating the drive of spatial attention toward reward value or uncertainty.

Although IOR explains why refixations go toward locations that haven't been recently fixated, it makes no predictions about (1) the relationships with fixation durations, (2) the first couple of acquisitions, nor (3) the effect of the actual numeric values seen. However, specific predictions are made by information foraging coupled with Bayesian updating of *EV*s.

## **"DECIDING" WHERE TO LOOK**

Many authors have considered saccadic control as a surrogate for decision making (Glimcher, 2001; Gold and Shadlen, 2007). From our results we argue, in contrast, that deciding where to attend involves *different* considerations to deciding upon actions:


Computationally, a critical difference is that "deciding" compares values, whereas "attending" compares uncertainties. Information foraging thus requires different mechanisms to classical decisionmaking models of winner-takes-all competition between the option values (Wang, 2002; Wong et al., 2007). So long as more information is available in the environment, then for guiding attention, the *least certain* option needs to win out (Renninger et al., 2007). One implementation of this would be a neural map of uncertainty, rather than value, that guides attention analogous to maps proposed for reward (Peck et al., 2009) and salience (Koch and Ullman, 1985).

Even when attention is guided by values, we suggest that the values are integrated in a fundamentally different way. Rather than comparing option values in an accumulator (Ratcliff and McKoon, 2007), we suggest that attention is guided by value via a spatial map, which may incorporate reward expectation and history from many sources (Platt and Glimcher, 1999; Ding and Hikosaka, 2007; Milstein and Dorris, 2007), such as online value estimates. Such attentional value biases are entirely compatible with action-choice being subserved by independent comparators often used in decision models.

## **CONCERNS AND LIMITATIONS**

Although the framework advanced here has some attractions, there are also some potential concerns or limitations. First, does *EV* really carry less weight early in a trial (**Figure 9**)? At the start of a trial, participants have no information about *EV*, so it is not surprising that early fixations are not directed toward the higher *EV* option. If this is the case, perhaps the relative influence of *EV* and information do not vary through a trial, i.e., the coefficient α(*t*) in Equation (10) might in fact be constant. To address this, we used the estimated posterior for *EV* Equation (4) to re-analyse whether participants fixated the option that had the higher value *according to their online estimates*, and obtained results similar to **Figure 9B**. Participants looked at the higher *EV* estimates on fixations 6 and 7 (corrected *p <* 0*.*05), but not on earlier fixations. Thus, we conclude that attention was significantly pulled by *EV* later but not earlier in the trial. We cannot rule out, however, that earlier in the trial *EV* contributes less because the estimated *EV* differences are smaller, or that later in the trial high *EV*s are fixated as a by-product of a comparison process.

Second, the first few shifts of attention (indexed by gaze) did not show true information-guidance. The second acquisition tended to be within the same gamble as the first fixation, which contravenes predictions of pure information-seeking: information gain is maximized by looking across to the other gamble. Even more surprisingly, participants refixated recently seen items before all items have been explored. For example, sometimes both the second and third acquisitions are "within-option" movements (**Figure 2D**, "WWA"). Contrary to this, pure information-seeking mandates that attention go preferentially to previously unseen items. Refixations ought *not* to occur until after all items have been visited, even accounting for memory limitation or "decay of information." The unconstrained decision time in our task might have favored this suboptimal behavior in the first few saccades. In contrast, an information-seeking policy does explain later fixations (**Figure 9A**).

We suggest that more elaborate models of information acquisition may be needed to explain these findings. We suggest three possible extensions. First, the information-accrual rate [parameter *k*<sup>1</sup> in Equation (1)] may not be constant through the decision period; in particular, it might be low for the initial acquisitions, which would also explain the longer initial fixations (**Figure 4B**). A second more intriguing possibility is that it is easier to integrate the probability and stake of an option when they are seen consecutively—perhaps reflecting a cost for shifting the focus of attention to items within working memory (Oberauer, 2002) or a cost for switching object files (Treisman et al., 1983). This cost could appear as an additional term in the shifting rule Equation (10). Our present data is not sufficient to distinguish these possibilities, but we note that "WW" patterns were commonest when fixating a high probability first—indicating that order of acquisition influences ease of integration. A third possibility is that saccades are not chosen to maximize information at the next movement, but rather, a whole sequence of subsequent saccades is chosen, to maximize information gain over several fixations. If we were to include decay into the updating model, fixation order would make a difference to information, possibly resulting in a different optimal strategy. Our current model assumes some form of bounded rationality, since we ignore the possibility of planning sequences.

Third, how much of attentional control can be explained by *EV* and information? The results showed that attention was significantly attracted to information salience early in a trial, and to high *EV* later in a trial. However, our maximal prediction accuracy was only 62% for information-seeking, and 61% for *EV*-seeking (**Figure 9**). Could other factors guide attention in our task? Of note, participants did not always choose the higher *EV*, and the final acquisition went to the chosen option on 80% of trials (**Figure 5**). It is likely that subjective preferences involve a more complex notion of utility than simple *EV*, for example incorporating risk preference or probability discounting (Kahneman and Tversky, 1979). These extra factors probably also contribute to attentional guidance *before* a choice.

In calculating whether participants fixated the most informative location, we took σ as constant. That is, we did not include the effect of fixation durations or decay, which would involve making assumptions about the information acquisition rate and forgetting rate. In particular, we did not fit any parameters to individual participants' performance. Information acquisition rate and forgetting rate may well vary from person to person (Colzato et al., 2010). On top of these factors, attentional guidance might itself be noisy. For example, a softmax rule (Luce, 1977) could be used to determine the next fixation location given the *EV*s and information gains. The observed transition from information salience to reward salience bears similarities with longer term switches between *exploration and exploitation* seen under risk (Daw et al., 2006; Cohen et al., 2007). In cases where information increases due to learning, the proportion of "noisy" choices that are *not* guided by value (i.e., the temperature) would decrease over time (Sutton, 1991; Carmel and Markovitch, 1999). In our case, rather than switching from random to modeldriven choice, attention switches from uncertainty-seeking to reward-seeking.

Finally, throughout our analysis, we have made two assumptions: saccades are a relatively direct index of how attention is directed, and attention is focused rather than divided. Attention dissociates from eye movements in experimental conditions of enforced fixation (Juan et al., 2004), however, saccades probably entail movements of attention under most conditions (Sheliga et al., 1994; Corbetta, 1998; McPeek et al., 1999). In our displays, participants would be unable to perceive numerals that are not within a couple of degrees of fixation, as we established in pilot experiments. This enforced a serial strategy, in which dividing attention could not have been beneficial. We expect that refixations would be greatly reduced if this serial constraint were lifted, because dividing attention could facilitate both integration across dimensions and comparison within a dimension.

## **DECISION BIASES DUE TO ATTENTION**

Attention influences the decision process in a number of ways. Selecting stimulus features boosts their contribution in the stochastic progression of an ongoing decision process (Roe et al., 2001; Usher and McClelland, 2001; Kim et al., 2012). Attention may highlight supporting evidence for the favored option, generating attentional shifts within an option rather than between options (Glöckner and Herbold, 2011), but also reflecting whether a decision involves component-wise comparison or integration of value (Arieli et al., 2011). Counterproductively, attention biases choices in favor of the attended option (Krajbich et al., 2010; Brandstätter, 2011), and its influence on choice can be modeled as leaky integration of value over time, with a bias toward the attended item. These approaches show that attention powerfully modulates choice, but fail to explain how attention is itself guided.

Sampling theories make predictions about how we acquire information from the options available before a choice (Stewart et al., 2006). According to decision field theory, attention under risk is drawn in proportion to probabilities (Roe et al., 2001). But such a scenario would make attention highly inefficient at obtaining information. Optimal information gathering should not simply attend to the higher probability or expected value; rather, attention should seek uncertain options whose distribution of value has a high entropy.

It seems counterintuitive, however, to choose an option that is not being attended. Indeed participants generally choose the option they were last attending to unless that option is much worse than the other one (Shimojo et al., 2003; Krajbich et al., 2010)—but why should this be? A parsimonious explanation of this phenomenon is to regard attention as a form of foraging. Rather than deciding which item is better, decisions are made by an "engage or search" strategy. During the course of a single decision, attentional allocation dynamically switches from information-seeking to value-seeking (**Figure 9**). This accounts for the correlation of final saccades with both *EV* and choice (**Figure 5**). The decision to engage accept or reject the currently attended option might be subserved by a drift-diffusion model similar to that of Krajbich et al. (2012), which is driven by the difference between attended and unattended items.

But can we also explain the bias for choosing the initiallyfixated option (**Figure 5**)? Information foraging predicts that after visiting all four locations, participants should refixate the first item they saw. At the same time, choice-by-foraging suggests that we choose whether or not to go for the currently fixated item, at each acquisition. Therefore, if participants begin to choose too soon—i.e., by engaging, rather than searching—we might expect the first item seen to be selected. According to this view, the first-viewed bias might be explained by premature engagement with the currently viewed option, perhaps linking reflection impulsivity to motor impulsivity (Evenden, 1999).

## **PREDICTIONS OF THE INFORMATION FORAGING ACCOUNT**

The foraging view of decisions suggests that as information is "depleted from the environment"—or rather, the precision of our internal estimates approaches that of the environment information salience no longer drives attention. At this point attention becomes driven by the estimated values. This makes two strong neurophysiological predictions. First, reward signals should propagate from stimulus-value regions early in a trial, to attentional regions later in the trial. Thus, one prediction might be that value-sensitive brain regions, such as OFC (Padoa-Schioppa and Assad, 2006; Kennerley and Wallis, 2009) encode the decision variables for each option as information is accrued, but once information acquisition begins to saturate (**Figure 1**) value signals propagate to parietal and oculomotor regions, biasing attention (Bisley and Goldberg, 2010). This permits a decision to accept or reject the currently fixated option, perhaps involving dorsomedial prefrontal cortex (Hayden et al., 2011; Kolling et al., 2012).

Second, in order to support information foraging, the most uncertain items in a display must compete for attention. Neural signals proportional to the lack of information or uncertainty should compete spatially, weighted by expectations of what information is available in the environment. Importantly, such competition would require not simply representation of a probability density, but rather an *explicit* representation of the uncertainty signal (Fiorillo et al., 2003; Knill and Pouget, 2004). Although uncertainty signals have been found in medial prefrontal regions (Grinband et al., 2006), as well as OFC (Hsu et al., 2005; Tobler et al., 2007; Kepecs et al., 2008; Schultz et al., 2008), the cellular

## **REFERENCES**


representation of uncertainty remains unclear. We expect that during a decision, competition between such signals guides attentional selection.

## **CONCLUSION**

We used a freely-viewed choice between two gambles to examine the effects of risk and *EV* on the guidance of attention. We found that attention was initially drawn to uncertainty, and specifically depended on how the numbers seen determined uncertainty about *EV*. Toward the end of the trial, attention was drawn toward the higher *EV*, and eventually predicted choice. This suggests that attention is drawn by information-salience early in trials, and by reward-salience later in trials. We hypothesize that this reflects that choices are in fact made by a foraging mechanism of successively rejecting or accepting the currently attended option—a process which converges on the highest valued option.

## **ACKNOWLEDGMENTS**

The research was funded by a Wellcome Trust Clinical Research Training Fellowship to Sanjay Manohar. We would like to thank Steve Kennerley and Laurence Hunt for discussion that inspired the experiment. We would also like to thank our reviewers for detailed and insightful discussion.

spatial attention and action choices. *Exp. Brain Res.* 230, 291–300. doi:10.1007/s00221-013-3654-6


*Dir. Psychol. Sci.* 19, 51–57. doi: 10.1177/0963721409359277


attention test: impairment in singlefeature but not dual-feature visual search. *Arch. Clin. Neuropsychol.* 12, 621–634. doi: 10.1016/S0887- 6177(97)00011-5


Hayden, B. Y., Pearson, J. M., and Platt, M. L. (2011). Neuronal basis of sequential foraging decisions in a patchy environment. *Nat. Neurosci.* 14, 933–939. doi: 10.1038/nn.2856

Hickey, C., Chelazzi, L., and Theeuwes, J. (2010). Reward changes salience in human vision via the anterior cingulate. *J. Neurosci.* 30, 11096–11103. doi: 10.1523/JNEUROSCI.1026-10.2010


with goal-directed choice. *Curr. Opin. Neurobiol.* 20, 262–270. doi: 10.1016/j.conb.2010.03.001


decision in the parietal cortex (area LIP) of the rhesus monkey. *J. Neurophysiol.* 86, 1916–1936.


animats," in *Proceedings of the First International Conference on Simulation of Adaptive Behavior on From Animals to Animats*, 288–296. Available online at: http://webdocs.cs.ualberta.ca/ sutton/pap ers/sutton-91-SAB.pdf.gz


Neural circuit dynamics underlying accumulation of time-varying evidence during perceptual decision making. *Front. Comput. Neurosci.* 1:6. doi: 10.3389/neuro. 10.006.2007


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 07 October 2013; published online: 05 November 2013.*

*Citation: Manohar SG and Husain M (2013) Attention as foraging for information and value. Front. Hum. Neurosci. 7:711. doi: 10.3389/fnhum.2013.00711*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Manohar and Husain. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Reward predictions bias attentional selection

## *Brian A. Anderson\*, Patryk A. Laurent † and Steven Yantis*

*Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Clayton M. Hickey, VU University Amsterdam, Netherlands Chiara D. Libera, University of Verona, Italy*

#### *\*Correspondence:*

*Brian A. Anderson, Department of Psychological and Brain Sciences, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD 21218-2686, USA e-mail: bander33@jhu.edu*

*†Present address:*

*Brain Corporation, San Diego, USA*

Attention selects stimuli for perceptual and cognitive processing according to an adaptive selection schedule. It has long been known that attention selects stimuli that are task relevant or perceptually salient. Recent evidence has shown that stimuli previously associated with reward persistently capture attention involuntarily, even when they are no longer associated with reward. Here we examine whether the capture of attention by previously reward-associated stimuli is modulated by the processing of current but unrelated rewards. Participants learned to associate two color stimuli with different amounts of reward during a training phase. In a subsequent test phase, these previously rewarded color stimuli were occasionally presented as to-be-ignored distractors while participants performed visual search for each of two differentially rewarded shape-defined targets. The results reveal that attentional capture by formerly rewarded distractors was the largest when both recently received and currently expected reward were the highest in the test phase, even though such rewards were unrelated to the color distractors. Our findings support a model in which value-driven attentional biases acquired through reward learning are maintained via the cognitive mechanisms involved in predicting future rewards.

**Keywords: selective attention, attentional capture, reward learning, reward prediction, incentive salience**

## **INTRODUCTION**

Perception is limited in its representational capacity, which gives rise to the need to perceive stimuli selectively. Selective attention controls the availability of stimuli for cognition, decision making, and action (Desimone and Duncan, 1995). Recent evidence reviewed below suggests that attentional priority is influenced by prior associations between stimuli and reward, as well as by the current reward value of stimuli. By attending to stimuli associated with the delivery of reward (e.g., nutrients), organisms maximize the opportunity to procure valuable resources that are critical to their survival and wellbeing.

The voluntary deployment of attention can be influenced by the reward value of stimuli. For example, visual search is more efficient for targets associated with the delivery of reward (e.g., Kiss et al., 2009; Kristjansson et al., 2010). Targets associated with high reward are also more robustly represented in early visual areas of the brain (Serences, 2008; Serences and Saproo, 2010).

Certain stimuli capture attention involuntarily, including physically salient stimuli (e.g., Yantis and Jonides, 1984; Theeuwes, 1992, 2010) and stimuli possessing goal-related features (e.g., Folk et al., 1992; Anderson and Folk, 2012). Recent evidence demonstrates that attention is also captured by previously rewarding stimuli. The recent delivery of high reward primes attention to a reward-associated stimulus (Hickey et al., 2010, 2011). Furthermore, we have shown that stimuli persistently capture attention after repeated pairings with reward, even when they are no longer rewarded and are otherwise inconspicuous and task-irrelevant (Anderson et al., 2011b; Anderson and Yantis, 2012, 2013).

Over the past two decades, much has been learned about the underlying neurobiology of reward. Learned reward predictions are represented in the basal ganglia (BG), such that the onset of reward-associated stimuli elicits the release of dopamine (DA) from BG neurons (Schultz et al., 1997; Waelti et al., 2001). It is also known that unexpected reward also elicits phasic DA release. Once an organism learns that a stimulus predicts reward, the receipt of the expected reward no longer produces DA release in response to the reward; instead, the omission of the expected reward depresses DA activity (Schultz et al., 1997). Thus, phasic DA activity is thought to convey a signal that represents both reward prediction and reward-prediction error.

The relationship between the underlying mechanisms for processing reward and for biasing attention in favor of rewardassociated stimuli is unknown. One possibility is that reward motivates the recruitment of different cognitive processes, such as memory storage and perceptual learning, in order to establish and maintain attentional biases that prove adaptive in promoting reward procurement. By such an account, value-driven attentional biases are maintained independently of the cognitive architecture that subserves reward processing. Another possibility is that value-driven attentional priority is represented and signaled by the reward processing system, which is sensitive to current reward predictions. In the present study, we adjudicate between these two accounts by measuring the magnitude of attentional capture by previously reward-associated stimuli when different levels of reward were predicted from the current task.

The experiment was modeled after the experiments of Anderson et al. (2011b) and included a training phase and a test phase. In the training phase, participants learned to associate each of two color stimuli with different amounts of monetary reward (see **Figure 1A**). The training phase was followed by a test phase that was a modified version of the additional singleton paradigm

by a blank 1000 ms intertrial interval.

(Theeuwes, 1992) in which the target of visual search was a shape singleton (see **Figure 1B**). Reward feedback was also provided in the test phase, and one of the shape targets (e.g., a unique circle among diamonds) was probabilistically associated with more reward than the other. This reward structure allowed participants to experience reward prediction and reward-prediction error on each trial. Each item in the test phase was rendered in a different color, but color was not relevant to the task. However, one of the non-targets was sometimes rendered in a color that was associated with reward during the preceding training phase. This design allowed us to assess the magnitude of value-driven attentional capture by previously rewarded colors in the test phase, as a function of both reward prediction and reward-prediction error. The hypothesis that value-driven attentional priority is represented and signaled by the reward processing system predicts that value-driven attentional capture should be maximal when these reward signals are the largest, even though current rewards are unrelated to the previously reward-associated stimuli.

## **MATERIALS AND METHODS**

## **PARTICIPANTS**

Sixteen participants were recruited from the Johns Hopkins University community. All were screened for normal or correctedto-normal visual acuity and color vision. Participants were provided monetary compensation based on performance that varied between 12 and \$15 (mean = \$13.44). All procedures were approved by the Johns Hopkins University Institutional Review Board and all participants provided informed consent.

## **APPARATUS**

A Mac Mini equipped with Matlab software and Psychophysics Toolbox extensions was used to present the stimuli on a Dell P991 monitor. The participants viewed the monitor from a distance of approximately 50 cm in a dimly lit room. Manual responses were entered using a 101-key US layout keyboard.

## **STIMULI**

Each trial consisted of a fixation display, a search array, and a feedback display (see **Figure 1**). The fixation display contained a white fixation cross (0*.*5 × 0*.*5◦ visual angle) presented in the center of the screen against a black background, and the search array consisted of the fixation cross surrounded by six shape stimuli (each with a diameter of 2.3◦ visual angle) placed at equal intervals on an imaginary circle with a radius of 5◦. The six shapes were each rendered in a different color (red, green, blue, cyan, pink, orange, yellow, or white).

During the training phase, all six of the shapes were circles and the target was defined as the red or green circle (exactly one of which was presented on each trial). During the test phase, the six shapes consisted of either a diamond among circles or a circle among diamonds, and the target was defined as the unique shape. On 25% of the trials in the test phase, one of the nontarget shapes was red and on another 25% of the trials, one of the non-target shapes was green; these constituted the formerly rewarded distractors (these two non-target shapes are referred to as "distractors"). The target was never red or green during the test phase.

Inside the target shape, a white line segment was oriented either vertically or horizontally, and inside each of the nontargets, a white line segment was tilted at 45◦ to the left or to the right (randomly for each non-target). The participant was required to report whether the orientation of the line segment inside the target shape was vertical or horizontal with a corresponding key press. Correct responses were followed in both phases of the experiment by a feedback display that informed participants of the monetary reward earned on that trial, as well as the total reward accumulated thus far in the experiment.

#### **DESIGN AND PROCEDURE**

The experiment consisted of 240 trials during each of the two phases, for a total of 480 trials. Participants completed 50 practice trials prior to the training phase, and 20 practice (distractorabsent) trials prior to the test phase; behavioral data from these practice trials were not included in any analysis. In the training phase, target identity and target location were fully crossed and counterbalanced, and trials were presented in a random order. In the test phase, target identity, target location, distractor identity, and distractor location were fully crossed and counterbalanced, and trials were presented in a random order. Thus, in the test phase, the presence and identity of the distractor provided no predictive information concerning the target or reward.

In both the training and test phase, one of the two targets (e.g., red during training and diamond singleton at test) was followed by a high reward on 80% of correct trials and a low reward on the remaining 20%; the percentages were reversed for the low-reward target. High and low rewards were 6 and 2¢, respectively, during the training phase and 3 and 1¢ during the test phase (higher rewards were used in the training phase to maximize the learning of the color–reward associations). Incorrect responses or responses that were too slow were followed by feedback indicating 0¢ had been earned. Which color target and shape-singleton target was associated with high reward in their respective phase of the experiment was counterbalanced across participants, such that each combination of color and shape was used equally often. Participants were not informed of the reward contingencies, which had to be learned through experience in the task. Upon completion of the experiment, participants were given the cumulative reward they had earned.

Each trial began with the presentation of the fixation display for a randomly varying interval of 400, 500, or 600 ms. The search array then appeared and remained on screen until a response was made or the trial timed out. Trials timed out after 800 ms in the training phase and 1200 ms in the test phase. The search array was followed by a blank screen for 1000 ms, the reward feedback display for 1500 ms, and a 1000 ms intertrial interval.

Participants made a forced-choice target identification by pressing the "z" and the "m" keys for the vertically- and horizontally-orientated targets, respectively. If the trial timed out, the computer emitted a 500 ms and 1000 Hz feedback tone.

## **DATA ANALYSIS**

Only correct responses were included in all analyses of RT, and all RTs more than three standard deviations above or below the mean of their respective condition for each participant were excluded.

## **RESULTS**

## **TRAINING PHASE**

There were no significant differences in RT [*t(*15*)* = −0*.*16, *p* = 0*.*877] or accuracy [*t(*15*)* = −1*.*04, *p* = 0*.*316] to report a high-reward target compared to a low-reward target (means for high-reward target: 537 ms, 90.0%; means for low-reward target: 536 ms, 91.1%). There were also no significant differences in RT [*t(*15*)* = 1*.*81, *p* = 0*.*091] or accuracy [*t(*15*)* = −1*.*14, *p* = 0*.*272] based on the color of the target (means for red: 534 ms, 91.2%; means for green: 539 ms, 89.9%). In our prior studies on reward and attention, participants have generally been faster to respond to high-reward targets than to low-reward targets (Anderson et al., 2011a, 2012; Anderson and Yantis, 2012). The present results suggest that top–down attentional control dominated performance in the training phase, such that participants searched for the two target colors with approximately equal priority. The reward feedback allowed participants to learn the color–reward contingencies, however, and the effects of these contingencies on performance in the test phase were of primary interest.

### **TEST PHASE**

We first compared RT and accuracy for trials containing a highreward target compared to a low-reward target, as in the training phase. As in the training phase, RT [*t(*15*)* = −0*.*28, *p* = 0*.*785] and accuracy [*t(*15*)* = −0*.*26, *p* = 0*.*798] did not differ based on the value of the target (means for high-reward target: 673 ms, 89.8%; means for low-reward target: 663 ms, 90.6%). There was a highly significant effect of target shape, such that participants were substantially faster and more accurate to report circlesingleton targets compared to diamond-singleton targets [for RT: mean difference = 130 ms, *t(*15*)* = 14*.*26, *p <* 0*.*001, *d* = 3*.*57; for accuracy: mean difference = 8.9%, *t(*15*)* = 4*.*59, *p <* 0*.*001, *d* = 1*.*15]. This suggests that the circle singleton was more physically salient than the diamond singleton.

Next, to assess the effect of distractor presence, trials during the test phase were sorted according to whether they contained a non-target formerly associated with high reward (high-value distractor, 25% of trials), a non-target formerly associated with low reward (low-value distractor, 25% of trials), or neither (50% of trials). A repeated measures ANOVA revealed that RT in the three distractor conditions differed significantly [**Table 1**, *<sup>F</sup>(*2*,* <sup>30</sup>*)* <sup>=</sup> <sup>16</sup>*.*63, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*526]. Neither the color that was associated with high reward during training [*F(*2*,* <sup>24</sup>*)* = 2*.*93, *p* = 0*.*073] nor the shape singleton that was associated with high reward at test (*F <* 1) interacted significantly with the effect of distractor condition on RT, and the three-way interaction was also not significant [*F(*2*,* <sup>24</sup>*)* = 1*.*08, *p* = 0*.*357], so we collapsed across these two factors. A *post-hoc* contrast revealed that RT was slower when a previously rewarded color distractor was present compared to the distractor-absent condition, indicating the presence of value-driven attentional capture by formerly rewarded but now irrelevant colors [*t(*15*)* = 5*.*72, *p <* 0*.*001, *d* = 1*.*72]; RT did not differ between the high- and low-value distractor conditions [*t(*15*)* = −0*.*63, *p* = 0*.*537], and the distractors captured



attention regardless of their color [both *t*'s *>* 4*.*50, *p*'s *<* 0*.*001]. Accuracy did not differ significantly among the three conditions (*F <* 1), nor did the effect of distractor condition on accuracy interact with the color associated with high-reward during training [*F(*2*,* <sup>24</sup>*)* = 2*.*03, *p* = 0*.*153] or the shape singleton associated with high-reward at test (*F <* 1), and the three-way interaction was also not significant (*F <* 1).

According to reward-prediction error accounts, a representation of current expected reward develops based on a trial's former context (e.g., Nakahara et al., 2004). We therefore next examined how the magnitude of predicted reward on a given trial (based on the target's shape) modulated the degree to which the formerly rewarded color distractors captured attention. The predicted reward on a given trial was defined as the mean reward received over the previous five trials in which the current shape-singleton target served as the target. This computed value was used rather than the actual reward probabilities assigned to the singleton target, as previous research has shown that participants are highly sensitive to recent reward history (e.g., Serences, 2008), and this method better accounts for trials in the early part of the test phase in which participants have had little experience with the current reward contingencies. The mean reward received in the last 5 trials is, of course, highly correlated with the actual reward probabilities. But estimated value can vary considerably given the stochastic fluctuations in actual reward delivery, and so this method provides a potentially more sensitive index of the influence of experienced value on performance. We calculated value-driven attentional capture (slowing of RT on distractor present vs. absent trials) separately for trials on which the current shape singleton's predicted reward fell into one of four equally-spaced ranges as shown in **Figure 2A**. The magnitude of value-driven capture differed significantly for different amounts of predicted reward [*F(*3*,* <sup>45</sup>*)* = 2*.*96, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*042, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*165], and the data were well accounted for by a linear trend in which the magnitude of capture becomes greater as predicted reward increases [*F(*1*,* <sup>15</sup>*)* = 6*.*97, *p* = 0*.*019, η2 *<sup>p</sup>* = 0*.*317].

RT on distractor-absent trials did not differ based on predicted reward (*F <* 1), meaning that the observed changes in the magnitude of value-driven attentional capture as a function of predicted reward were not the consequence of baseline shifts in RT (the mean RTs on distractor-absent trials as a function of increasingly high predicted reward were 652, 646, 640, and 664 ms). Neither did the magnitude of value-driven attentional capture differ based on the mean reward received over the previous 5 trials without respect to the current target (*F <* 1), meaning that the effect of predicted reward on attentional capture did not reflect a more global consequence of recently received rewards.

In addition to the mean reward received over the last few trials a given stimulus was a target, another potentially salient rewardrelated signal concerns recent reward-prediction error. Positive reward-prediction error occurs when more reward is received than predicted, and negative reward-prediction error when less reward is received than expected. Reward-prediction errors are thought to provide a teaching signal that adjusts subsequent reward predictions to reduce the discrepancy between previously predicted and received reward (e.g., Waelti et al., 2001).

Therefore, the representation of reward on a given trial can be expressed in terms of the reward-prediction error realized on the preceding trial, with the magnitude being larger following positive reward-prediction error and smaller following negative reward-prediction error.

A positive reward-prediction error was taken to occur when participants received a high reward following a singleton target that typically yields low reward, and a negative rewardprediction error was taken to occur when participants received a low reward following a singleton target that usually yields high reward. We found that the magnitude of value-driven attentional capture differed significantly following the three possible outcomes of reward prediction on the previous trial (positive prediction error, negative prediction error, and no prediction error) [**Figure 2B**, *<sup>F</sup>(*2*,* <sup>30</sup>*)* <sup>=</sup> <sup>4</sup>*.*63, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*018, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*236]. In particular, value-driven capture was significantly greater following a positive reward-prediction error than following a negative reward-prediction error [*t(*15*)* = 2*.*63, *p* = 0*.*019, *d* = 0*.*66]; the former produced substantial value-driven attentional capture, while the latter produced no evidence of attentional capture. RT on distractor-absent trials did not differ based on the rewardprediction error on the preceding trial (*F <* 1), meaning that the observed changes in the magnitude of value-driven attentional capture were not the consequence of baseline shifts in RT (the mean RTs on distractor-absent trials following negative, no, and positive reward-prediction error were 658, 653, and 641 ms, respectively). This provides further evidence that the attentional bias toward stimuli with learned value varies as a function of ongoing task-related reward processing.

## **DISCUSSION**

Attention selects stimuli for perceptual and cognitive processing. By attending to stimuli associated with the delivery of reward, organisms maximize the opportunity to procure valuable resources. We have previously shown that valuable stimuli capture attention involuntarily (Anderson et al., 2011a,b, 2012; Anderson and Yantis, 2012, 2013). The present study tested the hypothesis that this attentional bias for valuable stimuli is maintained via the cognitive mechanisms involved in processing rewards.

Using two different measures of ongoing reward processing, we found strong influences of both currently expected reward and recent reward-prediction error on the magnitude of valuedriven attentional capture by formerly rewarded distractors. The greater the reward prediction on a given trial, the greater the distraction caused by previously rewarded but currently irrelevant stimuli. This finding is surprising because one might hypothesize participants to be most resistant to distraction when a high reward target was available to motivate goal-directed performance. Value-driven attentional capture was also more pronounced following positive reward-prediction error (i.e., when more reward was received than expected) than following negative reward-prediction error. This finding is also somewhat surprising because one might hypothesize that the reward-prediction errors would increase attention to the target, rather than to the distractor. Instead, this result shows that when an unexpectedly high reward has been obtained, stimuli that predict high reward in both the current and past contexts tend to capture attention.

Interestingly, value-driven attentional capture was small or non-existent when predicted reward was low and recent rewardprediction error was negative, respectively. This contrasts with the magnitude of value-driven attentional capture typically observed without reward feedback during the test phase (e.g., Anderson et al., 2011a,b; Anderson and Yantis, 2012, 2013). This suggests that relatively small rewards are experienced as disappointing and result in a reduction in the attentional bias afforded to rewardassociated stimuli, consistent with the small or even negative priming observed following a low reward (e.g., Della Libera and Chelazzi, 2006; Hickey et al., 2010, 2011).

These behavioral results suggest the existence of a common mechanism that represents both reward predictions and rewardprediction error, and signals incentive salience (i.e., attentional priority for formerly rewarded stimuli). One candidate for this mechanism is the phasic DA signal in the BG (Schultz et al., 1997; Waelti et al., 2001). This is consistent with recent evidence showing that the visual representation of a reward-associated cue is modulated by the receipt of unrelated reward and corresponding reward-related DA activity (Arsenault et al., 2013). If valuebased attentional priority is signaled via mechanisms that overlap with the signaling of current reward, modulating the representation of current reward should produce concurrent modulations in value-driven attentional capture. By relating ongoing reward processing to value-driven attentional capture in this way, our findings provide further insight into the mechanisms underlying attentional capture by reward-associated stimuli, which, in turn, has important implications for theories linking reward learning to attentional control (e.g., Della Libera and Chelazzi, 2006, 2009; Serences, 2008; Peck et al., 2009; Raymond and O'Brien, 2009; Hickey et al., 2010, 2011; Krebs et al., 2010; Serences and Saproo, 2010; Anderson et al., 2011a,b, 2012; Della Libera et al., 2011; Anderson and Yantis, 2012, 2013; Hickey and van Zoest, 2012).

It is worth noting that in the present study, the magnitude of attentional capture by stimuli previously associated with reward did not depend on the magnitude of prior reward value experienced during training (i.e., RT did not differ between the highand low-value distractor conditions). One possibility is that the reward associated with the color distractors was influenced by the reward received in the test phase, which was unrelated to color. Both color targets were associated with reward outcome in the training phase of present study, and the extent to which persistent reward-related attentional biases acquired through learning should scale with the magnitude of prior reward is unclear. Previous studies show that reward associations play a direct and important role in the development of attentional biases for former targets (Anderson et al., 2011a,b, 2012; Anderson and Yantis, 2013), which, together with the observed influence of ongoing rewards, suggests that the observed attentional biases for former targets reflects an effect of reward history.

Our findings also provide further evidence for a mode of attentional control that is distinct from the well-documented stimulus-driven and goal-directed mechanisms (e.g., Folk et al., 1992; Theeuwes, 1992, 2010; Yantis, 2000; Connor et al., 2004). We show that previously reward-predictive but currently irrelevant stimuli capture attention even when they are not task relevant and not physically salient, replicating previous results (Anderson et al., 2011b; Anderson and Yantis, 2012, 2013). Our data also reveal that motivating current task goals with reward potentiates rather than minimizes attentional capture by previously valuable but currently irrelevant stimuli. If value-driven attentional capture merely reflected difficulty overcoming a previously motivated selection strategy, it would not be expected to be modulated in this way and might instead be better overcome by the motivation provided by currently expected reward. Thus, our results provide direct evidence that learned value influences attentional control in a way that does not depend on either physical salience or ongoing goals, and is instead mediated by the cognitive mechanisms involved in reward processing.

Attentional biasing of reward-associated stimuli is adaptive in many circumstances, facilitating the procurement of future rewards. However, previous reward learning and ongoing goals will at times conflict, as they do, for example, in the case of desired abstinence from a substance of abuse. Visual cues for a substance of abuse can involuntary capture attention in drugdependent populations (e.g., Lubman et al., 2000; Marissen et al., 2006; Field and Cox, 2008), much as the previously rewardassociated distractors capture attention in the present study. This drug-related attentional bias is thought to play an important role in contributing to relapse following periods of abstinence (see Field and Cox, 2008, for a review). Our findings have implications for theories of such disordered attentional control in addiction by demonstrating that reward-related attentional biases are mediated specifically by the brain mechanisms involved in processing rewards, which are known to be directly affected by drugs of abuse (e.g., Berridge and Robinson, 1998).

## **REFERENCES**


*Psychol. Sci.* 17, 222–227. doi: 10.1111/j.1467-9280.2006.01689.x


## **ACKNOWLEDGMENTS**

We thank J. Halberda and V. Stuphorn for helpful comments and E. Wampler for assistance with data collection. The research was supported by U.S. National Institutes of Health grant R01- DA013165 to Steven Yantis and fellowship F31-DA033754 to Brian A. Anderson.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 March 2013; accepted: 23 May 2013; published online: 11 June 2013.*

*Citation: Anderson BA, Laurent PA and Yantis S (2013) Reward predictions bias attentional selection. Front. Hum. Neurosci. 7:262. doi: 10.3389/fnhum. 2013.00262*

*Copyright © 2013 Anderson, Laurent and Yantis. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## What are task-sets: a single, integrated representation or a collection of multiple control representations?

#### *Dragan Rangelov1 \*, Thomas Töllner 1, Hermann J. Müller 1,2 and Michael Zehetleitner <sup>1</sup>*

*<sup>1</sup> Department Psychologie, Allgemeine und Experimentelle Psychologie I, Ludwig-Maximilians-Universität München, München, Germany <sup>2</sup> Birkbeck College, University of London, London, UK*

#### *Edited by:*

*Joy Geng, University of California Davis, USA*

#### *Reviewed by:*

*Ulrich Ansorge, University of Vienna, Austria Nachshon Meiran, Ben-Gurion University, Israel*

#### *\*Correspondence:*

*Dragan Rangelov, Department Psychologie, Allgemeine und Experimentelle Psychologie I, Ludwig-Maximilians-Universität München, Leopoldstr. 13, DE-80802 München, Germany e-mail: rangelov@psy.lmu.de*

Performing two randomly alternating tasks typically results in higher reaction times (RTs) following a task switch, relative to a task repetition. These task switch costs (TSC) reflect processes of switching between control settings for different tasks. The present study investigated whether task sets operate as a single, integrated representation or as an agglomeration of relatively independent components. In a cued task switch paradigm, target detection (present/absent) and discrimination (blue/green/right-/left-tilted) tasks alternated randomly across trials. The target was either a color or an orientation singleton among homogeneous distractors. Across two trials, the task and target-defining dimension repeated or changed randomly. For task switch trials, agglomerated task sets predict a difference between dimension changes and repetitions: joint task and dimension switches require full task set reconfiguration, while dimension repetitions permit re-using some control settings from the previous trial. By contrast, integrated task sets always require full switches, predicting dimension repetition effects (DREs) to be absent across task switches. RT analyses showed significant DREs across task switches as well as repetitions supporting the notion of agglomerated task sets. Additionally, two event-related potentials (ERP) were analyzed: the Posterior-Contralateral-Negativity (PCN) indexing spatial selection dynamics, and the Sustained-Posterior-Contralateral-Negativity (SPCN) indexing post-selective perceptual/semantic analysis. Significant DREs across task switches were observed for both the PCN and SPCN components. Together, DREs across task switches for RTs and two functionally distinct ERP components suggest that re-using control settings across different tasks is possible. The results thus support the "agglomerated-task-set" hypothesis, and are inconsistent with "integrated task sets."

**Keywords: task switching, task sets, attentional selection, perceptual processing, electroencephalography, executive control**

## **INTRODUCTION**

Surviving in an environment in which both internal and external conditions change dynamically presupposes an ability to change between control settings for old and new tasks. A successful switch implies that the set of expectations about the environment (the topic of the present special issue) which was relevant in the previous task episode is replaced by one appropriate for the task at hand. Such switching processes are usually investigated in paradigms in which two or more different tasks vary across trials, requiring a change, on task-switch trials, in the internal control settings so as to fit the current task requirements. In *cued task switching*, prior to the stimulus display's onset, a cue is presented specifying the task to be performed on the upcoming trial. Across two consecutive trials, the task can either repeat or change. Reaction times (RTs) and errors are typically elevated for task switches relative to repetitions (Allport et al., 1994; Rogers and Monsell, 1995). Such task switch costs (TSCs) imply the existence of extra, time-consuming processes invoked on task switch trials, but not (or to a lesser degree) on repetition trials. To comprehensively account for TSCs, answers to two related, yet separable questions are necessary: first, what cognitive mechanisms give rise to the TSCs, and, second, how are the representations on which these mechanisms operate organized? The present study focused on the latter issue—more precisely, on whether or not having to change *some* expectations automatically triggers a change in *all* expectations about task-relevant properties of the environment.

## **DETERMINANTS OF TSCs**

The available literature offers two dominant approaches to the question of what cognitive *mechanisms* give rise to TSCs. According to the first, TSCs reflect the extra time it takes to reconfigure control settings from the previously relevant to the currently relevant task demands (Monsell and Driver, 2000; Monsell et al., 2003). Reconfiguration is achieved by means of a special executive function (or set of functions) which is active on task switches and inactive on task repetitions. An alternative approach assumes that TSCs reflect interference between the previously relevant and the currently required control settings, which are concurrently active on task switch trials (Allport et al., 1994; Gilbert and Shallice, 2002; Waszak et al., 2003). TSCs arise because the interference is weaker on task repetition than on task switch trials. Finally, a hybrid, reconfiguration-interference account has also been proposed, postulating that TSCs reflect a mixture of reconfiguration and interference processes (Meiran, 1996, 2000, but see Meiran et al., 2008). Critically, irrespectively of what mechanisms produce TSCs, all accounts assume that performance of a task is controlled by a set of representations that, following a task switch, are no longer appropriate. Thus, to meet the changed environmental demands, the control representations would have to change, too. Given this, the present study was designed to address the question of what *representations* change across task switches.

Conceptually, tasks can differ in all or some of the following respects: (i) criteria for spatial-attentional selection of the task-relevant stimulus; (ii) criteria for the identification of task-relevant stimulus properties; and (iii) task-appropriate stimulus-response (S-R) mappings. The set of cognitive representations specifying these criteria is considered collectively to constitute the task set. The available literature supports the notion of composite task set representations. For instance, Meiran (2000) demonstrated that while switching identification criteria can be performed in advance, actually performing the task is necessary for switching S-R mappings. Furthermore, Hübner et al. (2001) showed that the magnitude of TSCs increases with the number of task set components to be switched. Finally, a number of electrophysiological and imaging studies revealed neural correlates of a switch to co-vary with what task set component is being switched (Rushworth et al., 2002, 2005; Ravizza and Carter, 2008; Chiu and Yantis, 2009; Esterman et al., 2009; Hakun and Ravizza, 2012). In summary, the available evidence converges on the view that task sets consist of several dissociable representations controlling different cognitive processes in the stimulus-response chain. However, it remains unclear whether, on task switch trials, different components are changed relatively independently of each other, as would be predicted by the notion of "agglomerated task sets."

Studies investigating whether or not it is possible to change only those task set components that require a change and to reuse shared components across different tasks yielded somewhat inconsistent findings. For example, Arrington et al. (2003) asked their participants to report, on different trials, either the stimulus height, width, color, or luminance. Smaller TSCs were found for switches across similar tasks (e.g., switching from color to luminance discrimination) relative to switches across dissimilar tasks (e.g., from height to luminance discrimination), suggesting that some reusing of control representations across different tasks is possible. By contrast, Vandierendonck et al. (2008) had their participants discriminate either the parity (odd/even) or the magnitude (greater/smaller than four) of stimuli consisting of several identical digits (e.g., five instances of the digit three). On different trials, either the number of digits or the digits themselves were task-relevant. The critical comparison was between trials on which both the identification criterion (parity vs. magnitude) and the stimulus attribute (number vs. digit) switched, and trials on which only one criterion switched (e.g., from digit parity to magnitude discrimination). Although partial switches in principle allowed for old control representations to be reused, no difference was observed between full and partial switches indicating that, following a task switch, *all* task-set components are reset. Finally, in a study very similar to Kleinsorge (2004); Vandierendonck et al. (2008) (see also Kleinsorge and Heuer, 1999) observed partial repetition *costs*, that is, partial switches took actually more time to be implemented than full switches. Kleinsorge explained these findings by assuming a hierarchical organization of task sets, according to which having to change task set components situated earlier in the stimulus-response chain (e.g., identification criteria) would trigger a switch in all subsequent criteria (e.g., S-R mapping rules). In summary, the available literature suggests that following a task-switch, task-sets are sometime reset, sometimes switched, and sometimes reused. It should be borne in mind, however, that the various studies used (i) different paradigms and (ii) different types of switches. These differences will be discussed in more detail in the General Discussion.

## **RESETTING, SWITCHING OR REUSING TASK-SET COMPONENTS**

Depending on what tasks precisely vary across trials, task switching could necessitate changes of either all components (full switches) or only those components in which the two tasks differ (partial switches). To illustrate, consider a paradigm in which stimulus displays consist of many identical items with one of them (the singleton target) being different in either color or orientation from the rest. Participants are to, on different trials, either simply detect the presence vs. absence of the singleton target or discriminate its exact features, with a cue, presented prior to the display onset, announcing what task (detection or discrimination) is to be performed. Independently of the task sequence, the taskrelevant dimension can also repeat or change. The dimension in which the singleton target is defined, although not informative about the response to be performed (i.e., knowing the dimension would not specify the exact response!), would be informative for both spatial-attentional selection and post-selective identification processes.

Concerning target selection, the available evidence suggests that when the dimension repeats across trials, spatial selection is speeded relative to dimension changes, as elaborated in the Dimension-Weighting Account of Müller and colleagues (e.g., Found and Müller, 1996; Müller and Krummenacher, 2006; Müller, 2010). Since both singleton detection and singleton discrimination tasks would require spatial selection, spatial selection processes for *both* tasks would be sensitive to the singleton dimension. Accordingly, dimension repetition effects (DREs) would be expected across (task repetition and switch) trials of both detection and discrimination tasks.

In contrast to the shared spatial selection, post-selective identification processes should differ between the two tasks. On the one hand, fast and accurate singleton *detection* can be achieved by simply determining the presence/absence of a singleton, while information about what the singleton features precisely are is not strictly necessary for performing the task. Consistent with this, there is evidence that the response-irrelevant singleton features are not processed up to the level at which they become available for explicit report (Müller et al., 2004). On the other hand, encoding singleton features from a particular dimension is critical for the singleton *discrimination* task. With these differential task requirements in mind, it is likely that post-selective identification processes differ between tasks: the singleton dimension is important for identification in the discrimination, but not in the detection task (see Töllner et al., 2012b, for supporting EEG data).

Differences between detection and discrimination tasks would determine what *can be* and *is* reused across trials. Following performance of a target-present detection task, the dimension should have been encoded and thus be available for reuse only for spatial selection. Following a discrimination task, the dimension should have been encoded in and thus be available for reuse for both the spatial selection and post-selective identification processes. Thus, the task on trial *n* − 1, or *prime* trial, determines what is *available* for reusing. By contrast, the task on trial *n*, or *probe* trial, determines what *is* reused: in the detection task, reusing dimension information would facilitate just spatial selection, while in discrimination task reusing dimension information would facilitate both selection and identification processes. Consequently, on the hypothesis of agglomerated task sets, comparable DREs would be expected for detection → detection and detection → discrimination sequences, because across these sequences, only spatial selection criteria are available and reused. By contrast, following performance of the discrimination task, both spatial selection and identification criteria would be available for reuse, but they would be reused only on the current discrimination trial, predicting stronger DREs for discrimination → discrimination relative to discrimination → detection sequences.

In contrast to the notion of agglomerated task sets, that of integrated task sets would predict that following *any* switch, the task set would be reset; accordingly, there should not be a difference between full and partial switches. Finally, the notion of hierarchical task sets would predict a reversal of DREs, that is: switching from, e.g., detection to discrimination, would also switch the expected dimension, resulting in worse performance following dimension repetitions relative to dimension switches.

## **ERP COMPONENTS SENSITIVE TO THE SPATIAL SELECTION AND POST-SELECTIVE IDENTIFICATION COMPONENTS**

In the paradigm described above, preparatory adjustments with regard to the task-relevant (singleton) dimension are not possible since the task cue is not dimension-specific1 . Consequently, the analyses of partial-switch effects for spatial selection and postselective identification in the EEG domain have focused on areas and ERP components sensitive to the implementation of these processes.

As an index of spatial-attentional selection, parameters of the Posterior-Contralateral-Negativity (PCN, or N2-posteriorcontralateral<sup>2</sup> ) have been analyzed. This component manifests as an increased negativity at posterior scalp electrode sites contralateral to the singleton position, relative to ipsilateral electrode sites, in the 175–300-ms time range post-stimulus. PCN parameters are considered to reflect the dynamics of spatial selection processes (Luck and Hillyard, 1994; Eimer, 1996; Töllner et al., 2011). Importantly, PCN latencies are shorter for dimension repetitions relative to changes (Töllner et al., 2008). On the assumption of agglomerated task sets, substantial DREs are expected for both task repetitions and switches. By contrast, the assumption of integrated tasks sets would predict no DRE across task switches.

The second component of interest was the Sustained Posterior Contralateral Negativity (SPCN), manifesting as an increased negativity over posterior electrodes contralateral to the target, relative to ipsilateral electrodes, starting from 350–400-ms poststimulus. The SPCN component is considered to be sensitive to the processing demands following stimulus selection (Jolicoeur et al., 2008). Importantly, the SPCN is weaker (Mazza et al., 2007; Töllner et al., 2013) in tasks that do not necessarily require perceptual analysisfollowing stimulus selection (e.g., singleton detection task), and more prominent (ibid.) in tasks that require stimulus analysis up to the feature levels (e.g., in singleton discrimination task). Thus, the available evidence predicts a stronger SPCN for the discrimination task, which requires explicit feature discrimination, relative to the detection task, in which post-selective processing is not necessary for an accurate response. Furthermore, the hypothesis of agglomerated task sets predicts stronger DREs in the SPCN time range on trials preceded by (a trial with) the discrimination task, in which post-selective identification should be sensitive to the dimensional identity of the singleton, relative to trials preceded by the detection task, for which it is sufficient to determine that the selected item is a singleton, without necessarily identifying its precise dimension or feature properties.

## **METHODS**

## **PARTICIPANTS**

Sixteen males (mean age 29 years, 2 left-handed), all with normal color vision and normal or corrected-to-normal visual acuity, participated in the experiment for monetary compensation. All participants had extensive experience with psychophysical tasks and all were naïve as to the purpose of the experiment. Due to excessive eye blinking, two participants were excluded from the analyses.

## **APPARATUS**

Stimuli were presented on a 17 CRT monitor with a 1024 × 768 pixels resolution and an 85-Hz refresh rate. Custom written C++ code controlled stimulus presentation and recorded responses. The experiment was conducted in a dimly lit, acoustically and electromagnetically shielded room. Head-to-monitor distance was 60 cm.

## **STIMULI AND PROCEDURE**

Stimulus displays (**Figure 1**) were presented on a gray background (19 cd/m2, CIE *<sup>x</sup>* <sup>=</sup> <sup>0</sup>*.*292, *<sup>y</sup>* <sup>=</sup> <sup>0</sup>*.*307) and consisted of 38 vertical yellow (0.388, 0.520) bars arranged around three concentric—inner, middle, and outer—circles with 8, 12, and 18 items, respectively. Single bars subtended 0*.*4 × 1*.*9◦ of visual angle, and the diameters of the three circles were 5, 10, and 15◦, respectively. On target-present trials, one of the bars on the middle circle (excluding the two positions along the vertical

<sup>1</sup>Note that most of the previous EEG studies on task switching used paradigms in which preparatory adjustments of the task set component of interest were possible, to some extent. This work typically revealed involvement of prefrontal areas when preparing for a task switch (Karayanidis et al., 2003; Nicholson et al., 2005; Karayanidis et al., 2009).

<sup>2</sup>As shown by Shedden and Nordgaard (2001; Töllner et al., 2012a), the PCN is independent of both the amplitude and latency of the non-lateralized N2. Thus, to avoid potential confusion associated with the term "N2pc," we prefer the neutral term "PCN."

meridian) was replaced by either a blue (0.275, 0.541) or a green (0.211, 0.263) *color* singleton target or a right-titled (12◦ clockwise from vertical) or left-tilted (12◦ counter-clockwise) *orientation* singleton target, matched in luminance to the distractor bars (68 cd/m2).

Every trial started with a task cue (i.e., the word "detection" or "discrimination") shown for 1000 ms, followed by a 500 ± 50 ms blank screen. Next, the stimulus display appeared for 200 ms, followed by a blank screen until response. In case of response errors, the standard intertrial interval (1000 ms) was doubled. Responses were given via pressing the left and right mouse keys using the left- and right-hand thumbs, respectively. Stimulus displays were identical for both tasks, the difference being that, in the detection task, participants were required to discern the presence (on 60% of detection task trials) vs. the absence of a singleton target by pressing the corresponding response key with two possible S-R mappings: (i) target-present → R1, -absent → R2 and, respectively, (ii) target-absent → R1, -present → R2. In the discrimination task, a singleton was always present, with participants having to report the feature that distinguished it from distractors (blue, green, left-, and right tilted). Same-dimension singletons (e.g., blue and green) required different responses (e.g., R1 and R2), while singletons from different dimensions (e.g., blue and left-tilted) were mapped to a same response, resulting in four possible S-R mappings: (i) blue or left-tilted → R1, green or righttilted → R2, (ii) green or left-tilted → R1, blue or right-tilted → R2, (iii) blue or right-tilted → R1, green or left-tilted → R2, and (iv) green or right-tilted → R1, blue or left-tilted → R2. The two possible S-R mappings in the detection task and the four S-R mappings in the discrimination task yielded eight different S-R mapping combinations, which were counterbalanced across participants.

The task (detection vs. discrimination) and the target's dimension (color vs. orientation) were randomly selected on every trial, resulting in four task sequences (detection on both prime and probe trials; discrimination on both trials; detection on prime, discrimination on probe trial; and discrimination on prime, detection on probe trial) and two dimension sequences (repetition/change) across consecutive trials. Relevant dimensions were sampled with equal probability, however, in order to ensure comparable numbers of target-present trials across the two tasks, the detection task was made more frequent (3:2 ratio).

Prior to experiment proper, participants received 1–4 practice blocks (80 trials per block); practice was terminated once a participant achieved ≤ 10% errors per block. All participants met the criterion after 2–4 blocks. Following practice, participants completed 1920 trials (in ca. 3 h), split in two equal-length sessions with a 15–30 min break in between.

## **EEG RECORDING AND ANALYSES**

The EEG was sampled at 1 KHz using Ag/AgCl active electrodes (actiCAP system; Brain Products, Munich) from 64 scalp sites, arranged according to the 10–10 System (American Electroencephalographic Society, 1994), and amplified using BrainAmp amplifiers (BrainProducts, Munich) with a 0.1–250- Hz bandpass filter. Impedances were kept below 5 k*-*. Electrodes were online referenced to FCz and re-referenced offline to averaged mastoids. Electrodes placed at the outer canthi of the eyes and the superior and inferior orbits monitored blinks and eye movements. Non-stereotyped noise was removed by visual inspection, followed by high-pass filtering using a Butterworth infinite impulse response filter at 0.5 Hz (24 dB/Oct). An infomax independent-component analysis was run to identify and remove effects of eye movements and blinks. Next, continuous EEG was epoched into −200–600 ms segments time-locked to stimulus display onset. Baseline correction was performed based on the −200–0 ms pre-stimulus interval. Target-absent trials, trials preceded by a target-absent trial, error response trials, as well as trials preceded by an error were excluded from the analyses. Finally, trials with (i) signals exceeding ±60μV, (ii) bursts of electromyographic activity (permitted maximal voltage steps/sampling point of 50 μV), or (iii) activity lower than 0.5μV within intervals of 500 ms (indicating dead channels) were removed from further analyses on an individual channel basis. The remaining trials (mean = 111 trials per participant per condition, *SD* = 5 trials) were sorted according to experimental conditions, and averaged on an individual-channel basis.

## **DATA ANALYSES**

Analysis of EEG signals focused on two event-related potentials (ERPs). The ERP components were quantified by subtracting ERPs measured at lateral parieto-occipital electrodes (PO7/PO8) ipsilateral to the target's location from contralateral ERPs. The PCN peak latencies and amplitudes were defined per participant as the maximum negative-going deflection in the time period 170–270 ms post-stimulus. SPCN amplitudes were defined as the average of the time window the 430–510-ms post-stimulus.

Mean RTs, error rates, PCN peak latencies and amplitudes, and SPCN mean amplitudes were computed for correct target-present probes for which the primes were also correct target-present trials. Dependent measures were submitted to repeated-measures ANOVAs (RANOVAs) in three different analyses. First, overall task differences and TSCs were assessed in a RANOVA with main terms for (i) task on *probe* trial (detection vs. discrimination) and (ii) task sequence (repetition vs. change) across prime and probe trials. The second analysis focused on indices of re-using processes across trials, that is, on dimension-repetition effects. Because re-use is expected to co-vary with what is *available* for re-using, which is determined on prime trials, the second set of analyses used a RANOVA with main terms for (i) task on *prime* trial, (ii) task on *probe* trial, and (iii) dimension sequence (repetition vs. change) across trials. Third, effects of response sequence on mean RTs and error rates were analyzed. Because only a small number of trials were available per cell for this analysis, the corresponding ERPs could not be examined.

## **RESULTS**

## **ANALYSES OF OVERALL TASK DIFFERENCES** *Behavioral results*

Inspection of the overall mean RTs and error rates (shown in **Table 1**) revealed performance to be faster and more accurate for the detection task (mean RTs = 594 ms, mean error rate = 5.2%) than for the discrimination task (670 ms, 6.5%). Furthermore, task repetitions (598 ms, 4.5%) yielded faster and more accurate performance relative to task changes (666 ms, 7.1%), indicating substantial TSCs for both RTs and error rates (TSCRT = 68 ms, TSCerrors = 2*.*6%). Finally, switching from discrimination to detection incurred greater TSCs (81 ms, 3%) than switching from detection to discrimination (57 ms, 2.3%).

These observations were confirmed by RANOVAs of the mean RTs and error rates with main terms for task on probe trial (detection vs. discrimination) and task sequence (taskrepetition vs. -change). The analysis of the mean RTs proved main effects of task, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>36</sup>*.*44, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*74, task sequence, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>18</sup>*.*65, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*60, as well as their interaction, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>6</sup>*.*18, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*05, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*32, to be significant. The RANOVA of the mean error rates yielded likewise a significant main effect of task sequence, *F(*1*,*13*)* = 20*.*94, *p <* 0*.*01, η2 *<sup>p</sup>* = 0*.*62, without, however, any of the other effects reaching significance (all *F <* 2*.*17, all *p >* 0*.*16).

#### *ERP results*

The stimulus-locked ERP waves obtained for the detection and discrimination tasks are shown on **Figure 2**. **Figure 2A** depicts the time course separately for electrodes contra- and ipsilateral to the target position, averaged across target positions<sup>3</sup>

**Table 1 | Mean RTs (SEM) and percentage of errors (SEM) on probe trials for the two different tasks, dependent on the task sequence.**


3Analyses of PCN and SPCN amplitudes revealed a significant target position effect for the PCN component (3.50 and 3.25μV for left and right targets, respectively, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>4</sup>*.*81, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*05, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*27), and no difference between left and right targets in the SPCN component (*F* = 1*.*62, *p* = 0*.*22). Most importantly, target position did not interact significantly with electrode site (ipsi- vs. contralateral), nor with any other experimental manipulation (all *Fs <* 2*.*16, all *p >* 0*.*16)—indicating that the PCN and SPCN components were not specific for one visual hemifield.

(left vs. right), while **Figure 2B** depicts the difference between contra- and ipsilateral electrodes. As can be seen from **Figure 2B**, substantial lateralization effects, confirmed by *t*-tests against zero, were observed in the PCN time range (170–270 ms) for both the detection (−2.17μV, *p <* 0*.*01) and the discrimination task (−2.41, *p <* 0*.*01). Similar to what we already reported in Töllner et al. (2012a), PCN amplitude was higher for the discrimination than for the detection task, with comparable PCN latencies across the two tasks (221 and 222 ms for detection and discrimination, respectively). The RANOVA of the PCN latencies with main terms for (i) task on probe trial and (ii) task sequence yielded no significant effects (all *F <* 1). The RANOVA of the PCN amplitudes yielded a significant main effect of task, *F(*1*,*13*)* = 6*.*61, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*43; the main effect of task sequence non-significant (*F <* 1); the task × task sequence interaction approached significance, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>3</sup>*.*70, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*08, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*22), owing to the fact that switching task tended to increase the PCN amplitude for the detection task (−2.12 μV and −2.22μV for detection → detection and discrimination → detection sequences, respectively), while tending to decrease the amplitude for the discrimination task (−2.52 and −2.29μV for discrimination → discrimination and detection → discrimination sequences, respectively).

Furthermore, as **Figure 2B** shows, and as confirmed by *t*-tests against zero, lateralization effects in SPCN time range (430–510 ms) were observed for the discrimination task (mean amplitude = −0.36μV, *p <* 0*.*01), but not for the detection task (0.14 μV, *p* = 0*.*18). A RANOVA of the SPCN mean amplitude with main terms for task and task sequence revealed the main effect of task to be significant, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>72</sup>*.*74, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *p* = 0*.*85, with no other effects reaching significance (all *F <* 1).

## **ANALYSES OF RE-USING CONTROL SETTINGS ACROSS TASKS** *Behavioral results*

**Figure 3** depicts the mean RT and error rate for a given task on the probe trial (detection, discrimination) dependent on the task on the prime trial (detection, discrimination), separately for dimension repetitions and changes (dimension sequence). As can be seen, RTs were faster for dimension repetitions (blue bars) than for dimension changes (red bars), in all conditions; error rates followed a similar pattern. These observations were confirmed by a RANOVA of the mean RTs, which revealed all three main effects (all *F >* 6*.*18, all *p <* 0*.*05), all two-way interactions (all *F >* 18*.*17, all *p <* 0*.*01), and the three-way interaction between task on prime trial, task on probe trial, and dimension sequence, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>21</sup>*.*27, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*62, to be significant. An analogous RANOVA of the error rates yielded the following significant effects: main effect of dimension sequence, *F(*1*,*13*)* = 11*.*39, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*47; task on probe trial × dimension sequence interaction, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>33</sup>*.*88, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*05, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*28; task on prime trial × task on probe trial interaction, *F(*1*,*13*)* = 20*.*94, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*62; and task on prime trial × task on probe trial × dimension sequence interaction, *F(*1*,*13*)* = 8*.*16, *p <* 0*.*5, η2 *<sup>p</sup>* = 0*.*39.

To further investigate the significant three-way interactions, separate RANOVAs were run dependent on the specific task on prime trials (detection and, respectively, discrimination), with main terms for task on probe trial and dimension sequence. With the detection task on prime trials (left-hand panels in **Figure 3**), the analysis of the (probe-trial) RTs revealed both main effects to be significant: task on probe trial [145-ms difference between discrimination and detection tasks, *F(*1*,*13*)* = 34*.*04, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*72] and dimension sequence [23-ms difference between dimension changes vs. -repetitions, *F(*1*,*13*)* = 22*.*00, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*63]; the interaction between the two was far from significance (*F <* 1). The RANOVA of the error rates revealed a significant main effect of task on probe trials [3% difference between discrimination and detection, *F(*1*,*13*)* = 16*.*68, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*56], and a marginally significant main effect of dimension sequence, with dimension repetitions yielding 1.1% more accurate performance than dimension changes, *F(*1*,*13*)* = 4*.*35, *p* = 0*.*06; the interaction between the two was far from significance (*F <* 1).

With the discrimination task on prime trials (right-hand panels in **Figure 3**), a RANOVA of the (probe-trial) RTs revealed the main effect of dimension sequence, *F(*1*,*13*)* = 36*.*32, *p <* 0*.*01, η2 *<sup>p</sup>* = 0*.*81, and the task on probe trial × dimension sequence interaction, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>32</sup>*.*01, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*71, to be significant. The interaction was caused by the dimension repetition (vs. change) effect being much larger for discrimination → discrimination sequences (125 ms) than for discrimination → detection sequences (18 ms). For the errors (on probe trials), the main effect of dimension sequence proved significant, *F(*1*,*13*)* = 7*.*67, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*37, with accuracy being 1.4% higher for dimension repetitions relative to changes. The task on probe trial × dimension sequence interaction was also significant, *F(*1*,*13*)* = 8*.*31, *p <* 0*.*5, η<sup>2</sup> *<sup>p</sup>* = 0*.*40, with a stronger DRE for the discrimination relative to the detection task on the probe trial (3.2 vs. −0.3%, respectively).

In summary, significant DREs were observed in all experimental conditions. When the task on the prime trial required simple target detection, DREs were comparable for detection and discrimination tasks on the probe trial. By contrast, with the task on prime trial required target discrimination, stronger DREs were observed for the discrimination task, relative to the detection, to be performed on the probe trial.

## *ERP results*

Lateralized ERPs are depicted in **Figure 4** for the probe trials, separately for the different tasks on prime and on probe trials, as well as across different dimension sequences. As can be seen from **Figure 4**, the PCN latency was delayed for dimension changes (red lines) relative to dimension repetitions (blue lines), in all conditions. In the SPCN time range DREs were evident only for (probe) trials following the discrimination task.

## *PCN analyses*

For (probe) trials preceded by the detection task (left-hand side of **Figure 4**), a RANOVA of the PCN peak latencies revealed only

**FIGURE 4 | Group-averaged time course of PCN and SPCN components for the different task sequences, separately for dimension repetitions (blue) and changes (red).** Significant dimension repetition effects for the peak PCN latency and the mean SPCN amplitude are indicated. For the purpose of presentation, a 30-Hz low-pass filter was applied; data analyses were performed over individual, unfiltered data.

a significant main effect of dimension sequence, with dimension repetitions being 17 ms faster than dimension changes, *F(*1*,*13*)* = 15*.*41, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*54; no other effects reached significance (all *F <* 1). An analogous RANOVA for trials preceded by the discrimination task (right-hand side) also yielded only a significant main effect of dimension sequence [14-ms DRE, *F(*1*,*13*)* = 5*.*09, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*23], with no other effects reaching significance (all *F <* 1).

#### *SPCN analyses*

With the task on the prime trial requiring target detection, analysis of the (probe-trial) mean SPCN amplitudes revealed only a significant main effect of task on probe trial, with an overall stronger SPCN component for the discrimination relative to the detection task (−0.37 vs. 0.18μV), *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>25</sup>*.*70, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*66; no other effects reached significance (all *F <* 1*.*94, all *p >* 0*.*18). With the discrimination task on the prime trial, an analogous analysis also yielded a significant main effect of task on probe trial, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>30</sup>*.*16, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*70, with the significant SPCN for the discrimination relative to the insignificant SPCN for the detection task (−0.36 vs. 0.10μV). Importantly, the main effect of dimension sequence was also significant, *F(*1*,*13*)* = 5*.*33, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*29, with the SPCN amplitude being stronger for dimension changes relative to dimension repetitions (−0.23 vs. −0.02μV); the interaction task on probe trial x dimension sequence was far from significance (*F <* 1).

In summary, analyses of the ERPs revealed longer PCN peak latencies for dimension changes vs. repetitions independently of the task on prime or probe trial; SPCN mean amplitudes were overall larger for the discrimination than for the detection task on probe trial, while significant DREs in the SPCN time interval were observed only for trials preceded by the discrimination task.

#### **ANALYSES OF RESPONSE-SEQUENCE EFFECTS**

As can be seen from **Table 2**, mean RTs were overall fastest for full repetitions (same task, dimension, and response), intermediate for partial repetitions, and slowest for full changes. By contrast, the errors varied as a function of task sequence. Importantly, though, there was no evidence that partial repetitions resulted in less accurate performance relative to full changes.

These observations were confirmed by three-ways RANOVAs (task- × dimension- × response sequence) of the mean RTs and error rates with a focus on the main effect of and interactions involving response sequence. The RT analysis showed the main effect of response sequence to be significant, *F(*1*,*13*)* = 6*.*41, *p <* 0*.*05, η<sup>2</sup> *<sup>p</sup>* = 0*.*33, as well as the interaction of this factor with dimension sequence, *<sup>F</sup>(*1*,*13*)* <sup>=</sup> <sup>9</sup>*.*04, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*41, owing to a larger response repetition benefit when the dimension repeated (30 ms, *p <* 0*.*01) rather than changed (5 ms, *p* = 0*.*49). The task sequence × response sequence interaction was marginally significant, *<sup>F</sup>(*2*,*26*)* <sup>=</sup> <sup>1</sup>*.*67, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*07, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*18, suggestive of a larger response repetition benefit for the discrimination task (29 ms, *p <* 0*.*01, and 23 ms, *p <* 0*.*01, for detection → discrimination and discrimination → discrimination sequences, respectively) relative to the detection task (−1 ms, *p* = 0*.*93). The three-way interaction did not reach significance, *F* = 1.67, *p* = 0*.*27.

Concerning the error analysis, the main effect of response sequence was non-significant (*F <* 1). However, response sequence interacted with dimension sequence, *F(*1*,*13*)* = 18*.*43, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*59, and task sequence, *F(*2*,*26*)* = 18*.*52, *p <* 0*.*01, η<sup>2</sup> *<sup>p</sup>* = 0*.*59; and the response- × dimension- × task sequence interaction was significant, *<sup>F</sup>(*2*,*26*)* <sup>=</sup> <sup>8</sup>*.*55, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*01, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*40. *Post-hoc* analyses revealed the pattern of response sequence effects (repetition benefit vs. cost) to vary across task and dimension sequences. For discrimination → detection task sequences, dimension repetitions were associated with response repetition *costs* (−5.4%, *p <* 0*.*01), whereas dimension changes yielded response repetition *benefits* (4.1%, *p <* 0*.*01). By contrast, discrimination → detection task sequences resulted in response repetition benefits (3.5%, *p <* 0*.*01) independently of dimension sequence. Finally, discrimination → discrimination task sequences were associated with response repetition costs (−2.9%, *p <* 0*.*05) independently of dimension sequence.

In summary, analyses of response sequence effects on mean RTs showed either response repetition benefits or no effects of response sequence. This finding is at variance with hierarchical task sets, which predict response repetition costs following a task or dimension change. Integrated task sets, which predict no response sequence effects following a task or dimension switch, also account poorly for the present findings. The results for error rates were less consistent across task and dimension sequences, with different patterns of response sequence effects across different experimental conditions.



*aResponse sequence analysis for detection* <sup>→</sup> *detection task sequence was omitted because it always involved a response repetition.*

## **DISCUSSION**

The present findings demonstrate that, consistent with "agglomerated task sets," it is possible to reuse control settings across different tasks, and that the reusing is associated with distinct ERP components, depending on precisely which task set component (selection vs. identification criteria) is reused across tasks. In particular, switching tasks was revealed to be easier when the task-relevant dimension repeated relative to when it changed, as indexed by DREs on task switch trials. The notion of agglomerated task sets is fully compatible with existing computational models of cognitive control (Logan and Gordon, 2001; Meiran et al., 2008). Both these accounts postulate several independent parameters influencing processes of selection, identification, and, respectively, responding. Importantly, since these parameters are independent, it is plausible that they can also be adjusted independently—which is the core assumption of the agglomerated-task-sets hypothesis. The compatibility with the computational models of cognitive control emphasizes another property of agglomerated task sets: their computational efficiency.

Note though that the DREs reported presently are consistent not only with agglomerated task sets, but also with two, relatively strong alternatives, namely: (i) switching between the detection and discrimination conditions did not involve a task switch, and (ii) the DREs are not dimension-specific, but rather feature- or response-specific. The former alternative would postulate that the two tasks used in the present study were effectively one task, in which participants always performed feature discrimination, but, depending on the cue, selected different responses. The latter alternative would imply that the substantial reusing of spatial selection and identification criteria (as revealed in the dimensionrepetition effects) was not indicative of the reuse of task set components in general, but rather of the reuse of stimulusresponse associations encountered on the previous trial (Hommel et al., 2001; Dreisbach et al., 2006, 2007).

The present findings argue against the hypothesis that no task switching took place in our paradigm. First, in the present study, substantial differences in mean RTs and errors rates were observed between the detection and discrimination tasks, suggesting that the two conditions were performed differently. Behavioral differences were accompanied by differences in the ERPs: a reliable SPCN component was observed for the discrimination, but not for the detection task. Taken together, these findings suggest that solving the detection and, respectively, discrimination tasks involved the use of different task sets. Second, switching between the two conditions incurred substantial switch costs. Importantly, the switch cost magnitude differed, with stronger costs for discrimination → detection relative to detection → discrimination sequences. The asymmetry in the switch costs—with switching to an easier, or dominant, task (presently, detection) being more effortful than switching to a more difficult task (presently, discrimination)—has frequently been reported in task-switching literature (Allport et al., 1994; Wylie and Allport, 2000) and interpreted as an index of the interference between two concurrent task sets. On this background, it is likely that switching between the two tasks in the present study indeed involved task switch processes.

The second alternative explanation posits that DREs critically rely on repetitions of full stimulus-response episodes across trials, rather than on more abstract criterion repetitions. Studies investigating the role of S-R repetitions have typically found performance to be very good on full-repetition trials and worse on partial-repetition trials, that is, when either the stimulus or the response changed, while the other property repeated. Somewhat counter-intuitively, performance is typically *better* on full switches, that is, when both stimulus and response change, compared to partial repetitions (Hommel, 1998, 2005; Töllner et al., 2008; Zehetleitner et al., 2012). Applied to the present study, if DREs were actually S-R repetition effects, then changing the dimension and repeating the response (i.e., partial repetitions) should have resulted in worse performance relative to changing both the dimension and the response (full repetitions). However, analyses of response sequence effects showed this not to be the case, arguing that the DREs reported presently are not reducible to S-R repetition effects.

That partial-repetition costs were not prominent in the present study does not necessarily imply that the processes generating these effects were inactive; rather, the paradigm and dependent measures may not have been sensitive to these processes. Previous work investigating electrophysiological correlates of DREs in a single task paradigm (Töllner et al., 2008) revealed partial repetition costs to correlate with ERP markers that were independent of the ERP markers for dimension and, respectively, response sequence effects. Importantly, the markers of partial-repetition costs were not investigated in the present study. Additionally, in a recent study of partial-repetition costs (Zehetleitner et al., 2012), numerical simulations showed that three different sequence-sensitive mechanisms (dimension-, S-R mapping-, and motor-response-specific) combine and can produce any possible RT pattern, that is, with or without manifest partial repetition costs. Given this, until the boundary conditions for partial-repetition costs to arise are fully understood, it remains possible that the present paradigm simply did not meet these conditions.

## **DIRECTION OF PARTIAL-SWITCH EFFECTS: REUSING, RESETTING, OR SWITCHING**

While the present study, together with several previous investigations (Arrington et al., 2003; Rangelov et al., 2011, 2012), showed that reusing shared task set components across different tasks is possible, findings to the contrary have also been reported. In particular, Vandierendonck et al. (2008) failed to find any partialswitch effects, while Kleinsorge and colleagues (1999, 2004) found partial-switch *costs*, relative to full switches, rather than benefits as reported presently. How can these disparate findings be reconciled?

A notable difference between the paradigms used in these investigations and the present study is in the stimulus material and the cueing procedure employed. More precisely, the previous studies used stimuli that were ambiguous with regard to both task-relevant stimulus attributes and identification criteria. Thus, cueing of both the relevant stimulus property (e.g., number or digit value in Vandierendonck et al., 2008) and identification criteria (parity vs. magnitude) was necessary on every trial (e.g., the string "number odd/even" served as a cue). This procedure produced partial overlap between cue strings on partial switches (e.g., "number odd/even" → "digit odd/even") and no overlap on full switches (e.g., "number smaller/greater" → "digit odd/even"). As has been shown by several studies (Logan and Schneider, 2006; Schneider and Logan, 2009), overlapping cues activate competing task sets, resulting in negative interference on partial-switch trials. As a consequence, any benefits from reusing shared control representations might have been masked by interference effects from overlapping cues, resulting in either insignificant partialrepetition effects or even partial-repetition costs. By contrast, the present study used cues that did not overlap between the different tasks. Furthermore, the task-relevant dimension was unambiguously specified by the stimulus displays themselves, so no dimension cues were necessary. Consequently, in the present experiment, the negative interference effects would have been minimal—and, correspondingly, partial-repetition effects turned out significant.

## **ELECTROPHYSIOLOGICAL EVIDENCE FOR SEPARABLE CONTROL MECHANISMS**

In contrast to the mixed behavioral findings regarding the direction of partial-switch effects, investigation of the ERP components related to the spatial selection and post-selective processing task components offers more conclusive findings. By assuming relative autonomy of the control systems for the different cognitive processes, "agglomerated task sets" predict the existence of *multiple*, relatively independent sources of DREs. Consistent with the prediction, EEG analyses revealed *multiple* ERP components to be sensitive to dimension sequences: dimension changes (relative to repetitions) resulted in longer PCN latencies as well as larger SPCN amplitudes.

Analyses of the PCN latencies showed dimension changes (relative to repetitions) to be associated with longer latencies independently of the task sequence (see also Töllner et al., 2008, reporting similar findings in a single-task paradigm). This finding can be explained by assuming that both the detection and discrimination tasks required spatial selection processes, governed by a task set component sensitive to dimension sequences. Dimension repetitions would permit reusing control settings from the previous trial, generating DREs for the PCN latency. Importantly, on the agglomerated-task-set hypothesis, reusing settings is possible even across tasks that differ in other task set components—as evidenced by the present finding of DREs across task switches for the PCN timing.

Analyses of the SPCN amplitudes showed, consistent with our predictions, a more pronounced SPCN for discrimination relative to detection task trials. Apparently, the SPCN amplitude increases continuously with increases in demands for post-selective perceptual processing, from no SPCNs in singleton detection (present study) and singleton localization tasks (Mazza et al., 2007) through singleton discrimination tasks (present study) to strong SPCNs in compound-search tasks (Mazza et al., 2007; Töllner et al., 2013), in which one target dimension (e.g., color) is selection-relevant, whereas another dimension (e.g., orientation) is response-relevant.

Furthermore, the SPCN amplitude was sensitive to dimension sequence, but only on trials following the discrimination task. This finding is consistent with the conceptual analysis of the task sets for the detection and, respectively, the discrimination task and the functional interpretation of the SPCN component. Conceptual analysis of the detection task suggested that the singleton dimension is not mandatorily encoded in the post-selective identification settings (for the detection task) and it would not be available for reuse on the following trial, predicting no DRE for the SPCN component following detection task trials. By contrast, post-selective identification in the discrimination task necessarily involved determining the singleton dimension, predicting DREs for the SPCN component following trials of the discrimination task, consistent with the results of the present study.

The electrophysiological data are especially informative about the notion of a generalized switching mechanisms [as proposed by Kleinsorge and Heuer (1999)], according to which one criterion switch enforces a switch in all criteria, rather than simple re-setting. If generalized switching were operating as a default task switch mechanism, then reconfiguring the spatial selection component should have triggered a switch in the post-selective identification component as well. This would predict an obligatory coupling between DREs for the PCN and SPCN parameters—a prediction that is not supported by the present findings. Thus, in summary, the present ERP findings are fully consistent with, and yet independent of the behavioral findings of DREs across task switches, in their support for the notion of agglomerated task sets.

## **CONCLUSIONS**

The present study investigated the nature of task-set representations, defined as a set of criteria for spatial selection, postselective identification, and S-R mapping rules. Two alternatives were considered: task sets as integral representations and task sets as agglomerations of relatively autonomous control settings guiding different processing stages. The key property of the "agglomerated-task-set" hypothesis is that following a task switch, different control instances can be reconfigured independently of each other, permitting reusing some of the control settings across different tasks. By contrast, the hypothesis of "integrated task sets" predicted no partial-switch effects, as changing any task set component would either reset all settings, or trigger a full switch. Consistent with agglomerated task sets, our findings demonstrate substantial partial-switch effects, operationalized as DREs. Most importantly, evidence of DREs over PCN latencies and SPCN amplitudes, two functionally distinct ERP components, further supports the idea of task sets as a collection of autonomous control instances governing processes of spatial selection and perceptual/symbolic analysis, respectively.

## **ACKNOWLEDGMENTS**

This research was supported by DFG grants RA 2191/1-1 (to Dragan Rangelov) and ZE 887/3-1 (to Michael Zehetleitner and Hermann J. Müller) as well as the German-Israeli Foundation for Scientific Research and Development grant 2011/158 (to Michael Zehetleitner and Hermann J. Müller).

## **REFERENCES**


electrophysiological responses. *Exp. Brain Res.* 181, 531–536. doi: 10.1007/s00221-007-1002-4


74, 540–552. doi: 10.3758/s13414- 011-0251-0252


Stimulus saliency modulates pre-attentive processing speed in human visual cortex. *PLoS ONE* 6:e16276. doi: 10.1371/journal. pone.0016276


in task-shift costs. *Cogn. Psychol.* 46, 361–413. doi: 10.1016/S0010-0285(02)00520-0


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2013; accepted: 13 August 2013; published online: 03 September 2013.*

*Citation: Rangelov D, Töllner T, Müller HJ and Zehetleitner M (2013) What are task-sets: a single, integrated representation or a collection of multiple control representations? Front. Hum. Neurosci. 7:524. doi: 10.3389/fnhum.2013.00524*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Rangelov, Töllner, Müller and Zehetleitner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Behavioral and neural interaction between spatial inhibition of return and the Simon effect

#### *Pengfei Wang1, Luis J. Fuentes 2, Ana B. Vivas <sup>3</sup> and Qi Chen1 \**

*<sup>1</sup> Center for Studies of Psychological Application and School of Psychology, South China Normal University, Guangzhou, China*

*<sup>2</sup> Departamento de Psicología Básica y Metodología, Facultad de Psicología, Universidad de Murcia, Murcia, Spain*

*<sup>3</sup> Psychology Department, The University of Sheffield International Faculty, City College, Greece*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*John J. McDonald, Simon Fraser University, Canada Joeran Lepsien, Max Planck Institute for Human Cognitive and Brain Sciences, Germany*

#### *\*Correspondence:*

*Qi Chen, School of Psychology, South China Normal University, NO. 55, West of Zhongshan Avenue, Tianhe District, 510631 Guangzhou, China*

*e-mail: qi.chen27@gmail.com*

It has been well documented that the anatomically independent attention networks in the human brain interact functionally to achieve goal-directed behaviors. By combining spatial inhibition of return (IOR) which implicates the orienting network with some executive function tasks (e.g., the Stroop and the flanker tasks) which implicate the executive network, researchers consistently found that the interference effects are significantly reduced at cued compared to uncued locations, indicating the functional interaction between the two attention networks. However, a unique, but consistent effect is observed when spatial IOR is combined with the Simon effect: the Simon effect is significantly larger at the cued than uncued locations. To investigate the neural substrates underlying this phenomenon, we orthogonally combined the spatial IOR with the Simon effect in the present event-related fMRI study. Our behavioral data replicated previous results by showing larger Simon effect at the cued location. At the neural level, we found shared spatial representation system between spatial IOR and the Simon effect in bilateral posterior parietal cortex (PPC); spatial IOR specifically activated bilateral superior parietal cortex while the Simon effect specifically activated bilateral middle frontal cortex. Moreover, left precentral gyrus was involved in the neural interaction between spatial IOR and the Simon effect by showing significantly higher neural activity in the "Cued\_Congruent" condition. Taken together, our results suggest that due to the shared spatial representation system in the PPC, responses were significantly facilitated when spatial IOR and the Simon effect relied on the same spatial representations, i.e., in the "Cued\_Congruent" condition. Correspondingly, the sensorimotor system was significantly involved in the "Cued\_Congruent" condition to fasten the responses, which indirectly resulted in the enhanced Simon effect at the cued location.

**Keywords: spatial IOR, the Simon effect, fMRI, shared spatial representation, parietal cortex, frontal cortex**

## **INTRODUCTION**

It is amply accepted that there exist three functionally and anatomically independent attention networks in the human brain: the alerting network, the orienting network and the executive network (Petersen et al., 1989; Posner and Petersen, 1990; Fan et al., 2002, 2003b, 2005, 2009). The alerting network provides the ability to increase vigilance to an impending stimulus. This network consists of thalamic and some specific anterior and posterior cortical sites, and involves the cortical projection of the norepinephrine system (Fan et al., 2005, 2009; Federico et al., 2013). The orienting network is responsible for reflexively or voluntarily shifting visuospatial attention to a specific location to sample sensory input (Corbetta et al., 2000; Yantis et al., 2002; Fan et al., 2005; Kincade et al., 2005). For example, the orienting network is involved in a spatial inhibitory mechanism that prevents the attention system from re-examining previously attended locations. This mechanism was first described in the Posner's spatial cuing task, in which a peripheral cue was first presented to attract spatial attention to the cue location (Posner and Cohen, 1984; Posner et al., 1985; Klein, 2000). Responses to a target immediately appearing at the cued location, compared to responses to a target at an uncued location, were both faster and more accurate. However, if the cue-target stimulus onset asynchrony (SOA) was longer than 300 ms and the cue was uninformative with regard to target location, responses to the target at the cued location would be delayed, compared to responses to the target at the uncued location. This inhibitory effect is termed inhibition of return (IOR) (Posner and Cohen, 1984), which slows down attentional reorienting to the previously attended (cued) location, and thus increases the efficiency of visual search (Zhou and Chen, 2008; McDonald et al., 2009; Tian et al., 2011). Neurally, a dorsal frontoparietal network, including bilateral frontal eye field (FEF), the superior and inferior parietal cortex, are involved in the orienting network (Rosen et al., 1999; Corbetta and Shulman, 2002; Mayer et al., 2004; Zhou and Chen, 2008; Fan et al., 2009). The executive network manages the ability to control behavior to achieve intended goals and resolve conflict among alternative responses (Posner and Petersen, 1990). It has been generally measured by the Stroop task, the flanker task, and the Simon task (Umiltá and Nicoletti, 1990; Lu and Proctor, 1995; Botvinick et al., 2001; Fan et al., 2009). At the neural level, the executive function has been associated with anterior cingulate cortex (ACC) and lateral prefrontal cortex (MacDonald et al., 2000; Fan et al., 2005, 2009; Zhou et al., 2011).

Although there has been extensive evidence suggesting the functional and anatomical independences between the executive and the orienting networks, the attention networks need to interact in multiple ways to achieve coherent, goal-directed behaviors (Fuentes, 2004; Fuentes et al., 2012). For example, at the behavioral level, when the Stroop or flanker interference tasks are combined in the manipulation of IOR such that conflicting information can be presented at either the cued or the uncued location, the interference effects are reduced, eliminated or even reversed at the cued location (Fuentes et al., 1999; Vivas and Fuentes, 2001; Vivas et al., 2007). At the neural level, when spatial IOR, a mechanism associated with the orienting network, was orthogonally combined with non-spatial IOR, a mechanism associated with the executive network, the orienting and the executive networks interacted and compensated each other in biasing the attention system for novelty (Chen et al., 2010). The orienting network was involved in slowing down responses to the old location only when the non-spatial IOR mechanism in the executive network was not operative (i.e., when the non-spatial feature of the target was novel); the prefrontal executive network was involved in slowing down responses to the old non-spatial representation only when the spatial IOR mechanism in the orienting network was not functioning (i.e., when the target appeared at a novel location).

One exceptional case to the above findings, however, is when spatial IOR is combined with the Simon effect. Although previous studies found that the Stroop and the flanker conflicts were reduced or even reversed at the inhibited (cued) location of spatial IOR, an effect attributed to an executive-dependent inhibitory tagging mechanism (Fuentes et al., 1999, 2012; Vivas and Fuentes, 2001; Fuentes, 2004), the Simon conflicts were significantly increased at the cued location (Lupiáñez, Milán, Tornay, Madrid and Tudela, 1997; Pratt et al., 1997; Ivanoff et al., 2002; Hilchey et al., 2011). The Simon effect refers to the phenomenon that even when the spatial location of stimuli is task-irrelevant, participants' responses are slower when the spatial location of the stimuli is contralateral to the predefined location of response (i.e., incongruent condition) than when they are ipsilateral (i.e., congruent condition) (Umiltá and Nicoletti, 1990; Lu and Proctor, 1995). For example, in a color discrimination task, one of two color stimuli is presented either on the left or right side of the computer screen, and participants are instructed to press the leftside key in response to one color and to press the right side key in response to the other color. Although the spatial location of the color stimulus is irrelevant concerning the color discrimination task, participants' responses are slower when the spatial position of the color stimulus (left or right) was contralateral to the position of the response key (left or right) than when they are ipsilateral.

The "amplification of the Simon effect by IOR" can be interpreted as the consequence of the inhibitory mechanism: when stimuli fell at inhibited (cued) locations, access to the response system from the task-relevant dimension of the target (e.g., color) was hindered such that the competition from the task-irrelevant dimension of the target (e.g., location) was increased. Therefore, the increased Simon effect at the cued location due to the incongruent condition being affected at that inhibited location. Another similar but slightly different interpretation is that, IOR delayed both codes activated by the target (the task-relevant identity code and the task-irrelevant location code). However, the delaying effect of IOR on spatial processing (localization) was much greater than it is on non-spatial processing (Hilchey et al., 2011), so that the responses in the "Cued\_Incongruent" condition were significantly delayed, compared with the "Cued\_Congruent" condition.

However, an alternative hypothesis, the shared spatial representation account, cannot be rejected. In contrast to the Stroop effect and the flanker effect, in which the conflicts are induced between two non-spatial semantic representations, the conflicts in the Simon effect are between the response-related spatial representation activated by the task-relevant dimension of the target (e.g., color) and the response-related spatial representation activated by the task-irrelevant dimension (e.g., location) of the target. Therefore, if the Simon task is combined with the spatial IOR task, the shared spatial representation system could be activated when the aforementioned spatial representations coincide, especially when a congruent stimulus is presented at the cued location. Specifically speaking, when the target appears at the cued location and the cued location is on the same side as the response key required by the target, the shared spatial representation between the cued location of spatial IOR and the position of the response key in the Simon task could cause significantly faster responses (i.e., a facilitatory effect) in the "Cued\_Congruent" condition compared with the "Cued\_Incongruent" condition, indirectly resulting in the increased size of the Simon effect observed at the cued location.

In the present event-related fMRI study, we orthogonally combined the spatial IOR procedure with the Simon task. We aimed to investigate the neural correlates of the Simon effect and spatial IOR, and explore the neural substrates underlying the increased size of the Simon effect at the cued location by examining the two alternative hypotheses, the inhibitory hypothesis and the shared spatial representation hypothesis. If the inhibitory hypothesis is correct, we should expect higher prefrontal areas activation in the incongruent than in the congruent condition at the cued location, in comparison with neural activations at the uncued location. If the shared spatial representation hypothesis is correct, we predict that we will find shared spatial representations areas [e.g., the posterior parietal cortex (PPC), (Haxby et al., 1991; Sack, 2009)] between the spatial task and the conflict task. In addition, we should find higher neural activation in the congruent than the incongruent conditions at the cued location, in comparison with neural activation at the uncued location.

## **MATERIALS AND METHODS**

## **PARTICIPANTS**

Sixteen undergraduate students (9 males and 7 females, 24 ± 3 years old) participated in the present study. They were all right handed and had normal or corrected-to-normal visual acuity. None of them had a history of neurological or psychiatric disorders. All participants gave informed consent prior to the experiment in accordance with the Helsinki declaration, and the study was approved by the ethics committee of the School of Psychology, South China Normal University.

## **STIMULI AND EXPERIMENTAL DESIGN**

The stimuli were presented through a LCD projector onto a rear projection screen located behind the participants' head. Participants viewed the screen through an angled mirror on the head-coil. Each trial consisted of a serial of displays (**Figure 1**). The default display included three horizontally arranged white boxes (1*.*9◦ × 1*.*9◦ visual angle) on a black background. The center-to-center distance between two adjacent boxes was 7*.*4◦ in visual angle. Participants were instructed to fixate at the central box throughout the experiment. At the beginning of each trial, the outlines of one of the two peripheral boxes became thicker and brighter for 100 ms, serving as a cue to attract spatial attention to one of the peripheral locations. The cue was uninformative with regard to the location of the target, i.e., the target appeared at the cued location in 50% of the total trials. After an interval of 200 ms, the outlines of the central box became thicker for 100 ms, serving as a central cue to attract attention from the cued peripheral location to the center. After another interval of 300, 400, or 500 ms, the target (a blue or yellow patch) appeared in either the cued or the uncued peripheral box for 150 ms. Note that the purpose of using variable cue–target SOAs was to prevent participants from forming time-based expectations for the target.

While lying in the scanner, participants hold a response pad in each of their two hands, and the two response pads were positioned on the left and right side of the body. The behavioral task was to discriminate the color of the target, irrespective of the location of the target. Participants were instructed to press one button with the thumb of one hand if the color of the target was blue, and the other button with the thumb of the other hand if the color of the target was yellow. The mapping between the two response hands and the color of the target was counterbalanced across

participants. The spatial location of the target, though irrelevant to the color discrimination task, could be either congruent (i.e., ipsilateral) or incongruent (i.e., contralateral) with the side of the response hand.

Therefore, the present experimental design was a 2 (cue validity: cued vs. uncued) ×2 (Simon congruency: congruent vs. incongruent) event-related fMRI factorial design. There were four experimental conditions in the factorial design and 48 trials for each condition. In total, there were 256 trials, consisting of 192 experimental trials and 64 null trials. In the null trials, only the default display was presented. The inter-trial intervals (ITIs) were jittered from 2200 to 3200 ms (2200, 2450, 2700, 2950, and 3200 ms) with a mean ITI of 2700 ms. All participants completed a training section of 6 min outside the scanner before the scanning.

## **STATISTICAL ANALYSIS OF BEHAVIORAL DATA**

Incorrect responses and RTs longer than mean RT plus three times standard deviation (SD) or shorter than mean RT minus three times SD were excluded from further analysis. Mean RTs and error rates were then calculated and submitted to a 2 (cue validity: cued vs. uncued) ×2 (Simon congruency: congruent vs. incongruent) repeated-measures ANOVA. Significant effects were further examined by planned *t* tests.

#### **DATA ACQUISITION AND PRE-PROCESSING**

A 3T Siemens Trio system with a standard head coil (Erlangen, Germany) was used to obtain T2∗-weighted echo-planar images (EPI) with blood oxygenation level-dependent (BOLD) contrast (matrix size: 64 <sup>×</sup> 64, voxel size: 3*.*<sup>1</sup> <sup>×</sup> <sup>3</sup>*.*<sup>1</sup> <sup>×</sup> <sup>3</sup>*.*0 mm3). Thirty-six transversal slices of 3 mm thickness that covered the whole brain were acquired sequentially with a 0.3 mm gap (*TR* = 2*.*2 s, *TE* = 30 ms, *FOV* = 220 mm, flip angle = 90◦). The one-run functional scanning had 330 EPI volumes, and the first five volumes were discarded to allow for T1 equilibration effects.

Data were pre-processed with Statistical Parametric Mapping software SPM8 (Wellcome Department of Imaging Neuroscience, London, http://www*.*fil*.*ion*.*ucl*.*ac*.*uk). Images were realigned to the first volume to correct for inter-scan head movements. Then, the mean EPI image of each subject was computed and spatially normalized to the MNI single subject template using the "unified segmentation" function in SPM8. This algorithm is based on a probabilistic framework that enables image registration, tissue classification, and bias correction to be combined within the same generative model. The resulting parameters of a discrete cosine transform, which define the deformation field necessary to move individual data into the space of the MNI tissue probability maps, were then combined with the deformation field transforming between the latter and the MNI single subject template. The ensuing deformation was subsequently applied to individual EPI volumes. All images were thus transformed into standard MNI space and re-sampled to 2 <sup>×</sup> <sup>2</sup> <sup>×</sup> 2 mm3 voxel size. The data were then smoothed with a Gaussian kernel of 8 mm fullwidth half-maximum to accommodate inter-subject anatomical variability.

#### **STATISTICAL ANALYSIS OF IMAGING DATA**

Data were high-pass-filtered at 1/128 Hz and were then analyzed with a general linear model (GLM) as implemented in SPM8. Temporal autocorrelation was modeled using an AR (1) process. At the individual level, the GLM was used to construct a multiple regression design matrix that included four experimental events: (1) the target appeared at the cued location, and its response hand was ipsilateral to its location (Cued\_Congruent); (2) the target appeared at the cued location, and its response hand was contralateral to its location (Cued\_Incongruent); (3) the target appeared at the uncued location, and its response hand was ipsilateral to its location (Uncued\_Congruent); (4) the target appeared at the uncued location, and its response hand was contralateral to its location (Uncued\_Incongruent). The four events were time-locked to the target of each trial by a canonical synthetic hemodynamic response function (HRF) and its temporal and dispersion derivatives, with event duration of 0 s. The inclusion of the dispersion derivatives took into account the different durations of neural processes induced by the variable cue–target intervals and allowed for changes in dispersion of the BOLD responses induced by different cue–target intervals. Additionally, all the instructions, omissions, error trials were separately modeled as regressors of no interest. Parameter estimates were subsequently calculated for each voxel using weighted least-squares to provide maximum likelihood estimators based on the temporal autocorrelation of the data. No global scaling was applied.

For each participant, simple main effects for each of the four experimental conditions were computed by applying appropriate baseline contrasts [i.e., the experimental conditions vs. implicit baseline (null trials) contrasts]. The four first-level individual contrast images were then fed into a 2 × 2 within-participants ANOVA at the second group level employing a random-effects model (the flexible factorial design in SPM8 including an additional factor modeling the subject means). In the modeling of variance components, we allowed for violations of sphericity by modeling non-independence across parameter estimates from the same subject, and allowed for unequal variances between conditions and between subjects using the standard implementation in SPM8. Areas of activation in the main effects and the interaction effects were identified as significant only if they passed a conservative threshold of *P <* 0*.*001, corrected for multiple comparisons at the cluster level with an underlying voxel level of *P <* 0*.*001, uncorrected (Poline et al., 1997).

## **RESULTS**

#### **BEHAVIORAL DATA**

Mean RTs in the four experimental conditions were submitted to a 2 (cue validity: cued vs. uncued) ×2 (Simon congruency: congruent vs. incongruent) repeated-measures ANOVA (**Figure 2A**). The main effect of cue validity was significant, *F(*1*,* <sup>15</sup>*)* = 31*.*91, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*68, indicating that RTs to the cued targets (531 ± 20 ms) were significantly slower than RTs to the uncued targets (501 ± 18 ms), i.e., a significant IOR effect. The main effect of Simon congruency was also significant, *F(*1*,* <sup>15</sup>*)* = 22*.*31, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*60, indicating that RTs in the congruent condition (502 ± 19 ms) were significantly faster than RTs in the

incongruent condition (530 ± 19 ms), i.e., a significant Simon effect. More importantly, the interaction between cue validity and the Simon congruency was significant, *F(*1*,* <sup>15</sup>*)* = 10*.*15, *p* = <sup>0</sup>*.*006, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*40 (**Figure 2A**). Planned paired *<sup>t</sup>*-tests on simple effects further showed that, on the one hand, the size of the Simon effect was significantly larger at the cued location (40 ± 30 ms) than at the uncued location (17 ± 26 ms), *t(*15*)* = 3*.*186, *p* = 0*.*006. On the other hand, the size of IOR was significant larger in the incongruent condition (41 ± 27) than in the congruent condition (18 ± 23), *t(*15*)* = 3*.*186, *p* = 0*.*006. The error rates (**Figure 2B**) had the same pattern as the RTs, but further 2 × 2 repeated-measures ANOVA showed that, neither the main effects of the cue validity and the Simon congruency nor the interaction effect were significant (all *p >* 0*.*1).

## **IMAGING DATA**

mean RTs **(A)** and error rates **(B)**.

### *Common and specific neural correlates underlying spatial IOR and the Simon effect*

We first identified brain regions associated with the cue validity of spatial IOR. Right PPC, extending inferior to right middle occipital cortex and superior to bilateral superior parietal cortex, showed significantly higher neural activity to targets at the cued location than uncued location, i.e., the main effect contrast "Cued (Congruent + Incongruent) *>* Uncued (Congruent + Incongruent)" (**Figure 3A**; **Table 1A**). No significant activation was found in the reverse contrast. We then calculated the brain regions activated by the main effect of the Simon congruency. Bilateral inferior frontal gyrus, bilateral middle occipital gyrus extending to right superior occipital cortex

and left superior parietal cortex, and right middle temporal cortex showed significantly higher neural activity in the congruent condition than in the incongruent condition, i.e., the main effect contrast "Congruent (Cued + Uncued) *>* Incongruent (Cued + Uncued)" (**Figure 3B**; **Table 1B**). No significant activation was found in the reverse contrast.

Since the neural network involved in the main effect of spatial cue validity (**Figure 3A**) and the neural network involved in the main effect of the Simon congruency (**Figure 3B**) partly overlapped, in order to isolate the common and specific neural correlates underlying the two main effects, we further performed a conjunction analysis and an exclusive masking procedure between them. The conjunction analysis between the main effect of spatial cue validity (cued *>* uncued), and the main effect of Simon congruency (congruent *>* incongruent) showed significant activations in bilateral PPC extending to bilateral middle occipital gyrus (**Figure 3C**; **Table 1C**).

**Table 1 | Brain regions showing significant relative increases of BOLD response associated with the cue validity (cued vs. uncued) and the Simon congruency (congruent vs. incongruent).**


*The coordinates (x, y, z) correspond to MNI coordinates. Displayed are the coordinates of the maximally activated voxel within a significant cluster as well as the coordinates of relevant local maxima within the cluster (in italics).*

To isolate the brain regions that were significantly involved in the main effect of spatial cue validity, but not in the main effect of the Simon congruency, the main effect contrast "Cued *>* Uncued" was exclusively masked by the mask contrast "Congruent *>* Incongruent" at a liberal threshold of *p <* 0*.*05, uncorrected for multiple comparisons. In this way, those voxels that reached a level of significance at *p <* 0*.*05 (uncorrected) in the mask contrast were excluded from the analysis. Bilateral superior parietal cortex was exclusively involved in the main effect of spatial cue validity, rather than the main effect of Simon congruency (**Figure 4A**; **Table 2A**).

To isolate the brain regions which were involved only in the main effect of the Simon congruency, but not in the main effect of cue validity, the main effect contrast "Congruent *>* Incongruent" was exclusively masked by the mask contrast "Cued *>* Uncued." Bilateral middle frontal gyrus was exclusively activated by the main effect of the Simon congruency, rather than by the main effect of cue validity (**Figure 4B**; **Table 2B**).

#### *Neural interaction between spatial IOR and the Simon effect*

Left precentral gyrus (MNI: −40, 6, 48; *t* = 5*.*30, 576 voxels) was significantly activated by the neural interaction contrast "Cued (Congruent *>* Incongruent) *>* Uncued (Congruent *>* Incongruent)" (**Figure 5**). Parameter estimates in the four experimental conditions were extracted from the activated cluster. Planned paired *t*-tests on simple effects suggested that neural

activity was significantly increased in the congruent condition compared to the incongruent conditions when the targets appeared at the cued location, *t(*15*)* = 5*.*91, *p <* 0*.*001, while there was no significant difference between the congruent and incongruent conditions when the targets appeared at the uncued location, *p >* 0*.*1. No significant activation was found in the reverse interaction contrast.

## **DISCUSSION**

In the present fMRI study, we aimed to further investigate the interactions between spatial inhibitory processes, indexed by the spatial-based IOR phenomenon, and response-based conflict processes, indexed by the Simon task. Our previous research showed that inhibitory mechanisms triggered by presenting conflicting stimuli at locations subject to spatial IOR, caused striking patterns of interactions. Concretely, Stroop and flanker interference effects were reduced, eliminated or even reversed at **Table 2 | Brain regions showing significant relative increases of BOLD response associated with the cue validity (cued vs. uncued) and the Simon congruency (congruent vs. incongruent).**


*The coordinates (x, y, z) correspond to MNI coordinates. Displayed are the coordinates of the maximally activated voxel within a significant cluster as well as the coordinates of relevant local maxima within the cluster (in italics).*

the cued (inhibited) location (for recent reviews, see Fuentes, 2004; Fuentes et al., 2012). However, contrary to the Stroop and flanker tasks, the Simon and the IOR procedures activate response-related spatial representations, which might be responsible for the specific pattern of interactions observed when both procedures are combined in a single experiment: the Simon interference effect is increased when stimuli are presented at the cued location (see Ivanoff et al., 2002; Hilchey et al., 2011). We replicated that pattern of interaction at the behavioral level by showing increased size of Simon effects at the cued locations. This phenomenon might occur either by the delayed responses in the "Cued\_Incongruent" condition (inhibitory hypothesis), or by the facilitated responses in the "Cued\_Congruent" condition (shared spatial representation hypothesis). Our neural data supported the latter interpretation and revealed the neural mechanisms of the interaction between the Simon and IOR effect.

Wang et al. Spatial IOR and Simon effect

Regarding the behavioral data, previous researches have reported Simon congruency effects of around 20 ms size (De Jong et al., 1994; Vallesi et al., 2005; Nishimura and Yokosawa, 2010), and spatial IOR of around 40 ms size (Posner and Cohen, 1984; Klein, 2000) when each task is used in isolation. In our combined procedure, we report 17 ms for Simon congruency and 41 ms for IOR in the uncued location (Uncued\_Incongruent *>* Uncued\_Congruent) and the incongruent condition (Cued\_Incongruent *>* Uncued\_Incongruent), respectively. Thus, uncued locations and incongruent stimuli, the combined conditions that do not share any spatial representation, behaved as the standard conditions for each task, producing effect sizes in the standard range. Importantly, IOR reduced up to 18 ms as a consequence of response facilitation in the congruent trials when presented at the cued location (Cued\_Congruent *>* Uncued\_Congruent), which concurrently produced an increase in the Simon effect up to 40 ms (Cued\_Incongruent *>* Cued\_Congruent). Briefly, it seemed that the facilitated responses in the Cued\_Congruent condition, rather than delayed responses in the Cued\_Incongruent condition produced the significant interaction between spatial IOR and the Simon effect. Note, that in the present study we didn't make a further comparison of both conditions to a neutral condition, and then a direct assessment of whether the aforementioned interaction pattern is better accounted for in terms of facilitation or inhibition is not possible. However, on the basis of the effect sizes observed in both cued-uncued locations and congruentincongruent conditions, our present results clearly support the shared spatial representation hypothesis. This is further supported by the neural results, as we will discuss later on. The ocular-motor theory of IOR emphasizes the correlation between IOR and oculomotor system: the peripheral cue produces an automatic activation of an eye movement to that location, which generates IOR (Rafal et al., 1989; Kingstone and Pratt, 1999; Klein, 2000). And, both spatial IOR and the Simon effect could be influenced by eye movements (Abrahamse and Van der Lubbe, 2008; Buetti and Kerzel, 2010; Khalid and Ansorge, 2013). In the present study, in order to minimize the effects of eye movements, we instructed the participants to fixate at the central box throughout the experiment. Due to technical limitations, however, we couldn't track the eye movements during the fMRI-scanning.

Regarding neural data, we replicated brain activations that had been associated with either spatial IOR or Simon effects. Spatial IOR specifically activated the bilateral superior parietal cortex (**Figure 4A**). This finding was consistent with prior ERP and fMRI studies on spatial IOR (Zhou and Chen, 2008; Tian et al., 2011). Within the dorsal frontoparietal network, the bilateral superior parietal cortex plays an important role in voluntarily/involuntarily orienting visuospatial attention between spatial representations of external locations (Ungerleider and Mishkin, 1982; Goodale and Milner, 1992). For example, neuropsychological studies have shown that patients with superior parietal lesions were impaired in detecting the displacement of a visual stimulus and showed erratic fixation pattern in attention tasks (Phan et al., 2000; Vandenberghe et al., 2012). Neuroimaging studies with healthy adults further showed that the activity of the superior parietal cortex exhibited transient enhancement when attention was shifted between spatial locations [refer to Behrmann et al. (2004)].

On the other hand, the Simon task specifically activated the bilateral middle frontal cortex (**Figure 4B**). Previous neuroimaging studies suggest that compared to the congruent condition, a frontoparietal network is activated in the incongruent condition (Maclin et al., 2001; Fan et al., 2003a; Liu et al., 2004). For example, in the Liu et al. (2004)'s study, an arrow pointing upwards or downwards was presented on the left or right side of a central fixation point. Participants responded to one arrow with the index finger (left-most) and to the other with the middle finger (right-most) of their right hand. The incongruent condition, compared to the congruent condition, significantly activated the ACC, the dorsolateral prefrontal cortex (DLPFC), the precuneus, and the pre-supplementary motor area (pre-SMA). According to the conflict-monitoring theory (Botvinick et al., 1999, 2001, 2004) the ACC is responsible for monitoring conflict and response errors, whereas the DLPFC, which receives signals from the ACC, would be involved in modulating processing in the PPC by biasing the system toward the task-relevant information.

Our conjunction results provide unequivocal support for the shared spatial representation theory. The conjunction analysis between the two main effect contrasts "Cued *>* Uncued" and "Congruent *>* Incongruent" showed that the PPC in both hemispheres is the common site responsible of the interaction between IOR and the Simon effect (**Figure 3C**). The PPC is part of the dorsal visual stream involved in coding the spatial location of a stimulus ("where"), in contrast to the ventral visual stream, which is mainly devoted to the perceptual identification of objects ("what") (Ungerleider and Mishkin, 1982; Goodale et al., 1991; Goodale and Milner, 1992; Milner and Goodale, 1995, 2008). A large body of brain imaging literature points to a particular role for the PPC in multiple space representations (Kesner, 2009; Sack, 2009) and spatial cognition (Haxby et al., 1991; Colby and Goldberg, 1999; Landis, 2000; Marshall and Fink, 2001; Sack, 2009). For example, Andersen et al. (1997) argued that by using a specific gain mechanism, the PPC may combine different coordinated frames coming from various input spatial signals into common distributed spatial representations. Previous Transcranial Magnetic Stimulation (TMS) studies have clearly shown that, the PPC plays a crucial role in spatial representation both in the IOR and the Simon task. For example, TMS over areas of the right PPC has proven able to disrupt manual IOR (Chica et al., 2011), and IOR spatial remapping (van Koningsbruggen et al., 2010). Similarly, TMS over areas of the right PPC produced a reduction of the Simon effect (Schiff et al., 2008). In these studies, the results were interpreted as the disruption of the spatial representation. In the present study, both spatial IOR and the Simon task activated spatial representations from left and right locations. In spatial IOR, visuospatial attention is oriented/reoriented between spatial representations of the two cue locations; in the Simon task, the task-irrelevant spatial locations where targets can be presented are either congruent or incongruent with the task-relevant response codes. Importantly, it is only the "Cued\_Congruent" condition in which spatial IOR and the Simon effect may share the same spatial representation in the PPC, resulting in the observed behavioral response facilitation in that condition.

Our results were not consistent with the inhibitory theory. In that theory, increased Simon effect at the cued location is due to the incongruent condition responses being delayed at that inhibited location. Thus, we should have found higher neural activation in the "Cued\_Incongruent" condition than in the "Cued\_Congruent" condition, with the former conveying more conflict in it. Paradoxically, the bilateral middle frontal gyrus showed higher neural activity in the congruent than in the incongruent condition. In fact, one key difference between the classical Simon task and the one used in the present study is that the Simon stimuli were preceded by a spatial cue. Thus, once the attention orienting process, which shares the PPC neural network with the Simon effect, is evoked by the peripheral cue prior to the occurrence of the target, the attentional control set adopted by the bilateral middle frontal cortex might be matching the activated spatial representations with the response codes, in order to maximize the efficiency of behavioral responses. Therefore, whenever there was a match between the oriented spatial representations and the response codes, the bilateral frontal cortex caught it and showed higher neural activity (**Figure 4B**). That only occurs in the "Cued\_Congruent" condition.

In line with the previous contention, the neural interaction contrast "Cued (Congruent *>* Incongruent) *>* Uncued (Congruent *>* Incongruent)" suggests that the left superior precentral gyrus was significantly activated by the neural interaction between spatial IOR and the Simon effect by showing significantly enhanced neural activity in the "Cued\_Congruent" condition (**Figure 5**). Due to its topographic organization, the precentral gyrus (also known as the primary sensorimotor cortex) is traditionally considered the cortical area for voluntary movement (Ugur et al., 2005). More importantly, the superior region of the precentral gyrus is significantly involved in hand representation, object manipulation (Sastre-Janer et al., 1998; Boling et al., 1999; Rose et al., 2012) and motor execution (Stippich et al., 2002). Furthermore, the connectivity strength between the

### **REFERENCES**


of hand motor activation in Broca's pli de passage moyen. *J. Neurosurg.* 91, 903–910. doi: 10.3171/jns.1999.91.6.0903


precentral and the postcentral gyrus is positively correlated with hand motor performance (Rose et al., 2012). In another fMRI study, shorter reaction times with finger button presses were found along with greater activation of the supplementary motor area and right frontal opercular cortex (Klöppel et al., 2007). These findings suggest that, higher neural activity in the premotor cortex may facilitate the behavioral response. In the present study, the left precentral gyrus showed higher neural activity in the "Cued\_Congruent" condition, in correspondence with the facilitated behavioral responses observed in that condition. As it has been shown, it produced larger Simon effects at the cued than at the uncued location.

Taken together, by combining spatial IOR with the Simon task, we not only replicated the previous observation of larger Simon effects at the cued location of spatial IOR, but also revealed the neural mechanisms underlying this phenomenon. The key results were consistent with the shared spatial representation hypothesis. When the target appeared at the cued location and the cued location was congruent with the response code, the shared spatial representation system in the PPC between spatial IOR and the Simon effect was activated. Besides, the sensorimotor system in the precentral gyrus showed significantly enhanced neural activity, caused significant faster responses (i.e., a facilitatory effect) in the "Cued\_Congruent" condition compared with the "Cued\_Incongruent" condition, indirectly resulting in the increased size of the Simon effect observed at the cued location.

## **ACKNOWLEDGMENTS**

We are grateful to all our volunteers. The research reported here was supported by grants from Natural Science Foundation of China (30970895, 31070994, 31371127) and by grants from Spanish Ministry of Economy and Competitivity (CSD2008- 00048, PSI2010-09551-E, PSI2011-23340/PSIC). Qi Chen is supported by the Foundation for the Author of National Excellent Doctoral Dissertation of P. R. China (200907) and by the Program for New Century Excellent Talents in the University of China.


in parietal cortex. *Annu. Rev. Neurosci.* 22, 319–349. doi: 10.1146/annurev.neuro.22.1.319


*Percept. Perform.* 20, 731–750. doi: 10.1037/0096-1523.20.4.731


(1991). A neurological dissociation between perceiving objects and grasping them. *Nature* 349, 154–156. doi: 10.1038/349154a0


lesions. *Spat. Vis.* 13, 179–191. doi: 10.1163/156856800741199


*Neurol.* 64(Suppl. 2), S48–S52. doi: 10.1016/j.surneu.2005.07.049


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2013; accepted: 26 August 2013; published online: 17 September 2013.*

*Citation: Wang P, Fuentes LJ, Vivas AB and Chen Q (2013) Behavioral and neural interaction between spatial inhibition of return and the Simon effect. Front. Hum. Neurosci. 7:572. doi: 10.3389/ fnhum.2013.00572*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Wang, Fuentes, Vivas and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## The influence of visuospatial attention on unattended auditory 40 Hz responses

#### *Cullen Roth1,2,3, Cota Navin Gupta3, Sergey M. Plis 3, Eswar Damaraju3, Siddharth Khullar 3,4, Vince D. Calhoun3,5 and David A. Bridwell <sup>3</sup> \**

*<sup>1</sup> Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM, USA*

*<sup>2</sup> Department of Biology, Initiative for Maximizing Student Development, University of New Mexico, Albuquerque, NM, USA*

*<sup>3</sup> The Mind Research Network, University of New Mexico, Albuquerque, NM, USA*

*<sup>4</sup> Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, USA*

*<sup>5</sup> Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA*

#### *Edited by:*

*Joy Geng, University of California Davis, USA*

#### *Reviewed by:*

*Sarah E. Donohue, Otto von Guericke University Magdeburg, Germany Giri P. Krishnan, University of California Riverside, USA*

#### *\*Correspondence:*

*David A. Bridwell, The Mind Research Network, 1101 Yale Blvd. NE, Albuquerque, NM 87106, USA e-mail: dbridwell@mrn.org*

Information must integrate from multiple brain areas in healthy cognition and perception. The present study examined the extent to which cortical responses within one sensory modality are modulated by a complex task conducted within another sensory modality. Electroencephalographic (EEG) responses were measured to a 40 Hz auditory stimulus while individuals attended to modulations in the amplitude of the 40 Hz stimulus, and as a function of the difficulty of the popular computer game Tetris. The steady-state response to the 40 Hz stimulus was isolated by Fourier analysis of the EEG. The response at the stimulus frequency was normalized by the response within the surrounding frequencies, generating the signal-to-noise ratio (SNR). Seven out of eight individuals demonstrate a monotonic increase in the log SNR of the 40 Hz responses going from the difficult visuospatial task to the easy visuospatial task to attending to the auditory stimuli. This pattern is represented statistically by a One-Way ANOVA, indicating significant differences in log SNR across the three tasks. The sensitivity of 40 Hz auditory responses to the visuospatial load was further demonstrated by a significant correlation between log SNR and the difficulty (i.e., speed) of the Tetris task. Thus, the results demonstrate that 40 Hz auditory cortical responses are influenced by an individual's goal-directed attention to the stimulus, and by the degree of difficulty of a complex visuospatial task.

#### **Keywords: EEG, 40 Hz, gamma band, SSAEP, frequency tagging, ICA, tetris, visuospatial attention**

Multiple sensory areas must integrate across the brain in order to facilitate healthy cognition and behavior. Integrating information over multiple modalities may be beneficial in some instances but disadvantageous in others. For example, information from both the auditory and visual cortex provides useful information when listening to a speaker, since each modality facilitates understanding the speaker (Macaluso et al., 2004). Alternatively, complex tasks such as reading depend upon enhanced processing within visuocognitive areas that support reading, while placing less emphasis on the processing of irrelevant senses (e.g., auditory responses to individuals talking nearby) (Welcome and Joanisse, 2012).

Within a single modality, attending to a particular aspect of a feature results in an enhanced response toward that feature and a reduced response toward unrelated aspects of that feature. For example, attending to a direction of motion results in enhanced responses within visual areas sensitive to the attended direction, and reduced responses within visual areas sensitive to the orthogonal direction of motion (Treue and Martinez-Trujillo, 1999; Saenz et al., 2002). The enhancement and suppression of relevant and irrelevant aspects of a feature has been widely demonstrated within a single modality, such as vision, but they appear less prominent across modalities (Talsma et al., 2006). In particular,

the degree in which attention toward one modality (e.g., vision) modulates attention to another modality (e.g., audition) appears to depend on the difficulty and complexity of the attended task and the location in which unattended responses are measured (Hein et al., 2007).

A contributing factor to the degree in which auditory and visual tasks compete may be the degree of overlap between the brain areas involved in each task. Hein et al. (2007) demonstrate that frontal, parietal, and middle temporal responses partially overlap between simple visual detection and auditory detection. These overlapping activations indicate common brain areas involved in the perceptual and behavioral response to visual or auditory stimuli. In agreement with this suggestion, the authors show that detecting an auditory target diminishes the subsequent response to visual targets within frontal and middle temporal cortex, as well as within early visual areas. These findings are in qualitative agreement with a monotonic decrease in transient 40 Hz auditory responses when individuals attend to auditory tones, when they are passive, and when they engadge in reading (Tiitinen et al., 1993). In addition, the auditory-evoked response has been demonstrated to be sensitive to visuomotor load (Yucel et al., 2005; Haroush et al., 2010), and the response to emotional auditory stimuli is diminished with increasing visual load (Mothes-Lasch et al., 2012). Other studies have failed to demonstrate this effect (Parks et al., 2011), or have demonstrated the opposite pattern of results (Regenbogen et al., 2012).

In the following experiment, we examine the modulation of auditory 40 Hz EEG responses [i.e., steady state auditory-evoked potential's (SSAEP's)] with attention to the auditory stimuli, and as a function of the difficulty of the popular computer game Tetris. Tetris is a complex visuospatial task which requires visual detection and spatial rotation of the Tetris pieces. The task incorporates sudden visuospatial judgments with the appropriate motor movements to rotate the pieces and move them horizontally along the screen (Sims and Mayer, 2002). The requirement for attentional and perceptual resources within the task provides a promising avenue for testing the influence of attentional and perceptual load on irrelevant sensory processing (Lavie, 2005). We hypothesize that increased task load (e.g., a higher level of difficulty in the game Tetris) will be associated with enhanced recruitment of brain areas toward the visuospatial task and reduced recruitment of areas involved in processing or integrating the unattended auditory input. For further comparison, the magnitude of auditory responses during the Tetris task was compared to the magnitude of responses when individuals attend to the auditory stimuli directly.

## **MATERIALS AND METHODS**

## **PARTICIPANTS**

Nine individuals (all males) between the ages of 21 and 40 volunteered to participate in two sessions consisting of a reference session and a SSAEP session. One individual was excluded from analysis due to experimental error, bringing the total number of subjects to eight. Each individual had normal or correctedto-normal vision and had no family history of mental illness. Written informed consent was obtained prior to the first session.

## **DESIGN AND STIMULI**

All experimental stimuli were produced and presented on a computer using MATLAB (The MathWorks). A Tetris task was displayed within a figure (6.5◦ by 12.5◦ visual angle) in the middle of the screen. The squares that comprised the Tetris pieces were 0.6◦ by 0.6◦. Individuals pressed the left and right arrow keys to move the pieces left and right, respectively, and the spacebar to rotate the pieces counterclockwise. The down arrow key was disabled to ensure that individuals could not advance the piece further down the screen on their own, since this would create variability in the speed of the pieces and the difficulty of the task. Each completed row was worth 10 points regardless of whether one or multiple rows were completed at once in order to encourage individuals to complete rows as quickly as possible. The score was presented to the individual at the top of the screen for the duration of the 5 min trial. The task restarted if the individual received a "game over" before the trial was complete, although the scores were carried over.

The 40 Hz auditory stimulus consisted of a series of 5 ms square waves with a 20 ms pause between each square wave (**Figure 1D**). The intensity of the auditory stimuli were modulated randomly by reducing the amplitude of 1% of the square waves by 75% (selected at random). These modulations were inserted within the auditory stimuli during all EEG recordings in order to enhance the salience of the auditory stimuli. Individuals were instructed to attend to these modulations when they attended to the auditory stimulus in condition 3. The auditory stimulus was presented to the subject continuously throughout the EEG session via headphones (70 dB; binaural; sampling rate = 44,100 Hz).

## **TETRIS REFERENCE SESSION**

Each individual participated in an initial session to measure the level of Tetris in which they have peak performance. This reference session consisted of two blocks of the task. Each block consisted of 5 levels of Tetris (5 min. per level) in ascending levels of difficulty. The difficulty of each level was set by adjusting the duration before the piece advanced 0.6◦ toward the bottom of the figure. The following 5 levels were used: 255, 181, 138, 94, 50 ms. These levels correspond to movements in the Tetris pieces at approximate frequencies of 3.92, 5.52, 7.25, 10.64, and 20 Hz. The levels were chosen based upon initial pilot testing, which indicated that they capture the range of difficulties for both novice and experienced Tetris players.

Individuals took a brief break after the first 25 min block of 5 trials, then participated in a second identical block. The scores from the two blocks were averaged together and the level with the maximum score was used as the reference level for the speed of the Tetris pieces for that individual in the SSAEP session that followed.

## **SSAEP SESSION**

The SSAEP session began with an initial 5 min practice trial to allow individuals to become reacquainted with the Tetris controls. Individuals were instructed that they would either play Tetris or fixate on Tetris while attending to the auditory stimulus. The 40 Hz auditory stimulus was presented to the subjects throughout the session binaurally with headphones and 40 Hz cortical responses were recorded with EEG (see SSAEP recording and analysis, below). The level of difficulty of Tetris was adjusted by subtracting the individual subjects reference level from the first session by 50 ms (condition 1; difficult, **Figure 1A**) or by adding 200 ms to their reference level (condition 2: easy, **Figure 1B**). (Note that 50 ms was not subtracted from the subject with the reference level of 50 ms, since it was below the threshold of possible values). The words "Attend Clicks" were displayed over the Tetris game to instruct participants to attend to the modulations of the auditory stimuli (condition 3, **Figure 1C**). During this condition, individuals remained fixated at the center of the screen while the Tetris pieces fell, without rotating, at the same speed as in condition 1 (i.e., the subjects reference level minus 50 ms) (e.g., as depicted in **Figure 1C**). The keyboard controls were disabled and individuals were instructed to attend to the subtle variations in the amplitude of the 40 Hz stimulus. Individuals participated in each of the three conditions three times within a single experimental session over the 45 min session (e.g., 5 min per condition × 3 conditions × 3 repeats per condition). The order of conditions was determined for each individual using a latin square design. The latin square consisted of a 3 × 3 matrix with the elements 1, 2, and 3 such that no element repeated within a given row or column. The rows of the square were randomly drawn for each individual to create a 9 element vector of conditions within a

given session. This ensures that individuals never participated in the same condition consecutively, and that the order of conditions was randomized across individuals.

## **SSAEP RECORDING AND ANALYSIS**

EEG responses were collected with two 4 channel CamNtech Actiwave™ mobile EEG devices (sampling rate: 256 Hz). The eight electrodes where placed on the subject's scalp at the 10–20 locations F3, Fz, F4, Cz, P3, P4, O1, and O2, with a left mastoid reference. EEG analysis was conducted in MATLAB using custom functions, built-in functions, the Statistics Toolbox, and EEGLAB (Delorme and Makeig, 2004). Temporal independent component analysis (ICA) was conducted on each individual's detrended bandpass filtered (∼2–50 Hz) EEG data [extended Infomax, (Bell and Sejnowski, 1995; Lee et al., 1999)].

The spectral amplitude of the ICA sources were calculated within 2 s windows with 1 s overlap, generating 298 epochs for each 5 min trial. The epochs were averaged across intervals and conditions. The log amplitude of the spectral response at 40 Hz was normalized by the average log amplitude at 38 and 42 Hz, generating the SNR. This measure ensures that the 40 Hz responses are isolated to the auditory stimulus and attenuates the possibility that broadband muscle artifacts influence the observed results.

ICA sources were isolated if the log SNR of the 40 Hz response exceeded 0.3. This threshold was chosen since curves that exceed that value demonstrate a robust peak visually at 40 Hz. The isolated ICA sources were reconstructed to the original data space. The log SNR was calculated separately for each condition by averaging across each epoch [298 epochs per trial × 3 trials = 894 epochs] for each condition, separately for each subject. There was variability in the spatial location of the peak response across subjects, potentially due to the use of a left mastoid reference. Thus, subsequent analysis was conducted on the channel with the largest SNR within each subject. It is important to note that these ICA exclusion and channel selection criteria were unbiased with respect to differences across the conditions since the SNR values were examined after collapsing across all conditions.

Our goal was to determine whether cortical responses to an auditory stimulus are modulated by attention to the stimulus, and by the degree in which individuals attend to a complex visuospatial task. Differences in the response to the auditory stimuli were examined across conditions using a within subjects One-Way ANOVA with factor "condition" and log SNR as the dependent variable. The correlation between 40 Hz responses and the speed of the Tetris pieces was examined in order to further determine whether reduced auditory responses are associated with increased task speed (i.e., difficulty).

## **RESULTS**

## **BEHAVIORAL PERFORMANCE**

Tetris scores were collected for each individual at the end of each 5 min game. Two scores were collected for each level of difficulty and the two scores were averaged to determine the level in which each individual had the highest score. Four individuals demonstrated the highest score at speeds of 7.25 Hz, while the remaining four individuals had peak scores at 3.92, 5.52, 10.63, and 20 Hz. These levels served as a reference for the level of difficulty within the SSAEP experiment that followed. During the SSAEP session, an average of 4.55 lines and 2.05 lines were completed per minute for the difficult and the easy task, respectively.

## **INDEPENDENT COMPONENT ANALYSIS**

Temporal ICA was conducted on the EEG time course from session 2 in order to isolate temporal sources that respond to the stimulus frequency. The amplitude spectra of the temporal components indicate that each subject demonstrated a peak at 40 Hz within at least one source. In general, we observed the highest response within frontal electrodes. For example, the ICA source with the largest SNR contained the highest loading over Fz (1 subject), F3 (3 subjects), F4 (1 subject), P4 (2 subjects), or O1 (1 subject). The temporal components were selected if they had a log SNR greater than 0.3, and reconstructed into channel space. The average number of components identified was 3.125 (max = 6; min = 1). After reconstructing to channel space, the channel with the largest SNR was chosen for further analysis. Different channels were used for each individual since there was variability in the location of the peak response across individuals. **Figure 2** indicates the log amplitude for the channel with the peak response in each subject from 2 to 42 Hz. Each of the 8 individuals demonstrates a relatively smooth decline in the spectral response with a clear peak at the stimulus frequency of 40 Hz (**Figure 2**).

## **ATTENTION AND AUDITORY RESPONSES**

We examined whether auditory responses differed when individuals attended to modulations in the auditory stimuli, when they

**FIGURE 2 | Average SSAEP response for each condition.** The spectral response is displayed for each subject for the difficult visuospatial task **(A)**, the easy visuospatial task **(B)**, and for "attend clicks" **(C)**. The spectral response was obtained from the channel with the largest response at 40 Hz after ICA reconstruction (the channel is indicated on the left of each plot). The

log amplitude of the response is shown for each subject in gray, and the overall average is in black. Each individual demonstrates a response at 40 Hz, which corresponds to the frequency of the auditory stimulus. The topographic image on the upper right of each plot indicates the 10–20 locations for F3, Fz, F4, Cz, P3, P4, O1, and O2 (left mastoid reference).

ignored the stimulus and performed an easy visuospatial task, or when they ignored the stimulus and performed a difficult visuospatial task. The log SNR of the 40 Hz auditory responses are indicated in **Figure 3** for each of the three conditions. Seven out of eight individuals demonstrate a monotonic increase in response to the stimulus going from the difficult task to the easy task to "attend clicks." This pattern is present in the overall average (solid black lines), and is represented statistically by a within-subject One-Way ANOVA indicating significant differences in log SNR across the three tasks [*F(*2*,* <sup>14</sup>*)* = 5*.*09, *p* = 0*.*0218]. No significant differences were observed when the log amplitude of the 40 Hz response was used as the dependent measure [*F(*2*,* <sup>14</sup>*)* = 0*.*21, *p* = 0*.*8109] or when the average log amplitude of the surrounding bins (i.e., the noise) was used as a dependent measure [*F(*2*,* <sup>14</sup>*)* = 0*.*24, *p* = 0*.*2096].

The relationship between auditory responses and task difficulty was further explored by examining the correlation between the log SNR of the 40 Hz response and the speed of the Tetris pieces. Our hypothesis was that the increased speed (e.g., increased difficulty) of Tetris would increase visuospatial resources at the expense of auditory processing. Thus, increased difficulty within the Tetris task is expected to be associated with reduced cortical responses at 40 Hz. This pattern is demonstrated in **Figure 4** by plotting the log SNR against speed (Hz) for both the easy and difficult Tetris tasks. The figure demonstrates a negative relationship between log SNR and speed (Hz), which is represented by a significant negative correlation both when the results from the easy and difficult conditions were combined (*r* = −0*.*64; *p* = 0*.*0066) as well as separately within the difficult results (*r* = −0*.*79; *p* = 0*.*0209).

## **DISCUSSION**

Our results demonstrate that cortical responses to an auditory stimulus are influenced by an individual's goal-directed attention to the stimulus, and by the degree of difficulty of a visuospatial task. Importantly, we have shown that increases in the speed of

**FIGURE 3 | SSAEP response for each condition.** The average 40 Hz response is indicated for each subject and condition (empty circles). The overall average response is indicated by the solid black line and the pattern of individual responses across conditions are indicated by the dotted gray lines. The response to the 40 Hz stimulus is largest when individuals attend to the stimulus, followed by when they perform the easy visuospatial task, followed by the difficult visuospatial task.

the visuospatial task are correlated with decreases in the cortical response to the 40 Hz auditory stimulus. These findings are consistent with the theory that increases in visuospatial difficulty result in increased utilization of higher order resources toward the visuospatial task, and reduced higher order resources toward processing information within sensory modalities that are not relevant to the task (Hein et al., 2007).

Auditory cortical responses were isolated in the current study by examining EEG responses to a 40 Hz auditory stimulus. An

advantage of this approach is that the steady state response (i.e., the signal) is concentrated within the narrow frequency of the stimulus, while spectral changes due to muscle and eye artifacts (i.e., noise) will be distributed (Silberstein, 1995). Interestingly, we found differences in the signal-to-noise ratio across conditions, but no significant differences in the noise (e.g., 38 and 42 Hz) across conditions. This finding demonstrates that the results are due to differences in the normalized 40 Hz response, but not broadband changes in spectral amplitude that may result due to artifacts or changes in the spectral baseline.

The following additional points should be considered when interpreting the results. First, it is likely that some individuals contain greater prior experience with the task than others. While, individual video game experience was not collected, it is likely that increased video game experience would be associated with better performance within the Tetris task, which is reflected in the speed at which the Tetris pieces fell. Thus, the finding of a negative correlation between auditory responses (e.g., log SNR) and the speed of the pieces (**Figure 4**) is consistent with the theory that individuals with greater expertise were better able to focus on the visuospatial task and to ignore or suppress the irrelevant auditory stimuli. An additional consideration for further studies is whether the brain networks that modulate suppressing irrelevant sensory modalities may overlap with the brain networks that facilitate enhancing information across modalities (e.g., with multisensory integration). In addition, the present results were obtained within a population of males, which can limit the generalizability of the findings.

Auditory processing was examined within the current study by measuring EEG responses to a 40 Hz auditory stimulus. In this context, it is important to note that different sensory systems can be sensitive (i.e., generate larger responses) to different input frequencies (Regan, 1989), and different input frequencies can target brain networks with functionally distinct properties (Ding et al., 2006; Bridwell and Srinivasan, 2012). The auditory system demonstrates a 40 Hz transient response to tones (Tiitinen et al., 1993), suggesting that this frequency represents the intrinsic oscillatory frequency of auditory processing. This intrinsic nature of auditory 40 Hz responses is well supported by the finding that frequency tagged auditory responses peak at 40 Hz (Galambos et al., 1981; Picton et al., 2003), suggesting that 40 Hz auditory inputs may entrain and reveal the functional properties of the auditory system (Basar, 1999). This motivates its use in examining auditory cortical modulations with attention (Ross et al., 2004), and auditory/visual attention (Saupe et al., 2009; Keitel et al., 2011). However, it should be noted that while the auditory response peaks at 40 Hz, enhanced responses may also be observed within neighboring gamma-band frequencies (Galambos et al., 1981), and these input frequencies could potentially reveal similar modulations with attention.

Previous studies have examined the influence of visual load on the N1 and P2 ERP components (Regenbogen et al., 2012), as well as with the mismatch negativity (MMN) (Yucel et al., 2005; Haroush et al., 2010). Early evoked potentials, such as the N1, are thought to represent basic sensory processing, and these responses may be measured after averaging over repeated presentations of a basic auditory stimulus (Näätänen et al., 2011). The MMN, in contrast, represents the average response to deviant auditory tones that are presented within a series of more frequent auditory tones. In order to "detect" rare events, the brain must maintain a statistical representation or expectation of auditory stimuli, such that deviations from this expectation result in a reorientation of attention to the rare event. The MMN is therefore sensitive to more complex aspects of the auditory environment, and tends to demonstrate greater sensitivity to attentional manipulation (May and Tiitinen, 2010).

The different sensitivities of these evoked potentials to attention and early sensory processing suggests that they may demonstrate different sensitivity to visual load. Modulations within early evoked responses with visual load suggest the influence of visual load on early sensory processing, while modulations within the MMN demonstrate the influence of visual load on the auditory sensitivity to rare events (Haroush et al., 2010; Näätänen et al., 2011). This reflects a more complex level of auditory processing, which would potentially be disrupted to a greater degree when overlapping brain areas are utilized within visuocognitive tasks (Yucel et al., 2005). The frequency tagged auditory responses in the current study likely reflect the combination of the early auditory cortical responses as well as their integration with higher order brain areas. Thus, the frequency tagged response in our study may potentially reflect the interplay between early sensory areas and their modulation by higher level brain areas, such as the frontal cortex.

The influence of visual load has primarily been examined on auditory responses to rare tones (e.g., represented by the MMN) (Haroush et al., 2010). The influence of visual load on gamma (i.e., 40 Hz) responses was demonstrated by Tiitinen et al. (1993) with a similar experimental paradigm as the present study. Their study indicates that the 40 Hz auditory transient responses increase (compared to reading) when individuals passively ignore the tones, and further increase when they attend to the tones. An important distinction, however, is that the 40 Hz responses were elicited transiently from a brief auditory tone, while the 40 Hz responses in the current study were targeted by isolating frequency tagged 40 Hz cortical responses. A potential advantage of the frequency tagging approach is that the 40 Hz responses are robust to noise and can be distinguished from broadband increases in power that arise due to muscle tension or movement. Thus, the current approach may be useful for tracking an individual's attentional engagement within subsequent mobile EEG studies and brain-computer interface (BCI) based interventions.

An important distinction among previous studies is the degree of engagement and complexity of the visual task. For example, Tetris draws upon visuospatial resources (Sims and Mayer, 2002), reading draws upon visuocognitive processing (Welcome and Joanisse, 2012), and the visuomotor task in Yucel et al. (2005) encourages rapid visuomotor feedback and integration. Each of these studies demonstrate modulations in auditory responses with increases in task difficulty, but a series of simple detection and discrimination tasks have been unable to generate similar findings (for review see Haroush et al., 2010). These findings highlight the importance in considering the perceptual and cognitive resources required within a given task when examining its influence on attentional load (Lavie, 2010). Differences in task complexity and engagement could account in part for the inability of some studies to detect differences in auditory responses with increasing visual load. For example, simple detection and discrimination tasks may place an emphasis on automatic early visual processing without engaging higher level brain areas to the same extent as more complex tasks. While perceptual load within one modality appears to influence responses to irrelevant stimuli within the same modality (Lavie et al., 2004), there appears to be less of an influence of perceptual load within one modality on irrelevant sensory responses within another modality (Talsma et al., 2006).

## **REFERENCES**


independent component analysis. *J. Neurosci. Methods* 134, 9–21. doi: 10.1016/j.jneumeth.2003.10.009


Similar distinctions have been demonstrated within the load theory of attention and cognitive control (Lavie, 2005, 2010). For example, it has been demonstrated that there is reduced interference from an irrelevant distracter when individuals perform a high perceptual load task, but an increased interference from visual distracters when individuals perform a high memory load task. This finding suggests that an increased memory load is accompanied by a reduction in cognitive control resources that are available for suppressing irrelevant distracters. Within the context of the current study, individuals likely recruit the complex array of brain networks that facilitate visuocognitive processing during Tetris. These brain networks likely overlapped with the brain networks involved in processing the irrelevant auditory stimulus, resulting in a reduced 40 Hz auditory responses.

## **CONCLUSION**

This study investigated the influence of visuospatial attention on 40 Hz auditory cortical responses. EEG responses were measured to a 40 Hz auditory stimulus while individuals attended to modulations in the amplitude of the 40 Hz stimulus, and as a function of the difficulty of the visuospatial task Tetris. The results demonstrate the influence of visuocognitive demands on the sensory processing of irrelevant auditory stimuli. We found significant differences in the log SNR of the 40 Hz responses across the three conditions, demonstrating the influence of attention on the auditory 40 Hz response. Importantly, the log SNR of the 40 Hz response was significantly correlated with the speed (i.e., difficulty) of the task, indicating that auditory responses are reduced with increasing visuospatial load. Overall, the results demonstrate that 40 Hz auditory responses are influenced by an individual's goal-directed attention to the stimulus, and by the degree of difficulty of a complex visuospatial task.

## **ACKNOWLEDGMENTS**

This work is supported by NIH grants 1R01MH094524-01A1 (awarded to J. Turner and Vince D. Calhoun), 1R01EB006841 (awarded to V. Calhoun), GM-060201 (to Cullen Roth), and a pilot grant awarded by the Mind Research Network.

*Neurosci.* 22, 1440–1451. doi: 10.1162/jocn.2009.21284


*Curr. Dir. Psychol. Sci.* 19, 143–148. doi: 10.1177/0963721410370295


*Neuroimage* 21, 725–732. doi: 10.1016/j.neuroimage.2003.09.049


*and Evoked Magnetic Fields in Science and Medicine*. New York, NY: Elsevier.


L Nunez (New York, NY: Oxford University Press), 272–303.


Yucel, G., Petty, C., McCarthy, G., and Belger, A. (2005). Graded visual attention modulates brain responses evoked by task-irrelevant auditory pitch changes. *J. Cogn. Neurosci.* 17, 1819–1828. doi: 10.1162/089892905775008698

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 25 June 2013; published online: 15 July 2013.*

*Citation: Roth C, Gupta CN, Plis SM, Damaraju E, Khullar S, Calhoun VD and Bridwell DA (2013) The influence of visuospatial attention on unattended auditory 40 Hz responses. Front. Hum. Neurosci. 7:370. doi: 10.3389/fnhum. 2013.00370*

*Copyright © 2013 Roth, Gupta, Plis, Damaraju, Khullar, Calhoun and Bridwell. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## The functional anatomy of attention: a DCM study

## *Harriet R. Brown\* and Karl J. Friston*

*The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London, London, UK*

#### *Edited by:*

*Joy Geng, University of California Davis, USA*

#### *Reviewed by:*

*Emiliano Macaluso, Fondazione Santa Lucia, Italy Christian Ruff, University of Zurich, Switzerland Durk Talsma, UGent, Belgium*

#### *\*Correspondence:*

*Harriet R. Brown, Wellcome Trust Centre for Neuroimaging, Institute of Neurology, 12 Queen Square, London, WC1N 3BG, UK e-mail: harriet.brown.09@ucl.ac.uk* Recent formulations of attention—in terms of predictive coding—associate attentional gain with the expected precision of sensory information. Formal models of the Posner paradigm suggest that validity effects can be explained in a principled (Bayes optimal) fashion in terms of a cue-dependent setting of precision or gain on the sensory channels reporting anticipated target locations, which is updated selectively by invalid targets. This normative model is equipped with a biologically plausible process theory in the form of predictive coding, where precision is encoded by the gain of superficial pyramidal cells reporting prediction error. We used dynamic causal modeling to assess the evidence in magnetoencephalographic responses for cue-dependent and top-down updating of superficial pyramidal cell gain. Bayesian model comparison suggested that it is almost certain that differences in superficial pyramidal cells gain—and its top-down modulation—contribute to observed responses; and we could be more than 80% certain that anticipatory effects on post-synaptic gain are limited to visual (extrastriate) sources. These empirical results speak to the role of attention in optimizing perceptual inference and its formulation in terms of predictive coding.

**Keywords: attention, active inference, predictive coding, precision, Posner, cortical gain control**

## **INTRODUCTION**

Several years ago, we suggested that attention can be understood as the selection of processing channels that conveyed precise or salient information within the framework of predictive coding (Feldman and Friston, 2010). The idea is that both the content of visual information and the confidence placed in that information have to be inferred during perception. In predictive coding, topdown predictions of the content are confirmed or disconfirmed by comparison with bottom-up sensory information (Rao and Ballard, 1999; Friston, 2005). However, this comparison rests on estimating the reliability or precision of sensory information or more exactly the residuals or prediction error that cannot be explained. This precision may be itself context sensitive and has to be updated in exactly the same way as predictions of content (Brown and Friston, 2012a,b). This leads to view of hierarchical perceptual synthesis in which particular processing channels are selected on the basis of cues that portend spatial locations or featural attributes that are likely to convey precise information. In neuronally plausible implementations of this hierarchical Bayesian inference—namely, generalized Bayesian filtering or predictive coding—expected precision is thought to be encoded by the post-synaptic sensitivity or gain of cells reporting prediction error (Friston and Kiebel, 2009). Given that prediction error is passed forward from sensory cortex to higher cortical areas by ascending or forward connections, the most likely candidates for reporting prediction error are the superficial pyramidal cells that are the source of ascending connections (Bastos et al., 2012). This means that one can understand attention as the top-down gain control of superficial pyramidal cells passing information that is yet to be explained (i.e., prediction error) deep into the visual hierarchy.

This normative model and its neuronal implementation have been used to simulate and reproduce both the psychophysical and electrophysiological characteristics of the Posner paradigm (Feldman and Friston, 2010). In brief, predictive cues engage top-down predictions of increased precision in the left or right hemifield that facilitate the rapid processing of (inference about) valid visual targets. However, when an invalid target is presented in the wrong hemifield, the evidence accumulation implicit in predictive coding is slower, because gain or precision acts as a synaptic rate constant. This leads to protracted reaction times and an invalidity cost. Simultaneously, the scheme infers that prior beliefs about the target have been violated and prediction errors drive higher levels to update both the deployment of attention (i.e., precision) and target predictions *per se*. This explains the classic electrophysiological correlates of the validity effects in the Posner paradigm—in which invalid targets elicit slightly attenuated P1, N1 and N2 early components and a more pronounced P3b late component (Mangun and Hillyard, 1991; Hugdahl and Nordby, 1994; Talsma et al., 2007). These two electrophysiological characteristics may reflect the initial insensitivity (low precision or gain) of early visual responses and a subsequent *post-hoc* revision of top-down precision or gain control, when prediction error cannot be resolved by predictions based upon the (invalid) cue.

In this paper, we tried to verify these explanations for electromagnetic responses to valid and invalid targets in the Posner paradigm using magnetoencephalography (MEG) and dynamic causal modeling of differences in effective connectivity. In particular, we hoped to establish that a sufficient explanation for responses evoked by valid and invalid targets would be provided by a difference in the gain or post-synaptic sensitivity of superficial parietal cells following a cue—and a subsequent top-down modulation of this gain from parietal and higher extrastriate sources. To do this, we needed to use dynamic causal models based on canonical microcircuits that distinguish between superficial and deep pyramidal cells (Bastos et al., 2012)—and that explicitly include a top-down modulation of superficial pyramidal cells.

In what follows, we provide a brief description of the dynamic causal models used to address precision or gain control in predictive coding; describe the data and experimental design; and report the results of Bayesian model comparisons that quantify the evidence for condition-specific differences in superficial pyramidal cell gain. Our focus here is on cue-dependent differences in gain prior to the onset of a visual target and subsequent top-down modulation of that gain during target processing. In particular, we asked whether cue-dependent differences in gain, top-down modulation or both were evident in evoked electromagnetic responses—and, whether any differences in gain were restricted to visual sources or extended to the parietal cortex.

## **MATERIALS AND METHODS**

#### **DYNAMIC CAUSAL MODELING OF PREDICTIVE CODING**

In predictive coding models of inference in the brain (Mumford, 1992; Friston, 2005; Bastos et al., 2012), prediction error ascends to update representations at higher hierarchical levels. See **Figure 1** for a schematic summary. Crucially, the excitability of cells reporting prediction error corresponds (mathematically) to the precision of—or confidence in—the information they convey. This precision has been used to explain the psychophysical and electrophysiological correlates of attention and can be regarded as the basis of selective (predictive or attentional) gain—in which sensory processing channels that convey precise information are enabled.

Neurobiological implementations of predictive coding use superficial pyramidal cells to report precision-weighted prediction error:ξ*(i)* <sup>=</sup> *-(i)* · *(*μ˜ *(i)* <sup>−</sup> *<sup>f</sup>(*μ˜ *(i*+1*) ))*, where <sup>μ</sup>˜ *(i)* corresponds to representations (posterior expectations) of states of the world at level *<sup>i</sup>* in a cortical hierarchy and *<sup>f</sup>(*μ˜ *(i*+1*) )* corresponds to the top-down predictions of these expectations—based upon expectations in the level above. The precision of the ensuing prediction error is modulated by the precision *-(i)* to weight prediction errors in proportion to their (expected) reliability (c.f., known uncertainty). From our point of view, the encoding of precision at each level of the hierarchy—can be associated with the strength of inhibitory recurrent connections by noting that the expression for prediction errors is the solution to the following equation describing neuronal dynamics.

$$\begin{aligned} \dot{\xi}^{(i)} &= \tilde{\mu}^{(i)} - f(\tilde{\mu}^{(i+1)}) - \exp(\mathcal{V}^{(i)}) \cdot \xi^{(i)} \\ \dot{\xi}^{(i)} &= 0 \Rightarrow \chi^{(i)} = -\text{In}\Pi \end{aligned}$$

A more complete exposition of these dynamics can be found in Friston (2005). In this equation, γ*(i)* is the negative log precision.

With Dynamic Causal Modeling (Garrido et al., 2008; Bastos et al., 2012), we map this neurobiological implementation of predictive coding onto a neural mass model which is capable of simulating MEG data. The depolarization of the three excitatory cell populations in the model—superficial and deep pyramidal cells, as well as spiny stellate cells, forms the output of the model with the main contribution coming from superficial pyramidal cells. This activity is transformed by an MEG-specific lead-field

of origin of forward driving connections that convey prediction error from a lower area to a higher area and the backward connections that construct predictions (Mumford, 1992; Friston et al., 2006). These predictions try to explain away prediction error in lower levels. In this scheme, the sources of under a hierarchical dynamic model (see Feldman and Friston, 2010). State-units are in black and error-units in red. Here, neuronal populations are deployed hierarchically within three cortical areas (or macro-columns). Subscripts denote derivatives.

which describes the translation from source activity to sensor perturbation.

The four-population neural mass model used here has been described before (Brown and Friston, 2012b). In the neural mass models, γ*(i)* , the negative log precision, corresponds to the strength of recurrent inhibitory connections on superficial pyramidal cells. This means that as preclon increases, the strength of recurrent inhibition decreases. We therefore use the strength of intrinsic (recurrent) self-inhibition (on superficial pyramidal cells) as a proxy for log precision.

One new feature is introduced in this implementation of the neural mass model. To model top-down modulation of this self-inhibition we use the following form of (backward) modulatory connectivity:

$$\boldsymbol{\gamma}^{(i)} = \boldsymbol{\gamma}\_0^{(i)} - \mathfrak{Z} \cdot \boldsymbol{M} \cdot (\boldsymbol{\sigma}(\boldsymbol{V}) - \boldsymbol{\sigma}\_0)$$

Here, γ<sup>0</sup> is self-inhibition when firing rates are at baseline levels σ<sup>0</sup> = σ*(*0*)*. Firing rates σ*(V)* ∈ [0*,* 1] are a sigmoid function of depolarization *<sup>V</sup>* <sup>∈</sup> <sup>R</sup> of afferent neuronal populations (deep pyramidal cells in other sources). The modulatory connection strength matrix *M*weights the influence of other sources; such that a high value suppresses self-inhibition and (effectively) increases the gain or precision of the superficial pyramidal cells that are targeted.In what follows, we will model condition (valid or invalid) specific effects on γ to evaluate the evidence for cue-dependent changes in gain at the onset of target processing and test for condition specific changes in *M* that mediate target-dependent changes in gain as target is processed. Our hope was that we will find evidence for differences in baseline gain and subsequent top-down modulation—and that these would be expressed predominantly in early visual sources.

Specifically, we anticipated that intrinsic self-inhibition would be lower (gain would be higher) in left hemisphere sources after (invalid) cueing of the right hemifield relative to (valid) cueing of the left hemifield, where the target appeared in the left hemifield in both conditions. In other words, we hoped to show differential responses to identical targets could be explained by differences in gain induced by valid and invalid cues. Furthermore, we anticipated differences in descending modulatory effects between valid and invalid trials that would be necessary to reverse the laterality of gain control following an invalid target.

## **PARTICIPANTS**

Fourteen healthy right-handed subjects participated in the study (8 male; age 20–54). Ethical approval was obtained from the UCL Research Ethics Committee (no. 2715/001). Written informed consent was obtained from all subjects.

#### **EXPERIMENTAL PARADIGM**

All stimuli were presented using Matlab 7.1 and Cogent (http:// www*.*vislab*.*ucl*.*ac*.*uk/cogent*.*php). Stimuli were projected onto a screen 70 cm from the subjects. During the task, subjects fixated on a central cross at all times. At the start of each trial, the cross was replaced by an arrow pointing to the bottom left or bottom right corner of the screen, or a double-headed arrow pointing to both (neutral trials). The cues subtended 1.6 degrees of visual angle. After a cue-target interval of 50, 100, 200, or 400 ms, a target appeared either where the arrow had indicated (valid) or at the other side (invalid). The target was a white circle subtending 3.1 degrees of visual angle and presented in the lower left or lower right corners of the screen at 14.7 degrees eccentricity. Participants pressed a button with their right hand as soon as the target appeared. 66% of trials were valid, 17% were invalid and 17% uninformative (neutral cue trials are not considered here). Left and right cues and targets were balanced. Catch trials, in which no target followed the cue, were randomly presented before 10% of trials. 1800 trials were collected over three sessions on two consecutive days.

#### **BEHAVIORAL DATA**

Reaction times were collected by Cogent and analyzed with IBM SPSS 20. A full factorial univariate ANOVA was performed with fixed factors "side" "validity" and "cue-target interval" and random factor "subject."

#### **DATA COLLECTION AND PROCESSING**

MEG data was obtained using a whole-head 275-channel axial gradiometer MEG system (CTF Systems). The sampling rate was 600 Hz and a low-pass filter of 150 Hz was applied. Head position was monitored using three localization coils, placed on the nasion and in front of each ear. An infrared eyetracker (Eyelink 1000) was used to monitor participants' fixation as well as to detect blinks. Stimuli were presented and behavioral data were collected with Cogent.

Data were analyzed using SPM12b for EEG/MEG. Data were down-sampled to 200 Hz and bandpass-filtered between 2 Hz and 32 Hz. Baseline-corrected epochs were extracted from the time series starting at 50 ms before target onset and ending 400 ms after target onset. Trials where the eyetracker detected a blink or saccade were excluded from analysis. Trials were then robustly averaged across cue-target intervals and participants to yield four conditions—left valid cue, right valid cue left invalid cue and right invalid cue. Averaging across participants can reduce the spatial precision of the MEG signal; however, as our hypotheses were not concerned with the spatial location of the signals we chose to combine data across all participants to increase the signal-to-noise ratio of the waveforms.

#### **DATA FEATURE AND SOURCE SPECIFICATION**

We addressed our hypothesis using condition-specific grand average responses over all subjects. Intuitively, this is like treating each subject as if they were the same subject to produce an average ERP. To identify plausible sources we used a distributed source reconstruction (using four grand averages: valid right target, invalid right target, valid left target, and invalid left target) based on multiple sparse priors (with default settings).

The grand average data were bandpass filtered between 2 and 32 Hz and windowed from 0–400 ms of peristimulus time. We used a lead field based upon the standard MRI template and a boundary element model as implemented in SPM12 (Mattout et al., 2006). After source reconstruction, we quantified the power

of evoked responses (over all frequencies and peristimulus time) to produce the maximum intensity projections in **Figure 2**. As one would expect, left targets activate right early visual sources and *vice versa*. Note further, that early visual source responses to valid left targets are greater than the same targets under invalid cues. On the basis of these reconstructions, we identified eight sources corresponding (roughly) to key maxima of source activity. These sources included bilateral early visual sources (V2); bilateral sources near the occipitotemporal-parietal junction (V5); bilateral dorsal (V3) extrastriate sources and bilateral superior parietal sources (PC). The anatomical designation of these sources should not be taken too seriously—they are used largely an aide-memoire for sources at various levels in the visual hierarchy, so that we can discuss the functional anatomy. Clearly, the spatial precision of source localization does not allow us to associate each source with a specific cytoarchitectonic area—and even if we could, there is sufficient intersubject variability in cortical architectures to make this association, at best, heuristic.

The distributed network constituting the DCM is shown in **Figure 3**. The parietal sources sent backward connections to the extrastriate (V3 and V5) sources that then sent backward connections to the V2 sources. These connections were reciprocated by extrinsic forward connections to produce a simple visual hierarchy with bilateral connections.

## **MODEL SPACE AND BAYESIAN MODEL COMPARISON**

The DCM analyses used data from 0 to 400 ms of peristimulus time. To de-noise the data and improve computational efficiency, we fitted the first eight canonical modes of the scalp data, given the source locations—these can be regarded as the principal components of the data that can be explained by source activity. The sources were modeled as small cortical patches of about 16 mm radius—centered on the source locations in **Figure 2**—as described in (Daunizeau et al., 2006). The vertices of these sources used the same lead fields as in the source reconstruction.

Exogenous (visual target related) input was modeled as a Gaussian function with a prior peak at 120 ms (and a prior standard deviation of 16 ms). This input was delivered to V2 on the appropriate side (left for right target trials and right for left target trials). The ensuing models were optimized to explain sensor

connected in the distributed network shown on the right. The parietal sources sent both driving and modulatory backward

reciprocated by extrinsic forward connections to produce a simple visual hierarchy with bilateral connections.

responses by adjusting their (neuronal and lead field) parameters in the usual way—this is known as model inversion or fitting. The products of this inversion are posterior estimates of (differences in) intrinsic and extrinsic connectivity and the evidence or marginal likelihood for each model considered.

Our hypothesis centered on the gain of superficial pyramidal cells.We therefore estimated afullmodelinwhich allintrinsic gains and their extrinsic (backward) modulation could differ between valid and invalid trials. To ensure the same stimuli were used for assessing these differences we conducted two sets of analyses one for targets presented to the left visual field and another for targets presented on the right. Each DCM estimated all intrinsic, extrinsic andmodulatory connection strengths and any differences in intrinsic and modulatory connections due to invalid cuing.

After inverting the full model we then evaluated the evidence for reduced versions that constitute alternative hypothesizes or models. This model space was created by partitioning connectivity differences into three subsets and considering all eight combinations. These subsets were changes in intrinsic gain in the extrastriate sources (V2, V3, and V5); changes in parietal (PC) gain and changes in extrinsic modulatory connections. This partition was motivated by distinguishing between the effect of the cue on target-related responses—which should be apparent in changes in intrinsic gain in the visual areas—and the effect of the target *per se*—which should be apparent in changes in backward modulation of gain. To evaluate the ensuing models, we use Bayesian model comparison based upon (a variational free energy) approximation to log evidence. Having identified the model with the greatest evidence, we then examined its posterior parameter estimates. This allowed us to characterize validity effects quantitatively and to interpret them in computational (predictive coding) terms.

## **RESULTS**

**BEHAVIORAL DATA**

The ANOVA demonstratated significant main effects of validity, subject and cue-target interval, with significant interactions between cue-target interval∗validity, cue-target interval∗subject, side∗cue-target interval and validity∗side∗subject. Reaction times to validly cued targets were significantly shorter than to invalidly cued targets [left: mean (SD) 333 ms (42 ms) vs. 355 ms (44 ms), *p <* 0*.*001; right: mean (SD) 334 ms (42 ms) vs. 354 ms (44 ms)], **Figure 4**.

## **ATTENTIONAL EFFECTS IN SENSOR SPACE**

The effects of attention (validity of cueing) on responses to targets presented in the left hemifield are shown—for the first two canonical modes—in **Figure 6**. Although these MEG responses are formally distinct from classic EEG results, they speak to similar effects on early and late responses: the blue lines correspond to valid trials and red lines to invalid trials. The response in the first mode shows the early response (just before 200 ms) has a reduced latency and slightly higher amplitude—consistent with an attenuation of N2 response to invalid targets, as seen in classic EEG studies (Mangun and Hillyard, 1991). In terms of late responses, the second mode shows a protracted and elevated response around 300 ms that is consistent with a P3b component, when the target location is not attended.

The solid lines report the model predictions of observed responses (broken lines) in sensor space after inversion of the DCM. These illustrate the accuracy of model inversion, capturing both the early and late differences to a considerable level of detail. Examples of the underlying source activity that generates these predictions are shown in the lower panel. These traces represent the depolarization of three excitatory populations within the left

V2 source, contralateral to the visual input modeling the effects of target presentation. The dotted lines correspond to the spiny stellate and deep pyramidal populations, while the solid lines report the superficial pyramidal cells—that are the predominant contributors to sensor data. Note that this level of reconstructed neurophysiological detail rests on having a biologically plausible forward model.

Somewhat to our surprise, the differential responses to right targets were much less marked (results not shown). Furthermore, model inversion failed to converge for these conditions. Therefore, we restricted our analysis to the left target conditions. The failure to elicit clear validity effects with right targets may relate to the asymmetry of responses—and attentional gain control (see below).

## **BAYESIAN MODEL SELECTION**

A provisional Bayesian Model Comparison demonstrated that modeling the validity effect with changes in the strengths of the modulatory backwards connections only had the greatest posterior probability, justifying the investigation of these connections in the following analyses (**Figure 5**). The comparison of different explanations for the validity effects above focused on differences in the gain of superficial pyramidal cells—either intrinsic to extrastriate or parietal sources, or differences in the modulation of gain, mediated by extrinsic top-down connections. The relative log evidences for all combinations of these conditionspecific differences are shown in the upper left panel of **Figure 7**. The labeling of these models indicates the presence or absence of differences in extrastriate gain, parietal gain and gain modulation. It can be seen that the model with the greatest evidence includes differences in extrastriate gain and gain modulation but not differences in parietal gain. The corresponding posterior probabilities of these models (assuming all were equally plausible *a priori*) are shown in the upper right panel. These suggest that we cannot definitively exclude differences in parietal gain; however, we can be more than 80% confident that parietal effects are not necessary to explain these data, provided we allow for validity effects on extrastriate gain and its top-down modulation.

The lower panels show the same log evidences but in image format, to illustrate the relative evidence for gain effects. The image on the right is under extrinsic top-down gain modulation and suggests greater evidence than the corresponding results on the left, where modulatory effects are concluded. In both cases, the model with extrastriate—but not parietal—gain differences has the greatest evidence. Having identified the best model, we then quantified the changes in model parameters that explain the validity effect.

## **ATTENTIONAL GAIN EFFECTS**

**Figure 8** shows the differences in self-inhibition (top left panels) and backwards modulation of self-inhibition (top right panels) for the model with the highest posterior probability above. The upper panels show the differences as connectivity matrices indicating changes in connection strength. This means that differences in self-inhibition are located along the leading diagonal, while differences in backward connections are restricted to the upper diagonal elements. The middle panels show the same results but in terms of the posterior expectations for differences (in connections that changed) and their 90% confidence intervals.

As anticipated, the recurrent or self-inhibition of early visual sources showed a highly asymmetrical difference when attending to the right hemifield (during invalid trials), compared to attending to the left hemifield (during valid trials). When attending to the right hemifield the left V2 source shows a profound decrease in the self-inhibition of superficial pyramidal cells—consistent with a disinhibition or increase in gain. This is accompanied by a slight decrease in the gain or sensitivity of the left extrastriate V3 source and an increase in the right V5 source. Note that these

gain differences are in place before the target is presented and presumably—are instantiated by the cue. When the target arrives, it evokes responses throughout the visual hierarchy that modulate the gain of the lower sources. These effects are mediated by the backward modulatory connections.

With the exception of backward connections from the right parietal source, all the differences in backward modulation between valid and invalid trials are positive, speaking to an increase in gain (or a top-down disinhibition of superficial pyramidal populations). However, it is difficult to predict the changes in gain that are produced by modulatory effects, because this disinhibition could itself be inhibited when top-down afference falls below baseline firing rates. Therefore, we evaluated the changes in gain in early visual sources as a function of peristimulus time

for the two conditions. This is possible because we have a biologically plausible forward or generative model that allows us to examine changes in both neuronal states and connectivity—over peristimulus time—using the posterior parameter estimates.

**Figure 8** shows the log gain or precision of the early visual sources, following target presentation for valid (lower left panel) and invalid trials (lower right panel). As expected, there is a marked asymmetry in gain modulation during the prestimulus period that is revised or updated after the target is processed through activity dependent modulatory mechanisms. Specifically, during valid trials the gain is greater in the appropriate (right) early visual source and then reaches a peak shortly before 200 ms. This peak is complemented by a suppression of gain in the unattended (left) visual source. This can be contrasted with the gain modulation during invalid trials. Here, the attended left source starts off with a slightly higher gain. Furthermore, the unattended source is suppressed more acutely with the arrival of the target. However, after about 120 ms its gain increases markedly, to peak just before 200 ms. This redeployment of precision (c.f., reorientation of attention) is the largest gain modulation in both sources and conditions. Interestingly, the gain of the left source also enjoys a slight increase but to a substantially lesser degree. In short, the top-down modulation of gain (through modulatory disinhibition of superficial pyramidal cells) appears to exert a dynamic gain control over peristimulus time and shows marked lateralization, when attention is switched from one hemifield to another.

## **DISCUSSION**

In conclusion, we have used dynamic causal modeling to characterize putative changes in the gain of superficial pyramidal cell populations that might underlie attentional (validity) effects in the Posner paradigm. Our focus on gain mechanisms was

motivated by theoretical formulations of attention in terms of optimizing perceptual inference using the expected precision of particular processing streams (Feldman and Friston, 2010). This formulation rests upon predictive coding schemes that the brain might use to infer the causes of sensory consequences it has to explain (Friston and Kiebel, 2009). Our model comparison and quantitative analysis of changes in parameter estimates are remarkably consistent with theoretical predictions.

In brief, the modeling results suggest that, following a cue, sensory channels in the appropriate hemisphere are afforded more precision through the disinhibition of recurrent or self-inhibition of superficial pyramidal cells. These cells are thought to pass sensory information (prediction error) to higher levels to inform perception. When a target appears in an unattended location, the misplaced gain or sensitivity of lower areas is revised or updated by top-down modulatory influences from higher extrastriate and parietal sources. Phenomenologically, this increases the latency and reduces the amplitude of early responses to invalid targets—because they are processed by channels that have an inappropriately low gain. The resulting prediction error induces an update response that reverses the misattribution of gain, producing differences in late or endogenous response components such as the P3b. The P3b is known to be sensitive to probabilistic surprise (Mars et al., 2008; Kolossa et al., 2012) as well as to risk (Schuermann et al., 2012). These results suggest that the larger P300 in response to more unexpected events might be a result of exaggerated precision at lower levels incited by the arrival of an unexpected stimulus.

This application of dynamic causal modeling is slightly more focused than normal applications. We did not explore a large model space but focused on particular synaptic mechanisms as sufficient explanations for condition-specific responses. It is more than likely that there are many models of these differential responses that would produce equally good or better explanations. However, we chose to focus on models that were explicitly informed or constrained by computational and biophysical considerations; namely, that the effects have to be mediated by a neurobiologically plausible gain control that is consistent with normative principles of perceptual inference. This allowed us to validate the theoretical proposals empirically, while providing a principled model space within which to test specific hypotheses about the underlying wetware.

Evidence suggests that gain modulation in pyramidal cells is an important mechanism in visual attention. Electrophysiological studies have demonstrated that attention can enhance the

response of visual neurons (likely to be pyramidal cells) by a multiplicative factor (McAdams and Maunsell, 1999; Treue and Martínez-Trujillo, 1999). fMRI studies demonstrate increased BOLD response for attended versus unattended stimuli (Kastner et al., 1998), even if these stimuli are predictable (Kok et al., 2012), and early visual ERPs, which are most strongly determined by pyramidal cell firing, are enhanced by attention (Rauss et al., 2011).

Interestingly, although we were almost forced to model gain control using inhibitory self connections—because of the relative simplicity of neuronal mass models used by dynamic causal modeling—this particular mechanism makes a lot of sense in relation to current thinking about attention. Convergent evidence implicates local inhibitory processing, mediated by GABAergic neurotransmission, in attention. Drugs working at GABA receptors, such as benzodiazepines, which are positive allosteric modulators of GABA-A receptors, increase the behavioral effect of cues so that reaction time differences to validly and invalidly cued targets become larger, while overall reaction times are slowed (Johnson et al., 1995). Nicotine (an agonist at nicotinic acetylcholine receptors) also affects reaction times in the Posner paradigm, but it decreases the validity effect while increasing reaction times (Thiel et al., 2005; Meinke et al., 2006), and it is believed that the attentional effects of acetylcholine might be mediated at least partly though depression of inhibitory interneuron activity (Xiang et al., 1998; Buia and Tiesinga, 2006). These contrasting effects suggest that the inhibitory interneurons set the gain of their cortical area to determine reaction times. Increasing their effects increases reaction times due to greater overall inhibition, exaggerating the difference between high- and low-gain cortical areas, and *vice versa*. This is consistent with the "biased activation theory" of selective attention (Grabenhorst and Rolls, 2010), which suggests that GABA interneurons mediate competition between stimuli which can be biased through top-down signals (the backwards modulatory connections in this DCM).

In summary, the emerging picture is that attention may be mediated through local intrinsic or recurrent inhibitory mechanisms that form a key part of cortical gain control—and that have characteristic signatures in terms of frequency specific induced responses. This fits comfortably with the theoretical perspective provided by predictive coding—that provides a computational role for recurrent inhibition in encoding the gain or precision of prediction errors in hierarchical processing. The results presented in this paper provide an initial link between these computational imperatives and plausible mechanisms at the level of synaptic processing and hierarchical neuronal circuits.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 June 2013; accepted: 29 October 2013; published online: 02 December 2013.*

*Citation: Brown HR and Friston KJ (2013) The functional anatomy of attention: a DCM study. Front. Hum. Neurosci. 7:784. doi: 10.3389/fnhum.2013.00784*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Brown and Friston. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Post-perceptual processing during the attentional blink is modulated by inter-trial task expectancies

## *Jocelyn L. Sy1, James C. Elliott 2,3 and Barry Giesbrecht 2,3 \**

*<sup>1</sup> Department of Psychology, Vanderbilt University, Nashville, TN, USA*

*<sup>2</sup> Department of Psychological and Brain Sciences, University of California, Santa Barbara, Santa Barbara, CA, USA*

*<sup>3</sup> Institute for Collaborative Biotechnologies, University of California, Santa Barbara, Santa Barbara, CA, USA*

#### *Edited by:*

*Simone Vossel, Wellcome Trust Centre for Neuroimaging, University College London, UK*

#### *Reviewed by:*

*Roberto Dell'Acqua, University of Padova, Italy Theodore Zanto, University of California at San Francisco, USA*

#### *\*Correspondence:*

*Barry Giesbrecht, Department of Psychological and Brain Sciences, University of California, Santa Barbara Santa Barbara, CA 93106, USA e-mail: barry.giesbrecht@psych. ucsb.edu*

The selective processing of goal-relevant information depends on an attention system that can flexibly adapt to changing task demands and expectations. Evidence from visual search tasks indicates that the perceptual selectivity of attention increases when the bottom-up demands of the task increase and when the expectations about task demands engendered by trial history are violated. Evidence from studies of the attentional blink (AB), which measures the temporal dynamics of attention, also indicates that perceptual selectivity during the AB is increased if the bottom-up task demands are increased. The present work tested whether expectations about task demands engendered by trial history also modulate perceptual selectivity during the AB. Two experiments tested the extent to which inter-trial switches in task demands reduced post-perceptual processing of targets *,* presented during the AB. Experiment 1 indexed post-perceptual processing using the event-related potential (ERP) technique to isolate the context sensitive N400 ERP component evoked by words presented during the AB. Experiment 2 indexed post-perceptual processing using behavioral performance to determine the extent to which personal names survive the AB. The results of both experiments revealed that both electrophysiological (Exp. 1) and behavioral (Exp. 2) indices of post-perceptual processing were attenuated when consecutive trials differed in terms of their perceptual demands. The results are consistent with the notion that the selectivity of attention during the AB is modulated not only by within-trial task demands, but also can be flexibly determined by trial-by-trial expectations.

**Keywords: selective attention, event-related potentials, attentional blink, expectancy**

## **INTRODUCTION**

Human selective attention is often characterized as being flexible and dynamic, continually adapting to the information processing demands imposed by the external world and our internal goals and expectations (e.g., Corbetta and Shulman, 2002; Kastner and Pinsk, 2004; Vogel et al., 2005; Ristic and Giesbrecht, 2011; Franconeri et al., 2013). The flexibility of selective attention has been investigated by measuring the processing of stimuli that compete for attentional resources using behavioral or neuroimaging methods (e.g., Yantis and Johnston, 1990; Lavie and Tsal, 1994; Vogel et al., 2005). Demonstrations of the flexibility of attention come from studies showing that selective information processing is not fixed at either early or late stages of representation, but rather is sensitive to task demands. For instance, when attentional selectivity is measured by the behavioral interference caused by information presented at task-irrelevant spatial locations during visual search, both task demands and expectations influence the flexibility of attention. Specifically, increasing the bottom-up task demands by increasing the perceptual similarity between visual search targets and distractors can reduce the behavioral interference caused by task-irrelevant stimuli, suggesting that increasing the bottom-up task demands increases the perceptual selectivity of attention (e.g., Lavie and Cox, 1997). Other studies have demonstrated that the selectivity of attention during visual search is also modulated by expectations generated by trial-by-trial task dependencies. For example, during visual search tasks in which difficulty varies from trial-to-trial, when the difficulty on trial*<sup>n</sup>* and trial*n*−<sup>1</sup> are different (switch trials) the amount of interference caused by stimuli presented at task-irrelevant locations can be reduced compared to when the search difficulty on consecutive trials is the same (repeat trials, e.g., Theeuwes et al., 2004). Together these studies, as well as other similar behavioral and neuroimaging evidence (e.g., Yantis and Johnston, 1990; Handy et al., 2001; Yi et al., 2004), support the notion that the selectivity of spatial attention is not fixed, but rather flexibly adapts to both the inherent difficulty of the task as well as one's expectations about the task.

The flexibility of attention has not only been observed in spatial visual search tasks, but also in studies designed to measure the temporal dynamics of attention. The temporal dynamics of attention are typically investigated by examining the influence of selecting and identifying one target (T1) on the processing of a subsequent target (T2). These targets can either be presented within a rapid sequence of distractors (e.g., Raymond et al., 1992; Chun and Potter, 1995) or presented briefly and then masked

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 1 — #1

(e.g., Duncan et al., 1994; Ward et al., 1996). Observers typically have no difficulty reporting T1, but T2 detection and/or identification is impaired when it is presented within 200–500 ms of T1 (e.g., Raymond et al., 1992). This impairment is known as the attentional blink (AB) and it is thought to represent the temporal dynamics of selection and consolidation processes (for recent reviews, see Dux and Marois, 2009; Martens and Wyble, 2010). Classic behavioral and electrophysiological studies of the AB have demonstrated that despite the severe impairment in T2 performance, semantic information about T2 survives the AB and that items presented during the AB can prime subsequent targets (e.g., Luck et al., 1996; Maki et al., 1997; Shapiro et al., 1997; Vogel et al., 1998; Rolke et al., 2001; Dux and Marois, 2008). Based on this evidence, theoretical accounts of the AB typically assume that semantic processing is preserved during the AB and that the impairment in T2 performance occurs because of a post-perceptual failure of attention (e.g., Chun and Potter, 1995; Raymond et al., 1995; Olivers and Meeter, 2008).

In contrast to the studies showing spared semantic processing during the AB, more recent studies have demonstrated that semantic information about T2 does not always survive the AB. For instance,Vachon and Jolicoeur (2011) andVachon et al. (2007) have reported both behavioral and electrophysiological evidence that semantic processing within the AB can be suppressed when there is a task-switch between T1 and T2. The reduction in semantic processing presumably occurs because the reconfiguration of the attentional-set from one task to the other is a resourcedemanding process that interferes with the perceptual processing of T2 (Vachon et al., 2007; Vachon and Jolicoeur, 2011). Similarly, Giesbrecht et al. (2007, 2009) have used both electrophysiological and behavioral approaches to demonstrate that increasing T1 task load can suppress the extent to which semantic and high priority information (e.g., personal names) can survive the AB.

While the evidence from the AB showing reduced postperceptual processing (i.e., increased selectivity) with increasing task demands parallels the results of the visual search tasks showing reduced flanker interference and increased perceptual selectivity with increased perceptual load, there is a critical difference: in the studies of the AB, the selectivity of attention is measured by postperceptual processing of a task-relevant stimulus; whereas, in the visual search task selectivity is measured by the post-perceptual processing and subsequent interference caused by task-irrelevant stimuli. However, recent behavioral evidence has revealed that, much like in the visual search tasks described above, increasing T1-task load can reduce the interference caused by task-irrelevant flankers presented simultaneously with T2 during the AB (Elliott and Giesbrecht, 2010). Thus, when one considers the evidence together, the data are consistent with the notion that the perceptual demands of the T1 task can modulate the selectivity of attention within the AB, when it is measured by the postperceptual processing of task-relevant information and when it is measured by the post-perceptual processing of task-irrelevant information.

The recent empirical evidence in the literature is consistent with the notion that the selectivity of attention during the AB is flexible and modulated by the T1 task demands. However, it is unclear whether the temporal dynamics of attention are modulated by expectancies generated by inter-trial dependencies of T1 task demands. To clarify this issue, we tested whether the expectancies engendered by task-demand dependencies between trials modulate post-perceptual processing during the AB. In two experiments, participants were presented with two masked targets displayed in rapid succession. In both experiments, the first target (T1) was a flanker-type stimulus consisting of a single arrow flanked by pairs of arrows pointing either in the same direction (congruent, e.g., >>>>>) or in different directions (incongruent, e.g., <<><<). We refer to the congruent and incongruent conditions as low and high T1 load, respectively (Giesbrecht et al., 2007, 2009). Unlike previous studies that have used blocked T1 load conditions to demonstrate the effects load on post-perceptual processing of information presented during the AB (i.e., Giesbrecht et al., 2007, 2009), in the present experiment the two types of T1 load trials were randomly intermixed within experimental blocks. The random intermixing of trials allowed us to investigate the effects of inter-trial dependencies on post-perceptual processing during the AB by permitting the analysis of the data as a function of whether the T1-load on a given trial was the same as the previous trial (i.e., a T1 repeat trial) or was different than the previous trial (i.e., a T1-switch trial). In Experiment 1, post-perceptual processing during the AB was indexed by measuring the context sensitive N400 event-related potential (ERP) evoked by T2. In Experiment 2, post-perceptual processing was indexed by measuring the extent to which personal names survive the AB. Based on studies of spatial attention (Theeuwes et al., 2004) and previous studies of the AB (Giesbrecht et al., 2007, 2009; Vachon et al., 2007; Elliott and Giesbrecht, 2010; Vachon and Jolicoeur, 2011), we predicted that the additional demands required on T1-switch trials should decrease post-perceptual processing during the AB, relative to T1-repeat trials. Consistent with this prediction, we observed that T1-switches in load resulted in less semantic processing during the AB in both experiments.

## **EXPERIMENT 1**

## **RATIONALE**

The purpose of Experiment 1 was to test if expectancies generated by inter-trial T1 task dependencies modulate the processing and availability of semantic information presented during the AB. To do so, we revisited the finding that the context-sensitive N400 ERP component survives the AB (Luck et al., 1996). A context word was presented at the beginning of each trial, followed by a masked flanker stimulus (T1) and a word (T2) that was either related or unrelated to the context word presented at the beginning of the trial. The magnitude of the context sensitive N400 ERP (e.g., Kutas and Hillyard, 1980) was quantified by computing the mean amplitude of the difference wave of unrelated–related trials between 300 and 500 ms post T2 stimulus onset (Luck et al., 1996; Vogel et al., 1998). Using a similar task in which high and low T1 load were presented in different blocks of trials, we previously demonstrated that the N400 evoked by T2 was not modulated by the AB when T1 load was low, but was completely suppressed during the AB when T1 load was high (Giesbrecht et al., 2007). The key issue in the present work is whether trial-by-trial

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 2 — #2

dependencies generated when T1 load is mixed within a block of trials alters this pattern. Specifically, if semantic processing of T2 is not constrained by expectancies engendered by intertrial T1 task dependencies, then an N400 should be observed in all conditions. However, if the attentional demand imposed by inter-trial T1-switches modulates the extent to which semantic processing occurs, then the magnitude of the N400 should be reduced during the AB under switch compared to repeated conditions.

## **MATERIALS AND METHODS**

## *Participants*

Twelve undergraduates from the University of California, Santa Barbara (UCSB) provided informed consent and were paid \$10/hour for their participation (mean age = 19; 9 female). The UCSB Human Subjects Committee approved all procedures.

### *Apparatus and stimuli*

Stimulus presentation was controlled using custom scripts written for MATLAB (Mathworks, Inc., Boston, MA, USA) and the Psychophysics Toolbox (Brainard, 1997). T1 stimuli were black and consisted of a central arrow (0.4◦ × 0.4◦) centered between two pairs of arrows (0.4◦ × 1.1◦). The distance between adjacent arrows was 0.15◦. The complete target stimulus subtended 0.4◦ × 2.6◦. The context word presented at the beginning of the trial and the T2 word were black and white, respectively. Both were presented in uppercase 32-point Arial font. Each character subtended approximately 0.4◦ × 0.4◦. T1 and T2 masks were strings of black numbers and uppercase letters the same length as the respective target. All stimuli were presented on a neutral gray background and viewed on a 19-inch color monitor from a distance of 125 cm.

## *Procedure*

Each trial began with a random fixation interval (500–1000 ms), followed by the context word (1000 ms). After the context word there was a second random delay (750–1250 ms), followed by the presentation of T1 (53.3 ms) and the T1 mask (53.3 ms; T1-mask ISI = 53.3 ms). After the temporal lag (either 320 or 920 ms) lapsed, T2 was presented (40 ms) and then masked (40 ms; T2 mask ISI = 40 ms). After a third random delay (750–1250 ms) subjects were prompted to indicate their responses for T1 and T2. Subjects were instructed to read the context word presented at the beginning of the trial, identify the direction of the T1 central arrow (left or right) and determine whether T2 was related or unrelated to the context word. All responses were unspeeded and typed into the keyboard. After the responses were recorded, fixation returned to the screen and the participant started the next trial when ready. A sample trial sequence is shown in **Figure 1**.

## *Design*

There were four independent variables: T1 load, T1 inter-trial dependency, T2-relationship, and T1–T2 lag. T1 load was manipulated by the direction of the flankers relative to the central arrow and was either congruent (i.e., >>>>> or <<<<<) or incongruent (i.e., <<><< or >><>>). Because the different T1 load conditions were intermixed, each trial could be categorized as T1-repeat trial (when T1-load on trial*<sup>n</sup>* was the same as

**FIGURE 1 | (A)** A schematic illustration of the trial sequence in Experiment 1. **(B)** Mean proportion of correct responses on the first target (T1) task, plotted as a function of T1 load (high/low) and inter-trial T1-load dependency (repeat/switch). In this and subsequent figures, error bars represent the standard error of the mean calculated in a manner appropriate for within subjects experimental designs (Loftus and Masson, 1994).

trial*n*−1) or T1-switch trial (when T1-load on trial*n*, was different from trial*n*−1). T2-relationship specified the semantic association, either related or unrelated, between T2 and the context word. The specific words were compiled from previously published studies and norms (Postman and Keppel, 1970; Giesbrecht et al., 2004, 2007) and consisted of 300 related word pairs. Each word pair was randomly assigned to each of the load conditions, under the constraint that across subjects each pair was assigned to each of the load conditions an equal number of times. Unrelated word lists were created by randomly shuffling the related word pairs (Giesbrecht and Kingstone, 2004; Smallwood et al., 2011). T1–T2 lag was the temporal interval between the onsets of T1 and T2 and it was either 320 or 920 ms. T2-relationship and T1–T2 lag conditions were randomly intermixed within each block. There were 600 total trials (75 trials in each condition) that were divided into 10 blocks (five for each load condition) of 60 trials. Prior to the experimental trials, subjects were given 10 practice trials.

## *Recording and analysis*

Electroencephalographic (EEG) activity was recorded at 256 Hz from 32 Ag/AgCl sintered electrodes mounted in an elastic cap and placed according to the International 10/20 System. The horizontal and vertical electrooculograms (EOG) were recorded from electrodes placed 1 cm lateral to the external canthi (left and right) and above and below each eye, respectively. The data were re-referenced offline to the average of the signal recorded from electrodes placed on the left and right mastoids and then band-pass filtered (0.1–30 Hz). Trials containing ocular artifacts

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 3 — #3

(blinks and eye movements) detected by EOG amplitudes exceeding ± 50 mV or by visual inspection were excluded from the analysis. The average percentage of trials that were rejected was 6.9% (range 1.3–15.2%).

The average ERP waveforms in all conditions were computed time-locked to the onset of T2 and included a 200 ms prestimulus baseline and 600 ms poststimulus interval. The N400 was isolated by subtracting the resulting ERP waveforms on related trials from the ERP waveforms on unrelated trials. It is important to note that for a given subject, lag, and load condition the T2 word was exactly the same (only the context word was different), therefore any modulations observed in the resulting difference wave cannot be attributed to physical stimulus differences. The magnitude of the N400 was quantified as the mean amplitude of the difference waves over the 300–500 ms post-T2 time window. N400 measurements were obtained from frontal, central, and parietal electrodes (F3, Fz, F4, C3, Cz, C4, P3, Pz, P4, Luck et al., 1996; Vogel et al., 1998; Giesbrecht et al., 2007). As with previous studies, the mean amplitudes included both T2 correct and T2 incorrect trials (Luck et al., 1996; Vogel et al., 1998; Giesbrecht et al., 2007). The inclusion of both correct and incorrect trials should increase the likelihood that an N400 will be observed during the AB because semantic access is more likely to occur on T2 correct trials. Thus, any observed reduction in the magnitude of the N400 during the AB is likely to be an underestimate of the true reduction of semantic processing. Unless mentioned otherwise, within-subjects ANOVAs were used for all statistical analyses, and the *p*-values were adjusted in accordance with the Greenhouse-Geisser epsilon value.

## **RESULTS**

#### *Behavior*

*T1 accuracy*. Mean proportion of correct T1 responses are plotted as a function of T1 load (low/high) and inter-trial dependency (repeat/switch) in **Figure 1B**. Overall mean performance was 0.78 (SEM = 0.035). There was a significant effect of T1 load, such that performance was lower when T1 load was high (*M* = 0.64, SEM = 0.068) relative to when T1 load was low (*M* = 0.92, SEM = 0.021; *F*(1,11) = 15.58, *p* < 0.003, MSE = 0.062). Neither the main effect of inter-trial dependency nor the load x inter-trial dependency interaction were significant (both *F*'s < 1).

*T2 accuracy*. Mean proportion of correct T2 responses are plotted as a function of T1 load, inter-trial dependency, and lag in **Figure 2A**. Overall performance was lower at short lags compared to long lags (*F*(1,11) = 9.81, *p* < 0.02, MSE = 0.013). While visual inspection of **Figure 2A** suggests that there is an interaction between inter-trial dependency and lag, such that at the short lags performance on switch trials was lower than repeat trials, this interaction was not significant (*F*(1,11) = 2.16, *p* = 0.17, MSE = 0.013). No other effects were statistically significant.

*AB magnitude*. Two analyses were performed using AB magnitude as an index of the severity of the performance decrement caused by the T1 load and trial dependency manipulations. AB magnitude was computed by subtracting each individual's performance at the short lag (320 ms) from an optimal performance baseline (Jackson and Raymond, 2006; Giesbrecht et al., 2009). In the present experiment, the performance baseline for all conditions was the accuracy in the 920 ms lag, low load-repeat

condition (i.e., the condition in which T2 accuracy should be optimal). It was appropriate to select this data point to serve as the optimal performance baseline for all conditions because the T2 stimuli were exactly the same in all conditions. It is important to note that because AB magnitude was computed relative to a single estimate of optimal performance (i.e., 920 ms lag, low load-repeat condition) instead of relative to a within condition estimate of optimal performance (e.g., the 920 ms lag within each condition), the ANOVA on AB magnitude was not redundant with the ANOVA on T2 accuracy including lag as a factor reported in the preceding paragraph. Using this metric of AB magnitude, the first analysis tested whether the severity of the AB was modulated by trial dependency and load using a repeated measures ANOVA. The results of this analysis revealed a trend for an effect of inter-trial dependency (*F*(1,11) = 3.40, *p* < 0.1), but no other significant effects. While the ANOVA

FDR-corrected for multiple comparisons.

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 4 — #4

using this metric of AB magnitude as the dependent measure can indicate whether the severity of the AB is modulated by the experimental factors, it does not indicate the presence of an AB within a specific condition. Thus, the second analysis tested for the presence of an AB within each condition. To identify the presence of the AB, one-sample *t*-tests were performed, testing whether the AB magnitude in each condition was significantly different than zero (i.e., no AB). A false discovery rate correction (FDR; Benjamin and Hochberg, 1995) was applied to correct for multiple comparisons (*p* < 0.05). The results of this analysis are shown in **Figure 2B**. The key finding of this analysis was that AB magnitude was significantly different than zero in all conditions (FDR-corrected *p*'s < 0.006), except for the repeat low load condition (FDR-corrected *p* > 0.28).

## *Electrophysiology*

The ERP results are summarized in **Figure 3**. The mean N400 difference waves measured at central electrodes (C3/Cz/C4) are shown in **Figure 3A** as a function of inter-trial dependency, lag and time. The scalp topography during the N400 time window is shown in **Figure 3B**. The mean amplitude at all electrodes included in the analysis is plotted as a function of inter-trial dependency and lag in **Figure 3C**. Finally, the N400 mean amplitude is plotted as a function of inter-trial dependency, load, and lag for left, midline, and right electrodes in **Figure 3D**. The mean amplitudes were entered into a repeated measures ANOVA that included T1-load, inter-trial dependency, lag, anterior-posterior electrode position (frontal, central, parietal), and left-right electrode position (left, midline, right) as factors. The key finding that emerged from this

unrelated-related differences waves illustrating the N400 measured at central electrodes (average at electrodes C3/Cz/C4). **(B)** Scalp topography of the N400 mean amplitude computed over the 300–500 ms time window. **(C)** Mean N400 amplitude plotted as a function of lag (320/920 ms) and inter-trial dependency. The mean amplitude was

computed over the 300–500 ms post-T2 time window and averaged across all electrode sites included in the analysis (see Materials and Methods). **(D)** Mean N400 amplitude measured at left (F3/C3/P3), midline (Fz/Cz/Pz), and right electrodes (F4/C4/P4) plotted as a function of inter-trial dependency (repeat/switch), lag (320/920 ms), and T1 load (low/high).

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 5 — #5

analysis was a significant interaction between inter-trial dependency and lag (*F*(1,11) = 5.29, *p* < 0.05, MSE = 16.32). Inspection of **Figure 3C** suggests that this interaction is being driven by the fact that the N400 is not modulated by lag on repeat trials, but is on switch trials. *Post-hoc t*-tests confirmed this interpretation by revealing that there was no effect of lag on T1-repeat trials (*t*(11) = 1.07, *p* > 0.30), but the N400 was significantly smaller at the 320 ms lag than the 920 ms lag (*t*(11) = 2.93, *p* < 0.02) on T1 switch trials. This interaction is clearly visible not only in the mean amplitude data (**Figure 3C**), but also in the waveforms and scalp topographies (**Figures 3A,B**), all of which show a robust N400 on T1-repeat trials both inside and outside theAB, but a reduced N400 on T1-switch trials during theAB. There was also a three-way interaction between inter-trial dependency, lag, and electrode left-right position (*F*(2,22) = 5.34, *p* < 0.014, MSE = 0.788). This interaction (plotted with the additional factor of load in **Figure 3D**), was such that the inter-trial dependency × lag interaction (i.e., an effect of lag on switch trials, but not on repeat trials) was stronger at left electrode sites than midline and right electrode sites. Interestingly, while there is suggestive visual evidence that the effect of switching from high to low load had a greater impact on the N400 at short temporal lags than switching from low to high load, the three-way interaction was not significant (*F*(1,11)=2.92, *p*>0.12, MSE = 22.51). The remaining main effects and interactions were also not statistically significant.

Visual inspection of the difference ERP waveforms plotted in **Figure 3A** suggests that the baseline of the 920 ms lag waveform on repeat trials is generally more positive than the corresponding condition on switch trials. To assess the extent to which this apparent modulation in the baseline is contributing to the inter-trial dependency × lag interaction, we ran a control analysis using a finer-grained pre-stimulus interval (50 ms). The resulting rebaselined difference waves and mean amplitudes are shown in **Figure 4**. While the overall inter-trial dependency × lag interaction failed to reach significance, the inter-trial dependency × lag × electrode left-right position was significant (*F*(2,22) = 4.06, *p* < 0.04, MSE = 1.69). As in the original analysis, and as can be clearly observed in the mean amplitudes shown in **Figure 4B**, this interaction was such that the inter-trial dependency × lag interaction was robust over left electrodes. In contrast, at midline and right electrodes, the primary modulator of the N400 was temporal lag. This control analysis suggests that the inter-trial dependency × lag interaction is not solely being driven by differences in the prestimulus baseline, but rather is being driven by changes that are mediated by the interaction between trial-by-trial expectancies about task demands and the attentional demands caused by theAB.

#### **SUMMARY**

The key finding in Experiment 1 was that the magnitude of the N400 was attenuated during the AB on T1-switch trials, but not on T1-repeat trials. This finding suggests that post-perceptual processing during the AB was modulated by the inter-trial taskdemand expectancies and that the violation of this expectancy on T1-switch trials served to increase the selectivity of attention compared to when this expectancy was not violated. Interestingly, while there was an inter-trial dependency × lag interaction there was not an interaction between dependency, load, and lag. In other

words, the dependency × lag interaction described above, was not modulated by load. This is interesting because it suggests that the previously reported effect of load on the N400, which serves to completely suppress the N400 during the AB (Giesbrecht et al., 2007), can be reversed by the context provided by the inter-trial dependencies.

A second result was that while inter-trial dependency did not influence T1 accuracy or overall T2 accuracy, there was suggestive evidence that dependency did affect the presence of the AB. Specifically, there was a significant AB on T1-low load switch trials, but not on T1-low load repeat trials. Interestingly, in our previous study, an AB was observed even when T1-load was low (Giesbrecht et al., 2007). The absence of a significant AB in the low-load repeat condition suggests that the expectancy generated by the inter-trial dependency causes a decrease in the difficulty of the T1 task that is sufficient to result in the absence of theAB on low-load trials. However, this result should be interpreted with caution because of the lack of an effect of dependency on T1 accuracy, the lack of an interaction between dependency and lag on T2 accuracy, and the lack of a significant main effect of trial dependency on AB magnitude.

## **EXPERIMENT 2 RATIONALE**

To provide additional evidence that expectations engendered by trial-by-trial dependencies can modulate the selectivity of attention during the AB, we revisited another classic

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 6 — #6

demonstration of post-perceptual processing during the AB: the finding that one's own name is not subject to the AB (Shapiro et al., 1997). Experiment 2 tested whether the extent to which one's own name survives the AB is modulated by inter-trial load. There were two key manipulations. First, both T1-load and intertrial dependency were manipulated utilizing the same flanker task as in Experiment 1. However, because the behavioral effects on T1 performance and T2 performance were weak, we changed the T1 stimulus from black to white. The rationale was that the color change would make the flankers more salient and increase the likelihood that they would interfere with performance. Second, T2 was either the participant's own name (T2-own) or someone else's name (T2-other). If processing of high priority information during the AB is not constrained by task demands imposed by a switch trial, then there should be no AB for T2-own, but there should be an AB for T2-other, irrespective of switch in T1 congruency. However, if switches between trials influence the extent to which high priority information is processed, then the difference in AB magnitude between T2-own and T2-other conditions should be attenuated on switch trials compared to repeat trials.

## **MATERIALS AND METHODS**

## *Participants*

Fifteen undergraduates from the University of California, Santa Barbara participated in a single 45 min session for credit in an introductory psychology class (8 female).

## *Equipment and stimuli*

The T1 and mask stimuli, equipment, and stimulus control procedures were the same as in Experiment 1. The T2 stimuli were the subject's own name and names from the database of registered birth names available from the United States Social Security Administration (http://www.socialsecurity.gov/ OACT/babynames/). To provide a rough control for exposure to names other than one's own name, the 50 most popular male and female names were selected from the list of names that corresponded to the most common year of birth of the largely freshman introductory psychology class from which our sample was drawn (1987). All names were presented in black uppercase 32 point Arial font. Each character subtended 0.4◦ × 0.4◦.

## *Design*

There were two changes in the design from Experiment 1. First, T2 was either the participant's own name or another name from the list. The participant's own name appeared on one eighth of the trials. The lag between the onsets of the first and second targets ranged from 200 to 800 ms in steps of 120 ms. All variables were combined factorially and randomly intermixed.

## *Procedure*

Each trial started with a fixation cross that remained on the screen until the participant initiated the trial by pressing the space bar. After the trial was initiated, there was a random delay (500–1000 ms) followed by the presentation of T1 and its mask (duration = 53.3 ms; T1-mask ISI = 53.3 ms). After the lapsing of the temporal lag, T2 was presented (40 ms) and then masked (40 ms; T2-mask ISI = 40 ms). On half the trials T2 was a male name and on the other half it was a female name. At the end of

the trial, participants were instructed to indicate the direction of the central arrow (left or right) and then whether the name was a male or a female name. All responses were unspeeded and typed into the keyboard. After the responses were indicated, the fixation cross reappeared, and the participant started the next trial when ready. An example of the trial sequence is shown in **Figure 5A**. Participants completed one block of 10 practice trials, followed by 10 blocks of 48 trials.

## **RESULTS**

*T1 task accuracy.* The mean proportion of correct responses is plotted as a function of inter-trial dependency, load, and name in **Figure 5B**. There was a significant main effect of inter-trial switch where T1 accuracy was worse in switch trials (*M* = 0.80) compared to repeat trials (*M* = 0.83; *F*(1,14) = 10.23, *p* < 0.007, MSE = 0.002). Overall performance was also higher on low load trials (*M* = 0.93) than on high load trials (*M* = 0.70; *F*(1,14) = 33.34, *p* < 0.001, MSE = 0.048). The only significant interaction was the inter-trial dependency × load × name interaction (*F*(1,14) = 4.87, *p* < 0.05, MSE = 0.002), which appeared to be driven by overall lower performance in the T1-switch high load condition when T2 was someone else's name.

*T2 task accuracy.* The mean proportion of correct T2 responses is shown in **Figure 6A**. Overall, inter-trial dependency modulated performance, such that overall performance was lower on switch trials than on repeat trials (*F*(1,14)=5.53, *p*<0.04,MSE=0.007). Mean accuracy was also lower for T2-other (*M* = 0.82) compared to T2-own (*M* = 0.92; *F*(1,14) = 54.84, *p* < 0.001, MSE = 0.035). There was a main effect of lag, where T2 report was worse

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 7 — #7

at shorter lags than longer lags (*F*(5,70) = 11.36, *p* < 0.001, MSE = 0.013).

There were two key interactions. First, the effect of lag was more severe for T2-other compared to T2-own (*F*(5,70) = 2.45, *p* < 0.05, MSE = 0.017). Second, and most critically, there was a three-way interaction between inter-trial dependency, name, and lag (*F*(5,70) = 2.46, *p* < 0.05, MSE = 0.010). *Post-hoc* repeated measures ANOVAs revealed that this interaction was driven by the modulation of the name × lag interaction as a function of task dependency. Specifically, on repeat trials, there was no effect of lag for T2-own, but a large effect of lag for T2-other (name × trial: *F*(5,70) = 3.64, *p* < 0.006, MSE = 0.018). In contrast, on switch trials, there was an effect of lag (*F*(1,14) = 42.47, *p* < 0.001, MSE=0.02) and name (*F*(1,14)=17.60, *p*<0.002,MSE=0.015), but no interaction (*F* < 1).

*AB magnitude.* To further address the influence of inter-trial task dependencies on post-perceptual processing, we performed two AB magnitude analyses similar to those performed in Experiment 1. AB magnitude was computed by subtracting mean performance during the AB (lags 200–320 ms) from an optimal performance baseline. The baseline used in Experiment 2 was the condition in which T1-load was repeated and in which T2 was presented at the longest lag (800 ms). The resulting mean AB magnitude data are shown in **Figure 6B**. In the first analysis, AB magnitude was entered into a repeated measures ANOVA. The key finding was that AB magnitude was modulated by the interaction between inter-trial dependency and name (*F*(1,14) = 6.73, *p* < 0.022, MSE = 0.043). *Post-hoc* tests revealed two interesting aspects to this interaction. First, in the T2-own condition, AB magnitude was significantly smaller on repeat trials than on switch trials (*M*T2−own,repeat =0.05,*M*T2−own,switch =0.109; *t*(14)=3.04, *p* < 0.009). In contrast, in the T2-other condition there was no difference between repeat and switch trials (*M*T2−other,repeat = 0.13, *M*T2−other,switch = 0.11; *t*(14) = 1.25, *p* > 0.23). Second, on repeat trials AB magnitude was significantly larger in the T2-other condition compared to the T2-own condition (*M*T2−other,repeat = 0.05, *M*T2−own,repeat = 0.13; *t*(14) = 2.74, *p* < 0.02), but on switch trials there was no difference in AB magnitude (*M*T2−other,switch = 0.11, *M*T2−own,switch = 0.109; *t*(14) = 0.04, *p* < 0.97). In the second analysis, just as in Experiment 1, the presence of the AB in each condition was identified using one-sample *t*-tests (vs. zero). A FDR correction (Benjamin and Hochberg, 1995) was applied to correct for multiple comparisons (*p* < 0.05). The key finding of this analysis was that AB magnitude was significantly different than zero in all conditions (FDR-corrected *p*'s < 0.02), except for the own name repeat condition under both low and high load (FDR-corrected *p* > 0.11). In addition, AB magnitude in other name switch condition under low load was also not significantly different than zero (FDR-correct *p* > 0.11).

#### **SUMMARY**

The survival of personally meaningful information during the AB has been used to argue that some post-perceptual information is available during the AB (Shapiro et al., 1997). Overall performance on repeat trials replicated this previous finding showing there is no AB in response to one's own name, but there is an AB to other people's names. There were two main findings that were novel. First was the finding that expectancies engendered by inter-trial task dependencies modulated the severity of the AB when the second target was one's own name. Second, overall there was an AB in both T2-own name and T2-other name conditions when T1 load switched from the previous trial. Together, both the mere presence of an AB for one's own name on switch trials and the fact that the severity of the AB for one's own name can be modulated by intertrial task dependencies (i.e., AB magnitude was larger on switch relative to repeat trials) supports the idea that the post-perceptual processing of high priority stimuli can be attenuated during the AB by a violation of trial-by-trial expectancies generated during the course of one's experience with a task. One exception to this

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 8 — #8

pattern was in the other name switch condition under low load, in which the test for the presence of the AB did not reach the FDR-corrected threshold. When using an uncorrected threshold the AB magnitude was different than zero (*p* < 0.04, uncorrected), suggesting that there may be a weak AB for other names on low load switch trials. A final interesting finding is that while previous work has shown that increases in T1 task demands can cause an AB for one's own name (Giesbrecht et al., 2009), the absence of an AB for one's own name on repeat high load trials is suggestive evidence that the expectancies generated by inter-trial repetitions of high load are sufficient to override the effect of load on the current trial.

## **DISCUSSION**

The purpose of the present work was to test the extent to which expectancies about task demands engendered by the trial history of T1 task load modulate post-perceptual information processing during the AB. Experiment 1 tested the magnitude of the N400 evoked by T2 words during the AB and demonstrated that when T1 task load was repeated from the previous trial, the N400 survived the AB. Importantly, when T1 task load switched from the previous trial, the N400 evoked during the AB was attenuated relative to outside the AB. Experiment 2 tested if inter-trial dependencies influenced the extent to which personal names survive the AB. The results revealed that on T1-repeat trials one's own name survived the AB, but other names did not. However, on T1-switch trials, an AB was present for both one's own name and someone else's name. This suggests that inter-trial switches of T1 load reduced the availability of highly salient information during the AB.

Previous studies have shown that manipulations of task demands within a trial can attenuate post-perceptual processing during the AB (Giesbrecht et al., 2007, 2009; Vachon et al., 2007; Vachon and Jolicoeur, 2011). The novel finding in both of the present experiments is that inter-trial dependencies of task demand, induced by repetitions and switches in T1-flanker congruency between trials, attenuated the availability of semantic information during the AB. This new finding contrasts theoretical accounts of the AB that propose that information presented during the AB is processed to a post-perceptual level despite the impairment in report (e.g., Chun and Potter, 1995; Raymond et al., 1995; Olivers and Meeter, 2008). However, the present results are consistent with the growing literature demonstrating that the failure that gives rise to the AB can occur either at post-perceptual and perceptual (i.e., pre-semantic) stages of processing (e.g., Giesbrecht et al., 2007, 2009; Vachon et al., 2007; Vul et al., 2008; Elliott and Giesbrecht, 2010). Importantly, these more recent findings suggest that the level at which selective attention operates during the AB is flexibly determined by T1-task demands (e.g., Giesbrecht et al., 2007, 2009; Vachon et al., 2007; Elliott and Giesbrecht, 2010; Vachon and Jolicoeur, 2011).

The finding that post-perceptual processing during the AB is attenuated by inter-trial dependencies of task load, parallels the finding in the visual search literature showing that post-perceptual processing of task irrelevant information is also attenuated by inter-trial switches of task demands (e.g., Theeuwes et al., 2004). These results can be explained in the context of the conflict adaptation literature that suggests that managing changes in conflict between consecutive trials is an effortful process that requires more top-down attentional control in order to resolve conflict either by an active reconfiguration of task set, or by an active inhibition of the previous task set, or both (e.g., Rogers and Monsell, 1995; Monsell, 2003; Rossi et al., 2009). However, it is important to distinguish switch costs in the traditional sense, defined by a change in stimulus-response rules, from the switch costs in the current experiments where the participants performed the identical T1 task in all trials and only the perceptual difficulty changed between trials. However, more recent work has demonstrated that perceptual switches involving changes in the number of simultaneously presented features as in the present experiments resulted in similar if not greater behavioral switch costs than when compared with more typical task-switches (cf. Ullsperger et al., 2005; Ravizza and Carter, 2008).

The availability of post-perceptual information during the AB when T1-congruency was repeated and the reduction of postperceptual information during the AB when T1-congruency was switched between trials can be explained with a flexible selection account of attention. Flexible selection models posit that the level of information processing at which attention selects relevant information is dependent on concurrent task demands (e.g., Yantis and Johnston, 1990; Lavie, 1995, 2005; Pashler, 1998; Lavie et al., 2004; Vogel et al., 2005). In the context of the AB, an over investment of attentional resources on T1 required by a highly demanding task, such as a switch in task or high T1 load within a trial, may reduce the available resources available to process subsequent items presented rapidly beyond a perceptual level (Giesbrecht et al., 2007, 2009; Vachon et al., 2007; Elliott and Giesbrecht, 2010; Vachon and Jolicoeur, 2011). Effectively, the increase in T1-task demands increases the subsequent selectivity of processing, as measured by post-perceptual processing of T2. Thus, the present results support the proposal that the level of processing during the AB is flexible and not always fixed at a post-perceptual level and, more broadly, demonstrates that the human attention system develops expectancies about task difficulty that modulates both the spatial and temporal selectivity of attention.

## **ACKNOWLEDGMENTS**

This research was generously supported by the Institute for Collaborative Biotechnologies through grant W911NF-09-0001 from the U.S. Army Research Office. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. The authors thank Megan Lewis and Noelle Baumann for their assistance with data collection for Experiment 2. We also thank the two reviewers for their constructive comments, including Reviewer 1 who suggested the control analysis reported in Experiment 1.

## **AUTHOR CONTRIBUTIONS**

Barry Giesbrecht designed and programmed the experiments. Jocelyn L. Sy, James C. Elliott, and Barry Giesbrecht collected the data. Jocelyn L. Sy and Barry Giesbrecht analyzed the data. Jocelyn L. Sy, James C. Elliott, and Barry Giesbrecht wrote the manuscript.

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 9 — #9

## **REFERENCES**


by targets and distractors during rapid serial visual presentation: does word meaning survive the attentional blink? *J. Exp. Psychol. Hum. Percept. Perform.* 23, 1014– 1034. doi: 10.1037/0096-1523.23. 4.1014


192, 489–497. doi: 10.1007/s00221- 008-1642-z


"fnhum-07-00627" — 2013/10/4 — 18:06 — page 10 — #10


(2004). Neural fate of ignored stimuli: dissociable effects of perceptual and working memory load. *Nat. Neurosci.* 7, 992–996. doi: 10.1038/ nn1294

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 07 June 2013; accepted: 10 September 2013; published online: 08 October 2013.*

*Citation: Sy JL, Elliott JC and Giesbrecht B (2013) Post-perceptual processing during the attentional blink is modulated by inter-trial task expectancies. Front. Hum. Neurosci. 7:627. doi: 10.3389/fnhum.2013. 00627*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Sy, Elliott and Giesbrecht. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fnhum-07-00627" — 2013/10/4 — 18:06 — page 11 — #11

## Paired-pulse transcranial magnetic stimulation reveals probability-dependent changes in functional connectivity between right inferior frontal cortex and primary motor cortex during go/no-go performance

## *A. Dilene van Campen1,2 \*, Franz-Xaver Neubert <sup>3</sup> ,Wery P. M. van denWildenberg1,2 , K. Richard Ridderinkhof 1,2 and Rogier B. Mars <sup>3</sup>*

*<sup>1</sup> Department of Psychology, University of Amsterdam, Amsterdam, Netherlands*

*<sup>2</sup> Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, Netherlands*

*<sup>3</sup> Department of Experimental Psychology, University of Oxford, Oxford, UK*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Hanneke E. Den Ouden, Radboud University Nijmegen, Netherlands; New York University, USA Peter Smittenaar, University College London, UK*

#### *\*Correspondence:*

*A. Dilene van Campen, Department of Psychology, University of Amsterdam, Nieuwe Prinsengracht 130, 1018 VZ Amsterdam, Netherlands e-mail: a.d.vancampen@gmail.com*

The functional role of the right inferior frontal cortex (rIFC) in mediating human behavior is the subject of ongoing debate. Activation of the rIFC has been associated with both response inhibition and with signaling action adaptation demands resulting from unpredicted events. The goal of this study is to investigate the role of rIFC by combining a go/no-go paradigm with paired-pulse transcranial magnetic stimulation (ppTMS) over rIFC and the primary motor cortex (M1) to probe the functional connectivity between these brain areas. Participants performed a go/no-go task with 20% or 80% of the trials requiring response inhibition (no-go trials) in a classic and a reversed version of the task, respectively. Responses were slower to infrequent compared to frequent go trials, while commission errors were more prevalent to infrequent compared to frequent nogo trials. We hypothesized that if rIFC is involved primarily in response inhibition, then rIFC should exert an inhibitory influence over M1 on no-go (inhibition) trials regardless of no-go probability. If, by contrast, rIFC has a role on unexpected trials other than just response inhibition then rIFC should influence M1 on infrequent trials regardless of response demands. We observed that rIFC suppressed M1 excitability during frequent no-go trials, but not during infrequent no-go trials, suggesting that the role of rIFC in response inhibition is context dependent rather than generic. Importantly, rIFC was found to facilitate M1 excitability on all low frequent trials, irrespective of whether the infrequent event involved response inhibition, a finding more in line with a predictive coding framework of cognitive control.

**Keywords: rIFC, go/no-go task, paired-pulse,TMS, motor cortex, prediction, inhibition**

## **INTRODUCTION**

A changing environment requires us to constantly adapt our behavior in response to new and surprising events. Suppressing unwanted actions or switching between response alternatives are important mechanisms to adapt our behavior. From a psychological perspective the set of mechanisms responsible for this flexible behavior are frequently grouped together under the umbrella term "cognitive control" (Miller, 2000; Ridderinkhof et al., 2004; Rushworth et al., 2004; Verguts and Notebaert, 2009). Response inhibition, i.e., the suppression of response activation of the upcoming action, is traditionally seen as one hallmark of cognitive control (Logan et al., 1984; Verbruggen and Logan, 2008; van den Wildenberg et al., 2010a). More recently, it has been argued that our brain implements control by employing a predictive strategy through which it extracts statistical regularities in the environment and uses this information to optimize response strategies (Friston, 2005; Clark, 2013). Evidence for predictive modulation of brain activity consistent with this model has been reported in parietal and frontal cortex (Huettel et al., 2005), but also in the primary motor cortex (Bestmann et al., 2008). In this framework, when an unpredicted stimulus occurs this results in a prediction error typically signaling adaptation of the predicted or planned motor response ("action reprogramming," Mars et al., 2007b), and the updating of the internal representation of the environment (Mars et al., 2008; den Ouden et al., 2012).

A large body of literature has consistently identified a network of brain regions involved in cognitive control processes (Garavan et al., 1999; Ridderinkhof et al., 2004; Aron and Poldrack, 2006; Mars et al., 2007b, 2011). The right inferior frontal cortex (rIFC) in particular has been identified as a critical node within this network (for review see Ridderinkhof et al., 2011). Early studies suggested that this area is critical for the inhibition of inappropriate responses (Garavan et al., 1999; Aron et al., 2003; Rubia et al., 2003), thus specifying its role in action reprogramming and response inhibition. Consistent with this view, a series of studies recently showed that rIFC exerts an inhibitory influence over the primary motor cortex (M1) when actions need to

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 1 — #1

be reprogrammed in the context of environmental information (Buch et al., 2010; Neubert et al., 2010). However, others have suggested a broader role for rIFC in action control. For instance, Verbruggen et al. (2010) suggested that different subparts of rIFC have distinct roles in detecting changes in the environment and implementing the most appropriate action. Vossel et al. (2011) showed that rIFC activity in response to unpredicted stimuli depends on the previous trial history, an interpretation consistent with a predictive coding framework of cognitive control such as discussed above.

Most current studies focusing on the role of rIFC in cognitive control present participants with an environment in which one stimulus-response combination is most frequent and occasional unexpected events require the dominant response to be overridden or replaced by an alternative action. These studies thus confound the requirement to inhibit a response and the surprise inherent in the unexpected stimulus. Thus, it cannot be fully established whether rIFC is involved primarily in response inhibition or more generally in the processing of unpredicted events. The goal of the current experiment is, therefore, to examine the influence of rIFC over the motor cortex in a context in which the role of response inhibition can be disentangled from a role in the processing of unpredicted events generally.

We employ a modified version of the classical "go/no-go" task. In this task, one type of stimulus is presented frequently while another type is presented infrequently. In the standard version of this task, participants are required to respond to the frequently presented stimulus and to withhold their action to the infrequently presented stimulus. In a modified version of the task we reversed the probabilities, such that the no-go stimuli are frequent and the go stimuli are unexpected. Thus, in this context, the unpredicted stimulus signals a need to override the pre-potent tendency to refrain from action (cf. Nieuwenhuis et al., 2003) and does not require response inhibition. Importantly this infrequent go stimulus is similar to the infrequent trials in the standard version of the task in terms of surprise. This set-up allows us to disentangle the role of IFC in response inhibition from a role in processing unexpected events in general.

To probe the influence of rIFC on the motor cortex, the functional connectivity between rIFC and M1 is assessed using paired-pulse transcranial magnetic stimulation (ppTMS). In this procedure, a single"test" transcranial magnetic stimulation (TMS) pulse is delivered over the hand representation of M1 to elicit a motor-evoked potential (MEP) in the electromygraphic (EMG) recorded from the effector muscle. On half of the trials, a "conditioning" TMS pulse over rIFC precedes the test pulse over M1. By calculating the ratio of the MEP amplitude recorded on pairedpulse (pp) and single-pulse trials, the impact of rIFC on the motor cortex can be assessed (Mars et al., 2009; Buch et al., 2010; Neubert et al., 2010; Buch et al., 2011; Catmur et al., 2011). Recent ppTMS studies showed rIFC exerts an inhibitory influence on M1 during action reprogramming (Buch et al., 2010; Neubert et al., 2010). This inhibitory effect was found when participants had to a switch between response alternatives. However, during normal action selection, a facilitatory effect of rIFC on M1 was reported.

The main aim of this study was thus to use physiological markers of the effects of rIFC on M1, as assessed by ppTMS, to

disentangle the role of rIFC in response inhibition and signaling unpredicted actions in a go/no-go task. Manipulation of the probability of the no-go trials was used to differentiate between inhibitory demands *per se* from the action adaptation demands resulting from unpredicted action signals. In case rIFC is involved in response inhibition *per se*, we expected an influence of IFC on M1 on no-go trials only. Alternatively, in case of a more generic override-related activation of rIFC in response to unpredicted action signals in general, similar patterns are expected for infrequent no-go and infrequent go stimuli in both experiments.

## **MATERIALS AND METHODS**

### **PARTICIPANTS**

Eleven participants (age range 20–32 years, Mean 26.8 years SD 3.7, six women) performed experiment A (the *frequent go* experiment) and nine participants (age range 20–32 years, Mean 26.8 years SD 3.9, five women) performed experiment B (the *frequent no-go* experiment). Initially twelve participants were recruited for each experiment. One participant was excluded from the analyses of *frequent go* experiment, due to low trial numbers. One participant was excluded from the analyses of *frequent no-go* experiment, due to low trial numbers. Another two participants dropped out after*frequent go* experiment because the *frequent no-go* experiment could not be completed due to time restrictions. This resulted in a total of 11 participants in the *frequent go* (A) experiment and 9 participants in the *frequent no-go* experiment (8 of which participated in both experiments). All participants had normal or corrected-tonormal vision. All participants gave informed consent and were screened for familial epilepsy or other neurological disorders. A safety questionnaire was used to assess potential and risk factors of TMS. The Mid and South Buckinghamshire Research Ethics Committee approved the experimental procedures. At least 1 week before participating, participants were invited to a "taster" session, in which the whole procedure of the experiment was explained to them and they were given the opportunity to experience a few pulses of TMS, such that they could make an informed decision about their participation in the actual experiment. Participants participating in both experiments only attended the taster session before their first participation.

## **EXPERIMENTAL SET UP**

All participants were seated approximately 85 cm in front of a computer screen and responded with the right index finger on the space bar (see **Figure 1**). A chin support system was used to prevent movement of the head during the experimental blocks. Participants wore earplugs to protect against the noise of TMS and an EEG cap on which the locations of the TMS stimulation sites were marked.

## **DESIGN AND PROCEDURE**

Stimuli were presented in white on a black background on a computer screen. A fixation cross was presented in the middle of the screen at the start of trial for 500–750 ms (uniform distribution) and disappeared at stimulus onset (SOA). A letter T (presented in regular orientation or upside-down) was used to represent go or no-go signals, respectively. This mapping was counterbalanced over participants. The stimuli disappeared after 60 ms.

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 2 — #2

Participants were seated in front of the computer display and responded by pressing the space bar with the right index finger. The test coil was placed over the left M1 and the conditioning coil was placed over rIFC. EMG was recorded continuously from the right hand

FDI muscle. **(B)** TMS intervals. A fixation cross appeared on the screen was applied on one of three SOAs, either 75, 125, or 175 ms. On half the TMS trials, this pulse was preceded by a pulse over rIFC 8 ms earlier.

When participants participated in both experiment A and B, the two experiments were conducted in two separate sessions on different days and separated by at least 1 week and the stimulusresponse mapping were held constant across the two sessions for that participant. In experiment A, the "*frequent go*" experiment, participants were instructed to withhold their response on 20% of the trials (no-go trials) and respond as quickly as possible with the right hand by pressing the spacebar on 80% of the trials (go trials). In experiment B, the "*frequent no-go*" experiment, participants were instructed to withhold their response on 80% of the trials (no-go trials) and respond as quickly as possible on 20% of the trials (go trials).

Each experiment consisted of one behavioral practice block, a TMS practice block and eight experimental blocks of 120 trials each. The behavioral practice block and the TMS practice block consisted of 60 trials. In each experimental block, TMS was delivered on 24 trials on go and no-go trials, 12 single-pulse and 12 pp trials. TMS trials were mixed in a quasi-random fashion with no-TMS trials (either 4, 5, or 6 successive no-TMS trials in between TMS trials) to prevent expectancy of the TMS pulse. TMS pulses were delivered at one of three time intervals after SOA, namely 75, 125, or 175 ms. Overall, for each SOA there were 32 single-pulse trials and 32 paired-pulse trials (pp), resulting in a total of 192 TMS pulse trials distributed over go and no-go trials per experiment.

#### **TRANSCRANIAL MAGNETIC STIMULATION**

Two figure-of-eight coils connected to MagStim 200 monopulse machines (The Magstim Company, Whitland, UK) were used to deliver TMS pulses. The TMS coil delivering the test pulse over left M1 was placed tangentially on the skull with the handle pointing backwards at an angle of approximately 45◦ (cf. O'Shea et al., 2007). The second coil delivering the conditioning pulse was placed initially over rIFC with the handle pointing forward. In a few participants the orientation of the coil was adjusted in a slightly counter-clockwise direction because the stimulation resulted in uncomfortable muscle contractions. The location of the coil over left M1 was determined as the location in which the largest MEP amplitude for any given stimulation intensity was elicited in the right first dorsal interosseous (FDI) muscle controlling the right index finger. The location of the conditioning coil was based on averaged MNI coordinates (*x* = 53, *y* = 15, *z* = 18), overlapping with previous work (Forstmann et al., 2008; Neubert et al., 2010;Verbruggen et al., 2010), transformed back into subject space for each individual and was placed using neuronavigation (MRIaligned frameless stereotaxic neuronavigation, Brainsight, Rogue Research, Inc).

Resting motor threshold (RMT) in both left and right FDI were assessed, defined as the lowest intensity (expressed as % of maximum TMS stimulator output) at which an MEP with an amplitude of >50 μV is present in at least three out of five trials. The conditioning pulse intensity in the experiment was set at 110% of RMT of the right M1 elicited in the left FDI (cf. Mars et al., 2009; Buch et al., 2010). The intensity of the test pulse over left M1 was set at the value that yielded an average MEP of 1.0 mV recorded from the right FDI at rest. The inter-pulse interval for ppTMS was set at 8 ms. Mean RMT stimulator output over the right hemisphere was 35% for both experiments, established at the start of each experiment (range *frequent go* experiment: 28–44%; *frequent no-go* experiment: 30–40%). Mean left hemisphere stimulation intensity (1.0 mV) was 42% for both experiments (range *frequent go* experiment: 32–54%; *frequent no-go* experiment: 37–53%).

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 3 — #3

## **ELECTROMYGRAPHIC RECORDINGS**

Two Ag-AgCl electrodes were placed over the muscle belly of the FDI and the related muscle tendon on both the right and left hand. The left-hand electrodes were only used for determining the RMT and were removed before the start of the experimental blocks. An earth electrode was attached to the bony structure of the right elbow (olecranon ulnae). EMG data was sampled at 5000 Hz using a Cambridge Electronic Design (CED) 1902 amplifier, a CED Micro1401 Mk II. A/D converter. Bandpass filtering of 10–1000 Hz and using an additional 50 Hz notch filter was performed using Spike2 computer software (CED, Cambridge, UK). MEP amplitude was defined as the peak-to-peak amplitude within a window of 10–40 ms after the test pulse over M1.

## **ANALYSES MEP DATA**

Analyses of the MEP data followed standard procedures (Mars et al., 2007a, 2009). Trials were excluded if (a) the participant responded incorrectly (commission or omission errors), (b) the participant responded prematurely (RT < 50 ms), (c) no reliable MEP was elicited (MEP < 200 μV), (d) the MEP was elicited during or after the voluntary response, artificially inflating the MEP amplitude, or (e) there was a strong precontraction of the response muscle, again artificially inflating the MEP amplitude. Exclusion criteria (d, e) were based on trial by trial inspection and using a cut-off based of 10 root mean square EMG calculated during pre-stimulus baseline (100 ms prior to stimulus). To account for differences in overall MEP amplitude between participants, all MEP amplitudes were transformed into *z*-scores using all MEPs retained after preprocessing using the criteria outlined above of that participant over all conditions (Burle et al., 2002; van Campen et al., in press). To quantify the paired-pulse effect (PPE) of rIFC on M1, the ratio of MEP amplitudes between single-pulse and pp for each individual and SOA was indexed by *z*-scores per condition: PPE = (pp MEPmean − sp MEPmean) / [SD pp + sp], where pp is paired-pulse TMS and sp is single-pulse TMS (Buch et al., 2010). In this way a direct comparison between single-pulse and pp TMS is given.

## **NORMALITY OF DATA**

RT data were roughly normally distributed. In contrast, accuracy levels (i.e., error percentages) were not normally distributed. Therefore, analysis of variance (ANOVA) on error percentages were performed over square-root transformed percentages. Shapiro–Wilk tests for normality indicated that 3 out of 24 MEP samples (Experiment A: single-pulse TMS no-go trials at 75 ms, Experiment B: single-pulse and ppTMS go trials 175 ms) do not comply with the assumption of normality over participants. Because (1) the majority of the MEP samples is roughly normally distributed over participants, and (2) the ANOVA procedure is quite robust against moderate violations of the normality assumption (Schmider et al., 2010), we analyzed single-pulse and pp MEP amplitude with an overall omnibus mixed ANOVA. PPE data are normally distributed over participants, according to both Kolmogorov–Smirnov (*p* > 0.166) and Shapiro–Wilk (*p* > 0.183) tests of normality.

## **ANALYSES**

Mean RT and accuracy data were submitted to ANOVA with the between-subjects variable *Experiment* (*frequent go* vs. *frequent no-go*). An ANOVA with the within-subject factors *Trial-type* (*go* vs*. no-go trial*) and *SOA* (75, 125, and 175 ms) and the betweensubjects factor *Experiment* (*frequent go* vs. *frequent no-go*) was used to analyze the single-pulse MEP data. An ANOVA with the within-subject factors *Pulse* (*single-pulse* vs. *paired-pulse*), *Trialtype* (*go* vs*. no-go trial*), and *SOA* (75, 125, and 175 ms) and the between-subjects factor *Experiment* (*frequent go* vs. *frequent no-go*) was used to analyze all MEP data. When the sphericity assumption was violated, degrees of freedom were corrected using the Greenhouse–Geisser (GG) method. Uncorrected degrees of freedom are reported for ease of reading.

It was not known at which time point in the response interval IFC exerted influence of the motor cortex. Therefore, we probed this interaction at the different SOAs. However, this also made our design unnecessarily conservative. Therefore, when looking at the specific effects on infrequent trials, which are the focus of the current manuscript, we followed hierarchical procedure to investigate these effects. First, we tested the PPE on infrequent trials for each SOA in each experiment against zero. Based on these analyses we identified that the 125 ms SOA in the *frequent go experiment* and the 175 ms SOA in the *frequent no-go experiment* were the moments at which IFC exerted its influence on M1. We then tested the differential effects of trial-type in the two experiments at only these SOAs in a single ANOVA. This procedure, though, does mean our data await replication in a separate experiment in which the SOAs are formulated in the hypothesis.

## **RESULTS**

Behavioral data are presented first, followed by the single-pulse MEP data, overall physiological data, and PPE analyses.

## **BEHAVIORAL DATA**

An ANOVA with the between-subjects factor *Experiment* (*frequent go* vs. *frequent no-go*) was used to analyze the RT and accuracy data. RTs on go trials were considerably faster in the *frequent go* experiment as compared to the *frequent no-go* experiment [333 vs. 431 ms, main effect *Experiment*, *F*(1,18) = 20.586, *p* < 0.001]. Fewer commission errors were made on frequent compared to infrequent no-go trials [0.3 vs. 20.8%, main effect *Experiment, F*(1,18) = 45.251, *p* < 0.001]. Commission error responses were considerably faster on infrequent compared to frequent no-go trials [304 vs. 471 ms, main effect *Experiment*, *F*(1,16) = 17.175, *p* = 0.001]. For go trials, omission error incidence was comparable across experiments [0.4 vs. 1.1%, *Experiment*, *F*(1,18) = 0.533, *p* = 0.475]. These behavioral patterns were similar to a previous study using a similar probability manipulation (Nieuwenhuis et al., 2003).

## **SINGLE-PULSE TMS**

An ANOVA with the within-subject factors *Trial-type* (*go trials* vs. *no-go trials*), and *SOA* (75, 125, or 175 ms) and the between-subjects factor *Experiment* (*frequent go* vs. *frequent no-go*) was used to analyze the *z*-scored single-pulse MEP data. As expected MEP amplitudes were higher on go compared to no-go

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 4 — #4

trials [0.042 vs. −0.101, main effect of *Trial-type*, *F*(1,18) = 6.111, *p* = 0.024]. Larger MEP amplitudes were found in the *frequent no-go* experiment [−0.088 vs. 0.029, main effect *Experiment*, *F*(1,18) = 7.984, *p* = 0.011]. In addition, the difference in MEP amplitudes between go and no-go trials was modulated by the experimental context [interaction effect, *Trial-type* × *Experiment, F*(1,18) = 39.867, *p* < 0.001].

Motor-evoked potential amplitudes differed depending on the time point of stimulation [−0.200, −0.010, and.122, main effect of *SOA*, *F*(2,36) = 7.933, *p* = 0.001]. As can be seen in **Figure 2A**, in the *frequent go*, where responding to the stimulus was the predominant response and participants reacted fastest (see above, Behavioral results), the amplitude of the MEP increased over time when the stimulation occurred closer to the response. On the no-go trials, where participants were not required to make a response, this monotone increase is not observed. This pattern of go and no-go modulation of MEP amplitude was not seen in the *frequent no-go* experiment, presumably due to the much longer response times. These effects are reflected in the significant interactions, between *SOA* and *Experiment* [*F*(2,36) = 7.650, *p* = 0.002], between *SOA* and *Trial-type* [*F*(2,36) = 3.677, *p* = 0.035] and between *Trial-type*, *SOA*, and *Experiment* [*F*(2,36) = 16.963, *p* < 0.001].

In sum, the single-pulse MEP amplitudes during *frequent go* trials show the pattern normally observed during response trials and this pattern was modulated by the task manipulation with frequent no-go signals. These results suggest that our experimental manipulation was successful and hence we now turn to comparing the effects of rIFC stimulation on M1.

#### **PHYSIOLOGICAL EFFECTS OF ppTMS OVER rIFC ON M1**

An ANOVA with the within-subject factors *Pulse* (single-pulse TMS vs. *ppTMS*), *Trial-type* (*go* vs. *no-go trials*), and *SOA* (*75*, *125*, or *175 ms*) and the between-subjects factor *Experiment* (*frequent go* vs. *frequent no-go*) was used to analyze the z-scored MEP data (**Figures 2A,B**). As was observed for the single-pulse data, MEP amplitudes on go trials were larger than on no-go trials [0.067 vs. −0.113, main effect of *Trial-type*, *F*(1,18) = 16.103, *p* = 0.001], but this general pattern was different between the two experiments [*Trial-type* × *Experiment* interaction*, F*(1,18) =40.469, *p*<0.001]. There was no main effect of pulse [*Pulse*, *F*(1,18) = 0.144, *p* = 0.709], but there were specific effects of *Pulse* between the two experiments [*Pulse* × *Experiment* interaction, *F*(1,18) = 5.200, *p* = 0.035]. In the *frequent go* experiment, MEP amplitudes were lower following single-pulse TMS than ppTMS, whereas in the *frequent no-go experiment* this pattern was reversed. This was also reflected in the different trial-types [*Trialtype* × *Pulse* × *Experiment* interaction, *F*(1,18) = 4.446, *p* = 0.049]. Thus, the presence of a pulse over rIFC affected the amplitude of the MEP elicited by the test coil in the two experiments differently.

Investigating the SOA-specific effects, we again observed that the MEP amplitudes changed over time [−0.197, −0.019, and 0.146, main effect of *SOA*, *F*(2,36) = 13.182, *p* < 0.001]. The pattern of MEP amplitudes over time was different between the two experiments [interaction effect, *SOA* × *Experiment*, *F*(2,36) = 9.666, *p* < 0.001] and differed between go and no-go trials [interaction effect, *SOA* × *Trial-type*, *F*(2,36) = 9.424, *<sup>p</sup>* <sup>=</sup> 0.002, GG-corrected: <sup>χ</sup><sup>2</sup> <sup>=</sup> 9.757, <sup>ε</sup> <sup>=</sup> 0.696]. Also the pattern over time of MEP amplitudes on go and no-go trials was different between the two experiments [interaction effect, *Trial-type* × *SOA* × *Experiment*, *F*(2,36) = 15.459, *p* < 0.001].

In summary, the effects of *Pulse* indicate that preceding the test pulse over M1 by a conditioning pulse over rIFC modulated the amplitude of the MEP. This effect is specific to *Trial-type* and *SOA*. Importantly, the reported effect differed between the two experiments. In the next section, we investigate these differences more closely using planned *t*-tests of the effects of the rIFC on M1 at each SOA and trial-type and within each experiment.

#### **PAIRED-PULSE EFFECTS: ALL SOAs**

In order to get a clearer picture of the effects of the rIFC pulse on the excitability of the motor cortex, we calculated the "PPE" for each time point and each condition (see Materials and Methods).

**FIGURE 2 |** *Z***-scored MEP amplitudes of single-pulse (sp) and paired-pulse (pp) MEPs for go and no-go trials at three SOA for (A)** *frequent go* **experiment and (B)** *frequent no-go* **experiment.** Black lines represent MEP amplitudes of single-pulse MEP amplitudes and gray lines of ppTMS MEP amplitudes. Solid lines are go trials and dotted lines are no-go trials.

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 5 — #5

The PPE has been established as a standard measure of causal influence of one cortical area over another (e.g., Civardi et al., 2001; Koch et al., 2006; O'Shea et al., 2007; Mars et al., 2009; Buch et al., 2010). If there is no effect of the rIFC pulse on the excitability of the motor cortex, i.e., on the MEP amplitude, PPE is zero. A positive PPE means that a pulse over rIFC increased the excitability of the motor cortex in that condition, termed "facilitation," and a negative PPE indicates a decrease in M1 excitability, termed "inhibition." These effects are displayed in **Figures 3A,B**.

One sample *t*-tests of the PPE against zero (the null hypothesis of no effect of rIFC stimulation) showed that in the *frequent go* experiment the only influence of rIFC on the motor cortex was a facilitatory effect on the infrequent no-go trials at the specific time interval of 125 ms after stimulus presentation [0.353, *t*(10) = 2.298, *p* = 0.044, all other effects *p* > 0.30]. Interestingly, such a facilitatory effect was also present on the infrequent go trials in the *frequent no-go experiment*, with rIFC stimulation enhancing motor cortex excitability at the later time point of 175 ms [0.319, *t*(8) = 2.560, *p* = 0.034]. Thus, there is a facilitatory effect of rIFC on M1 on infrequent trials, independent of their response (or inhibition) requirements. Inhibitory effects of rIFC on M1 were found in the *frequent no-go experiment* on no-go trials at 125 ms [−0.266, *t*(8) = −4.230, *p* = 0.003]. A direct contrast of the facilitatory effect on infrequent no-go trials in the first experiment (frequent go trials) and frequent no-go trials in the second experiment revealed a significant difference between the two at SOA 125 ms [0.353 vs. −0.266, *t*(18) = 3.729, *p* = 0.002, corrected for unequal variance]. No other effects reached significance (all *p* > 0.1).

#### **PAIRED-PULSE EFFECTS: EFFECTS OF LOW PROBABILITY**

The goal of the experiment was to test whether the influence of IFC on M1 differed as a function of frequency between trial-types and experiments. Specifically, we were interested to see what would happen on low-frequency trials if they were traditional no-go trials or go trials. In order to ensure that we were able to detect any effects present, we probed the influence of IFC on M1 at three different SOAs. Thus far, we have reported effects while looking at all these SOAs. This design is, however, unnecessarily conservative, since we probe time points outside those on which we expect our effect. Therefore, we now present a more focused analysis, comparing the PPE on low frequent trials between the two experiment at the SOA that IFC influence is strongest. This SOA was defined for each experiment as the moment where the PPE of the lowfrequent trial-types was maximum as indicated by the *t*-test of the PPE against zero. This was the 125 ms SOA for the *frequent go* experiment and the 175 ms SOA for the *frequent no-go* experiment.

We performed an ANOVA on the PPE on both go and no-go trials at those time points of maximum influence of IFC on M1. This ANOVA with within-subject factor *Trial-type* (*go trials* vs. *no-go trials*) and the between-subjects factor *Experiment* (*frequent go* vs. *frequent no-go*) for infrequent stimuli (SOA 125 and 175 ms) revealed an interaction between *Trial-type* and *Experiment* [*Trialtype* × *Experiment* interaction, *F*(1,18) = 10.076, *p* = 0.005]. No main effect of *Experiment* or *Trial-type* was found [*Experiment*, *F*(1,18) = 1.958, *p* = 0.179, *Trial-type*, *F*(1,18) = 1.616, *p* = 0.220]. This analysis thus shows that IFC influences M1 on different trialtypes in the two experiments.

In conclusion, slower RTs on go trials and fewer commission errors on no-go trials were found in the *frequent no-go experiment* replicating previous results (Nieuwenhuis et al., 2003). Overall, differences in MEP amplitudes between the experiments were found indicating a different pattern of activation. Most important, facilitatory PPEs were observed not only on infrequent no-go trials, but also on infrequent go trials. The time difference between these PPEs (maximal at 125 and 175 ms after stimulus presentation, respectively) likely reflects the corresponding difference in response speed between the two contexts. In case of frequent no-go trials, an inhibitory PPE was found that peaked around 125 ms after stimulus presentation.

#### **DISCUSSION**

The main goal of this study was to investigate the role of rIFC on the excitability of the primary motor cortex during cognitive control. Two main hypotheses can be formulated based on the literature, namely that rIFC functions to inhibit incorrect

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 6 — #6

response tendencies or that rIFC responds to surprising stimuli in a more general sense. We manipulated the probability of no-go trials to differentiate inhibitory demands (withholding an action) from the action adaptation demands resulting from unpredicted events. Behaviorally, we observed faster responses on frequent as compared to infrequent go trials, and fewer commission errors on frequent as compared to infrequent no-go trials. Physiologically, facilitatory functional connectivity between rIFC and M1 was observed not only on infrequent no-go trials, but, crucially, also on infrequent go trials. This finding implies that rIFC influences M1 when unpredicted stimulus signals action adaptation demands in general, rather than just response inhibition *per se*.

The behavioral findings replicate previous work (Nieuwenhuis et al., 2003). These authors analyzed behavioral and event-related brain potential effects in a go/no-go task using frequent go and frequent no-go contexts. Physiologically, these authors first replicated typical findings, reporting a larger N2 event-related potential on infrequent no-go trials compared to frequent go trials, consistent with the then dominant account that N2 amplitude reflects inhibitory demands. However, when go- and no-go probabilities were reversed, they observed that the N2 disappeared for frequent no-go trials; instead, the N2 was now largest for the infrequent go trials, more consistent with an alternative account of the N2 in terms of conflict when the predicted action was to be overridden by a competing action option as designated by an infrequent signal. Their results are similar to the facilitatory PPE observed on the low frequent stimuli in the current experiments.

The physiological effects in the present study form an extension and in part an apparent departure from the results of previous PPE studies (Buch et al., 2010; Neubert et al., 2010). In the next section we will first contrast the current findings with previous studies using single-pulse TMS over M1 and secondly compare the current findings with other paradigms in which PPE is used as an index of functional connectivity. Finally, we will discuss the current findings within the existing framework of action control.

## **SINGLE-PULSE EFFECTS**

In this study, we exploited the use of TMS as a probe of corticospinal excitability. The amplitude of the MEP elicited by a single TMS pulse over the primary motor cortex can be taken as an index of the activity within cortico-spinal neurons (Civardi et al., 2001). This was clearly reflected in the development of the MEP on successive SOAs in the *frequent go* experiment. While MEP amplitudes were not significantly modulated during no-go trials, they increased during the stimulus-response interval on go trials. This is similar to effects standard observed in the literature (Leocani et al., 2000; Yamanaka et al., 2002; Coxon et al., 2006; van den Wildenberg et al., 2010b; Fujiyama et al., 2011). A different pattern was observed on *frequent no-go* experiment, which is partly explained by the longer response times in that experiment, resulting in the TMS pulses effectively occurring earlier in the response period. The pattern of MEP amplitudes during the go trials in the *frequent no-go* experiment might slightly surprising, showing a tendency to be a bit lower than baseline (75 and 125 ms) at SOA 175 ms, but it should be noted that it is not uncommon to see inhibitory processes within the motor cortex at work during longer stimulus-response intervals (Hasbroucq et al., 1999; Duque and Ivry, 2009). Importantly, in the current experiment, the single-pulse MEP amplitude is simply a baseline that is compared to the MEP amplitude on ppTMS trials. It is the modulation of the MEP amplitude by the preceding pulse over rIFC that is the dependent variable in this experiment.

## **PAIRED-PULSE FUNCTIONAL CONNECTIVITY**

In the current study we found facilitatory effects of rIFC on M1 on low frequent trials, independent of whether these trials required inhibition of a response. An inhibitory effect of rIFC on M1 was found only on frequent no-go trials. Previous work probing rIFC influence over M1 showed an inhibitory effect on switch trials around 175 ms and a facilitatory effect on stay trials (Neubert et al., 2010). This inhibition seemed to be preceded by a facilitatory effect of pre-SMA on M1 on action reprogramming trials (Mars et al., 2009; Neubert et al., 2010). One might expect action reprogramming trials and no-go trials to show similar dynamics, but this was not observed in the current experiment. Although the results of the present study make sense in the context of the literature on predictive coding and rIFC (e.g., Vossel et al., 2011), the results are not directly comparable with these previous ppTMS experiments. However a number of factors might reconcile these apparently different effects.

First, it seems likely that multiple processes occur in the time interval between stimulus and response. Rather than rIFC sending a single signal to M1 it seems more likely that there is a two-way stream of communication, with information going both from rIFC to M1 and back. It is known that rIFC interacts with M1 via both cortical and subcortical pathways (Neubert et al., 2010), the different functions of which are yet to be established. The current study is also the first to probe rIFC/M1 interactions during cognitive control outside the context of explicit action reprogramming and could therefore have tapped into previously unidentified interactions between the two cortical loci. Importantly, one should also be cautious relating *physiological* inhibition, as indicated by an inhibitory PPE, and *cognitive* inhibition, which is what is assumed to occur during response conflict resolution. Although the result from previous action reprogramming studies conveniently showed physiological inhibition where cognitive inhibition would be hypothesized, this inference warrants caution.

It should also be noted that the effects of ppTMS, although highly consistent and replicable between subjects and sessions (Neubert et al., 2010), are quite sensitive to even small changes in stimulation site and stimulation intensity (Civardi et al., 2001). Although we have kept the stimulation intensities in the current experiment to the same levels as in the previous experiments, the location of the rIFC coil might have been more ventral. In the study by Neubert et al. (2010), the coil ended up in location with a *z*-coordinate in MNI space of 25–30, whereas in the current study we aimed at a *z*-coordinate of 18. It is now becoming more and more apparent that the rIFC consists of a number of subdivisions, including different partitions in the dorsal-ventral dimension. It is also becoming clear that these

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 7 — #7

subareas have different functions in cognitive control (Verbruggen et al., 2010). Therefore, it is not unlikely that our quite ventral simulation position targeted a different rIFC subdivision than the previous studies of Neubert et al. (2010) and Buch et al. (2010). This provides an interesting hypothesis for future studies.

## **THE ROLE OF rIFC IN THE LARGER NETWORK**

Cognitive control relies on a network of areas, prominently involving the rIFC, pre-supplementary motor area (pre-SMA), and basal ganglia (for reviews see Mostofsky and Simmonds, 2008; Mars et al., 2011; Ridderinkhof et al., 2011). It is therefore important to consider the present results in the context of this larger network, rather than presuming that rIFC functions alone to implement cognitive control. Work in primates (Isoda and Hikosaka, 2007) and humans (Nachev et al., 2007; Forstmann et al., 2008; Mars et al., 2009; Wylie et al., 2010) confirms an essential role not only for rIFC but also for pre-SMA and STN in conflict resolution. It has been shown that all nodes of this network are connected in the human brain (Aron et al., 2007) and that disruption of one node influences the activity and functional connectivity of the remaining nodes (Neubert et al., 2010).

The current results show a role of rIFC beyond response inhibition and it is worth considering what this means for its role in the larger network. Previous models in the interaction between the different nodes in this cognitive control network suggest that medial frontal cortex including pre-SMA is active before lateral frontal cortex including rIFC (Kouneiher et al., 2009; Neubert et al., 2010). The fact that rIFC is active on all types of surprising trials dovetails with similar observations of pre-SMA (Strange et al., 2005). These results invite an interpretation along the lines of predictive coding framework for the whole network and provide some possible clues on the function of rIFC within this network.

## **INTERPRETATION LIMITATIONS**

The current findings of IFC influence on M1 in response to infrequent stimuli provide evidence in line with the predictive coding account (Friston, 2005; Clark, 2013). However, some caution is warranted.

First, the exact timing of functional connectivity is difficult to predict. Therefore, we probed three well-documented time points (75, 125, and 175 ms), adding an additional factor to the design, making our design necessarily less powerful. Therefore, in the final part of the analyses, we used a series of *t*-tests to determine at which SOA the influence of IFC over M1 was largest on the infrequent trials. This turned out to be the 125 ms SOA in the frequent-go experiment and the 175 ms SOA in the *frequent no-go* experiment. We then performed a *Trial-type* × *Experiment* ANOVA on the PPE of these time points, showing clearly that the effect of rIFC on M1 is on different trial-types in the two experiments. In this way, we thus limit our interpretation to the trial-type and direction of the effect. We acknowledge that by analyzing the data in this fashion, we perform a delicate balance between the need to probe a range of time points in order to ensure we don't miss detecting our effect and the need to not have a very underpowered design. The current results should thus be seen as preliminary. Future experiments should, for instance, perform these tests on different datasets, one focusing on identifying the SOAs that should be probed and a separate dataset to investigate the effect of frequency.

Second, we probed the functional connectivity between rIFC and M1 using an inter-pulse interval of 8 ms, which is similar to the timing of previous experiments (Buch et al., 2010; Catmur et al., 2011) and is thought to afford the involvement of direct cortical pathways (Neubert et al., 2010). Additional experiments employing longer inter-pulse intervals might reveal additional effects relying on sub-cortical pathways as well.

The current study thus invites a number of follow-up experiments to investigate top-down control over the motor cortex within a predictive coding framework, replicating the present effects in a large cohort and investigating the neural pathways mediating these effects. In general, we believe ppTMS has proven itself a suitable tool in this endeavor.

## **ACKNOWLEDGEMENT**

This work was supported by an Open Competition grant (K. Richard Ridderinkhof and Wery P. M. van den Wildenberg) from the Netherlands Organization for Scientific Research (NWO).

## **REFERENCES**


"fnhum-07-00736" — 2013/11/11 — 20:50 — page 8 — #8


"fnhum-07-00736" — 2013/11/11 — 20:50 — page 9 — #9

suppression of impulsive behavior in Parkinson's Disease. *Brain* 133, 3611–3624. doi: 10.1093/brain/awq239

Yamanaka, K., Kimura, T., Miyazaki, M., Kawashima, N., Nozaki, D., Nakazawa, K., et al. (2002). Human cortical activities during Go/NoGo tasks with opposite motor control paradigms. *Exp. Brain Res.* 142, 301–307. doi: 10.1007/s00221- 001-0943-2

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 25 June 2013; accepted: 15 October 2013; published online: 12 November 2013.*

*Citation: van Campen AD, Neubert F-X, van den Wildenberg WPM, Ridderinkhof KR and Mars RB (2013) Paired-pulse transcranial magnetic stimulation reveals probability-dependent changes in functional connectivity between right inferior frontal cortex and primary motor cortex during go/no-go performance. Front. Hum. Neurosci. 7:736. doi: 10.3389/fnhum.2013.00736*

*This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 van Campen, Neubert, van denWildenberg, Ridderinkhof and Mars. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

"fnhum-07-00736" — 2013/11/11 — 20:50 — page 10 — #10

## The role of auditory transient and deviance processing in distraction of task performance: a combined behavioral and event-related brain potential study

## *Stefan Berti\**

*Department for Psychology, Johannes Gutenberg-University Mainz, Mainz, Germany*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Fabrice Parmentier, University of the Balearic Islands, Spain Sabine Grimm, University Barcelona, Spain*

#### *\*Correspondence:*

*Stefan Berti, Psychologisches Institut, Johannes Gutenberg-Universität Mainz, Wallstrasse 3, D-55099 Mainz, Germany e-mail: berti@uni-mainz.de*

Distraction of goal-oriented performance by a sudden change in the auditory environment is an everyday life experience. Different types of changes can be distracting, including a sudden onset of a transient sound and a slight deviation of otherwise regular auditory background stimulation. With regard to deviance detection, it is assumed that slight changes in a continuous sequence of auditory stimuli are detected by a predictive coding mechanisms and it has been demonstrated that this mechanism is capable of distracting ongoing task performance. In contrast, it is open whether transient detection—which does not rely on predictive coding mechanisms—can trigger behavioral distraction, too. In the present study, the effect of rare auditory changes on visual task performance is tested in an auditory-visual cross-modal distraction paradigm. The rare changes are either embedded within a continuous standard stimulation (triggering deviance detection) or are presented within an otherwise silent situation (triggering transient detection). In the event-related brain potentials, deviants elicited the mismatch negativity (MMN) while transients elicited an enhanced N1 component, mirroring pre-attentive change detection in both conditions but on the basis of different neuro-cognitive processes. These sensory components are followed by attention related ERP components including the P3a and the reorienting negativity (RON). This demonstrates that both types of changes trigger switches of attention. Finally, distraction of task performance is observable, too, but the impact of deviants is higher compared to transients. These findings suggest different routes of distraction allowing for the automatic processing of a wide range of potentially relevant changes in the environment as a pre-requisite for adaptive behavior.

**Keywords: auditory distraction, control of attention, event-related brain potentials, mismatch negativity (MMN), P3a, reorienting negativity (RON), predictive coding**

## **INTRODUCTION**

The detection of unexpected changes in the sensory environment is a central prerequisite for flexible adaptation to new situations in a dynamic environment: Automatic processing of currently irrelevant sensory input can result in distraction of ongoing behavior, allowing for an evaluation of changes in the environment. In other words, salient changes in the environment may automatically trigger orientation of attention. This is very prominent in the auditory modality and enables scanning of the surrounding environment without physical orientation to sound locations. With this, it is possible to detect sudden onsets or changes irrespective of attentional allocation to the sound source. Importantly, the pre-attentive detection of changes in the auditory environment covers different types of changes spanning from sudden onsets of a sound (e.g., a ring-tone of a mobile phone during a lecture) to small variations within a continuous sound stream (e.g., prosodic changes in the lecturer's speech in response to the ringtone). Consequently, it was argued that different neuro-cognitive mechanisms exist for tapping the broad range of potentially relevant changes in the environment (Näätänen, 1990; Escera et al., 1998; Rinne et al., 2006; Näätänen et al., 2007; Winkler et al., 2009; Berti, 2012). The present study aims at comparing these two different situations of auditory change detection directly and at testing whether rare auditory stimuli may trigger different mechanisms of sensory processing and subsequent attentional orientation depending on whether they either are deviating from a continuous stimulation or are transient auditory events.

In recent years a number of studies have dealt with the processing of auditory stimuli or stimulus features which were irrelevant for the current task. Under special circumstances, a rare change of a task irrelevant part of the auditory stimulation results in behavioral distraction, i.e., it is mirrored in prolonged response times and a decreased accuracy in processing the experimental task (for a review see Escera et al., 2000). In these studies the logic of the oddball paradigm is applied, which is that two types of stimuli are presented with one frequent stimulus (e.g., in 87% of the trials; the so called standard) and one rare stimulus (e.g., in 13% of the trials; the so called deviant). Standard and deviant stimuli can, for instance, differ in pitch (e.g., 1000 vs. 1100 Hz). However, in distraction paradigms this variation of standard and deviant pitch is task irrelevant and can be ignored because the participants' task is related to another stimulus feature. For instance, in an auditory-auditory distraction paradigm introduced by Schröger and Wolff (1998), the presented auditory stimuli differed in duration (half of the stimuli had a duration of 200 ms and the other half of 400 ms) and the participants were instructed to perform a duration discrimination task. Tone duration and pitch varied independently from each other and the presentation of the deviant pitch could not be anticipated. In an auditory-visual cross-modal distraction paradigm (see Escera et al., 1998) task relevant visual stimuli were preceded by task irrelevant auditory standard or deviant tones. In addition to these two types of distraction paradigm, visual-visual (e.g., Berti and Schröger, 2004), bimodal (Boll and Berti, 2009), and recently vibrotactilevisual (Parmentier et al., 2011b; Ljungberg and Parmentier, 2012) paradigms were developed which resemble the general distraction paradigm logic. The intriguing finding is that irrelevant stimulus features in all these different types of paradigms distract task performance, suggesting that the processing of deviants is highly important. It has been argued that distraction by deviants mirrors the openness for changes in the environment in order to enable fast switches of attention to a potentially relevant stimulus (see, for instance, Escera et al., 2000; Berti and Schröger, 2003; Berti, 2008a). This assumption was investigated by Hölig and Berti (2010) by combining the distraction logic with the task-switching logic, demonstrating that deviants indeed allow for a fast task switch and do not disrupt information processing in general. This mirrors the idea of an orienting response (OR, see Sokolov, 1963, 1990; Barry, 2009) as a basic mechanism of adaptation to changes in the environment. Finally, recent behavioral studies indicate that the automatic orientation of attention toward new information triggers an involuntary semantic evaluation of it (Parmentier, 2008; Parmentier et al., 2011c, 2013) which further supports the interpretation of distraction as a relevant adaptive mechanism.

On a functional level, it has been argued that three processing steps in the neuro-cognitive system are underlying behavioral distraction (see, for instance, (Escera et al., 2000; Berti, 2008a; Horváth et al., 2008), namely (1) pre-attentive change detection, (2) involuntary orienting of attention, and (3) voluntary reorienting of attention. In more detail, the initial step of change detection is assumed to automatically process the incoming sensory information and to detect novel or deviating aspects in the physical input indicating changes in the environment. In case of the detection of a change the second step is triggered, which is the orientation of attention onto the new information; this orientation of attention is assumed also to proceed automatically and constitutes involuntary allocation of attention. However, in case the change in the environment is irrelevant (as is the case in the above mentioned distraction paradigms) a reorientation of attention to the relevant aspects of the sensory input is necessary in order to perform the task at hand. Therefore, the third processing step involved in distraction is voluntary reorientation of attention. Taken together, these additional steps of involuntary and voluntary switching of attention triggered by change detection disturb the processing of the task relevant information, resulting in prolonged responses and/or increased error rates (see, for instance, Schröger, 1996; Alho et al., 1997; Escera et al., 1998). This sequence of distraction related neuro-cognitive processes is mirrored in the human event-related brain potential (ERP). In detail, in distraction paradigms focusing on processing of (irrelevant) auditory stimuli (as is the case in the auditory-auditory and auditory-visual paradigm) a typical sequence of components is observable in the deviant compared to the standard ERP, mirroring sensory processing and the control of attention related with the behavioral distraction effects (see, for instance, Schröger and Wolff, 1998; Escera et al., 2001; Berti et al., 2004; Berti, 2008a, 2012; Horváth et al., 2008; Hölig and Berti, 2010; for a review see Escera and Corral, 2007): Task irrelevant auditory changes elicit the mismatch negativity (MMN), P3a, and reorienting negativity (RON), with the MMN indexing the pre-attentive sensory processing (Näätänen, 1990), the P3a indexing the involuntary switch of attention onto the new information (Friedman et al., 2001), and the RON indexing the subsequent reorienting back to the task relevant information (Schröger and Wolff, 1998; Berti, 2008a). However, for several reasons this "mechanistic" view of a three step processing chain underlying distraction seems to be too simple for tapping the functional diversity of flexible adaptation to ongoing changes in the sensory environment.

The analysis of behavioral data in a number of studies demonstrated a variety of factors influencing the actual effect of irrelevant deviating sensory information on task performance, including different types of top-down effects (Berti and Schröger, 2003; Sussman et al., 2003; Munka and Berti, 2006; Wetzel and Schröger, 2007; SanMiguel et al., 2008; Ruhnau et al., 2010; Horváth and Bendixen, 2012; Parmentier and Hebrero, 2013), the degree of the change compared with the standard stimulation (Yago et al., 2001; Berti et al., 2004; Berti, 2012), or the local micro-structure of the stimulation, i.e., with regard to the pattern of standard repetitions and change (Bendixen et al., 2007; Jankowiak and Berti, 2007; Horváth et al., 2008). In addition, Parmentier et al. (2010) and Wetzel et al. (2012) also reported potential facilitation effects by deviant stimuli, raising the question why and when auditory deviants become distracting (see also Parmentier, 2008; Parmentier et al., 2011a). One potential answer to this question is that pre-attentive change detection and subsequent triggering of attentional allocation can be based on different mechanisms optimized for different potentially relevant events in the auditory environment (Näätänen, 1990; Escera et al., 1998; Rinne et al., 2006; Näätänen et al., 2007; Horváth et al., 2008; Winkler et al., 2009; Berti, 2012). For instance, Escera et al. (1998) presented two types of auditory changes within the auditory-visual distraction paradigm: a deviant sinusoidal tone with a frequency of 700 Hz and novel stimuli which comprised short environmental sounds and which were presented only once within the experiment; the standard stimuli (sinusoidal tones with a pitch of 600 Hz) were presented in 80% of the trials while deviant and novel stimuli were presented in 10% of the trials each. In this study, deviants and novels both distracted visual task performance but distraction effects on the behavioral and ERP levels showed interesting differences: On the one hand, deviants resulted in a stronger distraction effect than novels. This is a surprising result because novels comprise a stronger change compared with deviants. On the other hand, novels elicited an increased N1 component while deviants elicited an MMN in the ERP. Moreover, both, deviants and novels elicited a subsequent P3a but the P3a in novel trials comprised of two subcomponents with an early and a later subcomponent. The differential effects of novels and deviants on the early, deviance related ERP components were replicated in a study by Berti (2012) applying the auditory-visual distraction paradigm. In this study, three types of rare (13% of the trials in each block) auditory changes were presented before the relevant visual stimulus: a pitch deviant (pitch increase of 10%), novel stimuli, and—in addition to the Escera et al. (1998) study—a short environmental sound. In detail, the short environmental sound was a sound that is similar to the novel stimuli and, therefore, differs in sensory richness compared with the sinusoidal standard stimulus. But in contrast with the novel stimuli this sound is repeated as a rare stimulus and, therefore, is not novel within the experiment. The ERPs showed MMN elicited by the deviants, but N1 increase plus MMN with the two other types of changes (environmental sound or novels). Interestingly, behavioral distraction effects were obtained in the conditions with the strong changes (i.e., the novels and the environmental sound) only. Finally, in a study by Rinne et al. (2006) applying the auditory-auditory distraction paradigm, intensity decrement deviants elicited an MMN only while intensity increment deviants elicited an MMN preceded by an N1 enhancement. Taken together, this pattern of results suggests two different mechanism of change detection: One mechanism capable of detecting salient changes like the onset of a pronounced difference, a novel or a transient sound in the environment which is mirrored in the N1 component, and another mechanism capable of detecting slight or small changes in the sensory input (see Escera et al., 1998; Rinne et al., 2006; Berti, 2012). According to Näätänen (1990) the first mechanism can be denoted as transient detector and the second mechanism as deviance detector (see also Näätänen et al., 2007; Winkler et al., 2009).

With regard to the deviance detection mechanism, it was assumed that the MMN mirrors the processing of the violation of the current sensory input from a sensory memory trace built on basis of the ongoing (standard) stimulation (see, for instance, Näätänen, 1990; Schröger, 1997; Näätänen et al., 2005). Recently, this "memory based" interpretation of the MMN was developed into a more general interpretation within the context of predictive coding theory (see, for instance, Winkler, 2007; Winkler and Czigler, 2012). In this framework, the MMN is assumed to reflect the "prediction error" which is the difference between the expected sensory input (as predicted from the previous input) and the actual sensory input. A basic prediction might be that the ongoing stimulation will be continued by a physically identical stimulus, i.e., the standard (this resembles the memory based MMN), but predictions about the upcoming stimulation might be derived on the basis of more complex rules or regularities (see Bendixen et al., 2007; for a review see Näätänen et al., 2001). However, this interpretation suggests that deviance processing as a basis for distraction relies on the existence of additional information allowing for building up a memory trace or deriving predictions regarding the sensory environment. In contrast, it remains unclear whether also rare changes not embedded within a continuous stream of sensory information are capable of triggering change detection and switching of attention resulting in distraction of ongoing task performance. This question is addressed in the present study by applying the auditory-visual distraction paradigm (see Escera et al., 1998). In detail, rare auditory stimuli are presented in two conditions. In one condition, these rare stimuli are embedded within a stream of frequent stimuli (i.e., standard stimuli; Oddball condition) while in the other condition, the same type of stimuli are presented infrequently before the task relevant visual stimulus but there is no second type of stimulus which could serve as a standard stimulation (Distractor condition). On the one hand, rare stimuli in the Oddball condition constitute typical deviant stimuli and should, therefore, result in the elicitation of deviance related behavioral and ERP distraction effects including the elicitation of an MMN. On the other hand, rare stimuli in the Distractor condition should elicit a pronounced N1 (see Näätänen and Picton, 1987). The question is whether this N1-based transient detection will also trigger a behavioral distraction effect and attention related ERP components (i.e., P3a and RON).

## **EXPERIMENT 1**

## **METHODS**

## *Participants*

Twelve healthy volunteers (age-span 22–38 years, mean age 28.4 years, 2 males) with normal or corrected to normal visual acuity participated in the study. In accordance with the Declaration of Helsinki all subjects gave written consent after the nature of the experiment was explained to them.

## *Experimental task*

The subjects' task was to decide whether a visually presented number was odd or even by pressing one of two assigned response buttons. The visual stimuli were numbers between one and eight with a presentation time of 200 ms and a stimulus-onset asynchrony (SOA) of 1200 ms. The probability of odd and even numbers was 50% each. Subjects performed this task in two conditions: In the Oddball condition every number was preceded by a 200 ms sinusoidal tone (including 5 ms rise and fall time) with a SOA of 300 ms. Importantly, the auditory stimuli could be of a standard (600 Hz, 87% of the trials; standard trial) or a deviant (660 Hz, 13% of the trials; deviant trial) frequency. In the Distractor condition a preceding tone (660 Hz) was presented only in 13% of the trials (distractor trial) while in the rest of the trials the visual stimulus was not preceded by an auditory stimulus (no-tone trial). With regard to the frequency of the different trial types in the two conditions, standard and no-tone trials will be referred to as *frequent trials* and deviant and distractor trials as *rare trials*. The tones were presented binaurally with a sound pressure level of 75 dB. In both conditions the auditory stimuli were irrelevant for the odd-even discrimination task. The trials were presented blockwise for the Oddball and the Distractor condition. Each condition consisted of 7 blocks of 100 randomized trials, with the exception that each infrequent trial was followed by at least three frequent trials. Every block started with a fixation cross presented for 1000 ms at the middle of the screen at the position at which the visual stimuli were presented. Moreover, the experiment started with a training block consisting of 100 notone trials in order to practice the odd-even discrimination task. The subjects were instructed to react with a left or a right key press as fast and as accurate as possible and to reduce eye movements, blinks, and movements in general. The response-to-key mapping and the order of conditions were randomized between subjects.

#### *Behavioral data analysis*

Mean reaction time (RT) and hit rate were computed for the responses. Trials with RTs shorter than 200 ms were discarded as false reactions, and mean RTs were only computed for correct trials. The first frequent trial after a rare trial was excluded from the analysis. The behavioral data was subjected to Two-Way repeatedmeasures analysis of variance (ANOVA) with factors Condition (Oddball vs. Distractor) and Trial type (frequent trial vs. rare trial). In addition, distraction effects in RTs were calculated for each condition separately by subtracting the RT in frequent trials from the RT in rare trials and were tested for a significant difference from zero by two one-sample, two-tailed *t*-tests.

#### *EEG recording and analysis*

The electroencephalogram (EEG) was recorded with a 32-channel Neuroscan SynAmps amplifier from 19 electrodes placed on a cap according to the International 10–20 System (Jasper, 1958) and from two additional electrodes placed at the left (LM) and right mastoids (RM). The reference electrode was placed at the tip of the nose. The EEG was online filtered with a 0.05–70 Hz bandpass and a 50 Hz notch filter. Electro-occulograms (EOG) were also recorded for offline artifact correction vertically from above and below the right eye and horizontally from the outer canthi of both eyes. The EEG was offline bandpass filtered with a 1–30 Hz bandpass filter. The ERPs were computed separately for frequent and rare trials in the Oddball and Distractor condition within a time window from −200 to −800 ms relative to auditory stimulus onset in case of trials consisting of an auditory stimulus or the comparable point in time in the no-tone trial. In other words, ERPs were calculated relative to a point in time 300 ms before the onset of the task relevant visual stimuli in all trials. All epochs with extensive eye movements (i.e., whenever the standard deviation within a sliding 200 ms time window exceeded 25μV at the EOG or at Fz) were rejected automatically from the calculation of ERPs. The 200 ms period before stimulus onset or relative to the comparable point in time in the no-tone trial served as baseline. Again, the first frequent trial after a rare trial was excluded from ERP computation. In addition, difference waves were computed separately for the two conditions by subtracting the ERPs elicited by frequent trials from the ERPs elicited by rare trials. The ERPs and difference waves are depicted at Fz, Cz, Pz, and LM because these electrodes represent the pattern of results best.

After artifact correction, the data sets of two participants were excluded from further ERP analysis because of a too low number of artifact free epochs in one or more trial types (i.e., less than 40 artifact free EEG epochs). After visual inspection, the ERPs of the remaining ten participants were analyzed at Fz and Cz by calculating the mean amplitude in five different time windows (70– 150, 150–220, 220–390, 390–530, and 530–790 ms) separately for the two stimulus types in the two conditions. In addition, differences between rare and frequent trial ERP amplitudes were calculated in order to test for significant change-related components by means of one-sample, two-tailed *t*-tests against zero (see **Table 1**). In addition, the mean amplitudes in the five respective time windows were evaluated for significant effects of the factors Trial type, Condition, and Electrode (Fz vs. Cz) by Three-Way repeated-measure ANOVAs.

#### **RESULTS**

In Experiment 1, accuracy in the visual classification task was high (range of hit rate: 0.90-0.91) and did not differ between the two conditions and trial types (all *F*'s in the Condition × Trial type ANOVA *<* 1). **Figure 1A** depicts the mean RTs obtained in Experiment 1: RTs were in the range of 420 to 432 ms and showed only small variations between the two conditions and trial types. The Two-Way ANOVA revealed only a marginal significant interaction of Condition and Trial type while neither the Condition nor the Trial type obtained a significant effect: Condition *<sup>F</sup>(*1*,* <sup>11</sup>*) <sup>&</sup>lt;* 1, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*410, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*063; Trial type *<sup>F</sup>(*1*,* <sup>11</sup>*)* <sup>=</sup> <sup>1</sup>*.*424, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*258, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*115; Condition × Trial type, *<sup>F</sup>(*1*,* <sup>11</sup>*)* <sup>=</sup> <sup>3</sup>*.*391, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*093, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*236. This was also mirrored in the distraction effects (rare trial RT minus frequent trial RT; DRT) which was small in the Oddball condition and close to zero in the Distractor condition: Oddball DRT 10 ms, *t(*11*)* = 1*.*924, *p* = 0*.*081, Cohen's *d* = 0*.*163; Distractor DRT −2 ms, *t(*11*) <* 1.

**Figure 2** summarizes the ERPs obtained in the Oddball (**Figure 2A**) and Distractor condition (**Figure 2B**). In the Oddball condition, both types of auditory stimuli elicited the N1 component which is followed by a second negativity in the rare but


**Table 1 | Mean amplitudes (and standard error of mean) and** *t***-statistics of deviance related differences between rare and frequent trial separately for the five time windows at Fz and Cz in Experiment 1.**

*All df's* <sup>=</sup> *9; significance levels:* ◦*<sup>p</sup> <sup>&</sup>lt; 0.1, \*p <sup>&</sup>lt; 0.05, \*\*p <sup>&</sup>lt; 0.01, \*\*\*p <sup>&</sup>lt; 0.001.*

not in the frequent stimulus type (see **Figure 2A**, Fz). In contrast, the rare auditory stimulus in the Distractor condition elicited a distinctive N1 followed by a positive peak around 200 ms (see **Figure 2B**, Fz). The difference between rare and frequent tones in the two conditions is depicted in the difference waves (**Figure 2C**). The difference waves illustrate that the early negative components elicited by rare auditory stimuli in the Oddball and the Distractor condition differ with regard to their peak latencies. In other words, the processing of deviant tones in the Oddball condition are mirrored in the MMN while the processing of distractor tones in the Distractor condition are mirrored in the N1. In addition, both the N1 and the MMN are followed by positive components: In the Oddball condition a P3a peaking around 280 ms is visible. In the Distractor condition two positive peaks are visible at Fz: a P2 peaking around 200 ms and a P3a peaking around 340 ms. Finally, an early phase of the RON component is obtained in the Distractor condition between 400 and 520 ms and a later phase of the RON component is obtained in both conditions between 520 and 700 ms. The statistical tests of the mean amplitudes in the difference waves (by means of *t*-tests against zero) confirm the elicitation of N1/MMN, P3a, and late RON in the Oddball condition and the elicitation of N1, P2, P3a, and early and late RON in the Distractor condition (see **Table 1**). The results of the statistical analysis for the five time windows by means of 2 × 2 × 2 repeated-measure ANOVA is summarized in **Table 2**: Main effects of the factor Stimulus type are obtained in all time windows while main effects of the factor Condition are confined to the three early time windows. In addition, significant interaction terms in all except the latest time windows confirm the differences in processing of rare auditory stimuli in the two conditions. This difference is already mirrored in the comparison of the ERPs of the rare stimuli: In all but the latest time window, deviant ERP amplitudes differ from distractor ERP amplitudes at FZ: 70–150 ms: *t(*9*)* = 7*.*27, *p <* 0*.*0001, Cohen's *d* = 2*.*39; 150– 220 ms: *t(*9*)* = 3*.*81, *p* = 0*.*0042, Cohen's *d* = 2*.*03; 220–390 ms:

*t(*9*)* = 6*.*13, *p* = 0*.*0002, Cohen's *d* = 1*.*39; 390–530 ms: *t(*9*)* = 2*.*52, *p* = 0*.*0328, Cohen's *d* = 0*.*85; 530–790 ms *t(*9*) <* 1.

#### **DISCUSSION**

Rare pitch changes in the Oddball condition elicited the MMN, the P3a, and the RON and result in a small (but statistically not significant) RT prolongation. This pattern of results can be best interpreted as distraction of the visual odd-even classification task by the task irrelevant change of the preceding auditory stimuli. By comparison, the infrequent presentation of the same stimuli in the Distractor condition elicited a pronounced N1 component followed by a P2, a late P3a, and the early and the late RON subcomponents (see Berti, 2008a). On the behavioral level, no distraction effects were obtained in the Distractor condition. The pattern of ERP results in the two conditions mirrors the findings by Escera et al. (1998) and partly by Berti (2012): The small change (i.e., the deviant) in the Oddball condition elicited an MMN component followed by the attention related ERP components P3a and RON while the strong change (i.e., the rare transient sound) elicited a pronounced N1 component followed by an early and a late fronto-central positive component (in Berti, 2012, there is only one P3a obtained) and a RON component (visible but not analyzed in Escera et al., 1998). With regard to the fronto-central positive components between 150 and 390 ms, these resemble the early and late P3a reported in Escera et al. (1998; see also Berti, 2008b) but in the present Experiment 1 the early positive component peaks around 200 ms (230 ms in Escera et al., 1998). With this, it is questionable whether the first positive component following the N1 is best described as a P2 or an early P3a. In contrast, the later positive peak around 340 ms with a frontal maximum resembles a typical P3a and together with the subsequent RON component one may assume that also the rare sinusoidal tones in the Distractor condition resulted in attentional switching and distraction of task performance. Therefore, the present results are in line with studies demonstrating the

contribution of a transient detection mechanism to distraction (see Escera et al., 1998; Rinne et al., 2006; Berti, 2012). Moreover, in contrast to the studies by Escera et al. (1998), Rinne et al. (2006), and Berti (2012) the N1 elicited by the rare auditory stimuli is not followed by an MMN. This suggests that the N1-route of change detection is also capable of triggering a switching of attention without the possibility of comparing the actual input with a prediction or memory trace and elicitation of an MMN. On the other hand, this conclusion is weakened by an unclear pattern of results on the behavioral level: For the reason that the RT distraction effect in the Oddball condition is only marginally significant (in a two-tailed *t*-test) and the interaction term of the


**Table 2 | Statistical evaluation of effects of Trial type (rare vs. frequent), Condition (Oddball vs. Distractor), and Electrode (Fz vs. Cz) by means of repeated-measure ANOVAs separately for the five time windows in Experiment 1.**

*F-values and partial* η<sup>2</sup> *<sup>p</sup>'s are summarized; all df's* <sup>=</sup> *1, 9; significance levels: \*p <sup>&</sup>lt; 0.05, \*\*p <sup>&</sup>lt; 0.01, \*\*\*p <sup>&</sup>lt; 0.001.*

factors Condition and Trial type also failed to reach statistical significance on a 5% level, it is hard to tell whether the lack of a behavioral distraction effect in the Distractor condition is due to the incapability of the rare stimuli to result in distraction of task performance or whether it mirrors weak statistical power in the behavioral data only. This is relevant because if the lack of a behavioral distraction effect is a valid finding, one may conclude that behavioral distraction is triggered by the MMN or deviance detector route only. To elucidate this question, Experiment 1 was replicated with novels as rare stimuli because these stimuli obtained stronger distraction effects in recent studies (see, for instance, Berti, 2012).

## **EXPERIMENT 2**

#### **METHODS**

#### *Participants*

Sixteen healthy volunteers (age-span 22–51 years, mean age 28.8 years, 7 males) with normal or corrected to normal visual acuity participated in the study. In accordance with the Declaration of Helsinki all subjects gave written consent after the nature of the experiment were explained to them.

## *Experimental task*

The experimental task with its instruction, timing, stimulus types, conditions, number of blocks, and numbers of trials in each block was the same as in Experiment 1. In contrast to Experiment 1, the rare stimulus type in the Oddball and the Distractor condition were novel stimuli (see Escera et al., 1998): Novels are short environmental sounds with a duration of 200 ms (including 10 ms rise and fall times) which were only presented once in each condition. Again, novels in the Oddball and Distractor condition were presented in 13% of the trials (rare trials) while in the remaining 87% of the trials either the 600 Hz sinusoidal tone (Oddball condition) or no auditory stimulus (Distractor condition) preceded the visual stimulus (frequent trial). The analysis of the participants' performance in Experiment 2 corresponded to the behavioral data analysis of Experiment 1.

## *EEG recording and analysis*

The recording and analysis of the EEG were the same as in Experiment 1 with the following exceptions: The EEG was recorded from nine electrodes (F4, Fz, F3, Cz, Pz, O1, O2, RM, and LM) referenced to the nose plus the vertical and horizontal EOG. After artifact correction, the data sets of seven participants were excluded from further ERP analysis because of a too low number of artifact free epochs in one or more trial types (i.e., less than 40 artifact free EEG epochs). After visual inspection, the ERPs were analyzed at Fz and Cz by calculating the mean amplitude in four different time windows (100–160, 160– 260, 260–420, and 420–570 ms) separately for the two stimulus types in the two conditions. In addition, differences between rare and frequent trial ERP amplitudes were calculated in order to test for significant change-related components by means of onesample, two-tailed *t*-tests against zero (see **Table 3**). The averaged amplitudes in the five respective time windows were evaluated for significant effects of the factors Trial type, Condition, and Electrode by Three-Way repeated-measure ANOVAs.

## **RESULTS**

Performance in the visual classification task in Experiment 2 is slightly decreased compared with Experiment 1 but still quite accurate (range of hit rate: 0.86–0.88). Again, hit rate does not differ between the two conditions and trial types: Condition *F(*1*,* <sup>15</sup>*) <* 1; Trial type *F(*1*,* <sup>15</sup>*) <* 1; Condition × Trial type *F(*1*,* <sup>15</sup>*)* = 1*.*042, *p* = 0*.*324. **Figure 1B** summarizes the mean RTs obtained in Experiment 2 which are in the range of 472 to 503 ms. RTs tend to be shorter in the Oddball compared with the Distractor condition and are increased in rare compared with frequent trials. This is mirrored in the Two-Way ANOVA: Condition *<sup>F</sup>(*1*,* <sup>15</sup>*)* <sup>=</sup> <sup>4</sup>*.*605, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*049, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*235; Trial type *<sup>F</sup>(*1*,* <sup>15</sup>*)* <sup>=</sup> <sup>32</sup>*.*301, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*0001, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*683; Condition × Trial type, *<sup>F</sup>(*1*,* <sup>15</sup>*)* <sup>=</sup> <sup>5</sup>*.*791, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*029, <sup>η</sup><sup>2</sup> *<sup>p</sup>* = 0*.*279. In addition, in both conditions a significant distraction effect is obtained which is pronounced in the Oddball compared to the Distractor condition: Oddball DRT 21 ms, *t(*15*)* = 5*.*196, *p* = 0*.*0001, Cohen's *d* = 3*.*198; Distractor DRT 9 ms, *t(*15*)* = 2*.*919, *p* = 0*.*011, Cohen's *d* = 0*.*128.

The ERPs obtained in the Oddball and Distractor conditions are depicted in **Figures 3A**,**B**, respectively. The differences between the rare and the frequent trial type are highlighted in **Figure 3C**: In both conditions rare novel stimuli elicited an early negativity which is pronounced in the Distractor compared


**Table 3 | Mean amplitudes (and standard error of mean) and** *t***-statistics of deviance related differences between rare and frequent trial separately for the four time windows at Fz and Cz in Experiment 2.**

*All df's* <sup>=</sup> *8; significance levels:* ◦*<sup>p</sup> <sup>&</sup>lt; 0.1, \*p <sup>&</sup>lt; 0.05, \*\*p <sup>&</sup>lt; 0.01, \*\*\*p <sup>&</sup>lt; 0.001.*

with the Oddball condition but which virtually does not differ in latency. The early negativity is followed by a positive component in both conditions but with different timings: In the Oddball condition the positivity peaks around 330 ms while in the Distractor condition it peaks around 210 ms. Subsequent to the positive peaks a late negativity is observable in both conditions at Fz between 420 and 570 ms. The existence of these frontocentral auditory change-related components is confirmed by a series of two-tailed *t*-tests (see **Table 3**) with one exception: In the Distractor condition the late negativity around 500 ms obtains only a marginal significant difference from zero. **Table 4** summarizes the statistical analysis of the effects of the three factors Trial type, Condition, and Electrode by means of repeated-measure ANOVAs: In the early time windows, significant main effects and interactions of Trial type and Condition are revealed. In the 260– 420 ms time window a significant main effect of the factor Trial type is obtained. Finally, in the latest time window a main effect of the factor Condition is revealed which is qualified by an interaction of Trial type and Electrode. Beside this interaction, the factor Electrode obtains a main effect in the 160–260 ms time window only and reveals significant interaction terms with the factor Condition in the 100–150 ms time window and with the factor Trial type and a three-way interaction in the 160–260 ms time window.

## **DISCUSSION**

In a nutshell, Experiment 2 replicated the findings of Experiment 1 with two major exceptions: Firstly, on a behavioral level in both conditions RT costs elicited by the novels are obtained. Importantly, the distraction effect in the Oddball condition is more pronounced than the distraction effect in the Distractor condition which is also confirmed by a Condition × Trial type interaction in the ANOVA. Secondly, novels in both conditions elicited a pronounced N1 which is more pronounced in the Distractor compared with the Oddball condition. In addition, a distinct MMN as obtained in the Oddball condition of Experiment 1 cannot be identified. The central question of Experiment 2 was whether distraction effects can be obtained in both conditions and the answer is yes. But distraction effects differ significantly with smaller RT costs in the Distractor condition, in which a prediction based processing is not possible. However, the results of Experiment 2 are best discussed together with the findings of Experiment 1.

## **GENERAL DISCUSSION**

Rare auditory stimuli can distract visual information processing irrespective of whether the auditory stimuli deviate from a continuous auditory stimulation or whether the auditory stimuli are transient sounds interspersed in an otherwise sound free stimulation. On the behavioral level, distraction is mirrored in RT prolongation but not in an increase of error rate. In addition, behavioral distraction effects are smaller in transient sounds compared with deviants. The ERP results suggest that distraction of task performance is triggered by two different initial processes: In transient sounds, the initial step of change detection is mirrored by the N1 and in deviants initial change detection is mirrored by the MMN; the further processing of the two types of rare stimuli share attentional processes as reflected in the P3a and RON components. These attention related ERP components are preceded by an additional positive peak around 200 ms in the transient sound processing. This suggests that distraction by rare and irrelevant sounds in different situations share later processing steps (presumably linked with controlled attention) while the initiation of distraction differs with regard to the characteristics of the context or the stimulation.

With regard to the N1 and MMN results the present findings add to the findings by Escera et al. (1998), Rinne et al. (2006), and Berti (2012) suggesting two different mechanisms of auditory change detection in the context of distraction. Here I demonstrate that the N1-route of distraction is observable when no further reference information allows for deriving predictions as a basis for change detection. On the other hand, a suddenly appearing sound within an otherwise silent environment (for example in a sound attenuated lab chamber) constitutes a strong and salient change and, therefore, does not require a reference model to be identified as a change. Interestingly, the findings in the present Distractor conditions mirror those in the study by Escera et al. (1998) where novels also constitute a strong and salient change. The present study supports the idea that the N1 reflects a transient detector mechanism (Näätänen, 1990), and from the resemblance to the findings in Escera et al. (1998) one may conclude that this transient detector also adds to the processing of the novels when these are embedded into a regular standard stimulation. This interpretation is further supported by a study by Berti (2012); however, in this study strong changes elicited an N1 enhancement irrespective of whether they were novels or not. This suggests that the triggering of the N1-route or transient detector mechanism

depends on whether the stimulus is strong enough to pass a sensory threshold. Taken together, the triggering of the transient detector mechanism seems to be independent from whether the stimulus is novel or not and from whether it is embedded in a continuous stream of auditory stimulation. This is also in line with the findings by Rinne et al. (2006) reporting an additional N1 when the deviation of the standard is getting stronger (i.e., intensity increments). Moreover, the Rinne et al. (2006) study also demonstrates that in the oddball-situation the N1 might be added to the MMN-route (see also Escera et al., 1998; Berti, 2012). In other words, the two mechanisms do not exclude each other but may be processing in parallel (see Grimm and Escera, 2012, for a discussion of the variety of processes presumably underlying auditory change detection). The present study suggests that these two mechanisms can be triggered independently from each other because in Experiment 1 the pronounced N1 is confined to the


**Table 4 | Statistical evaluation of effects of Trial type (rare vs. frequent), Condition (Oddball vs. Distractor), and Electrode (Fz vs. Cz) by means of repeated-measure ANOVAs separately for the four time windows in Experiment 2.**

*F-values and partial* η<sup>2</sup> *<sup>p</sup>'s are summarized; all df's* <sup>=</sup> *1, 8; significance levels:* ◦*<sup>p</sup> <sup>&</sup>lt; 0.1, \*p <sup>&</sup>lt; 0.05, \*\*p <sup>&</sup>lt; 0.01, \*\*\*p <sup>&</sup>lt; 0.001.*

Distractor condition while the MMN is confined to the Oddball condition. (But note the small N1 enhancement in Experiment 1 preceding the MMN in the Oddball condition.) The difference between the conditions is the existence of a continuous stimulation which may serve as a reference model and allow for deriving predictions about the environmental stimulation. Therefore, the findings of Escera et al. (1998), Rinne et al. (2006), and Berti (2012) also suggest that the predictive coding mechanism is not inactivated by the transient detector mechanisms and vice versa. Again, this may suggest a more parallel processing of the transient and deviant information contained in the auditory stream in the oddball stimulation. But if the transient detector and the deviant detector operate in parallel to and independently from each other, then the question arises whether and at which point in the processing chain these routes of change detection interact.

Deviants in an oddball stimulation resulting in a behavioral distraction effect usually do elicit the P3a component. This component is interpreted as an electrophysiological index of involuntary attention switching (see, for instance, Friedman et al., 2001; Escera and Corral, 2007). In the present experiments rare stimuli also elicited the P3a component. In detail, rare stimuli embedded within the standard stream showed the typical fronto-central positive peak between 300 and 400 ms irrespective of whether they are deviants or novel sounds. With regard to the (non-oddball) Distractor conditions, rare stimuli elicited a biphasic positive component with two distinct positive peaks in Experiment 1; this pattern can be interpreted as early and late P3a subcomponents (see Escera et al., 1998; Berti, 2008b). Consequently, one may conclude (1) that both the N1 and the MMN route result in the elicitation of a switch of attention and (2) that within the time window of 200 to 400 ms the effects of the two processing streams converge. The latter conclusion is in line with the findings that the P3a is elicited also by visual deviants (see Berti and Schröger, 2001, 2004) and that audio-visual interaction in bimodal deviants is visible in the P3a time window (see Boll and Berti, 2009). The first conclusion is supported by the subsequent elicitation of a RON component: This component is elicited when the detected and (at least partly) attended change is irrelevant for the task at hand with the consequence that a reorientation to the task relevant information after distraction is required (see, for instance, Schröger and Wolff, 1998; Berti et al., 2004; Berti, 2008a; Horváth et al., 2008). Experiment 1, therefore, demonstrates that N1 or transient triggered change detection is capable of eliciting involuntary and voluntary attentional allocation as demonstrated by the P3a-RON complex. This is in line with the Preliminary Process Theory (PPT; see Barry, 2009) arguing that one route contributing to OR is also based on a transient detector. Finally, this also matches the functional interpretation of distraction as a pre-requisite for a subsequent fast switch of attention or behavioral goals: As demonstrated by Berti (2008b) and Hölig and Berti (2010), switches between different objects in working memory or between different tasks are also mirrored in the P3a.

On the other hand, the present findings also challenge the interpretation of the P3a as a unitary component reflecting involuntary switching of attention for three reasons: (1) The Distractor condition obtained a pronounced early positive peak around 200 ms preceding the "classical" P3a, (2) the elicitation of a P3a is not necessarily correlated with a behavioral distraction effect (see also Munka and Berti, 2006; Wetzel et al., 2013), and (3) the degree of the change (as indexed by the N1 and MMN components) is not systematically mirrored in the P3a amplitude. The latter point is in contrast to earlier studies demonstrating a correlation of the degree of a deviation with the ERP signs of distraction, especially P3a (see Yago et al., 2001; Berti et al., 2004). However, in the context of the present study this seems to mirror the qualitative change between the two modes of processing but not a gradual increase of distraction. In addition, the difference between the two modes of processing also seems to include the integration of an additional processing step as mirrored in the positive component around 200 ms. In other words, the sequence of ERP components in the Distractor conditions suggests that the processing of the transients do elicit a "transient P2" component in the present study which is observable in the time window of the MMN (see **Figure 2**). However, for the reason that the rare stimuli in the Distractor condition seem to be processed efficiently (mirrored in the pronounced N1 amplitude) this "transient P2" is elicited very early. In contrast, in other conditions this positive component may be elicited later and overlap with the classical P3a. Importantly, from this interpretation one must conclude that this early positive component and the later positive component within the 200–400 ms time window are independent from each other. The existence of two independent fronto-central positive components following N1/MMN which might be typically intermixed in one seemingly unitary P3a would also explain why the P3a obtained in this kind of studies do not fully resemble the effects mirrored especially in the earlier ERP components (see Berti et al., 2004, 2013; Horváth et al., 2008). (An interesting idea is that P3a and "transient P2" may also fully overlap transforming the P3a into a novelty P3. But on basis of the present study this idea remains highly speculative.) However, as discussed by Berti (2008b) and Hölig and Berti (2010), the P3a time window may mirror two different aspects of attentional control in the context of distraction: One process of (automatic) disengagement or unhitching of attention from the present task (see also Polich, 2007) and another process of controlled attention, for instance, in the service of updating of task relevant information (see also Barcelo et al., 2006). It is noteworthy that the study by Berti (2008b) reports the elicitation of P3a without oddball-like presentation (with two types of equiprobable trials) demonstrating that rareness is not a necessary condition for the P3a and supporting the conclusion that the P3a does not mirror involuntary or automatic switching of attention *per se* (see Kopp et al., 2006). However, in the context of the present study it is possible that the unhitching of attention is mirrored in the early positive component (P2) as an effect of effective processing of the auditory change (as mirrored in the pronounced N1) while the controlled allocation of attention is mirrored in the (later) P3a. If this interpretation holds, one may conclude that the process of disruption of task processing by unhitching of attentional resources takes place around 200 ms (see peak of the P2 in **Figure 2C**). It is noteworthy that the P2 in the Distractor condition and the MMN in the Oddball condition of Experiment 1 overlap. This might be due to the fact that a deviant does not result in a strong N1 enhancement (but see the small N1 difference in **Figure 2C**) which again does not trigger the P2 related process. But this might be compensated by a subsequent, additional process of deviance detection based on the processing of prediction and violations from the predictions: the MMN. Importantly, if this hypothesis of two independent processes within the 200 to 400 ms time window contributing to distraction and attentional control holds, this might also explain why the N1/MMN, P3a, and RON do not form "a strongly coupled chain" as formulated by Horváth et al. (2008) and as suggested by other findings including Berti et al. (2004).

Taken together the present study demonstrates that a variety of neuro-cognitive processes is related to distraction as a prerequisite for flexible adaptive behavior. Especially within the comparable—early processing steps, two different mechanisms contributing to behavioral distraction were identified (see also Näätänen, 1990; Escera et al., 1998; Rinne et al., 2006; Näätänen et al., 2007; Winkler et al., 2009; Berti, 2012). This fits into the perspective of a recent review by Grimm and Escera (2012) stating that different mechanisms of auditory change detection are mirrored in the early ERP and suggesting a number of factors facilitating automatic change detection. In the present study at least two different mechanisms are observable: a transient detection mechanism mirrored by the N1 component and a deviant detection or predictive coding mechanism mirrored by the MMN component. As demonstrated here, both routes of change detection trigger processes of attentional control, presumably also including change detection processes not tapped by the present methodological approach (for instance, correlated in mid latency responses of the human ERP, see Grimm et al., 2011). However, one question remains open: Why were the distraction effects in the present study smaller (and partly absent) in the Distractor conditions? The reason for this might be due to a number of additional factors influencing the actual behavioral effect of a change including stimulus characteristics (e.g., Parmentier et al., 2011a; Berti, 2012), characteristics of the sequence of stimulation (e.g., Bendixen et al., 2007; Jankowiak and Berti, 2007; Horváth et al., 2008), or the informational content of a stimulus (Parmentier et al., 2010; Ljungberg et al., 2012; Wetzel et al., 2012; Li et al., 2013). For instance, Jankowiak and Berti (2007) demonstrated a standard facilitation effect which adds to the degree of distraction (i.e., RT difference between standard and deviant trials). In other words, the RT difference between standard and deviant trials is a mixture of potential RT facilitation effects in standard trials and potential RT prolongation in deviant trials. With regard to the auditory-visual distraction paradigm, the task irrelevant auditory stimuli serve as cues for the upcoming, task relevant visual stimulus which facilitates processing of the visual stimulus compared with a non-cued situation (see Escera et al., 1998). In addition, it has been shown that the auditory deviant only distracts the processing of the visual information if the auditory stimulation contains information that is relevant to the experimental task (see Parmentier et al., 2010; Ljungberg et al., 2012; Wetzel et al., 2012; Li et al., 2013). This demonstrates that the auditory stimulation is processed only if it is of potential benefit for the task at hand (e.g., as a cue). Interestingly, this can also lead to facilitation effects by deviants compared to standards (Parmentier et al., 2010; SanMiguel et al., 2010a,b; Wetzel et al., 2012). With regard to the present study one may conclude that the difference in distraction effects in Experiment 2 is due to a lack of additional facilitation effects by the cueing and, in this sense, the distraction effects in the Distractor condition mirror "pure" distraction. Interestingly, a facilitation or cueing effect in the Oddball condition is likely because in both experiments RT in the frequent condition is faster in the Oddball compared with the Distractor condition; Experiment 1: 420 vs. 432 ms, *t(*11*)* = 2*.*243, *p* = 0*.*047, Cohen's *d* = 0*.*185; Experiment 2: 472 vs. 494 ms, *t(*15*)* = 3*.*173, *p* = 0*.*006, Cohen's *d* = 0*.*334. In addition, it is also possible that in the Distraction condition of both experiments the rare auditory stimuli serve as non-informative cues because the coupling of the stimulus onset with the onset of the visual target is too lose. With this, one should expect no distraction effect of the rare stimuli at all (see Parmentier et al., 2010; Ljungberg et al., 2012; Wetzel et al., 2012; Li et al., 2013). If this interpretation holds, the behavioral distraction effect obtained in the Distractor condition of Experiment 2 can be interpreted as additional support for the notion that distraction by deviants and transients are based on two distinct routes. However, even though the ERP findings may sometimes suggest a straightforward coupling of the processing of deviant and novel stimuli to behavioral distraction, the pattern of influences and interactions between different processes of change detection and of attentional allocation resembles more gradual effects of the automatic processing of environmental sensory information on behavior. Further research may elucidate the interaction of neuronal mechanisms of sensory and attentional processing in order to provide us with a fuller picture of how the gradual and effective adaptations to a wide variety of dynamic changes in the environment is realized in humans.

## **REFERENCES**


visual distraction: behavioral and event-related indices. *Brain Res. Cogn. Brain Res*. 10, 265–273. doi: 10.1016/S0926-6410(00)00044-6


## **ACKNOWLEDGMENTS**

The author thanks R. Stenner for help during data acquisition for Experiment 1, J. Adams, S. Bollmann, B. Both, N. Chelidze, T. Gubo, T. Illger, E. Kovaleva, C. Linninger, and R. Litvak for help during data acquisition for Experiment 2, C. Escera for providing the set of novel stimuli, and N. Waller for correcting the English.

revisited: evidence for a hierarchical novelty system. *Int. J. Psychophysiol*. 85, 88–92. doi: 10.1016/j.ijpsycho.2011.05.012


59, 355–363. doi: 10.1027/1618- 3169/a000164


value. *Cognition* 115, 504–511. doi: 10.1016/j.cognition.2010.03.002


event-related potential study. *BMC Neurosci*. 11:126. doi: 10.1186/1471- 2202-11-126


change: a new distraction paradigm. *Brain Res. Cogn. Brain Res.* 7, 71–87. doi: 10.1016/S0926-6410 (98)00013-5


and vMMN) linking predictive coding theories and perceptual object representations. *Int. J. Psychophysiol*. 83, 132–143. doi: 10.1016/j.ijpsycho.2011.10.001


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 April 2013; accepted: 19 June 2013; published online: 11 July 2013.*

*Citation: Berti S (2013) The role of auditory transient and deviance processing in distraction of task performance: a combined behavioral and event-related brain potential study. Front. Hum. Neurosci. 7:352. doi: 10.3389/fnhum.2013.00352 Copyright © 2013 Berti. This is an open-access article distributed under the terms of the Creative Commons*

*Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## I know what is missing here: electrophysiological prediction error signals elicited by omissions of predicted "what" but not "when"

## *Iria SanMiguel\*, Katja Saupe and Erich Schröger*

*BioCog, Institute for Psychology, University of Leipzig, Leipzig, Germany*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Carles Escera, University of Barcelona, Spain Harriet Brown, University College London, UK*

#### *\*Correspondence:*

*Iria SanMiguel, BioCog, Institute for Psychology, University of Leipzig, Neumarkt 9-19, 04109, Leipzig, Germany e-mail: iria.sanmiguel@uni-leipzig.de* In the present study we investigated the neural code of sensory predictions. Grounded on a variety of empirical findings, we set out from the proposal that sensory predictions are coded via the top-down modulation of the sensory units whose response properties match the specific characteristics of the predicted stimulus (Albright, 2012; Arnal and Giraud, 2012). From this proposal, we derive the hypothesis that when the specific physical characteristics of the predicted stimulus cannot be advanced, the sensory system should not be able to formulate such predictions, as it would lack the means to represent them. In different conditions, participant's self-paced button presses predicted either only the precise time when a random sound would be presented (random sound condition) or both the timing and the identity of the sound (single sound condition). To isolate prediction-related activity, we inspected the event-related potential (ERP) elicited by rare omissions of the sounds following the button press (see SanMiguel et al., 2013). As expected, in the single sound condition, omissions elicited a complex response in the ERP, reflecting the presence of sound prediction and the violation of this prediction. In contrast, in the random sound condition, sound omissions were not followed by any significant responses in the ERP. These results confirmed our hypothesis, and provide support to current proposals advocating that sensory systems rely on the top-down modulation of stimulus-specific sensory representations as the neural code for prediction. In light of these findings, we discuss the significance of the omission ERP as an electrophysiological marker of predictive processing and we address the paradox that no indicators of violations of temporal prediction alone were found in the present paradigm.

**Keywords: predictive coding, Bayesian surprise, temporal orienting, predictive timing, self-generation, human, auditory cortex, efference copy**

## **INTRODUCTION**

The brain anticipates upcoming sensory stimulation. This has clear advantages, for example, we react faster and more accurately to predictable events (Anllo-Vento, 1995; Mangun, 1995), and we can detect them at lower thresholds (Hawkins et al., 1990; Luck et al., 1994; Correa et al., 2005). Prediction is intricately tied to attention, and the processing of predictable events in the brain is guided by the interaction between these two processes (Kok et al., 2012). Accordingly, sensory responses to predicted stimuli may be enhanced (e.g., when attending to an expected target, Mangun and Hillyard, 1991) or attenuated (e.g., when stimuli are self-generated Hesse et al., 2010; Timm et al., 2013), depending on their relevance for behavior. There is overwhelming empirical evidence that prediction has a pervasive influence on stimulus processing (for a review see Bendixen et al., 2012). However, it is still unclear exactly how predictions come about, and once a prediction has been formulated, what its neural representation code is. In other words, the neurophysiological basis for a variety of prediction effects is a matter of debate.

Several findings indicate that when a particular stimulus is strongly expected, brain activity in the particular areas coding for that stimulus type is modulated. For instance, functional brain imaging in attentional cuing tasks has shown that visual cortex activity is raised during expectancy of a visual target (Kastner et al., 1999). Electrophysiological studies have demonstrated that if a sound is omitted from a predictable pattern, an auditory-like response may be emitted (Raij et al., 1997; Hughes et al., 2001; Bendixen et al., 2009). A similar result is found in associative learning and conditioning studies: If two stimuli are repeatedly presented in close succession, the presentation of the first stimulus by itself can trigger responses that would usually require the presentation of the second stimulus (Den Ouden et al., 2009). Such anticipatory responses are also triggered for actions with predictable sensory consequences; for example when left and right button presses are paired with, respectively, the presentation of faces or houses, the button press alone can trigger activity in the corresponding contentspecialized visual processing area (Kuhn et al., 2010). This collection of findings seems to indicate that strongly predicting a stimulus may trigger the activation of its neural sensory representation, much like what happens during imagination or memory retrieval (Albright, 2012).

This idea sits well with the theoretical and computational approach known as predictive coding (Friston, 2005), in which the predictive activation of sensory representations plays a fundamental role. In predictive coding, predictions are formulated in higher areas of the cortical hierarchy and are sent as top-down signals to lower areas, where they induce an expected pattern of activation. The lower area receives sensory input and contrasts it with the expected activity pattern. Any mismatch between the predicted pattern and that evoked by the input is sent to the higher area as prediction error. The same procedure is repeated in multiple hierarchical cortical levels, each area computing the difference between predictions received from the higher areas and the input received from the lower areas.

In sum, the evidence supports a model in which predictions are coded by selectively modulating the neural units whose response properties match the predicted stimulus' characteristics. However, humans are clearly also able to make unspecific predictions i.e., knowing that something will happen now but not knowing exactly what will happen. For example, we can certainly notice the difference between the adequate termination of a song and its undue interruption, even if we don't know exactly how the song would have continued otherwise. This kind of prediction is difficult to reconcile with the predictive coding models: If we do not have a specific sensory representation to predictively activate, then how do we predict?

In the present study we explore the neural code of sensory predictions by inducing a strong sensory prediction and unexpectedly omitting the predicted stimulus. Following predictive coding models, early sensory responses should equal the difference between the prediction and the input. For the particular case of omissions of predicted stimuli, since there is no input, the electrophysiological response observed should be an exact mirror image of the prediction, therefore giving access to its neural representational code (SanMiguel et al., 2013). We hypothesize that, if prediction indeed relies on the activation of stimulus-specific sensory representations, it should not be possible to generate predictions when the specific stimulus characteristics are unknown. Accordingly, in a situation in which we can only predict *when* a stimulus will be delivered but not precisely *what* stimulus it will be, no prediction error signals should be observed when the stimulus is omitted. Following our interrupted song example, if there is no identity prediction (we don't know how the song would continue) and no sensory input (the song stops), there should be no mismatch between the two and hence, no prediction error. If this hypothesis is correct, then it raises the additional question of how we can notice the undue interruption of the song.

## **MATERIALS AND METHODS**

#### **PARTICIPANTS**

This experiment was conducted in accordance to the Declaration of Helsinki. All participants gave written informed consent for their participation after the nature of the study was explained to them. Fifteen healthy Leipzig University students (10 women, 5 men, 2 left-handed) ranging in age 19–34 years (mean = 24.1 years) volunteered to participate in the experiment. Participants either received course credits or were reimbursed for their participation. All participants had normal or corrected-to-normal vision, and reported no hearing impairment or history of psychiatric or neurological disease.

### **STIMULI AND PROCEDURE**

The experimental task was delivered with Cogent 2000 v1.29 running on Matlab. Participants sat comfortably inside an electrically shielded chamber and fixated on a fixation cross displayed on a screen placed at a distance of approximately 100 cm from their eyes. In all conditions, participants pressed a button with the thumb of their dominant hand every 600–1200 ms on a cushioned Microsoft SideWinder Plug & Play Game Pad. In the sound conditions, button presses initiated the delivery of a sound on 87% of the trials (sound trials). Sounds were omitted on the remaining 13% of the button presses (omission trials). Omission trials were randomly placed with the restriction that the first five trials of each run of trials and the two trials immediately following an omission were always sound trials.

Sound stimuli consisted of 48 different common environmental sounds rated as identifiable by an independent sample of participants (see Wetzel et al., 2010). All sounds were shortened to have a duration of 200 ms, were tapered-cosine windowed (10 ms rise- and 10 ms fall-time), root mean square (RMS) matched and presented binaurally through headphones (Sennheiser HD 25-1). Participants wore soft foam earplugs during the whole experiment in order to silence any possible noise generated by the button presses. Prior to the start of the experiment, and with the earplugs inserted, participants adjusted sound volume to a loud but comfortable level while listening to the 48 sounds presented randomly with a stimulus onset asynchrony (SOA) of 800 ± 200 ms.

Two different sound conditions were performed. In the single sound condition, the same sound was presented in all trials of one block; hence both the timing and the identity of the sound could be predicted. A total of seven different sounds were used in this condition per participant, one sound per block. Across the whole participant sample, each of the 48 sounds was used in the single sound condition at least twice. In the random sound condition, a different sound was randomly selected in every trial out of the complete 48 sounds sample; hence only the timing but not the identity of the sound could be predicted. In addition to the sound conditions, a no-sound motor control condition was included in which no sounds were delivered after the button presses.

Prior to the start of the experiment, participants performed a short training without sounds to tune to the requested timing between button presses. During training, visual feedback on the timing between button presses was presented on every trial. Training could be repeated at any point during the experiment as needed if participants lost the pace. The different condition blocks were organized in pseudorandom order as follows. The experiment was divided in three parts. In the first part one nosound motor control block, three single and three random sound condition blocks were performed in random order. In each the second and third parts, one no-sound motor control block and two blocks of each sound condition were performed in random order. Every block was ∼3 min long. In total, 1386 sound trials and 203 omission trials were performed for each sound condition. A total of 600 trials were performed as no-sound motor control. Blocks could be repeated if an excessive number of trials fell outside the button-press timing limits enforced. Total experimental time was around 1 h 20 min.

## **ELECTROENCEPHALOGRAM (EEG) ACQUISITION**

The EEG was continuously acquired at a sampling rate of 500 Hz from 64 Ag/AgCl active electrodes commonly referenced to the tip of the nose, the signal amplified by BrainVision Professional BrainAmp DC amplifiers and recorded with Vision Recorder v1.10 (Brain Products GmbH, Germany). Electrodes were mounted in an elastic cap (actiCAP, Brain Products GmbH, Germany) according to the 10% extension of the international 10–20 system (Chatrian et al., 1985). Three additional electrodes were placed in order to record eye movements, one electrode on the nasion and one below each eye (see Schlögl et al., 2007). The ground electrode was placed on the forehead.

## **EEG PREPROCESSING**

EEG preprocessing was performed with EEGlab (Delorme and Makeig, 2004). Offline, the EEG was bandpass filtered from 1 to 100 Hz (windowed sinc FIR filter, Kaiser window, Kaiser beta 5.653, filter order 908), corrected for eye movements following Schlögl et al. (2007), and lowpass filtered (25 Hz lowpass, windowed sinc FIR filter, Kaiser window, Kaiser beta 5.653, filter order 908). Remaining artefacts were rejected by applying a 75µV maximal signal-change per epoch threshold. A −200 to +500 ms epoch was defined around each button-press. No baseline correction was applied, to avoid introducing motor preparation signals present in the baseline period into the post-stimulus waveforms (Urbach and Kutas, 2006). Epochs were averaged for each condition separately. All trials outside the 600–1200 ms buttonpress timing limits, the first five trials of each run of trials and the two sound trials immediately following an omission trial were excluded from analysis. On average 7.2% of the trials were rejected per condition (range 1–20.5%). These rejection rates resulted in a minimum of 145 omission trials per sound condition and 482 no-sound motor control trials included in the final averages per participant.

## **EVENT-RELATED POTENTIAL (ERP) ANALYSIS**

The presence of prediction-related activity in each sound condition was first verified comparing omissions trials to the physically equivalent silent button presses in the no-sound motor control condition, where no prediction should be present. To identify time-windows and regions of interest for this comparison, we combined a priori knowledge with an assumption free, cluster-based random permutation procedure. The analysis was constrained by a priori knowledge on the sequence of responses elicited by single sound omissions in a recent study. Following SanMiguel et al. (2013), we expected a series of three consecutive omission responses (omission N1, N2, and P3): a first negative response, present over frontotemporal scalp locations in the time period between 0 and 100 ms, followed by a second negative response between 100 and 200 ms, maximal over the frontocentral midline and finally a broadly distributed positive deflection between 200 and 400 ms. Hence, the statistical analysis focused on three regions of interest (ROIs): left temporal (FT7, FC5, T7, C5), frontocentral midline (Fz, FCz, Cz) and right temporal (FC6, FT8, C6, T8). Given this a priori information, the omission N2 and P3 could be clearly identified on the grand-average single sound omission waveforms. Thus, time windows of interest for these two components were defined around the deflection peaks on the frontocental midline electrodes (oN2, 144–164 ms; oP3, 278–356; see **Figure 1**). Statistical analyses for these two components were carried out on the mean amplitude over the defined time-windows and over all electrodes in each ROI. For the oN2, amplitude measures on the frontocentral midline ROI were contrasted with a two-sided, paired samples *t*-test between the omission trials and the no-sound motor control, separately for the single and random sound conditions. For the oP3, the presence of responses was tested with a condition (omission trials, no-sound motor control) × region (left temporal ROI, frontocentral midline ROI, right temporal ROI) ANOVA, separately for the single and random sound conditions.

The time window for the oN1, however, could not be clearly identified by the same procedure, as a slow rising negativity was present on the temporal ROIs over the whole 0–100 ms time-period (see results). For this reason, a cluster-based nonparametric permutation testing procedure implemented in the Fieldtrip toolbox (Oostenveld et al., 2011) was applied on the time-courses of the electrodes included in the temporal ROIs, in order to identify clusters of interest in the time domain. This procedure follows the approach described in Maris and Oostenveld (2007). Essentially, the time-course of the two conditions was compared with a point by point dependent samples *t*-test and clusters of adjacent significant points (*p <* 0*.*1) were identified. For each cluster, a cluster-level statistic was calculated by taking

**FIGURE 1 | ERPs in sound and omission trials.** Sound and omission responses in the single sound **(top)** and random sound **(bottom)** conditions, plotted for one selected electrode in each ROI (temporal left: FT7, frontocental midline: Fz, temporal right: FT8). Both sound and omission responses are motor-corrected via subtraction of the no-sound motor control waveform. Clear omission-related responses are present only for omissions in the single sound condition and not for omissions in the random sound condition. The analysis time-windows for the oN2 (144–164 ms) and oP3 (278–356 ms) components are indicated with gray shading on the midline ROI electrode.

the sum of all the individual *t*-statistics within that cluster. The multiple comparisons problem was solved using non-parametric testing at the cluster level. A comparison distribution was generated by randomly permuting the values between conditions 1000 times, and computing the cluster-level statistic in each of the permutations. A cluster was considered significant if the probability of observing a larger cluster level statistic from the shuffled data was below 5%. The single and random omission ERPs were compared to the motor control ERPs following this procedure to identify significant temporal clusters, particularly in the early (0–100 ms) time-window. On the basis of the cluster analysis, the time-window for the omission N1 was defined (oN1, 42–92 ms), and an additional earlier time-window of interest was identified, i.e., the early negativity window (eNeg, −20 to 40 ms; see results for a closer description of the selection procedure for these two windows). Confirmatory parametric testing was additionally carried out on the mean amplitude in these windows in the temporal ROIs. Separate condition (omission trials, no-sound motor control) × hemisphere (left temporal ROI, right temporal ROI) analyses of variance (ANOVAs) were performed for the random and single sound conditions in the oN1 and eNeg time-windows.

After the presence of prediction-related activity for omission trials was tested in each sound condition, omission trials were directly contrasted between the single and random sound conditions in each of the time-windows of interest. In the oN1 and eNeg time windows, amplitude measures for the single and random sound omissions were contrasted with a condition (single omission, random omission) × hemisphere (left temporal ROI, right temporal ROI) ANOVA. In the oN2 time window amplitude measures on the frontocentral midline ROI were contrasted with a two-sided, paired samples *t*-test between the omission responses of the single and random sound conditions. Finally, in the oP3 time window, differences between omission responses in the single and random sound conditions were tested with a condition (single omission, random omission) × region (left temporal ROI, frontocentral midline ROI, right temporal ROI) ANOVA.

## **TOPOGRAPHIC ANALYSIS**

ERP voltage distributions for the eNeg, oN1 and oN2 ERP components were transformed into scalp current density maps (SCD) following the method described in Perrin et al. (1989). SCD maps are reference free and indicate scalp areas where current lines emerge from or converge into the scalp, allowing an easier visual estimation of the underlying generators than scalp potential maps. For SCD analyses, the maximum degree of the Legendre polynomials was chosen to be 50, and the order of splines (m) was set to 4. A smoothing parameter of 10−<sup>4</sup> was applied.

## **SOURCE ANALYSIS**

Brain sources for the relevant ERP responses were estimated performing brain electrical tomography analyses, using the Variable Resolution Electromagnetic Tomography (VARETA, Bosch-Bayard et al., 2001) approach. With this technique, sources are reconstructed by finding a discrete spline-interpolated solution to the EEG inverse problem: estimating the spatially smoothest intracranial primary current density (PCD) distribution compatible with the observed scalp voltages. This allows for point-to-point variation in the amount of spatial smoothness and restricts the allowable solutions to the gray matter, based on the probabilistic brain tissue maps available from the Montreal Neurological Institute (Evans et al., 1993). This procedure minimizes the possibility of "ghost sources," which are often present in linear inverse solutions (Trujillo-Barreto et al., 2004). A 3D grid of 3244 points (voxels, 7 mm grid spacing), representing possible sources of the scalp potential, and the recording array of 64 electrodes were registered with the average probabilistic brain atlas developed at the Montreal Neurological Institute. Subsequently, the scalp potential in the latency range of the relevant components was transformed into source space (at the predefined 3D grid locations) using VARETA. Statistical parametric maps (SPMs) of the PCD estimates were constructed based on a voxel by voxel Hoteling *T*<sup>2</sup> test between conditions in order to localize the sources of the response. For all SPMs, Random Field Theory (Worsley et al., 1996) was used to correct activation threshold for spatial dependencies between voxels. Results are shown as 3D activation images constructed on the basis of the average brain.

## **RESULTS**

Participants were able to maintain a stable pace between button presses keeping an average of 784 ± 46 (SD) ms between presses in the random sound condition, 805 ± 44 ms in the single sound condition and 818 ± 50 ms in the no-sound motor control. A mean of 1.3 (range: 0–3) blocks were repeated per participant.

To isolate prediction-related activity we compared electrical brain responses time-locked to button presses resulting in physically identical stimulation (i.e., no sound was delivered), but differing in the degree of prediction for upcoming sounds. No prediction for a sound should be present in blocks in which button presses never caused a sound (no-sound motor control condition). In contrast, highly precise predictions about the forthcoming sound could be formulated during the single sound condition, while only predictions about the sound onset should be generated during the random sound condition. Thus, any activity elicited to omissions of single and random sounds above the no-sound motor control responses was considered a neural reflection of prediction. Sound and omission responses for each of the sound conditions are depicted on **Figure 1**. In order to identify the sound- and omission-related responses, motor activity has been subtracted from the waveforms. Thus, the plots show the difference between the sound- or omission-related potential in the sound conditions and the response in the no-sound motor control. The ERPs show clearly identifiable omission responses in the single sound condition, when a specific prediction could be formulated, while a less consistent pattern of activity is visible in the random sound condition.

In the single sound condition, large deflections corresponding to the oN2 and oP3 components can be observed, and corresponding analysis time-windows were defined around these two peaks on the central midline ROI. Statistical analysis carried out for these two components, corroborated the presence of significant omission responses in the single sound condition [oN2: *t(*14*)* = −4*.*086, *p* = 0*.*001; oP3: *F(*1*,* <sup>14</sup>*)* = 29*.*124, *p <* 0*.*001] but not in the random sound condition [oN2: *t(*14*)* = 0*.*255, *p* = 0*.*802; oP3: *F(*1*,* <sup>14</sup>*)* = 0*.*190, *p* = 0*.*670]. In the random sound condition, a significant condition × ROI interaction was found for the oP3 time-window [*F(*2*,* <sup>28</sup>*)* = 10*.*043, *p* = 0*.*002]; however, *post-hoc t*-tests in each ROI corroborated that there was no significant response elicited in any of the ROIs [temporal left: *t(*14*)* = 1*.*104, *p* = 0*.*288; temporal right: *t(*14*)* = 1*.*970, *p* = 0*.*095; midline: *t(*14*)* = −1*.*253, *p* = 0*.*231]. The direct statistical contrast between responses elicited in omission trials in the single and random sound conditions corroborated the presence of larger prediction-related activity for omission trials in the single sound than in the random sound condition in both time windows [oN2: *t(*14*)* = −4*.*813, *p <* 0*.*001; oP3: *F(*1*,* <sup>14</sup>*)* = 17*.*453, *p* = 0*.*001]. Again a significant interaction between condition and ROI was found for the oP3 time-window [*F(*1*,* <sup>14</sup>*)* = 8*.*805, *p* = 0*.*006]. Nevertheless, *post-hoc* paired comparisons corroborated that omission trials in the single and random sound conditions differed significantly in all ROIs [temporal left: *t(*14*)* = 4*.*849, *p <* 0*.*001; temporal right: *t(*14*)* = 3*.*331, *p* = 0*.*005; midline: *t(*14*)* = 4*.*070, *p* = 0*.*001].

In the time period between 0 and 100 ms, we expected to identify the oN1 component on the temporal ROIs; however, no clear peak can be observed in this time period but rather a sustained negativity, starting even before 0 ms (see **Figure 1**). Therefore, we adopted a data-driven approach to identify additional time periods of interest where the sound omission waveforms significantly differed from the motor control, particularly in the early time period. The results of the cluster-based permutation test are depicted in **Figure 2**. In the single sound condition, significant clusters were found corresponding to the oP3 (positive cluster 1 [PC1], 266–500 ms, *p <* 0*.*001 and PC2, 270–398 ms, *p <* 0*.*001) and the oN2 (negative cluster 2 [NC2], 94–184 ms, *p* = 0*.*008)

**FIGURE 2 | Cluster analysis on temporal ROIs.** ERPs for the temporal left **(left column)** and temporal right **(right column)** ROIs (averaged over electrodes in the ROI) are plotted for sounds and omissions in the random sound **(top row)** and the single sound **(bottom row)** condition. Both sound and omission responses are motor-corrected via subtraction of the no-sound motor control waveform. Temporal clusters in which omission responses differed significantly from the motor control are indicated under the waveforms for each sound condition (NC: Negative cluster, PC: Positive cluster). Time-windows defined on the basis of the cluster analysis for the eNeg (−20 to 40 ms) and oN1 (42 to 92 ms) components are indicated with gray shading.

components. Additionally, two significant clusters were found in the early time period. NC1 [−2 to 214 ms, *p <* 0*.*001] covers most of the sustained negativity and the oN2 response, and NC3 (−52 to 92 ms, *p* = 0*.*008) covers the earlier part of the sustained negativity. In the random sound condition, a single significant cluster was found (NC1, −28 to 60 ms, *p* = 0*.*041). This cluster covers approximately the same time-period as NC3 in the single sound condition, and similar deflections are observable in the omission and sound waveforms of both conditions in this time window. Thus, to be able to characterize the topographies and sources of this early negativity, a representative time-window was defined (eNeg, −20 to 40 ms), centered around the coinciding portions of the significant clusters (NC1, random condition and NC3, single condition) and covering the peak of the deflection in both conditions. Finally, the oN1 time-window was identified in the time period covered by NC1 in the single sound condition, which was not included in the eNeg or the oN2. In this time-window, a differentiated deflection can be identified in the omission waveforms of both the single and the random sound conditions, thus a representative time-window was defined around this peak (oN1, 42–92 ms). Consistent with SanMiguel et al. (2013), the oN1 time-window also includes the Na component of the T-Complex elicited by sounds.

Statistical analyses carried out on the eNeg and oN1 time-windows in the temporal ROIs corroborated the results of the cluster-based analysis. In the eNeg time-window, the omission waveforms differed significantly from the motor control in both the random [*F(*1*,* <sup>14</sup>*)* = 9*.*609, *p* = 0*.*008] and the single [*F(*1*,* <sup>14</sup>*)* = 9*.*808, *p* = 0*.*007] sound conditions, while the random and single sound omission waveforms did not significantly differ from each other [*F(*1*,* <sup>14</sup>*)* = 0*.*387, *p* = 0*.*544]. In the oN1 time-window, a significant response was only elicited by single sound omissions [*F(*1*,* <sup>14</sup>*)* = 11*.*208, *p* = 0*.*005] and not by random sound omissions [*F(*1*,* <sup>14</sup>*)* = 3*.*741, *p* = 0*.*074], although a trend was apparent. The direct comparison between single and random sound omissions in the oN1 time-window corroborated the presence of a larger response in the single than in the random condition [*F(*1*,* <sup>14</sup>*)* = 6*.*592, *p* = 0*.*022].

In sum, the statistical analysis of the ERP waveforms allowed the identification of a series of responses in the omission waveforms. When participants expected to hear a sound after pressing the button, an early negativity was present at the moment of the button press (eNeg), irrespective of whether a specific or a random sound was expected. Subsequent prediction-related activity however, was only present in the single sound condition, when a specific sound was expected. In this condition, when the predicted sound was omitted, the early negativity was followed by the omission N1, N2, and P3 responses. The scalp distribution and VARETA source estimation for each of these components are characterized in **Figure 3** (eNeg, single and random condition) and **Figure 4** (oN1, oN2, single condition). The early negativity (eNeg, **Figure 3**) shows consistent topographies and sources in both the random and single sound conditions, indicating a probable motor origin. The maximum of the source estimation is located in premotor/supplementary motor areas in the left hemisphere. Given that thirteen out of fifteen participants were right-handed, the lateralization is on average contralateral

to execution hand. In the single sound condition, both the oN1 and oN2 responses (**Figure 4**) show a scalp distribution consistent with sources in auditory cortices. The VARETA source estimation yielded similar sources on superior temporal gyrus (STG) for both the oN1 and oN2, with the oN1 showing a more posterior and right-lateralized distribution compared to the oN2. In the oN1 time-window, the VARETA solution also shows a significant source of activity in the right middle frontal gyrus (rMFG).

## **DISCUSSION**

In the present study, we tested a hypothesis on the neural substrate for sensory prediction. We hypothesized that when participants expect to hear a specific sound after pressing a button, the button press triggers the predictive activation of the sound's representation in auditory cortex. Unless prediction is present, if no auditory stimulation is presented, no auditory sensory responses should be observed. Hence, we inspected brain responses elicited when the self-generated sounds were omitted after the button press. Any electrophysiological responses elicited by sound omissions should be a direct consequence of predictive activity. Following predictive coding models, we assume that early sensory responses reflect the informational difference between sensory prediction and sensory input. Given that in omission trials there was no input, in this particular case the difference between prediction and input should directly reflect the neural code of the sensory prediction. Therefore, examining early responses obtained in omission trials should help answer the question of how prediction is represented. If predicting the occurrence of a particular sound is accomplished by modulating the sensory units that form the sound's sensory representation, sound omissions should trigger a mismatch between prediction and input only in those modulated units, causing them to respond, and eliciting an auditory response. However, if the specific physical characteristics of the predicted sound are unknown, the sensory system should not be able to predict it, as it would have no means to represent it. Accordingly, in this case, if the sound is omitted, no predictionrelated responses should be observed. The results support this line of reasoning: when participant's button presses always generated the same sound, sound omissions elicited an auditory-like response, followed by subsequent error signals. Conversely, when button presses generated a different sound on every trial, the sound omission did not elicit any significant auditory responses in the ERP and subsequent error signals were also not observed. However, in both cases the motor plan appeared to carry unspecific expectation activity.

Electrophysiological activity in premotor areas contralateral to the execution hand differed at the moment of the button press depending on whether this motor act was expected to have an associated auditory consequence or not. In both the random and the single sound condition, we observed an enhanced negativity around the time of the button press, compared to the no sound motor control. Pleasingly, this negativity was present irrespective of whether the sound was later omitted or not (cf. **Figures 2**, **3**), as at this moment the sound had not yet been presented either

way. The effect was also identical irrespective of whether a specific sound could be predicted or not, and within this time-window, no prediction-related activity was observed in auditory areas. These facts indicate that the early negativity does not represent a specific sensory prediction, but rather some form of expectation associated to the motor act, which does not carry specific information about the predicted stimulation. These characteristics fit well with the possibility that we are observing an efference copy of the motor command. According to motor control models, whenever a motor act is planned, a copy of the motor plan (i.e., the "efference copy") is generated and sent to sensory processing areas, where the sensory consequences of this motor command can be anticipated (Crapse and Sommer, 2008). In this way, intended and achieved effects of our motor commands can be compared, providing necessary feedback for achieving efficient motor control (Wolpert and Ghahramani, 2000). Alternatively, differences in the magnitude of motor activity could be due to different amounts of attention being invested in the motor act in motor only and motor auditory blocks.

Prediction-related responses showing specificity for the predicted stimulation were only observed in the single sound condition. When a specific sound was expected but it was not delivered, the earliest response observed in the ERPs at the time of the sound omission was a fronto-temporal N1 portraying typical characteristics of an auditory N1 response, including sources which localized to auditory cortex. Thus, we propose that the N1 response elicited by sound omissions represents the prediction of the particular sound (see also SanMiguel et al., 2013). Moreover, the elicitation of the N1 response signals that this prediction was not matched by the input, and thus it was erroneous. The computation of the prediction error is accomplished within sensory cortex, which presumably feeds forward this information to higher cortical areas. The N1 response was followed by an anterior N2 and a P3 response. Anterior N2s are typically elicited in paradigms which tap into at least one of two core concepts: the presence of deviance or the presence of errors or conflict in the context of action monitoring (Folstein and Van Petten, 2008). These two characteristics both play a defining role in omission trials in the present paradigm: omissions were rare events, which consisted in motor acts that did not have the expected consequences, therefore indicating a possible action error. Indeed, previous studies in which actions were paired with unexpected or unintended outcomes have reported similar anterior N2 responses (Gehring et al., 1993; Falkenstein et al., 2000; Katahira et al., 2008; Gentsch et al., 2009; Iwanaga and Nittono, 2010). As in the present study, the N2 is often followed by a P3, forming the N2-P3 complex. The P3 response is thought to reflect attention orienting triggered by surprising events (Friedman et al., 2001; Escera and Corral, 2007) and the updating of mental models to integrate new information (Barceló et al., 2006; Polich, 2007). In line with these ideas, we propose that when a sound omission was encountered, first, a prediction error was detected, signaled here by the N1 response. This error in prediction characterizes the omission as an unexpected and thus surprising event, leading to a mobilization of attentional resources to process the event in depth. Further processing was signaled by the N2-P3 complex, which reflects cognitive control measures related to the evaluation of the error's significance in context, and the adjustment of the forward model that generated the prediction in order to minimize future error.

None of these prediction-related responses were observed when a random sound was omitted. The failure to elicit an N1 response fits to our hypothesis regarding the neural representation of prediction. In this condition, no specific sound representation could be modulated in sensory cortex, hence there was no difference between the (lack of) prediction and the (lack of) input and no N1 was elicited. More surprisingly, the N2-P3 complex was also absent. One could suspect that even when a specific sound cannot be predicted, participants can still notice the absence of sound as a rare event and that this would trigger similar evaluation and cognitive control processes as when a specific sound is expected. That is, that even when a prediction error response can't be elicited in early sensory cortex, parallel routes might exist to trigger higher cognitive processing of the omission. In fact, differences in motor activity at the time of the button press do indicate that some form of expectation was present also in this condition. However, there seemed to be no consequences when this expectation was violated, as no error responses were elicited in this case. Thus, the present findings argue for a serially organized system and imply that, as long as specific identity predictions cannot be formulated, deviating events are not evaluated in depth and do not trigger attentional orienting and cognitive control measures. Although somewhat puzzling, this finding is consistent with psychological studies on what people consider a surprising event. Teigen and Keren (2003) showed that the same unusual event can be rated as more or less surprising in different scenarios, depending on whether there is one highly probable alternative or the alternative possible events are also each relatively improbable. Maguire et al. (2011) have proposed that, rather than being a direct function of the probability of the event, subjective ratings of surprise depend on the ease with which the event can be integrated into an existing explanatory model. This proposal is conceptually similar to Itti and Baldi's formal Bayesian definition of surprise (Itti and Baldi, 2009; Baldi and Itti, 2010). Bayesian surprise quantifies how incoming data affects an observer, by measuring the difference between the observer's beliefs before and after receiving the new data. New data that is difficult to integrate into the current explanatory model (i.e., the observer's beliefs) requires that significant changes are made to the model, thus yielding a high value of Bayesian surprise. This perspective stresses the importance of the observer's beliefs: when the observer cannot make confident predictions, any event holds little surprise value, no matter how improbable it is by itself.

Hence, the absence of the N2-P3 complex for omissions in the random sound condition might be related to a low level of surprise in this scenario. The P3 response has historically been tied to the concept of surprise (Sutton et al., 1965; Donchin, 1981), and more recent studies have been able to model trial-by-trial fluctuations in P3 amplitude using various estimations of surprise values (Mars et al., 2008; Kolossa et al., 2012). However, alternative models have rather stressed the model updating aspects of P3-eliciting events. Bayesian surprise neatly encompasses both aspects. Thus, our findings are consistent with this framework and provide a possible neurophysiological basis for the computation of surprise values. In particular, the surprise value of an improbable event might critically depend on the sensory system's capabilities for representing the observer's beliefs. If the expected event is beyond the representational capabilities of low-level sensory cortices (e.g., when it is a category of stimuli that do not share any one particular physical property), then the new data (in this case the random sound omission) would encounter no model to modify, and so it would generate a low value of Bayesian surprise.

Nevertheless, it is quite unlikely that prediction did not play any role in the processing of self-generated sounds in the random sound condition. First of all, as discussed above, motor activity was altered when participants expected the button press to have an auditory consequence. Moreover, outside the laboratory, predictions can hardly ever be made with absolute certainty about the precise physical characteristics of the stimulus. Further, there is some evidence that stimulus processing is modulated by temporal predictability regardless of whether the specific identity of upcoming stimuli can be predicted (Baess et al., 2008; Lange, 2009). Possibly, a relatively unspecific prediction could still be formulated in auditory cortices in the random sound condition, based on the few available predictable physical characteristics of the upcoming stimulus (e.g., its sensory modality, spatial location). Moreover, neural units with rather unspecific response properties are present in sensory cortices, including auditory (Jones, 1998). These unspecific units could arguably be recruited to represent imprecise predictions. While there was no significant prediction-related activity in the omission ERPs of the random sound condition in the oN1, oN2, and oP3 windows, it is worth noting that the random omission time courses do not appear to be random noise either. There are visible, albeit quite small deflections for random omissions in each of these time-windows (see **Figure 2**) which are consistent with the responses found in the single sound condition also in their scalp topography (data not shown). According to predictive coding models, the responsiveness of prediction error units (i.e., sensory units) is weighted by the precision of the available prediction (Feldman and Friston, 2010). Hence, the lack of significant prediction-related activity in the random sound condition could be partly explained by the low precision of the prediction in this condition, leading to a down-weighting of the prediction error response. Additionally, in the random sound condition, omissions almost exclusively incur a violation of temporal prediction. The mechanisms of temporal prediction are different from those of identity prediction. According to current models, temporal prediction is a strictly modulatory process that relies on the generation of ideal windows for stimulus processing, coinciding with the occurrence of the relevant stimuli (Large and Jones, 1999; Schroeder and Lakatos, 2009). In these temporal windows, the neural responsiveness of sensory areas is increased so that stimuli arriving at the predicted point in time receive privileged processing. Therefore, temporal prediction by itself is not expected to drive any responses if no stimulation is presented, as is the case of omission trials. The present paradigm is especially sensitive to this distinction.

Temporal prediction has been mostly investigated by comparing responses to task-relevant stimuli presented at expected vs. unexpected moments in time (see Nobre et al., 2007). Typically, in these studies, the physical characteristics of the stimuli are always highly predictable. Nevertheless, a few studies have been able to show that temporal prediction by itself has no effects on early perceptual processing of task-relevant visual stimuli (Miniussi et al., 1999; Griffin et al., 2002), but has a multiplicative effect when combined with identity prediction (Doherty et al., 2005). In a revision of these findings, Nobre et al. (2007) conclude that perceptual influences of temporal expectations may be dependent upon other receptive-field properties of neurons. In other words, that temporal prediction can only have a modulatory effect on identity predictions that low-level sensory cortices are able to represent. Omission responses are particularly suited to investigate this claim, given that there is no input to be modulated, hence all activity is purely a reflection of predictive processes. However, temporal orienting studies investigating stimulus omissions are scarce. In a rare example, Langner et al. (2011) found greater omission-related responses in basal ganglia for specific compared to non-specific expectations when temporal prediction was held

## **REFERENCES**


activate a common neural network for cognitive control. *J. Cogn. Neurosci.* 18, 1734–1748.


constant. The authors related basal ganglia activity to Bayesian surprise, but effects on sensory cortices were not reported.

In conclusion, the present findings support a model in which identity predictions are accomplished via forward modeling, producing a template of the predicted stimulus and making use of the receptive field properties of low-level sensory units to represent it; while temporal prediction has a strictly modulatory effect, boosting the responsiveness of the sensory units that hold the identity predictions, within the time windows in which stimuli are expected. As a consequence, the present findings indicate that electrophysiological responses commonly associated with prediction error signaling (N2, P3), and the attentional and cognitive control processes associated with them, are only elicited by violations of identity prediction. These findings raise important questions regarding the role of temporal expectation. In particular, it is unclear how violations in purely temporal (without identity) expectations can be detected.

## **ACKNOWLEDGMENTS**

Funded by the German Research Foundation (DFG, Reinhart-Koselleck project SCH 375/20–1). Data for this experiment was collected by Jue Huang, Florian Joachimski and Tobias Ay as part of a university course included in their master's studies. The authors wish to also thank Laura Schäfer for assistance during data collection and Nicole Wetzel and Andreas Widmann for providing the sound stimuli. This experiment was realized using Cogent Graphics developed by John Romaya at the LON at the Wellcome Department of Imaging Neuroscience.

at the perceptual level. *Psychon. Bull. Rev.* 12, 328–334. doi: 10.3758/BF03196380


involuntary auditory attention. *J. Psychophysiol.* 21, 251–264. doi: 10.1027/0269-8803.21.34.251


tive attention. *Psychophysiology* 32, 4–18. doi: 10.1111/j.1469- 8986.1995.tb03400.x


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 31 May 2013; accepted: 10 July 2013; published online: 29 July 2013. Citation: SanMiguel I, Saupe K and Schröger E (2013) I know what is missing here: electrophysiological prediction error signals elicited by omissions of predicted "what" but not "when". Front. Hum. Neurosci. 7:407. doi: 10.3389/ fnhum.2013.00407*

*Copyright © 2013 SanMiguel, Saupe and Schröger. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.*

## The ups and downs of temporal orienting: a review of auditory temporal orienting studies and a model associating the heterogeneous findings on the auditory N1 with opposite effects of attention and prediction

## *Kathrin Lange\**

*Institut für Experimentelle Psychologie, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Floris P. De Lange, Radboud University Nijmegen, Netherlands Stefan Berti, Johannes Gutenberg University Mainz, Germany*

#### *\*Correspondence:*

*Kathrin Lange, Federal Institute for Drugs and Medical Devices, Kurt-Georg-Kiesinger Allee 3, 53175 Bonn, Germany e-mail: kathrin.lange@bfarm.de*

The temporal orienting of attention refers to the process of focusing (neural) resources on a particular time point in order to boost the processing of and the responding to sensory events. Temporal attention is manipulated by varying the task-relevance of events at different time points or by inducing expectations that an event occurs at a particular time point. Notably, the electrophysiological correlates of these manipulations at early processing stages are not identical: Auditory studies operationalizing temporal attention through task-relevance consistently found enhancements of early, sensory processing, as shown in the N1 component of the auditory event-related potential (ERP). By contrast, previous work on temporal orienting based on expectations showed mixed results: early, sensory processing was either enhanced or attenuated or not affected at all. In the present work, I will review existing findings on temporal orienting with a special focus on the auditory modality and present a working model to reconcile the previously heterogeneous results. Specifically, I will suggest that when expectations are used to manipulate attention, this will lead both to an orienting of attention and to the generation of precise predictions about the upcoming event. Attention and prediction are assumed to have opposite effects on early auditory processing, with temporal attention increasing and temporal predictions decreasing the associated ERP correlate, the auditory N1. The heterogeneous findings of studies manipulating temporal orienting by inducing expectations may thus be the consequence of differences in the relative contribution of attention and prediction processes. The model's predictions will be discussed in the context of a functional interpretation of the auditory N1 as an attention call signal, as presented in a recent model on auditory processing.

**Keywords: temporal orienting, attention, predictability, audition, ERP, N1**

## **WHAT IS SELECTIVE ATTENTION AND HOW IS IT INDUCED?**

Our sensory systems are often exposed to a large amount of input from various external and internal sources. However, not all of this input is equally relevant to our current needs and goals. In order to deal efficiently with the resources at hand it is therefore adaptive to prioritize the processing of certain events. The process or set of processes by which prioritized processing is attained is referred to as attention. For example, Nobre (2004) referred to attentional orienting as "the set of processes by which neural resources are deployed selectively toward specific attributes of events on the basis of changing motivation, expectation, or volition in order to optimize perception and action. Operationally, orienting can be measured through the behavioral consequences of changes in stimulus salience, predictability, or relevance" (p. 157). In addition to addressing the idea of prioritized or exclusive processing, this definition points out that attention can be oriented to various attributes of a stimulus event (e.g., spatial position or pitch) and that there are different

ways to operationalize attention: manipulating stimulus salience, stimulus predictability, or stimulus relevance.

Stimulus predictability and stimulus relevance, respectively, are manipulated in two of the most frequently used paradigms in attention research: probabilistic cuing (e.g., Posner et al., 1980) and filter tasks (e.g., Cherry, 1953; Hillyard et al., 1973). In probabilistic cuing, stimulus predictability is manipulated, i.e., a symbolic cue indicates for instance the most likely spatial position of the upcoming target (e.g., Posner et al., 1980). In this task, behavioral benefits for predicted targets are attributed to an orienting of attention to the expected position (because, on average, this will be beneficial for task-performance). By contrast, in filter paradigms stimulus relevance is manipulated. For example, the spatial position of an event determines whether the event is relevant for performance, that is, whether it requires an overt (e.g., a key-press) or covert (e.g., counting) response (e.g., Cherry, 1953; Hillyard et al., 1973). Because stimuli at the other spatial location never require a response, it is assumed that processing resources are strictly focused on the task-relevant position. Stimuli presented at the relevant or irrelevant position are therefore regarded as attended or unattended, respectively.

Both probabilistic cuing and filter paradigms are associated with an orienting of attention. However, the effects that are measured in the two groups of tasks need not be identical, nor do they necessarily reflect identical (sets of) processes—including (but not restricted to) attentional processes. First, effects may be quantitatively different, i.e., attention effects may be larger in filter compared to cuing tasks. This is because in filter tasks only attended stimuli require a response, thus allowing for a strict focusing of attention. By contrast, in probabilistic cuing tasks, processing resources have to be (unevenly) *divided* between expected and unexpected stimuli, because both are associated with a response. Second, filter and cueing tasks may involve qualitatively different (sets of) processes. Cueing tasks first and foremost manipulate stimulus probability and hence participants' expectations. This is assumed to lead to an orienting of attention to the expected stimulus, similar as in filter paradigms. However, expectations may also exert effects on stimulus processing apart from those triggered by the orienting of attention: In other words "experimental results interpreted in terms of selective attention are often confounded by expectation effects" (Rauss et al., 2012, p. 1249). Although the conclusion that cuing tasks involve both expectations and attention may be partly obvious, it is not always acknowledged and addressed in attention research (for similar arguments see Summerfield and Egner, 2009; Nobre et al., 2012). An additional consequence of expectations is that participants may (successfully) engage in predictions as to which stimulus will be presented next. Because of this, the potential confounding of attention and expectations (and hence, predictions) may become problematic when pooling findings obtained in the different paradigms. Attention (i.e., the focusing of processing resources) and expectations or predictions are known to be associated with opposite effects on early event-related potentials (ERPs; see also Summerfield and Egner, 2009), particularly in the auditory modality (as detailed below). As a consequence, effects measured in scalp-recorded ERPs may differ as a function of the paradigm used—as is the case in studies related to the temporal orienting of attention (as detailed below). In the present manuscript, I will explain the different effects on the scalp-recorded ERP by assuming that both cuing and filter paradigms involve a focusing of processing resources (or orienting of attention), whereas only cuing paradigms involve processes related to stimulus prediction. To keep it simple, I assume that the attention process (the focusing of processing resources) is identical between the two paradigms (although this notion is surely debatable).

In the following, I will provide a brief overview on data corroborating the opposite effects of attention and predictability on brain correlates of early processing in the auditory modality, particularly the auditory N1 (paragraph 2). I will then discuss the problem to adequately assign the probabilistic cuing paradigm to one or the other category, although ERP data roughly resemble those obtained using filter paradigms (paragraph 3). I will then turn to the field of *temporal* orienting of attention in the auditory domain. I will start by reviewing recent ERP studies that manipulated expectations for and task-relevance of stimuli presented at particular points in time, respectively (paragraph 4), manipulations classically used to trigger an orienting of attention. The results of these studies with respect to the direction of the effect the experimental manipulation had on the auditory N1 were heterogeneous. Manipulations of task-relevance lead to increases of the N1 for stimuli at the task-relevant time point, whereas temporal expectations lead to increases, decreases, or null-effects for stimuli at the expected time point. One possibility is to ascribe this controversy to different sub-processes of attention that are involved in the two classes of paradigms. However, as an alternative, I will present a working model on the auditory N1 (paragraph 5) that is based on empirical findings that attention and prediction have opposite effects on early auditory ERPs (as reviewed in paragraph 2). The model assumes that paradigms manipulating (temporal) stimulus probabilities (such as probabilistic or rhythmic cuing) first and foremost trigger overall expectations for particular stimuli. These expectations are supposed to induce prediction processes known to decrease N1 amplitudes. Moreover, if stimuli require additional processing (e.g., because they are response relevant), expectations may also lead to a focusing of processing resources (i.e., an orienting of attention) to the expected stimuli. The latter process is assumed to correspond to the attentional orienting induced by task-relevance in filter paradigms, which is known to increase the auditory N1. Because of the opposite effects of these two processes, the net effect that can be measured in the ERP, depends on their relative contributions. The model is used to describe and explain the pattern of results of the existing auditory temporal orienting studies. Moreover, testable hypotheses can be derived of the model, which may encourage future research.

## **ATTENTION AND STIMULUS PREDICTABILITY HAVE OPPOSITE EFFECTS ON EARLY AUDITORY ERPs EFFECTS OF ATTENTION BASED ON TASK-RELEVANCE**

Most ERP studies investigating auditory attention have used filter paradigms. In these paradigms, only stimuli sharing a certain feature (e.g., a certain spatial position) have to be evaluated with respect to their possible response relevance (e.g., Hillyard et al., 1973; Schwent et al., 1976; Näätänen et al., 1978; Giard et al., 1988; Woldorff and Hillyard, 1991). Hence, it is assumed that attention is strictly focused on the task-relevant channel, which is therefore regarded as attended. Attention effects measured in these tasks consist of enhanced negativities including an early Nd (peaking between 100 and 200 ms) with a fronto-central scalp topography and a late Nd (peaking around 300 and 400 ms) with a more anterior maximum (e.g., Näätänen et al., 1978; Hansen and Hillyard, 1980; Näätänen, 1982; Alho et al., 1987; see also Näätänen and Alho, 2004 for a review). The early Nd may also encompass a modulation of the sensory-evoked N1 (e.g., Hillyard et al., 1973; Giard et al., 1988; Rif et al., 1991; Alcaini et al., 1994; Ozaki et al., 2004). This may be regarded as evidence for a gating or filter mechanism of attention (e.g., Hillyard et al., 1973; Hillyard, 1981; Kauramaki et al., 2007; see also Hillyard et al., 1998), by which the processing of attended stimuli is favored over that of unattended ones.

## **EFFECTS OF STIMULUS PREDICTABILITY**

As opposed to attention, stimulus predictability is associated with a reduction (rather than an enhancement) of early negativities. Examples include the attenuation of the N1 elicited by auditory effects of one's own motor action (Schafer and Marcus, 1973; McCarthy and Donchin, 1976; Ford et al., 2001; Houde et al., 2002; Heinks-Maldonado et al., 2005, 2006; Martikainen et al., 2005; Bäß et al., 2008; Aliu et al., 2009; Lange, 2011) or by temporally predictable auditory stimuli (e.g., Schafer et al., 1981; Clementz et al., 2002; Ford et al., 2007; Lange, 2009; see also Vroomen and Stekelenburg, 2010)<sup>1</sup> .

The motor induced suppression of the N1 is typically explained by forward models of motor control (e.g., Miall and Wolpert, 1996; see also Sperry, 1950; Von Holst and Mittelstaedt, 1950). According to these models, whenever an action is initiated, predictions are made with respect to its sensory consequences. The actual outcome is then compared to the predicted effect: If both match, an attenuated response results. The suggested mechanisms are similar to what is suggested by the more general predictive coding framework (e.g., Friston, 2005): In a hierarchically organized sensory system, each level of the hierarchy receives both bottom–up, sensory information from the level below and top–down, predictive information from the level above. Deviations between sensory input and predictions cause an error signal, which is then propagated to higher levels to adjust the predictions. Assuming that the error signal is reflected in the scalp-recorded ERP, predictive coding can explain the reductions of negative ERP components associated with repetition suppression and sensory-predictable standards in oddball tasks: With improving predictions, the actual sensory input will more closely match top–down predictions, resulting in a smaller error signal—and hence, a reduction of the stimulus-evoked ERPs (Baldeweg, 2007). Given the similarity between ERP attenuations observed with sensory predictions and the motor-related ERP suppressions, it seems plausible to assume that similar predictive mechanisms may be applied when deriving predictions from external stimulation and from internal motor commands (for similar arguments see also Schubotz, 2007 and Sowman et al., 2012). Hence, motor- and sensory predictions may constitute different sources for a single mechanism implementing top–down predictions in perceptual processing.

## **PROBABILISTIC CUING: CAUGHT IN THE MIDDLE BETWEEN ATTENTION AND PREDICTABILITY**

Although probabilistic cuing is typically regarded as a manipulation of attention (e.g., Mangun and Hillyard, 1991), studies using this paradigm cannot be unequivocally classified with respect to the dichotomy between attention and predictability: On the one hand, cued and uncued stimuli occur with different probabilities, i.e., these studies manipulate stimulus predictability. On the other hand, in a typical probabilistic cuing task, all stimuli require an overt response. It is therefore highly adaptive to orient attention to the most probable—hence expected—event: On average, this will yield the highest performance. Therefore, the expected event also becomes the attended event (though presumably to a lesser degree than in filter tasks). At the same time, it is reasonable to assume that expectations are used to generate (more or less) precise predictions as to which stimulus will be presented next—similar to the predictions based on internal forward models described above. It may thus be concluded that probabilistic cuing tasks confound attention and expectation/prediction (see also Summerfield and Egner, 2009; Kok et al., 2012; Rauss et al., 2012). In spite of this confound, results obtained in ERP studies employing probabilistic cuing are similar to those measured in filter paradigms (e.g., Schröger, 1993, 1994; Schröger and Eimer, 1997; see also Mangun and Hillyard, 1991 for visual results). For example, Schröger (1993) reported a similar early negativity of transient auditory attention using both a pure filter task (Experiment 1) and a filter task combined with probabilistic cuing (Experiment 2). Assuming that probabilistic cuing induces both an increase in stimulus predictability and an orienting of attention and assuming additive effects of these processes on the auditory N1, the observable ERP effect might reflect the fact that the attention-related enhancements outweighed the reductions induced by stimulus predictability. Hence, the direction of the probabilistic cuing effect on the auditory ERP may depend on the different factors that contribute to the orienting of attention on the one hand and to event predictability on the other. Consistent with this notion, at a descriptive level, the attention effect in the early negativity observed by Schröger (1993) was smaller when a filter task was combined with probabilistic cuing (Experiment 2; i.e., both attention and prediction are involved) compared to the use of a pure filter task (Experiment 1; only attention is induced). Findings that are consistent with this notion are also obtained in the field of auditory temporal attention, as detailed below.

Note, however, that additivity of attention and prediction is not the only possibility. A recent application of predictive coding theory to probabilistic cuing explains enhanced ERPs to attended stimuli in the Posner-task by assuming *synergistic* effects of attention and prediction. According to this notion, attention is supposed to boost the precision of prediction, thus leading to a heightened weighting of sensory evidence (Feldman and Friston, 2010). This is supposed to reverse effects of "sensory silencing," and hence leads to the typical amplitude increase in ERPs to attended stimuli in probabilistic cuing tasks (see also Kok et al., 2012). Predictive coding models have been successfully applied to explain the enhanced ERPs observed in (spatial) probabilistic cuing in the visual domain (e.g., Mangun and Hillyard, 1991) and, moreover, the predictions of these models have been corroborated

<sup>1</sup>A related phenomenon is the mismatch negativity (MMN; Näätänen et al., 1978). The MMN is a relative negativity to expectancy-violating deviants compared to expectancy-matching standards, which is typically measured in oddball tasks. The MMN can be explained by sensory adaptation (for a review see May and Tiitinen, 2010) or by an involvement of top–down expectations or their violation (e.g., Schröger and Wolff, 1996; see also Todorovic et al., 2011; Todorovic and de Lange, 2012). In classical MMNstudies, the MMN is mostly related to an enhanced negativity to deviants, hence measuring expectancy-violation. However, the early negativity to standard stimuli has been shown to decrease with an increasing number of repetitions (and hence increasing predictability) of the standard (Haenschel et al., 2005; see also Todorovic et al., 2011; Todorovic and de Lange, 2012). This suggests that the MMN is partly due to a reduction of negativity as a function of the predictability of the standards (see also Baldeweg, 2007 for discussion).

by recent functional Magnetic Resonance Imaging evidence (Kok et al., 2012).

## **TEMPORAL ORIENTING OF ATTENTION**

In recent years, a growing number of studies investigated the temporal orienting of attention, i.e., the selection of information for prioritized processing based on the time of stimulus occurrence (e.g., Nobre and Coull, 2010). Similar to spatial attention, temporal attention has been induced by manipulating the task-relevance of stimuli at particular time points or the expectations for stimuli at particular time points. However, whereas in the spatial domain, enhancements of early ERP components are a common finding for orienting based on both task-relevance and on expectations, these two paradigms yield somewhat different results when it comes to the temporal orienting of attention.

## **ERP STUDIES OF TEMPORAL ORIENTING BASED ON TASK-RELEVANCE**

Most of the studies inducing temporal attention by manipulating stimulus relevance used a variant of the selective attention paradigm introduced by Hillyard et al. (1973). In the temporal version of this paradigm, two sounds are presented in each trial, which are separated by a shorter or longer temporal interval (e.g., 600 vs. 1200 ms; Lange et al., 2003; see also **Figure 1A**). The first sound is a cue and marks the onset of the interval. The

Experiment 1), the symbolic cue (black note symbol) indicates the time point,

**(D)** (Lange, 2010), as reflected in the variable length of the arrows.

second sound is either a frequent standard stimulus or an infrequent deviant stimulus (e.g., louder or softer than the standard, Lange et al., 2003, 2006). Participants are asked to respond to the deviants, but only if they follow the cue after a specified time interval, marking the attended time point. Sounds presented at the other time point never require a response and can be completely ignored. Which time point is attended is either indicated prior to each block of trials (for a review see Lange and Röder, 2010) or is signaled trial-by-trial by the nature of the cue (Lange, 2012a,b). In this paradigm, only stimuli at the cued time point require further evaluation and categorization as standard or deviant. Thus, processing resources are likely dedicated predominantly to this time point, and it is regarded as attended. Because attended but not unattended deviants require an overt response, it is important to control for motor confounds. For this reason, ERPs to standard stimuli are used to measure attention effects, because standard stimuli do never require an overt response.

Most of the studies using this approach in the domain of temporal orienting have employed auditory stimuli and found evidence that temporal attention operates early in the processing chain, as evidenced by an enhancement of the auditory N1 around 100 ms post-stimulus (see **Figure 2A**; Lange et al., 2003, 2006; Lange and Röder, 2006; Röder et al., 2007; Sanders and Astheimer, 2008; Lange, 2012a,b; see also Chait et al., 2010 for related data; but see Griffin et al., 2002, Experiment 2 for visual data suggesting later effects). Hence, the data obtained with filter paradigms consistently show that temporal orienting modulates early auditory processing as reflected in the amplitude enhancement of the auditory N1.

## **ERP STUDIES OF TEMPORAL ORIENTING BASED ON TEMPORAL EXPECTATIONS**

Expectations are derived from contingencies (or probabilistic relationships) between events, as experienced in the environment. These contingencies may not only include information relating to which event is about to occur, but also relating to when a particular event is to be expected. Within this context, the studies investigating motor induced suppression of sensory processing (reviewed above) may be characterized as establishing a temporal contingency between motor acts and sensory events (as illustrated in **Figure 3**, right). However, temporal relationships may also be established between separate sensory events. As for these sensory-sensory contingencies, it seems useful to further distinguish between discrete and periodic events (**Figure 3**, left and middle), which are used to induce expectations in probabilistic and rhythmic cuing paradigms, respectively (probabilistic cuing: e.g., Coull and Nobre, 1998; Miniussi et al., 1999; Correa et al., 2004, 2005; see also Correa et al., 2006; Lampar and Lange, 2011, Experiment 1; rhythmic cuing: e.g., Doherty et al., 2005; Lange, 2009, 2010)<sup>2</sup> .

A specific feature of expectations in the time domain calls for a further distinction: The fact that the flow of time itself provides information as to whether an event is about to occur. The conditional probability that an event will occur at a particular time point given that it has not yet occurred increases as a function of time, being low directly after the cue has been presented and approaching certainty for the latest possible time point, when an event may occur. This relationship is referred to as the hazard function and it is known to affect both reaction times (for a review see Niemi and Näätänen, 1981) and brain activity (single-cell recordings e.g., Riehle et al., 1997; Ghose and Maunsell, 2002; Janssen and Shadlen, 2005; Schoffelen et al., 2005; Ghose and Bearl, 2010; functional imaging data: Bueti et al., 2010). Although most ERP studies investigating the temporal orienting of attention have not addressed this issue explicitly (for an exception see Correa and Nobre, 2008), it seems worth acknowledging the difference between low vs. high conditional probability when reviewing existing findings on temporal orienting research in order to explain the different effects<sup>3</sup> .

## *Temporal relationships between discrete sensory events: probabilistic cuing*

Temporal probabilistic cuing is a temporal variant of the Posnercuing task (Posner et al., 1980). In the temporal version, a symbolic cue indicates the time point when the target is most likely to be presented (**Figure 1B**). In most studies, two different time intervals are randomly cued, a short and a long interval (e.g., 600 and 1400 ms, Miniussi et al., 1999). The target appears with a high probability at the end of the cued interval and with a low probability at the end of the other interval. While a behavioral benefit is commonly observed using this kind of task (for a review of visual data see Correa, 2010; for auditory data see also Lampar and Lange, 2011 and the behavioral experiment reported in Lange and Röder, 2006), it remains an open question whether or not temporal probabilistic cuing operates on the same processing stages as temporal attention based on task relevance. As noted above, the flow of time itself provides information with respect to whether a stimulus will be presented. Hence, when the short interval is cued but the interval is actually long (i.e., no stimulus appears at the end of the short interval), the time of stimulus presentation becomes certain. In this case, when a response is required regardless of attention—as in probabilistic cuing—it is adaptive to re-orient ones attention to the other (i.e., the long) interval, leading to similar processing resources dedicated to the ending of the long interval in attended and unattended conditions. Because of this, behavioral effects (consisting of faster responses to temporally attended stimuli) are typically restricted to stimuli presented after a short interval (e.g., Coull and Nobre, 1998; Miniussi et al., 1999; Griffin et al., 2001, 2002; Correa et al., 2004, 2006). The associated ERP findings are heterogeneous, however: Some studies do not find any evidence that temporal attention affects early, sensory ERP components for the short interval (visual: Miniussi et al., 1999; auditory: Lampar and Lange, 2011, Experiment 1; see also **Figure 2B**, left) or found an enhancement, but in a later component than observed in spatial attention (visual: Griffin et al., 2002, Experiment 1).

<sup>2</sup>Note that some related studies cannot be classified unequivocally in this taxonomy: Correa and Nobre (2008) and Rohenkohl and Nobre (2011) share features of both probabilistic and rhythmic cuing.

<sup>3</sup>Note that the rhythmic cuing study by Rimmele et al. (2011), cannot be classified with respect to this distinction, since conditional probability is high for the regular condition but varies for the irregular condition.

Only Correa et al. (2006), who manipulated expectations in a block-wise manner, reported an enhancement of the sensory P1 component of the visual ERP around 100 ms post-stimulus. As for the long interval, where behavioral effects are typically not observed (but see Griffin et al., 2002, Experiments 1 and 2, where more than two intervals were used to overcome the problem of temporal predictability for later than expected stimuli), ERPs seemed to be unaffected by temporal attention in the visual domain. By contrast, there is evidence that the auditory N1 to attended stimuli of the long interval is reduced by temporal attention (**Figure 2B**, right; Lampar and Lange, 2011, Experiment 1). A potential explanation relates this effect to the interplay between a priori and conditional probability, which differs between the short and the long interval. This idea will be detailed below. To summarize, existing ERP studies paint a heterogeneous picture with respect to the potential effects of expectancy-based temporal attention on early, sensory processing.

## *Temporal relationships between periodic sensory events: rhythmic cuing*

In the time domain, we do not only establish expectations by assessing the probabilities of particular temporal delays between discrete stimuli as in probabilistic cuing. Being exposed to a temporally regular, repetitive sequence of stimuli such as the ticking of a clock or the flashing of a turning signal, we expect the pattern to continue and may thus anticipate the next tick of the clock or the next flashing of the light. Several recent ERP studies investigated the impact of a temporally regular stimulus sequence on the processing of subsequent stimuli. Most studies compared stimulus processing between conditions where the target followed a regular vs. an irregular sequence (e.g., visual: Doherty et al., 2005; see also Rohenkohl and Nobre, 2011; auditory: Lange, 2009, 2010; Rimmele et al., 2011), or between conditions where stimuli followed regular sequences of different tempi (auditory: Sanabria and Correa, 2013; see also Correa and Nobre, 2008).

*Rhythmic cuing in the auditory domain.* The susceptibility of early auditory processing to the temporal orienting of attention has been demonstrated by several studies manipulating temporal attention by means of task-relevance (Lange and Röder, 2010). Hence, it came at no surprise that rhythmic cuing in the auditory domain was also associated with early, sensory effects (Lange, 2009, 2010). However, the direction of the early effect varied with the specific experimental settings: Whereas a reduction of the N1 was observed in the two experiments reported in Lange (2009), an enhancement was found in a later study (Lange, 2010).

Lange (2009) presented temporally regular or temporally irregular tone sequences prior to a target tone (**Figure 1C**; see also Doherty et al., 2005, for a visual version of this task). The tones of the each sequence were presented either as a scale (ascending or descending; predictable pitch) or the pitches of the sequence tones varied unpredictably. The target tone followed the sequence after an interval equivalent to the omission of two steps of the regular condition. Faster responding was observed in the regular compared to the irregular condition (similar to the visual study of Doherty et al., 2005). Analysis of the ERP data showed that valid temporal expectations were associated with an amplitude attenuation in the time range around 100 ms, i.e., a reduction of the auditory N1, compared to the condition, where *no* expectation was induced (**Figure 2C**; but see Rimmele et al., 2011, who found an enhancement of the N1 with a similar manipulation). Crucially, because a valid expectation condition was compared to a condition without any expectation, the observed effect can be distinguished from the family of mismatch responses (for a review see Schröger, 1998), which mainly reflect response to expectancy violations. Additionally and consistent with other findings on rhythmic cuing (Doherty et al., 2005), temporal expectations also enhanced the P3 (see also Correa and Nobre, 2008; Rohenkohl and Nobre, 2011).

*The role of temporal predictability in rhythmic cuing.* The reduction of the auditory N1 to rhythmically expected sounds (regarded as attended in the rhythmic cuing paradigm) contrasts with findings of earlier auditory temporal orienting studies, which reported enhancements of the N1 (for a review see Lange and Röder, 2010). Because of the opposite polarities of the ERP effects, one may assume that manipulations of the two paradigms (rhythmic cuing and filter tasks) reflect separable attention processes. Hence, the reduction of the N1 might constitute a specific correlate of what may be termed rhythmic attention whereas the enhancement of the N1 might be specific to attention based on task-relevance. There is, however, an alternative explanation, which is compatible with the assumption of a single attention process with a uniform effect on stimulus processing: The reduced N1 could have reflected the increased predictability of stimuli in the rhythmic compared to the arrhythmic condition. Sensory predictability is also known to be associated with attenuated N1 amplitudes (e.g., Schafer et al., 1981; Clementz et al., 2002). In Lange (2009), the final interval was of equal duration in the rhythmic and in the arrhythmic condition to eliminate the possibility of using top–down knowledge of the last interval in the rhythmic but not the arrhythmic condition. Hence, in both conditions participants were able to predict exactly when the final sound would occur. However, the estimation of an interval benefits from its frequent presentation (e.g., Drake and Botte, 1993). Therefore, prediction might have been particularly precise when the sequence was regular, because here the same interval is frequently presented. By contrast, in studies inducing an orienting of temporal attention by manipulating task-relevance, the time point of target presentation is not predictable at the onset of a trial (for a review see Lange and Röder, 2010). The fact that these studies consistently observed an increased N1 to temporally attended stimuli, whereas rhythmic cuing was associated with an amplitude decrease (Lange, 2009) might thus be due to differences in temporal predictability rather than fundamental differences between different ways to manipulate temporal orienting.

A follow-up study (Lange, 2010) corroborated the notion that the N1 attenuation obtained in Lange (2009) may have been due to increased temporal predictability in the rhythmic condition. This study used basically the same paradigm as Lange (2009), i.e., a regular or an irregular sequence was presented prior to a target tone. However, the new design of Lange (2010) also included targets at time points earlier or later than the time point marked by the rhythmicity of the sequence (**Figure 1D**). Hence, the sequence could *not* be reliably used to predict the timing of target onset (which had been possible in the 2009 study). Notably, N1 to rhythmically attended stimuli was no longer attenuated in the regular compared to the irregular condition, which is consistent with the notion that the N1 attenuation observed in the earlier study (Lange, 2009) was due to temporal prediction processes rather than rhythmic attention (see also Vroomen and Stekelenburg, 2010 for similar results). Interestingly, a small but reliable enhancement of the N1 was found for the rhythmic compared to the arrhythmic condition in Lange (2010) <sup>4</sup> . This effect is consistent with earlier findings of an enhanced N1 in auditory temporal attention studies (Lange and Röder, 2010) and may thus reflect an orienting of attention in time.

*Rhythmic cuing may affect stimulus processing both by prediction and by attention.* Further analyses showed that the N1 enhancement was only observed for stimuli in the short and medium interval condition, whereas no effect (a small reduction at the descriptive level) was found for auditory targets presented after the longest interval (**Figure 2D**). Notably, for the long interval stimulus occurrence is certain due to conditional probability. Hence, this pattern of results suggests that prediction processes affected stimulus processing even in the paradigm used by Lange (2010)—when considering the contribution of conditional probability to overall predictability, as already suggested for probabilistic cuing (Lampar and Lange, 2011, long interval data).

It may therefore be hypothesized that presenting a rhythmic sequence triggers two processes with opposite effects on stimulus processing: The first is similar to what is manipulated when taskrelevance is used to induce an orienting of attention and leads to an enhancement of the N1. This process dominates when target timing is uncertain because of reduced a priori probability and/or reduced conditional probability, leading to the N1 enhancement in the short interval and medium interval conditions of Lange (2010). The second process is related to the increased predictability of stimulus onset in the rhythmic condition and leads to a reduction of the N1. This process may dominate the ERP effect when the sequence can be reliably used to predict the moment

<sup>4</sup>At first glance, the findings of the recent study by Sanabria and Correa (2013) are inconsistent with those of Lange (2010): These authors found reduced negativities to rhythmically cued targets—although stimulus sequences did not reliably predict sound onset. However, Sanabria and Correa (2013) compared the processing of a valid and an invalid condition. Hence, the effect may either reflect a reduced negativity to validly cued targets or an enhanced negativity to invalidly cued targets (akin to a mismatch response, e.g., Schröger, 1998).

of the sound's onset—either due to the fixed a priori probability (as in Lange, 2009) or because of an increased conditional probability (as in Lange, 2010, long interval; see also Lampar and Lange, 2011, long interval). Predictions may depend on the interplay between conditional probability (that stimulus occurrence becomes more and more likely with elapsing time), a priori probability (that the final interval will take a particular value), and rhythmic expectations (that the regularity of the sequence will be continued).

## **EXPLAINING EFFECTS OF AUDITORY TEMPORAL ORIENTING BY OPPOSITE EFFECTS OF ATTENTION AND PREDICTION**

Summarizing the core findings, the ERP effects measured in studies operationalizing temporal attention (i.e., the focusing of processing resources to a point in time) by manipulating taskrelevance and expectations, respectively, are not identical: Studies operationalizing temporal attention through task-relevance consistently report enhancements of early, sensory ERP components, whereas studies manipulating temporal expectations yield mixed results, showing either enhancements or attenuations of early, sensory processing, or no effects. Studies using filter paradigms primarily focused on auditory stimuli, whereas probabilistic and rhythmic temporal cuing have been employed both in vision and in audition. Notably, results are not less heterogeneous when considering only the auditory studies, suggesting that the discrepant findings on early, sensory effects are not due to stimulus modality—although the precise role of modality still needs to be explored. Given the pattern of results of the studies reviewed above, another explanation seems likely: The discrepancies in the

ERP effects may result from the fact that the increased expectations based on regularity and probability manipulations not only induced an orienting of attention to the expected point in time, but—at the same time—induced processes of predicting stimulus onset that may have counteracted the enhancing effect of attention on N1.

In the following, I will present a working model that describes N1 amplitude as a function of a single attention process on the one hand and a prediction process the other. Assuming that previous temporal orienting studies involved processes of attention and prediction to different degrees, this model can explain most of the partly divergent findings of previous studies with respect to the auditory N1. Moreover, it leads to novel predictions concerning the interplay between attention and prediction.

## **A WORKING MODEL ON N1 EFFECTS IN TASKS RELATED TO TEMPORAL ATTENTION**

**Figure 4** depicts the main components of the model and how they might relate. The model assumes that attention (**Figure 4**, left) and temporal prediction (**Figure 4**, right) have opposite effects on the amplitude of the N1: Attention leads to an enhancement of the N1 (hence the positive (+) influence of Attention] and prediction to an attenuation [hence the negative (−) influence of Prediction; see Equation 1].

$$\text{N1} = \text{Attention} - \text{Prediction} \tag{1}$$

The orienting of attention refers to the allocation of processing resources. The orienting of attention may rely only on

task-relevance (as in filter tasks), being independent of any stimulus expectations (hence the additive component γ in Equation 2). Additionally, the allocation of processing resources may follow ones (temporal) a priori expectation that a stimulus will occur at a particular point in time (Equation 2). This depends on the a priori probability that a stimulus occurs at a particular time point relative to another event (i.e., the relative proportion of trials with a given temporal relationship between cue and a target, as in temporal probabilistic cuing), and on the overall degree of temporal regularity in the sequence of stimuli (i.e., periodic stimulus presentation, particularly in isochronous sequences, as in rhythmic cuing). Because the rhythmic regularity of a sequence is experienced almost instantaneously whereas probabilistic features are extracted only after considering a larger number of trials, it is proposed that rhythmic cuing has a stronger influence on a priori expectations than a priori probability (i.e., β *>* 1, see Equation 3). Most importantly, however, it is assumed that attention will only be focused to the expected time point, when stimuli at this time point are relevant for task performance—because only in this case (additional) resources are needed for stimulus processing (hence the multiplicative relation between task-relevance and expectation in Equation 2). Finally, according to the model, the orienting of attention is triggered only at the onset of a trial and does not follow the increasing conditional probability during the course of the trial (in other words, conditional probabilities do not influence the orienting of attention).

$$\text{Attention} = \text{task} - \text{relevance}$$

$$\times \text{(a priori expectation + \text{y})} \quad \text{(2)}$$

$$\text{A priori expectation} = \text{a priori probability} + \text{\\$}$$

× rhythmic regularity (3)

Both in cuing paradigms and in filter paradigms participants are asked to (more or less) frequently respond to targets. Because this requires a certain amount of processing resources, it is reasonable to assume that participants use any information available to most efficiently allocate their processing resources (or orient their attention)—either because a subset of stimuli is expected (as in cuing paradigms) or because only a subset of stimuli requires deeper processing at all (as in filter paradigms). Hence, it is assumed that probabilistic cuing (Lampar and Lange, 2011), rhythmic cuing (Lange, 2009, 2010), and filter paradigms (e.g., Lange et al., 2003; Sanders and Astheimer, 2008) should all involve an allocation of processing resources to a subset of stimuli, and hence attentional differences between conditions. The model assumes a single attention process to be involved in cuing and filter tasks. Note, however, that it is also conceivable in principle that filter and cuing tasks involve qualitatively different mechanisms. Whether one has to distinguish selection-based and expectation-based temporal attention, and whether these involve only strategic or also automatic processes remains open. Further research is needed to further explore the precise nature of the attention process (or processes) involved. Importantly, the fact that targets are relevant for response selection seems to be crucial for orienting of attention, i.e., the mere presence of expectations *alone* should not lead to an orienting of attention. For example,

in self-generation paradigms expectations about action effects are generated, but these action effects typically do not require any response. Here, a differential orienting of attention compared to the control condition should not automatically be induced (for a discussion of this point see also Lange, 2011).

The second major component of the model is temporal prediction (Equation 4). Temporal prediction is a direct consequence of (1) the a priori expectation (or global expectation; specified in Equation 3) and (2) the conditional probability (or local expectation) that a stimulus will occur at a particular time point (i.e., the hazard rate).

$$\text{Prediction} = \text{a priori expectation}$$

$$\times \left( \alpha \times \text{conditional probability} \right) \qquad (4)$$

Increasing a priori expectations—either based on increases in the a priori probability for a stimulus at a particular time point (e.g., for valid stimuli in probabilistic cuing or for stimuli following one's own motor action) or by presenting the stimulus as part of a rhythmically regular sequence (as in rhythmic cuing)—not only lead to a biased allocation of processing resources as outlined above. They also allow more or less precise prediction of when the next stimulus is about to occur. The possibility to precisely predict the moment of stimulus occurrence is, however, not only dependent on a priori expectations, set up at the beginning of a trial, but also on the conditional probability. When events are presented with a uniform (or "aging") distribution over time—as in the studies cited above—conditional probability relates to the increasing certainty that the stimulus will occur "right now" the more time elapses. In this case, participants can more and more precisely predict the point in time, when the stimulus is about to occur. The model assumes that the conditional probability has a modulating influence on a priori expectations, i.e., its impact on prediction is only observed when there are differences in either a priori probability or rhythmic regularity. However, the influence of conditional probability on prediction is assumed to be stronger than that of the other two components (i.e., α *>* 1).

According to these assumptions, in a probabilistic cuing task, invalid stimuli (low a priori probability) following the cue after a short interval (low conditional probability) are particularly unpredictable, whereas valid stimuli (high a priori probability) following after a long interval (high conditional probability) are particularly predictable (e.g., Lampar and Lange, 2011). Likewise, stimuli presented as part of a periodic sequence are more predictable than stimuli presented as part of a random sequence particularly when they match the regularity of the sequence (e.g., Lange, 2009). Finally, because of learned contingencies, sounds triggered by one's own motor act can be predicted more precisely than externally triggered sounds, since predictability of the keypress is already increased (e.g., Lange, 2011, see also Hughes et al., 2013).

The model is consistent with effects observed in temporal probabilistic cuing (no effect on the auditory N1 for the short interval and a reduction for the long interval; Lampar and Lange, 2011, Experiment 1), the reduced N1 to self-generated stimuli in the self-generation paradigm; Lange (2011), and the reduced N1 in rhythmic attention when the rhythmic sequence reliably predicts sound onset (Lange, 2009). For rhythmic attention with uncertainty, the model as presented here predicts an enhancement for the rhythmic condition that changes to a reduction over time, which is also similar to the pattern observed in Lange (2010). Finally, the model adequately describes the N1 enhancements obtained in filter paradigms (e.g., Lange et al., 2003, 2006; Sanders and Astheimer, 2008). The only temporal orienting data that cannot be easily described by the model *as is* are the results for the long interval of Lampar and Lange (2011), Experiment 2. This experiment has elements of both a probabilistic cuing task (attended stimuli are more likely than unattended ones) and of a filter task (responses are only required for attended stimuli). The model predicts an enhancement of the N1 for both the short and the long interval. However, the pattern observed is consistent with this prediction only for the short interval. For the long interval, the opposite effect was observed. Therefore, further studies are needed to identify limiting conditions or complement the model by further variables and/or further relationships between variables.

Notably, the model also gives rise to several novel predictions for effects of task-relevance, rhythmic or probabilistic cuing, and motor-induced predictions—when different tasks are combined. First, according to the model, manipulations of task-relevance should yield smaller effects on N1 when combined with rhythmic or probabilistic cuing. This is because rhythmic or probabilistic cuing increases predictions for attended stimuli: Hence, the enhancing effect of attention on the N1 will be counteracted by the decreasing effect of (valid) predictions. Because, however, both cuing manipulations are thought to increase *both* attention *and* prediction, the reduction of the N1 effect may be relatively small. Hence, one may have to make an effort to demonstrate this empirically. Notably, however, at a descriptive level, there are findings which are well in line with this assumption: In the spatial attention study by Schröger (1993), attention effects on N1 were larger when attention was manipulated by task-relevance alone (Experiment 1) compared to a condition, in which taskrelevance and probabilistic cuing were combined (Experiment 2). Moreover, in previous studies of temporal orienting, the N1 effects seemed to be more pronounced in studies employing pure manipulations of task-relevance (e.g., Lange et al., 2003; Sanders and Astheimer, 2008) compared to a combination of task-relevance and probabilistic cuing (Lampar and Lange, 2011). Second, both rhythmic and probabilistic cuing may consistently lead to a reduction of the N1, when only passive stimulation is used. This is, because the model assumes that the impact of rhythmic and probabilistic cuing on attention depends on the (potential) task-relevance of these stimuli. In this case, the probability manipulations will not induce an orienting of attention, while predictions are still possible—similar to earlier studies of sensory predictability or studies of motor-induced suppression (e.g., Schafer and Marcus, 1973; McCarthy and Donchin, 1976; Schafer et al., 1981; Ford et al., 2001, 2007; Clementz et al., 2002; Bäß et al., 2008). In a related fashion, the model also predicts that the motor-induced suppression of the N1 to the effects of one's own actions may be *reduced* if participants oriented their attention to the self-elicited stimuli because they were relevant to the task at hand. In this case, stimuli rendered predictable by means of the preceding motor action would not only elicit a prediction-related decrease but in addition an attention-related increase of the N1, resulting in a reduction (or even an elimination) of the overall effect. Future studies are needed to address these hypotheses and provide evidence in favor of the basic idea represented in the model. Moreover, future research may investigate how the attention and prediction processes are related when it comes to non-temporal stimulus features.

## **OPEN QUESTIONS**

Although the physiological mechanisms and the functional interpretation of the heterogeneous N1 effects still need to be identified, I will briefly discuss a tentative account of a potential physiological mechanism and a functional interpretation in the following. It is conceivable, that both the N1 attenuation induced by stimulus predictability and the N1 enhancement induced by attention reflect a modulation of the frontal subcomponent 3 of the auditory N1 (according to Näätänen and Picton, 1987). Näätänen and Picton (1987) already discussed the notion that the frontal sub-component of the N1 is attenuated by temporal predictions ("knowledge of the timing of the stimulus," p. 412) and recent studies suggest that this unspecific component is also involved in motor-induced suppression of the N1 (SanMiguel et al., 2013; Timm et al., 2013). Notably, it has been suggested recently that this component may also be subject to manipulations of temporal attention (Lange, 2012a).

The processes underlying the N1—particularly its frontal subcomponent—are most likely not involved in the perceptual analysis and the identification of specific sound attributes (e.g., Davis and Zerlin, 1966; Parasuraman and Beatty, 1980; Butler, 1972; Pratt and Sohmer, 1977, see Näätänen and Picton, 1987 for a review). Rather, there is evidence that the amplitude of the N1 is related to the detection of the onset of a sound (e.g., Davis et al., 1968; Squires et al., 1973; Parasuraman and Beatty, 1980; Parasuraman et al., 1982) and to the sound's attention-catching properties, more distracting sounds being associated with a larger N1 (Campbell et al., 2003; Rinne et al., 2006). Acknowledging these and other findings, Näätänen et al. (2011) suggest that the N1 is associated with an attention call signal, triggered by a mechanism dedicated to onset-detection. According to these authors, early auditory processing engages two parallel pathways: One dedicated to onset detection and one associated with auditory feature analysis. It is assumed that the N1 (particularly the frontal subcomponent) is generated by the detection mechanism and that the main function of the underlying process is to increase the likelihood that the outcome of the feature analysis mechanism becomes available for conscious perception (e.g., Näätänen et al., 2011).

The next step for future studies is to identify the precise physiological mechanisms behind the ERP effects of attention and predictability. If temporal attention and temporal predictions indeed modulate the frontal sub-component of the N1 (e.g., Näätänen and Picton, 1987; see also Lange, 2012a), their respective functional roles could be to enhance and reduce the attention-catching properties of sounds and hence the likelihood of conscious sound processing. From a functional point of view, such an interpretation is highly plausible: An increased attention call for task-relevant sounds is adaptive, since these stimuli typically require an overt response, hence necessitating further conscious processing and evaluation. By contrast, in studies investigating pure effects of sound predictability, processing requirements are mostly similar for predictable and unpredictable sounds. In this case, there is less need for a mechanism promoting differential conscious processing. Moreover, when thinking of the most common instance of predictable events, i.e., all kinds of sensory events that result from our own motor actions, it is even more adaptive to reduce the likelihood of conscious processing otherwise we would be almost constantly distracted by what seem by-and-large irrelevant events.

## **CONCLUSIONS**

In the present paper, I briefly reviewed the heterogeneous ERP data of auditory temporal orienting paradigms using either task-relevance or expectations to induce a temporal orienting of attention. In order to explain this pattern of results, I presented a working model assuming that both manipulations activate a single attention process that enhances the auditory N1. Paradigms manipulating expectations to induce attention additionally involve prediction processes, which lead to an N1 attenuation. Open questions that are of relevance with respect to the interpretation of the enhancing and reducing effects of attention and predictions on N1 amplitudes concern the physiological mechanisms underlying these effects and their functional significance. With respect to the physiological interpretation, it needs to be investigated whether prediction and attention affect the same process in opposite directions—or whether they merely co-occur in time. Future studies may address this question by employing orthogonal manipulations of attention and prediction in the same experiment to test whether or not the respective effects are additive. Moreover, the physiological mechanisms underlying these effects need to be identified: Do they constitute modulations of the sensory-evoked N1 (or one of its subcomponents) or are they due to additional, endogenous voltage shifts (e.g., Giard et al., 2000 for a review on a similar discussion for non-temporal attention). The answer(s) to this question will also help to pinpoint the functional interpretation of the effects. Finally, the working model as presented here is not meant to be exhaustive. Rather, it is a first approach to explain most of the partly discrepant findings of previous temporal orienting research in a parsimonious way by taking into consideration factors that differed between studies. Hence, the model needs adaptation to accurately describe the existing data, in addition to empirical evaluation of its predictions.

## **ACKNOWLEDGMENTS**

The work reviewed in the present paper has been supported by the Deutsche Forschungsgemeinschaft (DFG), grants LA 2486/1- 1 and 1-2. I thank Dr. Daniela Czernochowski for her helpful comments on an earlier draft of the manuscript.

## **REFERENCES**


*Neurootol.* 7, 303–314. doi: 10.1159/ 000064444


production. *Psychophysiology* 42, 180–190. doi: 10.1111/j.1469-8986. 2005.00272.x


pitch. *Brain Cogn.* 69, 127–137. doi: 10.1016/j.bandc.2008.06.004


auditory deflection, explained. *Psychophysiology* 47, 66–122. doi: 10.1111/j.1469-8986.2009.00856.x


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 January 2013; accepted: 23 May 2013; published online: 11 June 2013.*

*Citation: Lange K (2013) The ups and downs of temporal orienting: a review of auditory temporal orienting studies and a model associating the heterogeneous findings on the auditory N1 with opposite effects of attention and prediction. Front. Hum. Neurosci. 7:263. doi: 10.3389/fnhum.2013.00263*

*Copyright © 2013 Lange. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any thirdparty graphics etc.*

## Emotion regulation, attention to emotion, and the ventral attentional network

## *Roberto Viviani 1,2\**

*<sup>1</sup> Department of Psychiatry and Psychotherapy III, University of Ulm, Ulm, Germany <sup>2</sup> Institute of Psychology, University of Innsbruck, Innsbruck, Austria*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Matthias Gamer, University Medical Center Hamburg-Eppendorf, Germany Hadas Okon-Singer, University of Haifa, Israel*

#### *\*Correspondence:*

*Roberto Viviani, Department of Psychiatry and Psychotherapy III, University of Ulm, Leimgrubenweg 12-14, 89075 Ulm, Germany e-mail: roberto.viviani@uni-ulm.de*

Accounts of the effect of emotional information on behavioral response and current models of emotion regulation are based on two opposed but interacting processes: automatic bottom-up processes (triggered by emotionally arousing stimuli) and top-down control processes (mapped to prefrontal cortical areas). Data on the existence of a third attentional network operating without recourse to limited-capacity processes but influencing response raise the issue of how it is integrated in emotion regulation. We summarize here data from attention to emotion, voluntary emotion regulation, and on the origin of biases against negative content suggesting that the ventral network is modulated by exposure to emotional stimuli when the task does not constrain the handling of emotional content. In the parietal lobes, preferential activation of ventral areas associated with "bottom-up" attention by ventral network theorists is strongest in studies of cognitive reappraisal. In conditions when no explicit instruction is given to change one's response to emotional stimuli, control of emotionally arousing stimuli is observed without concomitant activation of the dorsal attentional network, replaced by a shift of activation toward ventral areas. In contrast, in studies where emotional stimuli are placed in the role of distracter, the observed deactivation of these ventral semantic association areas is consistent with the existence of proactive control on the role emotional representations are allowed to take in generating response. It is here argued that attentional orienting mechanisms located in the ventral network constitute an intermediate kind of process, with features only partially in common with effortful and automatic processes, which plays an important role in handling emotion by conveying the influence of semantic networks, with which the ventral network is co-localized. Current neuroimaging work in emotion regulation has neglected this system by focusing on a bottom-up/top-down dichotomy of attentional control.

**Keywords: emotion regulation, attention to emotion, ventral attentional network, thought control, scrambled sentences test, dual-process models**

## **INTRODUCTION**

Emotion and emotion regulation are important issues in clinical neurosciences because disturbed affect, impulsivity, and low control capacity are common in psychopathology. Evidence gathered from diverse strands of research, behavioral as well as based on neuroimaging methods, has shown attention to be a key mechanism for the achievement of regulatory goals. A first type of studies has shown that the manipulation of attentional load by varying cognitive processing demands may alter responses to emotional stimuli (Hariri et al., 2000; Pessoa et al., 2002; Compton et al., 2003; Banich et al., 2009; Luo et al., 2010). These studies demonstrated that increasing attentional demands (for example by varying the difficulty of the task) may attenuate response to emotional stimuli, even if the task does not require diverting attention from the emotional aspect of the stimulus set. A second strand of research has used neuroimaging to investigate the neural correlates of asking individuals to attend to their own internal emotional state and modify it in a specific direction (Schaefer et al., 2002; Lévesque et al., 2003; Ochsner and Gross, 2005; Beauregard, 2007; Wager et al., 2008). This form of explicit emotional control is often referred to as "voluntary" or "deliberate" emotion regulation (Gross and Thompson, 2007). Cognitive control mechanisms may also be important to keep thoughts out of mind that are concerned with emotional issues (Wenzlaff and Wegner, 2000; Brewin and Beaton, 2002). Additional evidence on the importance of attention for regulation has been provided by studies that considered the developmental link between emergence of attentional capacity and behavioral self-control (Diamond and Gilbert, 1989; Posner and Rothbart, 1998, 2000). Finally, a large number of behavioral studies have documented the effects of attending to stimuli with emotional tone (for reviews, see Mathews and MacLeod, 1994; Bradley, 2009; Yiend, 2010) and the existence of emotion-congruent attentional biases in psychopathology of affect (Williams et al., 1997; Yiend, 2010), thus underscoring the importance of the interaction between emotional and attentional processes in psychopathology.

The insight emerging from functional neuroimaging has been the recognition of the existence of distinct brain circuits that are activated on the one hand by the perceptual encoding of emotional stimuli, and on the other hand by regulatory processes. The amygdala (a brain structure part of the limbic system located in the medial anterior temporal lobe) has been consistently involved in the early perception of emotionally arousing stimuli (Davis and Whalen, 2001; Dolan, 2002; Dolan and Vuillemier, 2003; Vuilleumier, 2005; Adolphs and Spezio, 2006). Specific mechanisms have been shown to operate on this early perception to refocus attention effectively (Dolan, 2002; Dolan and Vuillemier, 2003; Vuilleumier, 2005). These mechanisms allow emotional material to undergo preferential processing (Vuilleumier, 2005; Phelps, 2006), prioritizing the handling of information that is likely to be relevant for the goals and the survival of the individual. In contrast, the control of emotional stimuli interfering with cognitive tasks is associated with activation of the prefrontal areas that are recruited during cognitive control and mnemonic encoding of general stimuli (Elliott et al., 2000; Compton et al., 2003; Banich et al., 2009). These areas have been shown by studies of cognition to be involved together with medial prefrontal and parietal areas in a dorsal cortical network critical to cognitive control (Rypma and D'Esposito, 1999; Smith and Jonides, 1999; Wager and Smith, 2003; Owen et al., 2005; **Figure 1A**), active when attending and giving priority to stimuli according to internal goals, rather than the perceptual or emotional salience. Consistently with their prominent role in cognitive control, these same prefrontal areas are also involved in studies of voluntary

**FIGURE 1 | (A)** Schematic diagram of regions associated in the left hemisphere with working memory tasks (in blue) and in negative subsequent memory effects [in red, drawn after review data presented by Uncapher and Wagner (2009)]. The ventrolateral region of the prefrontal cortex is in many studies associated with executive tasks, and its belonging to processes attributed here to the dorsal network is unclear (see text). **(B)** Schematic diagram of areas associated with the dorsal (in blue) and the ventral attentional network [in yellow; drawn after the review data of Corbetta and Shulman (2002) and Corbetta et al. (2008)]. **(C)** Schematic partition of the inferior parietal region based on cortical laminar organization in man [approximate drawing based on Caspers et al. (2006)]. **(D)** Schematic partition of the parietal and the relevant prefrontal lobes in Brodmann areas. AI, anterior insula; DLPFC, dorsolateral prefrontal cortex; FEF, frontal eye fields; IPL, inferior parietal lobule; IPS, inferior parietal sulcus; MLPFC, middle frontal gyrus; VLPFC, ventrolateral prefrontal cortex; SMG, supramarginal gyrus; SPL, superior parietal lobule; TPJ, temporoparietal junction.

emotion regulation (see Ochsner and Gross, 2008 and Ochsner et al., 2012 for review).

Interestingly, theorists of emotion regulation also mention the existence of "automatic emotion regulation" forms (Mauss et al., 2007; Phillips et al., 2008) or of unconscious varieties of emotion regulation (Bargh and Williams, 2007), which contrast with the voluntary form just mentioned. The term automatic refers here to processes evoked by the stimulus and running without monitoring (Mauss et al., 2007; Gyurak et al., 2011), or initiated without awareness and not subject to strong capacity limitations (Williams et al., 2009). However, almost all existing studies have focused on forms of emotion regulation mediated by top-down regulatory processes. The resulting account of emotion regulation is based on a dual-process model (Barrett et al., 2004) that opposes automatic sensory encoding of emotionally arousing stimuli on the one hand, and on the other an integration of attentional mechanisms and prefrontal function envisaged to account more generally for cognitive control processes (Posner and Rothbart, 1998; Posner et al., 2002; Compton, 2003; Ochsner et al., 2009a; Hofmann et al., 2012).

This account is consistent within biased competition theories of attention (Desimone and Duncan, 1995; Miller and Cohen, 2001). Emotionally arousing stimuli in bottom-up perceptual channels may be viewed as particularly effective in competing for access to short-term memory (Vuilleumier, 2005; Stanley et al., 2009), requiring strong bias by top-down processes to maintain cognitive control. Accordingly, emotion dysregulation may be seen as arising from increased reactivity to emotional stimuli (mapped to increased activation of structures such as the amygdala) or from the failure to down-regulate emotional representations through the biasing activity of prefrontal areas involved in voluntary cognitive control (Posner and Rothbart, 1998; Phillips et al., 2003a,b; DeRubeis et al., 2008). In the following, we will refer to this dual-process model as to the "accepted view" of emotion regulation, since it is the one that informs most neuroimaging studies of emotion regulation and its failure in psychopathological conditions. Considerable evidence has now been gathered in neuroimaging studies on the involvement of the amygdala in the psychopathology of affect and impulsivity (Rauch et al., 2000; Herpertz et al., 2001; Whalen et al., 2002; Siegle et al., 2007), while the data on the involvement of the prefrontal areas in the psychopathology of affect are less univocal (Fitzgerald et al., 2006; Taylor and Liberzon, 2007; Thomas and Elliott, 2009).

In recent years, the dual-process model has been extensively investigated by neuroimagers. An important debate has involved the influence of top-down processes at early stages of sensory encoding of emotional stimuli, or their relative automaticity (Okon-Singer et al., 2007, 2012; Pessoa, 2008; Vuilleumier and Huang, 2009; Pessoa and Adolphs, 2010; Tamietto and de Gelder, 2010; Dolcos et al., 2011). These influences suggest that the form of automaticity of early sensory encoding in the amygdala and visual pathways consists of the low level of control minimally required for its processing. This demonstrates the importance of refining the simple dichotomy between automatic and controlled processes (Neumann, 1984), as in views emphasizing the gradualism of features defining automaticity (Moors and De Houver, 2006). Another important issue is the capacity of the amygdala and subcortical processing to prime visual cortex for processing emotionally salient stimuli, allowing the early perception of emotional stimuli to refocus attention effectively (Dolan, 2002; Dolan and Vuillemier, 2003; Vuilleumier, 2005; Bach et al., 2011). These refinements document the complex interaction between bottomup sensory encoding and top-down control during appraisal and regulation of emotion, but do not question the dual-process character of the model.

A possibly greater challenge to dual-process models has emerged in neuroimaging studies of spatial attention. These studies originally set out to investigate the functional properties of areas known to be involved in hemispatial neglect, a syndrome affecting patients with parietal damage (Corbetta et al., 1993). In the last years, these studies have provided increasing evidence of the existence of a ventral network in many respects opposed, but also interacting with the dorsal attentional network in ensuring the correct functioning of attentional processes (Corbetta and Shulman, 2002; Corbetta et al., 2008; Shulman et al., 2010). Of course, it has been held for a long time that different forms of attention may be distinguished (James, 1890), and more specifically, that executive attention and orienting may be differentiated both functionally and on the basis of the associated brain circuits (Posner, 1980; Jonides, 1981; Müller and Rabbitt, 1989; Posner and Petersen, 1990). However, the ventral network model of attentional processes differs from previous accounts because of the description of a new class of stimuli capable to trigger orienting, and of the functional differentiation of ventral and dorsal areas (as in the parietal lobe), which act concurrently and in a coordinated fashion. Furthermore, the importance of the ventral network is emphasized even when deactivated during the execution of a focused task. As argued more extensively below (Section The Dorsal and the Ventral Attentional Networks: the Issues for Emotion Regulation), the evidence on the role of the ventral network gathered in studies of cognition suggests that at least three separate processes, instead of two, are involved in the interaction between incoming stimuli and internal goals in determining the focus of attention.

The description of a ventral attentional process, concomitantly engaged and interacting with attentional processes of executive nature, raises the question of the role played by the attentional orienting mediated by the ventral network in emotion processing and emotion regulation. This question does not by itself challenge the relevance of executive processes and their neural correlates for the deliberate control of emotion, as the evidence in this respect, briefly mentioned above, is quite extensive. Rather, it draws attention to the extent to which existing observations may carry evidence for an involvement of the ventral network that has so far eluded systematic analysis. Similarly, this question does not challenge the evidence on the effect of emotional material on the early sensory processing of stimuli. What this question does challenge, however, is the adequacy of the dual-process attentionbased model to account fully for the data and the phenomenology of the interaction between emotion and attention and of emotion regulation.

In this review I will first summarize findings on the ventral attention network with the aim of highlighting the questions they raise for the dual-process view of emotion regulation and its neurobiological basis. These findings draw attention to the importance of understanding the characteristics of stimuli that activate the ventral attentional network, which belong to a category referred to as "behaviorally relevant." I will then review studies of spatial attention and memory that considered the specific effect of emotional stimuli to see if they provide any evidence on their propensity to activate this network and fall within this category. In a second step, I will review studies on voluntary and spontaneous emotion regulation, showing that some of their findings are difficult to account for in the dual process view of emotion regulation, but are compatible with a model of attentional processes that includes the specific role of the ventral network. These studies suggest that the neural substrates of forms of emotion regulation that are part of a spontaneous process, or emerge in the absence of a tightly constrained task set, overlap with the ventral, but not the dorsal, network. These results motivate the thesis advanced here that the ventral attentional network implements a form of attentional control of high importance for affective functioning, and that the state of activity of the ventral network reflects the absence or existence of proactive control processes, corresponding to the absence or existence of a task set targeting the influence of emotion on response. I will also introduce recent evidence suggesting that, in the absence of proactive control, the ventral network may steer response through modalities that differ from those of voluntary emotion regulation, and that may operate when regulation takes place spontaneously, i.e., in the absence of explicit, voluntary efforts, or arises from the spontaneous elaboration of presented stimuli.

## **THE DORSAL AND THE VENTRAL ATTENTIONAL NETWORKS: THE ISSUES FOR EMOTION REGULATION**

First described in the right hemisphere in studies of spatial attention, the ventral network interrupts and resets attention to behaviorally salient stimuli, while the task of the dorsal network is to maintain the locus of attention in the face of distraction, select stimuli according to prior information or goals, and coordinate responses (Corbetta and Shulman, 2002; Corbetta et al., 2008). The dorsal network includes dorsal frontal areas near or at the frontal eye fields (FEF), related to the dorsolateral prefrontal cortex (DLPFC), and the intraparietal sulcus (IPS) and the superior parietal lobule (SPL) in the parietal lobe (**Figure 1B**, blue). The DLPFC constitutes the neural substrate of top-down control processes associated with limited capacity (Posner and Presti, 1987), executive (Kane and Engle, 2002; Baddeley, 2003) and biased attention models of cognitive control (Desimone and Duncan, 1995; Miller and Cohen, 2001). The parietal node of the dorsal network is co-activated by recruitment of these topdown processes, but (at least in studies of spatial attention) is also active when attention is reoriented by external stimuli due to their sensory salience (Corbetta et al., 1993; de Fockert et al., 2004). Unlike the dorsal network, the ventral network is not activated by representations of goals or expectations, but responds together with the dorsal network when initially unattended objects of relevance to the task are detected (Corbetta et al., 2000). Core regions of the ventral network are the inferior parietal lobule (IPL) and the adjacent temporo-parietal junction (TPJ). In the prefrontal lobe, studies of spatial attention have associated the ventrolateral prefrontal cortex/inferior frontal gyrus (VLPFC/IFG) and the anterior insula with the ventral network (Corbetta et al., 2008; **Figure 1B**, yellow).

Unlike the dorsal network, activity in the ventral network is reduced during focussed tasks relative to fixation or baseline. However, the ventral network can be briefly activated together with the dorsal network in circumstances in which attention is refocused by external stimuli of behavioral relevance (Corbetta et al., 2008). This observation gave rise to the hypothesis that deactivation has a functional interpretation as a form of filter on the perceptual and semantic encoding of the stimuli (Todd et al., 2005; Shulman et al., 2007). Because activation in the dorsal network is implicated in maintaining expectations of incoming stimuli, the deactivation may originate in the interaction with the dorsal network (Corbetta et al., 2008). Below, I refer to such task set-related processes as "proactive control."

An early hypothesis formulated by spatial attention researchers is the applicability of their findings, in suitably generalized form, to selection and attendance of thoughts and memories in a semantic space (Posner, 1980). The inferior parietal cortex, in particular, hosts multimodal semantic association areas (Downar et al., 2000), and appears to be involved in perceptual tasks of apparently diverse nature (Husain and Nachev, 2007). Some of these tasks, such as oddball or go/no-go paradigms, involve the detection of salient items within a longer sequence (Linden et al., 1999; Clark et al., 2000; Marois et al., 2000; Downar et al., 2002). This variety of sources of parietal activation is consistent with findings in non-human primates. Recordings of firing activity in parietal neurons have provided evidence of a role of parietal cortex in dynamically computing priorities guiding selection not only for spatial orienting, but also for abstract rule-based actions (Gottlieb, 2006; Andersen and Cui, 2009; Bisley and Goldberg, 2010; Freedman and Assad, 2011). These recent studies confirm the topicality of traditional views of parietal cortex as a sensory integration or multimodal association area (Critchley, 1953) hosting abstract representations of external space (Bisiach et al., 1979; Mesulam, 1981). They show that the common denominator of parietal involvement is the integration of visuospatial and behavioral information from different sources, irrespective of whether response involves eye or limb movements or goaldirected choice. These data support the extension of models developed in studies of spatial attention to more general forms of computation of priorities for the generation of responses.

In man, a dorsal/ventral parietal dissociation in the left parietal region with several features analogous to those present on the right has been identified when selecting internal representations arising during memory tasks (Cabeza et al., 2008, 2012; Ciaramelli et al., 2008; Uncapher and Wagner, 2009; see, however, Sestieri et al., 2010) and when detecting changes in stimuli in the absence of a specific task (Downar et al., 2000). In neuroimaging studies of episodic memory, activation in the IPL at memory encoding has been noted to correlate with lower later recall (**Figure 1A**, red; for a systematic review, see Uncapher and Wagner, 2009). These negative memory effects (negative correlation between memory performance and IPL activity) are thought to result from slip-ups of processing away from the information to be encoded (Otten and Rugg, 2001; Wagner and Davachi, 2001). Led by the hypothesis that the IPL may be associated with shifts of spatial attention away from the focus (Corbetta and Shulman, 2002), memory researchers have proposed an analogous role for IPL in memory (Cabeza et al., 2008; Ciaramelli et al., 2010), leading to processing task-irrelevant thoughts or stimulus features (see also Li et al., 2007 and Congdon et al., 2010 for analogous results in attentional tasks). In these accounts, the attentional function of the ventral network is clearly different from the maintenance of goal processing in the face of distraction or interference attributed to executive attention. Rather, IPL appears here to be associated with internal reorienting as kind of attentional interrupt, leading to disengagement from the current focus and turning attention to a new item. In some cases, this reorienting function may contribute to performance by favoring processing atypical or infrequent aspects of the stimulus (Uncapher and Wagner, 2009), which may be of advantage in interaction with goal-oriented processing in a complex and varying environment. In other cases, this reorienting function may lead to performance decrements. In this body of literature, it is customary to refer to "top-down" and "bottom-up" attention to refer to the goaldirected allocation of attention to anticipated memory targets and to the orienting attention to automatically recollected memories (Cabeza et al., 2012). Here, ventral network activity is associated with processing an internally generated distracter.

However, researchers of spatial attention have also pointed out the capacity of the ventral network to steer attention irrespective of the perceptual salience of the stimulus or distracters. Here, the ventral network appears to control attention in the face of sensory salience. In dual-process accounts, top-down control fails when the high salience of a stimulus leads to winning the competition even in the face of top-down bias (Yantis and Jonides, 1984; de Fockert et al., 2004). In ventral network orienting, in contrast, it is rather the relevance to behavioral goals or long-term memory associations that determines refocusing of attention (Downar et al., 2001; Serences et al., 2005; Kelley and Yantis, 2010; Cabeza et al., 2012). This shows that the kind of salience that activates the ventral network differs from sensory salience, and appears to reflect both broad representations of active goals and longterm information. Indeed, ventral network orienting has been shown in specific conditions to override orienting to sensory salience (Indovina and Macaluso, 2007). Recent data show that reorienting can also take place to stimuli that have been made previously relevant by reward conditioning (Anderson et al., 2011; for a review of related findings, see Awh et al., 2012). This suggests that the salience encoded in ventral network areas may not only be defined online by the current task, but also by emotional experience.

Because ventral network-based refocusing of attention dissociates functionally and anatomically from both sensory saliencebased orienting and executive attention (Corbetta et al., 2008; Anderson et al., 2011; Cabeza et al., 2012), a comprehensive model of attention that includes ventral network orienting must differ from a dual-process theory of control as exemplified by the accepted view of emotion regulation. In the following sections I intend to address two issues ensuing from the dissociation of dorsal and ventral networks. The first is the evidence that emotional content modulates activation in areas of the ventral network. A positive answer would suggest that emotional tone belongs to the category of "behavioral relevance" attributed to stimuli that capture attention through the ventral network. To see if emotional material tends to recruit the ventral network more than neutral material, I will review studies on the effect of emotional material on attention and memory in Section Neuroimaging Studies of Attention to Emotion and Memory below. The second issue is what kind of control function the ventral network may implement when applied to emotional content. This issue will be addressed by reviews on the differential involvement of dorsal and ventral areas in studies of emotion regulation in sections Neuroimaging Studies of Voluntary Emotion Regulation and Thought Control and Spontaneous Avoidance of Negative Material.

Although the ventral network described in studies of spatial attention extends to ventral right prefrontal areas, the present review will be restricted to the parietal region. One reason for this focus is that the involvement of this hub of the ventral network is well-documented in several tasks. In particular, a parallel dissociation between dorsal and ventral areas has been characterized not only in the right hemisphere in spatial attention studies, but also in the left in studies of memory (Cabeza et al., 2012). Another reason is the complexity of the debate on the function of ventral prefrontal regions in the left hemisphere. While some have provided evidence for the involvement of ventral prefrontal areas in short-term memory tasks, in contrast to involvement of dorsal regions in more demanding working-memory tasks (Rypma and D'Esposito, 1999; Rypma et al., 1999), others have shown its involvement in executive tasks with high degree of interference between stimuli (Demb et al., 1995; Thompson-Schill et al., 1997; Wagner et al., 1997; Jonides et al., 1998; for a recent overview, see Schulz et al., 2009). In the right hemisphere, VLPFC/IFG is active in tasks where response needs be inhibited (for reviews, see Aron et al., 2004; Chambers et al., 2009), but the specificity of this observation is called into question (Sharp et al., 2010). These are complex issues that require more attention than can be given here.

The inferior parietal region is a relatively large area that in man may be differentiated further on the basis of the laminar organization of the cortex (**Figures 1C,D**; Caspers et al., 2006, 2013), corresponding to different preferential connectivity patterns (Caspers et al., 2011; Mars et al., 2011). As this differentiation suggests, there are functional specializations within this region (Hutchinson et al., 2012; Mars et al., 2011). However, the focus of the present review is on the general characterization of the cognitive processes that may be hosted here, in contrast with those associated with dorsal regions. Drawing on views from neuropsychology (Mesulam, 1981) and the non-human primate literature (Gottlieb, 2006; Andersen and Cui, 2009; Bisley and Goldberg, 2010; Freedman and Assad, 2011), I will view the parietal region as involved in computing priorities for attentional selection and choice from information from different modalities, based on both sensory salience and behavioral relevance. This view is not inconsistent with the existence of subregions that are further specialized, for example by the type of information they integrate. The notion of ventral attentional network draws on neuroimaging data in man that suggest sensory salience and top-down control to map onto the dorsal, behavioral relevance onto the ventral portion of the parietal lobule.

## **NEUROIMAGING STUDIES OF ATTENTION TO EMOTION AND MEMORY**

The studies of attention to emotion that will be considered here are those where emotional stimuli are used as cues (as in spatial attention paradigms) or as distracters, or are present in the stimulus set without providing information for the task. This restriction is justified by the consideration that, if emotional content is selected on the basis of a voluntary effort, there is no reason to assume that this selection cannot take place on the basis of processes instantiated in the dorsal attentional network (i.e., the dual-process and the alternative model presented here lead to the same prediction). Studies will be considered that report on the contrast between the effect of emotional and non-emotional stimuli. This contrast provides evidence on a specific effect of emotional material on the ventral network, not just on the possible activation of the ventral network during the task.

## **SPATIAL ATTENTION TASKS**

Relatively few neuroimaging studies in the healthy have combined spatial attention paradigms with emotional stimuli (**Table 1**). A few studies used emotional stimuli as non-informative cues presented simultaneously to elicit covert reflexive orienting to the emotionally salient cue. The main reasoning behind these studies is contrasting emotional and neutral non-informative cues to investigate covert orienting to emotional cues as a manifestation of preferential processing. Two studies using emotional stimuli conditioned to aversive events found activations that extended from the IPS into the supramarginal gyrus and anteriorly toward the secondary sensory cortex (Fredrikson et al., 1995; Armony and Dolan, 2002). In contrast, a recent study in which the cue was conveying information about the magnitude of reward reported only weak or no effect on ventral parietal or temporoparietal regions (Tosoni et al., 2013). In this study, however, the cue could not be ignored, as it instructed on the position of the target. Pourtois et al. (2006) is a neuroimaging study of particular relevance to the present issue, since it looked at effects of non-informative cues explicitly to disentangle effects in the dorsal and ventral attentional systems using fearful and happy faces. They found that fearful emotional cues additionally activated a temporo-parietal-occipital region, which was associated with changes of activation in the dorsal attentional system at the presentation of the target. The locations in the temporo-parietal region were more posterior than those of the previous two studies. However, the emotional salience of the Pourtois et al. (2006) study was not acquired through conditioning, but presumably biologically determined.

In summary, these findings do suggest that emotional material in non-informative cues co-activates the ventral attentional system albeit in different parts of the IPL depending on the nature of the emotional salience. As in the spatial attention studies, activation of the ventral and dorsal systems occur together (both the ventral and the dorsal system were activated in parietal areas, **Figure 2**).

#### **Table 1 | IPL involvement in studies of attention to emotion.**


*Lat, laterality of effect; R, right; L, left; L*∼*R, about equal laterization; L<R, predominantly right-lateralized; L>R, predominantly left-lateralized; the asterisk marks lateralizations relative to deactivations. IPL finding, report of an effect in the inferior parietal lobe or in the temporo-parietal region. The symbol "Yes* +*" indicates that the IPL effect was the strongest across the brain. Effects reported for the comparison emotion vs. neutral (either directly, or in an interaction/simple effects contrast, as appropriate). For criteria of selection of studies in the table, see the Methods section.*

## **DISTRACTER TASKS**

A larger number of studies have investigated the effect of attention to emotion in tasks originally devised to study cognitive interference. In these studies, emotional stimuli appear as distracters. In biased competition models of attention, these stimuli require more top-down effort to be suppressed. Here, however, the focus will be on their capacity to co-recruit the IPL, the parietal hub of the ventral network associated with attentional capture on the basis of the "behavioral relevance" of the stimulus. In some studies, distracters are physically distinct, but spatially or temporally contiguous to the target stimuli; in others, the emotional tone is present as a dimension of the stimulus that is of no relevance for the formulation of the task response.

The most obvious use of emotional material is as a spatial distracter in a selection or identification task (Eriksen and Eriksen, 1974). These studies show that emotional material as a distracter does not *per se* activate the IPL (Vuilleumier et al., 2001; Ochsner et al., 2009b). However, recruitment of the IPL was reported by studies where emotional stimuli were presented as temporally dislocated distracters (**Figure 3**, in yellow). Temporal distracters have an interfering effect on reports on the identity of a stimulus if they immediately precede or follow the target stimulus (Broadbent and Broadbent, 1987). In a study using a distracting non-informative word displayed briefly prior to the target word in a rapid serial presentation (Luo et al., 2007) the emotional and the sensory salience (supra- and subliminal) was varied independently. Increasing interference from distracters from subliminal to supraliminal increased activation in the SPL, while increasing interference by adding emotional valence increased activation in

**FIGURE 2 | Lateral rendering of foci for spatial attention studies with spatial cues with emotional tone (in yellow) on the PALS atlas (Van Essen et al., 2001).** On the surface of the rendering, histological classification of the inferior parietal lobe in man (from Caspers et al., 2006; cf. **Figure 1C**). On the left, joint projection of foci from both hemispheres; on the right, separate rendering for left and right hemispheres.

the IPL in supraliminal presentations. In this study, the effect of valence in the right IPL was the largest in the contrast.

Of particular relevance for the present issue is also the study by Mitchell et al. (2008), where emotional distracters of differing valence preceded and followed the target of a simple discrimination task. The IPL was activated together with the SPL by the task, but only the IPL was shown to interact with the presence or absence of emotional content in the distracters. *Post-hoc* analysis of this interaction showed that when distracter images were shown without the target, activity for negative distracters was highest and positive distracters lowest, while during the task the effect of valence was reversed. This suggests that institution of a task set resulted in different reactivity of the IPL to specific kind of distracter images, leading to suppression of the negative valence signal. Furthermore, activity in the IPL during the task was negatively correlated with amygdala reactivity to emotional valence. In this study, the only effect in the interaction was observed in the IPL. This study is consistent with the model of a filtering function of IPL for selecting relevant stimuli modified by the task set, showing that emotional stimuli are particularly effective in eliciting variations in the activity of the region. The findings are also consistent with the dissociation between this level of processing of emotional stimuli and the sensory encoding taking place in the amygdala.

In other studies where emotional stimuli appeared during the retention interval of a short-term memory task, however, no effect of emotion in the IPL was reported (Dolcos and McCarthy, 2006). Instead, these studies consistently reported relative deactivation of the IPL with emotional material (deactivation or less activation than in the neutral condition; Dolcos and McCarthy, 2006; Dolcos et al., 2008; Chuah et al., 2010; Denkova et al., 2010; Oei et al., 2011; Iordan et al., 2013; see blue foci of **Figure 4**).

The second group of studies used emotional tone as a taskirrelevant dimension of the target. Among the first studies in this group were those investigating the emotional Stroop. In the standard Stroop, the correct choice in the task and the distracting feature in the stimulus interfere directly. This task elicits strong activation in regions associated with attention and working memory (**Figure 1A**), consistent with the need to maintain task-relevant information online, exclude distracting features of

**et al., 2003), and rapid serial visual presentations of emotional**

**FIGURE 4 | Lateral rendering of foci of relative deactivation brought about by emotional stimuli used as distracters in the delay phase of a short-term memory (STM) task.**

**distracters (in yellow).**

the stimulus, and override a prepotent response tendency (Pardo et al., 1990; Banich et al., 2000). In the emotional Stroop, in contrast, the emotional tone of the stimulus may be a distracter only in virtue of its salience, and there is no direct conflict with response (Algom et al., 2004); accordingly, the interferencerelated activations in dorsal control areas are smaller (George et al., 1994). Whalen et al. (1998a) reported no effect on the IPL by emotional stimuli, but a dissociation between the ventral and dorsal anteromedial prefrontal cortex, accompanied by an activation/deactivation dissociation in the main task. Also Shafer et al. (2012) reported no IPL effect of emotion in a task with many elements in common with the emotional Stroop. The finding differed in the study by Compton et al. (2003), which explicitly focused on demonstrating the existence of an emotional/cognitive dissociation in the parietal lobes, and distinguished between the effects of valence and arousal in the stimuli. They found large activations in the superior parietal lobule for the standard Stroop (**Figure 3**, in blue). In contrast, none of the emotional conditions increased activity here; instead, negative emotion was associated with increased activity in the IPL bilaterally and in the left supramarginal gyrus, which was driven by valence (**Figure 3**, in red). However, the study did not report whether the dorsal/ventral dissociation was accompanied by analogous activation/deactivation dissociation in the main task.

Viewed together, the studies of this section report modulation of IPL by emotional distracters that depends on the relationship between the presence of emotional tone and the control processes activated by the task. When emotional tone was added to a distracter whose encoding was forced by the psychophysical properties of the presentation, as in the rapid serial presentation task, IPL was more active than with neutral distracters. In contrast, when emotional tone was added to distracters that could be effectively excluded by input processing, as in the retention interval of a short-memory task, IPL was more deactivated by emotional than by neutral stimuli. However, there were also studies in which IPL did not seem to be modulated by emotion.

## **COGNITIVE/MOTOR INHIBITION**

These studies are characterized by conflict in the generation of response due to an automatic association between some aspects of the stimulus set and the wrong response. Automaticity here ensues from an overlearned response in association with the stimulus that gives rise to the conflict (Logan, 1988), or to the rapid instantiation of a response habit in the presence of a large number of trials where the correct response is always the same. In this setting, the overlearned response must be inhibited for correct task execution. Studies in this group investigate the effect of emotional tone in combination with response inhibition.

A typical representative of this kind of study is the go/no go paradigm. Here, a stimulus requiring a response occurs frequently, while a rarely occurring stimulus requires no response. Go/no go studies elicit activation in a complex network associated not only with allocation of attention, but also with the necessity to regulate a prepotent motor response. Several studies have associated distinct prefrontal regions with the inhibitory component (Aron et al., 2004; Chambers et al., 2009; Sharp et al., 2010). In the present review, the focus is on the ventral parietal regions and its possible association with the attentional component of the task. In go/no-go studies no activation of the IPL was observed when the criterion determining the go or no-go response was the presence of emotional valence itself (Elliott et al., 2000). However, when emotional valence connotated stimuli incidentally, modulation of several cortical regions was observed, the most prominent of which were the ventromedial prefrontal cortex/orbitofrontal cortex, IFG, and right IPL (Goldstein et al., 2007; Brown et al., 2012; see **Figure 5**, red). IPL recruitment was also reported by studies adding emotional tone in the context of the conflict engendered by the standard Stroop (Krebs et al., 2011).

Another group of studies introduced a conflict between the emotional tone and another aspect of the stimulus by superimposing faces and conflicting or congruent written text. These studies reported no consistent effect in the IPL (**Table 1**).

As in the spatial attention studies, activation of IPL in studies of this group seems to be favored by the lack of an informativeness of emotional tone for the task at hand.

## **SUBLIMINAL PRESENTATION OF EMOTIONAL STIMULI**

The study by Luo et al. (2007) suggests that the effect of emotional valence of stimuli on the IPL requires supraliminal exposure, in contrast with the effect on the amygdala (Morris et al., 1999; Whalen et al., 2004; de Gelder et al., 2005; Liddell et al., 2005; Vuilleumier, 2005). If this is correct, then we should not observe any IPL effect (which might conceivably be associated with covert orienting) in studies of subliminal exposure to emotional material. To verify this hypothesis, neuroimaging studies where faces bearing an emotional expression were presented subliminally were examined to see if they reported IPL effects (**Table 2**). The findings of this survey were not consistent. Even if the majority of studies reported no effects in the IPL, two studies did. However, in the study by Phillips et al. (2004), effects were present for both subliminal and supraliminal presentations, although differently lateralized; furthermore, the supraliminal effects were larger. On both studies, the test statistic was well below thresholds required by multiple comparison corrections; this is in contrast with the magnitude of effects in the studies of the previous section, which were at times the strongest across the brain. One may

**FIGURE 5 | Lateral rendering of foci from studies investigating the effect of emotional distracters in tasks with prepotent response or motor inhibition.**

## **Table 2 | Effects on IPL of subliminal presentation of faces with emotional expression.**


*For criteria of inclusion, see the Methods section.*

conclude that there is no strong evidence for an effect in IPL at the presentation of subliminal emotional stimuli.

## **DECLARATIVE MEMORY OF EMOTIONAL STIMULI**

While in many respect different from studies of attention, studies of memory are included here because of the evidence for a dissociation between dorsal and ventral left parietal areas with many aspects in common with the dissociation demonstrated by studies of attention (Cabeza et al., 2008, 2012; Ciaramelli et al., 2008; Uncapher and Wagner, 2009). A possible connection between these two types of studies may be seen by viewing declarative memory tasks as selection from a set of internal representations.

At behavioral level, the influence of emotion on memory processes is shown by enhanced accuracy and vividness of declarative memories (Kensinger, 2004). These effects may arise from effect of emotion at different stages of the memory process, from encoding to consolidation and retrieval (LaBar and Cabeza, 2006). As in the modulation of attention by emotion, the amygdala is thought to interact with prefrontal function to fine-tune memory to emotional content (LaBar and Cabeza, 2006).

Neuroimaging studies have investigated the impact of emotion on memory much more systematically than in orienting paradigms of the previous section. A recent meta-analysis evaluated results from 15 carefully selected neuroimaging studies on successful emotional memory encoding (Murty et al., 2010). Of interest in the present context is the involvement of ventral parietal areas in emotional memory paradigms, beside the wellknown involvement of amygdala, medial temporal, and prefrontal areas. Murty et al. (2010) found a significant effect in the right IPL/supramarginal gyrus associated with successful encoding of emotional relative to neutral stimuli. Commenting on the contrast with the result of the systematic review by Uncapher and Wagner (2009), where activation in this area was associated with inferior memory performance, Murty and colleagues conjectured that recruitment of the reflexive orienting process associated with IPL by emotional material might have been representative of the beneficial effects of reflexive orienting in an ecologically complex setting.

## **SUMMARY ON NEUROIMAGING STUDIES OF ATTENTION TO EMOTION**

An effect of emotion on IPL was reported by about 65% of the reviewed studies of attention to emotion. Localization was on the right or indifferent in about 50% of studies; in all cases where localization was on the left, distracters were verbal. In 15% of the studies IPL activation was the largest reported effect in the contrast. However, the involvement of IPL was complex, as some studies reported its deactivation by emotion, in contrast with the majority of findings. This may be due to considerable diversity of the studies examined here. In several cases, however, it appears that these heterogeneous findings were influenced by the relationship between the stimuli and the task set.

In the studies of spatial attention, for example, where emotional tone was added to a spatial cue, activation in the IPL was reported when the cue was not informative for detecting the target (Armony and Dolan, 2002; Pourtois et al., 2006), in contrast with the effect of cues that informed about the location of its appearance (Tosoni et al., 2013). In studies with emotional distracters, where the distracters were spatially distinct from the target and were presumably inhibited by selection, no activation in IPL was detected. However, strong IPL activations were reported by studies in which the emotional distracter immediately preceded the target at the fixation center (rapid serial presentation tasks: Luo et al., 2007; Mitchell et al., 2008). These differing results may be due to the fact that in rapid serial presentation tasks processing of the distracter cannot be avoided, as it is shown by the interference in identifying the target (Broadbent and Broadbent, 1987).

No consistent activation of the IPL was reported by studies where the emotional distracter and the target were combined in the same stimulus or were spatially superimposed. Similarly, no consistent evidence for a role of IPL was provided by studies of conflict in the generation of response, such as the go/no go or standard Stroop. This may be due to the fact that the nature of the conflict here is on the side of response, not perception. A striking exception to this pattern is given by the studies by Goldstein et al. (2007), Krebs et al. (2011), and Brown et al. (2012). In all these studies, emotional tone was added to stimulus material that was used in the task generating response conflict. The presence of absence of emotional tone was not informative to make the decision, and perceptual encoding of the material was essential to the task. Here, activation of the IPL was robust.

It therefore appears that activation of the IPL by emotional distracters was contingent on the task set, additionally modulated by the locus of interference. When interference was on the perceptual side between stimuli, little activation in the IPL was seen unless processing of emotional stimuli was forced by the rapid serial visual presentation task. In studies where interference was on the response side, activation of the IPL was seen where emotional tone incidentally connoted stimuli used in the task, as in the studies by Goldstein et al. (2007), Krebs et al. (2011), and Brown et al. (2012).

These activations are consistent with a recruitment of IPL by emotional material but, considered in isolation, do not tell us unequivocally if they were due to increased top-down control in the presence of emotional distracters, to increased interference, or to attentional capture as in ventral network reorienting. Nevertheless, their ventral localization, and the tendency to occur when the emotional tone was not informative for the task, do not suggest direct involvement of top-down suppression of distracters. However, another aspect of these data speaks more decisively against interpreting IPL activation as the correlate of top-down control or increased interference, as activations of DLPFC may be. This is given by the studies of short-term memory or working memory in which emotion was associated with IPL deactivations (see Dolcos and McCarthy, 2006 and the analogous studies in **Table 1**). If IPL were the neural substrate of top-down control like SPL or DLPFC, we would expect it to be always activated—or at least not deactivated—by content designed to increase interference. In contrast, these deactivations parallel those of studies of cognition reporting an association between proactive control and deactivation of the ventral attentional network (Todd et al., 2005; Shulman et al., 2007). IPL activations may be observed when the emotional tone of distracters is embedded within the task so as to escape proactive control, as when it is not informative for the task, or when its encoding is essential for the task but the focus of control is on response. Because detected in comparison with neutral stimuli, the deactivations suggest that proactive control on the emotional features, brought about by their relevance to the task, may modulate the activation level in the IPL more strongly than non-emotional features of the stimuli. This conclusion is consistent with data from the effect of emotional tone on IPL recruitment in studies of declarative memory of emotion reviewed by Murty et al. (2010).

In interpreting these data, it is useful to remember that all these results emerged by contrasts between emotional and neutral distracters, which modeled at the second level the interaction between the task and emotional tone. They provide evidence consistent with the notion that emotional material may preferentially trigger activation in ventral parietal areas, as expected from stimuli of high behavioral relevance. In models developed in studies of non-human primates, the role of anterior parietal cortex is the computation of the relative behavioral relevance of stimuli to guide choice or selection on the basis of information from locations in extrapersonal space (Itti and Koch, 2001; Gottlieb, 2006; Andersen and Cui, 2009; Bisley and Goldberg, 2010; Freedman and Assad, 2011). To the extent that these ventral areas contributed to determining priorities in the handling of information in the studies reviewed here, they may also have contributed to prioritizing emotional information in specific task set and interference configurations.

There is also some indication that valence, especially negative valence, was more important to elicit the IPL effect than arousal levels (Compton et al., 2003). Emotional arousal is associated with activation of the amygdala (Whalen et al., 2002). The review of IPL effects in studies of subliminal emotional stimulation also suggests that robust effects of emotional tone in the IPL require stimuli to be presented supraliminally, in contrast with findings in the amygdala. This differentiates emotional processing in the IPL and in the amygdala.

A possible limitation of the present attempt to summarize results is the localization of many foci near the IPS. Because some degree of heterogeneity in collating data from different studies is inevitable, it is possible that some of the foci attributed to the effect of emotional intensity were located in the IPS, especially those situated more dorsally. These foci may then more appropriately be considered a correlate of activity of the dorsal network system, perhaps as a result of increasing interference from the emotional distracters. Discussion of this issue will be postponed to after considering data from emotion regulation studies.

Notwithstanding its common occurrence, the involvement of the IPL is not mentioned very often in the discussion of findings. This is remarkable since in some studies IPL involvement was quantitatively the most extensive or the most intense in the contrast opposing emotional and neutral stimuli (Compton et al., 2003; Goldstein et al., 2007; Luo et al., 2007; Mitchell et al., 2008). A similar remark was made by Murty et al. (2010) in the discussion of their meta-analysis of effects of emotion on parietal areas in studies of declarative memory. This relative neglect of ventral parietal involvement may depend on a hypothesis-driven focus on prefrontal regions as the substrate of cognitive control processes.

While these data provide some support to the notion that emotional tone may be part of the "behavioral relevance" category that preferentially triggers ventral network reorienting, they do not inform us on the relevance of the ventral network in emotion regulation. Studies of emotion regulation will be examined in the next section.

## **NEUROIMAGING STUDIES OF VOLUNTARY EMOTION REGULATION**

In studies of voluntary emotion regulation, participants are instructed to execute a specific strategy to change their emotional reaction. Usually, but not always, the strategy involves downregulating the reaction to an emotional stimulus (most often but not invariably negative). Strategies vary between studies, including simple suppression of erotic arousal (Beauregard et al., 2001), self-distraction (Kalisch et al., 2006), distraction by execution of a demanding cognitive task (Kanske et al., 2011), and suppressing expression of emotion vs. using cognitive reappraisal (Ochsner et al., 2002, 2004; Goldin et al., 2008). When used to down-regulate emotion, cognitive reappraisal is the recontextualization or reframing of a negative stimulus in less negative terms (Ochsner and Gross, 2008). This latter strategy is the most intensively investigated in recent studies.

Studies of emotion regulation are characterized by the instruction to change one's own affective state. This is in contrast with the attention to emotion studies of the previous section, in which the instruction referred to a cognitive task that remained the same, while experimental variation was introduced by adding emotional tone to stimuli. A well-known finding of about all voluntary emotion regulation studies is recruitment of the dorsal prefrontal cortex in both medial and lateral aspects, complemented by activation in the VLPFC/IFG (Ochsner et al., 2012). Therefore, the evidence for the involvement of the prefrontal portion of the dorsal attentional network in voluntary emotion regulation is overwhelmingly strong. This evidence constitutes the empirical support for the dual-process view of the mechanisms underlying cognitive emotion regulation (Ochsner and Gross, 2008). The findings on the parietal lobe, divided by dorsal and ventral localization, are summarized in **Table 3**.

Inspection of this table reveals that the IPL is more often recruited than SPL by voluntary emotion regulation (logistic regression with repeated measurements, *z* = 2*.*9, *p* = 0*.*003). Furthermore, there is a tendency for the studies showing no IPL recruitment to have been carried out earlier, when instructions simply to suppress one's reaction to the stimulus were common (Beauregard et al., 2001; Lévesque et al., 2003; Kalisch et al., 2006; Ohira et al., 2006; Kim and Hamann, 2007; logistic regression on time of publication, *z* = 2*.*5, *p* = 0*.*01). Some studies reporting foci in the SPL instructions to inhibit sexual arousal. In **Figure 6**, these foci were shown in blue, while foci detected by reappraise, suppress, or self-detach instructions are in orange. Foci reported while instructing to enhance the reaction to emotional stimuli are


#### **Table 3 | Effects on SPL and IPL of voluntary emotion regulation.**

*Data refer to the contrast regulate vs. look. Yes* +*: the plus sign indicates that the IPL effect was the strongest in the contrast. The word "reappraise" was used for studies that cite Ochsner et al. (2002) or later work to instruct participant on reappraisal; "increase/decrease or maintain" denote studies that refer to Jackson et al. (2000) or Jackson et al. (2003) for participant instruction. For criteria of selection of studies in the table, see the Methods section.*

in light green. Studies reporting on enhancement instructions are few, and the reported foci do not appear to deviate systematically from those detected with reappraise or suppress instructions.

An issue raised by this finding is the status of some foci located in the superior part of the IPL, which may originate in the IPS and for this reason may be considered part of the dorsal network (in **Table 3**, Koenigsberg et al., 2009, 2010; New et al., 2009). However, if one looks at the IPL clusters in studies that displayed the effect on a lateral surface rendering, it becomes apparent that they extended toward the temporal lobe, usually including TPJ (Ochsner et al., 2004; McRae et al., 2008; Domes et al., 2010; Erk et al., 2010; Modinos et al., 2010; Staudinger et al., 2011; McRae et al., 2012a). Only one study (New et al., 2009) shows an effect located within and limited to the IPS.

Another finding was that the peak effect on IPL was in many studies the strongest in the contrast with the look instruction, suggesting that the effect in the parietal lobe was more marked than in the prefrontal cortex. These studies are marked by "Yes +" in **Table 3**. In several studies (Ochsner et al., 2002; Wager et al., 2008; Drabant et al., 2009), the IPL peak correlated with self-reported efficacy of emotion regulation. Interestingly, IPL also appears to be the region that best differentiated the neural correlates of reappraisal in borderline personality relative to healthy controls (Schulze et al., 2011; Lang et al., 2012; see also Koenigsberg et al., 2009). In a recent study on the effects of psychotherapy of social anxiety assessed with a cognitive reappraisal probe, the interaction between time before and after therapy and the down-regulation instruction localized in the IPL (Goldin et al., 2013).

In conclusion, studies of voluntary emotion regulation (particularly those relying on cognitive reappraisal) appear to recruit the dorsal attentional network in the prefrontal lobes, but activate the IPL, the hub of the ventral attentional network, in the parietal lobes. Notwithstanding its striking prominence, IPL effects are not referred to often in these studies, and have only recently been explicitly noticed (Ochsner et al., 2012). Instead, the interpretive framework of these studies focuses on the effect in the prefrontal areas, consistently with the dual-process view of emotion regulation. A few key studies, however, offer possible insights on the interpretation of this pattern of anterior/posterior dissociation.

In the studies by McRae et al. (2010) and Kanske et al. (2011), activations elicited by cognitive reappraisal were compared with those detected in a "distraction" condition, which consisted in the simultaneous execution of a short-term memory (McRae et al., 2010) or a demanding working memory task (Kanske et al., 2011). Starting from the seminal findings by Hariri et al. (2000) and Liberzon et al. (2000), many neuroimaging studies have shown that attentional engagement in demanding tasks affects the sensory encoding of emotional stimuli in the limbic system and the amygdala, reducing the activation that may be attributed to emotional arousal (Pessoa, 2008). In the studies by Kanske and McRae, the concurrent cognitive task activated the SPL or the IPS (in light blue in **Figure 6**), while the activation of the cognitive reappraisal condition was shifted to the IPL, with a narrow area of overlap centered on the IPS. In Kanske et al. (2011) most of the activation in the dorsal prefrontal areas was shared between the cognitive reappraisal and the concurrent cognitive task, but cognitive reappraisal additionally recruited the ventrolateral prefrontal cortex. In McRae et al. (2010) the reappraisal task recruited additional ventrolateral and antero-medial prefrontal areas. On the basis of these data, therefore, an argument may be made that an important and distinctive neural correlate of cognitive reappraisal (as opposed to turning attention elsewhere, or just blocking affect) lies in the ventral, not the dorsal, activations associated with this task. Following further this line of reasoning, one may distinguish between relatively non-specific effects of attentional recruitment on the control of emotional arousal, common to all strategies of voluntary control and associated with prefrontal areas related to working memory and the dorsal network, and contributions from ventral network areas active during cognitive reappraisal. As noted in the discussion of the attention to emotion section, it is difficult to interpret activation in the ventral IPL as a correlate of top-down control, as these same areas were deactivated in studies requiring suppression of emotional distracters.

## **THOUGHT CONTROL AND SPONTANEOUS AVOIDANCE OF NEGATIVE MATERIAL**

While emotional content has in most respects a facilitatory effect on attentional processes, there are specific cases where it is also known to slow down processing or induce avoidance, especially in the context of stimuli that are aversive or endowed with negative valence (Gray et al., 2002; Sagaspe et al., 2011). Of particular importance to the understanding of mood disorders is the avoidance of negative cognitions in the healthy, in contrast to what is observed in depressed individuals (Beck, 1976). The bias for positive cognitions is empirically demonstrable with the scrambled sentences task (SST, Wenzlaff, 1991). Participants are presented with a set of scrambled words from which they can assemble one of two possible sentences, depending on the order and the selection of words from the set. When the two alternative sentences have pessimistic and optimistic connotations, healthy participants spontaneously avoid the pessimistic alternative even if no reference to the valence of the sentence was given in the instruction (for example, the set "is bleak the future bright" can be recomposed in either "the future is bright" or "the future is bleak"). This bias is associated with the absence of previous depressive episodes or symptoms and predicts future episodes (Rude et al., 2002, 2003, 2010; Wenzlaff et al., 2002). It also correlates with depression scores or assessment of mood (Rude et al., 2010; Viviani et al., 2010).

While relevant to assess the tendency to negative cognitions of depression, the roots of the SST are in a highly developed cognitive model of how the control of thoughts through executive attentional processes is achieved or may fail (Wenzlaff et al., 1988; Wegner et al., 1993; Wegner, 1994). According to this model two processes, differing in the amount of resources they require, work together to promote desired mental states. A "monitoring process," not very resource demanding, is continuously active in the background to detect the emergence of undesired content. When this happens, the monitoring process triggers an "operating process," much more resource demanding, to attend to and suppress the undesired content. The operating process acts therefore like an executive process down-regulating negative emotional content in models of cognitive control of emotion. There are also similarities between the monitoring process and the concept of "bottom-up attention" to internal memories of ventral network theorists. In both cases, these processes are associated with endogenous ideas competing for inclusion in working memory, and have a potentially disruptive effect. However, the notion of monitoring process is explicitly linked to desires for mental states in the definition of the kind of salience that triggers it. According to the thought control model, avoidance of negative words in the chosen sentences in the SST is initiated spontaneously through this mechanism.

The SST is therefore of interest as a paradigm that, according to the model that inspired it, triggers a regulatory process without the influence of an explicit instruction to regulate, in contrast to voluntary emotion regulation studies. If avoidance of negative thoughts in the SST is obtained by a control process of executive nature, we should observe recruitment of the dorsal network, or of the part of it that executes this control at the net of the effect of the instruction of the experimenter. This is a prediction not only of the thought control model, but also of dual-process models of control, because negative words are commonly more salient than positive words (Bradley and Lang, 1994), therefore requiring more top-down control to be suppressed. A second issue raised by the SST is the neural correlate of the monitoring process, and the recruitment of ventral attentional areas that might conceivably support the parallelism between the "monitoring process" and "bottom-up" attention.

A neuroimaging study of thought control based on giving explicit instructions to participants found activation in medial dorsal areas, but no activity in DLPFC or SPL; instead, activity was modulated in the insula and IPL (Wyland et al., 2003). The study by Viviani et al. (2010) used the SST to identify the areas activated while avoiding negative content in the absence of an explicit instruction. Two factors were present in the study: emotional and neutral sentences (to control for sentence selection) and making no mention of the emotional content vs. asking participants explicitly to avoid the pessimistic alternative (to compare spontaneous and instructed avoidance). The main finding was in contrast with the prediction of the thought control model. In the spontaneous group, the dorsal attentional network was not recruited by the emotional sentences; on the contrary, the dorsal attentional network was less recruited than when sentences were neutral. In contrast, in the instructed group a small increase in activation in the DLPFC was seen in the group explicitly instructed to avoid the negative alternative. Importantly, a significant emotion × instruction interaction was observed (led by the decrease of activation in DLPFC in the spontaneous group). Hence, the effects on the dorsal attentional network could not be explained by the presence of emotional material or the absence of an explicit instruction alone. In the parietal lobe, the interaction showed prevalent recruitment of dorsal areas in the instructed, and ventral areas in the spontaneous group when confronted with emotional material (**Figure 7**, foci in violet and green). Similar effects were observed in the medial prefrontal cortex and the posterior cingulus.

This study used an arterial spin labeling technique to identify areas activated and deactivated by the sentence-forming task. Comparison of the contrast task vs. baseline and the areas identified by the interaction showed that the focus in the anterior IPL, in the prefrontal cortex and the posterior cingulus were deactivated in the sentence forming task. The increase of IPL perfusion in the spontaneous group when exposed to emotional sentences took place within the context of a task deactivation. In contrast, in the instructed group the presence of an explicit instruction to avoid negative sentences was associated with similar deactivations in the neutral and the emotional material.

A second study on the tendency to use emotional words provided indirect evidence for a modulation of areas deactivated by the task consistent with a ventral network model (Benelli et al., 2012). In this study, participants were asked to read short textual descriptions of a scene rich with potential emotional issues. Textual descriptions varied systematically along two factors: presence or absence of emotional words, and of abstract words. The key aspect of the study was that after the scan participants were

**scrambled sentences task study (Viviani et al., 2010).** In violet are foci that were more active during the instructed avoidance of negative sentences; in green the focus more active during spontaneous formation of sentences in the supramarginal gyrus. Note that the relative activation in the supramarginal gyrus is located more anteriorly than in voluntary emotion regulation studies, and is more similarly distributed to the ventral foci of studies with covert orienting to reward (Armony and Dolan, 2002, **Figure 2**) and emotional Stroop foci (**Figure 3**).

asked to give their own written account of what happened in the scene they had viewed. The account was then scored for the use of emotional or abstract words. Individual differences in the propensity to use or avoid emotional terms in the account were, as in the previous studies, related to spontaneous tendencies to avoid emotional material, since there was no explicit instruction in this respect. These after-scan scores were then regressed on the contrasts from the two factors characterizing the textual descriptions participants were reading during the scan. Individual differences in the use (or avoidance) of emotional words was significantly associated with the effect of emotional material while reading textual descriptions. Importantly, these differences showed modulation of areas deactivated by the task relative to fixation, including modulation of the IPL, while there was no correlation between the tendency to avoid emotional material and the use of prefrontal areas activated by the task and that are associated with working memory processes.

These data are consistent with the notion that avoidance of negative but arousing words in the healthy may not depend on successful executive control of emotional stimuli. Instead, avoidance of negative words in the healthy may take place naturally and without particular effort, notwithstanding their salience. This finding is of interest also because it holds in prospect the possibility to investigate empirically "automatic" forms of emotion regulation with these or similar paradigms. The hypothesis, advanced in Viviani et al. (2010) that this avoidance may be related to ventral network orienting, in alternative to executive processes associated with the dorsal network, depends not only on the anatomical localization of the areas associated with spontaneous avoidance, but also on the evidence that ventral network orienting operates to allocate attention to items that are behaviorally relevant irrespective of both salience and the focus of endogenous attention. Furthermore, the prevalent deactivation of these areas in the contrast task vs. baseline in these studies is inconsistent with a classic top-down control process, but is consistent with the observed deactivations in the ventral attentional network, enhanced by proactive control on these semantic/secondary association areas when avoidance was voluntary.

## **DISCUSSION AND PERSPECTIVES FOR FUTURE RESEARCH**

The present review has been motivated by the contrast between theories highlighting the existence of a ventral network and those adopted by studies of attention to emotion and emotion regulation. These latter employ a dual-process model to interpret their data based on the opposition between sensory salience of stimuli competing for attention and top-down bias to influence the outcome of this competition. The evidence from ventral network studies suggests the existence of at least a third distinct brain circuit-cognitive process involved in attending to external stimuli (Corbetta et al., 2008) or to internal representations (Cabeza et al., 2008). This evidence shows that attention may be driven to information that is somehow "behaviorally relevant" even if neither salient nor currently targeted by top-down control, in contrast with the dual-process model.

A first issue of interest was the evidence in the review for the capacity of emotional information to trigger activation of the inferior parietal lobule (IPL), a key hub of the ventral network, which would speak for including emotional information in the category "behaviorally relevant" considered in spatial attention studies. All types of studies considered in the review provided evidence of IPL recruitment in the presence of emotional stimuli, albeit with different degrees of consistency. The strongest evidence was provided by studies of emotion regulation, where the IPL was much more often activated than its dorsal counterpart. However, also studies of attention to emotion and memory provided considerable evidence of IPL activation in the presence of emotional stimuli, especially considering those cases where this activation was the strongest in the brain. Furthermore, the few studies of attention to emotion that examined the effect of valence and arousal separately suggest that valence, especially negative valence, rather than emotional arousal, was specifically responsible for ventral network activity. This finding would be consistent with a distinction between limbic circuits involved in emotion processing, which are activated in a pre-attentive and relatively automatic fashion by emotional salience and mediate emotional arousal, and the ventral network, preferentially activated by valence (Brosch et al., 2011) during semantic encoding of stimuli.

A second issue was what these studies could reveal about the function of these areas in emotional processing, and the applicability of the ventral network model of attentional reorienting to characterize this function. Here, the diverse types of studies considered in the review provided information from different angles but, as it will be discussed here, also broadly consistent with ventral parietal areas being concerned with computing the priority of stimuli for the generation of response, as in ventral network reorienting. This account of IPL function must explain its modulation in attention to emotion tasks, together with SPL recruitment; the preferential recruitment of IPL, instead of SPL, in emotion regulation studies; and the modulation of IPL in the tasks without explicit instruction to regulate, as well as the lack of SPL or prefrontal recruitment in these latter paradigms, in contrast with instructed paradigms.

The modulation of IPL signal in studies of attention to emotion appeared to be complex and to be influenced by the interaction of the locus of interference and the task set. These complex effects suggest that a careful analysis of the form of load and locus of interference (Kahneman and Treisman, 1984; Harris and Pashler, 2004; Lavie et al., 2004; Okon-Singer et al., 2007) are required to properly interpret emotion to attention data. The most robust findings were IPL activations in studies where the emotional information had to be incidental to the criteria for the response required by the task, and interference was located on the side of the response. In contrast, in studies where the conflict was at the level of the stimulus, IPL activity was seen in circumstances that favored the perceptual processing of distracters. These findings suggest that IPL activity was observed when the emotional tone of the distracter was not as such the target of top-down control.

Importantly, in some studies where suppression of distracter was effective, as in the retention interval of a working memory task, the effect of emotional tone on IPL was the opposite, i.e., IPL was more strongly deactivated than with neutral distracter. This finding is difficult to reconcile with a simple top-down control role of IPL, as in this case we would expect IPL not to be deactivated by distracters. However, the observed deactivations are compatible with ventral network orienting. Deactivations in the ventral network have also been observed in spatial attention studies, where they have been interpreted as the neural correlate of a "filter" on incoming data associated with the existence of a task set (Todd et al., 2005; Shulman et al., 2007). One possibility is that these deactivations were the neural correlates of proactive control on potential distracters, acting on the late phases of sensory encoding, when the stimulus reaches semantic association areas. Within the framework of the bias competition model (Duncan and Miller, 2002), these deactivations may be the neural correlate of top-down control associated with task-specific adaptive coding of incoming stimuli. The stronger deactivation of IPL reported in studies where emotion was a source of interference to the task suggests that emotional stimuli might be particularly effective in eliciting such deactivations when control is active, either through their valence, or perhaps because of their semantic properties (Talmi and Moscovitch, 2004).

These findings may be best understood within a model of parietal function in which computation of priorities for response choice integrate not only executive goals and stimulus salience, but also a wider class of sources of information of behavioral relevance. Awh et al. (2012) have summarized data on the interaction between attentional processes and reward-conditioned stimuli or the past history of exposure to stimuli that, as discussed here for emotional stimuli, show that a simple dichotomy between bottom-up sensory salience and top-down control is insufficient to account fully for existent observations. Their proposal is the integration of information deriving from past experience on the stimuli to compute priorities for attentional allocation (Awh et al., 2012). These priorities may be represented in the IPS, from which they would be brought forward to prefrontal areas to be mapped to motor effectors (Andersen and Cui, 2009). In the emotion regulation studies by McRae et al. (2010) and Kanske et al. (2011) described above, distraction by a cognitive task and reappraisal shared activation in the IPS, but otherwise dissociated between DPL and IPL. These studies suggest that IPS may integrate priority information from sensory salience and top-down sources in dorsal areas (Vandenberghe et al., 2001; Yantis et al., 2002) and from behavioral relevance in ventral areas of the parietal lobes.

The notion that the parietal cortex contains priority maps integrating information of different nature receives support from neurophysiological studies in non-human primates. Although with different nuances, many researches stress that neurons in parietal areas dynamically compute abstract priority maps of stimuli in external space based on information of different nature (Platt and Glimcher, 1999; Gottlieb, 2006; Andersen and Cui, 2009; Kable and Glimcher, 2009; Bisley and Goldberg, 2010; Freedman and Assad, 2011). However, studies in non-human primates have described no dissociation between dorsal and ventral areas akin to the one considered here from neuroimaging studies in man. Nevertheless, the priority map model is consistent with the strong modulation of IPL activation by emotion observed in the reviewed neuroimaging studies, and with including emotional valence in the behaviorally relevant category that may activate the ventral network.

In these priority map models, the role of multimodal association areas may be not simply one of passive repository of semantic memory, but of actively contributing to computation of priorities for choice of response (Dorris and Glimcher, 2004; Kable and Glimcher, 2009; Freedman and Assad, 2011; Fitzgerald et al., 2013). This is consistent with the possibility that criteria other than sensory salience or effort may determine response, thus challenging the dual-process model. This possibility also challenges the "modal" view that control of response in the face of sensory salience is the prerogative of executive function (Kahneman and Treisman, 1984; Kahneman and Frederick, 2002). Representations of value in the IPL may confer priorities to stimuli and thus override sensory salience even in the absence of executive processes. In the scrambled sentences task, for example, assigning priorities between negative and positive words in the spatial array of the scrambled word set was associated with relative activation of anterior IPL when the choice of sentence was spontaneous (Viviani et al., 2010). In the presence of an explicit instruction, in contrast, the relative activation of IPL for emotional words was reduced. The role of IPL in computing priorities for response may also explain modulation of response to emotional information in paradigms without an explicit instruction on the form of this response, even without additional recruitment of prefrontal areas associated with top-down control.

The opposite of proactive control may be characterized as unconstrained reorienting to stimuli or, in an internal semantic space, forms of thinking that give precedence to spontaneously emerged representations. When production of thought is spontaneous, and no proactive control is in effect, we may accordingly expect activation of semantic association areas. In this respect, the ventral attentional network appears to be functionally similar to the default network system, explaining the apparent anatomical overlap. As it has been noted, the default network system co-localizes with semantic association areas (Binder et al., 2009), is active during spontaneous (Buckner and Carroll, 2007) or associative thinking (Bar, 2007; for a different view, see Sestieri et al., 2010). This form of cognitive process would have the properties attributed to "bottom up" attention by memory researchers (Cabeza et al., 2012), qualified by the exclusion of automatic reorienting to sensory salience. There are also considerable similarities between the properties of "bottom up" attention and the monitoring process of theories of thought control (**Table 4**).

This model may also explain why ventral activation in the left parietal lobes and TPJ is observed in studies of cognitive reappraisal. The cognitive reappraisal instruction contains an invitation for participants to imagine a favorable outcome or create a different interpretive framework for the presented emotional scene. This instruction requires participants to think original thoughts not determined strictly by stimuli or task. In contrast, studies where participants are asked to do a "reality check," selfdistract by carrying out a task or looking elsewhere, or simply block internal affect, activate left IPL/TPJ less often.

The recruitment of IPL in studies of cognitive reappraisal and in the SST study suggests that mechanisms of attentional control based on executive function should be complemented with the contributions that semantic networks may give to



emotion regulation (**Table 4**). This contribution may be accomplished either through their deactivation in proactive control, or through their capacity to support sophisticated semantic encoding of stimuli prior to attentional regulation. This mechanism would be consistent with differences in IPL activity found in borderline personality disorder patients probed with the cognitive reappraisal paradigm (Koenigsberg et al., 2009; Schulze et al., 2011; Lang et al., 2012), who respond to therapeutic approaches that increase the capacity to articulate and semantically encode emotional exchanges (Viviani et al., 2011). It would also be consistent with the observed effect on IPL of psychotherapy, tested with a cognitive reappraisal probe (Goldin et al., 2013).

## **OPEN QUESTIONS AND ISSUES FOR FUTURE STUDIES**

This review has not addressed several issues arising from the integration of data on neural substrates of orienting and on emotion regulation. One is the role of the prefrontal parts of the ventral network (**Figure 1B**). In particular, the VLPFC in the left hemisphere has been associated with voluntary emotion regulation (Ochsner and Gross, 2005), and in the right hemisphere with inhibition of a prepotent response (Aron et al., 2004). Recent studies, however, have cast doubts on the characterization of VLPFC as a locus of control (Sharp et al., 2010). Studies more specifically targeting the relationship between the role of VLPFC in attention and control will be required to shed light on this issue.

A second issue concerns the functional significance of deactivations. In the present review, the hypothesis that deactivations observed in the ventral network were the neural signature of "filters" for incoming stimuli was extended to the emotional domain, and related to the existence of proactive forms of control that constrain processing. However, this hypothesis rests on relatively limited set of data. Clarifying the nature of deactivation may be of particular importance for the clinical neurosciences. A rich PET tradition of early clinical neuroimaging studies has consistently implicated the activation-deactivation balance between ventral and prefrontal dorsal areas in depression (Mayberg et al., 1999) or in processing emotional material (Drevets and Raichle, 1998; Whalen et al., 1998a; Bush et al., 2000; Perlstein et al., 2002). In the medial face of the prefrontal lobes, in particular, dorsal and ventral areas display activation and deactivation during focussed tasks, similar to the dissociation in the parietal lobes. Furthermore, the orbitofrontal cortex hosts representations of affective value of stimuli and expectations of reward or punishment that are used to generate response but are, unlike those in the parietal lobe, largely invariant to sensory features or spatial location (O'Doherty, 2007; Kable and Glimcher, 2009; Elliott et al., 2010).

A third issue concerns the importance of the ventral network for alternative forms of emotion regulation. Investigators have repeatedly noted the existence of forms of emotion regulation that cannot be adequately characterized in terms of a dual process model opposing perceptual salience on the one hand and endogenous attentional control on the other, leading them to formulate the concept of automatic forms of regulation (Bargh and Williams, 2007; Mauss et al., 2007; Phillips et al., 2008). These concepts are broadly consistent with clinical notions of involuntary mental processes handling emotional context, such as the tendency of some patients to avoid psychic painful content. However, difficulties in the operationalization of spontaneous or "automatic" forms of emotion regulation have contributed to hampering progress on its investigation. Here, I have proposed the scrambled sentences task as an empirical model of a form of emotion regulation that takes place spontaneously. However, it is unclear if this task is representative of all forms of emotion regulation that are characterized as automatic in the literature.

## **METHODS**

Because of the considerable diversity of tasks considered in this review, selection of studies based on search words proved of limited utility. These searches were complemented by existing reviews and by systematically checking references of retrieved studies (including subsequent studies citing retrieved studies using online tools such as Google Scholar). Effects in IPL were defined as activations in BA40 and BA39, while the adjacent portion of temporal area BA22 was included as effect in the TPJ. Effects in SPL were defined as activations in BA7.

Studies of attention to emotion (**Table 1**) are here understood as studies in which cued attention, short term memory or working memory tasks were investigated by varying the existence or absence of emotional tone in stimuli. In short term or working memory tasks emotional variation involved distracters or was incidental to the task. Note that these studies differ from another large category of studies, in which the cognitive load is varied on stimuli that are emotional (these studies provide evidence on the effect of recruitment of cognitive processes on emotional processing, not on the effect of emotional content on the recruitment of attentional processes). Only studies that reported data on healthy participants were considered (including studies on patient populations that reported effects on the healthy separately). For studies presenting an emotional distracter prior to the target stimulus, studies were considered where the interval between distracter and target was less than 500 ms. based on the data by Broadbent and Broadbent (1987) showing no interference at onset asynchronies starting between 480 and 750 ms. Studies with longer asynchronies may be viewed as studies of emotion induction, an issue not considered in the present review. The following studies satisfied these criteria, but were excluded because they omitted to report a whole brain analysis, did not report the emotion vs. neutral contrast, or a combination of both, or were not easily interpretable for the issue at hand: George et al. (1994, 1997), Simpson et al. (2000), Perlstein et al. (2002), Hare et al. (2005), Williams et al. (2005), Etkin et al. (2006), Shafritz et al. (2006), Beneventi et al. (2007), Blair et al. (2007), Mitchell et al. (2007a,b), Dickie and Armony (2008), Lee et al. (2008), Berkman et al. (2009), Siman-Tov et al. (2009), Hart et al. (2010), Padmala and Pessoa (2010), Pereira et al. (2010), Kanske and Kotz (2011b), Sagaspe et al. (2011), Wessa et al. (2013).

Studies on the subliminal presentation of emotional stimuli (**Table 2**) were selected on the basis of the declared intent of the authors. Methodological studies raise issues on the presentation time that ensures that perception is subliminal (Maxwell and Davidson, 2004). However, only two neuroimaging studies would satisfy stringent criteria for subliminal perception (Liddell et al., 2005; Harmer et al., 2006). Both studies reported no effect in IPL. The following studies were excluded from **Table 2** because they omitted the whole brain analysis: Whalen et al. (2004), Pessoa et al. (2006).

In the section on emotion regulation (**Table 3**), the following studies were excluded from analysis because they omitted the whole brain analysis, did not report on the contrast considered in **Table 3**, or provided information that could not easily be interpreted in terms of the issues examined here: Schaefer et al. (2002), Ray et al. (2005), Erk et al. (2006), Banks et al. (2007), van Reekum et al. (2007), Abler et al. (2008), Drabant et al. (2009), Ochsner et al. (2009b,a), Kober et al. (2010), Campbell-Sills et al. (2011), Ichikawa et al. (2011), Schulze et al. (2011), Vrticka et al. ˇ (2011, 2012, 2013), Lang et al. (2012), Lee et al. (2012), McRae et al. (2012b), Opitz et al. (2012).

Statistical analyses involving logistic regression with repeated measurements were conducted with the function "lmer" (Bates and Maechler, 2009) available within the additional package "lme4" of the freely available software "R" (The R Foundation for Statistical Computing, Vienna, Austria), version 2.14.0. In the model to test localization in the dorsal and ventral parietal lobes (Section Neuroimaging Studies of Voluntary Emotion Regulation), positive or negative finding, as reported in **Table 3**, was regressed on location (dorsal or ventral) and study (as the grouping variable for repeated measurements). In the test on the tendency of recent studies to report ventral location, report of IPL recruitment was regressed on time of publication of the study. Significance levels reported in the text are two-sided. Models were formulated after inspecting data to quantify effects; they should be therefore understood as explorative.

Figures were prepared with the freely available software Caret (Van Essen et al., 2001; available online from the website http://brainvis.wustl.edu/wiki/index.php/Caret:About).

## **REFERENCES**


neuroimaging and neuropsychological evidence. *J. Neurosci.* 30, 4943–4956. doi: 10.1523/JNEUROSCI.1209-09.2010


an fMRI study of visual auditory oddball tasks. *Cereb. Cortex* 9, 815–823. doi: 10.1093/cercor/9.8.815


Posner, M. I., and Petersen, S. E. (1990). The attention system of the human brain. *Annu. Rev. Neurosci.* 13, 25–42. doi: 10.1146/annurev.ne.13.030190.000325


cognitive demand while regulating unpleasant emotion. *Neuroimage* 47, 852–863. doi: 10.1016/j.neuroimage.2009.05.069


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2013; accepted: 18 October 2013; published online: 07 November 2013.*

*Citation: Viviani R (2013) Emotion regulation, attention to emotion, and the ventral attentional network. Front. Hum. Neurosci. 7:746. doi: 10.3389/fnhum.2013.00746 This article was submitted to the journal Frontiers in Human Neuroscience.*

*Copyright © 2013 Viviani. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Varying expectancies and attention bias in phobic and non-phobic individuals

#### *Tatjana Aue1 \*, Raphaël Guex1,2, Léa A. S. Chauvigné1 and Hadas Okon-Singer <sup>3</sup>*

*<sup>1</sup> Swiss Center for Affective Sciences, University of Geneva, Geneva, Switzerland*

*<sup>2</sup> Department of Psychology, University of Geneva, Geneva, Switzerland*

*<sup>3</sup> Department of Psychology, University of Haifa, Haifa, Israel*

#### *Edited by:*

*Simone Vossel, University College London, UK*

#### *Reviewed by:*

*Paul Pauli, Julius-Maximilians-University, Germany Roberto Viviani, University of Ulm, Germany*

#### *\*Correspondence:*

*Tatjana Aue, Swiss Center for Affective Sciences, University of Geneva, 7 Rue des Battoirs, 1205 Geneva, Switzerland e-mail: tatjana.aue@unige.ch*

Phobic individuals display an attention bias to phobia-related information and biased expectancies regarding the likelihood of being faced with such stimuli. Notably, although attention and expectancy biases are core features in phobia and anxiety disorders, these biases have mostly been investigated separately and their causal impact has not been examined. We hypothesized that these biases might be causally related. Spider phobic and low spider fearful control participants performed a visual search task in which they specified whether the deviant animal in a search array was a spider or a bird. Shorter reaction times (RTs) for spiders than for birds in this task reflect an attention bias toward spiders. Participants' expectancies regarding the likelihood of these animals being the deviant in the search array were manipulated by presenting verbal cues. Phobics were characterized by a pronounced and persistent attention bias toward spiders; controls displayed slower RTs for birds than for spiders only when spider cues had been presented. More important, we found RTs for spider detections to be virtually unaffected by the expectancy cues in both groups, whereas RTs for bird detections showed a clear influence of the cues. Our results speak to the possibility that evolution has formed attentional systems that are specific to the detection of phylogenetically salient stimuli such as threatening animals; these systems may not be as penetrable to variations in (experimentally induced) expectancies as those systems that are used for the detection of non-threatening stimuli. In sum, our findings highlight the relation between expectancies and attention engagement in general. However, expectancies may play a greater role in attention engagement in safe environments than in threatening environments.

#### **Keywords: attention bias, biological preparedness, expectancy bias, fear, phobia, spiders**

## **INTRODUCTION**

The present study investigates the interplay between two important known biases in phobic and non-phobic fear <sup>1</sup> , namely, expectancy bias and attention deployment bias. Although both have been demonstrated to be core features in phobia, to date, these two phenomena have been investigated independently from each other (expectancy bias: Davey and Dixon, 1996; de Jong and Muris, 2002; Mühlberger et al., 2006; Aue and Hoeppli, 2012; attention bias: Watts et al., 1986; Öhman et al., 2001; Olatunji et al., 2008; Okon-Singer et al., 2011; see also Bar-Haim et al., 2007; Cisler and Koster, 2010; Yiend, 2010, for a review).

Individuals with phobia or extreme fear of specific objects or animals, such as snakes or spiders, exhibit an expectancy bias when estimating the chances of encountering their feared object (de Jong and Muris, 2002; Aue and Hoeppli, 2012). Furthermore, they estimate that once they encounter their feared object, the circumstances will be more negative compared with their own estimations for objects that are less feared by them, and compared

with the estimations of non-fearful controls (e.g., Davey and Dixon, 1996; Mühlberger et al., 2006).

Studies on attention bias showed that fearful or phobic individuals tend to engage attention more quickly in their feared stimuli than in unfeared stimuli (e.g., Mogg and Bradley, 2006; Vrijsen et al., 2009); moreover, these individuals are slow in disengaging attention from their feared stimuli compared with unfeared stimuli (e.g., Fox et al., 2001, 2002; Yiend and Mathews, 2001). In addition, fearful or phobic individuals show deficient ability to ignore fear-related distractors compared with nonfearful healthy controls (e.g., Gerdes et al., 2008; Okon-Singer et al., 2011) 2 . Cisler and Koster (2010) suggested that this early vigilance to fear-evoking stimuli is followed by later avoidance (cf. Mogg et al., 1997; Amir et al., 1998; Rinck and Becker, 2006).

<sup>1</sup>We refer to individuals with extreme fear, but who were not clinically diagnosed as phobic, as "fearful." Phobic individuals in the context of the present paper refer to individuals who were clinically diagnosed.

<sup>2</sup>A meta-analysis (Bar-Haim et al., 2007) revealed that attention bias is consistently found when stimuli are presented outside awareness. This suggests that the bias for threat-related material in fearful and anxious participants originates at an early subconscious stage. Yet, part of it may also rely on conscious processing and top-down influences because larger effect sizes are generally observed when participants are aware of the presented stimulus material (Bar-Haim et al., 2007).

Several studies used a visual search task to explore the factors modulating engagement of attention (see review in Yiend, 2010). For neutral items, engagement of attention in certain items presented in a search array has been shown to be modulated by both bottom-up factors, such as color or motion, and top-down factors manipulated via working memory or priming prior to the search (Wolfe et al., 2003; Burra and Kerzel, 2013; Calleja and Rich, 2013; Woodman et al., 2013). A potential origin of biases in attention to threatening material could be biological preparedness (e.g., Öhman et al., 2001). Such preparedness has been hypothesized to increase bottom-up attentional capture (see Yiend, 2010, for details). Little is known, however, about the impact of prior expectancies on attention engagement in fear-relevant targets.

Although quite robust findings have been reported about expectancy and attention biases in fear and phobia, only a single study has so far examined their interrelation, to the best of our knowledge. We (Aue et al., 2013) found evidence that attention and expectancies might be intimately related in spider phobia. Viewing time for spiders in spider phobics was positively related to expectancies for encounters with these animals. Non-fearful individuals, in contrast, displayed a negative association of viewing time and encounter expectancy for spiders. These differential associations between the two groups were, however, unspecific for spiders (i.e., held also for snakes and birds), which can possibly be explained by the generally more stressful nature of the experiment for phobic individuals. Together, these findings suggest that, in potentially threatening situations, there might be a substantial difference in the co-organization of attentional and expectancy processes in phobic and low fearful control participants. Because the study data are of a correlational nature, we were unable to distinguish whether variations in expectancies were at the origin of variations in attention deployment or vice versa.

The current study investigated the directionality of biases in attention and expectancies and tested whether variations in expectancies can cause variations in attention deployment. Despite our earlier focus on visual avoidance (or overall viewing time; Aue et al., 2013), we were now interested in initial attention engagement toward threatening stimulus material. Specifically, we hypothesized that an individual's expectancies concerning frequencies and consequences of confrontations with threatening stimuli could sensitize the individual to certain types of stimuli. Such a top-down mechanism may then lead the individual to engage in active search and may guide his or her attention to evidence in the environment that supports the already existent expectancies (see Krizan and Windschitl, 2007, as well as Aue et al., 2012, for related links in between positive cognitive biases [overoptimism or wishful thinking] and selective attention). Thus, we hypothesized that prior expectancies about the occurrence of threatening events would exert a top-down influence on the visual search for threat (i.e., attention engagement in threat-related targets).

In order to directly examine the effect of expectancies on attentional engagement, we manipulated expectancies regarding the likelihood of different types of targets appearing that were presented in a visual search task. More concretely, spider phobic and low spider fearful control participants in the present study had to search for a deviant spider or bird among eight butterflies. Before the presentation of each search array, a verbal cue informed the participants about the likelihood that the deviant stimulus would be a spider or a bird. By this means, we manipulated our participants' expectancies of encountering (i.e., seeing) spiders and birds.

Three specific hypotheses were tested: First, on the basis of earlier literature (e.g., Bar-Haim et al., 2007), we predicted the attention bias for spiders to be more pronounced in spider phobics as compared with the low spider fearful controls. Second, from the evidence for modulation of detection speed by cueing and predictability in neutral items arrays (e.g., Wolfe et al., 2003; Burra and Kerzel, 2013), we predicted an effect of congruency in that variations in expectancies would modulate detection times in both groups of participants, with expected deviants leading to shorter reaction times (RTs). Third, we hypothesized phobic participants to be less sensitive to variations in externally imposed (or "objective") expectancies than low spider fearful controls because of the a priori conviction of the former to incur an increased risk to encounter (or detect) spiders (e.g., de Jong and Muris, 2002), even when objective background information regarding the likelihood of an encounter is given (Aue and Hoeppli, 2012).

## **MATERIALS AND METHODS PARTICIPANTS**

Thirty-one participants [16 spider phobic; 8 male (4 in spider phobic group)], aged between 19 and 46 years (*M* = 27*.*1, *SD* = 6*.*03) were recruited via ads placed in university buildings and on university and local websites. These ads looked for participants who were either extremely fearful of spiders (including strong physiological response and avoidance), or displayed particularly low fear of these animals. The study was embedded in a larger project investigating decision making, psychophysiological, and central nervous responses while imagining encounters with feared and non-feared animals. The ads explicitly specified these project aims. Persons interested in the study were given a telephone interview and screened with the *Diagnostic and Statistical Manual of Mental Disorders* (4th Edn., Text Rev.; American Psychiatric Association, 2000) and the *International Classification of Diseases* (10th Rev.; World Health Organization, 1992) for criteria for the presence or absence of spider phobia [adapted from Mühlberger et al. (2006)].

Apart from meeting or not meeting criteria for spider phobia, fear of spiders was also assessed by asking the participants to rate their respective fears on a scale from 0 (no fear at all) to 100 (maximal or extreme fear). Spider phobic individuals rated their fear of spiders much higher than low spider fearful control participants did, *t(*29*)* = 16*.*06, *p <* 0*.*000001 (*M*s = 81*.*4 and 18.7, respectively). Fear of spiders was further assessed after the experiment by the use of the French translation of the Fear of Spiders Questionnaire (Szymanski and O'Donohue, 1995), *t(*29*)* = 7*.*32, *p <* 0*.*000001 (*M*s = 86*.*2 and 24.3).

## **STIMULI**

Search array (attention): Stimuli consisted of (a) 30 pictures displaying spiders, all taken from a recently created picture base (Dan-Glauser and Scherer, 2011); (b) 30 birds, collected from the Internet; and (c) 100 butterflies, also collected from the Internet. In each trial, the search array consisted of a matrix of nine different animal pictures with three columns and three rows. There was an equal probability for the spiders and the birds to appear in any of the nine different locations within the matrix. The stimuli were matched for luminance and contrast and displayed in gray scale.

Cues (expectancy): Three different types of cues were presented specifying either "spider 90%," "bird 90%," or "spider bird 50%" (for half of the participants; for the other half, the latter said "bird spider 50%"). These cues specified the probability that the to-be detected deviant in a subsequently presented search array would be a spider or a bird.

In reality, the spider 90% (bird 90%) cue condition referred to a probability of 71% (69 trials) that there would actually be a spider (bird) among eight butterflies in the search array presented thereafter. In the remaining cases, either a bird (spider) was presented (23 trials) or no deviant at all (five trials). The latter trials were included to verify that the participants responded on the basis of the target perception.

In the 50% cue condition, there was an equal likelihood that either a spider or a bird would be the deviant in the subsequently shown search array (46 trials; 23 spider deviants and 23 bird deviants). In some cases, there was no deviant (five trials).

## **PROCEDURE**

Upon the participants' arrival at the laboratory, the nature of the experiment was explained and written informed consent was obtained (protocol approved by the local ethics committee). The experimental task was introduced as a test of the capacity to detect spiders and birds in an array of butterflies. After participants had thoroughly read the task instructions, they performed 10 practice trials to become familiar with the task.

**Figure 1** shows the sequence of events in an experimental trial. In each trial, participants saw a fixation cross for 2000–3000 ms that was followed by a cue presented for 1500 ms. These cues referred to the probability that the to-be detected deviant in the subsequently presented search array would be a spider or a bird (see preceding section for further details regarding the expectancy cues). After the presentation of the cue, another fixation cross appeared for 2000–3000 ms. Next, the search array, consisting of nine pictures [either nine butterflies (no deviant), eight butterflies and a spider, or eight butterflies and a bird] was shown for 2500 ms. The participants had to decide whether there was no deviant, or whether the deviant was a spider or a bird.

Participants were instructed to react as fast and correctly as possible. Responses were given by pressing three different keys on the computer keyboard; the keys attributed to spiders and birds were counterbalanced across participants. A total of 244 experimental trials were presented in random order [four runs of 61 trials with short pauses in between; the frequencies of trials of different kinds (cues, deviants) were comparable between runs]. The next trial began immediately after the detection period had elapsed. The inter-trial interval was jittered around 9 s.

In a post-experimental questionnaire, the participants specified whether (a) they had paid attention to the cues; (b) it had

target. Participants were told that the cues described the likelihood of a spider or a bird being the deviant in the search array. They were asked to respond as quickly and accurately as possible according to the target (i.e., spider, bird, or no target).

been easier for them to detect the spider rather than the bird (and the reverse question); and (c) there was a greater risk of a spider rather than a bird being the deviant when the 50% cue had been presented (and the reverse question).

After the participants had completed the Fear of Spiders Questionnaire, they were debriefed.

## **DEPENDENT VARIABLES**

The dependent variables consisted of the participants' RTs for the correct responses. Errors made up ∼5% of all responses (*SD* = 3%). We also analyzed differences in expectancies as assessed via the post-experimental questionnaire (see below for details).

## **DATA ANALYSIS**

### *Reaction times <sup>3</sup>*

A 2 × 3 × 2 analysis of variance (ANOVA) with the betweenparticipants factor *group* (spider phobic, low spider fearful control) and the within-participants factors *expectancy* [spider cue (spider 90%), bird cue (bird 90%), ambiguous cue (spider bird 50%/bird spider 50%)], and *target* (spider, bird) was performed on RTs. Significant effects were further investigated by the use of *post-hoc* Tukey tests. An α level of 0.05 (two-tailed) was applied. All reported effect sizes are partial η<sup>2</sup> and simply noted as η2.

The hypotheses led us to expect first a stronger attention bias for spiders (i.e., a greater difference in RTs between spiders and birds) in phobics than in low spider fearful controls, reflected in a

<sup>3</sup>We also performed analyses on logarithmic RTs and excluded outliers (±<sup>3</sup> *SD* from individual average RT). However, effects observed in the current study were not affected by these transformations of the data. Therefore, results for only the original data will be described.

significant interaction of the factors group and target. Second, we anticipated an effect of congruency, with detection of spider targets being facilitated by spider cues and detection of bird targets being facilitated by bird cues. Therefore, a significant interaction of expectancy cue and target was predicted. The third hypothesis stated that phobic participants would be less susceptible than control participants to externally induced expectations because of the strong and comparably persistent tendency to expect encounters with spiders in the former group (e.g., de Jong and Muris, 2002; Aue and Hoeppli, 2012). Consequently, we predicted stronger congruency effects in controls than in phobics, as revealed by a significant three-way interaction of group, expectancy cue, and target.

#### *Post-experimental questionnaire*

Group differences for the questions in the post-experimental questionnaire, to which the participants replied with either "yes" or "no," were investigated with a <sup>χ</sup><sup>2</sup> test (*df* <sup>=</sup> 1). An <sup>α</sup> level of 0.05 (two-tailed) was applied.

## **RESULTS**

### **REACTION TIMES**

The 2 (group: spider phobic, low spider fearful control) × 3 (expectancy: spider, bird, ambiguous) × 2 (target: spider, bird) ANOVA yielded both a significant main effect of expectancy, *<sup>F</sup>(*2*,* <sup>58</sup>*)* <sup>=</sup> <sup>17</sup>*.*76, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*000005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*38 (*M*<sup>s</sup> <sup>=</sup> <sup>1089</sup>*.*2, 1018.4, and 1041.1 ms, for spider, bird, and ambiguous, respectively), and a significant main effect of target, *F(*1*,* <sup>29</sup>*)* = 37*.*23, *p <* <sup>0</sup>*.*000005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*56 (*M*<sup>s</sup> <sup>=</sup> <sup>960</sup>*.*8 and 1138.4 ms, for spider and bird, respectively; see also **Figure 2**). These effects were qualified by the higher-order interactions described in the following paragraphs.

In accordance with our first hypothesis, in which we predicted a stronger attention bias for spiders being present in phobics compared with controls, the interaction group × target achieved significance, *<sup>F</sup>(*1*,* <sup>29</sup>*)* <sup>=</sup> <sup>17</sup>*.*50, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*0005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*<sup>38</sup> (phobics: *M*s = 924*.*6 and 1224.0 ms, for spider and bird, respectively; controls: *M*s = 997*.*0 and 1052.8 ms). Overall, controls did not display different RTs for the detection of spiders and birds (Tukey test for this pairwise comparison: *p >* 0*.*54), whereas

phobics demonstrated particularly slow RTs for the detection of birds (compared with RTs for spiders, as well as compared with RTs for both spiders and birds in the control group; all *p*s *<* 0*.*05).

Consistent with our second hypothesis, in which we predicted that expectancy cues would facilitate RTs with respect to congruent targets, the ANOVA revealed a significant interaction of expectancy × target, *F(*2*,* <sup>58</sup>*)* = 23*.*24, *p <* 0*.*000001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*44 (spider targets: *<sup>M</sup>*<sup>s</sup> <sup>=</sup> <sup>954</sup>*.*1, 947.2, and 981.0 ms, for spider, bird, and ambiguous cue, respectively; bird targets: *M*s = 1089*.*7, 1224.4, and 1101.3 ms, for bird, spider, and ambiguous cue, respectively). Somewhat surprisingly, though, expectancy effects were limited to the detection of birds; *post-hoc* Tukey tests for this interaction showed no difference between spider detections related to the three expectancy cues (*p*s *>* 0*.*63; all other *p*s corresponding to pairwise comparisons for this interaction *<* 0.0005).

The interaction of group, expectancy, and target, *<sup>F</sup>(*2*,* <sup>58</sup>*)* <sup>=</sup> <sup>1</sup>*.*79, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*18, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*06, did not reach significance. Nevertheless, on the basis of our third a priori hypothesis that experimentally induced expectancies would modulate detection times in low spider fearful controls more strongly than in phobics, we performed analyses separately for phobics and controls. Phobics displayed a significant main effect of expectancy, *<sup>F</sup>(*2*,* <sup>30</sup>*)* <sup>=</sup> <sup>9</sup>*.*00, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*37 (*M*<sup>s</sup> <sup>=</sup> <sup>1117</sup>*.*2, 1038.8, and 1067.0 ms, for spider, bird, and ambiguous, respectively), due to prolonged RTs for spider cues; a significant main effect of target, *<sup>F</sup>(*1*,* <sup>15</sup>*)* <sup>=</sup> <sup>36</sup>*.*28, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*00005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*71 (*Ms* <sup>=</sup> <sup>924</sup>*.*6 and 1224.0 ms, for spider and bird, respectively), due to faster RTs for spider compared with bird targets; and a significant interaction of both factors, *<sup>F</sup>(*2*,* <sup>30</sup>*)* <sup>=</sup> <sup>9</sup>*.*11, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*38 (spider targets: *M*s = 928*.*8, 912.0, and 933.0 ms, for spider, bird, and ambiguous cue, respectively; bird targets: *Ms* = 1165*.*6, 1305.7, and 1200.8 ms, for bird, spider, and ambiguous cue, respectively). Tukey tests revealed that expectancy cues did not differentially influence the detection of spiders (*p*s *>* 0*.*93, for the three corresponding pairwise comparisons)4. By contrast, expectancy cues clearly influenced the detection of birds in the phobic group, with spider cues leading to slower detections than both bird cues and ambiguous cues (*p*s *<* 0*.*001), and no difference between the latter two (*p >* 0*.*61).

Similarly, in the low spider fearful controls, all effects achieved (or approached) significance: a main effect of expectancy, *<sup>F</sup>(*2*,* <sup>28</sup>*)* <sup>=</sup> <sup>9</sup>*.*34, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*0005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*40, again due to slowed RTs for spider cues (*M*s = 1061*.*2, 998.1, and 1015.4 ms, for spider, bird, and ambiguous, respectively); a main effect of target, *<sup>F</sup>(*1*,* <sup>14</sup>*)* <sup>=</sup> <sup>5</sup>*.*13, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*07, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*22, due to faster RTs for spider compared with bird targets (*M*s = 997*.*0 and 1052.8 ms, for spider and bird, respectively); and an interaction of expectancy × target, *<sup>F</sup>(*2*,* <sup>28</sup>*)* <sup>=</sup> <sup>15</sup>*.*04, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*00005, <sup>η</sup><sup>2</sup> <sup>=</sup> <sup>0</sup>*.*52 (spider targets: *Ms* = 979*.*4, 982.5, and 1029.0 ms, for spider, bird, and ambiguous cue, respectively; bird targets: *M*s = 1013*.*7, 1143.1, and 1001.7 ms, for bird, spider, and ambiguous cue, respectively). Tukey tests revealed that the detection of birds was slowed

<sup>4</sup>But detection of spiders was altogether more rapid than detection of birds (all *p*s *<* 0*.*0005), reflecting a strong and persistent attention bias to fear evoking compared with neutral information in phobic participants.

when spider cues had been previously presented (*p*s *<* 0*.*005 with respect to all other conditions); no difference was observed between the remaining conditions (*p*s *>* 0*.*38). These results demonstrate that, similar to the results in the phobic group, experimentally induced expectancies did not impact RTs for spider deviants in the control group, while they did influence detection of deviant birds.

## *Post-experimental questionnaire <sup>5</sup>*

Most phobic and control participants said they had paid attention to the cues (phobics: 12 of 14, controls: 10 of 15; no group difference: <sup>χ</sup><sup>2</sup> <sup>=</sup> <sup>1</sup>*.*43, *ns*), demonstrating that most of our participants followed task instructions quite well. In line with the RT data, showing a stronger attention bias in phobics compared with controls, most phobics indicated that it had been easier for them to detect spiders rather than birds (9 of 14), whereas less than a third of the control participants thought it had been so (4 of 15), <sup>χ</sup><sup>2</sup> <sup>=</sup> <sup>4</sup>*.*14, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*05. Finally, more phobics than controls specified a greater risk of a spider rather than a bird having been the deviant when the 50% cue had been presented (phobics: 9 of 14, controls: 1 of 15), <sup>χ</sup><sup>2</sup> <sup>=</sup> <sup>9</sup>*.*96, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*005.

## **DISCUSSION**

First, we had hypothesized that phobic individuals would display a stronger attention bias than controls [cf. meta-analytic data reported by Bar-Haim et al. (2007), and a review of the topic in Okon-Singer et al. (2013)]. Data in the current project are supportive of this hypothesis. We observed a strong and persistent attention bias for spiders in spider phobics. An attention bias for spiders existed also in the low spider fearful control group; however, it was much smaller than in phobics and limited to specific situational requirements: Prolonged RTs for the detection of birds rather than spiders in control participants were observed only when spider expectations had been induced. Such greater context dependency of the attentional bias in controls might explain the inconsistency of results regarding the existence of an attention bias for threatening animals in healthy individuals (positive findings: e.g., Öhman et al., 2001; Lipp and Waters, 2007; null findings: e.g., Tipples et al., 2002; Lipp et al., 2004).

It is noteworthy that, in contrast to Bar-Haim et al.'s (2007) meta-analytic data, the difference between spider phobic and low spider fearful control participants in the current study was not so much based on the particularly rapid RTs of spider phobics for spiders, but more strongly on the particularly slow RTs of these participants for birds. It is possible that the phobic participants did not trust the bird cues, and due to an a priori expectancy bias (e.g., de Jong and Muris, 2002; Aue and Hoeppli, 2012) that was independent from our experimental manipulations, expected the spiders with a higher likelihood than birds. Spider phobics may have therefore started to selectively and quickly scan the visual array for spiders in all trials. Such biased stimulus processing may have impeded attention engagement in birds until it had been determined that there really were no spiders. Support for such an idea comes from the responses of the phobics to the postexperimental questionnaire, in which they even retrospectively specified a greater risk of a spider rather than a bird having been the deviant when the 50% cue had been presented. The fact that participants were told that spider and bird cues indicated a likelihood of 90% for the cued animal to be the deviant target in the visual search array, but the actual likelihood was only 71%, may have further increased the distrust in the cues.

Second, we had predicted an effect of congruency in that variations in the experimentally induced expectancies would modulate attention engagement in both groups of participants, with expected deviants leading to a shortening of RTs and unexpected deviants leading to a slowing of RTs. Such an effect was demonstrated earlier for search arrays presenting neutral items (e.g., Wolfe et al., 2003; Burra and Kerzel, 2013). Burra and Kerzel (2013), for instance, examined attentional capture during a visual search task while varying the predictability of neutral targets. Their findings show that predictability influenced attention engagement by modulating the search mode (singleton vs. feature search). In line with these results, we had predicted that prior expectancies given by the cues would modulate attention engagement in the current study.

In a general sense, we were indeed able to show that experimentally induced "objective" expectancies can impact RTs related to the detection of deviant animals in a visual search array. These findings hence support the existence of a causal link between expectancies and attention engagement. However, contrary to our predictions, this link is not simple: Our ANOVA results suggest that the experimentally manipulated expectancies did not influence attention deployment for the detection of spiders in both the spider phobics and the low spider fearful controls. Yet, replicating earlier results for neutral stimulus material (e.g., Burra and Kerzel, 2013), the experimentally induced expectancies specifically the spider cues—had an impact on the detection of birds. Thus, the deviant needed to be neutral for an increase in RTs to be detected in the invalid trials. This pattern of results suggests that the detection of threatening stimuli relies on different mechanisms than those required for the detection of neutral stimuli.

The modulation of attention capture by cues shown for bird targets is in line with models of visual detection that emphasize the role of priming and working memory representations in the modulation of visual search. For example, according to the attentional engagement theory (Duncan and Humphreys, 1989, 1992), attentional selection is modulated by templates actively maintained in memory. Similarly, the guided search model (Wolfe, 2003, 2010; Wolfe et al., 2003) argues that an a priori map guides subsequent search behavior. This top-down attention guidance may be manipulated via explicit task demands, or implicitly via expected target identity created through priming. Together, these theories suggest that if a sensory input matches a set of predefined properties, it will lead to involuntary shifts of attention. The results for birds in our study suggest that our experimental manipulation of expectancies may have influenced the content of such an a priori map (i.e., set of predefined properties), but these manipulations clearly did not affect attention engagement to threatening information.

<sup>5</sup>As a result of time limitations, two phobic participants did not complete the post-experimental questionnaire.

That there was no congruency effect for spider targets in phobics can be possibly explained by the fact that phobic participants are characterized by generally increased a priori expectancies of being presented with images of spiders (e.g., de Jong and Muris, 2002; Aue and Hoeppli, 2012). The experimentally induced expectancies in the present study may have influenced these habitual encounter expectancies only slightly, or even not at all, thereby being ineffective in producing significant changes in the habitually increased vigilance for spiders in spider phobics. If this effect were observed in phobics but not in controls, the data would be consistent with our third hypothesis that stated that phobic participants would be less sensitive than controls to variations in externally imposed (or "objective") expectancies.

Yet, contrary to our predictions, experimentally induced expectancies did not influence RTs for spiders in controls either, and these participants are not generally characterized by increased expectancies of encounters with spiders (Aue and Hoeppli, 2012). Therefore, our data do not allow the conclusion that an external induction of expectancies regarding spider encounters is generally more successful in non-fearful control participants than in phobic participants. That both groups of participants were able to follow task instructions and adopt different expectancy states according to the cues presented, however, is proven by the participants' differentiated responses to birds. Hence, the observed null finding for an influence of expectancies on spider detection speaks to specialized attention engagement toward threatening information.

Diverse kinds of specialized attention engagement in threatening information have been reported before (Öhman et al., 2001; Notebaert et al., 2011; see review in Yiend, 2010). Öhman et al. (2001) showed faster detection of threatening animals compared with flowers and mushrooms in non-phobic individuals. Interestingly, this quicker detection was amplified in phobic individuals [contrary to the current study, but in line with Bar-Haim et al.'s (2007), review; for a discussion of these inconsistencies, see first hypothesis above]6. Because an attention bias was found for phobic and non-phobic participants, the findings were interpreted as a preattentive prioritization of threat information due to biological preparedness. However, albeit the observation that detection of fear-relevant animals is prioritized, it has been shown that the degree of facilitation depends on the number of distracting items in the search array, thus contradicting the idea of preattentive processing of threatening stimuli (Batty et al., 2005; Notebaert et al., 2010, 2011).

Our findings are consistent with the view of a universal (i.e., not linked to high levels of fear) evolutionary heritage for the processing of threat7. This view may explain why the detection of evolutionary salient stimuli such as threatening animals (here: spiders) is not as penetrable to experimental expectancy manipulations as the detection of non-threatening stimuli (for similar findings using a dot-probe task in unselected participants, see Lipp and Derakshan, 2005). Such an evolutionary mechanism would be in line with the idea of biological preparedness for certain classes of stimuli (Seligman, 1971; Öhman and Mineka, 2001) and would ensure quick adaptive behavioral responses in the service of survival (e.g., Flykt et al., 2012), responses that can be initiated without requiring adequate expectancy states.

It has been proposed that fear responses can be rapidly mediated by a network of subcortical structures (the so-called fear module, including, for instance, the amygdala; Lang et al., 1998; LeDoux and Phelps, 2000; Davis and Lang, 2003). The amygdala is capable of initiating a defense response via connections to the hypothalamus and the brainstem, even without conscious processing of information regarding a threatening stimulus. Therefore, and because of the persistence of phobias despite the explicit knowledge that a feared object is not harmful, the fear module has been proposed to be "impenetrable to conscious cognitive control" (Öhman and Mineka, 2001, p. 515). The fear module may hence ensure automatic processing of survivalrelevant stimuli that is independent of explicit (or externally imposed) expectancies.

However, it is also possible that the spider pictures in our study popped out due to a common physical characteristic (e.g., curved line body with eight legs). This could alternatively explain why spiders were in general detected very fast and why cue manipulation did not affect RT for spider pictures in both groups. Birds may lack such a pop-out characteristic. As a consequence, our cues may have affected the detection of birds only. Note, however, that fast capture of attention by stimuli associated with threatening animals was previously shown even when visual features had been controlled for. Batty et al. (2005) used a visual search task with conditioned stimuli. Participants with high or low fear of spiders or snakes detected abstract shapes that had been paired earlier with either a neutral picture or a picture of their feared animal. In this study, both high and low fearful participants were overall faster at detecting targets that were associated with negative animals.

In line with the authors' conclusion that visual features are not the reason for facilitated capture of attention by threatening items, we do not think that differences in visual features can fully explain the effects found in the current study. Two additional reasons further strengthen our conviction on this issue. First, the butterfly distracters in our study were selected on the basis of the assumption that they share significant features with both spiders *and* birds (e.g., wings corresponding to birds' wings; six legs + two antennas corresponding to the eight legs of spiders). About 75% of the butterflies were displayed from their side, clearly showing all legs and antennas. All images were displayed in gray scale, preventing pop-out effects based on color differences between stimulus categories. Second, if pop-out effects had been responsible for our effects, controls should have displayed an overall greater discrepancy of RTs for spiders and birds.

Subsequent studies should test for the existence of an influence of attentional processes on expectancies, for instance by manipulating vigilance to or avoidance of threat. Deviations in attention (e.g., selective visual attention) may lead to biased expectancies about future outcomes because reality is experienced in specific

<sup>6</sup>Note as well that there were significant differences between the experimental paradigms used in these earlier studies and our own. For instance, the earlier studies did not use expectancy cues.

<sup>7</sup>As outlined above, such processing is not necessarily preattentive.

or selective ways. Adding eye-tracking and event-related potentials to the research tool inventory might help to identify basic mechanisms underlying phobia-related processing biases, their time course, and their interdependence. Neuroimaging may further increase knowledge by directly examining the impact of the suggested subcortical regions (e.g., in the so-called fear module; Lang et al., 1998; LeDoux and Phelps, 2000; Öhman and Mineka, 2001; Davis and Lang, 2003).

It is also important to note that our control group displayed particularly low fear of spiders and therefore might be characterized by specific responding. We cannot rule out that some of the control participants specifically liked these animals, because we did not assess independent data on the pleasantness or appeal of the animals in the current study. The question of whether participants characterized by "normal" fear of spiders exhibit the same pattern of response as phobics and low fearful controls remains to be investigated. Another aspect that should be examined is whether the effects observed by us can be reproduced for threatening stimuli other than animals that have been proven dangerous throughout evolution (e.g., guns and knives) in order to test the biological preparedness account. Finally, future studies are needed to rule out the possibility that the differences in RTs we observed in the current study are related to motor processes rather than attention engagement.

In conclusion, both phobics and low spider fearful controls showed an attention bias and less influence of expectancy cues in the detection of spiders, in line with the biological preparedness view. However, the attention bias was larger in phobics,

## **REFERENCES**


whereas it was restricted to specific expectancy conditions (i.e., spider cues, which produced a slowing of the detection of bird targets) in controls. Taken together, our results highlight the relation between expectancies and attention engagement in general. However, the influence of expectancies may be inhibited during the processing of threatening stimulus material, and expectancies may play a greater role in safe compared with threatening environments. Thus, at first glance, our data challenge the hypothesis that expectancy bias is at the origin of attention bias. Nonetheless, we cannot rule out that our participants exhibited a priori expectancy biases (leading to a preferential processing of spider-related targets) and that these biases were too strong to be overruled by the expectancy cues used in the present study. Future research should eliminate this possibility before safe conclusions can be drawn. In general, investigating the mutual relations between expectancy and attention biases may lead to a more comprehensive model of processing of threat in health and anxiety disorders. Notably, although attention and expectancy biases are core features in phobia and anxiety disorders, these biases were mostly investigated separately and their causal impact has not been examined. The current findings add much needed data to this emerging field of combining different types of bias. Such an approach may lead to therapeutic approaches that are more effective than selective targeting of either attention or expectancy bias.

## **ACKNOWLEDGMENTS**

This research was supported by grant PZ00P1\_121590 of the Swiss National Science Foundation to Tatjana Aue.


outcome desirability on optimism. *Psychol. Bull.* 133, 95–121. doi: 10.1037/0033-2909.133.1.95


flight-phobic subjects in cognitive and physiological responses to disorder-specific stimuli. *J. Abnorm. Psychol.* 115, 580–589. doi: 10.1037/0021-843X.115.3.580


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 April 2013; accepted: 12 July 2013; published online: 08 August 2013. Citation: Aue T, Guex R, Chauvigné LAS and Okon-Singer H (2013) Varying expectancies and attention bias in phobic and non-phobic individuals. Front. Hum. Neurosci. 7:418. doi: 10.3389/ fnhum.2013.00418*

*Copyright © 2013 Aue, Guex, Chauvigné and Okon-Singer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*