# THE COGNITIVE NEUROSCIENCE OF VISUAL WORKING MEMORY

EDITED BY: Natasha Sigala and Zsuzsa Kaldy PUBLISHED IN: Frontiers in Systems Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2017 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88945-168-5 DOI 10.3389/978-2-88945-168-5

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **THE COGNITIVE NEUROSCIENCE OF VISUAL WORKING MEMORY**

Topic Editors: **Natasha Sigala,** University of Sussex, UK **Zsuzsa Kaldy,** University of Massachusetts Boston, USA

Collage of brain activations during a working memory task seen from different angles. Full task and results description in Minati L, Sigala N (2013) PLoS ONE 8(9): e73746. doi:10.1371/journal.pone.0073746. Image copyright: Natasha Sigala

Visual working memory allows us to temporarily maintain and manipulate visual information in order to solve a task. The study of the brain mechanisms underlying this function began more than half a century ago, with Scoville and Milner's (1957) seminal discoveries with amnesic patients. This timely collection of papers brings together diverse perspectives on the cognitive neuroscience of visual working memory from multiple fields that have traditionally been fairly disjointed: human neuroimaging, electrophysiological, behavioural and animal lesion studies, investigating both the developing and the adult brain.

**Citation:** Sigala, N., Kaldy, Z., eds. (2017). The Cognitive Neuroscience of Visual Working Memory. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-168-5

# Table of Contents


#### **3. Developmental approaches**

*119 Neonatal Perirhinal Lesions in Rhesus Macaques Alter Performance on Working Memory Tasks with High Proactive Interference*

Alison R. Weiss, Ryhan Nadji and Jocelyne Bachevalier


Allison Fitch, Hayley Smith, Sylvia B. Guillory and Zsuzsa Kaldy


Sumie Leung, Denis Mareschal, Renee Rowsell, David Simpson, Leon Iaria, Amanda Grbic and Jordy Kaufman

*173 ERP markers of target selection discriminate children with high vs. low working memory capacity*

Andria Shimi, Anna Christina Nobre and Gaia Scerif

# Editorial: The Cognitive Neuroscience of Visual Working Memory

Zsuzsa Kaldy <sup>1</sup> \* and Natasha Sigala<sup>2</sup> \*

*<sup>1</sup> Department of Psychology, University of Massachusetts Boston, Boston, MA, USA, <sup>2</sup> Brighton and Sussex Medical School, University of Sussex, Brighton, UK*

Keywords: visual working memory, neuroimaging, development, prefrontal cortex, delay activity, fronto-parietal network, capacity, infants

**Editorial on the Research Topic**

#### **The Cognitive Neuroscience of Visual Working Memory**

Visual working memory (VWM) allows us to temporarily maintain and manipulate visual information in order to solve a task. The study of the brain mechanisms underlying this function began more than a half century ago, with Scoville and Milner's (1957) seminal discoveries with amnesic patients. As of 2016, more than 4000 studies have examined the brain mechanisms underlying VWM. In this Research Topic, our goal was to bring together perspectives on the cognitive neuroscience of VWM from multiple fields that have traditionally been fairly disjointed: human neuroimaging, electrophysiological and animal lesion studies, both in adults and in development.

Edited and reviewed by: *Maria V. Sanchez-Vives, Institut D'Investigacions Biomediques August Pi I Sunyer, Spain*

#### \*Correspondence:

*Zsuzsa Kaldy zsuzsa.kaldy@umb.edu Natasha Sigala n.sigala@bsms.ac.uk*

Received: *28 November 2016* Accepted: *04 January 2017* Published: *19 January 2017*

#### Citation:

*Kaldy Z and Sigala N (2017) Editorial: The Cognitive Neuroscience of Visual Working Memory. Front. Syst. Neurosci. 11:1. doi: 10.3389/fnsys.2017.00001*

The classic model of VWM posits that persistent delay activity in the prefrontal cortex is both sufficient and necessary to mediate visual working memory. Riley and Constantinidis contribute a thorough review of relevant primate studies, and provide compelling fresh evidence for it. They also survey a number of alternative models of VWM and conclude that each one can only mediate a limited range of memory-dependent behaviors. They also provide a detailed account of the tissue characteristics that make the prefrontal cortex (PFC) uniquely specialized to support this function.

Further support for the classic model is provided by Boschin and Buckley, who enhance it by offering an account of the functions of the frontopolar cortex (FPC) from a series of pioneering lesion and behavioral studies in the non-human primate. Specifically, they suggest that the FPC supports the exploration and evaluation of relative values of novel alternatives, some of which may turn out to be distractors, while the dorsolateral PFC maintains, manipulates, and selects relevant information, rules and strategies for the task at hand. Mansouri et al. review the role of VWM in executive control functions with an emphasis on abstract features, and representations of errors and conflicts in order to make adaptive behavioral adjustments. They note that primate performance in a Wisconsin Card Sorting Task analog is disrupted after lesions of the dorsolateral PFC, orbitofrontal cortex, but also of anterior cingulate cortex. Tsutsui et al. offer an integration of findings on visuospatial WM from two animal models: primates and rodents. Both lesion and single unit studies, together with anatomical patterns of frontoparietal connectivity indicate that the dorsolateral PFC in the macaque is analogous in function to the medial PFC in the rat.

The alternative model of the PFC delay activity, which posits that it serves as a top-down signal that modulates posterior sensory areas, rather than it encodes stimulus information per se (see D'Esposito and Postle, 2015, for a recent review), has also received experimental support and is represented in this Research Topic. Desrochers et al. present data on the human and non-human primate rostrolateral PFC during error and conflict monitoring in task sequences. Lorenc et al.'s human neuroimaging study combines PFC disruption via TMS with behavioral data and multivariate analysis of fMRI data, and provides evidence for the causal role of PFC in topdown tuning of posterior sensory areas. This tuning was also shown to be dynamically changing according to current task goals.

Lara and Wallis critically review studies that report delay activity in the PFC as the neural correlate of VWM. They cast doubt on the claim that stimulus-relevant information is encoded in the PFC, and suggest long-range synchronization of oscillations as a candidate mechanism by which the PFC exerts top-down control on sensory neurons. Lee and Baker provide further evidence for the alternative model by reviewing imaging evidence for the topography of maintained information during VWM tasks. They conclude that VWM is a highly distributed process, and claim that the relevant information can be maintained in any of the systems involved in the initial stages of perceptual processing.

Wolff et al. contribute a proof-of-principle experimental EEG study that explores the possibility of exposing hidden states of VWM, employing a functional perturbation approach combined with multivariate decoding. Finally, Ambrose et al. tested the inter-individual stability of behavioral and neural VWM capacity measures. They found that while results from their two different tasks (an easier color vs. a harder shape VWM task) correlated within individuals both in behavior and in brain activity (BOLD response in the occipital and parietal cortices), there were no significant brain-behavior correlations in capacity. Both of these empirical findings open up a lot of questions for future work.

Let us now turn to the works in this Research Topic that examined VWM from a developmental perspective. In 2004, we presented a summary of what was then known about the early development of VWM in humans (Káldy and Sigala, 2004) and we also put forward a novel hypothesis. Building on the more recent alternative model of VWM organization in the adult brain that distinguishes between a fronto-parietal control network and more posterior information storage areas in the ventral visual stream, we hypothesized that young infants may rely more on the posterior areas when solving tasks that involve VWM. In the more than 10 years since the publication of that review, some significant progress has been made on the developmental emergence of VWM systems in the brain, but there are still a lot of open questions.

Fitch et al. surveys what is currently known about the emergence of these systems in the first five years of life. This minireview concludes that both networks seem to be active before the end of the first year of life in humans, and a few pioneering studies have already identified VWM capacity-dependent neural activity in the occipital and parietal cortices of infants and young children.

Two empirical studies in this Research Topic that tested human infants found VWM-related activity in the ventral visual stream. Prior EEG studies have demonstrated that gamma-band power in the temporal cortex increases during periods while infants are maintaining an object representation in VWM (Kaufman et al., 2003, 2005). Here, Leung et al. have shown that this EEG signal increases with memory load (two objects vs. one). Optical imaging (fNIRS) studies reported by Wilcox and Biondi provided converging evidence. The occipital cortex (and posterior temporal cortex in younger infants) was involved during all events when infants had to maintain object representations in VWM. In addition to this, the anterior temporal cortex was selectively activated when infants maintained two distinct objects in VWM.

The medial temporal lobe (which includes the hippocampus, entorhinal, perirhinal, and parahippocampal cortices) has extensive connections with both frontal and temporal areas. Weiss et al. demonstrated the role of the perirhinal cortex in VWM development. They tested adult macaques that received neurotoxic lesions in the perirhinal cortex when they were 1–2 weeks old and found that these animals were impaired in VWM tasks that required trial-to-trial updating of visual information.

Two articles in our Research Topic have examined the complex interactions between visual attention and memory during development. Reynolds and Romano reviews the existing literature in infants, and conclude that while the role of sustained attention in long-term memory encoding has been well understood (see e.g., the now-classic works of Richards, 1997), the same is not true for relations between sustained attention and VWM performance in early development. They echo the conclusions of Fitch et al. that "future research should aim to examine relations between attention and working memory in infancy and early childhood using both psychophysiological and neural measures."

We know more about attention-VWM interactions in older children, thanks to, among others, the EEG/ERP studies of Scerif and her colleagues. Shimi et al. investigated the magnitude of the N2pc in 10-year-old children and adults, and found that this neural signature of visual attention during the encoding phase of the task was related to their behavioral performance during the later recognition phase. This brain-behavior relationship was demonstrated on the individual level as well: children with large attentional cue benefits and high VWM capacity elicited an adult-like ERP response following attentional selection of the to-be-encoded item, whereas children with low VWM capacity did not.

In summary, this Research Topic includes nine up-todate literature reviews and seven novel empirical studies approaching the neural mechanisms underlying visual working memory from human developmental, neuroimaging, and non-human mammalian perspectives. Together, they describe a common brain network that involves the frontoparietal control system, various processing stages of the ventral visual stream, and the medial temporal lobe with some differences in the weights and functions of the different structures. This extensive network seems to function in early infancy, and new multi-level approaches will help elucidate the details of the developmental trajectories.

#### AUTHOR CONTRIBUTIONS

The two authors contributed equally to the preparation of this manuscript.

#### ACKNOWLEDGMENTS

ZK was supported by National Institutes of Health's grant R15HD086658 and a Seed Grant from the Simons Foundation under the auspices of the Simons Center for the Social Brain at MIT (#319294).

#### REFERENCES


comparison recognition-memory paradigm. Dev. Psychol. 33, 22–31. doi: 10.1037/0012-1649.33.1.22

Scoville, W. B., and Milner, B. (1957). Loss of recent memory after bilateral hippocampal lesions. J. Neurol. Neurosurg. Psychiatry. 20, 11–21.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Kaldy and Sigala. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Role of Prefrontal Persistent Activity in Working Memory

Mitchell R. Riley and Christos Constantinidis \*

*Department of Neurobiology and Anatomy, Wake Forest School of Medicine, Winston-Salem, NC, USA*

The prefrontal cortex is activated during working memory, as evidenced by fMRI results in human studies and neurophysiological recordings in animal models. Persistent activity during the delay period of working memory tasks, after the offset of stimuli that subjects are required to remember, has traditionally been thought of as the neural correlate of working memory. In the last few years several findings have cast doubt on the role of this activity. By some accounts, activity in other brain areas, such as the primary visual and posterior parietal cortex, is a better predictor of information maintained in visual working memory and working memory performance; dynamic patterns of activity may convey information without requiring persistent activity at all; and prefrontal neurons may be ill-suited to represent non-spatial information about the features and identity of remembered stimuli. Alternative interpretations about the role of the prefrontal cortex have thus been suggested, such as that it provides a top-down control of information represented in other brain areas, rather than maintaining a working memory trace itself. Here we review evidence for and against the role of prefrontal persistent activity, with a focus on visual neurophysiology. We show that persistent activity predicts behavioral parameters precisely in working memory tasks. We illustrate that prefrontal cortex represents features of stimuli other than their spatial location, and that this information is largely absent from early cortical areas during working memory. We examine memory models not dependent on persistent activity, and conclude that each of those models could mediate only a limited range of memory-dependent behaviors. We review activity decoded from brain areas other than the prefrontal cortex during working memory and demonstrate that these areas alone cannot mediate working memory maintenance, particularly in the presence of distractors. We finally discuss the discrepancy between BOLD activation and spiking activity findings, and point out that fMRI methods do not currently have the spatial resolution necessary to decode information within the prefrontal cortex, which is likely organized at the micrometer scale. Therefore, we make the case that prefrontal persistent activity is both necessary and sufficient for the maintenance of information in working memory.

Keywords: prefrontal cortex, monkey, neurophysiology, fMRI, neuron

# INTRODUCTION

Working memory is the ability to maintain and manipulate information in mind, over a time span of seconds (Baddeley, 2012). The memory system storing information for a few seconds was termed "short-term memory" in the classical, three-store model of memory (Atkinson and Shiffrin, 1968).

#### Edited by:

*Natasha Sigala, University of Sussex, UK*

#### Reviewed by:

*Amy F. T. Arnsten, Yale University School of Medicine, USA Julio Martinez-Trujillo, University of Western Ontario, Canada*

> \*Correspondence: *Christos Constantinidis cconstan@wakehealth.edu*

Received: *12 September 2015* Accepted: *07 December 2015* Published: *05 January 2016*

#### Citation:

*Riley MR and Constantinidis C (2016) Role of Prefrontal Persistent Activity in Working Memory. Front. Syst. Neurosci. 9:181. doi: 10.3389/fnsys.2015.00181* The modern definition of working memory emphasizes its dynamic nature of representing and manipulating information originating from the environment or retrieved from long-term memory, rather than being a passive conduit of information into the long-term memory store (Baddeley, 2003; Smith and Kosslyn, 2007). In recent years, some authors have reserved the term "working memory" to refer specifically to complex information that needs to be manipulated; the term "visual short term memory" has been used to denote memory of simple stimuli (e.g., colored squares) that needs to be maintained without any further transformation (Todd and Marois, 2004). Although important in its own right, working memory is a core component of a number of other cognitive functions, including language, problem solving, reasoning, and abstract thought (Baddeley, 1992). Its central role in cognitive function explains the intense research interest that spans several decades.

Studies of lesions in humans and non-human primates first implicated the cortical surface of the frontal lobe as the site of working memory function (Jacobsen, 1936; Milner, 1963). Lesions of the prefrontal cortex (PFC—**Figure 1**) rendered subjects unable to perform even simple tasks requiring working memory. A wide range of impairments in tasks requiring manipulation of information in memory has been confirmed in recent lesion studies (Rossi et al., 2007; Buckley et al., 2009). Subsequently, neurophysiological experiments identified neurons that not only respond to sensory stimuli, but remain active during a period after a stimulus was no longer present; this "persistent activity" therefore provided a neural correlate of working memory (Fuster and Alexander, 1971; Funahashi et al., 1989). Visuo-spatial working memory has been a particularly fruitful model since spatial location can be varied parametrically and the activity of neurons representing each location can be studied systematically. Persistent activity in the prefrontal cortex has been shown to explain many aspects of behavioral performance in visuo-spatial working memory tasks (Qi et al., 2015b).

The role of prefrontal cortex in working memory has been re-evaluated over the past few years (Sreenivasan et al., 2014a; D'Esposito and Postle, 2015) as several sources of experimental evidence have challenged the traditional views on prefrontal persistent activity. First, neurophysiological studies have demonstrated that persistent discharges are not limited to the prefrontal cortex, but are widespread in a network of cortical and subcortical areas, thus raising questions on the role of persistent firing in the prefrontal cortex (Constantinidis and Procyk, 2004; Pasternak and Greenlee, 2005). Secondly, phenomena such as repetition suppression illustrate that the activity of neurons may be modulated by prior stimuli in the absence of persistent activity (Grill-Spector et al., 2006). Third, human fMRI studies have been successful in decoding information held in memory from visual cortex (Harrison and Tong, 2009) and have identified correlates of working memory capacity in the posterior parietal cortex (Todd and Marois, 2004, 2005; Xu and Chun, 2006). Therefore, alternative models based on interpretation of BOLD signals (which do not directly measure spiking activity) ascribe control processes to PFC while reserving the representation of working memory for the sensory cortices (Curtis and D'esposito, 2003; D'Esposito and Postle, 2015).

In this review, we examine the role of prefrontal cortex in working memory. We take a position largely in favor of the classical model of working memory being represented in the persistent activity of prefrontal neurons based on evidence from neurophysiological experiments in non-human primates and critical evaluation of human imaging studies. We begin by examining the anatomical basis of working memory and the specializations of the prefrontal cortical circuit. We then review the range of phenomena accounted for by persistent activity in visuo-spatial working memory, illustrating the enduring appeal of the model. Activation during spatial working memory may be viewed as equivocal about the role of the prefrontal cortex because persistent activity might be explained by top-down control processes as well as by working memory itself. We therefore discuss the evidence of prefrontal persistent activity for other content types of working memory. We then review memory models not dependent on persistent activity and posit that these could only mediate a limited range of working memory tasks. We finally review activity decoded from brain areas other than the prefrontal cortex during working memory, concluding that the ultimate source of this activation is the prefrontal cortex, and these areas alone are not sufficient for mediating working memory maintenance.

# ANATOMICAL ORGANIZATION OF WORKING MEMORY CIRCUITS

To understand why prefrontal cortex may represent robustly remembered information, it is instructive to review the anatomical basis of persistent activity. The primary source of sustained excitation is thought to be reverberating activity through layer II/III horizontal excitatory connections between prefrontal neurons with similar stimulus tuning (Constantinidis and Wang, 2004). PFC neurons receive horizontal connections from clusters of cells (**Figure 2**), arranged in stripe-like fashion, 0.2–0.8 mm wide (Goldman-Rakic, 1984; Levitt et al., 1993; Lund and Lewis, 1993; Kritzer and Goldman-Rakic, 1995; Pucak et al., 1996). Persistent firing between layer II/III neurons also depends on glutamate stimulating NMDA receptors (Wang et al., 2013). The relatively slow time constant of NMDA receptors allows the post-synaptic neuron to remain at a relatively depolarized state for a longer interval, compared to neurons containing AMPA receptors alone; without NMDA receptors, an unrealistically high level of firing rate would be required to sustain persistent activity (Wang, 2001). Additionally, sharper tuning for spatial location arises from GABAergic interneurons, which are essential in tuning the activity to represent specific spatial information (Rao et al., 1999, 2000; Constantinidis and Goldman-Rakic, 2002).

Several anatomical specializations endow the prefrontal cortex with unique properties in maintaining persistent activity. Prefrontal pyramidal neurons exhibit the most extensive dendritic trees and highest number of spines of any cortical neurons, some 23 times higher than the number of spines of layer III pyramidal cells in V1 (Elston, 2000, 2003). As a consequence, the spatial spread of functional interactions between neurons within the prefrontal cortex is more extensive than of neurons within the posterior parietal cortex (Katsuki et al., 2014). Additionally, dopaminergic innervation terminates predominantly in the frontal lobe and can improve the signalto-noise ratio of persistent activity, mainly via enhancement of the NMDA conductance (Yang and Seamans, 1996; Durstewitz et al., 2000; Seamans et al., 2001; Chen et al., 2004). Specialized GABAergic types have also been implicated in stabilizing persistent activity in the face of distraction, and physiological signatures of these neurons have been specifically identified in the prefrontal cortex (Wang et al., 2004; Zhou et al., 2012). All of these specializations suggest that the prefrontal cortex is better suited to generate and sustain persistent activity than its afferent areas (Qi et al., 2015b).

#### PERSISTENT ACTIVITY IN VISUO-SPATIAL WORKING MEMORY

The most extensively used paradigm to study visuo-spatial working memory involves the oculomotor delayed response (ODR) task (**Figure 3A**), which presents subjects with a brief stimulus and, after a delay period, requires an eye movement to its remembered location (Funahashi et al., 1989; Rao et al., 1999; Constantinidis et al., 2001a). Another common task, the delayed alternation task, similarly requires a (hand or eye) movement to one of two locations, alternating in successive trials, therefore requiring memory for the location of the preceding choice (Kubota and Niki, 1971; Niki, 1974). Persistent activity selective for the spatial location of the remembered stimulus is apparent in a population of prefrontal neurons, comprising approximately a third of the total prefrontal neurons (Qi and Constantinidis, 2013). The location of the preceding stimulus in such tasks is sometimes confounded with the preparation for the motor response; however, more complex tasks reveal that the majority of prefrontal neurons represent the former rather than the latter. For example, when a task requires monkeys to make an eye movement toward a location other than the location of the visual stimulus, the majority of prefrontal neurons represent the location of the preceding stimulus rather than the location of the impeding saccade. This is the case in the delayed anti-saccade task (Funahashi et al., 1993b) and the rotational ODR task (Takeda and Funahashi, 2002).

A recent study revives the idea that persistent activity generated during ODR tasks represents motor preparation rather than memory for the stimulus (Markowitz et al., 2015). The study used two versions of the ODR task, one in which the

representing upper right location) are drawn in red color. Pyramidal neurons excite each other through reciprocal connections. Stripes of neurons with similar spatial tuning are repeated across the surface of the cortex. Interneurons inhibit other pyramidal neurons with different spatial tuning (memory field representing lower right location) drawn in blue color.

FIGURE 3 | (A) Sequence of events in the Oculomotor Delayed Response (ODR) task. Successive frames represent the fixation period, stimulus presentation, delay period, and saccade toward the remembered stimulus location. (B) Delayed Match to Sample task. Monkeys first foveate the fixation point and pull a lever. They are then presented with a cue stimulus. This is followed by a random (0–2) number of non-match stimuli, separated by delay periods. When a match stimulus appears at the same location as the cue, the monkeys are required to release the lever. (C) Match/Non-match task. While monkeys fixate, two stimuli are presented in sequence, separated by delay periods. After another delay period, two choice targets are shown and the monkey has to saccade to the green target if the second stimulus matched the cue, and the blue stimulus, otherwise. (D) Schematic diagram of prefrontal activity elicited by the stimulus that is sustained during the delay period in each of the previous tasks.

stimulus appeared transiently (as in **Figure 3A**) and one in which it remained visible for the entire interval until the motor response. The conclusion that persistent activity represents motor preparation was predicated entirely on the assumption that memory storage is only mediated by neurons that exhibit persistent activity after the stimulus has been turned off, but do not continue to respond to the stimulus when it remains visible. Neurons exhibiting continuous activation by visual stimuli were considered "preparation" neurons, by default. This premise is tenuous. Neither direct evidence nor network models are available that would suggest that memory storage neurons are not activated continuously by a prolonged stimulus. In turn, this assumption leads to the conclusion that the activity of "storage units," thus defined, has no influence on recall performance or other aspects of behavior in a memory task (Markowitz et al., 2015). This is a questionable conclusion, in our view.

Persistent activity tuned for the location of a stimulus appears in the prefrontal cortex even in tasks where the stimulus does not immediately allow planning of a movement. In the spatial delayed-match-to-sample task, subjects are required to release a lever or press a button when a stimulus appears at a previously cued location (**Figure 3B**); in the match/non-match task, the monkeys have to saccade to a green or blue response target depending on whether two stimuli presented in sequence appeared at the same location or not (**Figure 3C**). In such tasks, prefrontal neurons generate persistent activity following the presentation of the original stimulus that is tuned for its spatial location (**Figure 3D**), and not the preparation of a motor response, the direction of which is not known until later in the trial (Qi et al., 2010, 2011; Goodwin et al., 2012).

Persistent activity is not merely an epiphenomenon of spatial working memory, either. The most straightforward evidence in favor of this idea comes from analysis of error trials in the ODR task, which are characterized by lower levels of delay period activity (Funahashi et al., 1989; Zhou et al., 2013). In other words, trials in which persistent activity is diminished are more likely to result in errors. A near linear relationship between behavioral performance and persistent activity can be also revealed in tasks that modulate parametrically the discriminability of two remembered targets (Constantinidis et al., 2001b).

Computational models provide a detailed picture of the relationship between behavioral outcomes related to working memory performance and persistent activity (**Figure 4**). Persistent activity can be sustained in such models by virtue of re-entrant connections between neurons with similar tuning for stimulus properties, so that activation after afferent input is maintained in the system (**Figure 4A**). Drifts in neuronal activity across the network of prefrontal neurons (**Figure 4B**) have been shown to predict precisely the relationship between several aspects of firing rate and the endpoint of the saccade (the spatial location being recalled by the monkey) in the ODR task (Wimmer et al., 2014). For example, persistent activity recorded from trials in which monkeys make eye movements deviating clockwise vs. counterclockwise relative to the true location of the stimulus yields slightly different tuning curves, as would be expected if the location recalled was determined by the peak of activity at the end of the delay period (**Figure 4C**). Similarly, the variability of a neuron's delay period activity (estimated by the Fano factor of spike counts, i.e., the variance divided by the mean) is maximal for inaccurate saccades to locations at the flanks of the neuron's tuning curve but lower for locations in the peak or tail (**Figure 4D**). This counterintuitive finding is also explained if one appreciates that small deviations in saccadic endpoint correspond to the bump of activity shifting in one direction or another, and that activity of a single neuron changes most rapidly if the bump traverses the flank of its tuning curve rather than its peak or tail. Finally, spike-count correlations of two simultaneously recorded neurons are lowest and negative

for inaccurate saccades when the cue appears between the peaks of their tuning curves (**Figure 4E**). This result is also consistent with the idea that working memory inaccuracies are caused by drifts of persistent activity in the delay period, and when the bump attractor randomly varies around a location between the peaks of two neurons, it inevitably causes an increase in firing rate for one neuron, but a decrease for the other. Importantly, these findings do not hold for neurons that do not exhibit persistent discharges, even though the latter are more numerous in the prefrontal cortex (Wimmer et al., 2014).

Persistent activity in the prefrontal cortex has also been shown to be subject to developmental changes, with lower levels of persistent activity present in older monkeys (Wang et al., 2011). This decline has been linked to alpha-adrenergic receptors. Drugs targeting these can ameliorate the effects of age-related cognitive deficits (Arnsten and Goldman-Rakic, 1985; Arnsten et al., 1988), as well as increase persistent discharges to levels seen in younger adults (Wang et al., 2011). An important concept to consider is that persistent activity is not the same as a generalized increase in neuronal excitability. For example, low doses of a nicotinic alpha-7 agonist enhance spatially tuned persistent activity but high doses produce non-specific excitation that erodes the representation of the remembered spatial location (Arnsten and Wang, 2016).

# PERSISTENT ACTIVITY IN NON-SPATIAL WORKING MEMORY

Prefrontal neurons generate discharges that represent other types of information, in addition to spatial location. Ventrolateral prefrontal cortex receives input from regions of the ventral visual pathway, most importantly the inferior temporal cortex and superior temporal gyrus (Petrides and Pandya, 1988; Webster et al., 1994). Generally, smaller populations of prefrontal neurons are tuned for object attributes such as geometric shape, color, or complex features (e.g., specific faces), than spatial location; a regional specialization is also present, with spatial information more prevalent in the dorsolateral prefrontal cortex than the ventrolateral prefrontal cortex (Meyer et al., 2011). Nonetheless,

robust, stimulus-selective persistent activity has been described in working memory tasks requiring subjects to remember the identity and features of stimuli. Examples include stimuli defined by simple, geometric shapes differing in color or luminance (Quintana et al., 1988; Hoshi et al., 1998; Constantinidis et al., 2001b; Sakagami et al., 2001; Averbeck et al., 2003; Inoue and Mikami, 2006; Genovesio et al., 2009), complex images, such as real objects and faces, or abstract pictures (Wilson et al., 1993; Miller et al., 1996; O Scalaidhe et al., 1997, 1999; Rao et al., 1997; Rainer et al., 1998; Rainer and Miller, 2000; Freedman et al., 2001; Roy et al., 2014) and the direction of motion of a random-dot stimulus that is always presented at the same location (Zaksas and Pasternak, 2006; Mendoza-Halliday et al., 2014).

In recent years, it has been recognized that persistent activity in the prefrontal cortex also represents information beyond the characteristics of stimuli. Activity may represent the abstract rules of the cognitive task subjects are required to perform (White and Wise, 1999; Wallis et al., 2001), categories (Freedman et al., 2001; Shima et al., 2007), and numerical quantities (Nieder et al., 2002). It may be also related to perceptual decisions (Kim and Shadlen, 1999; Barraclough et al., 2004), reward expectation (Leon and Shadlen, 1999), and sequences of events or actions (Averbeck et al., 2002; Inoue and Mikami, 2006; Sigala et al., 2008; Berdyyeva and Olson, 2010). Persistent activity of single neurons may represent more information than stimulus features and task variables simultaneously (Rigotti et al., 2013). For instance, persistent firing may represent different aspects of the task demands as they change over time, thus providing dynamic representations (Mante et al., 2013).

The realization that prefrontal activity is modulated by task factors to such extent has led to a re-evaluation of the nature of information represented in persistent activity (D'Esposito and Postle, 2015). Taken to the extreme, this idea would suggest that all stimulus-selective information that appears to be represented in the prefrontal cortex is in fact related to task rules or categorical judgments between alternatives rather than representing the memoranda themselves. In an attempt to pinpoint the nature of information represented in the prefrontal cortex, some experiments have relied on working memory for stimuli defined solely by elemental properties, such as direction of motion or color, and found the ability of prefrontal cortex to represent such features wanting. In an experiment requiring subjects to remember the overall direction of motion of an initial random-dot display and decide if the direction of a following display was the same or different, prefrontal neurons exhibited only transient representation of direction information in the delay period (Zaksas and Pasternak, 2006). Another experiment that required memory for the color of a stimulus revealed that very few prefrontal neurons exhibited pure color information, as opposed to information about its location (Lara and Wallis, 2014).

Ruling out prefrontal cortex as the cortical area mediating the representation of object information in working memory based on such negative findings appears premature. More recent experiments have succeeded in revealing robust persistent activity representing direction of motion throughout the delay period of a working memory task in the prefrontal cortex (and area MST) but not in area MT of the visual cortex, although MT was robustly activated during the presentation of these stimuli (Mendoza-Halliday et al., 2014). In the case of color, too, activation of only a small proportion of prefrontal neurons, in the order of 5–15% (Lara and Wallis, 2014) may be sufficient for the representation of stimulus information. It is also possible that color-selective neurons are concentrated in specific prefrontal "patches" (Lafer-Sousa and Conway, 2013) and persistent activity representing color information may be concentrated in such modules rather than be diffused across the entire prefrontal surface.

Persistent neuronal firing in prefrontal cortex has been observed even in the absence of performance of a task, or even learning of a task, while subjects view stimuli, passively. Prefrontal neurons have thus been shown to generate persistent discharges tuned for stimulus location and shape in monkeys never trained to perform a working memory (or other cognitive) task (Meyer et al., 2011; Meyers et al., 2012). The fact that prefrontal neurons generate persistent activity when not required to perform a working memory task is not incompatible with our intuition of working memory, either. We are able to recall stimuli we encounter even when we are not prompted to maintain them in memory ahead of time (Qi et al., 2015b). Consistent with this finding, recordings during passive fixation reveal persistent discharges selective for faces in the ventrolateral prefrontal cortex (O Scalaidhe et al., 1999). Prefrontal neurons also represent stimulus features even when they are irrelevant for the task at hand (Constantinidis et al., 2001b; Lauwereyns et al., 2001; Donahue and Lee, 2015). This evidence argues that persistent activity in the prefrontal cortex is sufficient to represent objectrelated information in working memory. In Section Alternative Working Memory Models, we will review the evidence that prefrontal cortex is also necessary for this role.

# ALTERNATIVE WORKING MEMORY MODELS

In recent years, the role of persistent activity has come into question by alternative models proposed to mediate working memory. By some accounts, information can be maintained in memory over a period of seconds through mechanisms other than persistent discharges. We will examine three categories of models here: non-spiking models dependent on synaptic mechanisms, rhythmic-spiking models conveying information based on the frequency and phase of discharges without necessarily an increase in overall activity, and dynamic-spiking models in which information is represented based on the pattern of neurons that are active without an elevation of mean firing rate across the population.

#### Non-spiking Models

Activity elicited after repeated presentation of the same stimulus is typically reduced, a phenomenon termed repetition suppression (Grill-Spector et al., 2006). As a result, the level of response to a particular stimulus in the context of a working memory task, such as the delayed match to sample task, can be informative about whether it was preceded by the same stimulus or not; match suppression may signal that the sample was the same as the match. This suppressed response to a matching stimulus is observed even though several seconds may intervene between the sample and match, and it does not require persistent activity (Miller et al., 1991, 1996). Match suppression (or enhancement, for some neurons) is observed for stimuli matching in shape, color, and form, in spatial location, or in direction of motion, in various cortical areas, including the prefrontal, posterior parietal, and inferior temporal cortex (Miller et al., 1991, 1996; Steinmetz et al., 1994; Zaksas and Pasternak, 2006; Woloszyn and Sheinberg, 2009). Furthermore, the extent of response difference to matching and non-matching stimuli has predictive power over behavioral performance, as it differs systematically in correct and error trials (Zaksas and Pasternak, 2006; Qi et al., 2012).

Computational models have been proposed that could account for such changes via mechanisms that do not depend on spike generation, but instead involve modification of synaptic strengths (Mongillo et al., 2008; Sugase-Miyamoto et al., 2008). Such mechanisms may be mediated by calcium availability at the presynaptic terminal, whose kinetics have a time constant in the scale of seconds (Mongillo et al., 2008). The duration and stability of working memory in such models may still be modulated by spiking activity.

Repetition suppression is a robust phenomenon observed across multiple cortical areas and the fact that the match/nonmatch effect differs in correct and error trials offers compelling evidence that memory performance has access to this activity. However, it is a phenomenon limited to recognition memory that may not even mediate representation of the identity of the remembered stimulus, and it cannot account for working memory performance in other tasks. It is hard to imagine an equivalent role of synaptic mechanisms for tasks such as the ODR, delayed alternation, N-back, or free recall tasks. Moreover, other computational models show that even though preference for a non-match over a match stimulus may be present in individual neurons with no persistent activity, the phenomenon may still be mediated by a network that depends on persistent activity (Engel and Wang, 2011). It is still an open question, therefore if synaptic mechanisms have a role in working memory in the absence of persistent activity.

# Oscillatory Models

Rhythmic activity has long been implicated in hippocampaldependent memory, and communication between the hippocampus and prefrontal cortex, in rodents (Buzsaki, 2010). In the human literature, the frequency of oscillations evident through MEG, EEG, and ECoG recordings has also been associated with distinct working memory processes (Roux and Uhlhaas, 2014). Recent neurophysiological studies in non-human primates have begun to address more specifically what role rhythmic firing may play in working memory (Siegel et al., 2009; Buschman et al., 2012; Liebe et al., 2012; Salazar et al., 2012; Brincat and Miller, 2015). The magnitude, frequency, and phase of oscillations within the prefrontal cortex and between the prefrontal cortex and other areas have been shown to be modulated depending on stimulus and task information (Buschman et al., 2012; Liebe et al., 2012). Therefore, information about the stimulus held in memory or task to be performed may be decoded based on these parameters. For example, oscillatory synchronization between LFP signals recorded from different sites within the prefrontal cortex has been shown to be modulated based on which of two task rules a monkey is performing (Buschman et al., 2012). The coherence in rhythmic synchronization between neurons in prefrontal and posterior parietal cortex has also been reported to be content dependent; in other words, prefrontal and parietal neurons synchronize their firing at specific frequencies, for different stimuli held in memory (Salazar et al., 2012). The phase of rhythmic activity could also differentiate information representing two sequentially presented stimuli (Siegel et al., 2009).

Oscillatory activity is not incompatible with persistent activity. For example, both robust persistent activity and gamma-band rhythmicity have been reported during the delay period of the ODR task (Pesaran et al., 2002), as well as the two-item memory task described above (Siegel et al., 2009). It is an open question whether oscillatory activity may dictate behavioral performance in working memory tasks independently of persistent activity.

# Dynamic Information Models

Information may be represented dynamically in a neuronal population without having to be rhythmic. The precise pattern of activation of different neurons at each time point during a working memory task can be used to decode the identity of the stimulus, even though overall activity during the delay period is not significantly elevated above the baseline (Stokes et al., 2013). This result provides yet another alternative mechanism of working memory representation.

The existence of stimulus information that can be decoded by the dynamic pattern of activation in the prefrontal population (Stokes et al., 2013) presents challenges to the persistent activity model. We should consider however that the stimuli used in the Stokes et al. study are similar to those used in previous studies where persistent activity was observed (Miller et al., 1996; Rao et al., 1997; Rainer et al., 1998). It is possible therefore that a population of neurons did generate persistent activity but might have been too weak to detect when all neurons were averaged together. The demonstration of a condition where persistent activity is truly absent and information is encoded solely by the dynamic pattern of information in neurons whose activity is not modulated during working memory is an open question. Furthermore, dynamic firing models have yet to establish what aspects of information that can be decoded from the dynamic representation of stimulus information can predict behavioral variables, such as recall error rates, accuracy of recall, or reaction time, to the extent that models of persistent activity have been successful in doing (Wimmer et al., 2014).

Dynamic patterns of activation across the population of neurons are not mutually exclusive with persistent activity either. Dynamic activity informative about stimulus identity and task rules has been observed even when persistent activity is present in the population (Crowe et al., 2008; Meyers et al., 2012). Different populations of neurons may also be active at different time points of the ODR task representing stimulus attributes or response preparation (Markowitz et al., 2015). One possible resolution to the two seemingly incompatible mechanisms of information representation is found by analyzing the neuronal population activity during the ODR task. Principal Component Analysis reveals a dynamic, low-dimensional representation, where stimulus location evolves dynamically in time after the cue presentation, but different locations remain constrained in separable subspaces (Roy et al., 2013). Persistent firing specific for the location of a stimulus may thus sweep the population of neurons, in a specific pattern, during the time course of a trial.

#### ROLE OF OTHER AREAS IN WORKING MEMORY

Persistent discharges are not an exclusive property of the prefrontal cortex. Neurons in premotor, parietal, cingulate, and temporal association areas generate robust persistent activity, as do subcortical structures including the basal ganglia and the mediodorsal nucleus of the thalamus (Constantinidis and Procyk, 2004; Pasternak and Greenlee, 2005). The proposed alternative mechanisms of memory maintenance reviewed before, and fMRI findings in humans have expanded the list of potential sites of memory into even more cortical areas, as early as the primary visual cortex (Harrison and Tong, 2009). We will next review the evidence of working memory representation in the posterior parietal and inferior temporal cortex (for spatial and object memory, respectively), and in visual cortical areas, including V1.

# Posterior Parietal (PPC) and Inferior Temporal (IT) Cortex

The posterior parietal and inferior temporal cortex represent the two main cortical afferents of the prefrontal cortex, as they are strongly interconnected with the dorsolateral and ventrolateral prefrontal cortex, respectively (Constantinidis and Procyk, 2004). Posterior parietal and dorsolateral prefrontal cortex share many functional properties with respect to spatial working memory (Rawley and Constantinidis, 2009) and both regions are activated simultaneously in human imaging studies of working memory (Jonides et al., 1993; Courtney et al., 1997; Owen et al., 1998; Ungerleider et al., 1998; Marshuetz et al., 2000; Bunge et al., 2001; Stern et al., 2001). Neurons in posterior parietal cortex also generate persistent activity (Gnadt and Andersen, 1988), and this has been shown to represent the remembered locations of visual stimuli, independent of a planned motor response (Constantinidis and Steinmetz, 1996). Tested with the ODR task, virtually identical percentages of neurons exhibiting working memory responses were observed in posterior parietal and dorsolateral prefrontal areas (Chafee and Goldman-Rakic, 1998).

Responses of IT neurons related to object memory exhibit many intriguing parallels with spatial working memory in the posterior parietal cortex. IT cortex shares a number of physiological properties with ventrolateral prefrontal cortex and both exhibit memory-related activation. IT neurons discharge in a persistent fashion after the offset of visual stimuli and their activity encodes the features of the remembered stimulus (Fuster and Jervey, 1981, 1982; Miyashita and Chang, 1988; Miller et al., 1993; Nakamura and Kubota, 1995; Naya et al., 2001; Sigala and Logothetis, 2002).

This simultaneous activation of the areas that are interconnected with the prefrontal cortex during working memory has inspired views that the prefrontal cortex does not represent a memory trace for a particular item per se, but rather an abstract representation, allocation of cognitive resources, the focus of attention, or other top-down signals (Cowan, 1988; Miller and Cohen, 2001; Hazy et al., 2006; Postle, 2006; D'Esposito, 2007). In this framework, the contents of memory may be represented in PPC and IT, instead. Evidence against this idea comes from memory tasks that require maintenance in memory of an original item through sequential presentation of distracting stimuli, such as the delayed match to sample task. Both object and spatial versions of this task have been developed. In the context of the object delayed-match-to-sample task, persistent discharges of IT neurons are interrupted by non-matching, distractor stimuli presented after the sample (Miller et al., 1993). Conversely, responses in the ventral prefrontal cortex are able to represent the actively remembered sample's feature throughout the trial regardless of the distractor stimuli displayed (Miller et al., 1996). Equivalent findings have been obtained in the posterior parietal cortex for the spatial delayed-match-to-sample task (Katsuki and Constantinidis, 2012). Posterior parietal discharges represent the most recent stimulus location and are disrupted by distracting stimuli (Constantinidis and Steinmetz, 1996). Prefrontal neurons are able to represent the location of the original stimulus held in memory even after the appearance of distractors, in various tasks (di Pellegrino and Wise, 1993; Qi et al., 2010; Suzuki and Gottlieb, 2013).

Most recent studies have somewhat qualified these findings, for example demonstrating that differences between IT/PPC and prefrontal neurons in their ability to generate persistent activity that survives distractors are qualitative rather than quantitative (Woloszyn and Sheinberg, 2009; Qi et al., 2010), and that prefrontal neurons may respond better to distractors than actively remembered stimuli, in some tasks (Jacob and Nieder, 2014; Qi et al., 2015a). Nonetheless, in the context of the working memory tasks reviewed in the preceding paragraph, performance of the task is simply not possible based on the activation of the posterior parietal or inferior temporal cortex alone. The link of prefrontal activation with performance of working memory tasks that involve sequential presentation of distracting stimuli is confirmed by human imaging studies, as well: prefrontal activation is predictive of errors when activity representing an initial item is not maintained, whereas parietal cortex is indiscriminately activated by behaviorally relevant stimuli and distractors, alike (Sakai et al., 2002). Accumulating studies ascribing different roles in the activity of prefrontal and parietal cortex in working memory (Jacob and Nieder, 2014; Qi et al., 2015a), and functions such as attention and categorization (Swaminathan and Freedman, 2012; Crowe et al., 2013; Ibos et al., 2013), raise the alternative possibility that prefrontal and PPC/IT cortex are specialized for different aspects of working memory, as well as other cognitive functions (Katsuki and Constantinidis, 2012).

An instance of such differentiation may be the reported role of the posterior parietal cortex in determining the capacity of working memory (Todd and Marois, 2004, 2005). Activation of parietal cortex revealed by fMRI best predicts the number of simultaneous items maintained in working memory, relative to both earlier areas and the prefrontal cortex (Todd and Marois, 2004). The single-neuron basis of the phenomenon is not clear, however. Persistent discharges in the prefrontal and posterior parietal cortex reveal few differences between the two areas and no obvious neural correlate that is present only in the posterior parietal cortex and could determine capacity (Buschman et al., 2011).

The primacy of prefrontal cortex in working memory behavior is perhaps most vividly demonstrated in inactivation studies. Cooling experiments, which reversibly inactivate the underlying cortex by lowering its temperature, demonstrate much greater decreases in memory performance in the ODR task after prefrontal than posterior parietal cooling (Chafee and Goldman-Rakic, 2000), even when the areas inactivated have similar delay period activity (Chafee and Goldman-Rakic, 1998). The results of these studies parallel the effects of reversible inactivation of the frontal eye fields via muscimol injections, which similarly produce a significant impairment in memory-guided saccade performance (Sommer and Tehovnik, 1997; Dias and Segraves, 1999). In contrast, modest or no impairment was observed after muscimol inactivation of the posterior parietal cortex (Li et al., 1999; Chafee and Goldman-Rakic, 2000; Wilke et al., 2012), even though posterior parietal inactivation produces consistent deficits in tasks that require attention or selection between multiple stimuli (Wardak et al., 2002, 2004; Liu et al., 2010; Wilke et al., 2012). Small lesions to the dorsolateral prefrontal cortex also produce impairment in working memory performance for remembered stimuli in the contralateral space, an effect termed a "mnemonic scotoma" (Funahashi et al., 1993a; Funahashi, 2015). Equivalent results from localized lesions of the posterior parietal cortex are not available.

#### Visual Cortex

In recent years, human imaging studies have been successful in decoding information held in memory from the visual cortex, including the primary (Harrison and Tong, 2009; Albers et al., 2013; Xing et al., 2013) and extrastriate visual cortex (Ester et al., 2013; Sreenivasan et al., 2014b), suggesting that these areas maintain the contents of working memory (Tong and Pratte, 2012). This extraction of information has been possible with Multi-Variate Pattern Analysis (MVPA), examining the simultaneous pattern of activation of multiple voxels to different task conditions; the overall levels of activity in visual cortex may not rise above baseline during working memory (Offen et al., 2009). Imaging studies have gone as far as to determine that the size of the primary visual cortex alone is the best predictor of working memory ability (Bergmann et al., 2016). Importantly, MVPA could not decode information from the prefrontal cortex, or could not fully account for behavioral performance in the task (Harrison and Tong, 2009; Sreenivasan et al., 2014b).

This negative finding of information failing to be decoded from the prefrontal cortex during working memory, despite the known activation of prefrontal neuron in similar tasks, is telling about the interpretative limitations of these results. A tacit assumption when comparing the results of MVPA analysis across different cortical areas is that the structure of the voxel (typically in the order of 3 × 3 × 3 mm) is equivalent in the primary visual and prefrontal cortex. This is definitely not the case. Unlike the precise topography of visual space in the primary visual cortex, no retinotopic map (or other overarching organizational principle) has been revealed in the prefrontal cortex (Constantinidis and Procyk, 2004). Sampling the prefrontal cortex with chronic arrays of micro-electrodes spaced at 0.4 mm of each other reveal that the same cortical location is represented multiple times across the surface, and with no obvious map of space (Leavitt et al., 2013; Kiani et al., 2015). Simultaneously recorded neurons with movable electrodes spaced as close as to 0.2 mm of each other reveal only a slight bias toward similar spatial preference among neighboring prefrontal neurons (Constantinidis et al., 2001a). Precise stimulus location information is therefore represented in an extremely fine spatial scale, with the entire visual hemifield possibly represented in prefrontal modules no large than 0.5 × 0.5 mm in surface (Constantinidis et al., 2001a). Voxels averaging cortical volumes an order of magnitude larger are thus likely to obliterate stimulus information and will predictably fail to decode the information held in working memory, even if this is robustly represented in the activity of prefrontal neurons.

A recent fMRI study has in fact been successful in retrieving features of remembered stimuli, the orientation of a grating, from the prefrontal cortex during working memory (Ester et al., 2015). Such information may be represented more coarsely across the surface of the prefrontal cortex, making it possible to decode from fMRI activation patterns. In any case, these results argue directly against models of working memory that postulate solely a topdown control role for the prefrontal cortex, and place feature storage networks in the visual cortex (Ester et al., 2015).

MVPA methods still yield undeniable positive findings of fMRI imaging in the visual cortex and it is important to consider the neural basis of this activity that yields information about the contents of working memory. Early visual areas do not generate persistent activity. A recent study comparing activity in three cortical areas in the same animals, required to remember the direction of motion of a random-dot display, found virtually no persistent discharges in visual area MT, but robust activation in parietal area MST, in addition to prefrontal persistent activation (Mendoza-Halliday et al., 2014). This suggests an abrupt generation of feature-selective persistent activity in areas beyond the visual cortex. On the other hand, a small percentage of V1 neurons exhibit suppressed levels of discharges during working memory, below background levels (Super et al., 2001). It is unclear, however, whether V1 activity can be predictive of behavior in working memory task as this modulation was present for both correct and incorrect trials (Super et al., 2001). Changes in levels of activity in V1 during working memory are likely due to top-down projections from higher associative cortices, since V1 activation appears first in superficial layers (Roelfsema, 2015). A key aspect of this phenomenon is that background levels of activity in V1 are relatively "quiet," thus making it possible to capture the subtle backwash from higher cortical areas, while the higher cortical areas themselves may be too noisy to detect these small signals. fMRI activation may additionally be detecting pre-synaptic activation of V1 neurons from higher cortical areas (Logothetis and Wandell, 2004), which makes V1 activity even less likely to be the ultimate storage of working memory contents and determinant of working memory performance.

#### CONCLUSIONS AND UNRESOLVED QUESTIONS

The role of prefrontal persistent activity in working memory has been the focus of renewed attention in the past few years. This interest has been spurred by the realization that other brain areas are also active during working memory maintenance, that persistent activity may be shaped by the demands of the task rather than merely be representing information, and that dynamic patterns of activity can represent information in working memory. These results have inspired alternative models of working memory maintenance in the brain.

In this review, we make the case that persistent activity in the prefrontal cortex is both necessary and sufficient to account for information held in memory, across a variety of tasks and experimental conditions. Prefrontal persistent activity is also present in working memory tasks that do not rely on spatial stimuli and can encode attributes of stimuli (such as direction of motion and shape) or task variables and rules. Computational models based on persistent activity can account for levels of performance and patterns of errors depending on neuronal discharges to a greater extent than any alternative models.

Phenomena like repetition suppression are likely to be generated by synaptic rather than spiking mechanisms and they appear to correlate with behavior. However, they can only account for a limited set of behaviors and memory functions. Similarly, rhythmic or otherwise dynamic patterns

#### REFERENCES


of activity across the population of prefrontal neurons may convey information about stimulus properties. Such patterns of activation are not incompatible with persistent activity, either. It is upon future research to determine whether a causal relationship exists between such mechanisms and working memory performance.

The prefrontal cortex is not the only area that represents working memory information. Posterior parietal and inferior temporal areas have been long known to be active during working memory, though they appear insufficient to sustain information, for at least some tasks. It remains an open question on whether these areas are specialized for different aspects of working memory performance, or if their activity supports the maintenance of working memory in a distributed network that requires the prefrontal cortex. Information decoded from the primary visual cortex but not in the prefrontal cortex in fMRI studies cannot rule out a prefrontal involvement in working memory due to interpretational limitations that have to do with the topography of stimulus representation in these areas. It remains unclear whether neuronal activity in primary visual cortex plays any role in determining working memory behavior. Future work should aim to resolve these issues.

#### AUTHOR CONTRIBUTIONS

MR and CC conceptually developed and wrote this review.

# ACKNOWLEDGMENTS

Research reported in this paper was supported by the National Eye Institute of the National Institutes of Health under award numbers R01 EY017077 and R01 EY016773 to CC; NIMH award F31 MH104012 to MR; and by the Tab Williams Family Endowment and Harry O'Parker Neurosciences Fund at the Wake Forest School of Medicine.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Riley and Constantinidis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Differential contributions of dorsolateral and frontopolar cortices to working memory processes in the primate

#### Erica A. Boschin\* and Mark J. Buckley\*

Department of Experimental Psychology, University of Oxford, Oxford, UK

The ability to maintain and manipulate information across temporal delays is a fundamental requirement to bridge the gap between perception and action. In the case of higher-order behavior, the maintenance of rules and strategies is particularly helpful in bridging this gap. The prefrontal cortex (PFC) has long been considered critical for such processes, and research has focused on different subdivisions of PFC to gain an insight into their diverse contributions to these mechanisms. Substantial evidence indicates that dorsolateral PFC (dlPFC) is an important structure for maintaining information across delays, with cells actively firing across delays and lesions to this region causing deficits in tasks involving delayed responses and maintenance of rules online. Frontopolar cortex (FP), on the other hand, appears to show the opposite pattern of results, with cells not firing across delays and lesions to this region not affecting the same rule-based, delayed response tasks that are impaired following dlPFC lesions. The body of evidence therefore suggests that dlPFC and FP's contributions to working memory differ. In this article, we will provide a perspective on how these regions might implement distinct but complementary and interactive functions that contribute to more general temporallyextended processes and support flexible, dynamic behavior.

#### Edited by:

Zsuzsa Kaldy, University of Massachusetts Boston, USA

#### Reviewed by:

Satoshi Tsujimoto, Kyoto University, Japan Maria Medalla, Boston University, USA

#### \*Correspondence:

Erica A. Boschin erica.boschin@psy.ox.ac.uk; Mark J. Buckley buckley@psy.ox.ac.uk

Received: 20 August 2015 Accepted: 05 October 2015 Published: 29 October 2015

#### Citation:

Boschin EA and Buckley MJ (2015) Differential contributions of dorsolateral and frontopolar cortices to working memory processes in the primate. Front. Syst. Neurosci. 9:144. doi: 10.3389/fnsys.2015.00144 Keywords: prefrontal cortex, frontopolar cortex, dorsolateral prefrontal, delay, valuation

#### WORKING MEMORY AND PREFRONTAL CORTEX (PFC)

A fundamental aspect of cognition is the ability to maintain and manipulate information even when it cannot be directly perceived in the form of sensory input, for example because it is no longer accessible. Besides contributing to basic memory processes, such as the passive maintenance of information for future use, this type of cognitive processing is also essential in order to associate actions and/or stimuli with outcomes that may be temporally distant from the onset of the action or stimulus themselves. Furthermore, it is advantageous for the planning and execution of sequential behavioral plans that span longer timescales than that of a single action.

The prefrontal cortex (PFC) has long been considered critical for this cognitive ability, often referred to by the very general and umbrella term ''working memory''. Several studies have linked PFC cells' activities with the internal representation of information, ranging from the encoding of stimulus features, to value, to more abstract rules, goals and strategies (Asaad et al., 1998, 2000; White and Wise, 1999; Wallis et al., 2001; Bunge et al., 2003; Kennerley et al., 2011), as well as with the maintenance and manipulation of information across time (Fuster and Alexander, 1971; Goldman-Rakic, 1995; Miller et al., 1996; Bunge et al., 2003; Mushiake et al., 2006; Mansouri et al., 2007). PFC damage in human patients has been linked to severe deficits in memory and planning (Bauer and Fuster, 1976; Goldman-Rakic, 2011; Fuster, 2008; Thompson-Schill et al., 2002) and such patterns of impairment have also been extensively reported in the animal literature (for a comprehensive review, see Fuster, 2008). In particular, the effects of large targeted PFC ablations on a range of tasks in non-human primates have led some authors to hypothesize a role for PFC in processing specifically temporally extended and/or temporally complex information (Wilson et al., 2010).

#### DORSOLATERAL AND FRONTOPOLAR CORTICES AND TEMPORALLY EXTENDED PREFRONTAL FUNCTIONS

Evidence suggests that, rather than being a functionally homogeneous region, PFC may comprise a network of cytoarchitecturally and functionally distinct subdivisions (Walker, 1940; Carmichael and Price, 1994; Petrides and Pandya, 2002; Petrides, 2005; Brodmann, 1909). Therefore, one question concerns whether particular subdivisions of PFC might be specifically crucial for particular processes referred to under the general rubric of working memory processes. Fuster, 2008 distinguished between lateral prefrontal and medial prefrontal syndromes, with the former, but not the latter, being characterized by impairments in, amongst other functions, working memory. Indeed, a large number of findings regarding the properties of PFC cells and the effects of PFC damage on working memory tasks come from investigations into lateral PFC, and particularly the dorsolateral prefrontal (dlPFC) regions (**Figures 1A–E**) including, in the macaque, the area surrounding the principal sulcus (Petrides, 2000). Human neuroimaging studies have shown that a region anteriorly adjacent to dlPFC, namely frontopolar cortex (FP), approximately corresponding to Brodmann's area 10 (**Figures 1A–E**), is also particularly active during working memory and episodic memory tasks in humans (Gilbert et al., 2006a,b) and it has been associated with prospective memory (PM) functions, i.e., the maintenance of information related to a future action plan across time-delays (Okuda et al., 2007; Burgess et al., 2011; Volle et al., 2011). Consistent with Fuster's distinction between lateral and medial PFC syndromes, FP's memory functions have also generally been associated with its lateral portion, which, in humans, has been found to closely resemble macaque's dorsolateral area 46 in terms of functional connectivity with wider cerebral cortex (**Figure 1F**; Sallet et al., 2013; Neubert et al., 2014).

Nevertheless, recent studies have also begun to highlight some differences between the two regions, for example in neurophysiological profiles of cells in dlPFC vs. FP. Cells in the dorsal and lateral aspect of FP, unlike more posterior cells in dlPFC per se, do not appear to fire across temporal delays (Tsujimoto et al., 2010, 2012), which is a property generally deemed characteristic of temporally extended memory processes. It is therefore possible that dlPFC and FP might be supporting different processes contributing to more general memory functions. One way to investigate this possibility is to look at the effects of selective lesions to each of these two areas on the performance of the same type of cognitive tasks, in order to discern whether their respective contributions can be differentiated. While several experiments have investigated the effects of dlPFC lesions on various components of working memory, up until very recently, the absence of studies on the effects of targeted FP lesions had precluded such a comparison. In the light of new experimental findings, we can now begin to form some hypotheses on the potential distinct contributions of these two regions to cognition.

#### STIMULUS FEATURES

In tasks of recognition memory such as delay-matching-tosample (DMS) or delay-non-matching-to-sample (DNMS), the subject has to maintain a memory trace of the perceptual features of a sample stimulus, in order to accurately compare them with those of a test stimulus (or stimuli) after delays of varying length. Cells in dlPFC have been shown to fire during delays in such tasks, with activity correlated to the individual properties of the sample (Miller et al., 1996; Sawaguchi and Yamane, 1999). In a series of classic studies, Fuster and colleagues showed that, in the monkey, cooling of dlPFC regions including sulcal area 46 caused deficits in spatial delayed-response and DMS tasks with increased delays, but not on simultaneous matching-to-sample tasks (Fuster and Alexander, 1970; Bauer and Fuster, 1976). Further investigations have suggested a more nuanced role for dlPFC in DMS/DNMS tasks than that of passive general maintenance of information, as lesions to dlPFC can leave performance on these tasks relatively unimpaired (Passingham, 1975; Bachevalier and Mishkin, 1986; Kowalska et al., 1991), but can affect specific processes that contribute to DMS/DNMS performance, such as visuospatial processes (Passingham, 1975; Levy and Goldman-Rakic, 2000) or the selection and manipulation of information that is maintained ''online'' across temporal delays in order to guide choice behavior (Petrides, 2000; Rowe et al., 2000).

While no recordings of FP cells during DMS/DNMS task exist to date, we recently investigated the effects of targeted lesions to the macaque's FP on both tasks, and found that, unlike dlPFC lesions, these had no effect on any aspect of the animals' performance of either task (**Figure 2A**). The FP animals were undistinguishable from controls both in reaching criterion for the tasks and in their performance across varying delays (Boschin et al., 2015). This suggests that, despite its activation during working memory tasks, FP is not essential to support the maintenance of visual information across delays, nor for guiding choice behavior based on the type of visual information and rules that underpin DMS/DNMS tasks.

# ABSTRACT RULES AND STRATEGIES

The need to maintain or manipulate information across time is not exclusively a requirement of situations where one needs

and the target regions of interest. The connectivity profile of human medial FP (FPm) closely resembles that of medial area 10 (10m) the macaque brain. Human lateral FP (FPl), on the other hand, appears to resemble macaque area 46, here shown in yellow (adapted from Neubert et al., 2014, with permission from Elsevier).

to hold a memory trace of a cue or stimulus that can no longer be directly perceived, as in the case of DMS/DNMS tasks. Even in the presence of constant sensory input, in the form of visual stimuli for example, other types of task-relevant information might be maintained, such as rules, strategies or action plans. A large body of evidence does implicate both dlPFC

FIGURE 2 | Patterns of spared and impaired performance following FP lesion in the macaque (adapted from Boschin et al., 2015): tasks (right) and results (left). (A) Delayed-Matching/Delayed-Non-Matching-to-Sample: FP animals are not impaired compared to controls across several different delays. (B) Objects-in-scenes: in this task, animals learn about which of a pair of foreground objects (alphanumeric characters, indicated by the red arrows) presented within a complex scene is associated with reward. They are presented with 20 novel problems every day and in each daily session they are tested on that set of problems eight times. Animals are tested for 15 days pre-operatively and post-operatively. For control animals, the greatest improvement in performance (measured as decrease in percent error) was observed between the first and second run, indicating rapid learning. FP animals, on the other hand, did not show such substantial improvement between the first and second run, indicating a deficit in rapidly learning about the relative values of novel stimuli. (C) Successive single-problem learning. The animals learn about which of a single pair objects (clipart images) is associated with reward with problems presented successively. In the first run they are given forced-choice trials where the rewarded and unrewarded item are presented individually (order counter-balanced across trials), then they are tested on that problem 10 times successively. A session comprises 10 such problems and each animal completes 10 sessions pre- and post-operatively. FP animals were again impaired on rapid, one-trial learning about the relative value of novel stimuli, (here measured as the decrease in percent error between the forced-choice phase and the first presentation of a problem between the two stimuli). (D) Acquisition of a new abstract rule: animals are trained to perform a simultaneous matching-to-sample task requiring them to choose a stimulus on the basis of two concurrent abstract rules ("matching" and "smaller than"). As an intermediate phase they are trained on the new "smaller-than" rule for 3 days, which is depicted in this figure. Control animals showed a significant decrease in percent error from the first to the second day of learning to apply the new "smaller than" rule. This is indicative of rapid learning about the value of the novel abstract rule. FP animals, however, did not display such an improvement.

and FP in the encoding, maintenance and manipulation of task instructions, abstract rules and strategies (Rowe et al., 2000; Strange et al., 2001; Wallis et al., 2001; Mushiake et al., 2006; Sakai and Passingham, 2006; Christoff and Keramatian, 2007; Rowe et al., 2007; Sakai, 2007; Buckley et al., 2009; Tsujimoto et al., 2011; Mian et al., 2012), and one hypothesis about FP function posits that this area sits atop of a prefrontal hierarchy where increasingly abstract information is represented in rostral vs. caudal PFC regions (Badre and D'Esposito, 2007; Koechlin and Summerfield, 2007; Badre, 2008). Therefore one possibility is that FP's role in temporally extended cognitive processing can only be uncovered when the task involves a higher level of abstraction than in DMS/DNMS.

While any type of rule-based behavior benefits from reliable and consistent maintenance of rules and context across time, this type of processing is particularly useful in situations where rules or instructions are not explicitly cued on every trial and/or are not kept constant, but, rather, change dynamically. While in versions of DMS/DNMS when the rule varies from trialto-trial, but is nonetheless cued, significant BOLD activity is elicited in ventral PFC but not dlPFC (Bunge et al., 2003), activity in dlPFC is found in contexts where rules are not explicitly cued and, for example, have to be inferred by stay/switch cues (Forstmann et al., 2005), have to be learnt by trial-an-error (Monchi et al., 2001; Lie et al., 2006), or have to be decided for oneself (Bengtsson et al., 2009). Furthermore, FP cells have been shown to increase activity when feedback indicates that responses are correct according to the current strategy, but only when they are not directly cued (Tsujimoto et al., 2010, 2012). Therefore, both dlPFC and FP appear to be more engaged in contexts where uncued behavioral alternatives have to be maintained and differentially selected depending on changes in contextual demands.

Variants of the Wisconsin Card Sorting Test (WCST)—where subjects are required to respond by matching a sample to one of several test items according to uncued rules that vary dynamically across the session—have proved valuable in animal and human neuropsychological studies investigating the underlying neural mechanisms supporting such behavior. In a monkey-analog of the WCST, single-cell recordings in the macaque's principal sulcus (area 46 and 9/46) have identified cells that encode and maintain a representation of the currently relevant rule both within and between trials (Mansouri et al., 2006) and, in a conflict-version of the task, a representation of the level of conflict experienced on the current and previous trials was also found in the same area (Mansouri et al., 2007). Consistent with these findings, lesions to this region impair the animal's ability to maintain the rule in memory across increasing delays (Buckley et al., 2009), as well as the ability to adapt behavior in response to varying levels of conflict (Mansouri et al., 2007). This indicates that the monkey principal sulcus is essential for supporting the maintenance and exploitation of dynamically changing task rules and task-relevant contextual information across time.

As in the case of DMS/DNMS tasks, to date no recordings have been carried out in the macaque FP during the WCST analog. However, recent findings about the effects of lesions to this area indicate that, unlike dlPFC lesions, FP damage does not impair animals on either rule maintenance or rule switching in the standard version, nor does it impair the conflict version of the task (Mansouri et al., 2015). This may be seen as further consistent with findings reporting neurons that encode rules and strategies in dlPFC but not in FP (Mansouri et al., 2006; Tsujimoto et al., 2010, 2011, 2012).

Nevertheless, while FP animals were not impaired in any aspect of the WCST analogs, FP lesions did nonetheless have an effect on performance, in the form of an enhancement compared to controls. FP lesioned animals were better at adapting their behavior following exposure to conflict and were also less susceptible to intervening distractors, regardless of salience, being better able to maintain the relevant rule in memory compared to controls (Mansouri et al., 2015). This pattern of enhancements after FP lesions, contrasted with the pattern of impairments following dlPFC lesions in the same task, suggests that, while dlPFC seems to be fundamental for maintaining and selecting the appropriate behavioral strategies, FP may play a very different role in this type of abstract and dynamic cognitive behavior.

# EVALUATING THE RELATIVE VALUE OF NOVEL ALTERNATIVES: A PROPOSED CONTRIBUTION OF FRONTOPOLAR CORTEX TO COGNITION

We hypothesize that a key contribution of FP to cognition is in supporting the exploration and evaluation of the relative value of different alternatives, particularly when novel. This hypothesis is supported by the effects of FP lesions across a range of behavioral tasks, in particular the findings of very specific effects of such lesions on rapid learning about novel alternatives across three different tasks: an objects-in-scenes task (**Figure 2B**), a successive single-problem learning task (**Figure 2C**), and the acquisition of a new abstract rule (''smaller than'') in a simultaneous visual discrimination task (**Figure 2D**; Boschin et al., 2015).

In these tasks, control animals showed a sharp decrease in errors in the early stages of choosing between new alternative scenes and objects, or acquiring a novel alternative rule, indicating that they were able to rapidly extract information about the relative value of these novel alternatives. FP lesioned animals, on the other hand, showed no such pattern of rapid learning (see **Figures 2B–D**), but were indistinguishable from controls in later stages of learning, where error rates decreased more gradually (Boschin et al., 2015). This indicates that FP might be crucial for a mechanism that aids the rapid extraction of the relative value of different behavioral options, above and beyond the kind that can be implemented through repeated, direct experience with the outcome of each alternative. This mechanism might involve the computation of internal inferences about the value of unchosen alternatives relative to the value of those that have been directly chosen. Animals with an intact FP might be at an advantage compared to animals without an FP because they are able to infer more about the potential value of unchosen options based on their experience with the chosen option.

This hypothesis is consistent with the data from Mansouri et al. (2015) about the enhancing effects of FP lesions in contexts where distractors (such as free reward and novel tasks between trials of the WCST) may represent alternatives that the animal perceives as being potentially relevant to goal-directed behavior. If, as we hypothesize, FP is involved in the ongoing process of evaluating alternatives in relation to one another, it would be expected to both facilitate rapid learning about novel alternatives, as well as bias animals to explore the potential value of novel alternatives that turn out to be mere distractors. Therefore, animals without a FP would not be biased in such a manner and better able to exploit reward opportunities from ongoing goaldirected behavior when faced with distraction, as demonstrated by Mansouri et al. (2015). Similarly, they could better adapt their behavior to varying levels of conflict, in the absence of the deleterious effects of distraction (Mansouri et al., 2015). Indeed, patients with lesions to FP have been found to perform better than controls in tasks that involve concentration (Petrie, 1952; Burgess et al., 2012). This would also be consistent with Rowe et al. (2007) findings that patients with FP lesions made fewer errors than controls on ''stay'' trials, but more errors on ''switch'' trials, which is consistent with the idea of increased focus on the current task set ignoring potential alternatives. Indeed, FP activity in human subjects was recently found to be correlated with the difference in value between chosen vs. unchosen options (Boorman et al., 2009, 2011) as well as with exploratory behavior (Daw et al., 2006) and changes in FP functional connectivity were reported when subjects switch to a previously unchosen alternative (Boorman et al., 2009).

This new framework could allow for new interpretation of some influential findings regarding the activation of FP in tasks with a working memory component. For example, Volle et al. (2011) showed that patients with FP lesions were impaired on a PM task where they were asked to perform stimulus-judgments while concurrently maintaining the intention to push a button every 30 s. Importantly, they were not impaired when the PM task was explicitly cued by a visual stimulus (i.e., pressing a button whenever they saw an animal). Our hypothesis of FP function could help explain these findings in a novel way as, in the timebased PM task, patients would have had to continually maintain and assess the relative value of the two tasks (stimulus-judgement vs. button-press), which fluctuated depending on the recency of the latest button-press, whereas no such requirement was present in the event-based PM task, where the value of the prospective memory task was explicit when cued.

# CONCLUSIONS AND FUTURE DIRECTION

Taken together, the evidence we presented can be interpreted within a theoretical framework where FP and dlPFC support distinct, but complementary and interactive, cognitive processes that can contribute to more general temporally extended functions, namely the exploration and evaluation of the value of novel behavioral alternatives and the implementation of ongoing behavior based upon what is perceived to be the contextually most relevant information, respectively. In tasks where action plans can span long timeframes and/or need to be updated dynamically in response to contextual changes, dlPFC is essential to appropriately maintain, select and manipulate information, rules and behavioral strategies, particularly in the absence of specific cues that inform the subject about the most appropriate response. In these dynamic contexts, FP can interact with dlPFC by providing the latter with information about novel valuable behavioral options that dlPFC can then encode, maintain and implement in order to flexibly adapt behavior.

Regarding generalization across species, comparative functional connectivity studies have suggested that while human medial FP resembles macaque FP, human lateral FP resembles dorsolateral area 46 in the macaque as opposed to macaque FP (Neubert et al., 2014). However, our findings (Boschin et al., 2015) are consistent with the human imaging literature about lateral and medial FP function (Boorman et al., 2009, 2011). Further, the effects of FP lesions doubly-dissociate from the effects of lesions to posteriorly adjacent dorsolateral areas in the macaque (i.e., FP lesions impair rapid scene learning but not short-term rule-memory, whereas principal sulcus lesions show the reverse pattern of impairments; see Baxter et al., 2008; Buckley et al., 2009; Boschin et al., 2015; Mansouri et al., 2015), consistent with existing literature regarding dlPFC's role in the maintenance, manipulation and selection of information, rules and strategies (e.g., Rowe et al., 2000; Petrides, 2000; Forstmann et al., 2005; Bengtsson et al., 2009). Therefore, from a functional point of view, there appears to be consistency across species about the role of these two areas in behavior. One possibility is that the differences in connectivity observed in Neubert et al.'s (2014) study were confounded by differences in the cognitive states of the subjects (i.e., anesthetized animals vs. restive awake humans). This question certainly deserves further investigation and an important part of future research will be to directly relate findings from human and animal studies in the same brain-state, ideally an active state associated with ongoing choice-behavior.

Moving forward in the exploration of the role of dlPFC and FP in these processes, the key concept is interaction. Most of the data collected so far has stemmed from the study of individual areas in isolation, but neuroimaging in humans has begun to draw attention to the highly interactive nature of activity between PFC and wider cortical networks (Sakai and Passingham, 2006; Rowe et al., 2007; Boorman et al., 2011). For example, Sakai and Passingham (2006) showed that FP appears to influence posterior regions differently depending on the intended rule to be implemented via context-dependent changes in functional connectivity between FP and different task-relevant posterior regions. Furthermore, Rowe et al. (2007) showed in a related paradigm that when FP was damaged regions posterior to FP also interacted with each other differently. However, such data remains correlative. New experimental methodologies now offer scope to investigate how different regions causally influence the areas to which they are connected (and vice-versa) when animals engage in choice behavior, by employing a combination of simultaneous multi-neuronal recordings and reversible inactivations and/or lesions during the same behavioral tasks. Besides their functional differences, FP and dlPFC also present differences in their anatomical connections with other regions. In terms of the specific areas they are connected to, dlPFC's connections span a wide network of both cortical and subcortical structures (Masterman and Cummings, 1997; Petrides and Pandya, 1999; Yeterian et al., 2012), while FP's connections are more robust with higherorder prefrontal regions and are considerably sparser in more posterior and subcortical regions (Petrides and Pandya, 2007; Burman et al., 2011a,b; Yeterian et al., 2012). Furthermore, even for regions that are connected to both FP and dlPFC, there can be differences at the level of synaptic connectivity Medalla and Barbas (2010). Therefore, combining selective inactivation of FP and dlPFC with recordings, should help shed light not only on their individual functions, but on how the neural dynamics in the areas interconnected with these

#### REFERENCES


regions are differentially affected when the former is inactivated as opposed to the latter, and how that might also affect the way the interconnected region of interest interacts with its own different target areas. For neuroscience to progress, we strongly support the notion that a paradigm shift is required away from investigating individual regions in isolation towards investigating how areas interact at the neuronal level both in the healthy brain and in the face of brain damage, dysfunction and disease.

#### ACKNOWLEDGMENTS

The preparation of this manuscript was supported by an MRC project grant to MJB.


Fuster, J. M., and Alexander, G. E. (1970). Delayed response deficit by cryogenic depression of frontal cortex. Brain Res. 20, 85–90. doi: 10.1016/0006-8993(70) 90156-3


and corticocortical connection patterns. Eur. J. Neurosci. 11, 1011–1036. doi: 10.1046/j.1460-9568.1999.00518.x


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Boschin and Buckley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Working Memory in the Service of Executive Control Functions

Farshad A. Mansouri 1,2,3\*, Marcello G. P. Rosa1,2,3 and Nafiseh Atapour 1,3

<sup>1</sup> Department of Physiology, Monash University, Melbourne, VIC, Australia, <sup>2</sup> ARC Centre of Excellence in Integrative Brain Function, Monash University, Melbourne, VIC, Australia, <sup>3</sup> Neuroscience Program, Biomedicine Discovery Institute, Monash University, Melbourne, VIC, Australia

Working memory is a type of short-term memory which has a crucial cognitive function that supports ongoing and upcoming behaviors, allowing storage of information across delay periods. The content of this memory may typically include tangible information about features such as the shape, color or texture of an object, and its location and motion relative to the body, as well as phonological information. The neural correlate of working memory has been found in different brain areas that are involved in organizing perceptual or motor functions. In particular, neuronal activity in prefrontal areas encodes task-related information corresponding to working memory across delay periods, and lesions in the prefrontal cortex severely affect the ability to retain this type of memory. Recent studies have further expanded the scope and possible role of working memory by showing that information of a more abstract nature (including a behavior-guiding rule, or the occurrence of a conflict in information processing) can also be maintained in shortterm memory, and used for adjusting the allocation of executive control in dynamic environments. It has also been shown that neuronal activity in the prefrontal cortex encodes and maintains information about such abstract entities. These findings suggest that the prefrontal cortex plays crucial roles in the organization of goal-directed behavior by supporting many different mnemonic processes, which maintain a wide range of information required for the executive control of ongoing and upcoming behaviors.

# Edited by:

Natasha Sigala, University of Sussex, UK

#### Reviewed by:

Jose L. Pardo-Vazquez, Fundaçao Champalimaud, Portugal Shintaro Funahashi, Kyoto University, Japan

#### \*Correspondence:

Farshad A. Mansouri farshad.mansouri@monash.edu

Received: 05 August 2015 Accepted: 16 November 2015 Published: 14 December 2015

#### Citation:

Mansouri FA, Rosa MGB and Atapour N (2015) Working Memory in the Service of Executive Control Functions. Front. Syst. Neurosci. 9:166. doi: 10.3389/fnsys.2015.00166 Keywords: executive control, prefrontal cortex, working memory, non-human primates, short-term memory

# SHORT-TERM STORAGE OF INFORMATION REQUIRED TO GUIDE ONGOING OR UPCOMING BEHAVIOR

The concept of working memory describes a process of short-term storage of information to support ongoing or upcoming actions, and is considered a crucial component of the executive control of goal-directed behavior (Baddeley, 1986; Fuster, 1995; Goldman-Rakic, 1995a,b). One view, emerging mostly from human studies, considers working memory as an essential intermediate stage (or buffering system) for retrieved memories, thus enabling further manipulation and integration of information involved in perceptual and mental functions (Baddeley, 1986, 2012). A related perspective, mostly focused on the neural substrate of working memories, assumes that retention of task-relevant information is essential for complex behaviors which evolve in time, in order to maintain the perception and actions in a coherent and goal-directed framework. Therefore, working memory processes appear crucial for the temporal organization of behavior (Fuster, 1997; Fuster et al., 2000), including linking processes across delays (Goldman-Rakic, 1995a,b). Related models propose that other short-term memory functions provide an intermediate stage for the buffering and exchange of information between working memory and long-term memory repositories (Ericsson and Kintsch, 1995).

Various techniques, including non-invasive imaging and cellular and molecular studies in animal models, have enriched our knowledge about the working memory process. Here, we briefly review some of the studies that have been conducted in non-human primates to examine the neural substrates and mechanisms of working memory, with emphasis on recent work that demonstrates working memory for abstract features such as rules and strategies.

### WORKING MEMORY IN NON-HUMAN PRIMATES

Single-cell recordings afford high temporal and spatial resolution for the study of information conveyed by neuronal activity. This type of research, using behaving monkeys, has provided ample evidence for the involvement of different cortical and subcortical areas in the short-term storage of information in delayed response tasks. In such studies the cognitive tasks typically include an encoding period, during which a to-be-remembered ''cue'' or ''sample'' is presented, followed by a delay period, during which information about the cue has to be maintained for successful resolution of an upcoming problem. At the end of the delay period the memory of the cue is tested by requiring an operant behavior to select a choice. Examples of cognitive tasks with such paradigms include the delayed matching to sample task, in which a choice object that matches the sample needs to be selected, and the delayed alternation task, in which an alternative action, different from a previous response, has to be selected (Fuster, 1995; Goldman-Rakic, 1995a,b, 1996). Various tasks have examined the process of working memory in different modalities (such as visual, auditory or tactile) by changing the features and modality of the to-be-remembered cue. Neural correlates of working memory have been found in many different brain areas, including those typically regarded as being involved in perceptual and motor functions.

# WORKING MEMORY OF CONCRETE ENTITIES

In a classical study, Fuster and Alexander (1971) trained monkeys to perform a delayed response task in which the monkeys had to remember a visual cue across a delay period. The authors found that a significant number of cells in prefrontal cortex and in the mediodorsal nucleus of thalamus displayed a persistent increase in activity during the delay period. This led them to conclude that this persistent activity might represent the mnemonic processes that enable short-term storage of information across the delay period. Kubota and Niki (1971) also reported persistent activity during the delay period in the context of a delayed alternation task. These pioneering studies supported the emerging idea that working memory is based on maintained representation of events and stimuli, even after their cessation, in the prefrontal neurocircuitry. Fuster (1990, 1995, 1997) subsequently suggested that such representations enable temporal linking of recent salient experiences to the upcoming action. These studies were followed by others which characterized the relationship between the delay-period activities, the preceding (to-be-remembered) stimulus features, and the intended (upcoming) action, as well as the persistence of this activity and its resistance to distraction and interruption.

In another study, Funahashi et al. (1989) examined the delay period activity in a more controled condition, in which eye position was closely monitored and the monkeys were required to maintain information of a location in space, to guide an upcoming saccadic eye movement. Eye fixation during the delay period was crucial to rule out the possible confounds arising from different eye positions during the delay period. Their findings revealed the presence of ''memory fields'' within the prefrontal cortex, suggesting that separate memory-processing modules covered the visual scene in terms of temporary storage of memory. They also showed that the delay period activity was attenuated in error trials (in which the eye saccade was made erroneously in a manner that was unrelated to the previously given information), suggesting that the delay period activity was linked to correct behavioral performance. This finding was first to link the persistent delay period activity to the overall behavior of the monkeys. In follow-up studies, the same group (Funahashi et al., 1993a,b) provided evidence to support the idea that a memory map in prefrontal cortex underlies spatial working memory. However, related studies indicated that sustained activity in the delay period was not a unique property of prefrontal neurons. Cellular activity in other cortical areas, particularly the posterior parietal cortex, also conveys information during delay periods (Gnadt and Andersen, 1988; Chafee and Goldman-Rakic, 1998). These findings raised important questions regarding the significance of delay period activity in guiding overall behavior, its relation to required mnemonic process and other impending processes or actions, and the possible differential contributions of individual brain areas to the working memory process.

In the following years different research groups found sustained neuronal activity in delayed response tasks in various compartments of the prefrontal cortex as well as in the sensory and motor areas (di Pellegrino and Wise, 1991; Miller et al., 1993; Ferrera et al., 1994; Motter, 1994; Bodner et al., 1996; Constantinidis and Steinmetz, 1996; Miller et al., 1996; Rao et al., 1997; Asaad et al., 1998; Chelazzi et al., 1998; Rainer et al., 1998a,b; Romo et al., 1999; Fuster et al., 2000; Zaksas et al., 2001; Pardo-Vazquez et al., 2008, 2009; Rawley and Constantinidis, 2009; Sigala, 2009). These studies showed that, depending on the task demand, information about different stimulus features, from different modalities, could be maintained in working memory and represented in neuronal activity within the prefrontal cortex and sensory areas.

In a landmark study, Rao et al. (1997) trained monkeys to perform a delayed response task in which they had to make a saccade to the remembered location of an object. In each trial, the object was presented briefly at the center of the screen, and then replaced by a fixation point. During the ensuing delay period the monkeys had to retain information about the identity of this particular object (sample) in their short-term memory. Two different objects were then shown, one of which matched the previously presented sample. This was followed by another delay period (in which the monkeys had to hold information regarding the ''sample location'') before the appearance of four saccade targets on the screen; only then did the animals make saccades to the remembered location of the object. Therefore, in the same trial the monkeys had to retain the memory of an object and its location in two separate delay periods, respectively. This study showed that the same population of prefrontal neurons can convey information about objects and their locations, across two delay periods, depending on the task demands. Such neurons were distributed in different parts of the lateral prefrontal cortex, indicating that representations of working memory of objects and their locations are not regionally segregated.

These findings have changed the classic view of the prefrontal cortex as the powerhouse of working memory processes. Recent models suggest that short-term storage of discrete information can be achieved in the same areas that initially process the sensory information and enable perception (Pasternak and Greenlee, 2005; Zaksas and Pasternak, 2006; Lui and Pasternak, 2011; D'Esposito and Postle, 2015). An important question arising from these studies is the specific contribution of prefrontal cortex to these mnemonic processes. Different models have emerged from imaging and animal model studies to suggest that the storage of information in short-term memory can be accomplished by sensory areas; however, persistent representations in prefrontal cortex might act as a medium for additional processes on the maintained representation of stimuli, as well as the application of these to guide goal-directed behavior (Pasternak and Greenlee, 2005; D'Esposito and Postle, 2015). This view is supported by numerous studies showing that cellular activity in the prefrontal cortex during cue-presentation and/or delay period activity can convey information about the upcoming reward (Watanabe, 1986, 1996; Watanabe et al., 2002; Leon and Shadlen, 1999; Tremblay and Schultz, 2000; Kobayashi et al., 2002; Wallis and Miller, 2003), the upcoming actions (Quintana and Fuster, 1992; Asaad et al., 1998; Ferrera et al., 1999; Hoshi et al., 2000) and the task context (Sakagami and Niki, 1994; Hoshi et al., 1998; White and Wise, 1999; Wallis et al., 2001; Barraclough et al., 2004; Genovesio et al., 2005; Johnston and Everling, 2006; Mansouri et al., 2006). The findings of neuroimaging and neuropsychological studies in humans also support this emerging view regarding the contribution of the prefrontal cortex to primate cognition (Sakai and Passingham, 2003, 2006; Müller and Knight, 2006; Sreenivasan et al., 2014).

In summary, our views about the function of the prefrontal cortex as a center for working memory of task-relevant information has evolved to a more comprehensive model, which considers the prefrontal cortex as the site of dynamic and highly plastic integrative machinery for the executive control of behavior. Such integrative functions are supported by reciprocal connections between the prefrontal cortex, sensory association areas, premotor areas, and areas involved in the organization of emotions and motivations (Barbas, 2000; Burman et al., 2011, 2015; Petrides et al., 2012; Reser et al., 2013). These connections might enable prefrontal areas to select sustained neural representations in sensory areas, and link them to other task-relevant information such as reward and actions and/or retrieved memories, in order to construct an active representation of the task set required to achieve a particular goal (Miller, 1999; Miller and Cohen, 2001; Courtney, 2004; Deco and Rolls, 2005; Pasternak and Greenlee, 2005; Ranganath, 2006; Watanabe and Sakagami, 2007; Rushworth et al., 2011; Funahashi and Andreau, 2013; D'Esposito and Postle, 2015).

# WORKING MEMORY OF ABSTRACT ENTITIES WITHIN AND ACROSS TRIALS

Other studies have shown that the information contained in working memory can be of a more abstract nature. Nieder et al. (2002) and Nieder (2005) trained monkeys to perform a delayed matching to ''number of items'' task, in which the monkeys first observed a sample comprising several items; after the delay period they then had to decide whether the display had the same number of items. The exact physical appearance of the displays was changed, and the monkeys therefore had to maintain information about ''numerosity'' during the delay period. The authors found that prefrontal cell activity encoded and maintained such information, suggesting that the abstract concept of number can be held in working memory via prefrontal neurons.

In another series of studies Mansouri and Tanaka (2002) and Mansouri et al. (2006, 2007, 2014) trained monkeys to perform a computerized analog of Wisconsin Card Sorting Test (WCST; **Figure 1A**). In the WCST, successful adaptation to the unannounced rule changes requires maintenance of the information about the relevant rule within and across trials. The monkeys had to match a sample to one of three test items based on either color or shape. A liquid reward and a discrete visual signal (error signal) were given as feedback to correct and incorrect target selections, respectively. The relevant rule and its frequent changes were not cued, meaning that the monkeys could find it only by interpreting the feedback. These studies showed that monkeys can successfully perform the WCST analog, indicating that they could infer and memorize the relevant rule. A significant number (about 30%) of dorsolateral prefrontal neurons near the principal sulcus represented the rules within and across trials, independent of the other aspects of the task (**Figure 1C**). The magnitude of the rule-dependent activity modulation correlated with the number of errors that the monkeys made after each rule change, in the course of reestablishing high performance. This indicated a link between representation of the working memory of the rules and the efficiency of the monkeys' overall behavior in adapting to frequent rule changes. However, information regarding the rule was retained in prefrontal cell activity during error trials, when the monkeys used the irrelevant rule to guide their behavior. This suggested that even during error trials information about the relevant rule was maintained in the prefrontal neurocircuitry, but for some other reasons such as a lapse of attention,

FIGURE 1 | Neuronal activity in prefrontal cortex representing abstract entities. (A) Cognitive task paradigm. In each trial, a start cue (a gray circle) appeared when an inter-trial interval (ITI) was over. The monkey had to push a bar after the onset of the start cue. This action changed the start cue to a fixation point, after which a sample stimulus replaced the fixation point. If the monkey maintained eye fixation and bar press three test items appeared (to the left, right and below the sample). The monkeys had to touch the test item that matched the sample in color or shape. The relevant rule for matching (matching by shape or matching by color) was consistent within a block of trials. The relevant rule was not cued and changed without any notice to the monkey when a criterion of 85% correct performance was achieved. (B) Dorsolateral prefrontal cortex cell activity represented conflict level experienced in the previous trial. The rastergram indicates activities in individual correct trials. Each row corresponds to a trial and each dot represents an action potential. Activities in high-conflict trials after low-conflict trials (LH, blue) and those in high-conflict trials after high-conflict trials (HH, pink) are shown. The mean activities are aligned at sample onset. (C) Activity difference between color and shape blocks in a dorsolateral prefrontal cortex cell represented the matching rules. The line graphs on the bottom left show the averaged firing rates in color and shape blocks, aligned at the sample onset. The bar graph on the bottom right represents the mean firing rate during the Sample epoch in consecutive blocks. The red and black dots, lines, and bars indicate color and shape blocks, respectively. The bin size is 50 ms. (A,B) are adapted from Mansouri et al., 2007 (Ref. 49). (C) is adapted from Mansouri et al., 2006 (Ref. 48).

or inaccessibility of the content of working memory for the decision process, the monkeys did not follow the relevant rule (Mansouri et al., 2006). Follow-up studies showed that lesions within the dorsolateral prefrontal cortex, orbitofrontal cortex or anterior cingulate cortex impaired performance of the WCST analog (Buckley et al., 2009; Kuwabara et al., 2014).

Additional studies examined the susceptibility of the working memory of the relevant rule to changes in task demand and interruptions. After the monkeys reached a high performance level with a particular rule, the inter-trial interval (ITI) was lengthened to increase the period during which the memory had to be held across trials. The monkeys' ability to remember the relevant rule was then tested in the following trial. The working memory of the rule was very vulnerable to changes in the holding period, as the performance of control monkeys (without a brain lesion) significantly decreased after the long ITI, although it still remained above the level of chance (Buckley et al., 2009). Monkeys with lesions of the dorsolateral prefrontal cortex were the most susceptible to this manipulation and their performance dropped to the chance level, whereas animals with orbitofrontal or anterior cingulate lesions could still perform the WCST above the chance level (Buckley et al., 2009). Mansouri et al. (2014, 2015) also examined the vulnerability of the working memory process to interruptions. Working memory of the rule was very vulnerable to distractions as introducing salient events such as free reward or performing a simple additional task during the ITI completely disrupted working memory and the performance dropped to the chance level in control monkeys.

These findings indicate that working memory processes maintain abstract information, and are not limited to a single trial, bridging the ITI to maintain the information that is necessary to guide behavior in the following trials. Other studies have also shown that information of task/rule might be maintained in prefrontal cell activity within and across trials (Rainer et al., 1998b; Asaad et al., 2000; Wallis et al., 2001).

# MNEMONIC PROCESSES IN CONTEXT-DEPENDENT EXECUTIVE CONTROL ADJUSTMENT

Conflict in information processing and the occurrence of errors evoke trial-by-trial modulations in behavior. It has been proposed that adaptive tuning of executive control, mediated by the dorsolateral prefrontal cortex, underlies these modulations (Botvinick et al., 2001; Carter and van Veen, 2007; Egner, 2007; Mansouri et al., 2009; Schroder and Infantolino, 2013; Wessel et al., 2014). The behavioral modulations induced by conflict and error are seen in the trial in which these first become manifest, and also in subsequent trials. It has been suggested that a mnemonic process is necessary to modulate behavior according to conflict experienced in an earlier trial, so that the required information is maintained (Mansouri et al., 2007, 2009). When the conflict-inducing task context ends, this mnemonic process should hold information about conflict during ITIs, to enable modulation of behavior in upcoming trials (Mansouri et al., 2009).

To examine the neural substrate and underlying neural mechanisms of conflict-induced behavioral adjustment, Mansouri et al. (2007, 2009) trained monkeys to perform a version of the WCST in which the level of conflict changed trial-by-trial. The monkeys' behavior was modulated by conflict in the current and following trials, and neuronal activity in dorsolateral prefrontal and orbitofrontal cortices encoded the existing conflict level. Another group of cells in the dorsolateral prefrontal cortex modulated their activity during the ITI depending on the conflict level in the previous trial, but such neurons were not observed in the orbitofrontal cortex (Mansouri et al., 2007, 2009, 2014; **Figure 1B**). This activity modulation may represent a mnemonic process that maintains information of conflict across trials. Modulation of behavior by an error in an earlier trial might also require such a mnemonic process during the ITI, and previous studies have shown that the activity of dorsolateral prefrontal cortex (Mansouri et al., 2006) and orbitofrontal cortex cells (Simmons and Richmond, 2008) maintains information of errors during the ITI.

These studies suggest that different compartments of the prefrontal cortex make dissociable contributions to mnemonic processes in the performance of the WCST. Compared to the consequence of lesions in other prefrontal and medial frontal regions, lesions in the principal sulcus led to the most significant impairment of the mnemonic processes (Mansouri et al., 2007, 2014, 2015; Buckley et al., 2009), These findings suggest that the dorsolateral prefrontal cortex might be more involved in working memory processes in the WCST. Nieder et al. (2002) and Eiselt and Nieder (2015), showed that neuronal activity in dorsolateral prefrontal and parietal areas, but not in premotor or cingulate motor areas, encodes numerosity information during sample and working memory periods, suggesting that working memory of numerosity is supported by the dorsolateral prefrontal cortex and parietal cortex.

# A BROADER PERSPECTIVE OF WORKING MEMORY

Historically (Funahashi et al., 1989; Fuster, 1995; Goldman-Rakic, 1995b; Miller and Cohen, 2001; Constantinidis and Procyk, 2004; Deco and Rolls, 2005; Pasternak and Greenlee, 2005; Ranganath, 2006; Cowan, 2008; Baddeley, 2012), a number of features have been described for working memory: (i) it has a short duration, and fades as the delay period gets longer; (ii) it is goal-oriented and its content is used to guide upcoming behavior; (iii) it is limited to a trial, being updated in each subsequent trial; (iv) it is highly vulnerable to distraction; (v) its content is a discrete feature of an object or event such as a particular color, shape or position in space; and (vi) subjects intentionally store information in working memory to solve a problem and are therefore aware of its content.

Recent studies suggest that prefrontal cortex also supports a kind of memory that maintains information about task context, in order to enable context-dependent executive control adjustment in subsequent trials. This mnemonic process shares some aspects of the concept of working memory defined in delayed response tasks in that: (i) it maintains task-relevant information for a short period; (ii) its content, which could be an abstract variable such as conflict, is updated trial-by-trial; and (iii) it is crucial for optimizing performance in a goal-directed task. However, this memory also differs from working memory in that maintaining the information is not intended and the subjects can still perform the task, although not optimally, without such information.

# CONCLUSION

Working memory is essential for the organization of goaldirected behavior, as it maintains task-relevant information. Sustained delay period activities in prefrontal cortex have been traditionally considered as neural mechanisms for encoding the working memory. However, four decades of studies on working memory indicate that this is not a unique property of the prefrontal cortex neurocircuitry, and that distributed networks including sensory systems and sub-cortical areas are also involved in the short-term storage of information. In addition, converging evidence from various experimental approaches indicates that the prefrontal cortex might selectively combine sustained representations of task-relevant events with information such as task goal, behavioral rules, conflict and actions to construct a representation of the goals and strategies required to achieve these goals.

Recent studies suggest that various kinds of short-term memories maintain task-relevant information such as errors and conflict to enable adaptive adjustments in the executive control of behavior. These are mnemonic processes in the service of executive control to optimize behavior, based on recent experiences. Prefrontal cortex cells represent these mnemonic processes, and lesions within the prefrontal cortex impair the adaptive behaviors that are dependent on these processes. The concept of working memory could be broadened to include these short-term memories that are not directly necessary to perform the task, but are used to optimize performance. During the performance of goal-directed behaviors, parallel and diverse mnemonic processes, distributed in multiple networks, might actively maintain task-relevant information to enable a rich representation of goals, actions, rules and strategies at different levels of abstraction. The prefrontal cortex could therefore play a unifying role in linking these diverse but relevant processes to

#### REFERENCES


optimize the use of the cognitive resources that are necessary to control the goal-directed behavior.

#### FUNDING

This study was supported by strategic grant scheme program, School of Biomedical Sciences at Monash University and ARC Centre of Excellence for Integrative Brain Function at Monash University.

### ACKNOWLEDGMENTS

We would like to thank Rowan Tweedale for making suggestions on the manuscript.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Mansouri, Rosa and Atapour. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Comparative Overview of Visuospatial Working Memory in Monkeys and Rats

#### Ken-Ichiro Tsutsui\*, Kei Oyama , Shinya Nakamura and Toshio Iijima

Division of Systems Neuroscience, Graduate School of Life Sciences, Tohoku University, Sendai, Japan

Neural mechanisms of working memory, particularly its visuospatial aspect, have long been studied in non-human primates. On the other hand, rodents are becoming more important in systems neuroscience, as many of the innovative research methods have become available for them. There has been a question on whether primates and rodents have similar neural backgrounds for working memory. In this article, we carried out a comparative overview of the neural mechanisms of visuospatial working memory in monkeys and rats. In monkeys, a number of lesion studies indicate that the brain region most responsible for visuospatial working memory is the ventral dorsolateral prefrontal cortex (vDLPFC), as the performance in the standard tests for visuospatial working memory, such as delayed response and delayed alternation tasks, are impaired by lesions in this region. Single-unit studies revealed a characteristic firing pattern in neurons in this area, a sustained delay activity. Further studies indicated that the information maintained in the working memory, such as cue location and response direction in a delayed response, is coded in the sustained delay activity. In rats, an area comparable to the monkey vDLPFC was found to be the dorsal part of the medial prefrontal cortex (mPFC), as the delayed alternation in a T-maze is impaired by its lesion. Recently, the sustained delay activity similar to that found in monkeys has been found in the dorsal mPFC of rats performing the delayed response task. Furthermore, anatomical studies indicate that the vDLPFC in monkeys and the dorsal mPFC in rats have much in common, such as that they are both the major targets of parieto-frontal projections. Thus lines of evidence indicate that in both monkeys and rodents, the PFC plays a critical role in working memory.

#### Edited by:

Natasha Sigala, Brighton and Sussex Medical School, UK

#### Reviewed by:

Paula Louise Croxson, Icahn School of Medicine at Mount Sinai, USA Anna S. Mitchell, University of Oxford, UK

\*Correspondence:

Ken-Ichiro Tsutsui tsutsui@m.tohoku.ac.jp

Received: 23 November 2015 Accepted: 14 November 2016 Published: 16 December 2016

#### Citation:

Tsutsui K-I, Oyama K, Nakamura S and Iijima T (2016) Comparative Overview of Visuospatial Working Memory in Monkeys and Rats. Front. Syst. Neurosci. 10:99. doi: 10.3389/fnsys.2016.00099 Keywords: monkey, rat, lesion, single-unit recording, prefrontal

# INTRODUCTION

The term ''working memory'' refers to the cognitive ability to actively maintain and manipulate information that is behaviorally relevant. The concept of working memory extends far beyond that of short-term memory being a temporary storage of information, as working memory is assumed as a workplace for processing information. Baddeley and Hitch (1974) proposed a multi component model of human working memory, consisting of central executive, visuospatial sketchpad and phonological loop components, to which an episodic buffer as the fourth component was added later. Our current understanding of the neural mechanisms of working memory is mainly based on neuropsychological and electrophysiological experiments carried out on monkeys, many of which were focused on visuospatial functions. Recently, novel techniques derived from molecular biology have become common for rodents and started to provide further information concerning the role of specific receptors, cell types, and neural circuits. It is increasingly necessary to integrate the knowledge obtained from monkey and rodent experiments for a deeper understanding of the neural mechanisms linking molecular, cellular and systems levels. There is also a purely biological interest in comparing the neural background of common cognitive functions between different mammalian species. Here, we provide a comparative overview of visuospatial working memory in monkeys and rats on the systems level.

# VISUOSPATIAL WORKING MEMORY IN MONKEYS

# Neuropsychology—Lesion and Inactivation Studies

For primates, various delay tasks have been used to study the neural background of working memory (for a review see Fuster, 2008). The standard tests for visuospatial working memory are ''delayed response'' and ''delayed alternation'' tasks, whereas those for nonspatial visual working memory are ''delayed-match-to-sample'' and ''delayed object alternation'' tasks (**Figure 1**). The prefrontal cortex (PFC) has been specified as the brain region responsible for visuospatial delay tasks, even well before the establishment of the concept of working memory.

FIGURE 1 | Standard delay tasks for monkeys performed on computer-controlled push-button panel. (A) Delayed response. The subject is required to memorize the location of the illuminated button and press it after the delay period when both buttons are illuminated. (B) Delayed alternation. The subject is required to alternate pressing the left and right buttons with intervening delays. The subject's action in the previous trial serves as a cue in the present trial. (C) Delayed match-to-sample. The subject is required to memorize the color of the light illuminated at the cue period, and after the delay, press a button illuminated with the same color. (D) Delayed object alternation. The subject is required to alternate the choice between two colors with intervening delays. In (C,D) the color with which the two buttons are illuminated at the response period is randomized between trials. Arrows indicate the button pressed by the monkey.

The first report of spatial delay task deficit due to a PFC lesion was made by Jacobsen (1936). Since then, a number of studies making smaller lesions within the PFC indicated that the ventral dorsolateral prefrontal cortex (vDLPFC), i.e., the area within and around the principal sulcus (Walker's area 46), is the most critical region for visuospatial delay task performance (Mishkin, 1957; Gross, 1963; Goldman and Rosvold, 1970). From studies involving focal unilateral lesioning (Funahashi et al., 1993a) or induction of focal unilateral inactivation (Sawaguchi and Iba, 2001) in the vDLPFC of monkeys performing oculomotor delayed response tasks with eight possible target positions arranged in a circle at 45◦ intervals, and with 16 possible target positions of eight different directions and two different eccentricities respectively, the visuospatial working memory function of the vDLPFC was suggested to be topographically organized, with each hemisphere basically being responsible for the contralateral visual hemifield. It was concluded that the nature of the deficit induced by vDLPFC lesions or inactivation is based on the concept of ''mnemonic scotoma''. Our recent study using low-frequency (1 Hz) repetitive transcranial magnetic stimulation (rTMS), with which we can temporarily inactivate the neural activity of the stimulated brain area, has shown that even in monkeys performing a delayed response task manually, unilateral inactivation of the vDLPFC yields visuospatial working memory deficits in the contralateral hemifield but not in the contralateral hand (Nakamura et al., 2014; Ogawa et al., 2015). In this temporal inactivation study using rTMS, monkeys were trained to manually perform a delayed response task with eight illuminable buttons arranged in a circle, similarly to the targets in oculomotor delayed response task in previous studies. The durations of the delay period (1.5, 4.5, 9, and 18 s) were randomized across trials. Low-frequency rTMS was applied either to the left or right vDLPFC before the daily task performance. During the daily session, left or right hand use was switched multiple times. Irrespective of the left or right hand use, the task performance was impaired in a delay-dependent manner only for targets contralateral to the stimulated hemisphere. This result may strongly support the idea that the memory coding in the vDLPFC is based on visuospatial but not on an effector-based coordinate.

Tsujimoto and Postle (2012) analyzed subjects' responses in error trials in the oculomotor delayed response task with 16 possible target positions (arranged in eight directions and two eccentricities) and found that errors were made mostly by responding to the correct target position in the previous trial. On the basis of this finding, they proposed that the nature of deficits in delayed response tasks induced by vDLPFC lesions or inactivation is the susceptibility to proactive interference or perseveration rather than mnemonic scotoma. However, by analyzing the data from our study in which we examined the performance of a delayed response task while vDLPFC was inactivated by low-frequency rTMS (Nakamura et al., 2014; Ogawa et al., 2015), we found that most errors were made by responding to the target adjacent to the correct target in the current trial, suggesting the blurring of the topographically organized visuospatial working memory. We speculate that the inconsistency of results in those studies may be due to the difference of the severity of the visuospatial working memory deficit induced by experimental manipulations. A mild impairment of the vDLPFC function may induce the blurring of the memory of the current cue location, resulting in making errors by responding to a target adjacent to the correct one, whereas a severe impairment may induce almost complete disappearance of the memory of the current cue location, resulting in making errors by confusing the memory trace of the current and the previous cue locations. Thus we speculate that the result described by Tsujimoto and Postle (2012) does not contradict the idea of the topographical organization of visuospatial working memory in vDLPFC, but rather it may reflect the severity of deficits induced under their experimental conditions. This hypothesis should be tested in the future study by manipulating the severity of deficit by parametrically changing the amount of muscimol injection or the intensity of low frequency rTMS.

#### Electrophysiology—Unit Recording Studies

When Fuster and Alexander (1971) and Kubota and Niki (1971) independently recorded single-unit activity in monkeys performing delay tasks for the first time, they discovered a sustained increase in the firing rate of vDLPFC neurons during the delay period. Such an activity was considered to be the neuron-level correlate of short-term memory and was later reinterpreted as that of working memory. Niki (1974b) found that many of the neurons with a sustained delay activity exhibited different discharge rates depending on the location of the cue, e.g., a higher discharge rate for the ''left'' cue than for the ''right'' cue (**Figure 2**). It soon became an issue whether the differential activity codes the information of the cue presented or the action planned, i.e., the problem of retrospective sensory coding vs. prospective motor coding. By comparing the activity of a neuron in the standard delayed response task and in a task that requires a response to a direction different from that of the cue, Niki and Watanabe (1976) found that 70% of differential delay neurons coded the cue location, whereas the remaining 30% coded the response direction. Later, Funahashi et al. (1993b) confirmed the dominance of cue location coding over action direction coding in the vDLPFC by using oculomotor proand anti-saccade tasks. Thus, it was indicated that the majority of neurons in the vDLPFC are involved in the retrospective coding of visuospatial information, rather than prospective coding. By using an oculomotor delayed response task with eight possible target positions, Funahashi et al. (1989) found that the differential delay activity was finely tuned to a certain area in the visual field, normally on the contralateral hemifield. Together with their lesion and inactivation studies (Funahashi et al., 1993a; Sawaguchi and Iba, 2001), this suggests the topographic organization of the visuospatial working memory function in the vDLPFC. Although most attention has been paid to sustained delay activity since its discovery, transient activity for cue presentation, response execution, and reward delivery have also been reported from the early years of unit recording in the vDLPFC (Fuster, 1973; Kubota et al., 1974; Niki, 1974a; Niki and Watanabe, 1979). It has been discussed that transient activity during cue presentation is considered related to the encoding of information in working memory, whereas the transient activity after the delay period can be related to the extinction of working memory content, action execution, or evaluation of the outcome of one's action (Fuster, 2008). More recently, it has been found that vDLPFC neurons show transient or sustained activity related to complicated visuospatial processes, such as route planning in a multistep maze (Mushiake et al., 2006) and perceptual categorization of arbitrarily distributed dots (Antzoulatos and Miller, 2011).

#### Anatomy of Monkey PFC

In addition to neuropsychology and electrophysiology, the anatomical connectivity, i.e., fiber projections, between brain regions provide key information for understanding brain functions on the systems level. The monkey PFC can be roughly subdivided into three areas: lateral, medial and orbital (**Figure 3**). As the lateral PFC is well interconnected with various sensory association and higher motor cortices, it may be mainly concerned with interaction with the external world, such as perception and recognition of external stimuli as well as planning and execution of motor actions. On the other hand, as the medial PFC is connected to medial temporal areas,

such as the amygdala, the hippocampus, and their surrounding cortical areas, and the hypothalamus, it may be related to internal processes, such as long-term memory, emotion, and autonomic nervous system. The orbitofrontal cortex (OFC) seems to be specifically involved in reward, punishment and association learning, as it is connected to visual, olfactory and gustatory sensory areas as well as the amygdala, the hippocampus and their surrounding cortical areas, and the hypothalamus. Such an idea of broad functional segregation of the PFC is in accordance with the results of the default-mode analysis of data obtained by PET (Kojima et al., 2009) and fMRI (Mantini et al., 2011), and cortical network analysis of restingstate fMRI data (Hutchison and Everling, 2014). The lateral PFC can be further subdivided into vDLPFC (Walker's area 46), ventrolateral PFC (VLPFC; Walker's areas 12 and 45), and dorsal dorsolateral PFC (dDLPFC; lateral surface of Walker's area 9 and 8B). The vDLPFC is mainly connected to various areas in the posterior parietal cortex (PPC), such as the superior and inferior parietal lobule (IPL), areas in the intraparietal sulcus (IPS), and the medial parietal (precuneus) cortex (Petrides and Pandya, 1984, 1999, 2006; Cavada and Goldman-Rakic, 1989). In contrast, the VLPFC is mainly connected with the temporal cortex, including the superior temporal cortex (STC), the inferior temporal cortex (ITC), and the areas in the superior temporal sulcus (STS; Webster et al., 1994; Borra et al., 2011; Saleem et al., 2014). The dDLPFC seems to function as an interface between the lateral and medial frontal cortices, having reciprocal connections to both the lateral and medial frontal cortices.

# Functional Organization of the Lateral PFC

On the basis of our current understanding of the anatomical connections of the PFC described above, it appears quite reasonable to consider that the lines of evidence from neuropsychological and electrophysiological studies indicate the critical involvement of the vDLPFC in visuospatial working memory. The PPC, which provides the major visual input to the vDLPFC, is the terminal region of the dorsal visual pathway. Lesions in this area cause poor performance in a ''landmark test'', in which subjects are required to select a target closer to a landmark object, which reflects the deficit in the visuospatial guidance of action (Mishkin and Ungerleider, 1982). Strangely, however, some studies have shown that no deficit was observed in a delayed response task for inactivating the PPC (Fuster, 1995; Chafee and Goldman-Rakic, 2000), in which the visuospatial guidance of action is also necessary. Neuronal activity in the PPC during spatial delay tasks has been reported to be similar to that in the PFC, i.e., a large proportion of PPC neurons show a differential sustained activity during the delay period (Constantinidis and Steinmetz, 1996; Chafee and Goldman-Rakic, 1998; Qi et al., 2010). When tested using pro- and anti-saccade tasks, most of the PPC neurons were found to code the cue location (Gottlieb and Goldberg, 1999).

Beyond the critical involvement of the vDLPFC in visuospatial working memory, some studies have indicated the functional segregation of working memory within the PFC. Behaviorally, one can dissociate visuospatial and nonspatial object working memory by different types of delay tasks (**Figure 1**). Early lesion studies indicated that whereas the visuospatial working memory was most impaired by vDLPFC lesions (Mishkin, 1957; Gross, 1963; Goldman and Rosvold, 1970), the nonspatial visual working memory was most impaired by lesions in the VLPFC (Passingham, 1975; Mishkin and Manning, 1978). A single-unit study also showed the functional segregation between the vDLPFC and the VLPFC. That is, neurons related to visuospatial working memory were mainly found in the vDLPFC whereas those related to nonspatial visual object working memory were mainly found in the VLPFC (Wilson et al., 1993). Results of those neuropsychological and electrophysiological studies are in good agreement with anatomical connections. Namely, the vDLPFC is mainly connected to the PPC while the VLPFC is mainly connected to the ITC for visual input. However, the idea of the parallelism of the visuospatial and nonspatial working memories between vDLPFC and VLPFC may be an oversimplification (Rushworth and Owen, 1998). A number of single-unit recording studies showed that neurons related to nonspatial visual working memory were distributed not only in the VLPFC but also in the vDLPFC (Watanabe, 1986a; Quintana et al., 1988; Miller et al., 1996; Wallis and Miller, 2003b; Warden and Miller, 2010). Furthermore, other studies have indicated that the vDLPFC is concerned with abstract information beyond any sensory modality: a recent lesion study indicates that the vDLPFC is involved in working memory for abstract rule (Buckley et al., 2009). Additionally, there are a number of single-unit recording studies reporting sustained activity of neurons coding the abstract rule information (Wallis and Miller, 2003a; Yamada et al., 2010).

Unlike in the case of the vDLPFC or VLPFC, only a few studies examined the function of the dDLPFC specifically. Petrides (2000) showed by selective lesioning of the dDLPFC that the contribution of this area is critical when monkeys are required to maintain more than two items in their working memory at the same time.

# VISUOSPATIAL WORKING MEMORY IN RATS

# Anatomy of Rat PFC

As the vDLPFC has been indicated as the most critical structure for visuospatial working memory in monkeys, a comparable area in rats would be the most promising candidate for having the same neural function. However, in rats, the anatomical definition of the PFC is not as clear as in monkeys (Preuss, 1995; Uylings et al., 2003). A classical definition of the PFC in primates is the existence of granular layer IV; therefore, the PFC has been referred to as the ''frontal granular cortex'', but there is no such area in the rat frontal cortex. Using another definition, i.e., the projection from the thalamic nucleus medialis dorsalis (MD), we can define the PFC extending medially and ventrally in the anterior part of the cerebral cortex. For simplicity, we subdivide the PFC into two areas, medial and ventral. The medial and ventral areas of the PFC are referred to as the medial prefrontal cortex (mPFC) and OFC, respectively. The mPFC includes cytoarchitechtonically defined areas such as frontal area 2 (Fr2), dorsal anterior cingulate area (ACd), prelimbic (PL) and infralimbic (IL) areas. The OFC includes areas such as the medial, ventral, ventrolateral and lateral orbitofrontal cortices (MO, VO, VLO and VL, respectively).

It appears that, according to the inter-regional connectivity, the mPFC can be further divided into two subareas: the dorsal mPFC, which corresponds to the cytoarchitechtonically defined areas Fr2 and ACd, and the ventral mPFC, which corresponds to PL and IL (**Figure 4**). Concerning the thalamo-cortical connectivity, the dorsal mPFC is reciprocally connected to the lateral part of the MD nucleus, whereas the ventral mPFC is reciprocally connected to the medial part of the MD nucleus (Uylings and van Eden, 1990). Concerning the cortico-cortical connectivity, the dorsal mPFC is reciprocally connected to the occipital, parietal and retrosplenial cortices, whereas the ventral mPFC is reciprocally connected to the rhinal cortex and amygdala (Ongur and Price, 2000; Uylings et al., 2003). The ventral mPFC can also be characterized as a medial prefrontal area that receives a heavy innervation from the hippocampus (Jay and Witter, 1991; Cenquizca and Swanson, 2007). These anatomical data suggest that the dorsal mPFC is the most likely candidate for the rat brain region comparable to the vDLPFC in the monkey brain, whereas the ventral mPFC in rats may be comparable to the mPFC in monkeys.

# Neuropsychology—Lesion Studies

The widely used task to test visuospatial working memory in rats is delayed alternation in a T or Y maze (**Figure 5**). Eight-arm radial and figure-eight mazes are also common in testing the visuospatial working memory function. Kolb et al. (1974) tested for the first time whether visuospatial memory deficits can be observed in rats by lesioning a part of the frontal lobe using a delayed alternation task in a T-maze and a delayed response task in their original device. They found that the performance in those tasks was impaired by the mPFC lesion. Since then, a number of studies confirmed that a mPFC lesion leads to spatial working memory deficits detected as poor performance in the delayed alternation task in the T or Y maze (Larsen and Divac, 1978; Thomas and Brito, 1980; Eichenbaum et al., 1983; Wolf et al., 1987; Sánchez-Santed et al., 1997). From those studies, it appears that the impairment in the delayed alternation in the T or Y maze tends to be more severe when the lesion is limited to the dorsal part of the mPFC rather than when limited to the medial part of the mPFC. Kesner et al. (1996) dissociated the working memory for egocentric and allocentric spaces by using a six-arm modified plus maze and demonstrated that the egocentric working memory deficit (forgetting whether one has made a right or left turn before) is induced by a dorsal mPFC lesion, whereas the allocentric working memory deficit, forgetting which arm (place) one has been before, is induced by a ventral mPFC lesion. The difference

in the spatial coordinate used in the dorsal mPFC and ventral mPFC may reflect the difference in the visuospatial information provided by their afferent connections. That is, the dorsal mPFC is mainly connected to the parietal cortex whereas the ventral mPFC is mainly connected to the hippocampus and its surrounding cortical areas (Jay and Witter, 1991; Ongur and Price, 2000; Uylings et al., 2003; Cenquizca and Swanson, 2007).

#### Electrophysiology—Unit Recording Studies

Electrophysiological activities related to visuospatial working memory functions have not been extensively studied in rats as in monkeys, but there have been several studies that have shown neuronal activities in the rat mPFC which are presumably related to visuospatial working memory. Jung et al. (1998) recorded unit activity mainly in the mPFC during the performance of working memory tasks in an eight-arm radial maze and a figure-eight maze and reported a transient activity related to a specific timing in a trial or a specific place in the maze. Baeg et al. (2003) recorded unit activity in the mPFC during performance in a figure-eight maze and indicated that the left or right choice at the end of the central section of the maze can be predicted from the differential activity in the central section of the maze prior to the choice. Similarly, Yang et al. (2014) recorded unit activity in the mPFC during the performance of delayed alternation in a Y maze and found a choice-predicting differential activity during the delay period preceding the choice.

Such a differential activity can be the retrospective coding of the choice in the previous trial or the prospective coding of the choice in the next trial. However, in both studies, the differential activity during the delay was transient in many neurons, and only a small population showed a sustained activity throughout the delay period. There has been a long debate on why the sustained delay activity can be only rarely found in the rat mPFC when performing delay tasks (Baeg et al., 2003; Yang et al., 2014).

One possible reason why previous studies failed to clearly show delay activity in rats is that the visuospatial working memory coding is different between monkeys and rats. It is the most straightforward idea that each of the neurons showing differential sustained delay activity carries information throughout the delay. On the other hand, Batuev et al. (1980) proposed a model in which ensembles of neurons showing transient activity in different timings can relay the information throughout the delay period. It is possible that in monkeys the working memory is coded in both ways, whereas in rats, mainly in the latter way. Another possibility is that the rarity of the sustained delay activity in rats is derived from task difference. During neuron recording, monkeys perform delay tasks manually as they sit in a primate chair with their head firmly fixed by a head-fixation device, whereas rats make locomotive movements without restrictions in a larger environment with respect to their body size. It is possible that in freely moving rats, prefrontal neurons, which fire transiently in relation to continuous sensory inputs and continuous motor planning and execution, overwhelm sustained delay neurons in number, whereas they remain silent in head-fixed monkeys. It is also possible that the sustained delay activity can be interrupted from certain sensory stimulation, which may shift the attention of a subject.

To address the second possibility, we recorded single-unit activity from the mPFC of head-fixed rats performing a delayed response task (**Figure 6**). We found a considerable number of neurons showing a sustained activity during the delay period, many of which were differential between ''left'' and ''right'' trials (**Figure 7**). Importantly, these sustained delay neurons appeared to be more densely distributed in the dorsal mPFC than in the ventral mPFC. We recorded from over 200 neurons from both areas and found that 17% of dorsal mPFC neurons showed differential sustained activity during the delay period of the delayed response task performance, whereas only 8% of ventral mPFC neurons did so. This result corresponds to the anatomical connectivity showing that the dorsal mPFC is the main target of parieto-frontal projections (Ongur and Price, 2000; Uylings et al., 2003), which may convey egocentric visuospatial information. To specify whether the recorded sustained delay activity was coding the location of the cue retrospectively or the direction of the movement prospectively, unit activity was recorded under the pro- and

FIGURE 6 | Delayed response task for head-fixed rats. (A) Apparatus used for experiments with head fixation (Left, top view; right, side view). A rat is laid in a prone position with its head fixed and body loosely restrained in a half-cylinder acrylic chamber. (B) Sequence of task events in a trial. At the beginning of a trial, an LED, either on the left or right, is illuminated for a short time then turned off. After a delay period, two spouts protrude towards the mouth of the rat. The correct response is to lick the same direction as the LED illuminated before the delay. The duration of the delay was typically 2 s for single-unit recording. Correct responses are rewarded with a drop of sucrose from the spout. Prior to the behavioral training, the head fixation device was implanted under anesthesia. After a period for recovery from the surgery, rats were habituated to the head-fixation condition by giving free reward from the spout. Then, the rats were trained in the delayed response task. As they performed about 300 to 400 trials per day, the correct rate gradually increased and reached over 80% in 2 or 3 weeks.

anti-response rules. Under the pro-response rule, a rat was required to lick the spout in the same direction as the cue illuminated before the delay period. Under the anti-response rule, the rat was required to lick the spout in the opposite direction from the cue illuminated before the delay period. The rule was altered every eight trials. Surprisingly, only less than 20% of all differential delay neurons coded the cue location, whereas the rest coded the response direction. This result is the opposite from those obtained from the monkey vDLPFC, where the vast majority of neurons coded the cue location retrospectively during delayed response performance. Further investigation is needed to specify whether the rat mPFC primarily codes the planned action or the result is dependent on the subjects' strategy in performing the delayed response task.

# LIMITATIONS AND FUTURE PERSPECTIVES

In this article, we have provided a comparative overview of visuospatial working memory in monkeys and rats. Experimental neuropsychological studies over the years have indicated that the monkey vDLPFC plays critical roles in visuospatial working memory covering the peripersonal space, and probably the functionally comparable rat brain region may be the dorsal mPFC. Monkey electrophysiological studies have indicated that the sustained delay activity typically recorded in the vDLPFC may be the neural background of visuospatial working memory, and our recent study has found similar neuronal activity in the dorsal mPFC in rats. Anatomical studies indicate that the vDLPFC in monkeys and the dorsal mPFC in rats have much in common, such as that they are both the major targets of parieto-frontal projections. In summary of this review article, we conclude that to date accumulating evidence from anatomical, neuropsychological, and electrophysiological studies suggest the similarity between the monkey vDLPFC and rat dorsal mPFC in their roles in visuospatial working memory.

We should mention here the limitations of this review study. First, to keep the discussion well focused, we strictly limited the subject to visuospatial working memory in monkeys and rats, which resulted in focusing on a specific region of the PFC, the monkey vDLPFC and the rat dorsal mPFC. Much evidence from human neuropsychological and neuroimaging studies indicate that the human PFC is involved not only in visuospatial working memory but also in nonspatial working memory of various modalities, as well as many other aspects of cognitive and executive control functions (e.g., Owen et al., 1996; Koechlin et al., 1999; Olesen et al., 2004; for review, Stuss and Knight, 2002; Fuster, 2008; Passingham and Wise, 2012). Monkey electrophysiological studies have shown the neural correlates of various cognitive functions besides working memory within the PFC on the single-neuron level, such as response inhibition (Watanabe, 1986b), attentional control (Sakagami and Tsutsui, 1999; Lebedev et al., 2004), categorical recognition (Freedman et al., 2001; Antzoulatos and Miller, 2011; Tsutsui et al., 2016b), numerical recognition (Nieder et al., 2002), rule-based judgments (Wallis et al., 2001; Mansouri et al., 2006; Yamada et al., 2010), value-based decision making (Barraclough et al., 2004; Cai and Padoa-Schioppa, 2014; Tsutsui et al., 2016a), and complex action planning (Mushiake et al., 2006). We have no intention to insist that the function of the entire PFC can be solely explained by working memory, and indeed we admit that even the above mentioned list of PFC functions is not at all exhaustive. Nevertheless, working memory, i.e., the active maintenance and manipulation of information, may be the key element of any higher function that the PFC is responsible for, as we discuss in the last paragraph of this section. Second, we did not intend to make an exhaustive comparative study of monkeys and rats. Rather than comparing differences in various aspects of their physical and behavioral features, we focused on their common behavior, that is, they actively move around in the environment to explore and forage. Monkeys and rats use different types of senses to collect information from the environment; for example, what can be specific to rats may be whiskering and sniffing. Nevertheless, vision can be important in both monkeys and rats to recognize spatial information necessary to generate appropriate actions. In general, spatial information is supramodal, as it is established by combining information of different sensory modalities. Therefore, we consider that there can be many common aspects between different species for the neural coding of space. Indeed, by introducing the head-fixed experimental settings, we found neurons in the rat dorsal mPFC that code the location of sensory cues or the direction of an intended movement, similarly to what has been found in the monkey vDLPFC.

Here, we should also mention that the function of the rat frontal cortex is still under debate, with some researchers having views quite different from ours. Wise (2008) argued in his review comparing the frontal cortices of primates and rodents that there is no brain region in the rodent frontal cortex that is comparable to the primate PFC, referring to the conventional anatomical definition of the primate PFC as the frontal ''granular'' cortex, which is characterized by the prominence of granule cells in layer IV. However, if we refer to the inter-regional fiber projections, which constitute large-scale neural networks, instead of the cytoarchitecture mainly reflecting the features of a local neural network, there appears to be a common rule preserved between species: the dorsomedial, ventromedial, and orbital parts of the rat frontal cortex have similar cortico-subcortical and cortico-cortical projection patterns as the lateral, medial, and orbital parts of the monkey frontal cortex. As we have extensively reviewed in this article, neuropsychological and electrophysiological studies of monkeys and rats indicate that the monkey vDLPFC and rat dorsal mPFC appear to play a critical role in visuospatial working memory. By citing several monkey neuropsychological and electrophysiological studies, Wise (2008) further argued that the functional characteristics of the granular cortex in primates is not working memory, or the temporary storage of behaviorally relevant information, but the storage of ''knowledge'' that guides nonroutine behavior, such as rules and strategies. Indeed we admit that the rat PFC is not a replica-in-miniature of the monkey PFC, just as the monkey PFC is not that of the human PFC. Behavioral flexibility, which may be a manifestation of the PFC function, is more prominent in monkeys than in rats, and in humans than in monkeys. However, if the rule- or strategy-based behavior was specifically associated with the granular frontal cortex, the logical expectation is that rodents that lack the granular frontal cortex should not exhibit rule- or strategy-dependent behavior. In our studies, however, the rats learned to switch between pro- and anti-licking delayed responses as frequently as every eight trials. We consider that the notion that rodents do not have any PFC at all may be an underestimation of the capacity of the rodent frontal cortex function.

For the next step of the comparative study of the visuospatial working memory, we consider that it is important to investigate the flow of information in a large-scale network in both monkeys and rats. For such a purpose, we can benefit from recent progress in analytical methods and computing power. By the network information flow analysis of various forms of neural data, not only PET and fMRI images, but also simultaneously recorded electrocorticogram (ECoG), local field potential (LFP), and single-/multiple-unit activities throughout multiple brain regions, we may reveal how different brain areas work in harmony and how information is processed throughout the neural network. Furthermore, new techniques, such as optogenetics and TMS, that enable the event-related manipulation of local neural activity during task performance would be useful to test the validity of a network information flow model. The proposed inter-cellular mechanism of sustained delay activity is a reverberating neural circuit. The simplest of such circuit is reciprocally connected to excitatory neurons. Empirically, both the monkey vDLPFC (Petrides and Pandya, 1984, 1999, 2006; Cavada and Goldman-Rakic, 1989) and rat dorsal mPFC (Ongur and Price, 2000; Uylings et al., 2003) are reciprocally connected to the PPC. They also form thalamocortical and cortico-striato-thalamo-cortical circuits, as the other frontal regions do (Alexander et al., 1986). It is quite possible that the closed-loop reverberating circuit is included in those interregional projections. We will be able to test these hypotheses both in the monkey and rat brains by applying the network information flow analysis of various kinds of neural data simultaneously obtained throughout the brain.

Visuospatial working memory is the most well-studied function of the PFC and has been a central topic of PFC research for a long time. However, it is only a part of a vast variety of PFC functions. Dysfunction of the PFC leads to deficits in various cognitive abilities, such as visuospatial and object working memory and attention, inhibitory control of movement, motivational and emotional regulation, prospective inference, behavioral planning, and decision making (Stuss and Knight, 2002; Fuster, 2008; Passingham and Wise, 2012). Nonetheless, we believe that the investigation of the visuospatial working memory function using standard delay tasks would lead us to a fundamental understanding of the PFC function in general. One important aspect of the visuospatial working memory is that it can encode and extinguish information whenever necessary. Not only immediate encoding of information but also immediate clearance of the memory buffer is essential for avoiding confusion regarding the memorized information between trials. Indeed, proactive interference, the interference of the past memory over the new memory, occurs owing to PFC damage. Another important aspect is the conversion of information, such as from visual to motor, in the case of a delayed response. The PFC is capable of switching between different conversion rules immediately, such as from the pro- to anti-response rule or vice versa. In humans, such function can be examined using the Wisconsin Card Sorting Test, which has been a standard neurological procedure to test the PFC function. Immediate encoding, extinction and multiple conversion of information seem to be the key features of the PFC, which cannot be observed in other cortical regions, and may be the biological background of cognitive thought processes. These views are still at the level of working hypotheses, but we consider that this kind of reductionist attitude would be of much importance when investigating the function of the PFC, as we normally tend to end up adding a new item to a long-lasting list of PFC functions after conducting a new study. In addition, together with studies directly testing the working hypotheses, studies showing what kind of function a certain part of the PFC is ''not'' involved (e.g., Baxter et al., 2008; Minamimoto et al., 2010) can sometimes be more informative than so-called ''positive'' reports that further extend the list of PFC functions.

#### ETHICS STATEMENT

Animal Care and Use Committee of the Tohoku University Environmental & Safety Committee.

#### AUTHOR CONTRIBUTIONS

K-IT, KO, SN and TI wrote the manuscript while K-IT took a major role.

#### REFERENCES


#### ACKNOWLEDGMENTS

This study was supported by Grants-in-Aid for Scientific Research (KAKENHI: #24223004 and #24243067 to K-IT and #50615250 to KO), a Grants-in-Aid for Scientific Research on Innovative Areas ''Adaptive Circuit Shift'' (#26112009 to K-IT), and the Strategic Research Program for Brain Sciences (SRPBS) provided by the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan.


over the unilateral dorsolateral prefrontal cortex on the delayed response task performance. Program Number 442.17. Society for Neuroscience Abstract.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Tsutsui, Oyama, Nakamura and Iijima. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Monitoring and Control of Task Sequences in Human and Non-Human Primates

Theresa M. Desrochers <sup>1</sup> \*, Diana C. Burk <sup>2</sup> , David Badre1,3 and David L. Sheinberg<sup>2</sup>

<sup>1</sup> Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, RI, USA, <sup>2</sup> Department of Neuroscience, Brown University, Providence, RI, USA, <sup>3</sup> Brown Institute for Brain Science, Brown University, Providence, RI, USA

Our ability to plan and execute a series of tasks leading to a desired goal requires remarkable coordination between sensory, motor, and decision-related systems. Prefrontal cortex (PFC) is thought to play a central role in this coordination, especially when actions must be assembled extemporaneously and cannot be programmed as a rote series of movements. A central component of this flexible behavior is the momentby-moment allocation of working memory and attention. The ubiquity of sequence planning in our everyday lives belies the neural complexity that supports this capacity, and little is known about how frontal cortical regions orchestrate the monitoring and control of sequential behaviors. For example, it remains unclear if and how sensory cortical areas, which provide essential driving inputs for behavior, are modulated by the frontal cortex during these tasks. Here, we review what is known about moment-tomoment monitoring as it relates to visually guided, rule-driven behaviors that change over time. We highlight recent human work that shows how the rostrolateral prefrontal cortex (RLPFC) participates in monitoring during task sequences. Neurophysiological data from monkeys suggests that monitoring may be accomplished by neurons that respond to items within the sequence and may in turn influence the tuning properties of neurons in posterior sensory areas. Understanding the interplay between proceduralized or habitual acts and supervised control of sequences is key to our understanding of sequential task execution. A crucial bridge will be the use of experimental protocols that allow for the examination of the functional homology between monkeys and humans. We illustrate how task sequences may be parceled into components and examined experimentally, thereby opening future avenues of investigation into the neural basis of sequential monitoring and control.

Keywords: sequential control, frontal cortex, monitoring, attention, executive functions, imaging studies, TMS, electrophysiology

#### INTRODUCTION

We perform sequences of tasks every day. They range from the relatively simple and practiced, such as making a cup of coffee, to the more complex and infrequent such as cooking a three-course dinner for a large group of people. These sequences of tasks have common features. First, they are structured such that there is a superordinate goal that is served by multiple subordinate subgoals (Lashley, 1951). Second, the series of steps stay constant,

#### Edited by:

Natasha Sigala, University of Sussex, UK

#### Reviewed by:

Charlie R. E. Wilson, Institut national de la santé et de la recherche médicale, France Igor Bondar, Institute of Higher Nervous Activity and Neurophysiology RAS, Russia

#### \*Correspondence:

Theresa M. Desrochers theresa\_desrochers@brown.edu

Received: 29 September 2015 Accepted: 18 December 2015 Published: 21 January 2016

#### Citation:

Desrochers TM, Burk DC, Badre D and Sheinberg DL (2016) The Monitoring and Control of Task Sequences in Human and Non-Human Primates. Front. Syst. Neurosci. 9:185. doi: 10.3389/fnsys.2015.00185 but the specific sequences of motor actions can vary. Third, these sequences of tasks are often executed with little or no external cues as to the required order of the steps or which steps have already been completed.

These common features hint at the underlying complexity of task sequences, and begin to illustrate the distinction between sequences that are automatic or proceduralized and sequences that require control in a more supervised manner. This distinction between automatic and supervised actions has been proposed before. Norman and Shallice (1986) contrasted a ''contention scheduling'' process that selected a series of habitual actions based on their value with a ''supervisory attentional system.'' The supervisory system was capable of overriding habitual action, but did not directly select individual actions. A similar distinction between two systems has also been made in the context of avoiding errors in action (Reason, 1990). There has been some disagreement as to the exact separation between these systems (e.g., see Botvinick and Plaut, 2004; Cooper and Shallice, 2006), but there is general agreement that a lapse in supervised control may lead to the automatic execution of a nondesired action. In this review, we will refer to the scheduler of the more automatic or proceduralized sequences of actions as the ''schematic controller'', where schema are defined as sets of organized responses that can be executed as a unitary mass (Reason, 1990). The system that monitors, handles exceptions, and keeps track of progress towards a higher-level goal we will refer to as the ''supervised controller''. These controllers are networks of areas (**Figure 1A**) that most likely function in feedback loops (**Figure 1B**). We will first address the kinds of sequences that typically fall under schematic control and then shift our main focus to the neural basis of sequences of cognitive tasks and their supervised control.

Although sequential tasks can seem simple because we execute them with relative ease, they require supervisory control. Anyone who has prepared coffee, but forgot to turn on the coffee maker in the morning, or has mistakenly put the can of peas in the refrigerator and the milk in the pantry has experienced a failure of this sequential task system. The kind of control necessary to execute sequences of tasks feels intuitively understood, and cognitive control functions have typically been attributed to the frontal cortex (Stuss and Benson, 1984; Miller and Cohen, 2001; Passingham and Rowe, 2002; Badre, 2008). However, specific deficits in sequential task execution have been difficult to pinpoint with classic clinical tests of cognitive function. Patients with frontal lobe dysfunction are impaired in their higher-order planning and sequencing capabilities and are not capable of independent living, yet they perform well on conventional tests of executive function (Eslinger and Damasio, 1985; Shallice and Burgess, 1991). Similarly, deficits in sequential multistep tasks are pervasive across many neurological and psychiatric disorders, and many patients cannot function normally in everyday life (e.g., Pauls et al., 2014). Thus, a better understanding of how we perform sequences of tasks and the underlying neural circuitry would make great strides towards helping large populations of people with deficits in these functions. In addition, we would contribute to our understanding of a fundamental, yet complex, daily behavior.

Investigation of sequential task performance must occur at multiple levels (**Figure 1C**) to understand both the high-level cognitive systems and the activity patterns at the neuronal level. This review examines what we know of the frontal and striatal neural circuits involved in motor sequences, monitoring, attention, and cognitive control that are all necessary in order to complete sequences of tasks. While we mainly present studies of visually guided tasks, we posit that task sequences driven by other modalities would use similar mechanisms. We also acknowledge that there is a rich literature encompassing the role of structures outside frontal and striatal circuitry, such as the hippocampus and medial temporal lobe (MTL), typically associated with navigation (e.g., Iglói et al., 2010; Pfeiffer and Foster, 2013) and learning/memory (e.g., Schendan et al., 2003; Ross et al., 2009; Albouy et al., 2013). Similarly, recent work has implicated the MTL in representing sequential patterns in stimuli (Schapiro et al., 2013, 2014; Wang et al., 2015). Although these systems almost undoubtedly interact with frontal and striatal systems and contribute to the performance of task sequences (for example reviews, see Ranganath and Ritchey, 2012; Dehaene et al., 2015), we do not focus on them here because they are outside the scope of our review. Schematic and supervisory control functions are not typically ascribed to the hippocampus and associated structures (McDonald and Hong, 2013).

Here, we integrate the findings of human and non-human primate studies in order to outline the interplay between schematic and supervised control circuits for the execution of task sequences. As the existing literature does not yet provide a comprehensive mechanism for sequential cognitive tasks, we assert the importance of investigating task sequences in human and non-human models. Furthermore, we propose a paradigm for the study of task sequences in non-human primates that would enable a direct investigation of the neural mechanisms underlying sequences of tasks. With this review, we aim to connect previous literature on schematic and supervised sequences, and motivate the field to pursue investigation of sequential task control.

# MOTOR SEQUENCES

There is a rich literature examining the learning and execution of motor sequences, and the systems in the brain that support these sequences. While it is possible that these same systems are engaged in sequences of tasks, the extent to which this is true remains unknown. Further, an understanding of motor sequences is necessary, as many task sequences are composed of motor sequences. Many sophisticated behaviors, such as playing a musical instrument, require the concatenation of a series of complex motor acts. The series of steps can be preplanned and is often rehearsed to reduce variability and increase accuracy. Moreover, seemingly simple acts, such as reaching and grasping, also consist of multiple steps that rely more on subcortical areas and spinal cord circuits to appropriately execute (Whishaw et al., 2008; Azim et al., 2014). Here, we will discuss the literature that has investigated the neural basis of several kinds of motor sequences: muscle activation

sequences, habitual motor sequences, and supervised movement sequences. Once learned, these sequences can be executed with minimal cognitive oversight and would fall under the purview of cognitive goals. Thus understanding the neural circuitry that underlies motor sequences, even when under schematic control, is crucial for furthering the understanding of higher-level task sequences.

#### Muscle Activation Sequences

Movements that involve multiple muscle groups can be characterized as sequences, as they require the temporal control of muscle activation and inhibition. For coordinated, cyclical movements that do not require persistent attention to execute (e.g., breathing, walking, pyloric rhythm) there are central pattern generators in subcortical structures, the spinal cord and the periphery to control these behaviors, and thus limit cognitive control and cortical involvement (Marder and Bucher, 2001). Often these behaviors are innate, or once learned, are not subject to extensive modification. Overlearned movements can be offloaded to extracortical structures as they become nearly reflexive, despite the fact that they may be composed of multiple complex steps in animals (Ito, 2000; Doyon et al., 2002) and humans (Toni et al., 1998; Swett et al., 2010). While it is possible for supervisory control to override the timing and expression of automatic behaviors (e.g., telling yourself to breathe), such nearly automatic sequences do not rely on higher cortical areas for expression and thus would fall under the purview of schematic control.

#### Habitual Motor Sequences

Habits and addictive behaviors often involve the repetition of motor acts. Reward-action associations are represented by differential activity of regions throughout the brain, but particularly within the basal ganglia (**Figure 1A**). For example, the striatum is necessary for associating a particular action and reward, e.g., always press the right button for a reward (Berke et al., 2009). Stimulus-response associations represented in the striatum extend to entire sequences of actions that may become habitual. Studies in rodents have shown that neurons in the striatum mark the boundaries of action sequences (Jog et al., 1999; Barnes et al., 2005; Jin and Costa, 2010; Smith and Graybiel, 2013; Jin et al., 2014). This representation develops through learning (Jog et al., 1999; Barnes et al., 2005) and results in a sequence representation that does not include the specific, intermediate steps of the sequence (Jin et al., 2014). Studies in primates have also shown striatal activity at the boundaries of movement sequences (Fujii and Graybiel, 2003, 2005; Desrochers et al., 2015a). This striatal activity in primates also develops through learning, and activity at the end of the movement sequence may represent an integrated cost/benefit signal that can drive the acquisition of more efficient sequences (Desrochers et al., 2015a). Additionally, the basal ganglia play a critical role in the temporal control of movement sequences. Inactivation of the main motor output unit of the basal ganglia, the sensorimotor area of the globus pallidus internus, slowed the steps of a sequential out-andback reach task, but did not interrupt the step order or completely inhibit the primates' movement (Desmurget and Turner, 2010).

These studies suggest that motor sequence storage is not the primary role of the basal ganglia for well-learned, routinized actions, and that the basal ganglia likely serve as a gating mechanism for movement and competing action plans, i.e., play a role not only in schematic control, but also in supervisory control. Such supervision would require the evaluation of an entire series of actions, which neurons in the striatum have been shown to do (e.g., Desrochers et al., 2015a). Pharmacological inactivation of the caudate in primates during a double-step saccade task revealed the existence of competing motor plans. During the caudate inactivation, the subjects exhibited an increased number of averaged saccades, curved saccades and sequence errors (Bhutani et al., 2013). Although the concept of competing motor plans has been observed in cortical neural recordings and human behavioral tasks (Cisek and Kalaska, 2005; Gallivan et al., 2015) the extent to which this competition is observed and how the competing plans are chosen is still not well understood. Additionally, habit learning can induce strong links between steps, which can cause the completion of an earlier step to become the cue for a subsequent step in a series.

#### Supervised Movement Sequences

As many sequential tasks are not composed of actions with rigid ordinal positions, and can happen on varying time scales, it is critical to study behaviors that allow for different ordering and combinations of movements, necessitating oversight by the supervisory control system. Various behavioral tasks have been used to study the execution of non-habitual motor sequences, including saccades (Zingale and Kowler, 1987; Petit et al., 1996; Grosbras et al., 2001; Isoda and Tanji, 2004), arm movements (Morasso, 1981; Wainscott et al., 2005; Overduin et al., 2008; Moisello et al., 2009; Panzer et al., 2009) and hand movements (Miyachi et al., 1997; Shima and Tanji, 2000). One particular task, the ''push-pull-turn'' task (**Figure 2A**), helped elucidate the role of the supplementary (SMA) and pre-supplementary (pre-SMA) motor areas (**Figure 1A**) in the control of sequential movements. Non-human primates learned to complete three different movements in different orders, initially with the aid of cues at each step. They were subsequently trained to perform the different motor sequences from memory, with only a single cue used to signify which sequence to execute. The investigators discovered three notable patterns of neural activity from single-unit recordings in pre-SMA and SMA: sequence-specific activity that varied during the first trial of each sequence type, position-specific activity that tracked the rank of the three movements, and interval-selective activity which varied depending on which movement had just been completed and which was next. In a separate experiment, the same investigators demonstrated that the inactivation of SMA interrupted the execution of a motor sequence, but not the execution of each individual movement (Shima and Tanji, 1998), as would be expected from an area involved in the supervision, but not direct execution of movements. Single-unit recordings in SMA and pre-SMA during an eightstage sequence also showed activity related to the numerical ordering of the movement stages (Clower and Alexander, 1998). The neural coding of multiple facets (sequence, position, and interval) of movement sequences suggests mechanisms by which a supervisory controller may act (**Figure 1B**). Simultaneously, these studies provided a useful method for investigating how neuronal populations code for the phases and transitions of motor sequences.

Human and primate studies also indicate that activity in the frontal cortex plays an important role in the supervision of motor sequences. Imaging studies have demonstrated that neural activity varies depending on the type of sequence and stage of learning. When human subjects learned to complete

continues performing the remembered sequence of movements. (B) Saccade countermanding task (a.k.a. stop-signal task). The subject is instructed to hold fixation on a central fixation point until it is extinguished and make a saccade to a target that appears in the periphery; this is called a "no-stop" trial. On a fraction of trials, after the initial fixation point is extinguished and before the peripheral target appears, the fixation point reappears in the center. This is the "stop-signal" to abort the saccade and maintain fixation on the central fixation point. On a "stop trial," maintaining fixation would be correct and executing the saccade to the periphery would be an error. The duration of the time before the fixation point reappears, the stop-signal delay, can be modulated to titrate the difficulty of the task: the longer the delay, the more difficult to it is to abort the saccade. (C) Wisconsin Card Sort Task (WCST). The subject is instructed to select the visual stimulus that matches the sample stimulus based on one of two rules: color match or shape match. The current rule is determined by trial and error and remains in operation until a dimension change. The subject presses a button to select the appropriate stimulus and is given feedback after the response.

multiple sets of key presses, both prefrontal cortex (PFC) and lateral premotor cortex activation increased during new sequence learning, while SMA activity increased during the execution of pre-learned sequences (Jenkins et al., 1994). In another study, interval and rank information were related to different levels of frontal cortical activity within the same network. Pre-SMA was more activated by interval information, while the SMA and frontal eye fields were activated more by rank-order information (Schubotz and von Cramon, 2001). On a finer spatial-temporal scale, subpopulations of neurons in the SMA, pre-SMA, dorsal lateral prefrontal cortex (DLPFC), and supplementary eye fields (SEF) exhibited rank-order activity (Berdyyeva and Olson, 2010). Neural recordings in anterior cingulate cortex (ACC) during a sequential trial-and-error problem solving task also showed activity related to rank-order (Procyk et al., 2000). These findings demonstrate that there is distributed processing across cortical areas during sequence expression. However, despite associations between particular areas and sequence task variables, it remains unclear how the responses in these cortical areas jointly represent sequence learning, intervals and rank information. The continued study of sequences of tasks would serve to demonstrate the different roles these areas play in tracking sequence expression.

In order to fully understand how the brain is able to monitor and complete the stages of a task sequence, it is important to decouple automatic, procedural tasks and tasks requiring supervisory control by considering the attribution of errors. For example, if you realize you had forgotten to add water to the coffee maker after turning it on, it would not make sense to throw the grounds out and restart from the beginning. Rather, it would be sensible to temporarily turn off the machine, add water, and continue, despite the misordered step. This monitoring of the higher-level goal, to make the coffee, allows for flexibility in how the task is achieved. This level of executive function requires interaction among error monitoring, attention and cognitive control circuits, which we will discuss in the following sections.

#### MONITORING OF ERRORS AND CONFLICT

Many theories of executive control have emphasized the necessity of monitoring processes (e.g., Logan, 1985). Early work using event-related potentials (ERPs) described the errorrelated negativity (ERN) that is observed during error trials and is localized to medial frontal cortex (Gehring et al., 1993). Sequences of tasks require monitoring at both the higher-order sequence level and at each stage. There are at least two, nonexclusive alternatives for how higher-order monitoring could occur: (1) in a sustained fashion such that the sequence is constantly monitored against a reference set that determines how next to proceed; or (2) in a transient fashion at crucial choice points, such as the boundaries (beginning and end) of a sequence. There is strong evidence that online monitoring is key to handling perturbations of coordinated and goalrelated movements (Todorov and Jordan, 2002; Scott, 2004; Diedrichsen et al., 2010), but it remains unknown whether a similar mechanism is used to monitor performance during a sequence of cognitive tasks. We propose that this monitoring is carried out by a supervisory controller that comprises a constellation of neural areas and that for real-world, naturalistic sequence tasks, this monitoring requires recurrent interactions between these areas in an active and dynamic way.

Few studies have directly examined transient vs. sustained dynamics of cognitive monitoring. One such study used a hybrid fMRI design during a task switching paradigm (Braver et al., 2003). The authors found evidence for transient activity in left lateral PFC and sustained activity, elevated throughout task performance with respect to baseline/no task, in right anterior PFC; both were modulated by trial-by-trial differences in response speed. These results provide initial evidence for both kinds of control dynamics (sustained and transient) and suggest that they are separable in the brain. Another study used a wide variety of identification, matching, search, and judgment tasks and found both transient and sustained dynamics in many different frontal-cortical areas, suggesting that different monitoring dynamics were not unique to PFC (Dosenbach et al., 2006). Because these tasks share many properties with sequence tasks, we will discuss examples of them and some of the commonly reported transient cortical dynamics that are associated with aspects of these tasks. In particular, we focus on two monitoring processes: error monitoring and conflict monitoring.

Several medial frontal cortical areas (**Figure 1A**) have been implicated in error monitoring by their selective response to errors. A common task used to study errors is the countermanding task (**Figure 2B**). In this task, the participant is instructed to make a movement to a cued peripheral target following a go signal. However, on a fraction of trials, rather than completing the cued movement the participant is presented a stop signal which instructs them to abort the execution of the planned movement. Monkeys and humans perform this task similarly (Emeric et al., 2007), and error responses in this task have been localized to the ACC using event related potentials (ERP) in humans (Godlove et al., 2011; Reinhart et al., 2012), as well as local field potential (LFP; Emeric et al., 2008) and single-unit (Ito et al., 2003) recordings in monkeys. Similarly, the SEF have been implicated in error monitoring in studies using human fMRI (Curtis et al., 2005), monkey LFP (Emeric et al., 2010) and single unit (Stuphorn et al., 2000; Schall et al., 2002) recordings in the countermanding task. The nearby SMA has also been implicated (Garavan et al., 2002; Scangos et al., 2013). A recent study using human intracerebral recording concludes that the SMA is the main locus of action monitoring because it shows responses during error trials before those of more rostral or pregenual ACC (pACC; Bonini et al., 2014). Further, responses in SMA were found without correlated responses in pACC, but not the opposite. This suggests a hierarchy within the medial frontal cortical monitoring network, where activity in the SMA precedes and influences activity in the pACC. However, we note that while pACC and postgenual ACC may have related functions, they are likely not the same. In general, naming conventions for the medial cortex surrounding the cingulate sulcus have not been consistent (for a review, see Procyk et al., 2014). For the purpose of this review, we will refer to postgenual ACC (sometimes dorsal ACC) as simply ACC and note pACC when applicable. It is likely that the division is not simple, and further investigation with more complex tasks will be necessary to more fully distinguish the roles of all these medial frontal cortical areas in error monitoring.

A concept related to error monitoring is conflict monitoring. Conflict monitoring allows for further engagement of cognitive control systems to resolve incompatibilities (e.g., respond to the color of the word ''BLUE'' when presented in a red font as in the Stroop task) as they arise, so that subjects can respond appropriately (Botvinick et al., 2001). In complex tasks, the ACC has been shown to respond to conflict in humans (Carter et al., 1999) and in the nearby pACC of monkeys (Amemori and Graybiel, 2012). Other studies have also suggested a more general role of ACC in outcome monitoring (for review, see Botvinick et al., 2004). A study varying the amount of conflict and the level of cognitive control/integration necessary for a response found that the ACC reliably responded to both conflict and subgoaling/integration demands (Badre and Wagner, 2004), again supporting a role of ACC beyond conflict monitoring. In this study, though it was not an explicit sequence, items were presented serially and knowledge of the serial order was required to make responses, further suggesting an evaluative role in sequences of tasks. ACC was also found to be one of the few areas that was activated at the initiation of many different kinds of cognitive tasks, and activation was sustained during task performance (Dosenbach et al., 2006). Therefore, in addition to the more specific/transient monitoring functions described in ACC, it is possible that the ACC also performs a more general monitoring function. However, it remains unclear how error or conflict monitoring processes function in true multistep tasks, as there is likely simultaneous monitoring of conflict with the higher-level goal governing the sequence and conflict within steps.

Clues as to how error and conflict monitoring processes may be carried out in sequences can be garnered from how those medial frontal cortical areas involved in monitoring—SMA, ACC, and SEF—respond during sequential tasks. Early studies using positron emission tomography (PET) imaging in humans showed that the SMA was activated for pre-learned sequences of saccades (Petit et al., 1996), and the ACC was associated with the acquisition of a implicit motor sequence (Grafton et al., 1998). The ACC did not show changes during sequence transfer or retrieval, suggesting that the ACC was critical for the rapid adaptation and monitoring necessary to detect and acquire a new sequence. The SMA and pre-SMA of monkeys has also been shown to respond to sequential movements in a large body of work (for review, see Tanji, 1994). Units recorded in the SMA have been shown to respond to the serial position in a sequence (Clower and Alexander, 1998; Isoda and Tanji, 2004) and the timing interval of sequential items (Shima and Tanji, 2000). Trial history also affects SEF activity during the countermanding task, which suggests that the SEF participates in planning of sequences in order to merge task history with task goals (Curtis et al., 2005). Neurons in the SEF and SMA respond to the serial order of items in a sequence (Berdyyeva and Olson, 2010) and units an the SEF can be selective to order within particular sequences (Lu et al., 2002). In an fMRI study in humans, triple-step saccades activated both SEF to trigger sequences and more generally, the ACC (Heide et al., 2001). These studies of motor sequences suggest that medial cortical monitoring areas may also participate in the supervision of motor sequences. However, these studies can only point at a parallelism by saying that the same areas shown to selectively respond to error or conflict also, during separate tasks, respond during sequences. Further research must be done before we can conclude that these areas code the presence of error or conflict truly simultaneously with the properties of sequences.

Few studies have directly examined error and conflict responses of medial frontal areas in the context of motor sequences. In an fMRI study in humans, participants performed a serial reaction time (RT) task with conflict produced by introducing responses that were out of sequence. The authors reported increased activation in conflict over no conflict trials in ACC (Ursu et al., 2009). ACC was also activated during errors, supporting a role of ACC in the evaluation and monitoring of sequences. The activity of monitoring areas does not always appear to scale simply, and may indeed interact with the control of sequences. In a pair of studies that illustrate this point, monkeys were required to touch targets in one of six sequences that were discovered by trial and error (Procyk et al., 2000). Task related neurons in the ACC coded the serial order of sequences, irrespective of kinematics. Some neurons preferred the search phase, when the monkey was actively trying to discover which of the six sequences to perform, while others preferred replication, when the monkey was repeating the discovered (correct) sequence. Subsequent work showed that this activity in the ACC was not just error monitoring, because the majority of the cells did not respond to error (Procyk and Joseph, 2001). These studies show that while monitoring regions can also encode elements of sequences, these coding properties are not necessarily simply additive and may interact to lead to novel representations, not yet well understood.

The interaction between monitoring and sequential control can be more closely examined through causal manipulations. There is limited evidence in this domain, but those studies that do exist strongly suggest that these medial frontal areas do not just monitor sequences, but perhaps actively control them. In humans, sequences of memory guided saccades were disrupted by lesions to the ACC (Gaymard et al., 1998). Lesions to SEF in humans also impaired the performance of memory-guided sequences of saccades (Gaymard et al., 1990, 1993; Heide et al., 1995). Microstimulation in the SEF of monkeys perturbed the order of saccades to two remembered locations, but did not seem to perturb the memory itself (Histed and Miller, 2006), and disrupted the ability of monkeys to select three targets in sequence (Berdyyeva and Olson, 2014). Similarly, another study in monkeys showed that the execution of motor sequences, but not individual movements, was disrupted by the inactivation of SMA (Shima and Tanji, 1998). All of these areas (ACC, SEF, and SMA) have also been shown to participate in monitoring as well, and the disruption of sequential performance when the functioning of these areas is perturbed again suggests that they play a supervisory control role in addition to monitoring sequences.

The studies discussed in the context of the functioning of monitoring brain areas thus far have used motor sequences. In a rare study of a sequence of three-item cued tasks (rather than motor sequences) followed by a long pause, the authors found significant activation in the ACC when a sequence of three tasks was initiated (Dreher and Berman, 2002). Consequently, the authors argued that the role of the ACC was not specifically about conflict, as the first item in the sequence would have no more or less conflict than the last item in the sequence, but more related to general alerting. Though the authors did not explicitly test for sustained dynamics in their study, the activation observed at the start of sequences could also reflect the heightened activity at the start of an epoch that required monitoring (Dosenbach et al., 2006). These studies suggest that monitoring areas may participate in the supervisory control of sequences of tasks along with motor sequences, but further research will need to be done to discern the exact nature of the involvement.

Perhaps the most suggestive evidence we have thus far that medial frontal cortical areas are involved in not only monitoring, but also sequencing, comes from a study that explicitly examined monitoring of an abstract (non-motor) sequence. In this study, human participants monitored serially presented letters for the presence or absence of a particular sequence or sequences of letters (Farooqui et al., 2012). Many areas in the fronto-parietal network showed greater activation for the detection of a subor end-goal target than intervening targets such as rostrolateral prefrontal cortex (RLPFC), ACC, and pre-SMA. Though the study did not explicitly report the significance of activity in those regions at each step in the sequence, plots of the activation in those regions of interest (ROIs) suggest that some, if not all, could also have significant activation levels at all steps in the sequence. Preliminary data from one other study shows neurons recorded in the PFC and hippocampus of rats respond during a sequence monitoring task (Quirk et al., 2014). This task requires participants to monitor a pre-learned sequence of either odors, in rats, or images, in humans, for an item that is out of sequence (Allen et al., 2014). Rats and humans showed similar behavior, suggesting that perhaps similar neural control mechanisms might be at work.

We have discussed three brain areas that are often associated with monitoring functions in the medial frontal cortex: ACC, SMA, and SEF. Activity in these three monitoring areas has been related to errors and conflict, but little is known about their direct involvement in the control of actions. Recent work in the ACC localizing feedback-related activation to individual participants' specific motor map morphology in the same region may provide inspiration for future research on this topic (Amiez et al., 2013; Amiez and Petrides, 2014). Many studies suggest that these areas may function to exert control over sequences as responses to the ordering of stimuli and disruption of sequential performance are common findings with medial cortical recordings and manipulation. Together these studies suggest that the ACC, SMA and SEF are ideally situated to contribute to a supervisory role in task sequences, but very few studies have brought together investigation about monitoring and sequences, particularly on a more abstract task level. Though it is tempting to say that the monitoring and sequential control functions of medial frontal cortex are simply additive, it is most likely that there is an interaction between these functions and that each area has it's own unique contribution to the process. Few studies have examined where these systems intersect and a small number have begun to distinguish the processing of the three areas discussed. Future work will be necessary to examine supervisory control and monitoring functions in the context of sequences of tasks.

### ATTENTION: THE INTERPLAY BETWEEN TASK RULES AND SALIENCE

The study of attention includes a vast body of literature; our interest here is to discuss the role of attention in the execution of sequential tasks. Specifically, the abstract ''rules'' that govern task performance must interact with lower-level task features, such as stimulus salience. Neurophysiology and imaging studies have demonstrated that attention correlates can be observed in many areas of cortex. Thus it seems more fruitful to consider shifts of attention in the context of circuits. Recent work suggests that shifting between cortical and subcortical circuits, interarea synchrony and oscillations play a major role in control of attention (for a thorough review of oscillations and attention, see Baluch and Itti, 2011; Miller and Buschman, 2013). Each of these mechanisms has its own time course and the potential to uniquely contribute to the proper execution of sequential tasks.

Attention is generally characterized as having two distinct directional influences: top-down modulation (under supervisory control) and bottom-up modulation (which can activate schematic control). The features of a visual stimulus (e.g., brightness, contrast, color) can encourage orienting to that stimulus based on salience. For example, a background with very bright distractors can increase the time it takes to find an object of interest because the features of the distractors overwhelm the features of the target. During a sequential task, such bottom-up attentional drive could be either distracting (e.g., supporting the completion of steps in the wrong order) or enhance task performance by reinforcing sequence completion (e.g., decreasing possible options during the course of a task). Successful completion of a sequence of tasks relies on a continuous balance between the information channeled in from sensory cortices and top-down information about higher level goals.

Higher-level goals used by the supervisory control system are thought to be implemented by frontal cortical areas. The goals are thought to be represented by sustained activity in PFC during a task, and parietal areas might be the intersection of supervisory and schematic control systems (Asaad et al., 1998; Gill et al., 2000; Duncan, 2001; Wallis et al., 2001; Badre and Wagner, 2005; Sakai and Passingham, 2006). Frontal cortical neurons exhibit shorter latency than parietal areas carrying attention related signals (Buschman and Miller, 2007; Li et al., 2010), and microstimulation of the frontal eye fields can produce top-down modulation of area V4 in the ventral visual pathway (Moore and Armstrong, 2003), which demonstrates a mechanism for top-down attentional control. Inter-area coupling, including that between FEF and V4, and PFC and V4, has been shown to correlate with performance on visual attention tasks (Gregoriou et al., 2009, 2014). In addition, human neuroimaging has shown that superior parietal regions are involved in controlling shifts of attention, and has supported that such areas serve as an intersectional point between top-down and bottom-up attentional processes (Thakral and Slotnick, 2009; Greenberg et al., 2010). However the mechanism and site of interaction between attentional systems is still actively debated, as there are multiple sites where sequential and schematic control systems interact.

In the case of task sequences, top-down information can change from one step in the sequence to the next based on the current position within the sequence. The task goals might require orientation towards one feature of the stimulus during one phase, and a different feature in the next phase. The ability to change the focus of attention appropriately can be affected by factors such memory and trial timing. For example, memory can serve as an override of saliency. Memory-guided saccade sequences are less susceptible to distractors than cued saccade sequences (Gersch et al., 2009). Likewise, long-term memory can increase the sensitivity to the presence of a stimulus in particular spatial locations during visual search (Stokes et al., 2012). Evidence also supports that the rhythm of trial presentation is tracked in multiple areas, including fronto-cortical areas and auditory cortex (Cutanda et al., 2015; Konoike et al., 2015). These studies suggest that top-down attention is not a static process, but can adapt to the moment-to-moment changes in task demands while maintaining the over-arching goal.

Paradigms that involve task switching and different attentional networks have clarified the interaction of types of information (e.g., rules and bottom-up priming) and the roles of prefrontal and parietal areas in attentional shifts. One study decoupled top-down and bottom-up effects by asking people to maintain two separate mental counts, each associated with particular stimuli (Gehring et al., 2003). On each trial, participants either updated the same count as the previous trial (no-switch trial) or a different count (switch trial). No-switch trials facilitated faster RTs and shorter latency event-related potentials in frontal cortex, and this effect was exaggerated when the stimulus was also repeated. When the top-down (rule for which count to update) and bottom-up (stimulus viewed) components of the tasks aligned, attentional processes worked in synchrony. Another study directly investigated the effect of a PFC lesion on an attention task and found a behavioral deficit when the cue shifted rapidly across trials (Rossi et al., 2009). However, behavior was close to normal when the cue was constant across many trials and during a pop-out task with changing targets, which did not rely on top-down control. This suggests that the attentional systems can operate individually in certain tasks, although this independance may not hold for all paradigms. Ruthruff et al. (2001) proposed that task expectancy, defined as a top-down feature, affects the time to program an upcoming response, while task-recency, defined as a bottom-up attentional feature, affects the time to execute the response. They proposed that the two attentional systems jointly produce task readiness. This remains to be validated with neurophysiological evidence, but provides a testable hypothesis for the function of attentional systems in response preparation. Together, these studies suggest that both top-down and bottom-up attentional systems may contribute to the execution of sequential tasks, but direct evidence of the relative contributions of each attentional system through time in sequential tasks has not yet been demonstrated. It is likely that cognitive control mechanisms mediate the attentional systems described above, and thus we focus on this topic next.

# FLEXIBLE ADAPTATION FOR GOAL-DIRECTED SEQUENCES

The elements of sequential control that we have discussed thus far: sequential movements, monitoring, and attention must ultimately be brought together to accomplish a sequence of tasks. Cognitive control is the ability to flexibly adapt behavior and select actions based on goals. This ability becomes particularly important when completing a sequence of tasks, as not only must the correct actions be selected, but they must be selected in an appropriate order, all the while maintaining the overall goal.

The PFC has been shown to be critical for these cognitive control functions and support the ''rules'' that govern goaldirected behavior in humans (Passingham and Rowe, 2002, for review, see Miller and Cohen, 2001), and in non-human primates (Wallis et al., 2001; Roy et al., 2010; Buschman et al., 2012; Rigotti et al., 2013, for review, see Fuster, 1993). The cognitive control of task sequences can be thought of as hierarchical in that multiple sub goals are created in the service of an overarching goal through time. Studies of non-sequential hierarchical control in humans have illustrated a caudal to rostral progression in the response of areas to progressively more abstract levels of the hierarchy (Koechlin et al., 2003; Badre and D'Esposito, 2007; Badre et al., 2009), that may be ''gated'' by the striatum (Badre and Frank, 2012). These studies suggest that the same frontal cortical areas may function similarly when the hierarchy is created by a sequence, rather than a static rule structure. In monkeys, neurons in the PFC were also found to be selective to the memory of a particular sequence of items (Warden and Miller, 2010), suggesting that the these monitoring and cognitive control functions of the PFC extend into the sequential realm.

A more anterior region, RLPFC has also been implicated in settings that have elements in common with sequential hierarchical control including: tracking and performing operations on items presented serially (Braver and Bongiolatti, 2002; Badre and Wagner, 2004; De Pisapia et al., 2012; Nee et al., 2013); performing multiple tasks simultaneously (Gilbert et al., 2006; Dreher et al., 2008); exploring, tracking and updating reward contingencies (Daw et al., 2006; Kovach et al., 2012); the highest level of a contingent rule structure (Badre and D'Esposito, 2007); and task switching (DiGirolamo et al., 2001; Kim et al., 2012). Many of these functions share aspects of monitoring superordinate goals to provide a top-down superordinate signal over the course of several trials (Braver and Bongiolatti, 2002; Badre and Wagner, 2004; Dreher et al., 2008; De Pisapia et al., 2012; Nee et al., 2013). Complementary findings have shown the time course of RLPFC activity to be sustained over many individual actions or choices (Koechlin et al., 1999, 2003; Braver et al., 2003). There are relatively few studies of RLPFC in animals because rodents do not have cortex that is homologous to RLPFC (Preuss, 1995) and techniques have been developed only recently to record from these areas in the non-human primate (Mitz et al., 2009). Existing work has implicated RLPFC in monkeys in feedback during set shifting (Tsujimoto et al., 2010, 2012), learning the value of behaviors (Boschin et al., 2015), and shifting attention (Caspari et al., 2015). These studies suggest that the RLPFC may function similarly in the monkey and in the human, but none of these studies in monkeys or humans explicitly examine the functioning of RLPFC during sequential task control.

Many paradigms have been used to examine the flexible capabilities of frontal cortex in cognitive control. We will briefly highlight two, task switching and the Wisconsin Card Sorting Test (WCST), because of the adaptability of these paradigms to examine sequential task control. Both tasks already begin to query elements necessary for sequential control because it is only in the context of the previous task that the current task is a switch in task or ''rule''. In task switching, the increased time that it takes to go from one task to another is used as a marker for the engagement of cognitive control mechanisms in both humans (Rogers and Monsell, 1995; Ruthruff et al., 2001), and monkeys (Stoet and Snyder, 2003a,b, 2009; Caselli and Chelazzi, 2011). Task switching studies using fMRI in humans have revealed the activation of a wide array of areas in the frontal-parietal network such as RLPFC, PFC, and medial frontal cortex (Dove et al., 2000; Sohn et al., 2000; DiGirolamo et al., 2001; Braver et al., 2003; Kim et al., 2012; Schuck et al., 2015, for review, see Ruge et al., 2013). In monkeys, neurons that respond to particular strategies or the shift in strategies have been recorded in the PFC (Genovesio et al., 2005, 2008) and RLPFC (Tsujimoto et al., 2010, 2012). While none of these studies explicitly studied sequences of tasks, the areas that were found to be activated in task switching are also often implicated in monitoring, attention, and cognitive control, suggesting that the combination of these elements necessary for the execution of task sequences may have a neural substrate in one or more of these brain regions.

The WCST requires shifting rules or strategies where the switches are learned by trial and error and are not signaled or predictable, and thus require tracking the rules through time. Participants must ''sort'' the cards according to one dimension of the stimuli presented, such as color or shape. In adaptations of this task, the equivalent is deciding which dimension is currently relevant to match to the current stimulus (**Figure 2C**). This paradigm is different than instructed task switching, but seems to engage many of the same regions. Human lesion and imaging studies have shown the involvement of PFC in shifting or feedback (Milner, 1963; Berman et al., 1995; Nagahama et al., 1996; Monchi et al., 2001; Nakahara et al., 2002) along with ACC during error trials (Lie et al., 2006). Monkeys can learn analogs of the WCST (Mansouri and Tanaka, 2003; Moore et al., 2005). As in humans, studies in monkeys have shown the PFC is involved in maintaining the current rule and monitoring performance (Mansouri et al., 2006; Buckley et al., 2009; Moore et al., 2009), the RLPFC is involved in adapting performance according to the history of conflict (Mansouri et al., 2015), and the ACC is implicated in evaluating performance (Buckley et al., 2009; Moore et al., 2009; Kuwabara et al., 2014). It is often assumed that there is functional homology between brain areas involved in performing similar tasks in monkeys and humans. In a rare study directly comparing activations found in fMRI of monkeys and humans performing the WCST they found that set shifting activity was localized to the PFC of both species (Nakahara et al., 2002). These findings are important because there is no guarantee with the limited scope that recording electrodes have that they will capture the activity of those neurons most active/important for the task. Together these studies of task and set shifting implicate areas in the frontal lobes that are commonly associated with more general cognitive control. Understanding how exactly each of these areas is involved when any switch of set or task is executed within a sequence will require studying task sequences directly.

There are many unique demands when executing tasks sequentially, as evidenced by the fact that patients with frontal lobe damage are often unable to perform everyday task sequences on their own, despite the ability to perform normally on other tests of executive function (Eslinger and Damasio, 1985; Shallice and Burgess, 1991). For example, one patient was unable to perform complex sequences required daily living, yet excelled at the WCST. The patient could complete tasks towards specified goals only when the tasks and goals were repeatedly presented externally. The patient also seemed unable to trigger the automatic programs necessary for self-care (e.g., feeding). However, he was capable of initiating single movements and did not have any kind of movement deficit. This then again highlights several components of task sequences that are not captured by classic tests of executive function. Task sequences require flexible allocation of resources and time to complete multiple sequential goals, and are often unguided by external cues. Therefore, successful completion of a task sequence requires organization, internal monitoring, and the interaction between neural circuits are involved in schematic and supervised control. To study these elements that are unique to task sequences, it is then important to push an experimental paradigm beyond classic tests of executive function. With the large body of literature supporting task switching effects under many conditions, switching tasks in sequences is an ideal paradigm to study this kind of sequential control. When sequences of tasks are performed in everyday life, it most closely resembles a hierarchical task switching behavior, as we maintain an overarching goal while accomplishing, and switching between, many subtasks. It also has been shown that switch costs are robust to how much preparation a participant has to switch tasks, even when which task they are to complete next is completely memory guided (Sohn and Carlson, 2000).

Behavioral evidence for the hierarchical control of task sequences came from a study that asked participants to perform simple stimulus categorization tasks according to a remembered sequence (e.g., color, shape, shape, color; Schneider and Logan, 2006). They showed increased RT costs at the first item in the sequence, over and above costs of task switching alone. This provided evidence for the hierarchical control of task sequences because in the absence of the execution of a sequence first position RT's would not be elevated.

Despite their ubiquity in everyday life, we know little about how the brain controls task sequences (Koechlin et al., 2000; Koechlin and Jubault, 2006; Farooqui et al., 2012; Desrochers et al., 2015b). In Farooqui et al. (2012), participants monitored a stream of individual letters for targets from pre-specified sequences of different lengths. The primary result was that a broad network of frontal and parietal areas, including RLPFC, PFC, ACC and pre-SMA showed increased activation at the sequence termination. This provides evidence for these areas participating in the monitoring of abstract sequences, but the task did not require selecting a new task depending on sequence position (local task switching). Rather, the task level change was always at the sequence boundary. Therefore, the question of how these areas participate in the performance of task sequences is left open.

Another study of sequential control in humans asked participants to perform a sequence of choice RT tasks vs. a simple motor sequence during fMRI (Koechlin and Jubault, 2006). In this study, the task sequence was performed only once, and the initiation and termination were cued externally. They found phasic activation at the initiation and termination of the entire sequence of tasks in the inferior frontal gyrus (BA 45), and activation related to the initiation and termination of motor sequences more posteriorly in inferior frontal gyrus (BA 44) and in the pre-SMA. This suggests a separation of those areas engaged in the performance of task sequences from those involved in motor sequences that appears to support the notion that more abstract constructs are represented more anteriorly in the brain.

Based on previous behavioral work on hierarchical task sequences (Schneider and Logan, 2006), a recent fMRI and transcranial magnetic stimulation (TMS) study asked human participants to perform remembered sequences of tasks while undergoing fMRI scanning or TMS (Desrochers et al., 2015b). This study captured aspects of sequential task behavior that previous studies did not: participants had to both perform a task at each position in the sequence, and the initiation and termination of each sequence was internally monitored. The tasks were to make color and shape judgments of simple stimuli (**Figure 3A**). On each block of trials, participants were instructed to perform the tasks in a 4-item sequence, e.g., color-color-shapeshape, and they repeated this sequence, without external cues regarding the position in the sequence, for the duration of the block (**Figure 3B**).

In this context, the authors found that in the frontal cortex, the RLPFC, PFC, pre-SMA, and medial frontal cortex showed activity that gradually increased through the four-item sequence of tasks, and then reset at each new beginning (**Figures 3C,D**). Other areas in the frontal cortex, such as predorsal premotor cortex (pre-PMd) did show responses to other elements of this sequential task, but did not show the ramping pattern of activation and thus dissociated from RLPFC. Given the extent that RLPFC has been implicated in supervisory control functions, the authors then sought to determine if the ramping pattern of activation found in the RLPFC was indeed necessary

peak event-related response (at 6 s) of the voxels included the RLPFC ROI. (D) Voxelwise contrast of the Parametric Ramp regressors over baseline (extent threshold 172 voxels, note lateral views rotated ∼50◦ ). Outline of the RLPFC, pre-PMd, and SMA/pre-SMA ROIs used in the study in black.

(Continued)

#### FIGURE 3 | Continued

(E) Mean difference in ER (±SEM) due to stimulation at peak SOA for RLPFC and for pre-PMd in TMS1. ER differences shown over the course of sequences: beginning (Position 1), middle (Positions 2 and 3), and end (Position 4). Asterisk indicates significant difference in the effect of stimulation at Position 4 (F1,32 = 6.7, P < 0.01). (F) Same as (E) but for TMS2. Asterisk at Position 1 indicates a reliable difference between RLPFC and pre-PMd (F1,28 = 6.2, P < 0.02). At Position 4, tilde indicates a marginal difference between RLPFC and pre-PMd (F1,28 = 2.9, P < 0.1), and asterisk indicates a reliable difference between RLPFC and rostromedial prefrontal cortex (RMPFC; F1,14 = 4.4, P < 0.05).

for sequential task control and what the function of this activity might be. As a causal manipulation, TMS was applied during the same sequential task. The authors showed, in two separate experiments, that the RLPFC and associated network was necessary for the supervisory control of task sequences because single-pulse TMS caused an increase in the number of errors induced as the sequence progressed (**Figures 3E,F**). These effects mirrored the ramping pattern observed in fMRI (**Figure 3C**). The effects in RLPFC also dissociated from the effects of stimulation in the pre-PMd and a second control region, the rostromedial prefrontal cortex (RMPFC). These results suggest that the RLPFC is a key node in the supervisory control network for task sequences, and that its involvement is increasingly necessary as sequences progress and uncertainty may build up as to the current position within the sequence (**Figure 1C**). Previous studies of sequential control did not report this kind of ramping dynamic within the sequence (Koechlin and Jubault, 2006; Farooqui et al., 2012), suggesting that it is under these more naturalist conditions were participants must remember and monitor the sequence of tasks to be performed without external cues that these novel dynamics are revealed.

These few studies only scratch the surface of understanding sequential task control. Many questions remain as to the relative contributions of each area, how all the areas implicated in sequential control interact, and the underlying cellular mechanisms. It is in this realm that studies of nonhuman primates can be particularly informative; however, studies of sequential task control in these animals are even more rare than they are in the human. The neural mechanism underlying the ramping dynamics observed in humans may resemble the neural activity profiles that have been found in action sequences. Many regions such as the DLPFC, supplementary motor area (SMA), pre-SMA, and SEF have neurons that show selectivity to the serial position in action sequences (Niki and Watanabe, 1979; Clower and Alexander, 1998; Averbeck et al., 2003; Ryou and Wilson, 2004; Mushiake et al., 2006; Averbeck and Lee, 2007; Berdyyeva and Olson, 2010). Although these neural responses tend to be phasic at one position, some neurons may code positions later in the sequence with larger responses than earlier ones, thus producing the appearance of a ramp across the population (Averbeck et al., 2003; Berdyyeva and Olson, 2010). Examples of individual neurons that show ramping dynamics have also been found in the ACC and PFC (Niki and Watanabe, 1979). These cortical dynamics may interact with neuromodulatory mechanisms in the striatum, as the dopamine content of the striatum has been shown to ramp up as rats progress towards a goal (Howe et al., 2013). It has been suggested that two systems exist in parallel that use more spatial task-based coordinates or motor coordinates for sequential control, both containing loops through the basal ganglia and frontal cortex (Hikosaka et al., 1999) and that the neural constituents of sequential monitoring may be hierarchically organized themselves (Sigala et al., 2008). Further study of task sequences specifically will be necessary to illuminate these hypotheses.

In order to bridge the investigation of the neural basis of sequential task control between monkeys and humans, it is crucial to develop sequential task paradigms that can be performed by both species. It is not sufficient to assume similar tasks will be controlled by similar underlying neural mechanisms, and there are likely several levels of interactions between the neural responses in relatively simple tasks, and task sequences. As an illustrative example, in a rare study of monkeys performing sequences of tasks separated by long intervals, standard task switch effects were not observed (Avdagic et al., 2014). Techniques such as the use of fMRI in monkeys will also be key to establish functional homology between monkeys and humans, as it will allow the direct comparison of the activations present in each species (when used with the same task).

We provide here an example of a task that could be used to study sequential task control in monkeys and in humans. The paradigm merges the key aspects of the well-studied pushpull-turn, countermanding, and WCST tasks into a sequential control task. In this task, participants would be asked to match a central sample stimulus to one of three choice stimuli, according to their color or shape (**Figure 4A**). During initial training, an image displayed above the central sample stimulus would serve as a cue for the shape or color rule. In each block of trials, participants would repeat a short sequence of cued judgments (e.g., color, shape, shape; **Figure 4C**). After significant training, subjects would begin by performing a three-task sequence, and after completion of five cued sequences, would continue performing the same sequence, but without cues (**Figure 4B**) in order to complete a block of trials (**Figure 4D**). The design builds on the push-pull-turn task where a sequence of arm movements is first instructed, and then executed from memory (**Figure 2A**; Shima and Tanji, 2000). The important distinction between the push-pull-turn task and the current sequential task is that the sequence is not composed of individual movements, but is composed of the tasks to be completed (e.g., shape, shape, color) and completely removed from a motor sequence. This sequential task also builds on the capability of monkeys to flexibly adjust the current task or rule, and choose the appropriate stimulus dimensions on which to base a decision as in the WCST (**Figure 2C**; Nakahara et al., 2002). Together, these elements combine the monitoring, attention, and cognitive control requirements of task sequences into a paradigm that both monkeys and humans can perform, paving the way for future work.

# CONCLUSION

Despite the relative ease with which we complete sequences of tasks in our daily lives, they are incredibly complex and require the proper functioning of many systems in concert for their successful completion. We have discussed in this review the work in motor sequences that has provided a foundation for task sequences, and some of the major components of task sequences: monitoring, attention, and cognitive control. Very few individual studies or task paradigms bring together all of these components to study task sequences as a whole. Though often times it may be assumed that these neural systems may work similarly under sequential conditions as under nonsequential conditions, it is critical to test these assumptions to develop a direct understanding of task sequences themselves. This understanding is then important to address gaps in our understanding of how disorders that involve sequences such as Parkinson's disease, Huntington's disease, obsessive compulsive disorder, and perhaps even attention deficit disorder occur and may be treated. Simultaneously, an understanding of task sequences is important for aiding the large numbers of patients that have some form of frontal dysfunction and are unable to live independently.

We have proposed a framework that separates the control of sequences into schematic and supervisory control. The schematic controller selects sequences of movements that are more procedural and can be executed as a single unit. For example, many muscle activation sequences do not require specific attention in order to execute. While habitual motor sequences can also be executed as a unit and may be selected by the basal ganglia as part of the schematic control network, evidence suggests that the basal ganglia also take on more of a supervisory role for habitual actions, and play a central role in the formation and evaluation of these sequences. The supervisory control system is responsible for monitoring, handling any exceptions that arise, and keeping track of a higher-level goal. We have provided evidence that medial cortical areas implicated in monitoring functions may perform similar functions in sequences of tasks in the service of the supervisory controller. Attention harnesses the schematic controller in the form of bottom-up primarysensory mechanisms that are executed without conscious regulation. Top-down attentional mechanisms are at work when frontal cortical brain areas bias the activity of downstream regions to accomplish a particular goal under supervisory control.

When one has to flexibly pursue goals that may change through time, as in task sequences, the role of flexible supervisory control becomes more pronounced. Generally these flexible control functions have been assigned to rostral frontal cortical areas in non-sequential tasks where the maintenance or flexible switching among abstract rules for action is necessary. Studies of sequential motor tasks have similarly suggested that these regions track the variables necessary for the tracking and control of elements of sequences, and the sequences as a whole. However, there are few studies that have examined the most abstract level of the supervisory controller—the control of sequences of tasks.

The few studies that have examined sequential task control start to give evidence of how monitoring, attention, and cognitive control may come together, but in novel ways. For example, it had been previously established that the RLPFC was selectively involved in the highest level of abstraction when completing complex tasks and could be activated in a sustained manner; however, the ramping dynamics observed through the steps in a sequence of tasks had not been observed prior to participants actually being asked to complete such a task sequence (Desrochers et al., 2015b). Thus, while the areas in the frontal cortex and striatum may all play their respective roles that are not dramatically different in sequential tasks from the functions they are associated with in non-sequential tasks or motor sequences, the dynamics of their functioning and how they connect with other areas during sequential tasks is largely uncharted territory. Further study is necessary to directly observe and manipulate the neural circuitry in sequential tasks, and tasks such as the one we have proposed that are capable of being performed by both monkeys and humans will provide a crucial bridge in understanding between the mechanisms and the actions of people.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

All authors developed the topic for the manuscript. TMD and DCB completed the initial draft of the manuscript. All authors edited the manuscript.

#### ACKNOWLEDGMENTS

The authors would like to thank John Ghenne for his contribution to this work. We also thank members of the DLS and DB Labs for helpful discussions. Research reported in this publication was supported by NINDS of the NIH (R01NS065046, DB; F32NS080593, TMD), the NIMH of the NIH (T32MH019118, TMD), the NEI of the NIH (R01EY014681, DLS), and the Brown Institute for Brain Science.


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Desrochers, Burk, Badre and Sheinberg. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Effect of Disruption of Prefrontal Cortical Function with Transcranial Magnetic Stimulation on Visual Working Memory

Elizabeth S. Lorenc<sup>1</sup> \*, Taraz G. Lee<sup>2</sup> , Anthony J.-W. Chen1,3,4 and Mark D'Esposito1,3,5

<sup>1</sup> Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA, <sup>2</sup> Department of Psychology, University of Michigan, Ann Arbor, MI, USA, <sup>3</sup> Department of Neurology, VA Northern California Healthcare System, Martinez, CA, USA, <sup>4</sup> Department of Neurology, University of California, San Francisco, San Francisco, CA, USA, <sup>5</sup> Department of Psychology, University of California, Berkeley, Berkeley, CA, USA

It is proposed that feedback signals from the prefrontal cortex (PFC) to extrastriate cortex are essential for goal-directed processing, maintenance, and selection of information in visual working memory (VWM). In a previous study, we found that disruption of PFC function with transcranial magnetic stimulation (TMS) in healthy individuals impaired behavioral performance on a face/scene matching task and decreased category-specific tuning in extrastriate cortex as measured with functional magnetic resonance imaging (fMRI). In this study, we investigated the effect of disruption of left inferior frontal gyrus (IFG) function on the fidelity of neural representations of two distinct information codes: (1) the stimulus category and (2) the goal-relevance of viewed stimuli. During fMRI scanning, subjects were presented face and scene images in pseudo-random order and instructed to remember either faces or scenes. Within both anatomical and functional regions of interest (ROIs), a multi-voxel pattern classifier was used to quantitatively assess the fidelity of activity patterns representing stimulus category: whether a face or a scene was presented on each trial, and goal relevance, whether the presented image was task relevant (i.e., a face is relevant in a "Remember Faces" block, but irrelevant in a "Remember Scenes" block). We found a reduction in the fidelity of the stimulus category code in visual cortex after left IFG disruption, providing causal evidence that lateral PFC modulates object category codes in visual cortex during VWM. In addition, we found that IFG disruption caused a reduction in the fidelity of the goal relevance code in a distributed set of brain regions. These results suggest that the IFG is involved in determining the task-relevance of visual input and communicating that information to a network of regions involved in further processing during VWM. Finally, we found that participants who exhibited greater fidelity of the goal relevance code in the non-disrupted right IFG after TMS performed the task with the highest accuracy.

Keywords: visual working memory, functional magnetic resonance imaging, transcranial magnetic stimulation, executive function, selective attention, prefrontal cortex

#### Edited by:

Natasha Sigala, University of Sussex, UK

#### Reviewed by:

Christos Constantinidis, Wake Forest University, USA John Duncan, Medical Research Council, UK

\*Correspondence: Elizabeth S. Lorenc elizabeth.lorenc@berkeley.edu

Received: 17 September 2015 Accepted: 23 November 2015 Published: 16 December 2015

#### Citation:

Lorenc ES, Lee TG, Chen AJ-W and D'Esposito M (2015) The Effect of Disruption of Prefrontal Cortical Function with Transcranial Magnetic Stimulation on Visual Working Memory. Front. Syst. Neurosci. 9:169. doi: 10.3389/fnsys.2015.00169

# INTRODUCTION

Since the human brain has an inherently limited capacity for information processing and working memory (Cowan et al., 2005), it is crucial that relevant information in the environment be filtered from the myriad of visual details that are unimportant, and often detrimental, to the task at hand (Vogel et al., 2005). It is proposed that biased competition among representations of features in the visual field is resolved via both top-down and bottom-up signals, with the top-down influence likely guided by an ''attentional template'' maintained in working memory (Desimone and Duncan, 1995; Desimone, 1998). There is increasing evidence that the prefrontal cortex (PFC) is one source of these top-down signals which are essential for the privileged processing and maintenance of goalrelevant visual information within extrastriate cortex (Miller and D'Esposito, 2005; Bressler et al., 2008; Sreenivasan et al., 2014b; D'Esposito and Postle, 2015). Consistent with this view, we recently demonstrated that selective attention alters the tuning of stimulus category representations in extrastriate cortex, while the lateral PFC codes for the current task goal (i.e., ''Remember Faces, Ignore Scenes''; Chen et al., 2012).

Successful filtering of relevant visual information is essential for the prioritized storage of that information in working memory for later use, and information in working memory can further guide selective attention. Evidence for top-down modulatory processes shaping neural activity has been found throughout different stages of working memory (Gazzaley and Nobre, 2012): stimulus anticipation (e.g., Bressler et al., 2008; Puri et al., 2009; Esterman and Yantis, 2010), sensory processing and gating of information to be encoded into working memory (e.g., Gazzaley, 2011; Kok et al., 2012), prioritization and manipulation of memory representations (e.g., Nee and Jonides, 2009; Tamber-Rosenau et al., 2011; Kuo et al., 2014), and memory retrieval (Nobre et al., 2008).

Lesion studies provide evidence for the role of frontal cortex as one source of top-down signals that can modulate processing in sensory regions during working memory. Fuster et al. (1985) were the first to investigate the effect of PFC cooling on spiking activity in inferotemporal (ITC) neurons during a delayed-match-to-sample task. During the delay period—when persistent stimulus-specific ITC activity is observed—cooling caused attenuated spiking profiles and a loss of stimulus-specificity in ITC neurons. In humans, Barceló et al. (2000) found that lateral PFC lesions caused reduced extrastriate activity in the lesioned hemisphere and correspondingly lateralized behavioral deficits. In addition, Sauseng et al. (2011) found that TMS disruption of right frontal eye field function in healthy participants impaired the shifting of visuospatial attention, and yielded corresponding changes in electrocorticographic measures of neural dynamics. Finally, we previously demonstrated that TMS disruption of lateral PFC function impaired performance on a face/scene matching task, while reducing category-specific tuning in extrastriate cortex (Lee and D'Esposito, 2012). These results provide important causal evidence for the role of the PFC in shaping the tuning of information processed in extrastriate cortex, and provide insight into the dynamic nature of top-down modulation of visual areas by the PFC in accordance with task goals.

The present study uses a set of multi-voxel pattern classification analyses to further investigate the effects of PFC disruption on the neural representation of stimulus category and goal-relevance information codes. Immediately after continuous theta-burst TMS to the left inferior frontal gyrus (IFG) or a control region (left somatosensory cortex), participants underwent MRI scanning while performing a face/scene matching task, in which the relevant stimulus category (faces or scenes) varied by block. With this approach, we investigated the effect of frontal cortex disruption on the fidelity, as indexed by decoding accuracy, of two distinct types of visual working memory (VWM) representations: (1) stimulus category: whether a face or a scene was presented on each trial and (2) goal relevance, whether the presented image was task relevant (i.e., a face is relevant in a ''Remember Faces'' block, but irrelevant in a ''Remember Scenes'' block). First, we hypothesized that disruption of top-down control signals emanating from the left IFG would reduce the fidelity of the stimulus category code within extrastriate cortex. Second, given that PFC likely maintains a code for goal relevance, we hypothesized that PFC disruption would reduce the fidelity of this information code in this disrupted PFC region, as well as other areas that depend on information from this disrupted region.

# MATERIALS AND METHODS

Analyses were applied to unpublished and published data (Lee and D'Esposito, 2012).

# Participants

Data from 24 participants (8 male, age range 18–38) were analyzed in this study. Data from 15 participants have not been previously published and data from nine participants were published in Lee and D'Esposito (2012). Although the Lee and D'Esposito study originally included 12 participants, three of those participants were excluded due to methodological issues specific to the current analyses. All procedures were approved by the UC Berkeley Committee for the Protection of Human Subjects, and participants gave their written informed consent before the study and were compensated monetarily for their participation.

# Cognitive Task

In the MRI scanner, participants viewed a series of pseudorandomly interleaved face and natural scene images in a jittered, event-related design with 3, 5 or 7 s in between the onset of each 600 ms stimulus presentation (Chen et al., 2008, 2011; **Figure 1**). In separate scanning runs, participants performed a 1-back matching task within the faces only (''Remember Faces'') or scenes only (''Remember Scenes'') behavioral conditions. Participants responded to each image with a button press indicating a 1-back ''match'' or ''non-match'' within the relevant

Task-relevant images are outlined here, but outlines were not shown to participants. (NM = nonmatch, M = match). Figure modified from Lee and D'Esposito (2012).

category, and they also indicated ''non-match'' for all images of the irrelevant category. Participants also completed runs in which they were required to perform the 1-back matching task within both stimulus categories simultaneously, and runs in which they simply categorized each stimulus as a face or a scene, but these conditions were not of interest for the present analyses. Each participant completed five 20-trial 2 min runs of each behavioral condition, each of which contained four matches. To ensure that the pattern classification analyses were balanced and unbiased, both ''match'' and ''non-match'' and correct and incorrect trials were included in each of the following analyses.

# Transcranial Magnetic Stimulation

Detailed descriptions of the TMS methods used in this study have been published previously (Lee and D'Esposito, 2012). Immediately before each of two MRI scan sessions, nine participants underwent 40 s of continuous theta burst TMS, either to the left inferior frontal gyrus (''IFG TMS'') or to the left postcentral gyrus (''Control TMS'').

There was an average of 8 days between the IFG TMS and Control TMS scan sessions, with a range of 2–18 days. After the exclusion of three participants of the original 12 (see ''Participants'' Section), a total of two participants first underwent IFG TMS followed by Control TMS, and seven first underwent Control TMS. Given that each participant completed five 20-trial runs of each behavioral condition in an initial functional magnetic resonance imaging (fMRI) scan session prior to the two TMS/fMRI scans, it is unlikely that order effects account for the findings reported below. Moreover, re-analysis of the data accounting for order found no evidence of a systematic difference in TMS effects in the two order groups.

Left IFG TMS targets were defined functionally in a separate scan session with the same behavioral task, using a statistical contrast of all attended images vs. all ignored images, regardless of stimulus type, across all task conditions. Left postcentral gyrus TMS targets were anatomically defined using the Duvernoy brain atlas (Duvernoy, 1999) as a reference, and drawn as spheres with a radius of 5 mm centered 10 mm away from the midline and 5 mm from the top edge of the brain. TMS sites were identified in native space for each participant, and the corresponding MNI coordinates are listed in **Table 1**.

Continuous theta burst TMS, which provides localized activity disruption for up to 60 min after stimulation (Huang et al., 2005), consists of 50 Hz TMS pulse triplets administered every 200 ms (5 Hz) for a total duration of 40 s.

# Functional MRI Acquisition and Preprocessing

MRI data were acquired in the UC Berkeley Henry H. Wheeler, Jr. Brain Imaging Center with a Siemens TIM/Trio 3T MRI scanner with a 12-channel receive-only head coil. Functional data were obtained using a one-shot T2<sup>∗</sup> -weighted echoplanar imaging (EPI) sequence sensitive to blood oxygenation leveldependent (BOLD) contrast (TR, 1000 ms; TE, 32 ms; field



of view, 230 mm; matrix size, 64 × 64; in-plane resolution, 3.5 × 3.5 mm). Each functional volume contained 18 contiguous 5 mm-thick axial slices separated by a 0.5 mm interslice gap. Whole-brain MP Flash T1-weighted scans were acquired for anatomical localization and normalization.

Functional MRI data were then subject to standard preprocessing with AFNI (Cox, 1996) and custom Matlab (v2011b, The MathWorks, Inc., Natick, MA, USA) scripts. Motion correction and volume registration of each EPI run to the anatomical scan was carried out in a single resampling step by align\_epi\_anat.py (Saad et al., 2009), by first aligning the mean of the middle EPI to the anatomical data and then aligning each volume to that mean EPI with a 12-parameter affine registration. Next, AFNI's 3dDeconvolve tool was used to compute an ordinary least squares regression with 15 double gamma canonical hemodynamic response function regressors: eight stimulus regressors, one for each stimulus-category—memorycondition combination (i.e., a face in ''Remember Faces'', a scene in ''Remember Faces'', etc.), six motion parameter regressors (x, y, z, roll, pitch, yaw), and a quintic polynomial baseline regressor. Then, the resulting β-weighted estimated baseline component (motion + polynomial baseline) was calculated with AFNI's 3dSynthesize tool and subtracted from the original time series. Finally, each run was z-scored temporally, voxel-wise, in preparation for multi-voxel pattern analysis (MVPA).

#### Multi-Voxel Pattern Classification Analyses

In all of the following pattern classification analyses, we determined the fidelity of neural codes representing the category of each stimulus (face or scene) which we call the ''stimulus category'' code and the relevance of each stimulus to the current task goal (''remember faces'' or ''remember scenes'') which we call the ''goal relevance'' code.

#### Stimulus Category Code

A classifier was trained to distinguish multi-voxel activity patterns evoked by the presentation of a face from those evoked by presentation of a scene, regardless of the relevance of the stimulus category to the current task condition (Chen et al., 2011). Based on our unpublished data which found that the coding of stimulus category information peaks just over 5 s after stimulus onset, this code was examined using BOLD signal from the EPI volume collected 5–6 s post stimulus onset.

#### Goal Relevance Code

A classifier was trained to distinguish multi-voxel activity patterns representing the relevance of each stimulus to the current task set (i.e., Relevant: a face in ''Remember Faces'' or a scene in ''Remember Scenes'', vs. Irrelevant: a scene in ''Remember Faces'' or a face in ''Remember Scenes''). Based on our unpublished data which found that the coding of goal relevance information peaks later than the stimulus category code, about 6.5 s after stimulus onset, this code was examined using the BOLD signal from the EPI volume collected 6–7 s post stimulus onset.

#### Regions of Interest—Anatomical

A priori regions of interest (ROIs) were defined anatomically, by first registering each participant to MNI152 space (Grabner et al., 2006) and then back-projecting masks from the AAL atlas (Tzourio-Mazoyer et al., 2002) into the participant's native space. Anatomical ROIs included: left and right middle frontal gyrus (MFG), IFG (which includes pars opercularis, pars triangularis, and pars orbitalis), and extrastriate cortex (parahippocampal, lingual, and fusiform gyri).

# Regions of Interest—Functional

Functional ROIs were created from a dataset of 24 participants. This included previously unpublished data from 15 participants, and published data from nine participants who performed the behavioral task in the scanner prior to undergoing TMS (Lee and D'Esposito, 2012). To create ''stimulus category'' and ''goal relevance'' ROIs, we conducted whole-brain Gaussian Naïve Bayes searchlight analyses separately within each participant using the Searchmight toolbox (Pereira and Botvinick, 2011). Each 27-voxel cubic searchlight was iteratively moved throughout every voxel in the brain, following a leaveone-run-pair-out (one ''Remember Faces'' and one ''Remember Scenes'' run) cross-validation structure. The mean classification accuracy across all five cross-validation folds was assigned to the center voxel of each searchlight position, forming a stimulus category and a goal-relevance accuracy map for each participant. These accuracy maps were then spatially smoothed with an 8 mm FWHM Gaussian kernel, warped to MNI space, and then entered into a second-level group analysis in which the mean decoding accuracy at each voxel was tested against 50% chance accuracy with a one-sample t-test. The resulting t-map was used to threshold the mean across-subjects accuracy map at a stringent false-discovery-rate-corrected alpha level of 0.0001.

# ROI Pattern Classification Analysis

Within anatomical (MFG, IFG, and extrastriate cortex) and functional ROIs, a regularized logistic regression classifier (Princeton Multi-Voxel Pattern Analysis toolbox v1.1; http:// code.google.com/p/princeton-mvpa-toolbox/) was used to test for TMS-induced changes in the fidelity of codes representing stimulus category and goal relevance. All MVPA analyses were run with an iterative cross-validation procedure in which all but one pair of runs (one ''Remember Faces'' + one ''Remember Scenes'') were used to train the classifier, and the held-out pair were then used as a test set to assess classifier accuracy. Non-parametric permutation tests were used to test for above-chance classification, as well as to test for significant differences between information code fidelity (indexed by classifier accuracy) in the two TMS conditions. More specifically, 1000 sets of permuted class labels were pre-generated, following the cross-validation structure of the original analysis. Then, single-subject null classifier accuracy distributions were created separately for each ROI and TMS condition, each time using the same 1000 sets of permuted class labels. Finally, the single-subject classifier accuracies for each of the 1000 sets of permuted labels were averaged across subjects, to create a null distribution of mean classifier accuracies against which to test the observed mean classifier accuracies.

To test whether classification accuracy was significantly above chance within each ROI and TMS condition, we calculated the fraction of the null classifier accuracy distribution that exceeded the observed classifier accuracy. This allowed for the calculation of empirical p-values for each ROI and TMS condition, which were then Bonferroni corrected for multiple comparisons. Finally, we tested whether classification accuracy decrements after IFG TMS as compared to control TMS were greater than what would be expected by chance. First, a null ''TMS condition difference'' distribution was created for each ROI by subtracting the classifier accuracy in each permutation of the IFG TMS condition data from the classifier accuracy in the matching Control TMS condition permutation, and averaging across all eight participants. The p-value of the resulting TMS condition difference within each ROI was calculated as the fraction of the null TMS condition difference distribution that exceeded the true TMS condition difference. Finally, these empirical p-values were Bonferroni corrected for multiple comparisons. These analyses were repeated for each information code.

# Whole-Brain Searchlight Classification Analysis

This analysis was designed to investigate whether brain regions outside our initially hypothesized regions also code stimulus category and/or goal relevance, and to test whether the fidelity of these information codes are affected by left IFG disruption with TMS. Using the Searchmight toolbox (Pereira and Botvinick, 2011), we conducted a whole-brain Gaussian Naive Bayes searchlight analysis separately within each participant and TMS condition (IFG TMS, control TMS). Each 125-voxel cubic searchlight was iteratively moved throughout every voxel in the brain, with the mean classification accuracy across all cross-validation folds assigned to the center voxel of the searchlight. This yielded one accuracy map per TMS condition per participant, and each participant's IFG TMS accuracy map was then subtracted from the control TMS accuracy map to create a ''true TMS condition difference'' accuracy map. The resulting difference maps were normalized to MNI space, spatially smoothed with an 8 mm FWHM Gaussian kernel, and then entered into a non-parametric group analysis similar to that proposed by Stelzer et al. (2013). More specifically, 100 sets of permuted labels were generated, and used to create 100 null searchlight accuracy maps per participant. Then, 100,000 average group maps were created via a bootstrapping procedure: on each of the 100,000 iterations, 1 of the 100 maps was drawn randomly with replacement from each participant, and the resulting maps were averaged across participants. Next, the ''true TMS condition difference'' mean accuracy map was thresholded voxelwise, with each voxel only passing the threshold if its true value exceeded 99.5% of the 100,000 values in the null distribution. Finally, we performed a cluster correction procedure in which the cluster size threshold was determined empirically from our 100,000 null group maps. First, the size of the largest contiguous cluster (comprised of voxels sharing faces, not just edges or corners) in each of the 100,000 null group maps was calculated and recorded. Finally, any clusters in the ''true TMS condition difference'' map larger than 99.5% of the maximum clusters from the null maps were considered significant.

# RESULTS

These analyses were applied to data previously published (Lee and D'Esposito, 2012). In the previous article, univariate, spatial similarity, and functional connectivity analyses indicated that left IFG disruption reduced category-specific tuning in extrastriate cortex and impaired performance on a face/scene matching task. In addition, activity in the non-disrupted right IFG, and connectivity between this region and extrastriate cortex, predicted resistance to behavioral impairment from left IFG disruption.

In the current study, to assess the effects of lateral PFC disruption on the neural representation of the active maintenance of information codes during working memory, we examined multi-voxel patterns of activity within a priori anatomical and functional ROIs as well as across the whole brain. Specifically, we examined two distinct types of representations: (1) stimulus category—whether a face or a scene was presented on each trial of a face/scene matching task and (2) goal relevance—whether the presented image was task relevant (i.e., a face is relevant in a ''Remember Faces'' block, but irrelevant in a ''Remember Scenes'' block). Then, we compared the fidelity of these representations following left IFG TMS to those following left post-central gyrus TMS (control site).

#### Exploratory Searchlight MVPA Analyses

In an independent dataset in which participants did not undergo TMS (n = 24), a whole-brain Gaussian Naïve Bayes searchlight classifier (Pereira and Botvinick, 2011) was used to identify brain regions reliably representing each information code (e.g., stimulus category and goal relevance). Nine of these subjects later participated in the TMS experiment, but the data used in this exploratory searchlight analysis was separate from the data later analyzed for TMS effects.

As predicted, a stimulus category code was reliably identified in extrastriate cortex, but also within primary visual cortex and parietal cortex. To identify category-selective ROIs to test for TMS effects, we selected voxels within these areas using a highly stringent FDR-corrected alpha level of 0.0001 (**Figure 2A**). Anatomical coordinates of these ROIs are presented in **Table 2**.

A goal relevance code was reliably identified in a bilateral set of regions including lateral and medial PFC, premotor cortex, superior parietal cortex, and striatum (**Figure 2B**). To identify goal-relevance ROIs to test for TMS effects, we selected voxels within these areas using a highly stringent FDR-corrected alpha level of 0.0001. Voxel clusters were identified in IFG, supplementary motor area, precentral sulcus/precentral gyrus, inferior parietal lobule, and angular gyrus, and left caudate nucleus. Anatomical coordinates of these ROIs are presented in **Table 3**.

FIGURE 2 | Whole-brain searchlight analysis (FDR corrected, alpha p < 0.0001). (A) Brain regions that reliably represent stimulus category. (B) Brain regions that reliably represent goal relevance. Axial slice depicts voxels identified in the bilateral anterior insula/frontal operculum and left caudate nucleus.

# Effect of Left IFG TMS on Stimulus Category Code

#### ROI-Based Analyses

A stimulus category code was reliably identified within functional ROIs defined from the whole-brain searchlight analysis following control site TMS (**Figure 3**). Mean classification accuracies were 63% in these ROIs in both hemispheres (both significant after Bonferroni correction for multiple comparisons; permutation test p's < 0.02 corrected, p's < 0.001 uncorrected). In addition, a stimulus category code was reliably identified in these ROIs after left IFG TMS [mean classification accuracies of 59% (left) and 60% (right), p's < 0.02 corrected, p's < 0.001 uncorrected]. While a small effect, decoding accuracy of the stimulus category code in the left visual cortex functional ROI

TABLE 2 | Anatomical locations and MNI coordinates of the centers of mass of clusters used as stimulus category functional ROIs.


multiple comparisons, and tildes indicate p < 0.05 without Bonferroni correction. Error bars depict ± standard error of the mean. Dashed line indicates chance classification accuracy.

was reduced by left IFG TMS (p = 0.01, uncorrected). Decoding accuracy of the stimulus category code in the right visual cortex functional ROI was not affected by left IFG TMS (p = 0.08, uncorrected), but the more restricted anatomical extrastriate ROI exhibited a significant decrease in stimulus category code decoding accuracy after IFG TMS (TMS effect: p = 0.03, uncorrected).

A stimulus category code was not reliably identified in the anatomical MFG or IFG ROIs after either control TMS (left MFG: p = 0.08; right MFG: p = 0.56; left IFG: p = 0.49; right IFG: p = 0.39, all p's uncorrected) or after left IFG TMS (left MFG: p = 0.49; right MFG: p = 0.56; left IFG: p = 0.457; right IFG: p = 0.39, all p's uncorrected, **Figure 3**). There were also no significant differences between the TMS conditions in these four anatomical ROIs (all p's > 0.18, uncorrected).

TABLE 3 | Anatomical locations and MNI coordinates of the centers of mass of clusters used as goal relevance functional ROIs.


#### Searchlight Analyses

Following left IFG TMS, the whole-brain searchlight analysis identified a number of significant clusters in bilateral occipital and parietal cortex, and left superior medial gyrus, that exhibited a significant decrease in stimulus category code decoding accuracy (**Figure 4A**). Anatomical coordinates of these regions are presented in **Table 4**. Voxels within left fusiform gyrus, intraparietal, middle occipital, and parieto-occipital sulci, and in the right calcarine sulcus and cuneus exhibited spatial overlap with the category-selective regions identified in the independent exploratory searchlight analysis for identifying the stimulus category code (**Figure 2A**). Voxels in the superior medial gyrus were not identified in the exploratory searchlight analysis.

While it is unclear how to interpret increases in classification accuracy following IFG TMS as compared to control TMS, we found significant increases in stimulus category code decoding accuracy in the bilateral insula, right IFG, right superior temporal gyrus, and left middle temporal gyrus. In none of these regions was stimulus category reliably coded in the independent, no-TMS

FIGURE 4 | Regions that exhibited a significant decrease in (A) stimulus category and (B) goal relevance code decoding accuracy following left IFG TMS, as compared to control TMS. All depicted voxels are significant at the alpha (p < 0.005) level, and only voxel clusters larger than 99.5% of the null distribution of cluster sizes are shown here. White outlines depict the regions that showed above-chance classification in the exploratory searchlight analysis used to identify the stimulus category and goal relevance codes (see "Exploratory Searchlight MVPA Analyses" Section), voxelwise uncorrected p < 0.001.

TABLE 4 | Anatomical locations and MNI coordinates of the centers of mass of voxel clusters showing significant decreases in stimulus category code decoding accuracy after left IFG TMS.


∗ Indicate clusters within regions that also reliably represented the stimulus category code in a searchlight analysis on an independent, no-TMS dataset.

dataset used for functional ROI definition (see ''Regions of interest—functional'' Section ).

# Effect of Left IFG TMS on Goal Relevance Code

#### ROI-Based Analyses

A goal relevance code was reliably identified following control site TMS within all of the functional ROIs defined from the whole-brain searchlight analysis in an independent dataset: bilateral IFG, MFG, supplementary motor area, precentral sulcus/precentral gyrus/inferior frontal junction (IFJ), anterior insula/frontal operculum, parietal cortex, and left caudate nucleus (caudate p = 0.03, uncorrected, all other permutation test p's < 0.02 corrected, p's < 0.001 uncorrected; **Figure 5**). After left IFG TMS, however, the goal relevance decoding accuracy was significantly reduced, both in the left IFG (TMS effect: p = 0.04, uncorrected), right MFG (TMS effect: p < 0.01 corrected, p < 0.001 uncorrected) and the bilateral precentral sulcus/precentral gyrus/IFJ functional ROI (TMS effect: p = 0.01, uncorrected). We further examined the significant effect of left IFG TMS on the goal relevance code in the bilateral precentral sulcus/precentral gyrus/IFJ ROI by performing the classification analyses separately within each hemisphere. While goal relevance was represented with high reliability in the left and right ROIs both after control TMS and after left IFG TMS (all p's < = 0.004 corrected), the left IFG TMS marginally reduced decoding accuracy in both hemispheres (left TMS effect: p = 0.07, right TMS effect: p = 0.05, both uncorrected).

Following left IFG TMS, there was no significant decrease in goal relevance decoding accuracy in the right IFG (TMS effect: p = 0.34), left MFG (TMS effect: p = 0.10), bilateral supplementary motor area (TMS effect: p = 0.19), bilateral anterior insula/frontal operculum (TMS effect: p = 0.31), left caudate nucleus (TMS effect: p = 0.82) or bilateral parietal cortex ROI (TMS effect: p = 0.17).

#### Searchlight Analyses

Following left IFG TMS, as compared to control site TMS, the whole-brain searchlight analysis identified several brain regions that exhibited a significant decrease in goal relevance decoding accuracy (**Figure 4B**). These regions were found throughout the

frontal, parietal and occipital cortex (**Table 5**). Mirroring the ROI-based analysis, significant reductions were found in the left IFG, precentral sulcus, middle occipital gyrus/intra-parietal sulcus and right middle temporal gyrus, and calcarine gyrus. Anatomical coordinates of these regions are presented in **Table 5**.

We found significant increases in goal relevance decoding accuracy in the right insula, left middle temporal gyrus, and left lingual gyrus, although none of these regions exhibited significant coding of goal relevance in the independent no-TMS dataset (see ''Regions of interest—functional'' Section ).

#### Behavioral Analyses

Across both the ''Remember Faces'' and ''Remember Scenes'' conditions, participants performed the face/scene matching task with 92.9% mean accuracy after control site TMS. After left IFG TMS, mean accuracy was reduced to 90.1%. We tested for a brain-behavior relationship within the ROIs that showed a

TABLE 5 | Anatomical locations and MNI coordinates of the centers of mass of voxel clusters exhibiting significant decreases in goal relevance code decoding accuracy after left IFG TMS.


∗ Indicate clusters within regions that reliably represented the goal relevance code in a searchlight analysis of an independent, no-TMS dataset.

significant effect of TMS on decoding accuracy of the stimulus category code (left category-selective visual cortex functional ROI) and the goal relevance code (right MFG, left IFG, and bilateral precentral sulcus/IFJ), using an independent samples t-test on a median split of TMS-induced behavioral accuracy decrement (i.e., accuracy after control TMS minus accuracy after IFG TMS). While under-powered given the small number of subjects, no significant differences between the most- and leastimpaired participants were found in the TMS effect on the stimulus category code in the left visual cortex functional ROI (t(5.45) = 0.62, p = 0.56), or on the goal relevance code in the right MFG (t(6.95) = −0.95, p = 0.37), left IFG (t(4.02) = −0.69, p = 0.53), or bilateral precentral sulcus (t(6.4) = 1.22, p = 0.27).

In our previous analysis of this dataset (Lee and D'Esposito, 2012), we found that increased activity in the right (nondisrupted) IFG after TMS predicted resistance to the behavioral impairment caused by TMS. To further clarify this result, we tested for a relationship between behavioral accuracy after IFG TMS and decoding accuracy of the goal relevance code in this region. Across the large right IFG anatomical ROI, we found a significant positive correlation, such that those participants who showed high accuracy on the task exhibited reliable coding of goal relevance in the right IFG (Spearman's rho = 0.65, p = 0.04). As expected given the reduction of the goal relevance code in the MFG after left IFG TMS, there was no such relationship in either the left or the right MFG (left MFG: rho = 0.35, p = 0.36; right MFG: rho = 0.57, p = 0.11).

#### DISCUSSION

A growing body of evidence suggests that the prioritized processing and storage of information in VWM relies on topdown modulation of visual areas by the PFC (Miller and D'Esposito, 2005; Bressler et al., 2008; Sreenivasan et al., 2014a; D'Esposito and Postle, 2015). Here, we add causal evidence that the lateral PFC provides top-down signals that modulate the category-selectivity of visual cortex during VWM. In addition, we provide evidence that integration of an overarching task goal with incoming visual information is at least partially subserved by the left IFG, from which this information is likely transmitted to other regions responsible for further VWM processing.

In this study, we conducted a set of multi-voxel pattern analyses to identify brain regions that code for stimulus category and goal relevance during a face/scene matching task. Second, we determined how the fidelity of these codes (as indexed by multivoxel pattern analysis classifier decoding accuracy) is affected by disruption of the lateral PFC. As predicted, we found that stimulus category information was represented most reliably in extrastriate cortex, extending to early visual cortex and posterior parietal cortex. After left IFG disruption, there was a moderate reduction in the fidelity of the stimulus category code within these regions in both hemispheres. This finding is consistent with two previous studies that investigated the remote effects of disrupted lateral PFC function on visual cortical activity during VWM. The first (Miller et al., 2011), found that disruption of PFC function, both with TMS in healthy participants and in patients with lateral PFC lesions due to stroke, reduces the distinctiveness


of extrastriate cortex responses to face and scene stimuli. The second, using a different type of analysis of the data used in the present study (Lee and D'Esposito, 2012), also found that PFC disruption with TMS in healthy individuals causes a reduction in visual category selectivity in extrastriate cortex. Importantly, the participants for whom the lateral PFC disruption reduced the tuning of extrastriate responses to faces and scenes the most showed the greatest impairments in behavioral accuracy. While numerous correlational studies, both in humans (e.g., Gazzaley et al., 2004, 2007; Nee and Jonides, 2009; Tamber-Rosenau et al., 2011; Kuo et al., 2014) and in non-human primates (e.g., Freedman et al., 2003), have provided indirect evidence for topdown modulation of visual cortex by the PFC during VWM, the use of transient PFC disruption with TMS contributes important causal evidence for this model of cognitive control.

While the stimulus category code presumably arises largely as a result of ''bottom-up'' visual processing, the coding of goal relevance depends on the integration of a high-level task goal with bottom-up stimulus category information. This bridge between task goal and incoming visual information, while crucial for successful VWM performance, has not been wellcharacterized. In the current analyses, we found that the goalrelevance of incoming visual information, as determined by the current task set, was coded reliably in a distributed network of regions thought to be important for cognitive control, selective attention, and working memory (Dosenbach et al., 2006; Harding et al., 2015; also see Lückmann et al., 2014; for a review of these regions in attentional orienting in working memory). These goal-relevance regions included the IFG, MFG, precentral sulcus/IFJ, supplementary motor area, left striatum, and parietal and extrastriate cortices.

Following left IFG TMS, the fidelity of the goal relevance code decreased within that region, as predicted. In addition, the left IFG TMS also disrupted the goal relevance code in the MFG, suggesting that MFG relies on input from, or reciprocal communication with, the left IFG for the selective processing and maintenance of visual information. Previous functional connectivity analyses have suggested that the MFG plays a key role in VWM distractor resistance and in protecting items in memory from interference (Sakai et al., 2002; Postle, 2005), while the IFG may play a stronger role in determining the level of attention to allocate to incoming stimuli, based on task goals (Clapp et al., 2010). Considering these and the present findings, it is possible that the IFG is involved in determining whether an incoming stimulus is goal relevant, and gating information transfer to MFG accordingly, to aid in the protection of current items in memory from interference. Consistent with this proposed model (Feredoes et al., 2011) found that disruption of right MFG function with TMS during the presentation of distractors in a delayed recognition task caused increased activity in visual regions selective for the category of the remembered item.

After left IFG TMS, we also found a significant decrease in the fidelity of a goal relevance code within bilateral precentral sulcus/IFJ. A previous human fMRI/ERP study demonstrated that TMS to right IFJ before a similar delayed recognition task impaired task accuracy, and the size of the behavioral decrement was predicted by the degree to which top-down modulation of early visual cortex activity by the IFJ was impaired (Zanto et al., 2011). In addition, in a human MEG study, it was found that attention to different object categories induced gamma synchrony between the IFJ and the extrastriate regions most selective for those categories (Baldauf and Desimone, 2014). Moreover, the gamma activity in IFJ slightly preceded activity in extrastriate regions, which was interpreted as evidence that the IFJ directs visual processing via gamma synchrony with category-selective visual areas. Therefore, in the context of our results, it is possible that top-down modulation of visual areas by the lateral PFC is accomplished via processing of goal-relevance information in bilateral precentral sulcus/IFJ, from which goal-directed attention (Asplund et al., 2010) may be deployed to shape bilateral extrastriate cortical response selectivity (e.g., Chen et al., 2012). Further, it is likely that other brain regions, such as the frontal eye fields (e.g., Taylor et al., 2007), contribute additional top-down signals that aid VWM.

Finally, left IFG disruption did not significantly reduce the fidelity of the goal relevance code in the right IFG. However, participants who exhibited greater fidelity of the goal relevance code in this region after TMS performed the task with the highest accuracy. These findings are consistent with our original analyses of this dataset (Lee and D'Esposito, 2012). In that study, we found that increased functional connectivity between the right IFG and the right extrastriate cortex before TMS, and increased activity in the non-disrupted IFG after TMS, predicted resistance to the behavioral VWM impairment caused by TMS. Therefore, the current analysis provides additional insight into a potential compensatory mechanism, whereby reliable coding of goal relevance in a region homologous to the disrupted PFC area can provide protection against behavioral VWM impairment.

# AUTHOR CONTRIBUTIONS

MD, AJ-WC, TGL, and ESL participated in study conception and design, TGL collected the data, ESL conducted the analyses, and ESL drafted the manuscript with critical revisions from TGL, AJ-WC, and MD.

# FUNDING

This work was supported by the National Institutes of Health Grants MH63901 and NS40813, the National Science Foundation Major Research Instrumentation Program BCS-0821855, and the VA Office of Rehabilitation Research and Development.

# ACKNOWLEDGMENTS

We thank Terence Nycum for his work developing the cognitive task, and for his contributions to the multi-voxel pattern analyses, and Joshua Hoffman for his assistance with data collection.

#### REFERENCES


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Lorenc, Lee, Chen and D'Esposito. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Role of Prefrontal Cortex in Working Memory: A Mini Review

Antonio H. Lara<sup>1</sup> \* and Jonathan D. Wallis 2,3

<sup>1</sup> Department of Neuroscience, Columbia University Kolb Research Annex, New York, NY, USA, <sup>2</sup> Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA, USA, <sup>3</sup> Department of Psychology, University of California at Berkeley, Berkeley, CA, USA

A prominent account of prefrontal cortex (PFC) function is that single neurons within the PFC maintain representations of task-relevant stimuli in working memory. Evidence for this view comes from studies in which subjects hold a stimulus across a delay lasting up to several seconds. Persistent elevated activity in the PFC has been observed in animal models as well as in humans performing these tasks. This persistent activity has been interpreted as evidence for the encoding of the stimulus itself in working memory. However, recent findings have posed a challenge to this notion. A number of recent studies have examined neural data from the PFC and posterior sensory areas, both at the single neuron level in primates, and at a larger scale in humans, and have failed to find encoding of stimulus information in the PFC during tasks with a substantial working memory component. Strong stimulus related information, however, was seen in posterior sensory areas. These results suggest that delay period activity in the PFC might be better understood not as a signature of memory storage per se, but as a top down signal that influences posterior sensory areas where the actual working memory representations are maintained.

#### Edited by:

Natasha Sigala, University of Sussex, UK

#### Reviewed by:

David J. Freedman, University of Chicago, USA Bradley R. Postle University of Wisconsin–Madison, USA

> \*Correspondence: Antonio H. Lara ahl2143@columbia.edu

Received: 28 August 2015 Accepted: 27 November 2015 Published: 18 December 2015

#### Citation:

Lara AH and Wallis JD (2015) The Role of Prefrontal Cortex in Working Memory: A Mini Review. Front. Syst. Neurosci. 9:173. doi: 10.3389/fnsys.2015.00173 Keywords: working memory, attention, executive function, prefrontal cortex, frontoparietal network

# INTRODUCTION

A widely held view of prefrontal cortex (PFC) function is that it encodes task relevant information in working memory (Goldman-Rakic, 1987; Miller and Cohen, 2001; Baddeley, 2003). This account originates from decades of work that showed strong neural activity in PFC during the delay period of working memory tasks (Fuster and Alexander, 1971; Funahashi et al., 1993a; Wilson et al., 1993; Levy and Goldman-Rakic, 2000). This delay period activity has two key properties. First, it is specific to the stimulus being remembered, consistent with it containing information about the content of working memory. Second, it only encodes stimuli that are relevant to the task at hand: it is resistant to distractors (Miller et al., 1996; Sakai et al., 2002) and task irrelevant information is not encoded in working memory (Rainer et al., 1998). These properties of delay period activity have been observed at the single-neuron level in monkeys as well as on a larger scale in human imaging studies (Courtney et al., 1998; Zarahn et al., 1999; Curtis et al., 2004). In monkeys, single neurons recorded from PFC maintain stimulus information across the delay period, even when distracting stimuli are presented in the middle of the delay (Miller et al., 1996). The delay period activity is thought to reflect the stimulus currently in memory (Fuster, 1973; Funahashi et al., 1993a; Wilson et al., 1993; Procyk and Goldman-Rakic, 2006). In humans, multiple studies using various imaging techniques have also shown an increase in delay period activity in PFC. For example, using functional magnetic resonance imaging (fMRI) sustained activation was measured in the lateral PFC while subjects kept spatial locations in working memory across delays of several seconds (Courtney et al., 1998).

The necessity of PFC delay activity for working memory is demonstrated by studies showing that lesions to PFC produce strong deficits in working memory tasks both in monkeys (Fuster and Alexander, 1971; Bauer and Fuster, 1976; Funahashi et al., 1993b; Wilson et al., 1993; Levy and Goldman-Rakic, 2000) and humans (Müller et al., 2002; Tsuchida and Fellows, 2009; Voytek and Knight, 2010). In addition, disruption of delay period activity with microstimulation increases the rate of errors (Wegener et al., 2008). Furthermore, the longer the delay, the greater the error rate, consistent with a failure of working memory to retain stimulus information. These findings have formed the basis for the prevailing view of that PFC is the site where information about the stimulus to be remembered is stored in working memory (for a recent review, see D'Esposito and Postle, 2015). However, recently there has been a growing body of work that has cast doubt on this theory (Druzgal and D'Esposito, 2001; Curtis and D'Esposito, 2003; Postle et al., 2003; Ranganath et al., 2004; Sreenivasan et al., 2014a,b; Postle, 2015). In this mini-review, we will briefly discuss the evidence against the prevalent theory and review emerging evidence for an alternate proposal for the role of PFC in working memory.

# IS PFC THE SITE OF WORKING MEMORY STORAGE?

Some of the first evidence that contradicted the view that PFC represents stimulus information in working memory came from neuroimaging studies in humans. Researchers showed that delay period activity in PFC did not encode information specific to the stimulus being held in working memory (Curtis and D'Esposito, 2003; Riggall and Postle, 2012), while the converse was true for posterior sensory areas (Ester et al., 2009; Harrison and Tong, 2009; Serences et al., 2009; Emrich et al., 2013). These findings are important because they confirm that PFC is active during the delay period. However, they also suggest that PFC does not contain information about the stimulus, as would be expected if PFC were the site of working memory storage. In addition to evidence from imaging studies, it has been reported that lesions of PFC do not always impair working memory storage. Patients with large lesions localized to the lateral PFC showed no deficits on tests of verbal and memory span or delayed recognition (D'Esposito and Postle, 1999). A similar result was found in monkeys with lesions of the ventral PFC (Rushworth et al., 1997).

In trying to reconcile these discrepant findings, Curtis and D'Esposito (2003) proposed an alternate role for delay period activity in PFC: ''the [dorsal lateral] PFC does not store representations of past sensory events or future responses. Instead, its activation is an extra-mnemonic source of top-down biasing control over posterior regions that actually store the representations.'' A similar proposal was put forward by Postle (2006), based on similar line of evidence from lesion, imaging and electrophysiology studies. In his influential review Postle argued that ''the retention of information in working memory is associated with sustained activity in the same brain regions that are responsible for the representation of that information in non-working memory situations''; this implies ''that the PFC is not a substrate for the storage of information in working memory.'' (Postle, 2006) Instead, according to Postle, the contribution of PFC to working memory could be any of the control processes (e.g., attentional selection, flexible control, etc.) that are also required when performing a working memory task.

Until recently, however, there was little electrophysiological evidence to support these views. In an early study, Lebedev et al. (2004) trained monkeys to maintain one spatial location in working memory while they also attended to a different location that would provide the go cue for making a saccade to the remembered location. They found two populations of neurons in PFC: one population encoded the location where the monkeys were attending while the other population encoded the spatial location in working memory (Lebedev et al., 2004). This was one of the first demonstrations that PFC neurons can play a different role in a working memory task that is not strictly maintenance per se. Additional evidence for an alternate role PFC in working memory tasks comes from recent work in which researchers used multivariate pattern analysis of neuronal data recorded during performance of a delayed paired-associate task (Stokes et al., 2013). During initial stimulus presentation, PFC population activity encoded information related to the stimulus, yet this information did not persist into the memory period. During subsequent stimulus presentations, PFC population activity first encoded the physical properties of the new stimulus and shortly thereafter it switched to code whether it was a target or a distractor. Thus, PFC does not maintain stimulus information in working memory per se, yet it has access to that information and can reliably encode whether subsequent stimuli are targets or distractors.

Our own work has demonstrated further evidence that PFC is not necessarily involved in maintaining stimulus information in working memory (Lara and Wallis, 2014). We trained monkeys to perform a multi-item working memory task in which they had to remember the color of one or two colored squares. We used a large set of colors and the discriminations could be very difficult, often involving subtle changes in the shade of color. The difficulty of the discriminations required that monkeys maintain a very precise representation of the sample colors in working memory in order to successfully perform the task. Despite the difficulty of the task, monkeys could perform the task significantly above chance level. Surprisingly, however, we found that the overwhelming majority of PFC neurons failed to encode the color of the stimuli in working memory. Instead, the strongest signals reflected the passage of time and the spatial location of the stimuli. Both of these signals could play an important role in organizing behavior towards the performance of the task, but they do not reflect the contents of working memory.

On further analysis, we found that when monkeys had to maintain two colors in working memory, they tended to make small eye movements (microsaccades) to one or other of the items. These microsaccades had behavioral consequences and appeared to reflect covert attention. If the animals covertly attended an item, it was stored with a more precise representation in working memory. The animals appeared to be shifting their attention between the items in order to cope with the increased task difficulty. In this situation, neural activity strongly reflected the locus of covert attention. These results directly support the ideas put forward by Postle (2006). Even though the key requirement of the task was to maintain color information in working memory, there was very little evidence that PFC neurons encoded color. But this did not mean that PFC was uninvolved in the task. Instead PFC neurons encoded attentional control signals that helped improve the animals' performance.

In addition to the emerging neurophysiological evidence discussed above, a recent lesion study bolsters the case against the prevalent view of PFC function in working memory. Pasternak et al. (2015) trained monkeys to perform a delayed-matchto-sample task using random dot stimuli of varying motion coherence. Researchers found that lesions of the lateral PFC produced moderate deficits in the monkeys' ability to remember the direction of motion of stimuli presented in the contralesional side. However, this deficit did not depend on the specific features of the stimuli that led to the remembered direction of motion (e.g., motion coherence), indicating that PFC was not involved in coding the specifics of the motion stimulus. Furthermore, deficits were much more pronounced when the sample and test stimuli appeared in different locations compared to when they appeared in the same location. Thus, PFC lesions seemed to disrupt the ability of the monkeys to rapidly shift their attention at the time of the test. Pasternak and colleagues interpreted these results as evidence that PFC plays a role in attending to stimuli and accessing motion information stored in other areas.

# SENSORY CORTICES PLAY A CRITICAL ROLE IN WORKING MEMORY

If PFC is not responsible for storing information in working memory, then it is important to identify those brain areas that are responsible for this process. There is strong evidence from electrophysiological and functional imaging studies that sensory cortices play a crucial role (Pasternak and Greenlee, 2005). A large number of electrophysiology studies have examined single neuron activity in most sensory cortices including visual (Miller et al., 1993; Motter, 1994), auditory (Gottlieb et al., 1989), and even gustatory cortex (Lara et al., 2009). For example, working memory related activity has been reported in area V4 in a task where monkeys had to remember the color or luminance of a stimulus (Motter, 1994). A number of functional imaging studies have also reported working memory activity in sensory cortices. For example, in a study in which participants had to remember the orientation of a grating (Ester et al., 2009; Harrison and Tong, 2009; Serences et al., 2009; Emrich et al., 2013), orientation specific activation patterns were observed in the pooled activity of early visual areas V1–V4.

If posterior sensory areas are responsible for keeping information in working memory while PFC plays a role in attending to or selecting this information, then there must be a mechanism by which PFC and posterior sensory areas can interact. This assumption is not outlandish since it is known that PFC has reciprocal connections with nearly all sensory cortices (Pandya and Barnes, 1987). What is the nature of the interaction? One possibility is that PFC and posterior areas share information through long-range coupling of ongoing oscillatory activity present in both areas (Engel et al., 2001; Fries, 2009; Canolty and Knight, 2010). Indeed, there is a large body of work both in monkeys and in humans that has revealed an important role of oscillatory activity during working memory tasks (Vogel and Machizawa, 2004; McCollough et al., 2007; Ikkai et al., 2010; Johnson et al., 2011; Myers et al., 2014). For example, in monkeys, strong oscillatory activity in the local field potential (LFP) has been seen in lateral intra-parietal cortex during the performance of a delayed saccade task (Pesaran et al., 2002), and in V4 of monkeys performing a delayed match to sample task (Tallon-Baudry et al., 2004; Lee et al., 2005). There have also been reports of strong oscillatory activity in the LFP of PFC during the delay period (Siegel et al., 2009; Lara and Wallis, 2014) of delayed match to sample tasks.

In humans, extensive work using electroencephalography (EEG), electrocorticography (ECoG) and magnetoencephalography (MEG) has revealed increased ongoing oscillatory activity during working memory tasks both in frontal and posterior areas (for a review, see Roux and Uhlhaas, 2014). In a recent study, participants were asked to remember the spatial locations of either three red discs, three red discs while ignoring three blue discs or six red discs (Roux et al., 2012). In all conditions there was increased oscillatory MEG activity in the alpha and gamma frequency bands. In PFC, activity in the gamma-band (which is thought to reflect local processing; von Stein and Sarnthein, 2000) predicted the amount of task relevant information in working memory. A linear classifier using gamma-band activity from PFC could successfully classify trials with three targets and three distractors in the same category as trials with only three discs and not as six disc trials. Thus, the classifier correctly ignored the task irrelevant discs. In contrast, gamma-band activity in the inferior parietal lobule also reflected spatial information during the delay period, but the classifier failed to identify distractor trials as three item trials. Thus, it appears that while gamma-band activity in both PFC and parietal cortex reflects the stimuli currently in memory, only in PFC is the information discriminated as either task relevant or task irrelevant. A similar result was seen in monkeys where ventral intraparietal cortex population activity robustly encoded the number of target stimuli in a delayed-match-to-numerosity task even in the face of distractors (Jacob and Nieder, 2014). In contrast, PFC population briefly encoded distractors, but target numerosity information was quickly restored and the strength of the restored information predicted correct performance in a trial. Again, this suggests that PFC is not simply involved in the storage of information, but reflects control processes such as monitoring and selection.

# INTERACTIONS BETWEEN PFC AND SENSORY CORTEX

In order to fully understand the nature of the interaction between PFC and posterior sensory cortices, it is important to measure neural activity in both areas simultaneously. A number of recent studies have managed to do this during the performance of working memory tasks. A recent study examined the interaction between V4 and lateral PFC using simultaneous LFP and single neuron recordings in monkeys performing a visual working memory task (Liebe et al., 2012). In this study, researchers found that the theta-band phase locking value, a measure that quantifies the amount of synchrony between theta oscillations in V4 and PFC, was significantly enhanced during the delay period. The phase of PFC oscillations led V4 by about 15 ms, which suggests that the observed coupling is asymmetric and sufficiently fast to support functional interactions between the two areas. Indeed, when they looked at the timing of the spikes from each area, they found that during the delay, spike times were reliably locked to the phase of the ongoing delta-band oscillations in the more distant area (i.e., PFC spikes were phase locked to V4 delta-band LFP and vice versa). Importantly, these effects were stronger in trials in which monkeys successfully maintained information in working memory, and weaker in trials in which monkeys failed to remember the stimulus. These results suggest that synchronous activity in PFC and V4 could provide a mechanism through which information is shared between these two distant areas during working memory maintenance.

A similar flow of information was recently observed between PFC and posterior parietal cortex (Salazar et al., 2012). In this study, researchers made simultaneous spike and LFP recordings from PFC and posterior parietal cortex while monkeys performed a spatial delayed match to sample task. They calculated a coherence selectivity index designed to measure how much mutual information about the memorized stimulus there is between PFC and parietal electrodes. An increase in mutual information about sample identity and location was observed during the delay period. Furthermore, Weiner-Granger Causality showed that the flow of information was primarily from parietal cortex to PFC. These results are consistent with the idea that the storage of information is taking place in sensory cortex and PFC can access that information through synchronization of oscillatory field potentials. A similar phenomenon was reported in a recent study where researchers simultaneously recorded neural activity from lateral PFC and lower level visual areas MT and MST while monkeys performed a delayed match to sample task (Mendoza-Halliday et al., 2014). During the delay period, increased selective spiking activity was seen in MST and lateral PFC but not in MT. This sustained spiking could conceivably reflect the maintenance of stimulus information in working memory in both brain areas. However, an alternative possibility is that MST maintains a strong representation of the stimulus in working memory, which is then read out and integrated with other higher order signals by PFC. The behavioral task does not permit these two possibilities to be distinguished. However, even though there was no increase in spiking activity in MT during the delay period, stimulus information was present in the LFP amplitude from this area. Moreover, there was increased synchrony between low frequency LFP oscillations in MT and lateral PFC spikes, consistent with a top-down interaction between the PFC and early sensory neurons during the maintenance period.

Long-range synchronization of oscillatory field potentials is likely not the whole story. There is also the possibility of a more direct interaction via cortico-cortical synaptic connections between PFC and posterior sensory neurons (Petrides and Pandya, 1984). In a recent study, Crowe et al. (2013) recorded single neuron activity simultaneously form PFC and posterior parietal cortex neurons while monkeys were engaged in a categorization task. Both PFC and parietal neurons have been shown to play an important role in categorization tasks of this kind (Freedman et al., 2001; Miller et al., 2002; Wallis and Miller, 2003; Freedman and Assad, 2006; Ferrera et al., 2009; Swaminathan and Freedman, 2012). They found that the pattern of firing in PFC was strongly correlated with the pattern of firing in posterior parietal cortex at different time lags. Crucially, there was significantly stronger correlation between the pattern of PFC activity at one time and PPC activity at a later time, compared to the opposite direction. These results reflect selective top-down transmission of information from prefrontal to parietal neurons via a mechanism that does not necessarily involve synchronization of ongoing oscillatory activity. Although these results were found in a categorization task, a similar phenomenon could be at play during working memory. Furthermore, the exact direction of the interaction may depend on the precise cognitive process being performed. For example, accessing sensory information may involve information flowing from parietal cortex to PFC (''bottom-up''), while selective attention and filtering may involve information flowing in the reverse direction (''top-down''). Recent studies of sensorimotor processing have shown such bidirectional interactions within the fronto-parietal network (Siegel et al., 2015).

One potential challenge to the view outlined in this review is the recent work by Ester et al. (2015). They required subjects to maintain very precise representations of oriented gratings in working memory, and showed that orientation information could be decoded from the BOLD signal in localized frontoparietal subregions. However, an important caveat in interpreting these kinds of results is that information can be decoded even when neurons are not representing that information. For example, orientation information can be decoded from the retina in principle even though no individual neuron is representing orientation information. In an analogous way, it is possible that orientation information could be decoded from the pattern of activity in PFC neurons responsible for activating the correct representation in posterior sensory cortex even though individual PFC neurons are not tuned for this information in their firing rate. On the other hand, if PFC neurons responsible for precise sensory representations are localized to small subregions it is possible that these representations are missed by standard sampling methods used in single-unit neurophysiology studies. This possibility could be excluded by recording neural activity at multiple scales, such as combining ECoG and single unit methods (Lewis et al., 2015).

# CONCLUSION

In recent years there has been steady stream of work that has challenged the widely held view that PFC stores task relevant information in working memory. Early evidence against this view came mainly from fMRI studies in humans and it culminated in the alternate view, most clearly enunciated by Postle (2006), that sensory information is maintained in working memory by the same sensory neurons that represent that information when it is present in the sensory environment. The role of PFC is not to store information in working memory, but rather to actively focus attention on the relevant sensory representation, select information and perform executive functions that are necessary to control the cognitive processing of the information (Postle, 2006). There is growing neurophysiological and lesion evidence in support of this view.

More work is needed to shed light on the precise nature of the interaction between PFC and sensory areas during working memory. The use of modern large-scale recording methods (Kipke et al., 2008) and analysis techniques (Cunningham and Yu, 2014) has the potential to allow the tracing of the flow

### REFERENCES


of information from sensory areas to PFC and back again during working memory tasks. Equally as important, however, is to lay in place a theoretical framework that will allow the interpretation of this data. One promising idea is to try and understand how neuronal activity is related to the internal state of the brain above and beyond any coding for external factors. This approach forms the basis of the dynamical-systems framework, which has recently been adopted to understand the neural mechanisms underlying motor control (Shenoy et al., 2013). Given that executive processes like working memory and attention are, by their very nature, internal, dynamical processes, using a dynamical-systems approach in their study has the potential to shed light on how the brain internally generates (i.e., without relying on external inputs) the patterns of activity that are required for such a complex repertoire of executive abilities.

#### FUNDING

This work was supported by NIMH grant R01-MH097990 and NIDA grant R01-DA19028 to JDW.


working memory. J. Neurophysiol. 103, 1963–1968. doi: 10.1152/jn.00978. 2009


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Lara and Wallis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multi-Voxel Decoding and the Topography of Maintained Information During Visual Working Memory

#### Sue-Hyun Lee1,2 and Chris I. Baker <sup>2</sup> \*

<sup>1</sup> Department of Bio and Brain Engineering, College of Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, <sup>2</sup> Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA

The ability to maintain representations in the absence of external sensory stimulation, such as in working memory, is critical for guiding human behavior. Human functional brain imaging studies suggest that visual working memory can recruit a network of brain regions from visual to parietal to prefrontal cortex. In this review, we focus on the maintenance of representations during visual working memory and discuss factors determining the topography of those representations. In particular, we review recent studies employing multi-voxel pattern analysis (MVPA) that demonstrate decoding of the maintained content in visual cortex, providing support for a "sensory recruitment" model of visual working memory. However, there is some evidence that maintained content can also be decoded in areas outside of visual cortex, including parietal and frontal cortex. We suggest that the ability to maintain representations during working memory is a general property of cortex, not restricted to specific areas, and argue that it is important to consider the nature of the information that must be maintained. Such information-content is critically determined by the task and the recruitment of specific regions during visual working memory will be both task- and stimulus-dependent. Thus, the common finding of maintained information in visual, but not parietal or prefrontal, cortex may be more of a reflection of the need to maintain specific types of visual

information and not of a privileged role of visual cortex in maintenance.

#### Keywords: working memory, short term memory, multivoxel pattern analysis (MVPA), visual imagery, visual working memory, fMRI

#### INTRODUCTION

Working memory commonly refers to our ability to maintain and manipulate stimulus representations, typically for a short period of time, in the absence of the ongoing presence of that stimulus (Baddeley and Hitch, 1974). For example, holding a phone number in mind prior to pressing the buttons on the phone. In vision, working memory can involve diverse types of maintained content from complex forms such as faces and objects to fine visual details such as specific orientations or colors. The neural basis of visual working memory has long been the subject of debate and while multiple brain areas, from visual cortex, including primary visual cortex (V1) and the middle temporal area (MT), to the parietal, temporal and prefrontal

#### Edited by:

Zsuzsa Kaldy, University of Massachusetts Boston, USA

#### Reviewed by:

Bradley R. Postle, University of Wisconsin-Madison, USA Thomas B. Christophel, Charité – Universitätsmedizin Berlin, Germany Hoi Chung Leung, Stony Brook University, USA

#### \*Correspondence:

Chris I. Baker bakerchris@mail.nih.gov

Received: 17 September 2015 Accepted: 08 January 2016 Published: 15 February 2016

#### Citation:

Lee S-H and Baker CI (2016) Multi-Voxel Decoding and the Topography of Maintained Information During Visual Working Memory. Front. Syst. Neurosci. 10:2. doi: 10.3389/fnsys.2016.00002 cortices have been implicated in visual working memory (Wager and Smith, 2003), the functional roles these regions play has been controversial. Typically, theories have distinguished different processes that might be involved in visual working memory (Eriksson et al., 2015), making a distinction between stimulus representation or storage and executive or top down control, and have tried to map those distinctions onto specific brain regions. Various accounts posit that there is a working memory system separate from other memory or perception systems (e.g., Baddeley, 2012), that prefrontal cortex is involved in both maintenance and executive control (e.g., Funahashi et al., 1989, 1993; Chafee and Goldman-Rakic, 1998; Constantinidis et al., 2001), or that information is maintained in posterior cortex with prefrontal cortex primarily involved in top-down control of those regions (for recent review, see D'Esposito and Postle, 2015). In this review, we will focus on recent evidence from human functional magnetic resonance imaging (fMRI) studies identifying the substrates of maintained representations during visual working memory.

The terms ''visual working memory'' and ''visual shortterm memory'' are often used interchangeably. One of the key components of working memory is indeed the short-term maintenance of visual representations. However, working memory is often used to describe not just maintenance of representations, but internal manipulation of those representations as well (for recent discussion, see Marois, 2015; Postle, 2015a). In this review, we will refer to ''visual working memory'', following many of the studies that we cite, although our primary focus is on the maintenance of visual representations. Such maintenance can occur in many different contexts. For example, a participant might be asked to remember a stimulus that is briefly flashed on the screen (e.g., Serences et al., 2009). Alternatively, a participant might be cued to recall a recently presented stimulus, out of two or more alternatives, and then asked to remember that stimulus over a delay period (e.g., Harrison and Tong, 2009). However, the representations that are being maintained need not be accessed from recent sensory experience, but can also be retrieved from long-term memory, allowing further manipulation of the remembered content in such a way that makes it useful for ongoing behavior. In this light, visual working memory may share mechanisms with visual imagery (Albers et al., 2013; Tong, 2013) and even the accessing of conceptual knowledge (Martin, 2007, 2015).

In this review, we will highlight that to understand the engagement of particular regions during working memory, it is important to consider the nature of the stimulus representations that are being maintained. We will use the term ''information'' to refer to the specific aspects of the presented stimulus that are relevant to task performance and must therefore be remembered over the delay period. Thus, ''information'' does not necessarily refer to the entire stimulus itself or even to sensory properties of the stimulus. The maintained information could be one aspect of a visually presented stimulus (e.g., color, but not orientation, of a grating stimulus), or an abstraction from the stimulus (e.g., category). Further, the same information could be contained in very different underlying representations. For example, stimulus position could be maintained either in a visual representation (e.g., in V1) or a motor representation for an upcoming eye movement.

The fMRI studies we focus on have employed multivoxel pattern analysis (MVPA) techniques to decode maintained representations during the delay periods of working memory tasks. By ''decoding'' we simply mean that the BOLD response measured with fMRI has been used to infer the information that is represented. Many of these studies have revealed maintained representations in visual cortex (e.g., V1-V4, MT), supporting a role of sensory, not prefrontal, cortex in maintenance. However, there is some evidence for maintenance outside of visual cortex (including posterior parietal and prefrontal cortex) and here, we suggest that the ability to maintain information is a general property of cortex, not limited to specific regions. We argue that the predominance of studies revealing maintained representations in early visual cortex reflects the stimuli and task that have been probed. Specifically, the recruitment of any region will reflect the particular information that must be maintained as determined by the task context and the behavioral goals. Thus, working memory is best understood as a highly distributed process wherein information can be maintained in any systems engaged in the initial perceptual processing. This includes not just sensory cortex, but any region contributing to the initial percept, including parietal and frontal areas.

# DECODING MAINTAINED REPRESENTATIONS

The notion that information is maintained in sensory regions during visual working memory has been referred to as the ''sensory recruitment'' hypothesis (Pasternak and Greenlee, 2005). Early support for this view came from perceptual discrimination studies in which participants had to detect whether a sample stimulus (varying in spatial frequency, orientation, or motion stimulus) matched a test stimulus presented after a brief delay (Dupont et al., 1998; Magnussen and Greenlee, 1999). Irrelevant stimuli presented during the delay were found to interfere with discrimination performance in a feature-selective manner, suggesting that the mechanisms involved in maintaining the representation of the sample stimulus are linked to those involved in perceptual processing (Magnussen et al., 1991; Magnussen and Greenlee, 1992).

However, physiology (e.g., Funahashi et al., 1989, 1993; Miller et al., 1996; Constantinidis et al., 2001) and early fMRI (e.g., Zarahn et al., 1997; Courtney et al., 1998; Jha and McCarthy, 2000; Leung et al., 2002) studies shifted the emphasis away from sensory cortex to prefrontal cortex with the observation of elevated activity during the delay period that spanned intervening stimuli. While it was appealing to equate maintained activity with maintained representations, the mere presence of elevated activity does not indicate the nature of the underlying processing (Curtis and D'Esposito, 2003; Sreenivasan et al., 2014a). Further, such increased activity can also be found in posterior brain areas (Ranganath and D'Esposito, 2005) for both simple (Greenlee et al., 2000) and complex (Courtney et al., 1997; Druzgal and D'Esposito, 2003; Ranganath et al., 2004; Oh and Leung, 2010) visual features.

An alternative approach, focusing on the capacity limit of working memory, highlighted the potential role of parietal cortex. In particular, regions in parietal cortex exhibit activity which tracks the number of items held in memory and correlates with apparent capacity limitations (Linden et al., 2003; Todd and Marois, 2004, 2005; Vogel and Machizawa, 2004; Xu and Chun, 2006; Harrison et al., 2010). Further, Mitchell and Cusack (2008) found correlation with capacity-based regressors not only in parietal cortex but also in some prefrontal areas. While these findings suggest a link between parietal (and possibly prefrontal) cortex and working memory capacity, they do not indicate that the representations are maintained in these regions.

Recent fMRI studies have now provided more compelling evidence for the sensory recruitment model by focusing on whether the responses in a given region are specific to the maintained information (D'Esposito and Postle, 2015). Such studies have taken advantage of the development of MVPA techniques (for reviews, see Norman et al., 2006; Serences and Saproo, 2012; Haynes, 2015), which focus on the patterns of response across voxels rather than the average magnitude (see **Table 1** for a summary of studies). In these studies, the BOLD responses in a given region are used to infer or ''decode'' the nature of the underlying representation. For example, Harrison and Tong (2009) presented participants with two serially presented gratings, followed by a retro-cue (''1'' or a ''2'') indicating whether they had to remember the first or second grating. A test grating was presented after a further delay of 11 s and participants had to indicate whether it was rotated clockwise or anticlockwise relative to the cued grating. There were three key findings. First, during the delay period, the patterns of BOLD response in early visual cortex (V1-V4) could be used to decode the orientation of the grating held in memory, suggesting that early visual cortex holds a specific representation of the maintained orientation. Second, this decoding was possible even when there was no elevated activity during the delay period, suggesting that elevated activity is not necessary for the maintenance of orientation information. Third, the patterns of response observed during the delay period were similar to those evoked by physically presented gratings, suggesting that the maintained representations are strongly related to perceptual representations in these areas.

Support for the maintenance of representations in early visual cortex has also been provided by an alternative approach in which the response properties of individual voxels are explicitly modeled. For example, Ester et al. (2013) fit a model (often termed an encoding model) of orientation selectivity, based on a set of eight orientation-selective response functions ''channels'', to each voxel in early visual areas (following the approach of Brouwer and Heeger, 2009, 2011). Then, based on the response pattern across voxels (in independent data), they could reconstruct images reflecting the information content in a given area during the delay period of the task. This analysis revealed graded response profiles in V1 and V2 that peaked for the remembered orientation and was only present when explicit memory was required.

The ability to decode maintained orientation information in early visual cortex during visual working memory has now been replicated multiple times, supporting the three key findings described above (Ester et al., 2009, 2015; Serences et al., 2009; Sneve et al., 2012; Albers et al., 2013; Pratte and Tong, 2014). Further, the precision of the orientation representations in early visual cortex, measured as memory load is varied, reflects behavioral performance (Ester et al., 2013; see also Emrich et al., 2013). Beyond orientation, decoding of maintained representations has also been reported in early visual cortex for contrast (Xing et al., 2013), location (Sprague et al., 2014), motion (Riggall and Postle, 2012; Emrich et al., 2013), color (Serences et al., 2009), and color patterns (Christophel et al., 2012, 2015).

In all of these cases, the information that can be decoded during visual working memory is the kind of information (e.g., orientation, color, contrast) that is well represented by the underlying stimulus feature-selectivity in early visual cortex. Similarly, other areas of visual cortex with more specialized feature-selectivity during perception have demonstrated maintenance of information corresponding to that selectivity. For example, decoding of simple (Riggall and Postle, 2012; Emrich et al., 2013) and complex motion information (Christophel and Haynes, 2014) has been reported in the human MT complex (MT+) that is highly selective for stimulus motion. Further, in studies that have tested working memory for complex images such as objects, scenes and faces, decoding of maintained information has been reported in category-selective occipitotemporal cortex (Linden et al., 2012; Han et al., 2013; Lee et al., 2013; Nelissen et al., 2013; Sreenivasan et al., 2014b). However, it is important to note that in many of these cases while the task required within-category information (e.g., individual faces or scenes), decoding was at the level of category (e.g., faces vs. scenes, see **Table 1**). Thus, the ability to maintain representations appears to be a general property of visual cortex, with regions maintaining representations of those stimuli that match their underlying stimulus-selectivity.

It is important to realize, however, that the maintenance of content during delay periods is not simply a passive reflection of stimulus properties. The nature of the information maintained is critically dependent on the task, which determines the specific information that is required for successful performance. For example, Serences et al. (2009) presented colored oriented gratings and varied whether color or orientation was relevant for the discrimination to be made after the delay. They found that both orientation and color could be decoded from V1 during the delay, but only when that specific feature information was taskrelevant. Similarly, while there is some evidence that orientation information is maintained throughout V1, not just in the part of the retinotopic map corresponding to the stimulus location in the visual field (Ester et al., 2009, 2015), location-specific orientation information can be decoded when both location and orientation are task-relevant (Pratte and Tong, 2014). Consistent with this, Lee et al. (2013) reported decoding of object identity in high-level visual cortex only when the visual properties of the presented stimuli were task-relevant.


#### TABLE 1 | Summary of studies demonstrating multi-voxel decoding of information during visual working memory.

(Continued)

#### TABLE 1 | (Continued).


(Continued)

#### TABLE 1 | (Continued).


Studies are organized first by date and then alphabetically by first author. Across studies, a wide range of visual stimuli have been employed, from oriented gratings to high–level stimuli such as faces, objects and scenes. We list both the task-relevant information as well as the information that could be decoded. In many cases, these are the same, but there are also some studies in which the level of decoding differed from the task-relevant information. For example, in several of the studies employing high-level visual stimuli, the task required maintenance of information about within-category exemplars (e.g., different faces or scenes), but the decoding was at the level of category (e.g., faces vs. scenes). In the final column, we list the major regions in which information could be decoded. Studies differed in how regions were identified (e.g., region-of-interest vs. searchlight analyses) and we adopt the level of description provided in the published studies. We ascribe decoding to particular functional regions (e.g., V1, MT, FFA) only if those regions were specifically localized. Further, note that we do not give any information about tested regions in which information could not be decoded. For this information, we refer readers back to the original cited papers. EBA, Extrastriate Body Area; FFA, Fusiform Face Area; IPS0–4, retinotopically-defined regions in and around the intra-parietal sulcus (iIPS, inferior intra-parietal sulcus; sIPS, superior intra-parietal sulcus); LOC, object-selective Lateral Occipital Complex; LO1, lateral occipital area 1; MT+, motion-selective areas including both the middle temporal (MT) and medial superior temporal (MST) areas; OFA, Occipital Face Area; PCS, precentral sulcus; PPA, Parahippocampal Place Area; RSC, scene-selective Retro Splenial Complex; TOS, scene-selective region near the Transverse Occipital Sulcus; V1-V4, retinotopically defined regions of early visual cortex.

In contrast to the ability to decode maintained information in visual cortex during working memory, studies investigating parietal and frontal cortex have often failed to find any evidence for maintained representations. For example, while Riggall and Postle (2012) could decode maintained information about motion direction in early visual cortex and MT+, this was not possible in frontal and parietal areas, even when selecting those areas that showed elevated activity during the delay. Similarly, Emrich et al. (2013) found that the ability to decode multiple items in memory decreased significantly with increasing load in early visual cortex and MT+, but could not decode remembered items in parietal cortex, even in those areas that showed loadsensitive delay period activity. These results argue strongly for the sensory recruitment model and suggest that neither elevated nor load-sensitive delay activity is a sufficient marker for maintained representations in working memory.

However, these failures to find evidence for maintained representations outside visual cortex should be treated cautiously since some studies have reported positive results (Christophel et al., 2012, 2015; Jerde et al., 2012; Lewis-Peacock and Postle, 2012; Han et al., 2013; Christophel and Haynes, 2014; Naughtin et al., 2014; Sprague et al., 2014; Ester et al., 2015). For example, in studies of working memory for colored patterns and motion flow patterns, Christophel and colleagues (Christophel et al., 2012, 2015; Christophel and Haynes, 2014), reported decoding of maintained information not only in early visual cortex but also in posterior parietal cortex. Further, decoding of stimulus position has been reported in both parietal and frontal cortex (Jerde et al., 2012; Sprague et al., 2014). While these results appear to disagree with the sensory recruitment model, they are potentially explained by considering the nature of the information that must be maintained and the underlying functional properties of the regions. Specifically, the novel stimuli employed by Christophel and colleagues are defined by the relative spatial position of the color or moving elements, precisely the kind of information that parietal cortex is generally thought to process during perception (Kravitz et al., 2011). Similarly, stimulus position is well represented in parietal and frontal cortex, related to sensory attention and motor behavior, making these regions a good substrate for maintaining representations of position in addition to early visual cortex. Taking into account that information may be maintained in brain regions more directly concerned with action, it has been suggested that ''sensorimotor recruitment'' rather than ''sensory recruitment'' may be a more appropriate way to think about maintained representations (D'Esposito and Postle, 2015).

Earlier we highlighted that the ability to maintain representations appears to be a general property of visual cortex. Given the evidence just discussed, it may be that this ability is not limited to visual cortex, but that any particular cortical region can be recruited for maintenance, depending on the nature of the information maintained. To test this idea, we presented participants sequentially with two visual objects before presenting a retro-cue (indicating which sample to hold in memory) and then asked them to perform one of two different tasks after a delay period (Lee et al., 2013). In the visual task participants were asked to indicate whether an object fragment presented after the delay belonged to the cued object or not, requiring the maintenance of visual features. In contrast, in the non-visual task, participants were asked to indicate whether a whole object presented after the delay was from the same subcategory or not, requiring the maintenance of the name or subcategory of the object. A separate behavioral experiment confirmed the nature of the information being maintained in the two tasks with visual object distractors presented in the delay period impairing performance on the visual-task more than the non-visual task and word distractors showing the opposite pattern. During the maintenance of visual properties, we found that object identity could be decoded from occipitotemporal but not prefrontal cortex. In contrast, during the maintenance of nonvisual properties (object category or name), we found that object identity could be decoded from prefrontal but not occipitotemporal cortex. These results confirm that information can be maintained in both prefrontal and visual cortex, but this maintenance is task-dependent and is stronger when the nature of the information matches the underlying functional properties of the region even for the same sample object. Further, the magnitude of activity in both regions was not modulated by task, providing further evidence that the magnitude of response during the delay period is dissociable from the presence or absence of maintained information.

One key prediction of the suggestion that information is maintained in regions that have functional properties matching the nature of that information is that there should be a correspondence between regions engaged during working memory and those engaged during perception of the same stimuli. For example, we suggested above that the decoding of maintained representations in posterior parietal cortex reported by Christophel et al. (2012, 2015) might reflect the complex visuospatial nature of their stimuli. We would therefore predict that those same regions should show strong decoding of the patterns during perception. Unfortunately, this was not tested in those studies. Similarly, it is unclear whether the parietal and frontal regions reported by Ester et al. (2015) also show decoding of orientation during perception.

More generally, it is possible that any region containing stimulus information during perception could maintain that information during working memory. In this context it is important to consider that, with sufficient power, stimulusrelated responses for a simple visual stimulation plus attention control task are observed in the vast majority of the brain (Gonzalez-Castillo et al., 2012). If information can be widely distributed during perception, then the same may be true of maintenance during working memory. The failure to find more distributed maintained representations could reflect lack of power. As is always the case, the current null results should be treated very cautiously. In our own work, showing taskdependent decoding during the delay in occipitotemporal and prefrontal cortex (Lee et al., 2013), the critical result is the relative strength of decoding, not the presence or absence of decoding in either task.

Overall, multivoxel decoding studies have provided strong support for the role of visual cortex in the maintenance of information during visual working memory. However, the ability to maintain representations is not just limited to visual cortex and may be a general property of cortex with the nature of the information maintained determining which regions are engaged. In some cases (e.g., position, orientation), the information may be well represented in multiple regions and the decoding of maintained content may be highly distributed. In other cases (e.g., faces, objects) the information may be maintained only in regions with more specialized functional properties. Critically, the ability to maintain information is dissociable from the presence or absence of delay activity and elevated activity may reflect separate functions related to attention, motor preparation or executive control.

#### LIMITATIONS OF MULTIVOXEL DECODING

Despite the advantages of decoding approaches for the study of maintenance during visual working memory, we need to be very cautious in interpreting the results (for discussion, see Serences and Saproo, 2012; Haynes, 2015).

First, although MVPA can provide evidence that there are distinct representations during visual working memory, it does not indicate what the nature of those representations are (Sligte et al., 2013). For example, Christophel and Haynes (2014) demonstrated decoding of maintained information about motion flowfields in MT+, posterior parietal cortex and somatosensory cortex. It is unlikely that the underlying neural representations are similar in these three areas, but all three areas show distinct responses to the different flowfields that may reflect different aspects of the stimuli or associated cognitive processing.

Second, the success of MVPA depends on the spatial arrangement of responses across voxels and may require the presence of large-scale maps (Freeman et al., 2011). Thus in V1, properties such as position and orientation can be readily decoded. The failure to find decoding for particular information in a given region could simply reflect heterogeneous organization of that information across the cortex rather than its absence.

Reconstruction of stimuli based on an underlying encoding model (Serences and Saproo, 2012) has the advantage of an explicit model of the underlying neural responses, making the presence of decoding more interpretable. Further, since the model is fit at the individual voxel level, the method is not dependent on the large-scale organization of information. However, this approach is dependent on the specific a priori assumptions made in generating the model. The assumption of orientation tuning is very reasonable for early visual cortex, but it is much more challenging to generate a model for higher cognitive functions.

#### RELATIONSHIP TO NON-HUMAN PRIMATE STUDIES

In this section, we want to briefly discuss how the human multivoxel decoding results we have reviewed relate to findings in non-human primate literature, which have often focused on prefrontal cortex, and not visual cortex, as critical for the maintenance of information (for recent discussion, see also Postle, 2015b).

First, while there is strong evidence from the fMRI studies we have reviewed for maintained representations in early visual cortex (e.g., Harrison and Tong, 2009; Serences et al., 2009) and MT+ (e.g., Riggall and Postle, 2012), there is only limited evidence for maintained signals in non-human primate V1 (Supèr et al., 2001) and MT (Bisley et al., 2004; Zaksas and Pasternak, 2006). One account could be that these varying results reflect the very different nature of the signals recorded—single unit spiking data from non-human primates vs. population threshold and sub-threshold neural activity reflected in the BOLD response. Consistent with this view, a recent study found that the amplitude of local field potential (LFP) oscillations in macaque MT do reflect the maintained motion direction (Mendoza-Halliday et al., 2014). However, it is worth noting that that same study did find evidence for maintained representations of motion direction in firing rate in MST in addition to lateral prefrontal cortex (Mendoza-Halliday et al., 2014).

Second, while non-human primate studies have often reported stimulus-selective sustained activity in prefrontal cortex (e.g., Funahashi et al., 1989; Freedman et al., 2003), some fMRI decoding studies have failed to find evidence for maintained representations in human prefrontal cortex (e.g., Riggall and Postle, 2012; Emrich et al., 2013). Our emphasis on the nature of the maintained information could explain some of the discrepancy since the ''cat'' vs. ''dog'' category task employed by Freedman et al. (2003) may require abstract category information similar to that required in our non-visual task, which emphasized object name or category and revealed decoding in prefrontal cortex (Lee et al., 2013). However, as in posterior areas, the different nature of the signals measured with fMRI and neurophysiological recordings may also help explain the apparent discrepancies. Recent work has started to emphasize the dynamics of firing rate changes in monkey prefrontal cortex (Stokes, 2015) and a population level re-analysis of the data collected by Freedman and colleagues (Meyers et al., 2008) revealed a complex relationship over time between information in single neurons and that in the population as a whole. Further, neurophysiological recordings have revealed that a broad range of differ types of task features are reflected in the responses of prefrontal neurons (Stokes et al., 2013; Lara and Wallis, 2014; Postle, 2015b) and it may be difficult to tease these apart in the population-level measures reflected in the fMRI BOLD signals.

Finally, another potential account of the apparent discrepancy between the human and monkey studies is highlighted by a recent study of monkeys with unilateral prefrontal lesions (Pasternak et al., 2015). These monkeys exhibited a contralesional deficit in maintaining motion information across a delay, which was substantially pronounced when rapid allocation of spatial attention was required. This deficit was delay specific, supporting a role of prefrontal cortex in maintenance. Combined with the direction-selective signals recorded in prefrontal cortex during the delay period (Zaksas and Pasternak, 2006), this result might suggest a role for prefrontal cortex in maintaining the motion information necessary for this task. However, the deficit in the lesioned monkeys was not dependent on the specific stimulus features (coherence of the sample stimulus), suggesting it did not involve sensory information. Instead given the pronounced impact of rapidly shifting attention, the authors suggest that the role of prefrontal cortex lies in attending and accessing the task-relevant motion signals that are maintained elsewhere. Thus, the single unit neurophysiology data from non-human primate prefrontal cortex may be more associated with attentional signals than stimulus properties, while the multivoxel decoding data in human posterior cortex primarily reflects maintained sensory representations. Support for a specific role of prefrontal cortex in representing attentional context has also been provided by at least one multi-voxel decoding study (Nelissen et al., 2013).

### RELATIONSHIP TO VISUAL MENTAL IMAGERY

As we described earlier, the representation of information during visual working memory may be highly related to visual imagery. In both cases, visual information is represented in the absence of that information in the environment. The nature of the representations during visual imagery has been much debated (for review, see Pearson and Kosslyn, 2015). Recent evidence from multi-voxel decoding studies has provided strong support for the depictive (picture-like) view of visual imagery, which suggests visual imagery of a stimulus induces similar neural activation patterns with that generated by visual perception of the same stimulus (Stokes et al., 2009; Reddy et al., 2010; Cichy et al., 2012; Lee et al., 2012; Johnson and Johnson, 2014; Naselaris et al., 2015). For example, we trained participants to remember pictures of 10 common objects before placing them in the MRI scanner (Lee et al., 2012). During scanning, participants were cued with the name of the object and on interleaved trials were either presented with the picture of the object or asked to visually imagine the picture as vividly as possible. During imagery trials we found that we could decode the specific object the participant was imagining from responses in visual cortex. Furthermore, the patterns of response elicited during imagery were similar to those elicited during perception and it was possible to decode between imagery and perception suggesting that perception and imagery share similar substrates, much like the maintenance of information during visual working memory.

In comparing results from working memory with those from mental imagery it is worth noting that working memory paradigms involving a retro-cue, which requires the retrieval of previously presented information, are not that dissimilar from the paradigms used in mental imagery. The major difference is the time between presentation of the visual stimulation and the cue for retrieval.

To directly compare working memory and mental imagery, Albers et al. (2013) asked participants to perform two different tasks. In both cases, participants were first presented with a task cue followed by two serially presented gratings and then a second cue indicating which grating was relevant for that trial. In the working memory task, participants simply had to remember the cued grating over a delay period. Following the delay a probe stimulus was presented and participants indicated whether the probe was rotated clockwise or anticlockwise relative to the cued grating. In contrast, in the mental imagery task, participants had to mentally rotate the cued grating (with direction and angle indicated by the initial task cue) and then indicate whether the probe was rotated clockwise or anticlockwise relative to the imagined grating. Here the imagined grating is internally generated mental image that is novel but not remembered one. While Albers et al. (2013) refer to this as mental imagery, since the rotated image was never actually physically present, this task could also be interpreted as a short-term memory with manipulation task (i.e., requiring the working of ''working memory''). They found that in V1-V3 they could decode orientation during the delay on both working memory and mental imagery trials. Furthermore, they could decode between tasks and there was also generalization to representations estimated during perception. These results suggest a common internal representation for visual working memory and mental imagery that is similar to that evoked during perception (Tong, 2013). Similar results were obtained by Christophel et al. (2015) with their color patterns, showing that transformed versions of the memorized stimulus could also be decoded from the same regions (early visual and posterior parietal cortex) as the original memorized stimulus.

In contrast to these results, Saad and Silvanto (2013) argued that working memory and visual imagery are partly dissociable processes. They asked participants to hold a grating in mind (visual short-term memory condition) or project it as a mental image on the computer screen (imagery condition), and compared the effect of each on visual perception. They found that both visual short-term memory (working memory) and imagery conditions were correlated with visual perception. However, while the subjective strength of visual imagery was negatively associated with visual perception, a positive correlation pattern was found for visual memory, suggesting dissociation. An alternative explanation for this is that the bottom-up visual input (screen), which is combined with the mental image (grating) in the imagery condition but not in the visual short-term memory condition, may interfere with visual stimuli for the visual perception performance. Thus, this dissociation may not reflect the different nature of signals for maintenance between

#### REFERENCES


imagery and working memory but interference effect between bottom-up visual inputs (Saad and Silvanto, 2013).

#### CONCLUSION

In this article, we have reviewed fMRI studies employing multivoxel decoding during working memory. These studies have revealed maintained stimulus representations during delays that are unrelated to elevated activity levels. While these studies have often highlighted the role of early visual cortex, this may in part reflect the simple stimuli commonly employed and not any privileged role of early visual cortex in the process of maintenance. We have highlighted studies reporting decoding of maintained information outside of visual cortex and suggest that the distribution of representations during visual working memory is dependent on the information maintained, reflecting both the stimulus and the task. Thus, even prefrontal cortex may exhibit maintained representations for some types of information. Further, we suggest there should be correspondence between regions containing information during perception and those containing information during working memory and that any region that contains information during perception may potentially contribute to maintained representations during working memory. While we have focused on the maintenance of information, it is important to remember that there are many other aspects of working memory task performance that regions may contribute to, including stimulus-response mappings, match-nonmatch status of a trial, motor programs and decision criteria. Importantly we suggest that there may not be a sharp divide between regions involved in maintenance and regions involved in representing these aspects of task performance, but that these functions can co-exist in the same regions.

#### AUTHOR CONTRIBUTIONS

S-HL and CIB planned, discussed and wrote this article together.

#### FUNDING

S-HL and CIB supported by the Intramural Research Program of NIMH (MH002909-08).


ventral temporal and occipital regions. Neuroimage 73, 8–15. doi: 10.1016/j. neuroimage.2013.01.055


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Lee and Baker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Revealing hidden states in visual working memory using electroencephalography

Michael J. Wolff 1, 2, Jacqueline Ding<sup>3</sup> , Nicholas E. Myers 2, 3 and Mark G. Stokes 2, 3 \*

*<sup>1</sup> Department of Experimental Psychology, University of Groningen, Groningen, Netherlands, <sup>2</sup> Oxford Centre for Human Brain Activity, University of Oxford, Oxford, UK, <sup>3</sup> Department of Experimental Psychology, University of Oxford, Oxford, UK*

It is often assumed that information in visual working memory (vWM) is maintained via persistent activity. However, recent evidence indicates that information in vWM could be maintained in an effectively "activity-silent" neural state. Silent vWM is consistent with recent cognitive and neural models, but poses an important experimental problem: how can we study these silent states using conventional measures of brain activity? We propose a novel approach that is analogous to echolocation: using a high-contrast visual stimulus, it may be possible to drive brain activity during vWM maintenance and measure the vWM-dependent impulse response. We recorded electroencephalography (EEG) while participants performed a vWM task in which a randomly oriented grating was remembered. Crucially, a high-contrast, task-irrelevant stimulus was shown in the maintenance period in half of the trials. The electrophysiological response from posterior channels was used to decode the orientations of the gratings. While orientations could be decoded during and shortly after stimulus presentation, decoding accuracy dropped back close to baseline in the delay. However, the visual evoked response from the task-irrelevant stimulus resulted in a clear re-emergence in decodability. This result provides important proof-of-concept for a promising and relatively simple approach to decode "activity-silent" vWM content using non-invasive EEG.

#### Edited by:

*Natasha Sigala, University of Sussex, UK*

#### Reviewed by:

*Olaf Hauk, MRC Cognition and Brain Sciences Unit, UK Tatiana Pasternak, University of Rochester, USA Keiichi Kitajo, RIKEN Brain Science Institute, Japan*

#### \*Correspondence:

*Mark G. Stokes, Department of Experimental Psychology, University of Oxford, 9 South Parks Road, Oxford, OX1 3UD, UK mark.stokes@psy.ox.ac.uk*

Received: *04 May 2015* Accepted: *20 August 2015* Published: *03 September 2015*

#### Citation:

*Wolff MJ, Ding J, Myers NE and Stokes MG (2015) Revealing hidden states in visual working memory using electroencephalography. Front. Syst. Neurosci. 9:123. doi: 10.3389/fnsys.2015.00123* Keywords: EEG, multivariate pattern analysis, dynamic coding, hidden state, visual working memory

# Introduction

Visual Working memory (vWM) is essential for high-level cognition. By keeping task-relevant information in mind, vWM provides a functional basis for complex behaviors based on timeextended goals and contextual contingencies. Some of the most influential models of vWM are built on the intuitive notion that maintenance is directly related to the persistence of stationary activity states, representing specific content in vWM from the moment of encoding until that content is needed for behavior (Goldman-Rakic, 1995; Curtis and D'Esposito, 2003). Persistent activity models have obvious appeal—vWM effectively preserves a freeze-frame snapshot of past experience until it is no longer required. However, there are gaps in the argument for persistent activity models of vWM.

Accumulating evidence suggests that vWM is not always accompanied by persistent delay activity (Sreenivasan et al., 2014). For example, a recent study in non-human primates showed that content-specific delay activity can be effectively abolished during dual task interference, even though vWM-guided behavior is relatively spared (Watanabe and Funahashi, 2014). Robust delay activity returned when attention was refocused on the vWMtask. Similarly, human studies using non-invasive brain imaging suggest that activity patterns during maintenance delays correspond only to attended items (Lewis-Peacock et al., 2011). Unattended items do not seem to have a corresponding activity state, even though such unattended items are still maintained in vWM (Olivers et al., 2011; Larocque et al., 2014). As in the non-human primate study, the activity state of unattended items becomes apparent once attention is directed to them (Lewis-Peacock et al., 2011; Lewis-Peacock and Postle, 2012).

These results suggest that delay activity is not strictly necessary for maintenance in vWM. Dissociating vWM-performance from persistent delay activity implies that some form of "activity-silent" neural state contributes to maintenance in vWM (Stokes, 2015). For example, a synaptic model of vWM proposes that information is encoded in item-specific patterns of functional connectivity (Mongillo et al., 2008; Sugase-Miyamoto et al., 2008). Essentially, activity patterns during encoding drive content-specific changes in short-term synaptic plasticity (Zucker and Regehr, 2002). Although the temporary synaptic trace is effectively "activity silent," this hidden neural state can be read out from the network during processing of a memory probe. Mongillo et al. (2008) focused on known mechanisms of short-term synaptic plasticity; however, other neurophysiological factors could also pattern hidden states for vWM-guided behavior (Buonomano and Maass, 2009). The key principle is that activity-dependent changes in the hidden neural state could be important for maintaining information in vWM.

One reason that persistent-activity models of vWM have been so pervasive in the past is that it is much easier to find confirmatory evidence with conventional measures, such as elevated delay-period firing (Fuster and Alexander, 1971) or pattern decoding during the delay period (Harrison and Tong, 2009). Disconfirmatory evidence is essentially a null effect. Therefore, to evaluate the possible contributions of hidden states to vWM maintenance, it is necessary to develop measures that are capable of revealing them. Previously, we found that a neutral task-irrelevant stimulus presented during a vWM delay period generated vWM-specific patterns of activity in monkey prefrontal cortex (PFC; Stokes et al., 2013). We suggested that this context-dependent response pattern could reflect differences in hidden state. For illustration, consider echolocation (e.g., sonar), where a simple impulse (e.g., "ping") is used to probe hidden contours of unseen structure. Analogously, the impulse response to neural perturbation should co-depend on the pattern of input activity and the hidden state of the network. If the input pattern is held constant, we can attribute differences in the output to underlying changes in hidden state.

In the current study, we develop this idea further using a task-irrelevant visual stimulus (or "impulse stimulus") to drive a vWM-specific impulse response function that could be measured non-invasively using EEG. Participants performed a two-alternative vWM discrimination task that requires precise maintenance of the orientation of a memory item during a delay interval (Bays and Husain, 2008). Critically, on a subset of trials we presented a fixed high-contrast impulse stimulus designed to drive neural activity in the visual system. We predicted that the evoked response should differentiate the memory condition (i.e., the remembered orientation), even in the absence of vWMdiscriminative delay activity.

To anticipate the results, multivariate decoding at posterior electrodes accurately discriminated the orientation of the memory item during stimulus encoding. Consistent with previous evidence for dynamic coding in neural populations (Meyers et al., 2008; Stokes et al., 2013) and scalp-level patterns (Cichy et al., 2011), the discriminative patterns were dynamic during stimulus processing. After the initial dynamic trajectory, discrimination decayed to near-baseline levels during the delay period. Importantly, the impulse stimulus reactivated vWMspecific activity patterns, consistent with the hypothesis that vWM content could be stored in an "activity-silent" neural format. Interestingly, although the impulse response pattern differentiated the vWM-stimulus, the discriminative pattern did not match the patterns during memory encoding. This experiment provides a novel proof-of-concept of a potentially powerful method for inferring hidden neural states.

# Methods

#### Participants

Twenty-four healthy adults (12 female, mean age 22.2 years, range 18–38 years) were included in the experiment and analyses. During recruitment, four additional participants were excluded from all analyses due to excessive eye-movements and eye-blinks (more than 20% of trials were contaminated). All participants received a monetary compensation of £10/h and gave written informed consent. The study was approved by the Central University Research Ethics Committee of the University of Oxford.

#### Apparatus and Stimuli

The experimental stimuli were generated and controlled with the freely available MATLAB extension Psychophysics Toolbox (Brainard, 1997) and presented at a 100 Hz refresh rate and a resolution of 1680×1050 on a 22′′ Samsung SyncMaster 2233RZ. A USB keyboard was used for response input. The viewing distance was set at 64 cm.

A gray background (RGB = [150 150 150]) was maintained throughout the experiment. Memory items were circular sinewave gratings presented at a 20% contrast. The memory probes were circular, 100% contrast gratings underlying a square-form function. The radius and spatial frequency was fixed for both types of stimuli (2.88◦ , and 0.62 cycles per degrees), and the phase was randomized. The memory items' orientations were uniformly distributed, and angle difference between memory item and probe within each trial was uniformly distributed across 20 angle differences (±4 ◦ , ±5 ◦ , ±7 ◦ , ±9 ◦ , ±12◦ , ±15◦ , ±20◦ , ±26◦ , ±34◦ , ±45◦ ). The impulse item was a highcontrast, black-and-white round "bull's-eye" in the same size and spatial frequency as the memory items and probes. All stimuli were presented centrally. Accuracy feedback was given with high (880 Hz) and low (220 Hz) tones for correct and incorrect responses, respectively.

#### Procedure

Participants were seated in a comfortable chair and the keyboard was placed either on their lap or on a table in front of the participants. The participants' task was to memorize the orientation of the presented low-contrast grating and to press the "m" key with the right index finger if the probe was rotated clockwise and the "c" key with the left index finger if the probe was rotated counter-clockwise relative to the previously presented memory item. They were instructed to respond as quickly and as accurately as possible.

Each trial began with the presentation of a fixation cross, which stayed on the screen until probe presentation. After 1000 ms the memory item was presented for 200 ms. In half of the trials (i.e., "long" trials), the following delay period was 2600 ms, after which the probe was presented for 200 ms. In the delay period at either 1170 ("early-impulse" trials) or 1230 ms ("lateimpulse" trials) after the memory item, the impulse stimulus was presented for 200 ms (**Figure 1A**), which the participants were instructed to ignore. The temporal jitter was introduced to allow us to test whether any effect on stimulus decoding was specifically time-locked to the impulse. In the other half of trials ("short" trials), the response probe was presented 1200 ms after memory item (**Figure 1B**). These short trials were included to ensure that participants would pay attention throughout the delay period of the long trials. After probe offset, the screen remained blank until response-input. A feedback tone was then played for 100 ms and the next trial automatically began after 500 ms. Every 24 trials a performance summary screen, with the average accuracy and median reaction of all trials thus far, was shown. Participants could use this moment to take short breaks. The trial conditions were randomized across the entire session and participants completed 1600 trials in total (400 early-impulse trials, 400 late-impulse trials, and 800 short trials) over a time period of approximately 165 min (including breaks).

#### Behavioral Analysis

Memory performance was analyzed with the freely available MATLAB extension MemToolbox (Suchow et al., 2013). The standard mixture model of visual working memory (Zhang and Luck, 2008) was fit separately for each participant (N = 24) and trial-length condition. The model assumes that the distribution of response errors has two distinct causes: (1) Pure guesses, which result in a uniform distribution of errors across all angle differences in the forced-choice paradigm. (2) Variability in the precision of the remembered item, which, even though the item is memorized, can result in errors at particularly small angle differences between memory item and probe. Although the main purpose of this analysis was simply to confirm that our participants could reliably memorize the low-contrast memory item in this experiment, for completeness we also performed paired-samples t-tests on guess rate and memory variability between trial-length conditions.

#### EEG Acquisition

The EEG was recorded using NeuroScan SynAmps RT amplifier and Scan 4.5 software (Compumedics NeuroScan, Charlotte, NC) from 61 Ag/AgCl sintered surface electrodes (EasyCap, Herrsching, Germany) laid out according the to the extended international 10–20 system (Sharbrough et al., 1991) at 1000 Hz sampling rate. The anterior midline frontal electrode (AFz) was reserved as the ground. Electrooculography (EOG) was recorded from electrodes placed below and above the right eye and from electrodes placed to the left of the left eye and to the right of the right eye. Impedances were kept below 5 k. Data were filtered

In the other half of the trials, determined randomly, the probe was presented instead of the impulse after the first delay.

online using a 200 Hz low-pass filter and the electrodes were referenced to the right mastoid.

#### EEG Preprocessing

Offline, the signal was re-referenced to the average of both mastoids, down-sampled to 250 Hz with 16-bit precision and band pass filtered (0.1 Hz high-pass and 40 Hz low-pass) using EEGLAB (Delorme and Makeig, 2004). Because we were only interested in posterior electrodes for this study, re-referencing to global average could unnecessarily introduce additional noise from frontal channels. Nevertheless, for completeness, we confirmed that the results are qualitatively similar using both reference schemes. The data were then epoched from −200 to 1400 ms relative to the onset of the memory item for the short, no-impulse trials, and from −200 to 2800 ms for the long, impulse trials. Both long and short epochs were then baseline-corrected using the 200 ms prior to memory item onset. Subsequent artifact detection and trial rejection was performed via visual inspection and focused exclusively on the EOG channels and the 17 posterior channels of interest included in the analyses (P7, P5, P3, P1, Pz, P2, P4, P6, P8, PO7, PO3, POz, PO4, PO8, O1, Oz, O2). Trials containing saccadic eye-movements at any point in time, blinks during stimulus presentation, or other non-stereotyped artifacts were rejected from all further analyses. Impulse trials were subsequently re-epoched to two shorter epochs, time-locked to the memory item (−200 to 1400 ms) or to the impulse stimulus (−200 to 1400 ms). Finally, the data were smoothed with a Gaussian kernel (SD = 8 ms).

#### EEG Analysis

#### Multivariate Pattern Analysis

To determine whether the pattern of the EEG signal across the posterior channels of interest contained information about the remembered item, we used the Mahalanobis distance (Mahalanobis, 1936; De Maesschalck et al., 2000) to perform pair-wise comparisons between sets of trials in which orthogonal orientations were presented.

Trials were divided across four angle bins two times and only orthogonal angle bins were compared in the multivariate analysis (0◦ to 45◦ vs. 90◦ to 135◦ ; 45◦ to 90◦ vs. 135◦ to 180◦ ; −22.5 ◦ to 22.5 ◦ vs. 67.5 ◦ to 112.5 ◦ and 22.5 ◦ to 67.5 ◦ vs. 112.5 ◦ to 157.5 ◦ ). For illustration, see **Figure 2** for the eventrelated potentials of occipital electrodes (O1, Oz, and O2) for each pairwise comparison between orthogonal angle-bins.

We used a leave-one-trial-out cross-validation approach to calculate, on each trial, the multivariate dissimilarity (Mahalanobis distance) of that trial to the average of all other trials in the same angle bin, relative to the dissimilarity of that trial to the average of all trials in the orthogonal angle bin. Mahalanobis distances of the test trial were computed for each time point as follows:

$$\begin{array}{rcl}D1 &=& \sqrt{\frac{\left(\text{Train angle } 1 - \text{Test trial}\right)^{T} \* pC^{+} \*}{\left(\text{Train angle } 1 - \text{Test trial}\right)}}\\D2 &=& \sqrt{\frac{\left(\text{Train angle } 2 - \text{Test trial}\right)^{T} \* pC^{+} \*}{\left(\text{Train angle } 2 - \text{Test trial}\right)}}\end{array}$$

where "Train angle 1" and "Train angle 2" are row vectors containing the average signals of angle bins 1 and 2 (excluding the test trial) of each channel, and "pC+" is the pseudo inverse of the error covariance matrix. The error covariance was estimated by pooling over the covariances of each angle condition, estimated from all trials within each condition (excluding the test trial) using a shrinkage estimator that is more robust than the sample covariance for data sets with many variables and/or few observations (Ledoit and Wolf, 2004; Kriegeskorte et al., 2006).

FIGURE 2 | Event-related potentials of each angle bin averaged over the occipital channels (O1, Oz, and O2). Illustrated are all pairwise orthogonal angle bin comparisons that were made in the multivariate analysis of the memory item epoch (A) and impulse epoch (B). Light-gray and dark-gray bars represent the presentation of memory item and impulse stimuli, respectively.

The variables "Train angle 1," "Train angle 2," and "pC+" are all part of the training set, on which "Test trial," a row vector containing the signal of each channel of the left-out test-trial, is tested on. This was done by computing the difference between the two Mahalanobis distances between "Test trial" and "Train angle 1" (D1) and "Test trial" and "Train angle 2" (D2). The sameangle bin distance was always subtracted from the orthogonalangle bin difference (so if the "Test trial" was part of angle bin 1 then D1 would be subtracted from D2). If the signal indeed contained information about the memory item at that time point, this distance difference should be positive (because the orthogonal-angle bin distance should be higher than the sameangle bin distance). See **Figure 3** for a schematic overview of the analysis. This procedure was performed for all trials and all previously defined angle bin comparisons, resulting in two equivalent estimates of distance differences per trial. Observed distances were then averaged over the two estimates, and across trials, to derive a single value for each time point and each participant for subsequent statistical testing and plotting.

#### Cross-temporal Analysis

To explore the dynamics of information processing, and to test if the informative signal cross-generalizes to other time points (King and Dehaene, 2014), we computed a cross-temporal extension of the Mahalanobis analysis described above. The difference between condition-specific distances was computed as described above. However, instead of training and testing only on the same equivalent time points, train/test sliding windows were decoupled: The training data consisting of "Train angle 1," "Train angle 2" and the corresponding pseudo inverse of the covariance matrix (as described above) at train time Y was used to compute the distances to the test-trial at test time X (e.g., Stokes et al., 2013). After computing the distance differences for all possible train-test time combinations and averaging across all test trials, the results were combined into a cross-temporal matrix in which differences along the diagonal correspond directly to the time-resolved analyses already discussed, but offdiagonal coordinates reflect the extent to which the underlying discriminative neural patterns cross-generalize between traintest time points. This cross-temporal analysis was carried out within each trial epoch separately (memory-item and impulse), as well as across epochs, where the train data was taken from the impulse epoch and tested on all trials within the memory item epoch and vice versa, resulting in four cross-temporal discrimination matrices.

#### Univariate Analysis

To explore to what extent the differences in the EEG signal between memory items is driven by amplitude rather than pattern differences, we performed the univariate equivalent to the multivariate analysis described above. Instead of calculating the difference between the same- and orthogonal-angle bin Mahalanobis distances, the difference between the absolute sameand orthogonal-angle bin voltage differences averaged across all 17 posterior channels was computed.

#### Significance Testing

Statistics of one-dimensional EEG-analyses were inferred non-parametrically (Maris and Oostenveld, 2007) with signpermutation tests. For each time-point, the decoding value

train angle 1 (D1) and train angle 2 (D2) illustrated in two-dimensional space. The pooled covariance is computed from the training data. When the test trial belongs to angle bin 2, D2i is subtracted from D1i (top), when it belongs to angle bin 2, D1j is subtracted from D2j (bottom). This procedure is repeated for each trial and time-point and the resulting distance differences are averaged across all trials.

of each participant was randomly multiplied by 1 or −1. The resulting distribution was used to calculate the p-value of the null-hypothesis that the mean discrimination-value was equal to 0. Cluster-based permutation tests were then used to correct for multiple comparisons across time using 10,000 permutations, with a cluster-forming threshold of p < 0.01. The significance threshold was set at p < 0.05 and all tests were two-sided. Significance tests were carried out separately for the memory item (0–1400 ms) and the impulse (0–800 ms). The sample size of all tests was 24.

#### Data Sharing

In accordance with the principles of open evaluation in science (Walther and van den Bosch, 2012), all data and fully annotated analysis scripts from this study are publicly available at http:// datasharedrive.blogspot.co.uk/2015/05/revealing-hidden-statesin-working.html.

We also hope these data and analyses will provide a valuable resource for future re-use by other researchers. In line with the OECD Principles and Guidelines for Access to Research Data from Public Funding (Pilat and Fukasaku, 2007), we have made every effort to provide all necessary task/condition information within a self-contained format to maximize the re-use potential of our data. We also provide fully annotated analysis scripts that were used in this paper. Any further queries can be addressed to the corresponding author.

#### Results

#### Behavioral Results

Visual working memory performance (**Figure 4A**) was modeled separately for short and long trials, each consisting of 800 trials. The difference in guess rates for short (M = 0.074, SD = 0.048) and long trials (M = 0.073, SD = 0.047) was not statistically different [t(23) = 0.182, p = 0.858]. On the other hand, the standard deviation of remembered items (sd) was significantly different between trial length conditions [t(23) = 2.458, p = 0.022]: sd was lower for short trials (M = 4.272, SD = 1.318) than for long trials (M = 4.927, SD = 1.292; **Figure 4B**). Whether this decrease in precision in long trials is due to the increase in trial duration (Zhang and Luck, 2009) or the possible interference effect of the impulse stimulus (Magnussen et al., 1991) cannot be concluded, as the present study was not designed to address this issue.

The very low guess rates in both conditions provided evidence that the participants had little difficulty to reliably memorize the low contrast angle stimuli. Because most errors were attributed to noise in mnemonic precision rather than absolute forgetting, we included both incorrect and correct trials in all EEG analyses.

#### Memory Item Discrimination during and after Item Presentation

The averaged trial-wise difference in Mahalanobis distances between across- and within-angle conditions enabled us to decode the memory items from the EEG signal of the posterior channels as a function of time. A statistically significant cluster emerged 68 ms after memory item onset, and lasted until the end of this epoch (1400 ms, cluster p < 0.001; **Figure 5A**, cyan). Because the impulse analysis was only based on 50% of trials, we also analyzed the memory encoding effect only on corresponding long trials (**Figure 5A**, blue), enabling a power-matched comparison between the memory itemand impulse-epoch. This revealed several significant decoding clusters: 76–632 ms (p < 0.001), 668–720 ms (p = 0.023), 756–788 ms (p = 0.047), 876–936 ms (p = 0.016), and 964– 1000 ms (p = 0.036).

#### Memory Item Discrimination during and after Impulse Presentation

The same analysis as above was performed on the subsequent epoch for long trials, time-locked to the impulse onset. Significant temporal clusters of above-chance discrimination were detected at 140–408 ms (p < 0.001) and 424–508 ms (p = 0.005 after impulse onset (**Figure 5B**, blue, bottom).

FIGURE 5 | Multivariate discrimination of the memory item across time. (A) Memory item epoch. The discrimination for both trial types (in cyan), and exclusively for the long trials used in the impulse response analysis (in blue). Significant positive clusters are marked with bars in the corresponding colors. (B) Impulse epoch. The discrimination of memory item is shown for long trials (in blue), with positive clusters are marked in the corresponding significance bar along the bottom. Significant increases in discrimination compared to the mean discrimination 100 ms prior to impulse onset are indicated with dark-blue bars at the top. Light-gray and dark-gray bars represent memory item and impulse presentation, respectively. Error bars are standard deviations from the permuted null-distributions.

#### Decoding Accuracy Increases Significantly after Impulse Presentation

Since the decoding accuracy does not seem to drop completely to chance levels in the initial delay period, we also tested whether the presentation of the impulse results in a significant increase in discriminability. To this end, we subtracted the mean discriminability between −100 and 0 ms prior to impulse onset from the discrimination values after impulse onset. Two significant clusters were identified: 188–232 ms (p = 0.012) and 364–0.404 ms (p = 0.016). These results confirm that discrimination accuracy increased significantly after impulse presentation (**Figure 5B**, blue, top).

#### The Memory Item and Impulse Show Dynamic Coding

The cross-temporal analysis of the memory item epoch using both long and short trials showed a dynamic coding pattern. Discrimination was greatest when trained and tested on the same time-points, as opposed to different time-points (**Figure 6**, lower left). The impulse response, though weaker than the memory item response, suggested a dynamic coding pattern as well (**Figure 6**, upper right).

#### Memory Item and Impulse Coding Do Not Cross-generalize

We saw no evidence for cross- generalization between the neural patterns evoked by the memory stimulus and the impulse response, either when the training set was taken from the impulse epoch and tested on the memory item epoch (**Figure 6**, top left), or the other way around (**Figure 6**, bottom right).

#### Discrimination Accuracy is Time-locked to Impulse Onset

The increased discrimination accuracy shortly after the impulse could in principle be explained by a probe expectancy effect. Because the memory probe is presented on half the trials at this point, participants might prepare to respond to the probe. This could result in a more "active" maintenance of the memory item (e.g., Watanabe and Funahashi, 2007), which in turn could improve decoding accuracy. Although we do not find any evidence for a progressive ramp-up in discriminability at this time, this does not rule out a very precise form of temporal expectation.

To address this potential issue directly, we had introduced a very subtle temporal variability in the presentation of the impulse stimulus. Our reasoning was as follows: If discriminability is tightly time-locked to the variable onset of the impulse, rather than to the expected onset of the probe relative to the memory item, we can sensibly attribute the observed boost in discriminability to the presentation of the impulse stimulus.

We therefore plotted the cross-temporal matrices of the discrimination of the early and late impulse onset trials separately (**Figure 7A**) time-locked to memory item onset, where the training data of both matrices was based on all impulse trials time-locked to impulse onset. As is apparent from the figure, the highest discrimination effect is not along the diagonal (where the test and train times correspond to the mean impulse onset and the actual impulse onset of all trials, respectively). Rather, for the early impulse trials, discrimination is highest when the training time is shifted by +30 ms, while a −30 ms shift is best for the late impulse trials. We then plotted and analyzed the discriminations of the early and late impulse trials based on these shifted training times (**Figure 7B**). Three positive significant clusters were found

both in the early-onset condition (1544–1664 ms, p = 0.003; 1704–1776 ms, p = 0.007; 1792–1828 ms, p = 0.028) and in the late-onset condition (1568–1744 ms, p < 0.001; 1784–1836 ms, p = 0.012; 1860–1908 ms, p = 0.016). As is apparent from both the figure and the significant clusters, the time course of the late impulse onset trials is clearly later than the early onset trials.

To more directly test for the expected 60 ms latency shift in discrimination accuracy corresponding to the onset difference of the two impulse stimuli, we computed the Pearson's correlation between discrimination values of the time window from 1370 to 2170 ms of the early impulse onset condition with different time windows of the same length of the decoding values of the late impulse onset condition. Correlation coefficients were computed between the same time windows (0 ms difference) as well as for each 4 ms step up to a difference of 120 ms, resulting in 31 correlation values for each participant in total (**Figure 7C**). The mean correlation clearly peaked at a 60 ms difference and a cluster-corrected permutation test on the Fisher transformed correlation values showed that only the correlation coefficients between a time-difference of 32 to 100 ms were significantly positive across subjects (p < 0.001). These results provide clear evidence that the decoding time-course was time-locked to the onset of the impulse.

#### Memory Item Discrimination is Not Simply Driven by Mean Amplitude Difference

The univariate analysis that was based on the averaged signal of all posterior electrodes showed significant memory item discrimination only shortly after memory item onset, where a single short significant cluster was present (140–168 ms, p = 0.022). No significant discrimination could be made within the impulse epoch (**Figure 8**).

#### Discussion

We report the results of a novel method to recover visual working memory states that are otherwise hidden to EEG using a functional perturbation approach. We presented a highenergy visual impulse stimulus during the vWM delay period and measured the visual evoked response. Critically, we found that the impulse response carried significant information about the contents in vWM. Using multivariate analysis, we could decode the orientation of the previous memory item from the impulse-driven visual response. This provides important proof-of-principle evidence for the feasibility of exploring hidden neural states with non-invasive EEG, with important implications for working memory (Stokes, 2015).

We used Mahalanobis distances to compute the multivariate dissimilarity between the evoked response during maintenance of specific orientations. The Mahalanobis distance is superior to Euclidean distance (Stokes et al., 2013) because it accounts for the covariance structure of the noise between features (Kriegeskorte et al., 2006). In the current study, features were EEG sensors, which are known to be highly correlated. Analysis of the evoked response to the memory stimulus clearly validated this multivariate method as a powerful approach

correlations (Fisher's z) between the decoding time-course for the early and late impulse onset trials as a function of different temporal shifts. Mean correlation peaks at 60 ms. The blue bar illustrates the significant positive cluster of correlations. Error bars are standard deviations of the permuted null distributions.

for decoding task-relevant parametric dimensions. Robust orientation discrimination was observed in the EEG activity as early as 68 ms after the presentation of the memory stimulus. Decoding peaked at around 160 ms, before decaying into the memory delay period. Despite returning almost to baseline prior to the onset of the impulse stimulus, we observed a robust "reactivation" in decodability of the memory item that peaked at 200 and 360 ms after the impulse stimulus.

The impulse onset was temporally jittered by ±30 ms. The rationale for introducing this variability was to control for the possibility that reactivation could be explained by temporal expectation. On half the trials, the response probe was presented instead of the impulse stimulus. This was to ensure that participants were attending throughout the delay period. However, previous studies have shown that temporal expectation can also result in a ramp-up of item-specific delay activity (Takeda and Funahashi, 2004; Watanabe et al., 2009; Barak et al., 2010). Ramp-up activity could reflect a build-up of temporal expectation (Nobre et al., 2007), which could trigger attention-related pre-activation of the task-relevant template, as previously observed in monkey PFC (Rainer et al., 1999) and the human visual system (Stokes et al., 2009). Jittering the impulse onset time allowed us to differentiate the relative contribution of temporal expectation and of the impulse response. This subtle temporal offset allowed us to test whether reactivation was indeed time-locked to the impulse stimulus, or whether decodability was better explained by the temporal structure of the task.

Visual inspection of the decodability time-course locked to the impulse probe already suggests that temporal expectation is not a plausible account. It would be surprising if template-reactivation could be so precise over an interval as long as 1.2 s. Moreover, plotting the impulse response for the different impulse onset times relative to the onset of the memory stimulus provides an estimate of the time-locking to the stimulus onset (**Figure 7B**). As expected, the decodability profiles appear offset by approximately 60 ms. Finally, a correlation analysis of the decodability timecourses between impulse onsets confirmed that the correlation peaked at an offset of 60 ms. Overall, this pattern of results is consistent with the prediction that a neutral stimulus presented during the delay period drives activity in the memory network,

permuted null-distributions.

resulting in a patterned response that systematically reflects the representational characteristics of the information in working memory (i.e., orientation).

Previous studies have argued that early visual cortex is important for vWM (Pasternak and Greenlee, 2005). For example, Harrison and Tong conducted an fMRI study using a very similar paradigm as the current design (Harrison and Tong, 2009). Using multivariate analyses, they found significant decoding during the delay period despite an absence of abovebaseline activity levels. This suggests that subtle activity patterns in fMRI could also reflect hidden states (patterned spontaneous activity). Computational modeling provides evidence that spontaneous spiking activity should be patterned by the hidden state (Sugase-Miyamoto et al., 2008). Moreover, we previously found evidence for significant pattern separation in monkey PFC, despite activity levels that were no greater than the pretrial baseline (Stokes et al., 2013). Increasing the overall level of activity increased the pattern separation in that study. Future research could explore the relationship between spontaneous activity patterns measured with fMRI, single unit recording, and EEG.

It is also possible that the activity observed by Harrison and Tong (2009) actually reflected attentional preparation (Stokes et al., 2009) or imagery-related activity (Stokes et al., 2011; Albers et al., 2013). Indeed, it is almost impossible to separate potential non-working memory contributions in their design (Stokes, 2011). In the current study, we clearly dissociate impulsedriven decoding from temporal expectation. Moreover, visual imagery is unlikely to be triggered so rapidly by the impulse stimulus. It would be important for future research to explore the relationship between discriminating stimulus-driven and nondriven activity as a function of attention and imagery to further pinpoint the relative contribution of different neural states to these separable, but interrelated cognitive functions.

We also observed evidence for dynamic coding of the memory stimulus. Cross-temporal analyses clearly revealed superior discrimination along the diagonal axis, reflecting within-time generalization, relative to off-diagonal coordinates representing cross-temporal generalization. This is the hallmark pattern for dynamic coding, indicating that the discriminative patterns vary over time (King and Dehaene, 2014). Previously, Cichy and colleagues observed a similar pattern in MEG data during perceptual categorization (Cichy et al., 2011), consistent with similar results from intracranial recordings in monkey visual (IT; Meyers et al., 2008), parietal (Crowe et al., 2010) and prefrontal cortices (Meyers et al., 2008; Stokes et al., 2013). There was also some evidence for a dynamic coding pattern in the impulse response, suggesting that the impulse response might be best conceptualized as a memory-specific trajectory, although future research would need to clarify this interpretation.

Interestingly, we found no evidence for cross-generalization between the neural patterns evoked by the memory stimulus and the impulse response. Again, this could be interpreted as an extension of dynamic coding. The same task parameters are represented in both epochs (i.e., memory orientation), but using independent coding schemes. Epoch-independent coding schemes could be optimal for structured high-level representations (Sigala et al., 2008). However, this result could also reflect a fundamental difference in patterns of activity that modulate hidden states, and the patterns of activity that are emitted from a particular impulse stimulus. Indeed, the current results are consistent with the hypothesis that the impulse response should be an interaction between the input pattern and the current hidden state, rather than a simple "reactivation." Readout of the hidden state from the EEG response only requires a systematic relationship between the impulse response and the hidden state. By contrast, downstream cortical areas that read out the hidden state to generate a response might need to learn how to decode a time- and context-varying hidden state to access a memorized orientation. Recent theoretical models have shown that unsupervised read-out of dynamically changing states is in principle possible (Sussillo and Abbott, 2009; Sussillo, 2014).

Although this proof-of-principle experiment does not provide the definitive test for "activity-silent" working memory, the results are nonetheless consistent with a number of key predictions. First, memory-discriminative information effectively returns to baseline after initial encoding. Although this is essentially a null effect, the decay function is consistent with studies decoupling persistent content-specific delay activity and memory-guided behavior (Sreenivasan et al., 2014). Secondly, impulse-driven reactivation is consistent with a context-dependent response of a memory-configured hidden state (Mongillo et al., 2008; Sugase-Miyamoto et al., 2008). Finally, the dynamic trajectory during memory encoding is also consistent with a more general dynamic coding framework for working memory (Stokes, 2015).

Irrespective of any particular theoretical framework, the current experiment also provides an important demonstration of combining a functional perturbation approach with multivariate decoding to reveal otherwise hidden neural states. Activity states that we usually measure with non-invasive recordings only provide an incomplete picture of the diversity of neural states underlying cognition. This might be especially true for more tonic cognitive states, such as working memory, attention, or task set. Activity-silent representations pose an obvious problem for contemporary neuroscience, which is dominated by measurement and analysis of activity states. The ultimate success of future research will depend on new approaches to existing measurement techniques to probe diverse neural states, including "activity-silent" states. We believe that this paper provides an important proof-of-principle toward an accessible non-invasive approach. Non-invasive brain stimulation could be used in combination with EEG to probe hidden states (Bortoletto et al., 2015).The advantage of transcranial magnetic stimulation

# References


is that the response profile of distinct brain networks can be targeted specifically (Rosanova et al., 2009), but with the major disadvantage that the stimulation artifact effectively precludes analysis of the initial local response to the perturbation. While this is less problematic for measuring context-dependent changes in effective connectivity between distant brain areas (Taylor et al., 2007), this limitation could easily obscure the kind of effect studied here.

In conclusion, we provide useful proof-of-principle demonstration of the utility of combining a functional perturbation approach with EEG to reveal otherwise silent neural states. Although these results are consistent with a dynamic coding framework that suggests visual working memory could be encoded in an "activity-silent" state, the main purpose of the experiment was to develop a powerful tool for exploring cognitive states that cannot otherwise be differentiated with EEG. Future experiments will be able to exploit this novel approach in more complex experimental designs to tease apart the key coding principles underlying visual working memory.

#### Acknowledgments

This study was funded by the Wellcome Trust (to NEM), Medical Research Council (to MGS), and the National Institute for Health Research Oxford Biomedical Research Centre Programme based at the Oxford University Hospitals Trust, Oxford University. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health. We would like to thank Janina Jochim for assistance collecting EEG data.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Wolff, Ding, Myers and Stokes. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Feature-Based Change Detection Reveals Inconsistent Individual Differences in Visual Working Memory Capacity

Joseph P. Ambrose<sup>1</sup> , Sobanawartiny Wijeakumar<sup>2</sup> , Aaron T. Buss<sup>3</sup> and John P. Spencer<sup>2</sup> \*

<sup>1</sup> Department of Applied Mathematics and Computational Sciences, University of Iowa, Iowa City, IA, USA, <sup>2</sup> School of Psychology, University of East Anglia, Norwich, UK, <sup>3</sup> Department of Psychology, University of Tennessee, Knoxville, TN, USA

Visual working memory (VWM) is a key cognitive system that enables people to hold visual information in mind after a stimulus has been removed and compare past and present to detect changes that have occurred. VWM is severely capacity limited to around 3–4 items, although there are robust individual differences in this limit. Importantly, these individual differences are evident in neural measures of VWM capacity. Here, we capitalized on recent work showing that capacity is lower for more complex stimulus dimension. In particular, we asked whether individual differences in capacity remain consistent if capacity is shifted by a more demanding task, and, further, whether the correspondence between behavioral and neural measures holds across a shift in VWM capacity. Participants completed a change detection (CD) task with simple colors and complex shapes in an fMRI experiment. As expected, capacity was significantly lower for the shape dimension. Moreover, there were robust individual differences in behavioral estimates of VWM capacity across dimensions. Similarly, participants with a stronger BOLD response for color also showed a strong neural response for shape within the lateral occipital cortex, intraparietal sulcus (IPS), and superior IPS. Although there were robust individual differences in the behavioral and neural measures, we found little evidence of systematic brain-behavior correlations across feature dimensions. This suggests that behavioral and neural measures of capacity provide different views onto the processes that underlie VWM and CD. Recent theoretical approaches that attempt to bridge between behavioral and neural measures are well positioned to address these findings in future work.

Keywords: change detection, fMRI, individual differences, visual working memory, working memory capacity

# INTRODUCTION

Visual working memory (VWM) is a core cognitive system with a highly limited capacity of 3– 4 items (Luck and Vogel, 1997). VWM plays a key role in much of visual cognition, comparing percepts that cannot be simultaneously foveated and identifying changes in the world when they occur (Vogel et al., 2001). VWM capacity limitations are reliably associated with individual differences in a host of cognitive functions (Conway et al., 2003), and VWM deficits have been

#### Edited by:

Natasha Sigala, University of Sussex, UK

#### Reviewed by:

Andreas Nieder, University of Tübingen, Germany Mark Stokes, University of Oxford, UK

> \*Correspondence: John P. Spencer j.spencer@uea.ac.uk

Received: 15 December 2015 Accepted: 29 March 2016 Published: 19 April 2016

#### Citation:

Ambrose JP, Wijeakumar S, Buss AT and Spencer JP (2016) Feature-Based Change Detection Reveals Inconsistent Individual Differences in Visual Working Memory Capacity. Front. Syst. Neurosci. 10:33. doi: 10.3389/fnsys.2016.00033

observed in clinical populations, including children diagnosed with autism (Steele et al., 2007) as well as children born preterm (Vicari et al., 2004). VWM appears to be particularly predictive of individual differences in cognitive performance. By some estimates, individual differences in VWM capacity account for up to 40% of the variance in global fluid intelligence (Fukuda et al., 2010).

What neural mechanisms underlie VWM? Research has shown that a distributed network of frontal and posterior cortical regions underlies performance in VWM tasks. In particular, VWM representations are actively maintained in the intraparietal sulcus (IPS), the DLPFC, the ventral-occipital cortex (VOC) for color stimuli, and the lateral-occipital complex (LOC) for shape stimuli (Todd and Marois, 2004, 2005). In addition, there is suppression of the temporo-parietal junction (TPJ) during the delay interval, and activation of the ACC during the comparison phase (Todd et al., 2005; Mitchell and Cusack, 2008).

One of the more striking findings in the fMRI literature is that the BOLD response increases as the memory load is varied from 1 to 3 items and then asymptotes at higher loads (Harrison et al., 2010). This occurs within critical parts of the VWM network including the IPS and VOC (Todd and Marois, 2004). What is striking about these data is that they correspond with behavioral estimates of VWM capacity: estimates suggest that people can hold approximately 3–4 items in VWM (Luck and Vogel, 1997; Vogel et al., 2001). Thus, there is an apparent correspondence between neural capacity as indicated by the asymptotic BOLD pattern and behavioral capacity as indicated by measures such as Pashler's K (Pashler, 1988).

Evidence supporting this relationship comes from Todd and Marois (2005). They found a significant correlation between behavioral estimates of capacity and a normalized BOLD signal in posterior parietal cortex measured at the set size associated with each participant's capacity. There was also a significant correlation between behavioral capacity and neural capacity in VOC during the maintenance phase of one experiment. These data are consistent with ERP data from Vogel and Machizawa (2004) showing similar correlations over parietal and occipital cortex. Interestingly, correlations with behavioral capacity estimates were not pervasive: no significant correlations with behavior were observed in anterior cingulate cortex or in middle frontal gyrus.

Given the specificity of these findings to two neural loci, we sought to examine the robustness of the relationship between behavioral estimates of capacity and neural estimates of capacity, taking advantage of recent findings. In particular, Song and Jiang (2006) examined the neural bases of VWM by examining performance in a change detection (CD) task as people remembered colors, shapes, or both feature dimensions. Consistent with Alvarez and Cavanagh (2004), they found capacity differences for colors and shapes: participants remembered 3–4 colors but only 1–2 shapes. They also found the neural asymptotic pattern for both color and shape stimuli across multiple sites within the VWM network, with a stronger BOLD response for shapes than for colors.

These data set the stage for the individual differences approach in the present study. In particular, we asked whether the correlation between behavioral capacity and neural capacity for simple colors also holds for shapes despite dramatic differences in capacity for the two stimulus dimensions. That is, will individuals with a high capacity for colors also have high capacity for shapes and, critically, will correlations between behavioral and neural capacity measures hold despite dramatic differences in capacity across dimensions? Such a result would suggest a very strong link between behavioral capacity and neural capacity.

To test this question, we used a within-subjects design. Participants completed a VWM task with simple colors on one fMRI scanning day, and a VWM task with shapes on a second scanning day. We chose to use shapes from Drucker and Aguirre's (2009) study on shape similarity because these shapes have good psychometric properties (Zahn and Roskies, 1972) and have been well localized with fMRI. We estimated participants' VWM capacity along each dimension from their behavioral performance and examined whether behavioral estimates of capacity across dimensions were robust within individuals. Similarly, we measured neural capacity for each dimension across 30 ROIs identified from a recent meta-analysis of the VWM fMRI literature (Wijeakumar et al., 2015) as well as from Drucker and Aguirre (2009), and examined whether neural estimates of capacity across dimensions were robust within individuals. Finally, we examined correlations between the behavioral and neural capacity measures to determine whether there were robust individual differences between brain and behavior and whether these relationships remained robust across dimensions despite large differences in VWM capacity.

# MATERIALS AND METHODS

#### Participants

Twenty right-handed native English-speaking subjects took part in the experiment (age range 25 ± 4 years; 11 men, 9 women). All participants were recruited from the University of Iowa campus and community. All participants had normal or corrected-tonormal vision. All participants signed an informed consent document approved by the Ethics Committee at the University of Iowa.

We acknowledge that the low sample size is a limitation of this study. However, we note that this limitation is common in fMRI studies due to resource limitations. For example, the motivating studies by Alvarez and Cavanagh (2004) and Song and Jiang (2006) had a sample size of 12 and 6, respectively.

#### Procedure

The experimental paradigms were created using E-prime version 2.0 and were run on an HP computer (Windows 7). We used two variants of a CD task. In the Color CD task, the shapes of the stimuli were held constant. Participants were shown a memory array of 1–6 colored stimuli (Set Size). After a brief delay, they were shown a test array that was either the same array (Same condition) or an array where one of the stimuli had a different color (Different condition). In the Shape CD task, the colors of the stimuli were held constant. Participants were shown a memory array of 1–6 stimuli. After a delay,

they were shown either the same array (Same condition) or an array where one of the stimuli had a different shape (Different condition). Participants were asked to indicate if the items were the same or different using the index or middle finger buttons on a right-handed manipulandam box. At the start of the task, they were informed which button to push to indicate a Same response versus a Different response. There were no practice trials, but participants were shown example sequences during screening to familiarize them with the task before entering the scanner.

Colors were equally distributed in CIELAB 1976 color space. Shapes were based on Drucker and Aguirre's (2009) RFC-defined stimuli. Sets of eight possible colors and shapes used in the task were generated so that each color and shape were separated by 45◦ in feature space. Items were randomly selected from this pool to construct the stimulus array on each trial. The changed feature was also drawn from this pool during Different trials.

Each trial began with the presentation of a fixation cross for 2500 ms, followed by the memory array for 500 ms, then a blank screen delay for 1200 ms, and finally the test array for 1500 ms. The inter-trial interval was jittered between 1000 ms (50% of trials), 2500 ms (25%), and 3500 ms (25%). Participants were instructed to respond as quickly and accurately as possible. If a response was not entered within the duration of the test array's presentation, 'No Response Detected' was displayed on the screen, and the trial was excluded from analysis.

#### Design

Participants completed a total of four runs each for the Color and Shape CD tasks. Each set of runs occurred over a single scanning block with separate dimensions on separate days. The order of the scanning days (Color first versus Shape first) was counterbalanced across participants. Each run consisted of 20 randomized trials (10 Same, 10 Different) at each set size (SS1– 6) completed in increasing order. The goal of increasing set size across blocks was to maximize stability in the measurements of performance at each set size. Moreover, we hoped that the systematic ordering would help participants remain engaged throughout the experiment.

#### Image Acquisition and Processing

A 3T Siemens TIM Trio magnetic resonance imaging system with a 12-channel head coil located at the University of Iowa's Magnetic Resonance Research Facility was used. Anatomical T1 weighted volumes were collected using an MP-RAGE sequence. Functional BOLD imaging was acquired using an axial 2D echo-planar gradient echo sequence with the following parameters: TE = 30 ms, TR = 2000 ms, flip angle = 70◦ , FOV = 240 mm × 240 mm, matrix = 64 × 64, slice thickness/gap = 4.0/1.0 mm, and bandwidth = 1920 Hz/pixel. Each run was approximately 16 min and collected 491 volumes.

Head movement during the experiment was restricted using foam padding inserted between the observer's head and the head coil. The tasks were presented using E-prime software and a high-resolution projection system. The stimuli were subtended at a visual angle of 3.2–4.2◦ . In each trial, the stimuli were randomly arranged between six equidistant positions centered on a virtual circle with a visual angle of 6.7◦ from the center of the screen. Responses were recorded by a manipulandum strapped to the participants' hands. The timing of the presented stimuli was synchronized to the trigger pulse from the MRI scanner. Data were analyzed using Analysis of Functional NeuroImages (AFNIs) software. Standard preprocessing was used that included slice timing correction, outlier removal, motion correction, and spatial smoothing (Gaussian FWHM = 8 mm).

#### Methods of Analysis

Behavioral performance was assessed using Pashler's K which provides a behavioral index of VWM capacity at each set size (Pashler, 1988). Formally, this is given by the formula k = N (h−f) (1−f) where N is the set size, h is the hit rate (rate of correct different trials), and f is the false alarm rate (rate of incorrect same trials). Note that Pashler's K is the measure of choice when using a whole array test. Each participant was assigned a capacity value for each dimension by selecting the maximum K value across set sizes for that dimension. Given that point estimates can provide a noisy estimate of performance when values are quite comparable (as we expected would be the case at high set sizes), we also fit the K function with linear and quadratic functions for each dimension and selected the functional form that fit the data best. We then used the coefficient estimates from the fit as a secondary behavioral measure.

ROI-based analyses were carried out using 10 mm spherical regions defined using coordinates from regions of interest from the VWM literature (see e.g., Pessoa et al., 2002; Todd and Marois, 2004; Harrison et al., 2010). In particular, we focused on 21 ROIs from a recent meta-analysis (Wijeakumar et al., 2015); nine more were added from Drucker and Aguirre (2009) to examine cortical regions that might be selective for processing stimulus shape. Average beta values were extracted for each ROI (1–30), set size (1–6), and feature (Color, Shape) for each participant. Only trials with correct responses were included in the analyses as the number of incorrect trials for some of the lower set sizes was too small to analyze.

A 2-factor (set size, feature) ANOVA was carried out on data from each ROI to identify ROIs that showed a change in the BOLD response across set sizes. We then conducted additional analyses on the set of ROIs with Set Size or Set Size × Feature interactions. In particular, for each included ROI, we computed the maximum BOLD signal across set sizes for each dimension and the BOLD signal at the set size that matched the maximum K value for each subject and dimension. Finally, we examined correlations within and between the behavioral and neural measures using Pearson's correlation to examine whether behavioral estimates of capacity and neural estimates of capacity are correlated within individuals and across dimensions.

# RESULTS

#### Behavioral Results

K values were estimated for each set size, participant, and stimulus dimension. **Figure 1** shows these K values across

participants for the color (left panel) and shape (right panel) dimensions. As is evident, there were differences across stimulus dimensions. In the Color CD task, participants generally had higher K values (note that we scaled the panels differently to highlight the individual differences across participants). Indeed, across the sample, the Max K value for color was significantly greater than the Max K for Shape, t(19) = 13.495, p < 0.001. The Color K values were also less variable across set sizes showing a clear increasing and then decreasing pattern. By contrast, performance in Shape CD declined less at higher set sizes, reflecting the difficulty participants had with the Shape CD task beyond the lowest set sizes.

The other key result from **Figure 1**: participants showed clear individual differences. To examine whether these individual differences were consistent across dimensions, we correlated the Max K values across dimensions. There was a significant correlation, r = 0.64, p < 0.005, indicating that participants with a high capacity for colors generally also had a high capacity for shapes (see **Figure 2A**).

One limitation of the Max K measure is that it only considers a single value of the K function to represent each participant's performance. As an alternative, we fit the data in **Figure 1** with linear and quadratic functions, obtaining coefficient values describing the linear or quadratic fit for each participant and dimension. For Color, we determined that quadratic functions generally provided a better fit of participants' data than linear functions (the F-change statistic was significant for the quadratic fit for 13 of 20 participants). For Shape, we found that linear functions provided the most parsimonious description of the K functions (only 1 F-change statistic was significant for the quadratic fit). Based on these results, we carried forward the two coefficients from the quadratic fit of Color K and one coefficient from the linear fit of Shape K for each participant for further analysis.

Given that Max K is the most commonly used measure of capacity in the literature, we correlated the quadratic (Color) and linear (Shape) coefficients with Max K to examine the relationship between these measures. There was a significant positive correlation, r = 0.50, p < 0.05, between the quadratic coefficient and Max K for Color – participants with a strong negative quadratic coefficient who generally performed poorly at high set sizes had lower Max K, while participants with less negative quadratic coefficients (e.g., near −0.1) had higher Max K (see **Figure 2C**). Note that the linear and quadratic coefficients for Color were negatively correlated, r = −0.82, p < 0.001 (see **Figure 2D**). This linear term serves to shift the peak of the quadratic function so that the fit does not fall off until the K function does – around Set Size 4. For Shape, there was a significant positive correlation, r = 0.76, p < 0.001, between the linear coefficient and Max K (see **Figure 2B**). Thus, participants with higher capacity tended to show an increase in performance across set size while lower capacity subjects showed no improvement or a decline across set size.

#### fMRI Results

As a preliminary step in the fMRI analysis, we determined which of the 30 ROIs identified from the VWM literature were responsive to the memory load manipulation. To this end, we conducted a two-factor (Set Size, Dimension) ANOVA on data from each ROI. Eight ROIs (five from the meta-analysis, three from Drucker and Aguirre, 2009) showed a significant effect of Set Size or an interaction between Set Size and Dimension – left Temporo-Parietal Junction (LTPJ), left Occipital Cortex (LOCC), left Ventral Occipital Cortex (LVOC), right Intraparietal Sulcus (RIPS), right Superior Intraparietal Sulcus (RsIPS), right faceselective Middle Fusiform Gyrus (RfsMFG), and left and right V3a (LV3a, RV3a). Only average beta values from these eight ROIs were included in further analyses.

**Figure 3** shows average percent signal change across the set size manipulation for each cluster. LTPJ was the only cluster to show a decline in the BOLD response across Set Size, F(5,95) = 2.71, p < 0.05, replicating findings reported by Todd and Marois (2005). Note that there were no significant differences in the LTPJ response across stimulus dimensions. Additionally, V3a showed a very gradual increase in the BOLD response across set size, F(5,95) = 2.68, p < 0.05. Once again, there were no significant differences in the V3a response

values across participants. (B) Scatterplot of relationship between Shape Max K and linear coefficient of fits of Shape K functions for each individual across set sizes. (C) Scatterplot of relationship between Color Max K and quadratic coefficients of fits of Color K functions for each individual across set size. (D) Scatterplot showing linear versus quadratic coefficients for quadratic fits of Color K functions across set size.

across stimulus dimensions, although the BOLD response was generally higher for Shape than for Color [F(1,95) = 3.29, p = 0.085].

The remaining five clusters showed an increasing pattern across set size, with a decline at set size 6. Data from these clusters were analyzed together in a three-factor ANOVA with Set Size, Dimension, and Cluster as factors. There was a significant main effect of SS, F(5,380) = 4.48, p < 0.001, and a significant SS × Dimension interaction, F(5,380) = 2.40, p < 0.05. The interaction effect is shown in **Figure 4**. The BOLD response for the Color dimension rises more steeply and remains high across set sizes 3–6. By contrast, the BOLD response for the Shape dimension rises more gradually and falls off dramatically at set size 6. Post hoc tests determined that the BOLD response for the Color dimension was significantly greater than the Shape dimension at SS3 and SS6, p < 0.05. This is consistent with behavioral results that showed greater Max K for Color than for Shape.

In the previous section, we reported that individual differences in Max K for color were correlated with individual differences in Max K for Shape. Do these individual differences hold at the level of the brain as well? To investigate this issue, we measured the maximum BOLD response within each cluster across set sizes for each participant and dimension as well as the BOLD response at the set size at which the maximum K value occurred. We then correlated the neural measures. As can be seen in **Table 1**, the Max signal and Max K signal measures are highly correlated within dimensions for 14 of 16 comparisons across clusters. The two

FIGURE 3 | Average percent BOLD signal change across set size for each ROI that demonstrated a significant effect of Set Size. Error bars depict ± 1/2 SE.

comparisons that did not reach significance were both along the shape dimension.

The measures were also compared across dimensions. There were significant cross-dimension correlations in VOC, RIPS, and RsIPS (see **Figure 5**). In VOC and RIPS, the Max BOLD responses were correlated across dimensions, while in RsIPS, multiple significant correlations were observed. Thus, in these areas, participants with stronger neural responses when remembering items that varied along one dimension, also tended to have stronger neural responses when remembering stimuli along the other dimension.

#### Brain-Behavioral Correlations

The central question in this study was whether individual differences in behavioral capacity were correlated with individual differences in neural capacity and, further, whether these correlations held despite differences in capacity across stimulus dimensions. To examine this question, we correlated the five behavioral measures (Max K for Shape, Max K for Color, the linear coefficient for Shape, and the linear and quadratic coefficients for Color) with the four neural measures (Max BOLD for Shape/Color, BOLD at Max K SS for Shape/Color) within the eight clusters showing statistically robust differences in the neural response across set sizes. **Table 2** shows the results.

The first striking result is that there were no significant brainbehavior correlations with the Max K measures. The absence of any significant correlations between the standard behavioral capacity measure (K) and neural capacity measures is not consistent with previous findings (Vogel and Machizawa, 2004; Todd and Marois, 2005).

One limitation of Max K is that it is a point estimate of a function. In this context, it is interesting that there were multiple significant correlations between the neural data and coefficients from the curve fits. Nevertheless, brain-behavior correlations for the curve fits for Shape were all in the opposite direction of what was expected (see light gray shading). In particular, the four significant correlations with the linear coefficient for Shape were negative, that is, the stronger the BOLD response for Shape, the shallower the slope of the K function for Shape across set sizes. As with the Max K measure, there were no significant correlations between the behavioral curve fits and the neural measures for Color.

# DISCUSSION

The central goal of this study was to investigate the relationship between behavioral estimates of VWM capacity and neural estimates of VWM capacity using an individual differences approach. In particular, we conducted an fMRI experiment where we varied the complexity of the stimulus dimensions participants had to remember. Based on findings from Song and Jiang (2006), we expected that this would shift VWM capacity between dimensions. The question was whether high capacity individuals for one dimension would remain high capacity individual for the second dimension, and, further, whether brain-behavior correlations would remain robust across this shift in capacity.

Behavioral results from this study were consistent with the expected shift in VWM capacity across dimensions. In particular, capacity for colors was higher and less variable than capacity for shape. In addition, there were robust individual differences in capacity across dimensions: participants with a high capacity for color also had high capacity for shape. Thus, we succeeded in shifting behavioral capacity across dimensions, replicating findings from Song and Jiang (2006; see also, Alvarez and



<sup>∗</sup>Correlation is significant at the 0.05 level. ∗∗Correlation is significant at the 0.01 level.

Cavanagh, 2004). We also calculated secondary measures of behavioral capacity by fitting participants' K functions to linear and quadratic functions – quadratic for Color, linear for Shape. These novel behavioral measures were correlated with Max K. In particular, Max K was positively correlated with the quadratic fit coefficients for Color and linear coefficients for Shape, and negatively correlated with the linear fit coefficients for Color.

We then used an ROI approach to identify brain areas that showed a statistically robust change over set size. ANOVA results replicated several key effects in the VWM and change detection literatures. In particular, we replicated the suppression in LTPJ as the memory load increased (Todd and Marois, 2005). We also found load-dependent responses in RIPS, RsIPS, LOCC, and LVOC (see, e.g., Todd and Marois, 2004; Song and Jiang, 2006; Harrison et al., 2010). Moreover, when the V3a areas were analyzed together, we found a weak dimension effect (p = 0.085) with a stronger neural response for Shape versus Color. This is consistent with findings from Drucker and Aguirre (2009). Results from the group analyses also revealed that Color showed a more robust neural response across set sizes than Shape. In


TABLE 2 | Correlations between behavioral and neural measures (light gray shading indicates a correlation in a direction opposite of what was expected).

<sup>∗</sup>Correlation is significant at the 0.05 level.

particular, BOLD activation rose more quickly over set sizes and reached a more robust asymptote in LOCC, LVOC, RIPS, RsIPS, and RfsMFG. These results are not consistent with Song and Jiang's (2006) results – they reported greater BOLD activation for shapes than colors in superior parietal lobule, lateral occipital complex, and frontal eye fields. It is possible that this reflects differences in the shapes used across studies. Moreover, Song and Jiang presented variations in color and shape on each trial, asking participants to selectively attend to one dimension or the other. By contrast, we held one dimension constant while varying the other. Although our findings across dimensions clearly differ, there was a consistency across studies: Song and Jiang found a reduction in the BOLD response for shape at high set sizes, similar to the decrease observed at set size 6 here. This reduction in the BOLD response at high set sizes has also been observed with young children (Buss et al., 2014).

To analyze individual differences at the neural level, we extracted Max BOLD and BOLD at Max K SS measures from the ROI data. Within dimension, these measures were highly correlated with each other across all ROIs. Moreover, there were robust individual differences across dimensions in VOC, RIPS, and RsIPS: participants with strong neural responses to Color also had strong neural responses for Shape. Thus, individual differences at the neural level were preserved across dimensions even though there was a significant reduction in capacity moving from Color to Shape. RIPS and RsIPS have been identified in previous studies to represent the spatial positions of objects in VWM (Harrison et al., 2010), possibly binding features together via virtue of their shared spatial positions. If these areas provide a general index of bound object representations, it might explain the robust correlations across dimensions in that high capacity individuals would be expected to have robust object representations regardless of the featural content.

In the final analysis step, we examined whether individual differences in the behavioral measures were related to individual differences in the neural measures. There were no significant correlations with Max K; this was surprising given previous results (Vogel and Machizawa, 2004; Todd and Marois, 2005). Note that Todd and Marois (2005) reported significant correlations between Max K and a normalized BOLD signal

in both IPS and VOC. We examined whether normalizing our BOLD data would have an impact; this was not the case. We also examined whether curve fitting of the BOLD data across set size might yield a replication of Todd and Marois' findings. Once again, this was not the case.

We did find several cases of the opposite correlational pattern where a stronger BOLD response for shape was correlated with an index of lower behavioral capacity. It is possible that this reflects selective color processing in two of these areas – LOCC and LVOC. That is, if these areas are selective for color processing, one might expect that greater BOLD activation on shape trials in these areas might be indicative of poorer performance. By contrast, given that RV3a is a shape-selective area, it is not clear why we found a negative correlation between behavioral and neural capacity for shape VWM in this area.

Note that our report is not the first to find a mixed pattern of results when comparing individual differences in VWM capacity across behavioral and neural levels. Todd and Marois (2005) reported robust correlations between behavioral estimates of capacity and IPS activity; however, correlations with VOC activity were only significant in one experiment. Correlations with BOLD responses from all other ROIs were not significant. Critically, both studies had relatively limited sample size for investigations of individual differences (20 in the present report; 17 in Todd and Marois, 2005). This may have contributed to the sparse brain-behavior correlations.

Our conclusion from the present study is that there is a complex relationship between behavioral capacity and neural capacity. This is consistent with recent theoretical work. For instance, Johnson et al. (2014) used a dynamic neural field model of VWM to bridge between the behavioral and neural levels. Their model successfully reproduced patterns of behavioral data across set sizes in detail, including performance on correct and incorrect trials (see also, Johnson et al., 2009a,b). They also found an asymptote in neural activation over set sizes for some neural measures. Nevertheless, there was not a one-to-one relationship between behavioral estimates of capacity and the number of neural representations actively maintained by the model. That is, models with a behavioral capacity of 3–4 items often actively maintained 4–6 items in VWM.

### REFERENCES


Importantly, recent work has demonstrated that dynamic field models can provide useful insights into individual differences as well (see Perone and Spencer, 2013, 2014). Moreover, we have developed a method to simulate hemodynamics directly from dynamic field models (Buss et al., 2013). These two innovations suggest that dynamic field theory could be a useful theoretical framework to explore the relationship between behavioral and neural VWM capacity in greater detail. This will be a target of future work.

In summary, our results provide evidence that individual differences in both behavioral and neural measures are preserved across shifts in capacity created by processing simple versus complex features. Further, our results provide some evidence that higher capacity individuals determined by behavioral measures are also higher capacity individuals at the neural level. Nevertheless, there is clearly a complex relationship between behavioral estimates of capacity and neural estimates of capacity. Future work will be needed to clarify this relationship, and we suggest that recent neurally grounded theories of VWM might prove useful on this front.

#### AUTHOR CONTRIBUTIONS

JA is the primary author of the manuscript. JS and AB designed the experiment. JA, JS, and SW analyzed and interpreted data from the experiment. JS, SW, and AB all provided important revisions to the manuscript.

# FUNDING

This work was supported by NSF grant BCS-1029082.

# ACKNOWLEDGMENTS

We acknowledge support from NSF grant BCS-1029082. We would also like to thank Vince Magnotta for his help with collecting and processing data.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Ambrose, Wijeakumar, Buss and Spencer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neonatal Perirhinal Lesions in Rhesus Macaques Alter Performance on Working Memory Tasks with High Proactive Interference

#### Alison R. Weiss <sup>1</sup> \*, Ryhan Nadji <sup>1</sup> and Jocelyne Bachevalier 1, 2

*<sup>1</sup> Department of Psychology, Emory University, Atlanta, GA, USA, <sup>2</sup> Division of Developmental Cognitive Neuroscience, Yerkes National Primate Research Center, Atlanta, GA, USA*

The lateral prefrontal cortex is known for its contribution to working memory (WM) processes in both humans and animals. Yet, recent studies indicate that the prefrontal cortex is part of a broader network of interconnected brain areas involved in WM. Within the medial temporal lobe (MTL) structures, the perirhinal cortex, which has extensive direct interactions with the lateral and orbital prefrontal cortex, is required to form active/flexible representations of familiar objects. However, its participation in WM processes has not be fully explored. The goal of this study was to assess the effects of neonatal perirhinal lesions on maintenance and monitoring WM processes. As adults, animals with neonatal perirhinal lesions and their matched controls were tested in three object-based (non-spatial) WM tasks that tapped different WM processing domains, e.g., maintenance only (Session-unique Delayed-nonmatching-to Sample, SU-DNMS), and maintenance and monitoring (Object-Self-Order, OBJ-SO; Serial Order Memory Task, SOMT). Neonatal perirhinal lesions transiently impaired the acquisition of SU-DNMS at a short (5 s) delay, but not when re-tested with a longer delay (30 s). The same neonatal lesions severely impacted acquisition of OBJ-SO task, and the impairment was characterized by a sharp increase in perseverative errors. By contrast, neonatal perirhinal lesion spared the ability to monitor the temporal order of items in WM as measured by the SOMT. Contrary to the SU-DNMS and OBJ-SO, which re-use the same stimuli across trials and thus produce proactive interference, the SOMT uses novel objects on each trial and is devoid of interference. Therefore, the impairment of monkeys with neonatal perirhinal lesions on SU-DNMS and OBJ-SO tasks is likely to be caused by an inability to solve working memory tasks with high proactive interference. The sparing of performance on the SOMT demonstrates that neonatal perirhinal lesions do not alter working memory processes *per se* but rather impact processes modulating impulse control and/or behavioral flexibility.

#### Edited by:

*Zsuzsa Kaldy, University of Massachusetts Boston, USA*

#### Reviewed by:

*Philip Browning, National Institutes of Health, USA Toshiyuki Hirabayashi, National Institute of Radiological Sciences, Japan*

\*Correspondence:

*Alison R. Weiss alison.weiss@emory.edu*

Received: *13 August 2015* Accepted: *04 December 2015* Published: *05 January 2016*

#### Citation:

*Weiss AR, Nadji R and Bachevalier J (2016) Neonatal Perirhinal Lesions in Rhesus Macaques Alter Performance on Working Memory Tasks with High Proactive Interference. Front. Syst. Neurosci. 9:179. doi: 10.3389/fnsys.2015.00179*

Keywords: excitotoxic lesion, self-ordered task, serial order memory, perseveration, proactive interference

# INTRODUCTION

Working memory (WM) defines the psychological and neural processes responsible for keeping active a limited set of cognitive representations, and the executive capacity that acts upon those transiently stored representations. In other words, representations of objects, places, ideas, goals, or rules are maintained in WM and flexibly cooperate with process that monitor or manipulate the representations being kept "in mind." Domain-specific models of WM have proposed that the lateral prefrontal cortex has a topographical organization according to specific WM processes. Evidence from human functional imaging (Ungerleider et al., 1998; D'Esposito et al., 1999; Owen et al., 1999; Petrides, 2000; Cannon et al., 2005), and lesion studies in monkeys (Mishkin et al., 1969; Passingham, 1975; Mishkin and Manning, 1978; Kowalska et al., 1991; Petrides, 1991, 1995), strongly support a distinction between the ventrolateral PFC (vlPFC) associated with maintenance processes and dorsolateral PFC (dlPFC) associated with monitoring/manipulation processes. However, more recent studies suggest that the prefrontal cortex is part of a broader network of interconnected brain areas involved in WM (see for review Constantinidis and Procyk, 2004). Specifically, medial temporal lobe (MTL) structures are also recruited during WM tasks (Kimble and Pribram, 1963; Diamond et al., 1989; Petrides, 1991, 1995, 2000; Davachi and Goldman-Rakic, 2001; Stern et al., 2001; Ranganath et al., 2004; Libby et al., 2012; Warren et al., 2012). In a recent report, Heuer and Bachevalier (2011) demonstrated that neonatal damage to the hippocampus in monkeys resulted in severe loss of WMmonitoring abilities, but spared WM-maintenance abilities. Given that the only direct inputs of the hippocampus to the PFC target the ventromedial PFC via the fornix, but not the vlPFC or dlPFC (Cavada et al., 2000; Croxson et al., 2005), bottom-up information from the hippocampus to the dlPFC will need to be realized via a multisynaptic pathway. Yet, the dlPFC projects back to the posterior hippocampus (Goldman-Rakic et al., 1984; Morris et al., 1999) providing a potential top-down mechanism regulating hippocampal-dependent WM processes.

Another MTL structure well positioned to play a prominent role in WM processes is the perirhinal cortex (PRh), which has direct reciprocal connections not only with the hippocampus but also with lateral and orbital PFC fields (Suzuki and Amaral, 1994; Lavenex et al., 2002; Saunders et al., 2005). In addition, electrophysiological and functional imaging studies have reported increased activity in PRh during object-based WM tasks, and PRh neurons of adult macaques are highly activated during WM tasks requiring the temporary maintenance of object representations (i.e., small-set delayed-match-to-sample). Such neuronal changes were not observed in other temporal visual areas, such as area TE (Lehky and Tanaka, 2007). Likewise, 2-deoxyglucose imaging studies indicate increased activity in PRh (but not the entorhinal cortex) during a delayed object alternation task; a task requiring the maintenance and monitoring of information in WM (Davachi and Goldman-Rakic, 2001). Taken together, these results point to a unique contribution of the PRh to performance on tasks that require the active/flexible representation of familiar objects.

Although the critical contribution of the PRh to recognition and stimulus-stimulus association memory has been well documented (Murray et al., 1993; Brown and Aggleton, 2001; Lavenex et al., 2004; Lee et al., 2006; Warburton and Brown, 2010), its participation in WM processes remains to be fully investigated. In a longitudinal developmental study aimed at tracking the long-term effects of neonatal PRh cortex lesions on memory processes, we recently demonstrated that these early-onset lesions yielded severe recognition memory deficits that emerged in infancy and persisted until adulthood (Zeamer et al., 2015; Weiss and Bachevalier, 2016). In the present study, we tested whether the same neonatal PRh lesions will result in WM deficits and whether the deficits will encompass both maintenance and monitoring WM processes. As they reached adulthood, animals with neonatal PRh lesions and their controls were successively tested in three object-based working memory tasks previously used to assess the effects of neonatal hippocampal lesions on WM processes (Heuer and Bachevalier, 2011, 2013).

# MATERIAL AND METHODS

### Subjects

Fifteen adult rhesus macaques (Macaca mulatta), nine females and six males, participated in this study. Between postnatal days 10–12, the animals underwent surgery to create bilateral lesions of the perirhinal cortex, or sham operations. Six infant monkeys (three females, three males) were given MRIguided ibotenic acid injections into perirhinal areas 35 and 36 (Group Neo-PRh), seven monkeys (five female, two male) underwent the same surgical procedures withholding any injections (Group Neo-C), and two additional monkeys (one female, one male) served as un-operated controls. At the time of this study, all animals were 6–7 years old and housed individually in a room with a 12 h light/dark cycle (7 AM/PM). Monkeys were fed Purina Old World Primate chow (formula 5047) and supplemented with fresh fruit enrichment. During behavioral testing, chow was restricted and the weight of the animals was monitored and maintained at or above 85% of the full feed weight. Water was given ad libitum. One cohort of subjects were born at the YNPRC breeding colony (Lawrenceville, Georgia), and a second cohort were born at the breeding colony of the University of Texas, M.D. Anderson Cancer Center Science Park (Bastrop, TX). At both institutions, all animals received similar rearing and behavioral procedures, including social interactions with age-matched peers and human caregivers as described previously (for detailed description see Goursaud and Bachevalier, 2007; Raper et al., 2013).

All animals had received extensive, but similar, cognitive testing before they participated in this experiment, including tests of incidental recognition memory (visual paired comparison at 1, 6, and 18 months; Zeamer et al., 2015), oddity learning (3 and 15 months), concurrent discrimination learning with devaluation (48 months), and object and spatial recognition memory (60 months; Weiss and Bachevalier, 2016).

All protocols were approved by the Institutional Animal Care and Use Committee at Emory University in Atlanta, Georgia and conformed to the NIH Guide for the care and use of Laboratory Animals (National Research Council (US), 2011).

#### Neuroimaging and Surgical Procedures

All neuroimaging and surgical procedures were described in detail by Zeamer et al. (2015) and are briefly summarized below. To determine injection coordinates prior to surgical procedures and assess lesion extent post-surgery, subjects were given MRIs immediately prior to surgery and 6–8 days postsurgery. At both time points, animals were sedated (10 mg/kg of 7:3 Ketamine Hydrochloride, 100 mg/ml, and Xylazine, 20 mg/ml, administered i.m.) and intubated to allow inhalation of isoflurane (1–2%, v/v) and maintain an appropriate plane of anesthesia during the duration of the scan. An IV drip (0.45% NaCl and dextrose) was provided for normal hydration and the animal's head was restrained in a stereotaxic apparatus. Vital signs (heart and respiration rates, blood pressure, body temperature, and expired CO2) were constantly monitored during the scan and surgical procedures. The brain was imaged with a 3T Siemens Magnetom Trio system (Siemens Medical Solutions, Malvern, PA at YNPRC) using a 5-cm surface coil and two sets of images were obtained: (1) high-resolution structural images [3D T1-weighted fast spoiled gradient (FSPGR)-echo sequence, TE = 2.6 ms, TR = 10.2 ms, 25◦ flip angle, contiguous 1 mm sections, 12 cm FOV, 256 × 256 matrix]; and (2) Fluid Attenuated Inversion Recovery (FLAIR) images [TE = 140 ms, TR = 1000 ms, inversion time (TI) = 2200 ms, contiguous 3 mm sections, 12 cm FOV, 256 × 256 matrix; image sequences acquired in three series offset 1 mm posterior]. The pre-surgical T1-weighed images were used to calculate the injection sites and all pre- and post-surgical images were used to estimate the extent of PRh damage as well as damage to adjacent structures.

Following the pre-surgical scans, animals were maintained with Isoflurane gas (1–2%, v/v, to effect) during the surgical procedures, which were performed under deep anesthesia using aseptic conditions. The scalp was shaved and cleaned with chlorhexidine diacetate (Nolvasan, Pfizer). A long-lasting local anesthetic, Bupivacaine Hydrochloride (Marcaine 25%, 1.5 ml), was injected along the planned midline incision of the scalp, which extended from the occipital to the orbital ridge. After retraction of the galea, bilateral craniotomies (1 cm wide × 2.5 cm long) were made with an electric drill above the areas to be injected, and bone wax (Ethicon, Inc., Somerville, NJ; 2.5 g size) was applied as necessary to prevent bleeding. The dura was opened and injections of 0.4 µl ibotenic acid (Biosearch Technologies, Novato, CA, 10 mg/ml in PBS, pH 7.4, at a rate of 0.4 µl/min) were made 2 mm apart along the rostralcaudal length of the perirhinal cortex bilaterally. Sham-operated controls (Neo-C) underwent the same procedures, however once the dura was cut, no injections were made.

The dura, galea, and skin were closed in anatomical layers and the animal was removed from isoflurane, extubated, and closely monitored until complete recovery from anesthesia. Analgesic (acetaminophen, 10 mg/kg, p.o.) was given QID for 3 days after surgery. Additionally, animals received dexamethazone sodium phosphate (0.4 mg/kg, i.m.) to reduce edema, and Cephazolin (25 mg/kg, i.m.) once a day starting 12 h prior to surgery and ending 7 days after to prevent infection.

# Lesion Assessment

Histological evaluations are unavailable, as all animals are currently participating in other experiments. Hence, lesion extent was estimated using the MRI images following methods described in details in earlier publications (Málková et al., 2001; Nemanic et al., 2002). Briefly, coronal FLAIR images acquired 1 week post-surgery were used to examine areas with water hypersignals (edema) induced by cell death. Areas of hyper-signals seen in each coronal section were drawn onto corresponding coronal sections of a normal 1-week-old rhesus monkey brain (J. Bachevalier, unpublished atlas) using Adobe Photoshop. These images were then imported into Image J <sup>R</sup> and the surface area of hyper-signals in brain regions of interest (PRh, visual area TE/TEO, entorhinal cortex, parahippocampal cortex, amygdala, and hippocampus) was calculated in pixels<sup>2</sup> and multiplied by image thickness (1 mm) to obtain the lesion volume. The percent of damage to each structure was obtained by dividing the volume of the lesion for a given structure by the volume of that same structure in the control atlas and multiplying by 100.

# Apparatus and Stimuli

All behavioral tasks were conducted using the Wisconsin General Testing Apparatus (WGTA) located in a dark room with a whitenoise generator. Monkeys were transferred from their home cages and positioned in the WGTA facing a tray with 3 recessed food wells (2 cm diameter, 1 cm deep, spaced 13 cm apart). Correct responses were rewarded with preferred food rewards (i.e., mini-marshmallow, jelly bean, M&M etc.)

# Session-Unique Delayed Nonmatching-to-Sample (SU-DNMS)

Session-Unique Delayed Nonmatching-to-Sample (SU-DNMS) measured the maintenance of information in working memory and used training procedures described in Heuer and Bachevalier (2011). For each daily training session, a new pair of objects was selected from a collection of 1000 junk objects without replacement. Each trial consisted of two phases: sample and choice. During the sample phase, the monkey was presented with a single object covering a reward, followed by a delay of 5 s. In the choice phase, two objects, the sample object and the second object, were presented and the monkey was rewarded for selecting the object that was not rewarded during the sample phase. Following a 30 s intertrial interval, the same two objects were used for the next trial as well as for all 30 trials of the daily session. The object serving in the sample phase varied on each trial using a pseudorandom sequence. In the first trial, the two objects were novel, but as the daily session progresses, the two stimuli became highly familiar and generated proactive interference. Thus, in SU-DNMS familiarity/novelty judgments cannot be used to guide responses, rather subjects were required to generate responses based on recency memory and inhibit responses based on recognition memory. Learning criterion was set at 90% or better (27 out of 30) in one session, followed by a performance of 80% or better (24 out of 30) in the next training session. Training was discontinued after a maximum of 1000 trials if criterion was not met. Once subjects met learning criterion at the 5 s delay, testing was continued in the same way using a 30 s delay and a 30 s inter-trial interval. At this longer delay, subjects performed 20 trials per day, again using a novel pair of objects each day, until a learning criterion of 85% averaged over two consecutive testing sessions was achieved, or to a maximum of 500 trials.

The total number of errors (incorrect choices) until meeting criterion at each delay was used as a measure of learning. We also examined how the errors were distributed between the two objects across the daily trials. If errors were distributed equally between the objects, it suggested that the cause of the errors was an impaired ability to maintain information in working memory. On the other hand, if errors were biased toward one object, it instead suggested that the cause of the errors was an impairment of non-mnemonic processes important to support task performance. To test this proposal, we computed an Object Error Distribution Ratio by calculating the absolute value of percent errors made for each object during each daily session minus 50% [# Errors per Object/Total Errors in Session)∗100%)−50%)]. These values ranged from 0–50, where 0 represented an equal distribution of errors between the two objects and 50 represented a complete bias toward one of the objects.

#### Object Self-Ordered Task (OBJ-SO)

This task measured both maintenance and monitoring WM processes, and procedures replicated those described in Heuer and Bachevalier (2011). A set of three new objects, not used in the SU-DNMS task, were selected for the OBJ-SO task. During each daily testing session, monkeys chose three objects, one at a time, during three successive trials. At the start, all three objects were presented covering each of the three food wells with a food reward (Trial 1). Once the monkey made a first choice, the position of the objects on the tray was shuffled and only the two objects unselected in Trial 1 were baited in Trial 2. After the second choice, the positions of the objects were once again shuffled and only the single remaining (unselected) object in Trials 1 and 2 was baited on Trial 3. The same three objects were used in all daily testing sessions and were presented at 10 s inter-trial intervals. If, at any time during Trial 2 or 3, the monkey selected an unbaited object, this initial error was scored as a primary error and a correction procedure was initiated. Correction procedures involved reordering the objects and re-presenting them to the monkey until a rewarded object was selected. The number of times the correction procedure was repeated indicated the number of perseverative errors. For analyses, primary and perseverative errors were calculated separately for Trial 2 or Trial 3. Additionally, the percent of errors on Trial 3 that were "repeats" of the errors made on Trial 2 were also tabulated as a measure of impulsive responding.

Learning criterion for the OBJ-SO task was met when subjects scored 85% correct across 10 consecutive daily sessions (three primary errors or fewer), or testing was discontinued if subjects reached a maximum of 50 daily sessions. Thus, in OBJ-SO monkeys were rewarded for making choices based on the temporal sequence of their own object selections in previous trials of the daily testing session.

# Serial Order Memory Task (SOMT)

Similar to the OBJ-SO task, the SOMT assessed both maintenance and monitoring WM processes and was delivered using procedures described by Heuer and Bachevalier (2013). A pool of new objects was selected for each trial of this task from another collection of 1000 junk objects that differed in size, shape, color, and texture. The objects were divided in 25 bins of 40 objects each and each bin was selected for testing one at a time until all 25 bins were used before re-using the first bin. Thus, objects only reappeared about once per month. A trial of SOMT consisted of two phases: the sample phase and the test phase. In the sample phase, a list of objects were presented one at a time at 10 s intervals covering the baited center food-well. After displacing the last object of the list and retrieving the food rewards, there was a 10 s delay after which the test phase began. In the test phase, two of the objects from the list were selected and covered the lateral food-wells. The monkey was rewarded for displacing the object that occurred earliest in the list. After a 30 s inter-trial interval, the next trial began using a new set of objects. A total of 10 trials were given for each daily session.

The monkeys were first trained to criterion using lists of three objects. Training progressed in stages: during Stage 1, the test phase paired the first and third objects (1v3), Stage 2 paired the first and second (1v2), and Stage 3 paired the second and third (2v3). The monkey was required to score 80% (8/10) correct during a daily session before moving to the next stage. If the monkey scored 70% (7/10), then that stage was repeated the following session. If the monkey scored 60% or less (6/10), then they were moved back to the previous stage. Once the monkey completed the three-object version, they moved on to a fourobject version including six stages in which the orders of object pairings in the test phase were as follows: 1v4, 1v3, 1v2, 2v4, 3v4, and 2v3. It is worth noting that only discrimination problems including objects 2v3 required the animals to maintain the order of the objects presented in the list, since with training monkeys could learn that for the other discrimination problems Objects 1 were always rewarded and Objects 4 were never rewarded. After completing training on the four-object SOMT, monkeys were tested with probe trials.

Probe trials were administered to assess the ability of the monkeys to track the serial position of objects presented in sequence. This training was identical to the four-object version described above, except that half of the trials (five trials) were judgments between 1v4, and the other half (five trials) were judgments between 2v3. These two trial types were randomized within a daily session so that the monkey could not anticipate which temporal judgments would occur on each trial. Probe trials, therefore, required the monkeys to track ALL of the stimuli in the list. Ten probe trials were administered daily for three consecutive days, resulting in a total of 15 trials of each type. A ratio score was calculated by dividing the total number of correct responses on "inner" pairings (2v3 trials) by the total number of correct responses on "outer" pairings (1v4 trials). A ratio score above or below 1 indicated superior performance on one type of temporal discrimination over another, whereas a score equal to 1 indicated equivalent performance on both trial types.

#### Data Analyses

Scores of the control animals from the Texas cohort (n = 5) and control animals of the Georgia cohort (n = 4; see Subjects) were compared across all measures using independent sample t-tests. None reached significance, and so these groups were collapsed in a single control group for all subsequent analyses.

Data obtained from SU-DNMS and OBJ-SO followed a normal distribution, and so repeated measures ANOVAs were used to compare the scores of the Neo-PRh and Neo-C groups. For SU-DNMS, 2 × 2 ANOVAs (Group × Delay 5–30 s) using Delay as the repeated-factor were performed on the two parameters (errors to reach criterion, object error distribution ratio). For OBJ-SO, primary and perseverative Errors were analyzed with a Three-way ANOVA (Group × Error Type × Trial) with repeated measures for the last two factors. Finally, independent sample t-tests were used for both tasks to compare the performance of Neo-PRh and Neo-C groups on each measure.

Data from SOMT did not follow a normal distribution, with the exception of the Inner:Outer ratio score. Both nonparametric and parametric analyses were used for all measures. Given the similar pattern of results obtained with both analyses, only the parametric tests will be reported in the "Results" section below. For number of sessions to criterion, a 2 × 2 ANOVA (Group × Object-Pairing) with repeated measures for the second factor was performed. When sphericity was violated, degrees of freedom were adjusted using the Greenhouse-Geisser correction. Finally, group differences on probe trials (Inner:Outer ratio) were assessed using an independent sample t-test.

Correlations between extent of neonatal PRh lesions or unintended damage to adjacent areas and scores on the three tasks were performed with Pearson correlation. Lastly, for all ANOVAs, effect sizes are reported using eta squared (η 2 ) and calculated by dividing the sums of squares for the effect of interest by the total sums of squares (Cohen, 1973; Levine and Hullett, 2002; Keppel and Wickens, 2004). For all T-tests, effect sizes are reported using Cohen's d and calculated by dividing the difference between the means of the two groups by the pooled standard deviations (Rosnow and Rosenthal, 1996).

# RESULTS

#### Lesion Assessment

Detailed lesion assessments for all Neo-PRh animals have been published in Zeamer et al. (2015) and percentage of damage to the PRh and adjacent structures is given for each subject of Group Neo-PRh in **Table 1**. Briefly, all Neo-PRh animals received extensive bilateral damage to the PRh, averaging 73.6% (min = 67.1%, max = 83.3%). Unintended damage occurred in all cases, mostly in the entorhinal cortex (ERh) (average = 20.6%, min = 5.4%, max = 34.5%), but also minimally in area TE (average = 2.5%, min = 0.1%, max = 7.11%). Four of the six Neo-PRh subjects had negligible damage to the anterior hippocampus (average = 0.8%), and three of the six subjects had minimal damage to the amygdala (average = 2.5%). The PRh lesion of a representative case (Neo-PRh-4) is illustrated in **Figure 1** and two additional cases can be seen in previous publications (see Zeamer et al., 2015, see **Figure 2** for case Neo-PRh 3 and Weiss and Bachevalier, 2016, see **Figure 1** for case Neo-PRh-2).

# SU-DNMS

The numbers of trials and errors to reach the learning criterion at each delay, 5 and 30 s, as well as the Object Error Distribution Ratios are reported in **Table 2**. All animals reached criterion at both the short and long delays, although animals with Neo-PRh lesions made twice as many errors (Mean: 73 at 5 s delay and 34.8 at 30 s delay) than controls (Mean: 30.2 at 5 s delay and 18.4 at 30 s delay; see **Figure 2**). These group differences were confirmed by a significant group effect on the number of errors to reach criterion [F(1, 13) = 5.156, p = 0.041, η <sup>2</sup> = 0.28]. Planned comparisons revealed that the group difference at the 5 s delay was significant [t(13) = 2.207, p = 0.046, d = 1.12], but not at the 30 s delay [t(13) = −0.811, p = 0.432, d = 0.42]. Furthermore, although both groups improved their performance from the 5 to


*L%, percent damage to left hemisphere; R%, percent damage to right hemisphere; X%, average damage to both hemispheres; W%, weighted damage to both hemispheres [W%* = *(L%* × *R%)/100]. PRh, perirhinal cortex; ERh, entorhinal cortex. Lesion extents from cases Neo-PRh-1 thru Neo-PRh-6 were previously reported in Zeamer et al. (2015).*

the 30 s delay (see **Figure 2**), the delay effect and the interaction (Group × Delay) were not reliable [F(1, 13) = 2.803, p = 0.118, η <sup>2</sup> = 0.14; F(1, 13) = 0.783, p = 0.392, η <sup>2</sup> = 0.05], indicating that the magnitude of improvement was similar for both groups.

The Object Error Distribution Ratio (**Table 2**) was also higher in animals with Neo-PRh lesions than controls at both delays, indicating a tendency to preferentially select one object over the other [F(1, 13) = 3.782, p = 0.075, η <sup>2</sup> = 0.23]. Neither the delay effect nor the interactions between the two factors reached significance [F(1, 13) = 0.100, p = 0.756, η <sup>2</sup> = 0.01 and F(1, 13) = 0.150, p = 0.705, η <sup>2</sup> = 0.01, respectively]. Yet, planned comparisons indicated that the group difference was significant at the 5 s delay but not at the 30 s delay [t(13) = 2.561, p = 0.024, d = 1.42 and t(13) = 1.143, p = 0.273, d = 0.61, respectively].

Additionally, errors made during the first block of 10 trials and last bock of 10 trials in each daily session of the SU-DNMS task were tallied separately to determine if the monkeys tended to make more errors at the end of the session. A Group × Trial-Block (first-last) ANOVA with repeated measure for the second factor revealed a significant main effect of Group at the 5 s delay [F(1, 13) = 5.107, p = 0.042, η <sup>2</sup> = 0.282], but not at the 30 s delay [F(1, 13) = 0.754, p = 0.401, η <sup>2</sup> = 0.055] and a significant effect of Trial-Block at the 5 s delay [F(1, 13) = 5.084, p = 0.042, η <sup>2</sup> = 0.272] but not at the 30 s delay F(1, 13) = 3.672, p = 0.078, η <sup>2</sup> = 0.218]. None of the interactions were significant [5 s: F(1, 13) = 0.640, p = 0.438, η <sup>2</sup> = 0.034; 30 s: F(1, 13) = 0.142,

and 30 s for animals with neonatal perirhinal lesions (filled bars) and controls (open bars). \**p* < 0.05.

p = 0.712, η <sup>2</sup> = 0.008]. Thus, both groups of monkeys tended to make more errors on the last 10 trials than on the first 10 trials at 5 s delay, but not at 30 s delay.

#### OBJ-SO

Control animals reached criterion in an average of 12.7 testing days. In contrast, all but one of the six animals with Neo-PRh cortex lesions (Neo-PRh-5) failed to reach criterion within the limit of testing (50 testing days), resulting in an averaged group performance of 43 [t(13) = −3.454, p = 0.004, d = 1.81; see **Table 1**]. As shown in **Figures 3A,B**, this learning impairment was also reflected by a greater number of primary and perseverative errors on Trial 2 and Trial 3 made by Neo-PRh animals as compared to the Neo-C animals [Primary errors: t(13) = −3.444, p = 0.004, d = 1.68 and t(13) = −2.647, p = 0.020, d = 1.41 for Trial 2 and Trial 3, respectively; Perseverative errors: t(5.736) = −2.836, p = 0.031, d = 1.61 and t(13) = −2.901, p = 0.012, d = 1.50, for Trial 2 and Trial 3 respectively].

The Three-way ANOVA (Group × Error types × Trials) revealed significant main effects of Group [F(1, 13) = 9.597, p = 0.008, η <sup>2</sup> = 0.42] and Trial [F(1, 13) = 22.716, p < 0.001, η <sup>2</sup> = 0.55], but not of Error Type [F(1, 13) = 2.819, p = 0.117, η <sup>2</sup> = 0.15]. The Three-way interaction also reached significance [F(1, 13) = 10.545, p = 0.006, η <sup>2</sup> = 0.21]. Thus, although both groups made more primary and perseverative errors on Trial 3 than on Trial 2, Group Neo-C had a similar increase in primary and perseverative errors across trials. By contrast, for Group PRh, the increase in perseverative errors from Trial 2 to Trial 3 was greater in magnitude than the increase in primary errors [Group × Trial interaction: F(1, 13) = 7.217, p = 0.019, η <sup>2</sup> = 0.13 and F(1, 13) = 2.172, p = 0.164, η <sup>2</sup> = 0.07, for Perseverative and Primary Errors, respectively].

Finally, to determine whether the increase of errors in animals with Neo-PRh lesions was due to impulsive reactivity, we assessed the animals' tendency to select in Trial 3 the same incorrect object

#### TABLE 2 | Performance on the SU-DNMS and Obj-SO tasks.


*For Session Unique Delayed Non-Match to Sample (SU-DNMS), scores are number of trials and errors to criterion and the error distribution ratio at each delay. For the Object Self-Ordered task (OBJ-SO), scores are number of sessions and errors to criterion. Neo-C-2 and Neo-C-8 were not tested on SU-DNMS or OBJ-SO. Data from Neo-C-1 thru Neo-C-6 and Neo-C-11 previously reported in Heuer and Bachevalier (2011). Data Neo-H-1 thru Neo-H-6 used for comparison in Section Comparisons with Neonatal Hippocampal Lesions and also reported in Heuer and Bachevalier (2011).*

they selected in Trial 2. The percent of errors on Trial 3 that repeated the errors on Trial 2 did not significantly differ between groups [t(13) = −0.435, p = 0.671, d = 0.24].

#### SOMT

The numbers of sessions to reach criterion at each stage of object pairings on the three-Object and four-Object versions of this task are reported in **Table 3**. All monkeys acquired the task within the maximum number of sessions (20 per stage). On the three-Object version, the effects of group (Neo-C vs. Neo-PRh), Object-Pairing stages (i.e., 1v3, 1v2, 2v3) and their interaction did not reach significance [F(1, 12) = 0.827, p = 0.381, η <sup>2</sup> = 0.064; F(1.230, 14.758) = 3.312, p = 0.083, η <sup>2</sup> = 0.216; F(1.230, 14.758) = 0.023, p = 0.920, η <sup>2</sup> = 0.002, respectively]. A similar pattern emerged on the four-Object version [Group: F(1, 12) = 3.197, p = 0.099, η <sup>2</sup> = 0.210; six Object-Paring stages: F(2.503, 30.040) = 0.490, p = 0.659, η <sup>2</sup> = 0.036; Group × Object-Pairing interaction: F(2.503, 30.040) = 1.007, p = 0.392, η <sup>2</sup> = 0.075]. Therefore, both groups performed similarly on the three-Object and four-Object versions of the task.

Results of the probe trials are reported in **Table 3**. The Inner:Outer ratio scores of the Neo-PRh group averaged 0.84, indicating slightly better performance on 1v4 pairings that 2v3 pairings. The Neo-C group averaged 0.97, indicating approximately equal performance on both pairings. However, the group difference was not significant [t(11) = −1.375, p = 0.197, d = 0.76].

#### Correlations

Finally, none of the correlations between the average extent bilateral of PRh damage and scores on each of the three working

memory tasks reached significance (all ps > 0.05), indicating that greater extent of lesions was not related to performance on any of the tasks (see Supplemental Materials for details).

# Comparisons with Neonatal Hippocampal Lesions

To investigate how the pattern of deficits after the Neo-PRh lesions contrast with those previously reported after neonatal hippocampal (Neo-H) lesions, scores of Neo-PRh and Neo-C groups on the three working memory tasks were compared to those obtained by the Neo-H groups (Heuer and Bachevalier, 2011, 2013). As shown in **Table 2**, Neo-H lesions appear to affect SU-DNMS acquisition (50 and 16 errors for 5 and 30 s, respectively) to a smaller degree than Neo-PRh lesions (73 and 35 errors for 5 and 30 s respectively). However, differences between the three groups did not reach significance [5 s errors: F(2, 20) = 1.262, p = 0.307, η <sup>2</sup> = 0.123; 30 s errors: F(2, 20) = 0.574, p = 0.573, η <sup>2</sup> = 0.060]. In contrast, the Neo-PRh group was equally impaired in learning the OBJ-SO task as the Neo-H group (see **Table 2**), both groups averaging 43 and 44 sessions to reach criterion, respectively, as compared to 13 sessions for the controls, [F(2, 20) = 7.164, p = 0.005, η <sup>2</sup> = 0.443; Neo-PRh vs. Neo-H: t(18) = 0.130, p = 0.898, d = 0.070; Neo-PRh vs. Neo-C: t(18) = 3.236, p = 0.005, d = 1.810; Neo-H vs. Neo-C: t(18) = −3.094, p = 0.006, d = 1.568]. Finally, comparisons between the effects of Neo-H lesions and Neo-PRh lesions on the SOMT (**Table 3**) indicated that the Neo-H group required more sessions (five sessions) to complete the 2v3 phase of the four-Object version than the Neo-PRh group (three sessions) or controls (one session) [F(2, 19) = 5.336, p = 0.016, η <sup>2</sup> = 0.386; Neo-PRh vs. Neo-H: t(17) = −2.026, p = 0.059, d = 1.025; Neo-PRh vs. Neo-C: t(17) = 1.083, p = 0.294, d = 0.537; Neo-H vs. Neo-C: t(17) = −3.249, p = 0.005, d = 2.114]. This impairment of temporal order memory for the inner items of a list by the Neo-H group was also apparent in Probe trials, where Neo-H monkeys had lower Inner:Outer ratios (0.68) than the Neo-PRh monkeys (0.84) or Controls (0.97) [F(2, 18) = 5.350, p = 0.017, η <sup>2</sup> = 0.401; Neo-PRh vs. Neo-H: t(16) = 1.870, p = 0.080, d = 1.038; Neo-PRh vs. Neo-C: t(16) = −1.324, p = 0.204, d = 0.757; Neo-H vs. Neo-C: t(16) = −3.265, p = 0.005, d = 1.806].

#### DISCUSSION

This study investigated the effects of neonatal PRh-lesions on WM processes when animals reached adulthood. The results indicate that neonatal PRh-lesions slightly, but only transiently, impaired WM maintenance processes measured by the SU-DNMS task and impaired WM maintenance/monitoring processes measured by the OBJ-SO task. In contrast to both SU-DNMS and OBJ-SO tasks that generated high proactive interference, performance on the SOMT that was devoid of proactive interference was not altered by the neonatal PRh lesions. The results suggest that neonatal PRh lesions may impact the ability to resolve proactive interference and/or inhibit perseverative responding rather than affecting working memory processes per se. These findings will be discussed in turn.

#### Maintenance

Monkeys with Neo-PRh lesions initially learned SU-DNMS more slowly than controls. However, the mild impairment at the short delay was not evident with further training at the longer delay of 30 s. The same groups of animals were tested on several other memory tasks from infancy through adulthood, and their performance on these tasks can help us reject several interpretations of the transient impairment in the SU-DNMS task. For example, animals with neonatal PRh lesions did not differ from controls in learning a trial-unique delayed nonmatching task indicating no significant impact of the Neo-PRh lesions on perceptual abilities, formation of object representation, learning reward contingencies, or motivation to perform a task (Weiss and Bachevalier, 2016). Furthermore, the impairment at the 5 s of the SU-DNMS could not be explained by an inability to maintain object representation across the short delay, given the normal performance at the


#### TABLE 3 | Performance on the SOMT task.

*Scores are the numbers of sessions to criterion for each of the object pairings in the 3-objects and 4-objects version of the Serial Order Memory Task (SOMT). Probe ratio are correct choices for "inner" (2v3) problems over correct choices for "outer" (1v4) problems. Neo-C-2, Neo-C-8, and Neo-C-10 were not tested on the SOMT, and Neo-C-11 was not given the SOMT Probe trials. Data from Neo-C-1 thru Neo-C-6 previously reported in Heuer and Bachevalier (2013). Data from animals Neo-H-1 thru Neo-H-6 used for comparison in Section Comparisons with Neonatal Hippocampal Lesions and also reported in Heuer and Bachevalier (2013).*

longer delay of 30 s. However, one distinct feature of the SU-DNMS task that has not been addressed with prior memory tasks given to these groups of animals, but that could be relevant to their impairment in the SU-DNMS, is the increased interference encountered by the animals while responding to successive trials. Indeed, in contrast to all other memory tasks previously performed by the animals, SU-DNMS uses the same two stimuli on every trial of a daily session, generating increased proactive interference as the animals progressed through the task. Thus, the learning impairment observed in animals with Neo-PRh lesions at the 5 s delay could be the result of difficulties learning to resolve or inhibit interference. Interestingly, the mild and transitory impairment of the Neo-PRh subjects during the SU-DNMS task is reminiscent to that reported earlier by Eacott and colleagues after rhinal (perirhinal and entorhinal) cortex lesions in adulthood (Eacott et al., 1994). In this latter study, adult monkeys with rhinal lesions were tested in a matching-to-sample task using four stimuli and showed transient impairment especially at the shortest delays used and not at the longer delays, and then performed normally when re-tested with only two stimuli. This similar pattern of transient deficits after the early-onset and late-onset lesions suggests very little recovery of SU-DNMS performance after the early-onset PRh lesions.

A large body of work has already demonstrated that the hippocampus may be critical to reduce proactive interference (Shapiro and Olton, 1994; Butterly et al., 2012; but see Aggleton et al., 1986; Bachevalier et al., 2013). Given that the majority of sensory inputs reaching the hippocampus are relayed through the perirhinal cortex, the Neo-PRh lesions could have disconnected the hippocampus from receiving this flow of information and yielded decreased resistance to interference. However, this explanation seems implausible given that direct damage to the hippocampus does not impair performance on the SU-DNMS (Heuer and Bachevalier, 2011). An alternative explanation may relate to the important interconnections of the perirhinal cortex with the ventrolateral PFC (vlPFC) and orbital frontal cortex (OFC; Lavenex et al., 2002; Petrides and Pandya, 2002). Both vlPFC and OFC lesions in adult monkeys yield deficits in rule-learning that were attributed to perseverative interference generated from competition between well-established responses (Butter, 1969; Passingham, 1975; Mishkin and Manning, 1978; Dias et al., 1996; Meunier et al., 1997; Baxter et al., 2008, 2009). Furthermore, like performance of Neo-PRh monkeys, monkeys with vlPFC lesions require more trials than controls to acquire the DNMS rule and tend to make perseverative errors, but after learning the task, they perform normally on subsequent tests with longer delays (Kowalska et al., 1991). Monkeys with OFC lesions are similarly slow to acquire the DNMS rule, yet their deficit is not overcome with additional training (Meunier et al., 1997). Thus, the deficit in learning the SU-DNMS at short delay may have resulted from a disconnection of the vlPFC from the PRh, preventing vlPFC from accessing object-representations generated by PRh. Yet, the learning deficit in the SU-DNMS after the neonatal PRh lesions was only transitory as was the learning deficit following vlPFC lesions. This improvement in performance suggests that with further training, animals with such lesions can overcome or suppress their perseverative habits, presumably, by developing alternate strategies supported by other PFC areas, such as the OFC. A recent study investigating the effects of neonatal lesions to the vlPFC and OFC separately or in combination demonstrated that, in the absence of a functional vlPFC in infancy, the OFC can take over and support learning skills (Malkova et al., 2015).

#### Monitoring

In comparison to the transient impairment on the WM maintenance task, SU-DNMS, the same neonatal PRh lesions severely impacted acquisition of the OBJ-SO task in all but one of the Neo-PRh monkeys. Furthermore, the source of errors during OBJ-SO acquisition differed between the Neo-PRh and Neo-C groups. The Neo-PRh group made more primary errors than the controls, but the increase in primary errors from Trial 2 to Trial 3 was similar for both groups. Furthermore, although the Neo-PRh group made also more perseverative errors than controls, the increase in perseverative errors from Trial 2 to Trial 3 was greater in magnitude for animals with Neo-PRh lesions than for controls. This pattern of results indicates that monkeys with neonatal PRh lesions may be unable to monitor the order of self-generated responses. Alternatively, like the mild learning impairment reported above for the SU-DNMS task, the inability of animals with Neo-PRh lesions to solve the OBJ-SO task could also be due to inability to suppress interference. The OBJ-SO task uses the same three stimuli from trial to trial, and across all daily sessions, resulting in high levels of interference. Thus, as reported above for the SU-DNMS, the severe impairment on the OBJ-SO task after Neo-PRh lesions could be due either to an inability to monitor information in WM and/or to an inability to resolve interference.

To distinguish between these alternative interpretations, the animals were tested in the SOMT, a WM task that requires the ability to monitor the sequence of object presentations but uses novel objects in each trial. In the SOMT, use of trial-unique stimuli was intended to minimize the impact of interference, and so performance should depend only on the ability to monitor the temporal order of stimuli. Neo-PRh monkeys acquired the SOMT rules similarly to controls, requiring approximately the same number of sessions at each learning stage. During Probe trials, Neo-PRh, and Neo-C monkeys made similar numbers of correct choices for temporal judgments between Object 1 and Object 4 as they did for temporal judgments between Object 2 and Object 3, resulting in roughly equivalent Inner:Outer Ratio scores. Thus, measured with SOMT, neonatal PRh lesion appears to spare the ability to monitor items in WM. Therefore, the severe impairments of the same monkeys in OBJ-SO are likely to be caused by impairment in cognitive processes other than WM. Indeed, the increase in perseverative errors found in animals with Neo-PRh lesions while performing WM tasks with high proactive interference may have instead been caused by a lack of impulse control and/or impaired behavioral flexibility.

# Comparison with the Neonatal Hippocampal Lesions (Neo-H)

The pattern of deficits in the three working memory tasks after the Neo-PRh lesions contrasted with those reported after the Neo-H lesions (Heuer and Bachevalier, 2011, 2013). Unlike Neo-PRh lesions, Neo-H lesions did not impact the ability to maintain information in memory but resulted in severe impairment in both tasks measuring monitoring WM processes. Taken together, these data indicate that the perirhinal cortex and the hippocampus play different roles in supporting the development of WM processes; i.e., the hippocampus supporting monitoring WM processes whereas the perirhinal resolving proactive interference.

#### Conclusions

The present results suggest that the perirhinal cortex may be particularly important to resolve interference. Yet, it is not clear whether the deficits resulted from direct damage to the PRh or from downstream effects of the neonatal PRh lesions on the normal maturation of other neural structures, especially those with protracted anatomical and functional development, such as the PFC (Fuster, 2002; Overman et al., 2004; Conklin et al., 2007; Kolb et al., 2010; Perlman et al., 2015). Developmental studies in rodents (Tseng et al., 2009) and monkeys (Bertolino et al., 1997; Chlan-Fourney et al., 2000; Meng et al., 2013) have already demonstrated significant morphological and neurochemical changes in the lateral PFC as a result of early damage to the MTL structures. Given that the lateral PFC is critical for performance on WM tasks, the WM deficits after the neonatal PRh lesions may have resulted from maldevelopment of the PFC following disruption of inputs it receives from the PRh rather than damage to PRh per se. Disentangling these alternative interpretations will require the replication of the current experiments in a group of monkeys that will have received the same PRh lesions in adulthood.

#### AUTHOR CONTRIBUTIONS

Data acquisition by AW and RN. Experimental design, data analysis, and manuscript preparation by AW, RN, and JB.

#### FUNDING

This work was supported by the National Institute of Mental Health (MH-58846 to JB and T32-HD071845 to AW), the National Science Foundation (NSF-GRFP DGE-1444932 to AW), and the National Center for Research Resources P51RR165,

# REFERENCES


currently supported by the Office of Research Infrastructure Programs/OD P51OD11132.

#### ACKNOWLEDGMENTS

We thank the veterinary and animal husbandry staff at YNPRC for expert care and handling of the animals, the image core facility for their support during the MR imaging, and the members of the Bachevalier lab for their help with the surgical procedures.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnsys. 2015.00179


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Weiss, Nadji and Bachevalier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Development of Attention Systems and Working Memory in Infancy

#### Greg D. Reynolds\* and Alexandra C. Romano

Developmental Cognitive Neuroscience Laboratory, Department of Psychology, University of Tennessee, Knoxville, TN, USA

In this article, we review research and theory on the development of attention and working memory in infancy using a developmental cognitive neuroscience framework. We begin with a review of studies examining the influence of attention on neural and behavioral correlates of an earlier developing and closely related form of memory (i.e., recognition memory). Findings from studies measuring attention utilizing looking measures, heart rate, and event-related potentials (ERPs) indicate significant developmental change in sustained and selective attention across the infancy period. For example, infants show gains in the magnitude of the attention related response and spend a greater proportion of time engaged in attention with increasing age (Richards and Turner, 2001). Throughout infancy, attention has a significant impact on infant performance on a variety of tasks tapping into recognition memory; however, this approach to examining the influence of infant attention on memory performance has yet to be utilized in research on working memory. In the second half of the article, we review research on working memory in infancy focusing on studies that provide insight into the developmental timing of significant gains in working memory as well as research and theory related to neural systems potentially involved in working memory in early development. We also examine issues related to measuring and distinguishing between working memory and recognition memory in infancy. To conclude, we discuss relations between the development of attention systems and working memory.

#### Edited by:

Zsuzsa Kaldy, University of Massachusetts Boston, USA

#### Reviewed by:

Gaia Scerif, University of Oxford, UK John Spencer, University of East Anglia, UK

> \*Correspondence: Greg D. Reynolds greynolds@utk.edu

Received: 26 September 2015 Accepted: 08 February 2016 Published: 03 March 2016

#### Citation:

Reynolds GD and Romano AC (2016) The Development of Attention Systems and Working Memory in Infancy. Front. Syst. Neurosci. 10:15. doi: 10.3389/fnsys.2016.00015 Keywords: infancy, visual attention, recognition memory, working memory, event-related potentials, heart rate

# THE DEVELOPMENT OF ATTENTION SYSTEMS AND WORKING MEMORY IN INFANCY

What are the mechanisms that support the ability to retain information for a period of time before acting on it? When does this ability emerge in human development? What role does the development of attention play in this process? Answers to these questions are not only important for furthering our understanding of working memory, but are also fundamental to understanding cognitive development at a broader level. We delve into these questions from a developmental cognitive neuroscience perspective with a particular focus on the impact of the development of attention systems on recognition memory and working memory. In the sections that follow, we present a selective review of research in which psychophysiological and neuroscience techniques have been combined with behavioral tasks to provide insight into the effects of infant attention on performance on recognition memory tasks. We begin our review with a focus on infant attention and recognition memory because the combined measures used in this line of work provide unique insight into the influence of sustained attention on memory. To date, this approach has yet to be utilized to examine relations between attention and working memory in early development. In the second half of the article, we review research on working memory in infancy with a focus on studies utilizing behavioral and neuroscience measures (for more exhaustive reviews, see Cowan, 1995; Nelson, 1995; Pelphrey and Reznick, 2003; Rose et al., 2004; Bauer, 2009; Rovee-Collier and Cuevas, 2009). We also focus on recent research findings that shed light on neural systems potentially involved in attention and working memory in infancy (for excellent reviews on attention and working memory relations in childhood, see Astle and Scerif, 2011; Amso and Scerif, 2015). Because the human infant is incapable of producing verbal or complex behavioral responses and also cannot be given instructions on how to perform a given task, by necessity, many of the existing behavioral studies on infant working memory have been built upon look duration or preferential looking tasks traditionally used to tap into infant visual attention and recognition memory. Thus, it is difficult to draw distinct lines when determining the relative contribution of these cognitive processes to performance on these tasks in the infancy period (but see Perone and Spencer, 2013a,b). We conclude with a section examining potential relations between attention and working memory and propose that the development of attention systems plays a key role in the timing of significant gains in working memory observed in the second half of the first postnatal year.

### INFANT VISUAL ATTENTION AND RECOGNITION MEMORY

Much of what we know about the early development of visual attention comes from a large body of research on recognition memory in infancy. Because the defining feature of recognition memory is differential responsiveness to novel stimuli in comparison to familiar (or previously viewed) stimuli (Rose et al., 2004), the majority of behavioral research in the area has utilized the visual paired comparison (VPC) task. This task involves the simultaneous presentation of two visual stimuli. Look duration to each stimulus during the paired comparison is measured. Under the framework of Sokolov's (1963) comparator model, longer looking to a novel stimulus in comparison to a familiar stimulus (i.e., a novelty preference) is indicative of recognition of a fully encoded familiar stimulus. In contrast, familiarity preferences are indicative of incomplete processing and continued encoding of the familiar stimulus. The underlying assumption is that infants will continue to look at a stimulus until it is fully encoded, at which point attention will be shifted toward novel information in the surrounding environment.

Thus, infant look duration has been a widely used and highly informative behavioral measure of infant attention that also provides insight into memory in early development. Findings from these studies indicate that older infants require less familiarization time to demonstrate novelty preferences than younger infants; and within age groups, increasing the amount of familiarization results in a shift from familiarity preferences to novelty preferences (Rose et al., 1982; Hunter and Ames, 1988; Freeseman et al., 1993). Older infants also show evidence of recognition with longer delays between familiarization and testing. For example, Diamond (1990) found that 4-month-olds demonstrate recognition with up to 10 s delays between familiarization and testing, 6-month-olds demonstrate recognition with up to 1 min delays, and 9-month-olds demonstrate recognition with up to 10 min delays. These findings indicate that with increasing age, infants are able to process visual stimuli more efficiently and subsequently recognize those stimuli after longer delays. Unfortunately for infancy researchers, look duration and attention are not isomorphic. For example, it is not uncommon for infants to continue looking at a stimulus when they are no longer actively paying attention; therefore, looking measures alone do not provide a particularly accurate measure of infant attention. This phenomenon is most prevalent in early infancy and has been referred to as attention capture, obligatory attention, and sticky-fixation (Hood, 1995; Ruff and Rothbart, 1996).

Richards and colleagues (Richards, 1985, 1997; Richards and Casey, 1992; Courage et al., 2006; for review, Reynolds and Richards, 2008) have utilized the electrocardiogram to identify changes in heart rate that coincide with different phases of infant attention. During the course of a single look, infants will cycle through four phases of attention—stimulus orienting, sustained attention, pre-attention termination, and attention termination. The most relevant of these phases are sustained attention and attention termination. Sustained attention is manifested as a significant and sustained decrease in heart rate from prestimulus levels that occurs when infants are actively engaged in an attentive state. Attention termination follows sustained attention and is manifested as a return of heart rate to prestimulus levels. Although the infant is still looking at the stimulus during attention termination, she/he is no longer engaged in an attentive state. Infants require significantly less time to process a visual stimulus if heart rate is measured online and initial exposure is given during sustained attention (Richards, 1997; Frick and Richards, 2001). In stark contrast, infants given initial exposure to a stimulus during attention termination do not demonstrate evidence of recognition of the stimulus in subsequent testing (Richards, 1997).

# THE GENERAL AROUSAL/ATTENTION SYSTEM

Richards (2008, 2010) has proposed that sustained attention is a component of a general arousal system involved in attention. Areas of the brain involved in this general arousal/attention system include, the reticular activating system and other brainstem areas, thalamus, and cardio-inhibitory centers in frontal cortex (Reynolds et al., 2013). Cholinergic inputs to cortical areas originating in the basal forebrain are also involved in this system (Sarter et al., 2001). Activation of this system triggers cascading effects on the overall state of the organism which foster an optimal range of arousal for attention and learning. These effects include: decreased heart rate (i.e., sustained attention), motor quieting, and release of acetylcholine (ACh) via corticopetal projections. Ruff and Rothbart (1996) and Ruff and Capozzoli (2003) description of ''focused attention'' in children engaged in toy play as being characterized by motor quieting, decreased distractibility, and intense concentration coupled with manipulation/exploration would be considered a behavioral manifestation of this general arousal/attention system.

The general arousal/attention system is functional in early infancy but shows considerable development across infancy and early childhood with increased magnitude of the HR response, increased periods of sustained attention, and decreased distractibility occurring with increasing age (Richards and Cronise, 2000; Richards and Turner, 2001; Reynolds and Richards, 2008). These developmental changes most likely have a direct influence on performance on working memory tasks. The general arousal/attention system is non-specific in that it functions to modulate arousal regardless of the specific task or function the organism is engaged in. The effects of the system on arousal and attention are also general and do not vary in a qualitative manner depending on cognitive task, thus sustained attention would be expected to influence recognition memory and working memory in a similar manner. This nonspecific attention system directly influences functioning of three specific visual attention systems that also show considerable development in the infancy period. These specific attention systems are: the reflexive system, the posterior orienting system, and the anterior attention system (Schiller, 1985; Posner and Peterson, 1990; Johnson et al., 1991; Colombo, 2001).

#### THE DEVELOPMENT OF ATTENTION SYSTEMS IN THE BRAIN

At birth, newborn visual fixation is believed to be primarily involuntary, exogenously driven, and exclusively under the control of a reflexive system (Schiller, 1985). This reflexive system includes the superior colliculus, the lateral geniculate nucleus of the thalamus, and the primary visual cortex. Many newborn fixations are reflexively driven by direct pathways from the retina to the superior colliculus (Johnson et al., 1991). Infant looking is attracted by basic but salient stimulus features processed via the magnocellular pathway that can generally be discriminated in the peripheral visual field, such as high-contrast borders, motion, and size.

Looking and visual fixation stays primarily reflexive for the first 2 months until the end of the newborn period when the posterior orienting system reaches functional onset. The posterior orienting system is involved in the voluntary control of eye movements, and shows considerable development from 3 to 6 months of age. Areas of the brain involved in the posterior orienting system include: posterior parietal areas, pulvinar, and frontal eye-fields (Posner and Peterson, 1990; Johnson et al., 1991). The posterior parietal areas are believed to be involved in disengaging fixation and the frontal eye-fields are key for initiating voluntary saccades. In support of the view that the ability to voluntary disengage and shift fixation shows significant development across this age range, **Figure 1** shows results from a look duration study by Courage et al. (2006) in which infant look duration dropped significantly to a wide range of stimuli from 3 to 6 months of age (i.e., 14–26 weeks of age).

At around 6 months of age, the anterior attention system reaches functional onset and infants begin the drawn out process of developing inhibitory control and higher order attentional control (i.e., executive attention). Not only do infants have better voluntary control over their visual fixations, they can now inhibit attention to distractors and maintain attention for more prolonged periods when it is called for. As can be seen in **Figure 1**, Courage et al. (2006) found that from 6 to 12 months of age (i.e., 20–52 weeks), infants continue to show brief looks to basic, geometric patterns but begin to show longer looking toward more complex and engaging stimuli such as Sesame Street or human faces. This indicates the emergence of some rudimentary level of attentional control at around 6 months of age. Given that several models emphasize some aspect of attentional control as a core component of working memory (e.g., Baddeley, 1996; Kane and Engle, 2002; Klingberg et al., 2002; Cowan and Morey, 2006; Astle and Scerif, 2011; Amso and Scerif, 2015), it stands to reason that the emergence of attentional control at around 6 months of age would contribute significantly to the development of working memory.

The theoretical models for the attention systems discussed above are largely based on findings from comparative research with monkeys, adult neuroimaging studies, or symptomology of clinical patients with lesions to certain areas of the brain.

et al., 2006). Arrows indicate exact test age.

Unfortunately, developmental cognitive neuroscientists are highly limited in non-invasive neuroimaging tools available for use in basic science with infant participants. However, we have conducted multiple studies utilizing event-related potentials (ERPs) along with heart rate measures of attention and behavioral measures of recognition memory (Reynolds and Richards, 2005; Reynolds et al., 2010). Findings from these studies provide insight into potential areas of the brain involved in attention and recognition memory in infancy.

The ERP component which is most clearly related to infant visual attention is the Negative central (Nc) component. The Nc is a high amplitude, negatively-polarized component that occurs from 400 to 800 ms post stimulus onset at frontal and midline leads (see **Figure 2**). Nc has been found to be greater in amplitude to: oddball compared to standard stimuli (Courchesne et al., 1981), novel compared to familiar stimuli (Reynolds and Richards, 2005), mother's face compared to a stranger's face (de Haan and Nelson, 1997), and a favorite toy compared to a novel toy (de Haan and Nelson, 1999). These findings indicate that regardless of novelty or familiarity, Nc is greater in amplitude to the stimulus that grabs the infant's attention the most (Reynolds et al., 2010). Additionally, Nc is greater in amplitude when infants are engaged in sustained attention (as measured by heart rate) than when infants have reached attention termination (Richards, 2003; Reynolds et al., 2010; Guy et al., in press). The Nc is also ubiquitous in ERP research utilizing visual stimuli with infant participants. Taken together, these findings indicate that Nc reflects amount of attentional engagement.

In order to determine the cortical sources of the Nc component. Reynolds and Richards (2005) and Reynolds et al. (2010) conducted cortical source analysis on scalp-recorded ERP. Cortical source analysis involves computing a forward solution for a set of dipoles, and comparing the simulated topographical plots produced by the forward solution to the topographical plots obtained from observed data. The forward solution is iterated until the best fitting solution is found. The results of the cortical source analysis can then be mapped onto structural MRIs. **Figure 3** shows the results of our source analysis of the Nc component measured during brief stimulus ERP presentations and also during performance of the VPC task. As can be seen in **Figure 3**, the cortical sources of the Nc were localized to areas of prefrontal cortex (PFC) for all age groups including 4.5-month-olds. Areas which were common dipole sources included inferior and superior PFC, and the

anterior cingulate. The distribution of the dipoles also became more localized with increasing age. These findings support the proposal that PFC is associated with infant attention, and indicate that there is overlap in brain areas involved in both recognition memory and working memory tasks. Neuroimaging research with older children and adults indicates that there is a neural circuit including parietal areas and PFC involved in working memory (e.g., Goldman-Rakic, 1995; Fuster, 1997; Kane and Engle, 2002; Klingberg et al., 2002; Crone et al., 2006).

The late slow wave (LSW) ERP component is associated with recognition memory in infancy. The LSW shows a reduction in amplitude with repeated presentations of a single stimulus (de Haan and Nelson, 1997, 1999; Reynolds and Richards, 2005; Snyder, 2010; Reynolds et al., 2011). As shown in the two lower ERP waveforms in **Figure 2**, the LSW occurs from about 1–2 s post stimulus onset at frontal, temporal, and parietal electrodes. By examining the LSW, Guy et al. (2013) found that individual differences in infant visual attention are associated with utilization of different processing strategies when encoding a new stimulus. Infants who tend to demonstrate brief but broadly distributed fixations (referred to as short lookers; e.g., Colombo and Mitchell, 1990) during exposure to a novel stimulus subsequently showed evidence of discriminating hierarchical patterns based on changes in the overall configuration of individual elements (or local features). In contrast, infants who tend to demonstrate longer and more narrowly distributed visual fixations (referred to as long lookers) showed evidence of discriminating patterns based on changes in local features but not based on changes in the overall configuration of local features. Furthermore, research utilizing heart rate measures of attention during performance on a recognition memory ERP task have provided informative findings regarding relations between attention and memory. Infants are more likely to demonstrate differential responding to familiar and novel stimuli in the LSW when heart rate indicates they are engaged in sustained attention (Richards, 2003; Reynolds and Richards, 2005).

No studies to date have utilized cortical source analysis to examine cortical sources of the LSW. Late-latency and long duration ERP components can be more problematic for cortical source analysis due to greater variability in the timing of the latency of the component across participants and trials, and the likely contribution of multiple cortical sources to the ERP component observed in the scalp-recorded EEG. However, research with non-human primates and neuroimaging studies with older children and adults indicates the role of a medial temporal lobe circuit in recognition memory processes. Cortical areas involved in this circuit include the hippocampus and parahippocampal cortex; entorhinal and perirhinal cortices; and the visual area TE (Bachevalier et al., 1993; Begleiter et al., 1993; Fahy et al., 1993; Li et al., 1993; Zhu et al., 1995; Desimone, 1996; Wiggs and Martin, 1998; Xiang and Brown, 1998; Wan et al., 1999; Brown and Aggleton, 2001; Eichenbaum et al., 2007; Zeamer et al., 2010; Reynolds, 2015). Regardless of the potential areas involved in recognition memory in infancy, attention is clearly an integral component of successful performance on recognition memory tasks. Performance on recognition memory tasks is influenced by the development of each of the attention systems described above and it stands to reason that these attention systems would influence performance on working memory tasks in a similar manner. Furthermore, working memory and recognition memory are closely related and some of the tasks used to measure maintenance of items in working memory (i.e., visual short term memory, VSTM) in infancy are slightly modified recognition memory tasks. Thus, distinctions between working memory and recognition memory can be particularly difficult to make during the infancy period.

# THE DEVELOPMENT OF WORKING MEMORY IN INFANCY

Similar to work on attention and recognition memory, research on the early development of working memory has focused on the use of behavioral measures (looking and reaching tasks) with infant participants. Neuroscience models of early working memory development have also largely relied on findings from comparative research, clinical cases, and neuroimaging with older children and adults. However, there is a rich and growing tradition of cognitive neuroscience models and research on working memory development. In the sections that follow, we focus specifically on developmental cognitive neuroscience research on working memory in infancy (for more exhaustive reviews on memory development, see Cowan, 1995; Nelson, 1995; Pelphrey and Reznick, 2003; Courage and Howe, 2004; Rose et al., 2004; Bauer, 2009; Rovee-Collier and Cuevas, 2009).

Much of the research on working memory in infancy has focused on tasks similar to the Piagetian A-not-B task, and generally all tasks involve some delayed response (DR) with the correct response requiring some level of attentional control. The A-not-B and other DR tasks typically involve the presentation of two or more wells. While the participant watches, an attractive object is placed in one of the wells and the participant's view of the object is then occluded. Following a brief delay, the participant is allowed to retrieve the object from one of the wells. In the A-not-B task, after multiple successful retrieval trials, the location of the hidden object is reversed (again while the participant observes). The classic A-not-B error occurs when the participant continues to reach for the object in the original hiding location after observing the reversal of the hiding location.

Diamond (1985, 1990) has attributed perseverative reaching on the A-not-B task to a lack of inhibitory control in younger participants and attributes higher success rates in older infants (8–9 months) to further maturation of dorsolateral prefrontal cortex (DLPFC). It has been noted (Diamond, 1990; Hofstadter and Reznick, 1996; Stedron et al., 2005) that participants occasionally look to the correct location after reversal but continue to reach to the incorrect (previously rewarded) location. Hofstadter and Reznick (1996) found that when gaze and reach differ in direction, infants are more likely to direct their gaze to the correct location. Thus, poor performance in the A-not-B reaching task may be influenced by immature inhibitory control of reaching behavior as opposed to a working memory deficiency. Alternatively, Smith et al. (1999) conducted a systematic series of experiments using the A-not-B task and found that several factors other than inhibition contribute to perseverative reaching; including infant posture, direction of gaze, preceding activity, and long-term experiences in similar tasks. However, using an oculomotor version of the DR task, Gilmore and Johnson (1995) found that infants as young as 6 months of age were able to demonstrate successful performance. Similarly, using a peeka-boo looking version of the DR task, Reznick et al. (2004) found evidence of a developmental transition at around 6 months of age associated with improved working memory performance.

In several studies utilizing looking versions of the DR task, significant development has been found to occur from 5 to 12 months of age. With increasing age, infants show higher rates of correct responses, and infants can tolerate longer delays and still demonstrate successful responses (Hofstadter and Reznick, 1996; Pelphrey et al., 2004; Cuevas and Bell, 2010). Bell and colleagues (e.g., Bell and Adams, 1999; Bell, 2001, 2002, 2012; Bell and Wolfe, 2007; Cuevas and Bell, 2011) have integrated EEG measures in looking versions of the A-not-B task in a systematic line of work on the development of working memory. Bell and Fox (1994) found developmental change in baseline frontal EEG power was associated with performance improvement on the A-not-B task. Power changes from baseline to task in the 6–9 Hz EEG frequency band also correlate with successful performance for 8-month-old infants (Bell, 2002). Additionally, higher levels of frontalparietal and frontal-occipital EEG coherence as well as decreased heart rate from baseline to task are all associated with better performance on the looking version of the A-not-B task (Bell, 2012).

Taken together, these findings provide support for the role of a frontal-parietal network in working memory tasks in infancy which is consistent with findings from neuroimaging studies with older children and adults showing recruitment of DLPFC, ventrolateral prefrontal cortex (VLPFC), intraparietal cortex, and posterior parietal cortex (Sweeney et al., 1996; Fuster, 1997; Courtney et al., 1997; D'Esposito et al., 1999; Klingberg et al., 2002; Crone et al., 2006; Scherf et al., 2006). For example, Crone et al. (2006) utilized fMRI during an object working memory task with children and adults and found that VLPFC was involved in maintenance processes for children and adults, and DLPFC was involved in manipulation of items in working memory for adults and children older than 12. The youngest group of children tested (8–12 years of age) did not recruit DLPFC during item manipulation, and did not perform as well as adolescents and adults on the task.

The change-detection task is used to examine capacity limits for number of items an individual can maintain in VSTM, and the analogous change-preference task is used to measure capacity limits with infant participants. Similar to the VPC task, the change-preference task capitalizes on infants' tendency to prefer novel or familiar stimuli. Two sets of stimuli are briefly and repeatedly presented to the left and right of midline with items in one set of stimuli changing across each presentation and items in the other set remaining constant. Infant looking to the left and right stimulus set is measured and greater looking to the changing set side is utilized as an index of working memory. Set size is manipulated to determine capacity limits for participants of different ages. Ross-Sheehy et al. (2003) found a capacity increase from 1 to 3 items across 6.5–12.5 months of age. The authors proposed that the increase in capacity limits on this task across this age range is driven in part by development of the ability to bind color to location. In a subsequent study, the authors (Ross-Sheehy et al., 2011) found that providing infants with an attentional cue facilitated memory for items in a stimulus set. Ten month-olds demonstrated enhanced performance when provided with a spatial cue and 5-month-olds demonstrated enhanced performance when provided with a motion cue. These findings demonstrate that spatial orienting and selective attention influences infant performance on a VSTM task, and support the possibility that further development of the posterior orienting system influences maintenance processes involved in working memory in infancy.

Spencer and colleagues (e.g., Spencer et al., 2007; Simmering and Spencer, 2008; Simmering et al., 2008; Perone et al., 2011; Simmering, 2012) have utilized dynamic neural field (DNF) models to explain developmental changes in the change-preference task. Using the DNF model, Perone et al. (2011) did simulation tests of the spatial precision hypothesis (SPH), predicting that the increased working memory capacity limits found to develop during infancy are based on the strengthening of excitatory and inhibitory projections between a working memory field, perceptual field, and an inhibitory layer. According to the DNF model, the perceptual field consists of a population of neurons with receptive fields for certain feature dimensions (e.g., color, shape), and activation in the working memory layer leads to inhibition of similarly tuned neurons in the perceptual field. The results of their simulation experiments were very similar to past behavioral findings and provided support for the SPH in explaining the increases in capacity limits that have been found to occur with increasing age in infancy.

Findings from studies utilizing the change-preference task provide insight into capacity limits in VSTM during infancy. However, this task simply requires identification of novel items or objects based on maintenance of a memory representation over very brief delays (i.e., less than 500 ms). Given that delays between familiarization and testing on infant recognition memory tasks are typically very brief and the length of the delay is often not specified, it is particularly difficult to determine whether or not recognition memory performance is based on short-term memory or long-term memory. Recall that 4-month-olds only demonstrate recognition with up to 10 s delays (Diamond, 1990). Thus, it is also difficult to determine whether or not performance on the change-preference task taps into maintenance of items in working memory or simply measures recognition memory. Alternatively, one could argue that performance on recognition memory tasks with brief delays may be driven by working memory. Interestingly, Perone and Spencer (2013a,b) again utilized the DNF model to simulate infant performance on recognition memory tasks. The results of the simulations indicated that increasing the efficiency of excitatory and inhibitory interactions between the perceptual field and a working memory field in their model led to novelty preferences on VPC trials with less exposure to the familiar stimulus. These simulated results are similar to the developmental trends found to occur with increasing age across infancy in empirical studies utilizing the VPC task (e.g., Rose et al., 1982; Hunter and Ames, 1988; Freeseman et al., 1993). The authors concluded that development of working memory is a significant factor in the increased likelihood that older infants will demonstrate novelty preferences on recognition memory tasks when compared to younger infants.

In order to investigate working memory in infancy, Káldy and Leslie (2003, 2005) conducted a series of experiments with infants that involved both identification and individuation for successful performance. Individuation involves item or object identification combined with entering the identified information into existing memory representations. Infants were familiarized with two objects of different shapes presented repeatedly in the middle of a stage. The side position of the objects was alternated across presentations in order to require infants to integrate object shape with location on a trial by trial basis. During the test phase, the objects were presented in the center of the stage as in familiarization and then placed behind occluders on the same side of the stage. After a delay, the occluders were removed. On change trials, removal of the occluders revealed that the different shaped objects were reversed in location. On no-change control trials, the objects remained in the same location upon removal of the occluders. Longer looking on change trials indicated individuation of the object based on identifying the change in object shape from the location it was in prior to occlusion. Results indicated that while 9-month-olds could identify changes in object location for both objects (Káldy and Leslie, 2003), 6-month-olds were only able to bind object to location for the last object that was moved behind the occluder in the test phase (Káldy and Leslie, 2005). The authors concluded that the younger infants' memory maintenance was more susceptible to distraction of attention. Káldy and Leslie (2005) also proposed that the significant improvements on this task between 6–9 months of age are related to further development of medial temporal lobe structures (i.e., enthorhinal cortex, parahippocampal cortex) which allows older infants to continue to hold objects in working memory in the presence of distractors.

Thus, Káldy and Leslie (2003, 2005) and Káldy and Sigala (2004) have proposed an alternative model of working memory development which emphasizes the importance of medial temporal lobe structures more so than PFC. They argue that the majority of working memory models emphasizing the importance of DLPFC for working memory are confounding the response inhibition required in typical working memory tasks (e.g., the A-not-B task) with true working memory processes. To further address this limitation, Kaldy and colleagues (Káldy et al., 2015) designed a delayed match retrieval task which involves location-object binding but requires less response inhibition than the classic version of the A-not-B task. Infants are shown two cards, each with pictures of different objects or patterns on them. The cards are turned over and then a third card is placed face up which matches one of the face down cards. Infants are rewarded with an attractive stimulus for looks toward the location of the matching face down card. The authors tested 8- and 10-month-olds on this task and found the 10-month-olds performed significantly above chance levels. Eight montholds performed at chance levels but showed improvement across trials. Thus, similar to previous work, significant gains in working memory performance are found to occur in the second half of the first postnatal year on the delayed match retrieval task.

Regarding Káldy and Sigala (2004) view that too much emphasis has been placed on the importance of PFC for infant working memory, results from the DNF simulations done by Perone et al. (2011) also support the possibility that areas involved in visual processing and object recognition could account for successful working memory performance on the change-preference task without requiring significant PFC contributions to attentional-control. However, in recent exploratory studies utilizing functional near infrared spectroscopy (fNIRS) to measure the BOLD response of infant participants during an object-permanence task. Baird et al. (2002) observed activation of frontal areas for infant participants during the task. However, receptors were only applied to frontal sites, thus limiting the conclusion that the increased frontal activity during this task was unique or of particular functional significance in comparison to other brain regions. However, Buss et al. (2014) utilized fNIRS to image cortical activity associated with visual working memory capacity in 3- and 4-year-old children. In this study, receptors were applied over frontal and parietal locations. Frontal and parietal channels in the left hemisphere showed increased activation when working memory load was increased from 1 to 3 items. Results supported the possibility that young children utilize a frontal-parietal working memory circuit similar to adults. Both of these findings from fNIRS studies provide preliminary support for the role of PFC in working memory during early development.

Luciana and Nelson (1998) emphasize the critical role the PFC plays in integrating sensorimotor traces in working memory to guide future behavior. According to Luciana and Nelson, the A-not-B task may actually overestimate the functional maturity of the PFC in infant participants because it does not require the accurate integration of sensorimotor traces in working memory. They propose the integration of sensorimotor traces should be considered a core process in working memory definitions. The majority of working memory definitions include executive control components, and persistent activity in DLPFC has been linked with control functions involved in the manipulation of information for the purpose of goal-directed action (e.g., Curtis and D'Esposito, 2003; Crone et al., 2006). Thus, the exact contribution of PFC to working memory functions in early development remains unclear. What is clear from the extant literature is that infants beyond 5–6 months of age are capable of demonstrating basic yet immature aspects of working memory, and significant improvement in these basic functions occurs from 5–6 months (e.g., Diamond, 1990; Gilmore and Johnson, 1995; Hofstadter and Reznick, 1996; Káldy and Leslie, 2003, 2005; Káldy and Sigala, 2004; Pelphrey et al., 2004; Reznick et al., 2004; Cuevas and Bell, 2010).

# THE DEVELOPMENT OF ATTENTION SYSTEMS AND WORKING MEMORY

Similar to recognition memory, the improvements in working memory performance which occur after 5–6 months of age are likely influenced by further development of the attention systems previously discussed. The majority of the working memory studies discussed above examined visuospatial working memory. Performance on all of these working memory tasks involves voluntary eye movements and controlled scanning of the stimuli involved in the task. Thus, functional maturity of the posterior orienting system would be key for successful performance on these tasks. This system shows significant development from 3 to 6 months of age (Johnson et al., 1991; Colombo, 2001; Courage et al., 2006; Reynolds et al., 2013). This timing coincides with the time frame at which infants begin to demonstrate above chance performance on working memory tasks. For example, Gilmore and Johnson (1995) reported successful performance on an oculomotor DR task for 6-month-old infants, and Reznick et al. (2004) describe 6 months of age as a time of transition for performance on a peek-a-boo version of the DR task.

Successful performance on working memory tasks involves more than just voluntary control of eye movements. Working memory tasks also involve attentional control and inhibition. These cognitive functions are both associated with the anterior attention system (Posner and Peterson, 1990), which shows significant and protracted development from 6 months on. Several studies have shown significant improvement on DR and change-preference tasks from 5 to 12 months of age (Hofstadter and Reznick, 1996; Ross-Sheehy et al., 2003; Pelphrey et al., 2004; Cuevas and Bell, 2010), an age range that overlaps with the functional onset of the anterior attention system. Given that some models emphasize the role of PFC and attentional control as being critical for working memory (e.g., Baddeley, 1996; Kane and Engle, 2002; Klingberg et al., 2002), further development of the anterior attention system would be critical for working memory development (for further discussion of attention and memory relations in childhood and adulthood, see Awh and Jonides, 2001; Awh et al., 2006; Astle and Scerif, 2011; Amso and Scerif, 2015).

The general arousal/attention system shows significant developmental change across infancy and early childhood characterized by gains in both the magnitude and duration of periods of sustained attention (Richards and Cronise, 2000; Richards and Turner, 2001; Reynolds and Richards, 2008). Infants are more likely to demonstrate evidence of recognition memory if initial exposure to the test stimulus occurs during sustained attention or if the infant is engaged in sustained attention during the recognition test (e.g., Richards, 1997; Frick and Richards, 2001; Reynolds and Richards, 2005; Reynolds et al., 2010). It stands to reason that these developmental gains in sustained attention would also facilitate improved performance on working memory tasks. This reasoning is supported by Bell (2012) finding that infants who show decreased heart rate from baseline to task also show enhanced performance on the A-not-B task. Studies utilizing the heart rate phases (Richards and Casey, 1992) during infant working memory tasks would provide greater insight into the effects of sustained attention on working memory performance.

Relations between arousal and attention are complex and change throughout development. The significant and sustained decrease in heart rate associated with attention is most likely limited to infancy and early childhood; however, individual differences in heart rate variability are related to attention and cognitive performance throughout development (Porges, 1992; Suess et al., 1994; Reynolds and Richards, 2008). Relatively little work has examined the influence of arousal aspects of attention on working memory in later development. An exception would be the work by Thayer and colleagues (Hansen et al., 2003; Thayer et al., 2009) examining relations between HRV and working memory in adults. Their findings indicate that individual differences in baseline HRV are associated with performance on working memory tasks. Individuals with high baseline HRV perform better on working memory tasks than individuals with low baseline HRV, and the advantage is specific to tasks requiring executive function (Thayer et al., 2009). Thus, attention and arousal appear to influence working memory throughout development; however, the dynamics of these relations are complex and would be expected to change significantly with age.

The development of attention and the development of working memory are closely related. Significant gains on working memory tasks overlap in developmental timing with key periods for development of sustained attention, the posterior orienting system, and the anterior attention system. There is also significant overlap in neural systems involved in attention and

#### REFERENCES


working memory. The cortical sources of the Nc ERP component associated with infant visual attention have been localized to areas of PFC (Reynolds and Richards, 2005; Reynolds et al., 2010). Similarly, research with fNIRS indicates that frontal and parietal areas are involved in working memory performance for infants (Baird et al., 2002) and preschoolers (Buss et al., 2014). Given the substantial overlap in developmental timing and neural systems involved in both attention and working memory, future research should aim to examine relations between attention and working memory in infancy and early childhood using both psychophysiological and neural measures. A multi-level analysis approach would be ideal for addressing the controversy regarding the relative contribution of PFC, parietal cortex, and medial temporal lobe structures to working memory performance. Attention plays a key role in successful working memory performance, and the development of attention systems most likely influences the development of working memory. Bidirectional effects are common throughout development, and thus of equal interest is the potential influence of working memory on further development of attention systems in infancy and early childhood.

# AUTHOR CONTRIBUTIONS

After discussions about potential directions for the article, the authors (GDR and ACR) settled on the overall content to include and outline to follow for the article. ACR provided recommendations on potential content for several of the major sections of the article. GDR incorporated much of ACR's work into the article when he wrote the original draft, and subsequently incorporated further input from ACR into the final version of the manuscript.

# ACKNOWLEDGMENTS

Research reported in this article and the writing of this article were supported by the National Institute of Child Health and Human Development Grant R21-HD065042, and the National Science Foundation Developmental and Learning Sciences Division Grant 1226646 to GDR.


Neuroscience, eds C. A. Nelson and M. Luciana (Cambridge, MA: MIT Press), 479–497.


visible solutions. J. Cogn. Neurosci. 17, 623–631. doi: 10.1162/08989290534 67622


lobe. Neuropharmacology 37, 657–676. doi: 10.1016/s0028-3908(98) 00030-6


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Reynolds and Romano. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Off to a Good Start: The Early Development of the Neural Substrates Underlying Visual Working Memory

Allison Fitch † , Hayley Smith † , Sylvia B. Guillory and Zsuzsa Kaldy \*

Department of Psychology, University of Massachusetts Boston, Boston, MA, USA

Current neuroscientific models describe the functional neural architecture of visual working memory (VWM) as an interaction of the frontal-parietal control network and more posterior areas in the ventral visual stream (Jonides et al., 2008; D'Esposito and Postle, 2015; Eriksson et al., 2015). These models are primarily based on adult neuroimaging studies. However, VWM undergoes significant development in infancy and early childhood, and the goal of this mini-review is to examine how recent findings from neuroscientific studies of early VWM development can be reconciled with this model. We surveyed 29 recent empirical reports that present neuroimaging findings in infants, toddlers, and preschoolers (using EEG, fNIRS, rs-fMRI) and neonatal lesion studies in non-human primates. We conclude that (1) both the frontal-parietal control network and the posterior cortical storage areas are active from early infancy; (2) this system undergoes focalization and some reorganization during early development; (3) and the MTL plays a significant role in this process as well. Motivated by both theoretical and methodological considerations, we offer some recommendations for future directions for the field.

#### Edited by:

Lionel G. Nowak, Université Toulouse III - Paul Sabatier and Centre National de la Recherche Scientifique, France

#### Reviewed by:

Luis J. Fuentes, University of Murcia, Spain Emmanuel Procyk, French Institute of Health and Medical Research, France

#### \*Correspondence:

Zsuzsa Kaldy zsuzsa.kaldy@umb.edu †Co-first authors.

Received: 20 January 2016 Accepted: 02 August 2016 Published: 18 August 2016

#### Citation:

Fitch A, Smith H, Guillory SB and Kaldy Z (2016) Off to a Good Start: The Early Development of the Neural Substrates Underlying Visual Working Memory. Front. Syst. Neurosci. 10:68. doi: 10.3389/fnsys.2016.00068 Keywords: visual working memory, frontoparietal network, ventral stream, early development, neonatal lesions in primates, infants, preschoolers

# INTRODUCTION

Working memory is a limited-capacity system for the maintenance and manipulation of information in service of ongoing tasks. The classic model of working memory (WM, Baddeley and Hitch, 1974) distinguishes the central executive system and two different sensory buffers for the temporary storage of visual and auditory information (an additional system, the episodic buffer, was later added: Baddeley, 1986). This multicomponent model has framed essentially all research on WM for more than 20 years. More recent "state-based" WM models (Cowan, 1988; Oberauer, 2002; McElree, 2006), however, question basic assumptions of the multicomponent model, claiming

**Abbreviations:** DR, delayed response; DNMS, delayed non-match to sample; DTI, diffusion tensor imaging; EEG, electroencephalography; fNIRS, functional near infrared spectroscopy; fMRI, functional magnetic resonance imaging; HR, heart rate; dlPFC, dorsolateral prefrontal cortex; LTM, long-term memory; MTL, medial temporal lobe; Neo-HC, neonatally lesioned in the hippocampus; Neo-PRh, neonatally lesioned in the perirhinal cortex; Obj-SO, object self-ordered pointing task; PRh, perirhinal cortex; rs-fMRI, resting state fMRI; SOMT, Serial Order Memory Task; VoE, Violation of Expectation; vlPFC, ventrolateral prefrontal cortex; VWM, visual working memory; WM, working memory.

that there are no separate WM-specific storage systems in the brain; instead, representations held in WM are temporarily activated long-term memory (LTM) representations. According to this view, storage of sensory information involves posterior cortices; visual WM (VWM) representations, for example, have been localized in various stages of the ventral stream, starting in the occipital cortex (Harrison and Tong, 2009; Serences et al., 2009) and continuing to inferior temporal cortex (Miller et al., 1991). Maintenance and manipulation of WM representations (the functions of the central executive) depend upon a frontal-parietal network (Awh and Jonides, 2001; Curtis and D'Esposito, 2003), in particular, anterior insula, lateral prefrontal cortex (PFC), dorsal anterior cingulate cortex, and areas within and surrounding the intraparietal sulcus (Seeley et al., 2007).

This conceptualization of WM is grounded in an extensive body of neuroscientific research, the majority of which has been conducted with human adults (for reviews, see Jonides et al., 2008; D'Esposito and Postle, 2015; Eriksson et al., 2015). WM undergoes significant postnatal development, with farreaching consequences on cognitive development in general (Bull et al., 2008). Behavioral studies have shown that the ability to hold information in VWM emerges in infancy (Káldy and Leslie, 2003, 2005; Ross-Sheehy et al., 2003; Zosh and Feigenson, 2012), and gradually improves throughout childhood (Riggs et al., 2006; Cowan et al., 2010; Simmering, 2012) and adolescence (Isbell et al., 2015). It is outside of the scope of this mini-review to provide a comprehensive overview of the entire behavioral literature (see Kibbe, 2015; Cowan, 2016; Reynolds and Romano, 2016, in this Research Topic); instead, we will examine whether recent findings from neuroscientific studies of early VWM development can be fit into the adult model above.

We limit our focus to studies that examine VWM in the first 5 years of life. While there is an abundant fMRI literature on children older than 6–7 years of age (e.g., Geier et al., 2009; von Allmen et al., 2014), this method currently cannot be used with very young children, and here we focus on what is known about these mechanisms before this age. The studies reviewed here employ a variety of neurophysiological methods (primarily electroencephalography, EEG, and functional Near-Infrared Spectroscopy, fNIRS) in human infants and young children (**Table 1**) and lesions in young primates (**Table 2**).

Structural and functional brain development progresses in parallel. Both classic brain anatomical studies in synaptic density (Huttenlocher and Dabholkar, 1997) and more recent structural connectivity studies using DTI (Qiu et al., 2015) found a posterior-to-anterior progression during the first few years of life, with white matter developing in the occipital and temporal cortices before frontal areas. While our focus in this minireview is on the functional development of the system underlying VWM, we will also discuss a few groundbreaking studies where researchers were able to link behavioral performance in a VWM task with myelination of a specific network (Short et al., 2013; Meng et al., 2014).

# NEURODEVELOPMENT OF THE HUMAN VWM SYSTEM: INFANCY (0–2 YEARS)

Many of the neuroimaging studies examining infant VWM development employed the classic A-not-B task in conjunction with optical imaging (fNIRS) or EEG. In this task, an object is hidden at one of two locations and the infant is allowed to manually search for it. Once the infant repeatedly succeeds at one location, the object is then hidden at the other location. In the looking-based version of this task, looking times to the two locations are contrasted.

In one of the first studies to measure regional blood-flow changes in infants using fNIRS, Baird et al. (2002) found that prefrontal cortex (PFC) activity increased with success on an object maintenance task. More recently, EEG power and coherence measures from the entire scalp have been used to examine VWM task-related and age-related changes in the frontal-parietal network of infants (Bell and Wolfe, 2007; Cuevas and Bell, 2011; Bell, 2012; Cuevas et al., 2012a,b,c). Cuevas et al. (2012a), for example, found that frontal EEG power and heart rate predicted VWM performance in infants at 10 months, but not at 5 months. In another study, successful performance on the A-not-B task was found to be related to increased frontalparietal coherence at 8 months (Bell, 2012; Cuevas et al., 2012b). These findings suggest that the frontal-parietal network supports successful VWM performance between 8 and 10 months.

During the infancy period, functional connectivity of the VWM network appears to become less diffuse with age. Cuevas et al. (2012a) found an increase in EEG coherence relative to baseline across the entire scalp in 5-month-olds but only between the medial frontal and occipital electrode sites in 10-montholds. This finding is additionally supported by the observation of increased focalization of frontal-parietal network activity between 8 months and 4.5 years of age, which may reflect more efficient communication (Bell and Wolfe, 2007).

Resting-state fMRI (rs-fMRI) has been used to identify functional connections between brain regions in the absence of any task. This latter aspect makes this method particularly attractive for studies of early development, as infants can be scanned during sleep. In a pioneering study, Alcauter et al. (2014) tracked the development of resting-state networks in infants from birth to 2 years of age and their VWM performance. In addition to significant gains in synchrony among prefrontal and parietal regions at age one, it was found that connectivity between the thalamus and the salience network (which includes the insula, the cingulate, and frontal cortices, and is considered a sub-network of the frontal-parietal network in adults, see Elton and Gao, 2014) at age one predicted VWM performance at age two. In a DTI tractography study, the same group found that myelination of the tracts connecting frontal and parietal cortices predicted VWM performance in 1-year-old infants (Short et al., 2013). These studies thus corroborate the EEG findings that frontal-parietal connectivity is present before the end of the first year, and is related to VWM development. However, because salience network activity is functionally dissociated from WM performance in adults (Seeley et al., 2007; Elton and Gao, 2014),


TABLE

1



reliably. One notable limitation of three of the four fNIRS studies reviewed below is that hemodynamic responses were measured only in the frontal areas (or in Buss et al., 2014, in the frontal and the parietal cortices). Thus, conclusions were necessarily

constrained to these regions. Tsujimoto et al. (2004) found that lateral PFC activity in 5.5 year-old children was very similar to adults' during a change detection task: One of the most widely used paradigms in adult

it is likely this network undergoes functional reorganization between toddlerhood and adulthood.

 one age group

The involvement of posterior cortical areas in infant VWM has primarily been examined using more modern behavioral paradigms, such as Violation-of-Expectation (VoE), in conjunction with fNIRS, or EEG. Using fNIRS, Wilcox and colleagues found that the anterior temporal cortex showed consistent activation when infants noticed a change in the features of an object that they held in mind when it reappeared from behind an occluder (thus, this feature change "violated" their expectations; Wilcox et al., 2005, 2008, 2009, 2010, 2014). Task-related activation in the posterior temporal cortex gradually decreased from 5 to 12 months, and the occipital cortex was active during all object maintenance tasks. This decrease in activation in posterior temporal cortex may reflect functional reorganization of object processing areas over the course of development (Wilcox et al., 2012, 2014; Wilcox and Biondi, 2016). Converging evidence for maintenance related activity in posterior storage areas has been reported by Kaufman et al. (2003, 2005) using EEG. They found that increased gamma-band (20– 60 Hz) activity in the right temporal cortex of 6-month-olds was associated with the maintenance of object representations behind an occluder (Kaufman et al., 2003, 2005). More recently, Kaufman and colleagues showed that the same response was higher in the right occipital cortex when infants kept two vs. one object in VWM (Leung et al., 2016). This result raises the possibility of finding a load-dependent neural signature of information storage in infant VWM.

In sum, the literature concerning the neural substrates of VWM systems in infants points toward an early emerging frontal-parietal network; one that is present and active even before age one (Bell, Cuevas; connectivity studies). Studies by Wilcox, Kaufman and their colleagues found storage-related VWM activity in the temporal and occipital cortices as well, which may mirror similar findings in adults in the ventral visual stream (for a recent review, see Lee and Baker, 2016).

#### NEURODEVELOPMENT OF THE HUMAN VWM SYSTEM: EARLY CHILDHOOD (3–5 YEARS)

To date, only a handful of neuroimaging studies have examined VWM during the early childhood period, and all used fNIRS. The lack of neuroimaging (both structural and functional) conducted with this notoriously challenging age range is primarily due to practical limitations: Preschool-age children require special experimental designs as they are rarely willing to participate for an extended time, and they often do not follow verbal instructions


lesions during the first 2 weeks of life; Neo-PRh adults, adult macaques who received neurotoxic lesions in the perirhinal region during the first 2 weeks of life; L, longitudinal; C, cross-sectional; "one," only one age group tested.

VWM research, participants are briefly presented with a set of to-be-remembered items, and following a short delay are tested on whether or not the items have changed (Pashler, 1988; Luck and Vogel, 1997). Using the same task with a small longitudinal sample, Tsujii et al. (2009) found that between 5 and 7 years of age, increased VWM performance correlated with right lateralization of frontal activity.

More recently, Buss et al. (2014) found that the frontalparietal network was active in 3- and 4-year-olds during a change detection task, where load was systematically manipulated. Overall, they demonstrated greater involvement of parietal cortical areas relative to frontal areas, as well as increased parietal activity in 4-year-olds relative to 3-year-olds. Prior studies found that, in adults, activity in the parietal cortex was load-dependent for small set sizes, and leveled off at the behaviorally-defined capacity limit (Todd and Marois, 2004; Palva et al., 2011). In 3 and 4-year-olds this activity was load-dependent, but continued to increase beyond the observed capacity limit—a finding that warrants further investigation. In a similar investigation of delaydependent activity, Perlman et al. (2016) manipulated the length of delays (2 vs. 6 s) and found age-dependent activation in lateral PFC in children between 3 and 7 years of age, and that children recruited this area more during longer delays. As the ventrolateral PFC is involved in maintenance, this finding suggests increased active rehearsal of information with age.

In sum, it appears that the frontal-parietal network becomes increasingly adult-like throughout early childhood. Increased recruitment of prefrontal and parietal areas point to increased focalization of the frontal-parietal system, while increased lateralization to the right hemisphere suggests adult-like specialization of this network for visuospatial tasks (Thomason et al., 2009). Because recordings were not made from the temporal and occipital areas, at the current time we cannot draw any conclusions about the involvement of the posterior cortices. The paucity of research in this age range creates a gap in our understanding of the development of VWM.

# NEURODEVELOPMENT OF THE NON-HUMAN PRIMATE VWM SYSTEM: EFFECTS OF NEONATAL LESIONS

Both the frontal-parietal network and the posterior storage areas (e.g., IT) have multiple connections to the medial temporal lobe (MTL; Lavenex et al., 2002). While most current neuroscientific methods used in young children (fNIRS, EEG) do not allow access to these deep structures, primate lesion studies have provided a wealth of findings about the role of these structures in early development. Unlike adult lesion studies, which can only provide information about the relative contribution of a brain structure in a fully-formed system, neonatal lesion studies have the advantage of examining the downstream effects of a lesion on the developing system<sup>1</sup> . In the following section, we will focus on the role of the MTL in the development of the frontal-parietal network.

Heuer and Bachevalier (2011) examined the contribution of the hippocampus to the development of VWM abilities. Here they utilized a delayed response task (also widely used in classic behavioral studies with infants; e.g., Diamond and Doar, 1989), where participants are presented with one object (the sample), followed by a delay, and then a choice between a matching object and a non-matching object. In the delayed-nonmatch-to-sample (DNMS) version of this task, participants are rewarded for selecting the non-matching object. Results showed that adult macaques that received neonatal hippocampal lesions (henceforth: Neo-HC) performed as well as sham-operates on a DNMS task (requires maintenance and putatively relies on the vlPFC, see Petrides, 1995). However, these macaques failed to even meet training criterion on an object self-ordered pointing task (Obj-SO) in which participants selected baited food wells in a different order on successive trials (requires manipulation, specifically, monitoring serial order, and putatively relies on the dlPFC; Petrides, 1995).

Follow-up studies using other dlPFC-associated VWM tasks have provided supporting evidence: Neo-HC macaques made significantly more errors than controls on a serial-order memory (SOMT) task (Heuer and Bachevalier, 2013), and in a foraging task were more likely than controls to return to boxes they had already visited, especially if that box previously contained the animal's preferred food (Glavis-Bloom et al., 2013). Thus early hippocampal lesions lead to deficits in VWM manipulation, but not in maintenance. The finding that early hippocampal damage leads to deficits on a task that taps dlPFC has been replicated in human patients who suffered hypoxic-ischaemic events early in life (Geva et al., 2016).

In addition to hippocampal lesions, Weiss et al. (2016) found that neonatal lesions to another area of MTL, the perirhinal cortex, impacted VWM performance on tasks believed to rely on the vlPFC. In their study, macaques with neonatal lesions of perirhinal cortex (Neo-PRh) were impaired on a DNMS task at short delays, aswell as an Obj-SO task; both repeated stimuli across trials, and thus required trial-to-trial updating of information in VWM. In contrast, Neo-PRh animals performed well on a task that used novel stimuli across trials (SOMT), thus did not require updating, suggesting that the perirhinal cortex is not involved in manipulation of WM contents per se, but rather interference resolution or associated executive functions (e.g., inhibition).

These findings suggest that the MTL gives rise to the development of PFC-associated VWM skills, such as manipulation and interference resolution, likely through reciprocal neuroanatomical connections (Goldman-Rakic et al., 1984; Aggleton et al., 2015). Two recent connectivity studies provide converging evidence for this. Early hippocampal damage led to both reduced white matter (Meng et al., 2014) and decreased resting-state connectivity (Meng et al., 2016) between the dlPFC and the medial PFC and several posterior areas, such as IT and V4, in adult macaques. These anatomical and functional impairments correlated with poorer performance on the SOMT (Meng et al., 2014, 2016). This correlation underscores the importance of the hippocampus, as well as the frontal-parietal network in the development of VWM

<sup>1</sup>The earliest neuroscientific studies of the development of the frontal cortex used these techniques as well (Goldman, 1971; Miller et al., 1973), and demonstrated the role of both the dorsolateral and the ventrolateral PFC (dlPFC and vlPFC) in VWM. By connecting findings in PFC-lesioned macaques and human infants, Diamond and Goldman-Rakic (1989) laid the one of the first building blocks of developmental cognitive neuroscience.

abilities: By adulthood, Neo-HC macaques had not developed compensatory mechanisms for VWM. This stands in stark contrast to a similar neonatal lesion study demonstrating compensatory mechanisms for rule learning and recognition memory following lesions to the vlPFC (Malkova et al., 2016).

#### SUMMARY AND FUTURE DIRECTIONS

The goal of this mini-review was to examine the neurophysiological evidence regarding the early emergence of the VWM network that involves both the frontal-parietal control network and the posterior storage areas that have been identified in adults. Our first conclusion is that both of these systems seem to be active from as early as the second half of the first year in humans.

A handful of longitudinal and cross-sectional studies reviewed here point to a gradual focalization of the frontal-parietal system throughout development (see the works of Bell and her colleagues). We also see some evidence for the functional reorganization of the network during the early life period: for example, a shift away from the salience network from infancy to adulthood (Alcauter et al., 2014), and increasing reliance on the parietal cortex during the preschool years (Buss et al., 2014). These changes may reflect a specialization within the network. Furthermore, findings from non-human primates have demonstrated the significance of the medial temporal lobe in the development of the lateral PFC (see the works of Bachevalier and her colleagues).

Related to the emergence of posterior information storage areas, a number of studies found object-maintenance related activity in both occipital and temporal lobes in infancy (see the works of Kaufman and his colleagues). Studies on VWM mechanisms in early childhood have not recorded from these posterior areas, so our understanding of how these areas support VWM in this age range is, at the moment, limited.

A defining characteristic of VWM is its limited capacity. This functional characteristic can serve as a signature to identify VWM storage-related mechanisms: In these structures, activity is expected to gradually increase with the number of to-beremembered items, and then remain constant when capacity limit has been reached (e.g., Palva et al., 2011). In studies that aim to find this signature, the adult cognitive neuroscience literature has adopted a useful psychophysical measure to quantify VWM capacity (Cowan's k 2 ). Research on school-age children has recently begun to examine how memory load affects the recruitment of different parts of the VWM system using this measure (e.g., Shimi et al., 2014; Kharitonova et al., 2015). Importantly, this approach has already been applied successfully in preschoolers (Buss et al., 2014, reviewed above).

#### REFERENCES

Aggleton, J. P., Wright, N. F., Rosene, D. L., and Saunders, R. C. (2015). Complementary patterns of direct amygdala and hippocampal projections

Based on both theoretical and methodological considerations, the ideal design to study neurodevelopmental change in the VWM system has the following attributes:


Some of the studies to date have two of these features, but none have all three. Because of its versatility and low task demands, the change detection paradigm is the best positioned to meet criterion (a) in the near future. Thus, a crucial open question for future studies is how neural activity in the VWM network changes in children under 3 years of age using this task. As well, future studies with preschool-age children that record from posterior cortices (using whole-brain nets, see e.g., Sato et al., 2012) should elucidate the role of these structures in VWM beyond infancy.

Despite all the methodological challenges that are involved in studying brain functions in infants, young children, and young primates, research on early VWM neurodevelopment has gotten off to an exciting start. Several different physiological methods have already yielded converging results, and recent advances in neuroimaging methods (e.g., Cutini and Brigadoi, 2014; Graham et al., 2015), will likely lead to an expansion of research in the near future. We look forward to an exciting period in the study of the early developmental unfolding of the VWM system.

#### AUTHOR CONTRIBUTIONS

AF and HS contributed equally to this manuscript by selecting and summarizing relevant studies and writing multiple sections of this review. SG contributed significantly to the section on EEG and rs-fMRI studies. ZK has developed the theoretical perspective with the help of AF and HS. All four authors contributed to the writing and editing of the paper.

# FUNDING

The authors were supported by National Institutes of Health's grant R15HD086658 and a Seed Grant from the Simons Foundation under the auspices of the Simons Center for the Social Brain at MIT (#319294) awarded to ZK.

Alcauter, S., Lin, W., Smith, J. K., Short, S. J., Goldman, B. D., Reznick, J. S., et al. (2014). Development of thalamocortical connectivity during infancy and its

<sup>2</sup>The formula is k = N <sup>∗</sup> (H + CR − 1), where N is the number of items presented, H is hit rate, CR is correct rejection rate (Cowan et al., 2005).

to the macaque prefrontal cortex. Cereb. Cortex 25, 4351–4373. doi: 10.1093/cercor/bhv019

cognitive correlations. J. Neurosci. 34, 9067–9075. doi: 10.1523/JNEUROSCI. 0796-14.2014


mechanisms of encoding and maintenance in visual STM. J. Cogn. Neurosci. 26, 864–877. doi: 10.1162/jocn\_a\_00526


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Fitch, Smith, Guillory and Kaldy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Functional Activation in the Ventral Object Processing Pathway during the First Year

Teresa Wilcox \* and Marisa Biondi

*Infant Cognition Lab, Department of Psychology, Texas A&M University, College Station, TX, USA*

Infants' capacity to represent objects in visual working memory changes substantially during the first year of life. There is a growing body of research focused on identifying neural mechanisms that support this emerging capacity, and the extent to which visual object processing elicits different patterns of cortical activation in the infant as compared to the adult. Recent studies have identified areas in temporal and occipital cortex that mediate infants' developing capacity to track objects on the basis of their featural properties. The current research (Experiments 1 and 2) assessed patterns of activation in posterior temporal cortex and occipital cortex using fNIRS in infants 3–13 months of age as they viewed occlusion events. In the occlusion events, either the same object or featurally distinct objects emerged to each side of a screen. The outcome of these studies, combined, revealed that in infants 3–6 months, posterior temporal cortex was activated to all events, regardless of the featural properties of the objects and whether the event involved one object or two (featurally distinct) objects. Infants 7–8 infants months showed a waning posterior temporal response and by 10–13 months this response was negligible. Additional analysis showed that the age groups did not differ in their visual attention to the events and that changes in HbO were better explained by age in days than head circumference. In contrast to posterior temporal cortex, robust activation was obtained in occipital cortex across all ages tested. One interpretation of these results is that they reflect pruning of the visual object-processing network during the first year. The functional contribution of occipital and posterior temporal cortex, along with higher-level temporal areas, to infants' capacity to keep track of distinct entities in visual working memory is discussed.

#### Edited by:

*Natasha Sigala, University of Sussex, UK*

#### Reviewed by:

*Jordy Kaufman, Swinburne University of Technology, Australia Lauren Emberson, Princeton University, USA*

#### \*Correspondence:

*Teresa Wilcox twilcox@tamu.edu*

Received: *15 September 2015* Accepted: *07 December 2015* Published: *05 January 2016*

#### Citation:

*Wilcox T and Biondi M (2016) Functional Activation in the Ventral Object Processing Pathway during the First Year. Front. Syst. Neurosci. 9:180. doi: 10.3389/fnsys.2015.00180* Keywords: infants, object processing, object processing pathway, ventral temporal cortex, cortical development

# INTRODUCTION

Infants' capacity to track the identity of visual objects—to form coherent representations of objects that persist in the absence of direct visual input—changes substantially during the first year of life. Over the last 25 years developmental scientists have made significant progress toward understanding the nature and development of infants' capacity to represent objects in visual working memory. For example, investigations have revealed important changes in the type of information that infants include in their visual object representations, infants' capacity to integrate discordant sources of information, and the extent to which infants use this information to interpret Wilcox and Biondi Ventral Object Processing Pathway in Infants

physical events (Leslie et al., 1998; Tremoulet et al., 2000; Wilcox and Schweinle, 2002; Wang and Baillargeon, 2008; Baillargeon et al., 2012; Kaldy et al., 2015). There is also a growing body of research on the mechanisms that support and facilitate the changes that have been observed (Wang and Baillargeon, 2008; Wu et al., 2011; Baillargeon et al., 2012). One approach is to study the cognitive and cortical architecture on which the development of these capacities depends. With the introduction of more sophisticated neuroimaging and behavioral techniques that can be used with human infants in the experimental setting, the opportunities to apply a developmental cognitive neuroscience approach have expanded (Karmiloff-Smith, 2010; Wilcox and Biondi, 2015).

In the adult, a number of cortical networks have been identified as important to visual object working memory. Of most interest to the current research are networks that support the processing of objects on the basis of their featural properties. Initial studies conducted with non-human primates, and subsequent studies conducted with adult humans, have revealed hierarchically organized networks in ventral areas of the cortex. For example, areas in the primary visual cortex respond to specific features, such as lines, orientation, or color (Bartels and Zeki, 2000; Tootell et al., 2003; Orban et al., 2004), whereas areas in the occipito-temporal cortex (e.g., lateral occipital complex) integrate these features and code objects as wholes, independent of visual perspective (Malach et al., 1995; Grill-Spector, 2003; Kanwisher, 2003; Kourtzi and Connor, 2011). Moving posterior to anterior in the temporal cortex, object representations become more abstract, with anterior temporal cortex being important to higher-level object processing, such as object identification and categorization (Humphreys et al., 1999; Devlin et al., 2002; Peelen and Caramazza, 2012). Recently, investigators have begun to explore the functional development of this network. In a series of studies, for example, infants aged 3–12 months were shown a shape-difference, color-difference, or control event like those depicted in **Figure 1** (Wilcox et al., 2010, 2012, 2014a). Behavioral studies have revealed that early in the first year infants use the shape difference to individuate objects, but it is not until the end of the first year that infants use a color difference (Wilcox, 1999; Wilcox and Chapa, 2004). A similar developmental hierarchy has been observed in object segregation and identification tasks, which require related (but not identical) processes (Needham, 2000; Tremoulet et al., 2000). In the Wilcox et al. studies, functional near-infrared spectroscopy (fNIRS) was used to assess patterns of cortical activation during infants' processing of these events. Optodes were placed over three left ventral areas (**Figures 2A**, **3**): occipital cortex (near O1 of the 10– 20 International System for EEG recording), posterior temporal cortex (near T5 of the 10–20 System), and anterior temporal cortex (near T3 of the 10–20 System). Cortical activation was also measured from optodes placed over parietal cortex (near P3 of the 10–20 System) but this dorsal area is not of theoretical interest here. The main prediction was straightforward: a different pattern of activation would be obtained to events that engage, as compared to those that fail to engage, the individuation process. Whereas occipital cortex and posterior temporal cortex (lowand mid-level object processing areas) would be activated in

response to all events, anterior temporal cortex (a higher level object processing area) would be activated only in response to events in which infants individuate by feature.

saw 2 complete event cycles during each test trial.

As predicted, these studies revealed consistent, robust activation in occipital cortex to all three events at all ages tested between 3 and 12 months (see also Wilcox et al., 2005, 2008, 2009). Also as predicted, anterior temporal activation was obtained only in response to events in which infants individuate-by-feature. For example, infants 3–9 months, who use shape but not color information to individuate objects (Wilcox, 1999; Wilcox and Chapa, 2004), showed activation in the anterior temporal cortex when viewing the shape difference but not the color difference event. In contrast, infants 11–12 months, who use shape and color information to individuate objects (Wilcox and Chapa, 2004; Wilcox et al., 2007) showed activation in the anterior temporal cortex when viewing the shape difference and the color difference event. The control event does not activate anterior temporal cortex in any age group. This pattern of results—activation only when infants

FIGURE 2 | Configuration and placement of optodes. (A) Top: Configuration of the emitters (red circles) and detectors (black squares) and the nine corresponding channels in the headgear used by Wilcox et al. (2010, 2012, 2014a). Emitter-detector distances were all 2 cm. Bottom: Approximate location of the nine channels from which data were collected on a schematic of an infant's head in relation to the 10–20 International EEG system. Each detector read from a single emitter except for the detector between T3 and T5, which read from both emitters. The light was frequency modulated to prevent "cross-talk." Experiment 1 focused on data collected at channels 4 and 5 (posterior temporal cortex) and channels 8 and 9 (occipital cortex), which are in bold. (B) Top: Configuration of the emitter (red circle) and detectors (black squares) and the six corresponding channels in the headgear used in Experiment 2. Emitter-detector distances were either 2 cm (channels 2, 4, and 5) or 3 cm (channels 1, 3, and 6). For statistical analyses (see text), the channels were grouped into three regions within the posterior temporal cortex: Region I (channels 1 and 2), Region II (channels 3 and 4), and Region III (channels 5 and 6).

interpret featural differences as signaling the presence of distinct objects—implicates the anterior temporal cortex as central to the individuation process. This conclusion is supported by evidence that when infants younger than 11 months (who do not spontaneously individuate-by-color) are shown events prior to test that prime them to individuate-by-color, activation in anterior temporal cortex is obtained (Wilcox et al., 2014b).

What was unexpected was the pattern of activation observed in posterior temporal cortex. Activation in this cortical area appears to be age-related and independent of test event. For example, infants about 3–6 months show activation in posterior temporal cortex in response to all three test events and the magnitude of the response does not vary by event (Wilcox et al., 2010, 2012). Other research has reported that activation in posterior temporal cortex in this age group is (a) specific to objects and not non-object visual stimuli such as reversing checkerboard patterns or faces (Watanabe et al., 2008, 2010; Lloyd-Fox et al., 2009; Honda et al., 2010), and (b) independent of the properties of the objects involved (Watanabe et al., 2008, 2010). Collectively, these data implicate the posterior temporal cortex as important to mid-level visual object processing (at least in younger infants); this area responds selectively to objects, although the information associated with those objects is limited. In contrast to the robust responses observed in younger infants, by about 7 months posterior temporal activation appears to wane and by about 12 months is typically not observed (Wilcox et al., 2012, 2014a), suggesting that the ventral object-processing network undergoes functional reorganization during the first year. Further exploration of this phenomenon is warranted, however, for a number of reasons. First, the studies from which these results are drawn used slightly different age groups, some of which overlap, hence firm conclusions about the age-related differences merits re-analysis of the data on the basis of age. Second, headgear configuration, including sourcedetector distances, remained constant across age while head circumference increased with age, raising questions as to the extent to which the change in HbO responses observed could be better explained by changes in head circumference (and hence the cortical areas being assessed) than age. Finally, only two measurement channels in posterior temporal cortex were used, leaving a portion of posterior temporal cortex un-assessed.

The goal of the current research was to assess the conclusion that activation in posterior temporal cortex during visual object processing wanes during the first year. The current research took two approaches. First, in Experiment 1 data from previously conducted studies in which infants saw shape-difference, colordifference, or control events were compiled into a single data set. Analyses of the responses at the two measurement channels in the posterior temporal cortex were conducted for each of three age groups: 3–6, 7–9, and 11–12 months. As a comparison,

we analyzed hemodynamic responses at the two measurement channels in the occipital cortex, which have been reported to remain stable over the first year. In addition, we examined the correlation between hemodynamic responses, age, and head circumference, which has not been reported previously. Second, a new study was conducted in which optical imaging data were collected from six, rather than two, measurement channels in left posterior temporal cortex. Two age groups were tested in this experiment: 4- to 6-month-olds and 10- to 12-month-olds. This allowed us to assess the extent to which posterior temporal cortex might be involved in visual object processing in older infants, but had not been evident in previous studies because only two measurement channels were used.

# EXPERIMENT 1

the photograph for publication purposes.

The data for Experiment 1 were drawn from three previously published papers (Wilcox et al., 2010, 2012, 2014a) that had used a similar experimental protocol to assess activation in responses to the three events displayed in **Figure 1**. In these research reports, fNIRS data were collected while infants viewed a shape-difference, color-difference, or control event. We compiled the data from four measurement channels, two each in the occipital and posterior temporal cortex (**Figure 2A**), into a single database and hemodynamic responses along with age in days and head circumference. Although, all of these data were reported in previously published manuscripts, not all of the data were subjected to statistical analyses. For example, in Wilcox et al. (2014a) mean HbO responses in occipital and posterior temporal cortex were reported but not included in data analyses because they were not directly relevant to the research hypothesis. This approach allowed us to assess the extent to which responses in posterior temporal cortex, as compared to occipital cortex, differed by age when controlling for head circumference using a large sample.

#### Materials and Methods Participants

The sample included 198 infants belonging to three age groups: 99–207 days (n = 93, M days = 170.5, 53 males, and 39 females); 213–280 days (n = 50, M days = 236.1, 32 males, and 18 females); and 339–391 days (n = 55, M days = 356.5, 34 males, and 21 females). These will be referred to as 3− to 6-montholds (young age group), 7− to 9-month-olds (intermediate age group), and 11- to 12-month-olds (old age group), respectively. This sample included the infants tested in Wilcox et al. (2010) and Wilcox et al. (2012) and the 7- to 8-month-olds tested in Wilcox et al. (2014a)<sup>1</sup> . All data were collected using a betweensubjects design. The number of infants who viewed the shapedifference, color difference, and control event in each age group was the following: young age group (shape n = 32, color n = 31, control n = 30), intermediate age group (shape n = 21, color n = 6, control n = 23), and old age group (shape n = 19, color n = 18, control n = 18). In each age group, an additional 22, 13, and 23 infants were tested, respectively, but excluded from analyses because of poor optical signal, failure to attend to the display, procedural problems, or crying. The percentage of infants who were tested but failed to contribute data did not differ significantly for the young (19.1%) and intermediate (20.6%) age groups, nor for the intermediate and old (29.5%) age groups, p > 0.05 (Z-test). The attrition rates reported in Experiment 1 and Experiment 2 are within the range of those typically reported in infant fNIRS studies (Lloyd-Fox et al., 2010). Experiment 1 and Experiment 2 were carried out in accordance with the recommendations and approval of the Institutional Review Board, Division of Research, Texas A&M University with written informed consent from the parents/guardians of all infant participants. All parents/guardians gave written informed consent in accordance with the Declaration of Helinski.

Infants were recruited from commercially produced lists, birth announcements in the local newspaper, and through social media. Parents were offered \$5 or a lab T-shirt for participation. This study was carried out in accordance with the recommendations of the Institutional Review Board of Texas A&M University with written informed consent from parents of all participants. All parents gave written informed consent in accordance with the Declaration of Helsinki.

#### Task and Procedure

Infants sat on their parent's lap or in a Bumbo <sup>R</sup> seat in a quiet and darkened room and watched the event to which they were assigned for four test trials, in a puppet-stage apparatus. Trained experimenters produced the test events live following a precise

<sup>1</sup> Infants from Wilcox et al. (2010) and Wilcox et al. (2012) saw either a shapedifference, color-difference, or control event. However, only 17 infants from these two studies were aged 7–9 months. To increase the number of intermediate aged infants included in the sample, infants aged 7–9 months were also drawn from Wilcox et al. (2014a), for a final intermediate age samples size of n = 50. Infants in Wilcox et al. (2014a) were not presented with a color-difference event, hence, most infants in the intermediate group saw either a shape-difference or control event.

script. For the infants tested in Wilcox et al. (2010, 2012), test trials were 20 s in duration; for the infants tested in Wilcox et al. (2014a) test trials were 24 s in duration. Because analysis of the optical imaging data requires baseline recordings of the measured intensity of refracted light, each test trial was preceded by a 10 s baseline interval during which time a curtain covered the front opening and stage of the apparatus. The curtain was raised to begin each test trial.

Looking behavior was monitored by two independent observers who watched the infants through peepholes in clothcovered frames attached to the side of the apparatus. Interobserver agreement averaged 95% across all infants tested.

#### Instrumentation

The imaging equipment contained four fiber optic cables that delivered near-infrared light to the scalp of the participant (emitters), eight fiber optic cables that detected the diffusely reflected light at the scalp (detectors), and an electronic control box that served as the source of the near-infrared light and the receiver of the reflected light. The control box produced light at wavelengths of 690 nm, which is more sensitive to deoxygenated blood, and 830 nm, which is more sensitive to oxygenated blood, with two laser-emitting diodes (TechEn Inc). Laser power emitted from the end of the diode was 4 mW. Light was square wave modulated at audio frequencies of approximately 4–12 kHz. Each laser had a unique frequency so that synchronous detection could uniquely identify each laser source from the photodetector signal. Each emitter delivered both wavelengths of light and each detector responded to both wavelengths. The signals received by the control box were processed and relayed to a Dell desktop computer. A custom computer program recorded and analyzed the signal. Prior to test, infants were fitted with a custom-made headgear that secured the fiber optics to the scalp.

Configuration of the sources and detectors within the headgear, placement of the sources and detectors on the infant's head, and location of the measurement channels are displayed in **Figure 2A**. Source-detector separation was 2 cm. The headgear was not elastic so the distance between sources and detectors remained fixed. The headgear was placed on the infant's head using O1 as the anchor. For the purpose of this paper, we report only the data collected at O1 and T5. Head circumference of the infants tested ranged from 41 to 49 cm. Hence, the distance between O1 and T5 (1/5 of the head circumference) ranged from 8.2 to 9.8 cm. Although, head circumference did vary, the area of the skull (and underlying neural structures) affected was relatively small and, importantly, was smaller than the separation between each source and detector.

#### Processing of fNIRS Data

The fNIRS data were processed, for each of the four detectors separately, using the same protocol (see Wilcox et al., 2010). Briefly, the raw signals were acquired at the rate of 200 samples per second, digitally low-pass-filtered at 10 Hz, a principal components analysis was used to design a filter for systemic physiology and motion artifacts, and the data were converted to relative concentrations of oxygenated (HbO) and deoxygenated (HbR) blood using the modified Beer-Lambert law. Changes in HbO and HbR were examined using the following time epochs: the 2 s prior to the onset of the test event, the 20 s (data from Wilcox et al., 2010, 2012) or 24 s (data from Wilcox et al., 2014a) test event, and the 10 s following the test event. The mean optical signal from −2 to 0 s (baseline) was subtracted from the signals and other segments of the time epoch were interpreted relative to this zeroed baseline.

Optical signals were averaged across trials and then infants for each event. Trials objectively categorized as containing motion artifacts (a change in the filtered intensity greater than 5% in 1.20 s during the 2 s baseline and test event) and in which infants failed to attend to the event were eliminated from the mean. These criteria eliminated 51 (of a possible 372), 41 (of a possible 200), and 56 (of a possible 220) trials in the young, intermediate, and old age groups, respectively. The percentage of missing trials was significantly greater for the intermediate than young age group, z = −2.108, p = 0.035 (two-tailed), but did not differ significantly for the intermediate and old age group, z = −1.203, p > 0.05. These data indicate that around 7–8 months it becomes more difficult for infants to successfully complete a full complement of test trials. It is interesting to note that whereas the age groups did not differ significantly in their attrition rates (reported in Section Participants), they did differ in the quantity of data that was collected within a test session. We suspect that once infants become independently mobile and can actively engage in reaching and object manipulation without trunk support, around 7 months of age, they become less cooperative in experiments that involve watching objects. This makes collection of fNIRS data, which is sensitive to motion artifacts, more challenging in older infants.

#### Results and Discussion Looking Time Data

For each age group, duration of looking time data (in seconds) were averaged across trials and infants for each event and a Oneway ANOVA was conducted with event as the between-subjects factor<sup>2</sup> . The main effect of event was not significant at any of the three age groups (p > 0.05). The mean (standard deviation) looking times of the young, intermediate, and old age groups were 16.52 s (2.90 s), 17.99 s (1.55 s), and 16.99 s (2.45 s).

#### Hemodynamic Responses

For each age group, relative changes in HbO were averaged, for each event and channel, over 7–20 s (infants tested in Wilcox et al., 2010, 2012) or 7–24 s (infants tested in Wilcox et al., 2014a). This interval was chosen because the first emergence of the object to the right of the screen began at 5 s and, allowing 2 s for the hemodynamic response to become initiated, hemodynamic changes should be detectable by 7 s and persist until the end of the trial (see Wilcox et al., 2010 for supporting evidence). Statistical analyses are reported here for HbO responses only, which are more robust than HbR responses (Strangman et al., 2002). However, HbR data are reported in Supplementary Materials.

<sup>2</sup>Recall that the infants in the intermediate age group who were drawn from Wilcox et al. (2014a) had a test trial length of 24 s rather than 20 s. The looking time data of these infants was adjusted for trial length (duration of looking × 0.833).

For each age group, preliminary analyses were conducted to assess the extent to which mean HbO responses could be explained by event or sex. In all analyses, no main effects or interactions involving these factors were obtained (p > 0.05). Hence, in subsequent analyses the data were collapsed across event and sex.

Two sets of analyses were performed on HbO responses. First, for each age group, mean responses obtained at channels 4 and 5 (posterior temporal cortex) and channels 8 and 9 (occipital cortex) were compared to 0. The outcome of these analyses, including Cohen's d effect sizes (Cohen, 1988), are reported in **Table 1**. For the young age group, a significant increase in HbO was obtained in both occipital channels (large effect sizes) and in both posterior temporal channels (large/medium effect sizes). For the intermediate age group, a significant increase in HbO was obtained in both occipital channels (large/medium effect sizes) and in both posterior temporal cortex channels (medium/small effect sizes). For the older infants, a significant increase in HbO was obtained in both occipital channels (large effect sizes) and in one posterior temporal channel (small effect size). In sum, significant activation, with medium to large effect sizes, was obtained in all occipital channels at all ages. In contrast, whereas strong activation was obtained in the posterior temporal cortex in the youngest age group it waned over time, and by 11–12 months only one channel showed activation and the magnitude of this response, as indicated by the effect size, was small and of little practical significance.

Next, correlational and partial correlational analyses were conducted to determine the relation between HbO responses at each of the four channels, age in days, and head circumference (HC). The correlational analyses (**Table 2**) revealed a significant



*One sample t-tests were used to compare mean responses at each of the four channels, within the two cortical areas, to zero. Two-tailed p-values that passed the Benjamini and Hochberg (1995) test for multiple comparisons are indicated by asterisks:* \**p* < *0.05;* \*\**p* < *0.01;* \*\*\**p* < *0.001. A Cohen's d of 0.2, 0.5, and 0.8 are considered small, medium, and large effect sizes, respectively (Cohen, 1988).*



*Partial correlations controlled for head circumference. One-tailed p-values that passed the Benjamini and Hochberg (1995) test are indicated by asterisks:* \**p* < *0.05;* \*\**p* < *0.01;* \*\*\**p* <*0.001.*

negative correlation between age in days and HbO responses obtained in channels 4 and 8. Age in days and HC were positively correlated, as expected. Partial correlation analyses (**Table 2**) revealed that the negative correlation between age in days and HbO responses in channels 4 and 8 remained significant, even when controlling for HC. Plots of the partial correlations are displayed in **Figure 4**. These plots illustrate the fact that while the partial correlations were significant in channels 4 and 8, the effects sizes are relatively small. The negative correlation between age in days and HbO responses at channel 4 was predicted and is consistent with the group result reported above. The fact that age was not significantly, negatively correlated with HbO responses obtained in channel 5 was unexpected. This outcome suggests that HbO responses did not decrease linearly during the first year, but instead dropped exponentially at some point in time. The reported effect sizes (**Table 1**) suggest that the decline was greatest between the young and the intermediate age group, which is evident to some extent on the plots of the partial correlations. Finally, we were surprised by the negative correlation between age in days and HbO responses in channel 8. This outcome suggests that while occipital responses are robust at all ages, there may be subtle age-related changes that are not easily identifiable in smaller sample sizes. We will return to this is the General Discussion.

The results of Experiment 1 confirm that age-related changes in posterior temporal activation during visual object processing are marked and cannot be explained by changes in head circumference. As expected, age-related changes in occipital cortex were not evident in the group analyses and associated effect sizes; however, correlational analyses suggested that subtle age-related changes might exist. These findings will be discussed in more detail in the General Discussion. Experiment 2 was conducted to explore the extent to which age-related changes are observed in posterior temporal cortex when a larger array of channels is used.

#### EXPERIMENT 2

Infants aged 4–6 and 10–12 months were presented with shapedifference, color-difference, and control events. These are all newly collected data. We focused on the younger and older infants because the change in activation in posterior temporal cortex is most pronounced between these two age groups. The headgear (**Figure 2B**) was designed to assess activation in posterior temporal cortex in areas nearby (but not identical to) the areas assessed in Experiment 1. Three of these channels had a source-detector distance of 2 cm and the other three channels had a source-detector distance of 3 cm. This allowed us to determine the extent to which activation was obtained at nearby areas, laterally and in depth, to those obtained in Experiment 1.

#### Materials and Methods

#### Participants

Infants aged 125–208 days (n = 18, M days = 168, 12 males, and 6 females) and aged 314–400 days (n = 16, M days = 352, 10 males, and 6 females) were tested. For ease in description these will be referred to as 4- to 6-month-olds (young age group) and 10- to 12-month-olds (old age group). All infants viewed shape-difference, color-difference, and control events. Given fewer measurement channels and improvements in headgear design, we were able to implement a within, rather than between, subject design. In the young and old age group an additional 8 and 13 infants were tested, respectively, but excluded from analyses because of poor optical signal, failure to attend to the display, procedural problems, or crying. The percentage of infants who were tested but failed to contribute data did not differ significantly for the young (30.8%) and old (38.5%) age groups, p > 0.05 (Z-test). The attrition rates observed in Experiment 2 are higher than those observed in Experiment 1, most likely due to the greater number of trials with which infants were presented (i.e., a lengthier experimental protocol). The race/ethnicity of the infants as reported by their parents was Caucasian (n = 29), Hispanic (n = 3), or mixed race/other (n = 2). Infants were recruited from commercially produced lists, birth announcements in the local newspaper, and social media websites. Parents were offered \$5 or a lab T-shirt for participation. This study was carried out in accordance with the recommendations of the Institutional Review Board of Texas A&M University with written informed consent from parents of all participants. All parents gave written informed consent in accordance with the Declaration of Helsinki.

#### Task and Procedure

The task and procedure were identical to that of Experiment 1 except that infants viewed all three events, for a total of 12 test trials (20 s each), rather than viewing one of the three events for 4 test trials. Infants saw the events in one of three randomly assigned orders (shape, control, color; control, shape, color; or color, shape, control). Inter-observer agreement averaged 91% across all infants tested.

#### Instrumentation

Instrumentation was similar to that of Experiment 1 with the exception of the headgear configuration (**Figure 2B**). One source, anchored at T5, and six detectors were used to create six measurement channels. Three of the detectors were placed 2 cm from the source and each of these had a corresponding detector placed 3 cm from the source, allowing for measurement at two cortical depths in three regions. The headgear was not elastic so the distance between the source and detectors remained fixed. The mean head circumference for the younger and older groups was 42.6 cm (SD = 1.31) and 46.6 cm (SD = 1.72), respectively. Hence, for the two age groups the mean difference in the distance between 01 and T5 (1/5 of the head circumference) was 0.8 cm.

#### Processing of fNIRS Data

The fNIRS data were processed, for each detector separately, using a procedure similar to that of Experiment 1. Optical signals were averaged across trials and then infants for each event. Trials objectively categorized as containing motion artifacts and in which infants failed to attend to the event for at least 3 s were eliminated from the mean (We used a more liberal looking time criteria than in prior studies to increase data retention; this more liberal criteria did not alter the outcome of the HbO analyses). On the basis of these criteria, in the younger group 22 (of 208 possible) trials were eliminated from analysis and in the older group 54 (of 216 possible) trials were eliminated. The number of missing trials (in relation to total number of trials) differed significantly for the two age groups, z = −3.48 p < 0.0002 (two-tailed test). As in Experiment 1, the older and younger age groups did not differ significantly in attrition rates, but they did differ in the quantity of data collected within a test session.

#### Results

#### Looking Time Data

For each age group, duration of looking time data (in seconds) were averaged across trials and infants for each event and a repeated measures One-way ANOVA was conducted with event as a within-subjects factor. The main effect of event was not significant for either age group (p > 0.05). The mean (standard deviation) looking times of the young and old age group were 15.77 s (2.13 s) and 16.14 s (2.82 s), respectively.

#### Hemodynamic Responses

For each age group, relative changes in HbO were averaged, for each event and channel, over 7–20 s. Next, preliminary analyses were conducted to assess the extent to which mean HbO responses could be explained by event or sex. No main effects or interactions involving these factors were obtained (p > 0.05) in either age group. Hence, in subsequent analyses the data were collapsed across event and sex.

Preliminary analyses were also conducted to examine the extent to which HbO responses obtained at 2 and 3 cm source-detector distances differed. For each age group, mean responses obtained at each of the 6 channels (i.e., three pairs of channels, each pair including a 2 and 3 cm source-detector distance) in posterior temporal cortex were compared to 0 (see Supplementary Materials for HbO and HbR responses at each of the six channels). The outcome of these analyses indicate that very similar hemodynamic responses, with similar effect sizes, were obtained at channels 1 and 2, channels 3 and 4, and channels 5 and 6. This pattern held for both age groups. Hence, for the main analyses HbO data will be averaged across the two channels of each pair to create 3 regions of interest: I, II, and III, respectively. For illustrative purposes, the hemodynamic responses curves at each of the six channels, grouped into the three areas of interest, are displayed in **Figure 5**.

For each age group, mean responses obtained at each of the three ROIs were compared to 0. The outcome of these analyses (**Table 3**) revealed that for the young infants, a significant increase in HbO was obtained in all three ROIs, with medium to large effects sizes. For the older infants, no significant change in HbO was obtained in any of the ROIs. The effect sizes obtained with the older infants in Experiment 2 were equivalent or smaller than those obtained with the older infants in posterior temporal cortex in Experiment 1. These results confirm and extend those of Experiment 1 by revealing that across many different channels, young but not old infants show activation in posterior temporal cortex during visual object processing.

Correlation analyses for HC, age in days, and HbO responses were not conducted because HC and age in days (collapsed across the two age groups) were bi-modal in their distribution.

#### GENERAL DISCUSSION

The research reported here clearly shows that younger and older infants demonstrate different patterns of activation in posterior temporal cortex during a visual object-processing task. Experiments 1 and 2, combined, revealed that when viewing moving, occluded objects infants aged 3–6 months show robust activation in posterior temporal cortex, measured at eight

infants of Experiment 2. The number above each plot refers to the channel number and the Roman numerals refer to areas of interest. All regions are thought to lie within the posterior temporal cortex. The plot shows the mean HbO (red lines), HbR (blue lines), and HbT (green lines) curves in µM cm. Time is on the x-axis: the first black line on each plot denotes the onset of the 20 s test trial and the second black line denotes the onset of the 10 s baseline interval.

different channels surrounding T5, whereas infants aged 10–12 months showed little if any activation in any of these channels. Hemodynamic responses did not vary by event: regardless of the featural characteristics of the objects, and whether the same object or two different objects were seen to each side of the occluder, the same pattern of results was obtained. Additional data reported in Experiment 1 revealed that (a) an intermediate age group consisting of 7- to 9-month-olds showed HbO responses in posterior temporal cortex at a magnitude lesser than the young age group and greater than the old age group and (b) age-related changes in posterior temporal responses across the first year are better explained by age in days than head circumference.

The pattern of results obtained in the occipital cortex contrasted sharply with that obtained in posterior temporal cortex. Experiment 1 revealed strong hemodynamic responses in occipital cortex in all age groups (young, intermediate, old) to all test events. This outcome suggests that posterior temporal and occipital cortex play unique roles in visual object processing and, most relevant to the present discussion, that the contribution of posterior cortex to infants' processing of moving occluded objects changes considerably during the first year.

### Explaining Age-Related Change in Posterior Temporal Cortex

How do we interpret the age-related response obtained in posterior temporal cortex? One possibility is that these results reflect structural changes in the brain (e.g., increased density of neural tissue) or skull (e.g., increased skull thickness) that impede our ability to detect HbO responses. There are a number of reasons to question the viability of this explanation, the most notable being that activation has been obtained in posterior

TABLE 3 | Mean (SD) HbO responses for the young and old age groups of Experiment 2.


*One sample t-tests were used to compare mean responses at each of the three ROIs to zero. One-tailed p-values that passed the Benjamini and Hochberg (1995) test are indicated by asterisks:* \**p* < *0.05;* \*\**p* < *0.01;* \*\*\**p* < *0.001. Effect sizes as measured by Cohen's d are also reported Cohen (1988).*

temporal cortex in infants older than 6 months (a) during these occlusion sequences but under different experimental conditions (Wilcox et al., 2014b) and (b) during other types of object processing tasks (Biondi and Wilcox, 2014, 2015). If structural changes interfere with our ability to measure activation in posterior temporal cortex, we would not expect to obtain responses in other experimental contexts. An alternative, and more likely, possibility is that these results reflect functional maturation of the ventral object-processing pathway. In the adult, ventral object processing networks are not only hierarchically organized, but also distributed in their organization. For example, processing of inanimate objects elicits activation in a distributed network of areas in the lateral occipital complex (LOC) and ventral temporal cortex (as well as intraparietal sulcus) and this pattern is distinct from that activated in response in animate objects (Haxby et al., 2001; Xu and Chun, 2006; Xu, 2009; Naughtin et al., 2014; Jacques et al., 2015). It is possible that object processing networks are not as discretely organized in the young infant, but become refined with time and experience. There are two lines of evidence that support the idea of functional pruning in ventral object processing areas. First, areas in the occipital cortex become more selective in their response to visual stimuli between 2 and 3 months of age; whereas some responses are widely distributed around 2 months they become localized to posterior areas of the occipital cortex by 3 months (Watanabe et al., 2008, 2010). Second, there is evidence from nonhuman primate studies that the neural pathway critical to visual object recognition memory, which projects from the inferior temporal cortex to medial temporal lobe structures, has an abundance of connections early in infancy. By adulthood, some connections are eliminated entirely or become more refined in their distribution (Webster et al., 1991; Bachevalier and Mishkin, 1994). These two examples, although drawn from cortical areas that mediate different object-processing functions in the ventral pathway, provide evidence for the importance of functional pruning during infancy. There are a number of mechanisms by which this pruning could occur, including intrinsic neurobiological factors, early experience with the external environment, and self-organizing principles that lead to select patterns of connectivity within and between cortical areas (Bachevalier and Hagger, 1991; Homae et al., 2010; Johnson, 2010; Kolb et al., 2014).

We are less sure of how to explain the negative correlation between age and the magnitude of the response obtained in occipital channel 8. We cannot rule out a "structural change" explanation as we did for the posterior temporal cortex. In our studies we typically obtain significant HbO responses in occipital cortex to all occlusion events at all ages tested; we have not observed age-related changes in response to these or related visual events. In addition, in Experiment 1 the negative relation between age in days and HbO revealed in the correlation and partial correlation analyses was not reflected in the group analyses: we obtained significant activation, with large effect sizes, in occipital channels at all age groups tested. These results suggest that the HbO responses observed in occipital cortex are so robust from an early age that a decline in the magnitude of the response over the first year does not lead to a qualitative change in the outcome of the statistical analyses. Since there is no evidence for functional pruning of occipital areas for the processing of these events, at least not at the ages tested, we favor a structural change explanation for the negative correlation we observed between age in days and HbO responses. In other words, we hypothesize that the negative correlation obtained in occipital cortex reflects a different process than that observed in posterior temporal cortex. Of course, further investigation is needed to test this hypothesis.

# Object Processing and Visual Working Memory

Arguably, infants' processing and interpretation of occlusion sequences like those used in the current experiments draws heavily on visual working memory. Infants must keep track of objects, and their unique numerical identities, as the objects move in and out of view behind the occluding screen. We know from previous work that anterior temporal cortex, in addition to occipital and posterior temporal cortex, plays a unique role in infants' processing of these events (Wilcox et al., 2010, 2012, 2014a). Activation is obtained in anterior temporal cortex in response to the occlusion sequences when the individuation process is engaged—when infants interpret the event as involving two numerically distinct objects. Activation is not obtained in anterior temporal cortex when the individuation process is not engaged. What is currently open to speculation is the specific processes mediated by these cortical areas: occipital cortex, posterior temporal cortex, and anterior temporal cortex. On the basis of what is currently known about adults' tracking of visual objects, we suspect that occipital cortex (and posterior temporal cortex in the younger infants) mediates short-term storage of occluded objects. For example, fMRI studies with adults have revealed that areas in LOC encode objects as whole entities rather than as parts (Malach et al., 1995; Grill-Spector, 2003; Kanwisher, 2003; Kourtzi and Connor, 2011), and are activated when feature sets change (Xu and Chun, 2006; Xu, 2009). However, it does not appear as though object features are bound to the objects as this stage in the processing (Xu, 2009; Naughtin et al., 2014). There is also evidence that LOC does not mediate the initiation and formation of distinct object representations, but is instead responsible for keeping track of already formed representations (Naughtin et al., 2014). Collectively, our data suggest that in the infant, anterior temporal cortex mediates the formation of distinct object representations, whereas the occipital cortex (and posterior temporal cortex in younger infants) is responsible for tracking those distinct entities through occlusion. The extent to which occipital areas are involved in infants' representation of feature sets, and the cortical basis of feature binding is open to debate. The charge of future research is to identify the ontogeny of cortical networks that support object representation, individuation, and identification. This endeavor will shed light on principles of brain development, such as the conditions under which networks are pruned, and can enhance our understanding of the cognitive architecture that supports acquisition of object knowledge during the first year.

# AUTHOR CONTRIBUTIONS

TW contributed to the conception and design of the work, data analysis and interpretation, and preparing the manuscript. MB contributed to data acquisition and interpretation, and aided significantly in preparation of the manuscript. TW and MB both agree to be accountable for the work and had final approval of the submitted version.

### ACKNOWLEDGMENTS

The open access publishing fees for this article have been covered by the Texas A&M University Online Access to Knowledge (OAK) Fund, supported by the University Libraries and the

# REFERENCES


Office of the Vice President for Research. We thank Melissa Wallace Klapuch, Amy Hirshkowitz, Laura Hawkins, and the staff of the Infant Cognition Lab at Texas A&M University for help with data collection and management, and the infants and parents who so graciously participated in the research. This work was support by grants NSF BCS-0642996 and NIH R01- HD057999 to TW.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnsys. 2015.00180


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Wilcox and Biondi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Corrigendum: Functional Activation in the Ventral Object Processing Pathway during the First Year

Teresa Wilcox \* and Marisa Biondi

AUTHOR CONTRIBUTIONS

*Infant Cognition Lab, Department of Psychology, Texas A&M University, College Station, TX, USA*

Keywords: infants, object processing, object processing pathway, ventral temporal cortex, cortical development

#### **A corrigendum on**

**Functional Activation in the Ventral Object Processing Pathway during the First Year** by Wilcox, T., and Biondi, M. (2016). Front. Syst. Neurosci. 9:180. doi: 10.3389/fnsys.2015.00180

**Table 3** of the Wilcox and Biondi (2016) article contained the incorrect p-values, which we hereby rectify. The original table contained two-tailed p-values rather than one-tailed p-values. The legend of **Table 3** indicated that one-tailed values were reported. We therefore re-submit **Table 3** with the correct one-tailed p-values.

#### Edited and reviewed by:

*Natasha Sigala, University of Sussex, UK*

> \*Correspondence: *Teresa Wilcox twilcox@tamu.edu*

Received: *04 April 2016* Accepted: *18 April 2016* Published: *03 May 2016*

#### Citation:

*Wilcox T and Biondi M (2016) Corrigendum: Functional Activation in the Ventral Object Processing Pathway during the First Year. Front. Syst. Neurosci. 10:38. doi: 10.3389/fnsys.2016.00038* TW contributed to the conception and design of the work, data analysis and interpretation, and manuscript preparation. MB contributed to data acquisition and interpretation, and manuscript preparation.

# REFERENCES

Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statis. Soc. 57, 289–300. doi: 10.2307/2346101

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd Edn. Hillsdale, NJ: Lawrence Earlbaum Associates.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Wilcox and Biondi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.



*One sample t-tests were used to compare mean responses at each of the three ROIs to zero. One-tailed p-values that passed the Benjamini-Hochberg (Benjamini and Hochberg, 1995) test are indicated by asterisks:* \* *p* < *0.05;* \*\* *p* < *0.01;* \*\*\* *p* < *0.001. Effect sizes as measured by Cohen's d are also reported (Cohen, 1988).*

# Oscillatory Activity in the Infant Brain and the Representation of Small Numbers

Sumie Leung<sup>1</sup> , Denis Mareschal <sup>2</sup> , Renee Rowsell <sup>1</sup> , David Simpson<sup>1</sup> , Leon Iaria<sup>1</sup> , Amanda Grbic<sup>1</sup> and Jordy Kaufman<sup>1</sup> \*

<sup>1</sup> School of Health Sciences, Faculty of Health, Arts and Design, Swinburne University of Technology, Hawthorn, VIC, Australia, <sup>2</sup> Centre for Brain and Cognitive Development, Department of Psychological Sciences, Birkbeck, University of London, London, UK

Gamma-band oscillatory activity (GBA) is an established neural signature of sustained occluded object representation in infants and adults. However, it is not yet known whether the magnitude of GBA in the infant brain reflects the quantity of occluded items held in memory. To examine this, we compared GBA of 6–8 month-old infants during occlusion periods after the representation of two objects vs. that of one object. We found that maintaining a representation of two objects during occlusion resulted in significantly greater GBA relative to maintaining a single object. Further, this enhancement was located in the right occipital region, which is consistent with previous object representation research in adults and infants. We conclude that enhanced GBA reflects neural processes underlying infants' representation of small numbers.

#### Edited by:

Zsuzsa Kaldy, University of Massachusetts Boston, USA

#### Reviewed by:

Teresa Wilcox, Texas A&M University, USA Veronica Mazza, University of Trento, Italy

#### \*Correspondence:

Jordy Kaufman jkaufman@swin.edu.au

Received: 17 September 2015 Accepted: 18 January 2016 Published: 08 February 2016

#### Citation:

Leung S, Mareschal D, Rowsell R, Simpson D, Iaria L, Grbic A and Kaufman J (2016) Oscillatory Activity in the Infant Brain and the Representation of Small Numbers. Front. Syst. Neurosci. 10:4. doi: 10.3389/fnsys.2016.00004 Keywords: gamma-band activity, object permanence, small numbers, infancy, electroencephalogram, object processing

# INTRODUCTION

How and whether infants appreciate that an out-of-sight object continues to exist remains a fundamental question in child psychology and developmental cognitive neuroscience. Based on Piaget's original observations that infants under 9 months do not reach for hidden objects (Piaget, 1954), it was widely held that infants lack object permanence. However, recent studies measuring infants' looking behavior suggested that infants as young as 2.5 months of age expect the continued existence of hidden objects (Wang et al., 2005), as they look longer at events that violate the permanence and solidity of objects than at events that do not have such violations. Electrophysiological and neuroimaging studies have revealed several possible underlying neural mechanisms for object retention in around 6-month-old infants (e.g., Csibra et al., 2000; Baird et al., 2002; Kaufman et al., 2003, 2005; Wilcox et al., 2005; Wilcox and Biondi, 2015).

One of these mechanisms is the gamma band synchronized neural activity (GBA), which underlies infants' object tracking ability (Kaufman et al., 2003, 2005; Southgate et al., 2008), specifically, increased GBA at infants' posterior temporal cortex was observed whenever an object was occluded (Kaufman et al., 2003). Importantly, this increase in GBA was not associated with the object's disappearing state per se, but occurred most prominently when the manner of disappearance was consistent with the object's continued existence (Kaufman et al., 2005). Such findings are similar to the enhanced GBA observed during a period that adults needed to hold an object representation in short-term memory (Tallon-Baudry et al., 1998). This enhancement has also been demonstrated to be specific to holding hidden objects in infants' memory, as such increase was not seen with hidden faces (Southgate et al., 2008).

Although the importance of GBA for infants' object processing has been established, it is not yet known whether the magnitude of this GBA relates to the amount of information infants maintain during object occlusion. Behavioral studies that examined infants' object working memory capacity have been mainly divided into two lines of research: ''how many'' and ''what'', with the former focusing on the number of individual objects that infants could track, and the latter focusing on the number of specific objects infants could identify (see Kibbe and Leslie, 2013). In the ''how many'' studies, infants as young as 4 months old could keep track of more than one hidden object at a time (Wynn, 1992; Mareschal and Johnson, 2003), and they had a upper limit of about three objects in the first year of life (Feigenson and Carey, 2003, 2005). These studies required infants to use spatiotemporal cues to individuate objects. They did not need to identify a distinct feature of the object (Xu and Carey, 1996; Leslie et al., 1998; Xu, 1999). In contrast, the ''what'' studies showed that infants of 6.5 months and younger could only hold the identity of one single item in short-term memory, (Káldy and Leslie, 2003, 2005; Ross-Sheehy et al., 2003), as this line of research required infants to recall featural information to individuate objects (Wilcox and Baillargeon, 1998; Wilcox, 1999; Wilcox and Schweinle, 2002; Wilcox et al., 2010).

The different upper limits in infants' ability to retain the quantity vs. the identity of objects could be explained by how the brain processes different traits of an object differently, and the immaturity of these processes in infants. There are two routes for visual object processing: the dorsal route mainly processes spatial and temporal object information involved in guided action, such as location, whereas the ventral route mainly processes information that identifies an object (e.g., Ungerleider and Mishkin, 1982; Livingstone and Hubel, 1988; Milner and Goodale, 1995). While these routes are no longer thought to be as independent as they once were (see for example, Merigan and Maunsell, 1993; Puce et al., 1998; Humphreys and Jane Riddoch, 2003; Puce and Perret, 2003). Numerous developmental authors invoke the dual stream hypothesis as one of the most important heuristic frameworks for understanding early human infant-object interactions (e.g., Leslie et al., 1998; Xu et al., 1999; Atkinson, 2000; Johnson et al., 2001; Wilcox and Schweinle, 2002; Káldy and Sigala, 2004). Of note, is the finding that 4-month-old infants are capable of recalling the feature (via the ventral route), or the location (via the dorsal route) of an object separately, but unable to recall the combined feature and location information, suggesting that their ability to integrate information processed separately by the dorsal and ventral visual processing routes during occlusions is limited (Mareschal et al., 1999; Kaufman et al., 2003; Mareschal and Johnson, 2003; Mareschal and Bremner, 2006).

Infants' attenuated GBA activity for hidden faces led researchers to believe that the GBA during occlusion does not reflect the ventral route of visual processing (Southgate et al., 2008). However, it has not been examined if the GBA observed in the previous occlusion studies (Kaufman et al., 2003, 2005; Southgate et al., 2008) underlies the activity of the dorsal route, which processes spatial temporal information that allows infants to individuate objects. The aim of the present study is to answer the question of whether the amount of GBA reflects the amount or number of items that become occluded. If so, this could indicate that the GBA observed in the previous occlusion studies reflects the processing of spatiotemporal information. As previous studies have shown an increase of brain activities in the alpha- and gammaband when adults were asked to hold more items in their memory (Howard et al., 2003; Palva et al., 2011; Spitzer et al., 2014), we hypothesize that the GBA observed in infants' object tracking would increase with the number of objects being occluded.

#### MATERIALS AND METHODS

#### Participants

Twenty-eight full-term 6–8 month-olds (M = 212 days; 14 male, 14 female) participated in this experiment. An additional 13 infants were tested but were excluded from further analysis due to insufficient trial counts (fewer than 10 trials per condition) caused by fussiness or motion artifact. The study was approved by the Human Research Ethics Committee, Swinburne University of Technology, and written informed consent was obtained from the parents of all infant participants.

### Data Acquisition

Infants sat in a dimly-lit room on a parent's lap, 60 cm from the stimulus monitor. EEG was recorded with Netstation 4.3.1. acquisition software, and a NA300 amplifier from a Hydrocel Geodesic Sensor Net comprised of 124 electrodes. Online, EEG data were sampled at 500 Hz and were referenced to the vertex electrode. Infants' looking behavior was monitored and simultaneously video-recorded with the EEG data.

#### Paradigm

The experiment began with a stationary digital color photo of either two objects showing side by side, one object on the left side of the monitor, one object on the right side of the monitor or no object. The object(s) were fully visible for 780 ms (''fully-visible period''). It was followed by a gray screen moving upwards gradually until it covered the object entirely and was fully ''up'', and this process took 600 ms. The objects remained completely occluded for 600 ms (''completeocclusion period''). The gray screen then started to come down and revealed the next object(s), and the process took 600 ms (see **Figure 1**). An experimenter monitored the infants' looking behavior would pause the experiment and played a movie to re-engage infants' attention to the monitor before resuming the experiment. The conditions were collected pseudorandomly, with the 2-object and 1-object stimuli being presented no more than three times in a row, and the no-object stimulus never being presented twice in a row. The purpose of having a no-object stimulus was to introduce randomness to the paradigm, thus

there were fewer no-object presentations. An average of 53.36 (range: 31–83) and 50.86 (range: 29–92) stimuli were presented for the 1-object and 2-object conditions, respectively, while the average number of presentations of the no-object stimulus was 32.14 (range: 7–60). A researcher monitored infants' looking behavior via video link from another room, and whenever an infant looked away, would play a cartoon on the screen (with sound) to attempt to re-engage attention. The study was resumed when the infants looked at the screen again, and continued as long as the infants were happy.

# Data Analysis

EEG data were bandpass filtered (1–100 Hz, 12 dB/octave, 50 Hz notch). As we were interested in the GBA to the number of objects being occluded, we grouped the data into two stimulus conditions: 2-object and 1-object, and we analyzed the GBA during the period that the screen was fully up and stationary, and the objects were fully occluded (herein referred to as ''completeocclusion period''). For each of the stimulus conditions, EEG data were segmented from 1018 ms before the time when the screen was fully ''up'' (herein referred to as ''screen-up'') to 982 ms post screen-up, and an independent component analysis (ICA) was applied to remove eye movement and blink artifacts for the whole segment. An automatic rejection was then applied, where segments with EEG amplitude variations larger than 200 µV between 182 ms pre screen-up to 818 ms post screen-up were rejected. Segments were rejected, if infants looked less than a total of 200 ms during the fully-visible period and less than a total of 300 ms during the complete-occlusion period. This resulted in an average of 29.65 (SD = 11.90) and 28.05 (SD = 15.10) segments for 1-object and 2-object conditions, respectively. There were at least 10 accepted segments for each of the conditions (1-object and 2-object) for each infant. In this paradigm, no baseline correction was used, because: (1) our two conditions are comparable and independent from each other, especially our expected effect is a tonic, rather than a phasic, modulation of GBA; and (2) there is not a period that is the same prior the occlusion period in the two conditions, as the periods prior to the screen contain either one or two objects visible respectively in the two conditions. We therefore used the 1-object condition as the ''baseline'' for the 2-object condition.

Induced GBA was obtained by using a continuous wavelet transformation to the accepted segments of each electrode (Morlet wavelets with 21 frequency steps in the 30–50 Hz range). Average wavelet coefficients for each infant were calculated by taking the mean spectral amplitude (in µV) across segments during the complete-occlusion period, in two 300 ms bins (0–300 ms; 300–600 ms). Given that we previously found the object permanence GBA are located in the right posterior temporal cortex (Kaufman et al., 2003, 2005), we first grouped 48 posterior channels into six different regions: Temporal-Parietal-Left (TPL); Temporal-Parietal-Central (TPC); Temporal-Parietal-Right (TPR); Occipital-Left (OL); Occipital-Central (OC); Occipital-Right (OR); see **Figure 2**, then we calculated the mean gamma-band wavelet coefficients of eight electrodes for each of these regions.

# Statistical Analysis

To determine whether there was any effect due to the number of objects, one repeated measures analysis of variance (ANOVA) was employed where gamma-band wavelet coefficient during complete-occlusion period was the dependent variable, Condition (2-Object; 1-Object), Region (TPL; TPC; TPR; OL; OC; OR) and Latency (Early: 0–300 ms; Late 300–600 ms) were the independent variables. Greenhouse-Geisser correction was applied if the assumption of sphericity was violated. Where significant interactions were found, post hoc analysis were performed with Bonferroni correction for Type I error.

# RESULTS

The 2-Object condition generated more GBA than the 1-Object condition overall (F(1,27) = 26.43, p < 0.001), and this interacted with Region (F(5,135) = 152.24, p < 0.001). There was also a significant Region effect (F(5,135) = 66.91, p < 0.001). Examining the significant interaction between Condition and Region, post hoc analyses for each of the six regions revealed that the 2-Object condition elicited more GBA than the 1-Object condition only at the Occipital Right region (F(1,27) = 11.50, corrected p = 0.012), but no difference between the two conditions at any of the other five regions (see **Figure 3**).

# DISCUSSION

The most meaningful finding of this study of young infants was that maintaining a representation of two objects during occlusion resulted in significantly greater GBA relative to maintaining a single object. Importantly, this enhancement was observed during the object occlusion period, in which there were no visible differences between the two conditions, thus demonstrating that these differences reflect distinct cognitive demands rather than perceptual processing. Similar to the enhanced GBA observed in adults when their working memory load increases (Howard et al., 2003; Palva et al., 2011), the current results support the hypothesis that the amount of GBA reflects the amount of perceptual history infants maintain after the objects were occluded.

The increase in GBA in the current study was in the right occipital region, which was more posterior than that reported in related earlier work (Kaufman et al., 2005), where GBA in the right temporal region during the occlusion period was higher than that during the disintegration period. However, taking together our current and previous results, the topographic distribution of the GBA during object occlusion in infants is similar to that in the left occipitotemporal area that Tallon-Baudry et al. (1998) observed in adults, in which subjects were told to keep an object in mind.

Interesting questions are raised however on the topographic differences between the current findings and those of Kaufman et al. (2003) who observed a marked gamma activity increase more specific to temporal cortex. This might be because GBA in that region is specific to holding any hidden object(s) in mind, regardless of how many objects, therefore any gamma change might become unobservable when we contrasted the two occluded conditions.

Another possibility which we think is more likely is that GBA in temporal cortex arises from the process of attempting to track the motion of an occluded object whereas the current study involved representing occluded stationary objects only. We think this explanation is more likely because of consistent evidence from both Southgate et al.'s (2008) work with infants and Tallon-Baudry et al.'s (1998) work with adults. Both of these studies involved the representation of stationary objects and resulted in similar topography to that of the infants described here. Future studies designed to differentiate the motion of occluded objects as opposed to occluded stationary objects will be needed to confirm this notion. Interestingly, the neural differences that we report between in the 1- and 2-object conditions are strikingly similar to what Southgate et al. (2008) reports when comparing activity during the occlusion of a single toy to the occlusion of a single face. Future studies are also needed to clarify what this fascinating similarity might represent.

As the GBA revealed here is generally consistent with prior work with occluded objects, it is worth reflecting on what this activity reveals about the neural processes underlying infant representation of small number. Our favored interpretation of this is that this type of brain activity underlies our early ability to represent small numbers (e.g., Wynn, 1992). However, we cannot at this point rule out the possibility that this activity is at least partially influenced by the total amount of visual input received prior to the occlusion period.

For example, it is possible that occluding a single large object would results in the same pattern of activity as two smaller objects.

While additional research is necessary to definitively disentangle these possibilities, theoretical accounts of the role of GBA as well as behavioral studies with infants and adults suggest otherwise. For example, Cordes and Brannon (2008) specifically investigated size and number representations of young infants. Their results clearly showed that even when cues such as object size are available that infants spontaneously represent number. This work is consistent with both infant work (e.g., Feigenson and Carey, 2005) and adult work demonstrating that number representation often can take precedence over size representation (e.g., Gallivan et al., 2011). Moreover, it is important to note that in our two-object displays the objects were not contiguous. Given young infants use of contiguity to visually individuate objects (Kaufman and Needham, 2010), it is reasonable to assume that the brain activity reported here reflects individual object representation rather than total amount of visual input.

It is worth acknowledging the microsaccadic activity could present a potential confounding factor, as some research (e.g., Yuval-Greenberg et al., 2008) has suggested that this muscle-based activity can erroneously be measured as brainbased. However, we do not think this is an issue in the current study, because the reported differences occur when infants in the two conditions are viewing the identical scene (i.e., an occluding screen in the upright position). Thus, any GBA difference observed is best explained by the differences that define the two condition: number of objects prior to occlusion.

A number of important questions follow from this research; the most obvious being: how does GBA reflect larger numbers of occluded objects. Moreover, knowing that GBA can distinguish one from two hidden objects opens up opportunities for future research examining neural signatures for object individuation. In conjunction with the current research such studies should reveal a much richer picture of how neurodevelopment relates to cognitive change in preverbal infants.

# AUTHOR CONTRIBUTIONS

SL, DM and JK prepared the manuscript; SL, RR, JK and AG did the analysis; JK, DM, DS, LI and SL designed the study; RR, AG and LI did the recruitment and testing.

# FUNDING

This research was supported under Australian Research Council's Discovery Projects funding scheme (project number DP110101598). The Eric O. Baker Charitable Fund provided funding for the equipment used in this research. Finally, DM is partially funded by a Royal Society-Wolfson Research Merit award.

# ACKNOWLEDGMENTS

The authors thank the infants and parents that took the time to participate in this study. We thank Joanne Tarasuik, Angela Mayes and Leila Dafner for their assistance in recruiting and testing participants and the Swinburne Babylab interns for their assistance in all aspects of running this study.

# REFERENCES

Atkinson, J. (2000). The Developing Visual Brain. Oxford: Oxford University Press.


language. Act Psychol. (Amst) 102, 113–136. doi: 10.1016/s0001-6918(99) 00029-3


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Leung, Mareschal, Rowsell, Simpson, Iaria, Grbic and Kaufman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# ERP markers of target selection discriminate children with high vs. low working memory capacity

Andria Shimi <sup>1</sup> \*, Anna Christina Nobre<sup>2</sup> and Gaia Scerif <sup>1</sup>

<sup>1</sup> Attention, Brain, and Cognitive Development Lab, Department of Experimental Psychology, University of Oxford, Oxford, UK, <sup>2</sup> Brain and Cognition Lab, Oxford Centre for Human Brain Activity, Department of Psychiatry, University of Oxford, Oxford, UK

Selective attention enables enhancing a subset out of multiple competing items to maximize the capacity of our limited visual working memory (VWM) system. Multiple behavioral and electrophysiological studies have revealed the cognitive and neural mechanisms supporting adults' selective attention of visual percepts for encoding in VWM. However, research on children is more limited. What are the neural mechanisms involved in children's selection of incoming percepts in service of VWM? Do these differ from the ones subserving adults' selection? Ten-year-olds and adults used a spatial arrow cue to select a colored item for later recognition from an array of four colored items. The temporal dynamics of selection were investigated through EEG signals locked to the onset of the memory array. Both children and adults elicited significantly more negative activity over posterior scalp locations contralateral to the item to-be-selected for encoding (N2pc). However, this activity was elicited later and for longer in children compared to adults. Furthermore, although children as a group did not elicit a significant N2pc during the time-window in which N2pc was elicited in adults, the magnitude of N2pc during the "adult time-window" related to their behavioral performance during the later recognition phase of the task. This in turn highlights how children's neural activity subserving attention during encoding relates to better subsequent VWM performance. Significant differences were observed when children were divided into groups of high vs. low VWM capacity as a function of cueing benefit. Children with large cue benefits in VWM capacity elicited an adult-like contralateral negativity following attentional selection of the to-be-encoded item, whereas children with low VWM capacity did not. These results corroborate the close coupling between selective attention and VWM from childhood and elucidate further the attentional mechanisms constraining VWM performance in children.

Keywords: selective attention, encoding, visual working memory, development, ERPs, contralateral posterior negativity, N2pc

# INTRODUCTION

Temporary storage of information is essential in order to act on our ever-changing visual world. However, our visual working memory (VWM), the system responsible for keeping information in an ''on-line'' state, is highly limited to about four items (Cowan, 2001; Todd and Marois, 2004). Yet, at any given moment we are faced with multiple items competing for representation.

Edited by: Zsuzsa Kaldy, UMass Boston, USA

Reviewed by: Roberto Dell'Acqua, University of Padova, Italy Yoav Kessler, Ben-Gurion University of the Negev, Israel Carlos M. Gómez University of Seville, Spain

> \*Correspondence: Andria Shimi andria.shimi@psy.ox.ac.uk

Received: 19 August 2015 Accepted: 23 October 2015 Published: 05 November 2015

#### Citation:

Shimi A, Nobre AC and Scerif G (2015) ERP markers of target selection discriminate children with high vs. low working memory capacity. Front. Syst. Neurosci. 9:153. doi: 10.3389/fnsys.2015.00153 To maintain an adaptive behavior, we need to represent only the most relevant information at any time. Visual selective attention allows us to select and process the items that are most relevant to current goals by shifting our focus to locations or objects. Influential theories of attention have postulated that a key basic mechanism to resolving the competition among competing items is selectivity, the ability to attend to the most relevant information and ignore the irrelevant (e.g., Desimone and Duncan, 1995; Desimone, 1998; Corbetta et al., 2000; Kastner and Ungerleider, 2001).

Indeed, over the last decade, research findings have not only highlighted the dynamic interplay between selective attention and VWM (Corbetta et al., 2002; Mayer et al., 2007; Chun and Johnson, 2011; Ikkai and Curtis, 2011; Fusser et al., 2012; Gazzaley and Nobre, 2012; Cohen et al., 2014) but critically, have demonstrated that individual differences in the efficiency of selective attention underpin differences between individuals with high vs. low VWM capacity, both in young and late adulthood (Vogel and Machizawa, 2004; Gazzaley et al., 2005; Vogel et al., 2005; McNab and Klingberg, 2008; Jost et al., 2011; Linke et al., 2011). Taken together, these findings have shown that the mechanisms responsible for the selection and encoding of representations into VWM underpin efficient storage and higher VWM capacity.

VWM increases dramatically with age (Gathercole, 1999; Cowan et al., 2005) with accompanied maturational changes in the brain (Kwon et al., 2002; Klingberg et al., 2002; Luna et al., 2004; Crone and Ridderinkhof, 2011; Jolles et al., 2011; Barriga-Paulino et al., 2014). Driven by the advances in the adult cognitive neuroscience literature and given that selective attention also undergoes dramatic improvement during childhood (Plude et al., 1994; Scerif, 2010; Johnson, 2011; Stevens and Bavelier, 2012), recent developmental research has also started examining the influence of visual attention mechanisms on the developing VWM system (Olesen et al., 2007; Cowan et al., 2010; Ross-Sheehy et al., 2011; Sander et al., 2011; Wendelken et al., 2011; Astle et al., 2012, 2014; Markant and Amso, 2013; Shimi et al., 2014a,b; Shimi and Scerif, 2015), rather than focusing solely on increases in VWM storage. Extending the adult findings to the developmental domain, in a recent study, Shimi et al. (2014a) demonstrated that age-related differences in the temporal dynamics of attentional orienting mechanisms before or after encoding items in VWM contributed to differences in VWM performance between children and adults. Importantly, individual differences in the temporal dynamics of the preparatory attentional orienting mechanisms that bias the encoding of relevant items into VWM discriminated children with high vs. low VWM capacity.

Despite this growing body within developmental science, our knowledge about the interactions between selective attention and VWM in children remains primarily focused on behavioral performance, rather than on the underlying neural circuits. Multiple electrophysiological studies have investigated the neural mechanisms supporting adults' selective attention of incoming percepts in function of encoding in VWM. However, an understanding of analogous processes in children is significantly more limited. Similarly, knowledge about children's selective attention in service of VWM is disproportionally limited compared with knowledge about selective attention for sensory processing. In the sensory, rather than the memory domain, research has shown that the speed and the efficiency of the ability to select the relevant stimulus among competing items improves with age, possibly reflecting the protracted development of neural networks controlling selective attention (for reviews, see Ridderinkhof and van der Stelt, 2000; Stevens and Bavelier, 2012). Thus, here, we examined whether the neural correlates of attentional selection during encoding into VWM operate differently in childhood compared to adulthood. In addition, we examined whether individual differences in children's efficiency of attentional selection of the relevant item during encoding relates to individual differences in VWM capacity. This is typically assessed in more traditional behavioral terms, through explicit recognition memory at the end of the trial sequence. However, the event-related brain potentials (ERP) method can track electrical brain responses on a millisecond-by-millisecond resolution, and it is therefore ideally suited for investigating attentional mechanisms leading to later accurate memory at different processing stages (Hillyard and Anllo-Vento, 1998; Luck et al., 2000), in both children and adults. This high temporal resolution has important implications for increasing an understanding of the neural mechanisms supporting VWM across development. For example, we have previously seen that neural activity elicited when guiding attention to a location of an upcoming target via an attentional cue, i.e., an initial stage within the information processing stream [as reflected in ERP components such as the Early Directing Attention Negativity (EDAN), the Anterior Directing Attention Negativity (ADAN), and the Late Directing Attention Negativity (LDAP)], differed not only between children and adults, but also between children of high vs. low VWM capacity (Shimi et al., 2014a). Here, we asked the following complementary question: Do age group and individual differences in neural activity hold for a subsequent stage within the information processing stream, i.e., selecting efficiently the relevant to-be-encoded item in VWM?

We examined this question by focusing on a well-known lateralized electrophysiological marker of attentional selection of a target item among multiple competing items, the N2pc (Luck and Hillyard, 1994a,b; Eimer, 1996; Hickey et al., 2009). N2pc is an enhanced negativity elicited over posterior scalp sites contralateral to the side of the attended item, typically within the latency range of ∼200–350 ms post-stimulus. N2pc has been heavily studied within the adult population in the sensory domain (Woodman and Luck, 1999; Hopf et al., 2004; Kiss et al., 2008; Mazza et al., 2009; Woodman et al., 2009) and more recently in the VWM domain (Nobre et al., 2008; Astle et al., 2009; Kuo et al., 2009; Shimi et al., 2014a). In contrast, in children, studies of N2pc are exceptionally scarce: to our knowledge, only two studies have examined N2pc in typically developing children to date. One of these studies investigated the selection of sensory targets among distractors using a visual search paradigm (Couperus and Quirk, 2015), and the other study found the N2pc to be elicited by children when they retrospectively searched their VWM (Shimi et al., 2014a); in both cases, similarities and differences emerged between children and adults in the topography and latency of the N2pc respectively. Thus far, no published study has examined whether N2pc is involved in attentional selection during the encoding of information in VWM in childhood; and if so, whether it resembles the spatiotemporal characteristics of the N2pc involved in attentional selection during VMW encoding in adulthood. Here, by measuring N2pc, we examined: (1) whether children and adults elicit similar neural activity when selecting a target item among competing items, for encoding in VWM and (2) whether this neural activity relates to individual differences in VWM capacity in children.

# MATERIALS AND METHODS

# Participants

Seventeen typically developing children (5 males and 12 females), aged 10–11 years old (M = 10.2 years old, SD = 0.39), and 15 healthy adults (8 males and 7 females), 21–34 years old (M = 26.4 years old, SD = 3.76), participated in the study. Children were recruited from local primary schools via an opt-in procedure and adults were recruited among University postgraduate students. All participants were right-handed and had normal or correctedto-normal vision. The study had ethical approval from the Central University Research Ethics Committee of the University of Oxford. Prior to testing, adult participants and parents of child participants signed a consent form whereas children assented to participate in the study verbally. For their participation, adults received monetary compensation and children received an appreciation certificate. One adult participant was excluded from the analyses due to significantly below-chance behavioral performance. The same sample of participants was included in complementary analyses to those reported here, that focused on activity associated with attentional cues, rather than VWM arrays (Shimi et al., 2014a). We chose 10–11 year-olds as our age-comparison group to the adult group because a few developmental studies have shown that some cognitive control abilities reach the adult mature state around the age of 10–11 years of age whereas other cognitive control abilities continue to develop until later in adolescence (e.g., Huizinga et al., 2006). Based on this, 10–11 year-olds could either be similar to adults or still developing, making them thus an interesting target age group to study the developmental state of selective attention and WM processes. Also, taking into account the large variability that may exist in children's data, we opted for a narrow age group that would provide more statistical power and maximize the likelihood of separating age-related and individual differences.

# Task and Stimuli

The full study design was described in detail elsewhere (Shimi et al., 2014a). Here, we describe only the trial types related to the focus of the current paper. These are illustrated in **Figure 1**. Participants viewed arrays of four colored items, followed by a single colored probe item after a variable delay. They were instructed to indicate whether the probe was present among the initial four items by pressing a mouse button (left for present and right for absent). Items comprised identical line drawings of familiar objects and cartoons (e.g., basketballs, each subtending 1.64◦ × 2.05◦ of visual angle from a distance of 100 cm and centered at 2.87◦ lateral and 2.87◦ azimuthal eccentricity from a central fixation point). The items were presented in different colors (drawn from a set of seven colors: white, red, magenta, orange, yellow, green, and blue) on a black background. On half trials, the memory array was preceded by a spatial cue (white arrow; 0.82◦ × 0.82◦ ) that guided the participants' attention to one of the upcoming items of the array and was fully informative (100%) of the location of a target probe, should this appear in the memory array (cued trials henceforth). The cue was equally likely to point to one of the four possible locations. On the other half trials, a spatially uninformative white square (0.82◦ × 0.82◦ ) was presented before the array (neutral trials henceforth), and served the purpose of controlling for the non-spatial alerting effects that the spatial cue may engender.

Participants completed 12 practice trials to familiarize themselves with the task, followed by 192 test trials divided into blocks of 48 trials in each, with 67% of trials containing the probe in the memory array (''probe present'') and 33% of trials not containing it (''probe absent''). Cued and neutral trials were intermixed randomly within each block.

#### Procedure

Participants were comfortably seated in a dimly illuminated, electrically shielded room, and were given written and verbal instructions along with examples on cards. On practice trials, participants received verbal feedback from the experimenter and visual feedback (correct, incorrect, no response) on the screen after each trial, whereas on test trials, participants received feedback about the number of correct responses every 16 trials and at the end of each block. Participants were recommended and reminded prior to the beginning of each block to pay attention to the cue as it would help them decide whether the probe item reappeared. They held the mouse with their right hand and were advised to respond as quickly and accurately as possible while maintaining their gaze on the fixation point throughout the trial. They were also asked to blink as little as possible, preferably after they responded, and to try to remain still during task performance. Participants were monitored throughout the task via a camera to ensure that they were engaged in the task and that they were not moving or blinking excessively during the task. All participants completed all test blocks except one child that completed one block less due to fatigue and loss of interest to the task. Self-paced breaks were inserted between blocks.

# EEG Recording and Processing

EEG was recorded continuously using a NuAmp amplifier (Neuroscan, Inc.) from 19 silver/silver chloride electrodes mounted on an elastic cap and positioned according to the

minimal trial attrition across age-groups). Participants had to respond whether the probe was present in the array or not by pressing mouse buttons. Bottom row shows Cowan's K (left panel) and median RT (right panel) scores on cued and neutral trials for 10-year-olds and adults. Error bars represent ±95% confidence intervals.

International 10–20 system (AEEGS, 1991). The montage included four midline scalp sites (Fz, FCz, Cz, Pz) and five scalp sites over each hemisphere (F3/F4, C3/C4, P3/P4, PO7/PO8, O1/O2). Additional electrodes were used as ground and reference sites. The electrode placed at AFz on the midline served as the ground. The EEG was referenced on-line to the FCz electrode and then re-referenced off-line to the algebraic average of the left and the right mastoids. Blinks and eye movements were monitored by deriving bipolar recordings from electrodes placed on the outer canthi of both eyes (HEOG) and from one electrode placed below the right eye and F4 (VEOG). Electrode impedances were kept below 5 kΩ. The ongoing brain activity at all scalp sites was sampled every 1 ms (1000 Hz analog-to-digital sampling rate) and filtered with a band-pass of 0.50–70 Hz.

The EEG data were then filtered off-line with a low-pass filter of 40 Hz and the continuous EEG was segmented into epochs, time-locked to the onset of the memory array in cued trials. Given that we were interested in neural activity that was lateralized with respect to the side of the to-be-encoded item, epochs from leftward and rightward trials were combined with an averaging procedure that preserved the spatial location of the electrodes relative to the position of the to-be-encoded item (i.e., contralateral or ipsilateral). Epochs started 100 ms prior to- and ended 600 ms after stimulus onset. ERP amplitude values were baseline corrected relative to a −100–50 ms stimulus interval. Epochs containing excessive noise or drift (±100 µV for adults and ±150 µV for children) at any electrode were excluded from subsequent analyses. Furthermore, epochs containing blinks or eye movements (±50 µV for adults and ±100 µV for children) were rejected. The thresholds for each age-group were chosen based on previous ERP parameters used with adults (e.g., Murray et al., 2011) and with children (e.g., Melinder et al., 2010) and to be in line with the previous ERP parameters used with the same sample of participants (Shimi et al., 2014a). Due to skull differences (Scerif et al., 2006) as well as other physiological differences between children and adults (e.g., brain tissue) and given that children's spectral power is higher than adults' (Barriga-Paulino et al., 2011), different artifact rejection thresholds are required in order to refrain from excluding clean EEG trials from the children's data. In addition, all epochs were visually inspected for any residual artifacts, which were all manually eliminated, an additional check that was especially important for lateralized eye-movements, as these may capture overt rather than covert attention. This artifact rejection procedure resulted in retaining approximately 82% of overall trials for adults and 85% of overall trials for children. Finally, trials with incorrect behavioral responses were discarded. In order to maintain an acceptable signal-to-noise ratio, the accepted lower number of trials per participant was set to 20 trials, and on average retained 70 trials for adults and 74 trials for children.

#### ERP Analyses

The aim of this experiment was to examine children's neural correlate of selecting one out of multiple items for encoding into VWM, its relation with behavior, and whether it resembles the neural correlate observed in adults. Hence, the ERP analyses focused on epochs locked to the memory arrays presented after an attentional cue guided participants' attention to the item that they should encode in VWM. We targeted a well-known lateralized ERP marker of attentional selection, namely N2pc, and we quantified it as the mean voltage difference between contralateral and ipsilateral sites relative to the side of the to-beselected item (target henceforth). Based on the previous findings, N2pc was expected to occur, and therefore measured, at posterior electrodes, PO7/8 and O1/2. We examined the presence of N2pc only for memory arrays in cued trials as there was not one specific lateralized item to be encoded in arrays of neutral trials, rather participants had to encode all four items. The time windows for analyzing the N2pc for each age group were selected on the basis of the following latency analysis: lateralized voltage differences were tested in successive time-bins in steps of 40 ms intervals between 260 and 400 ms following visual inspection of the two group average waveforms. Effects were considered significant if a p < 0.05 criterion was exceeded for 40 ms and persisted over at least two successive time bins in a given region. This exploratory analysis for each age group guided the selection of the time window with which to test for the presence of an N2pc effect. A two-way repeated measures ANOVA was then conducted on the mean amplitude of the neural activity in the longer time window merging the time-bins in which the effects were found significant and sustained, testing the effects of electrodes (PO7/8 and O1/2) and visual hemifield (contralateral and ipsilateral to the target).

#### Behavioral Analyses

Separate mixed-design ANOVAs were performed on d-prime, K, and median RT scores with trial type (cued, neutral) as the within-subject variable and the age group (10-year-olds, adults) as the between-subject variable. D-prime and Cowan's K measures converged so for brevity here we report statistics only for K. Cowan's K is a memory capacity measure that reflects the number of stored items in memory (Pashler, 1988; Cowan, 2001) and here was calculated using the formula: K = S (set size of the initial array) × (hit rate − false alarm rate). Hit rate was defined as the conditional probability that the participants responded probe present when the probe was indeed present and false alarm rate was defined as the conditional probability that the participants responded probe present when in fact the probe was absent. Extreme scores (e.g., perfect hit rate) were adjusted using the formula 1−(1/2N) as recommended by Macmillan and Creelman (2005) where N = the number of total trials in a condition. RTs were computed for probe-present trials and for correct responses only because incorrect responses and absent trials maybe influenced by multiple non attentional processes (as discussed in Griffin and Nobre, 2003). In addition, we explored functional links between behavioral performance and neural activity in children via split-half paired-sample t-tests on high- and lowmemory capacity groups (as a function of cueing benefit in K) separately.

# RESULTS

#### Behavioral Results

There were significant main effects of age group, F(1,29) = 14.65, p = 0.001, with overall higher K scores in adults (M = 3.24) compared with children (M = 2.33), and trial type, F(1,29) = 96.41, p < 0.001, with significantly higher K scores in cued (M = 3.46) than in neutral trials (M = 2.11). The interaction of age group × trial type did not reach significance, F(1,29) = 1.82, p = 0.19, suggesting that benefits from cues in accuracy did not differ significantly between children and adults.

The analysis on median RTs to probes accurately reported as present in the memory array showed significant main effects of age group, F(1,29) = 35.27, p < 0.001, and trial type F(1,29) = 51.72, p < 0.001, as well as a significant interaction of age group × trial type, F(1,29) = 7.47, p = 0.011. Analyses of simple main effects for the age-group × trial type interaction revealed that the interaction was driven by a smaller RT benefit drawn from cues by adults (M = 170) than children (M = 378, p = 0.008). A subsequent difference-scores analysis was carried out to interpret the interaction independently of baseline differences on neutral trials, and taking overall slowing in RT into account by treating RT differences as proportions of neutral RTs [(neutral-cued)/neutral]. The effect on scaled RTs did not remain significant (p = 0.25), thus suggesting that the larger RT benefits in children depended on overall slowing in baseline responses by the children. **Figure 1** shows behavioral results.

#### ERP Results

#### Adults

For adults, there was significant enhanced negativity contralateral to the position of the target in the memory array between 260 and 320 ms at PO7/8 and O1/2 sites, F(1,13) = 6.03, p = 0.029, reflecting the N2pc. **Figure 2** illustrates the neural activity elicited during attentional selection of the target item for encoding into VWM for adults.

#### Children

The statistical analysis on the children's ERP amplitude showed similarities and differences compared to adults in terms of topography of the effects and their timing respectively. There was

significant enhanced negativity contralateral to the position of the target in the memory array between 280 and 380 ms at PO7/8 and O1/2 sites, F(1,16) = 4.74, p = 0.045, signifying N2pc. **Figure 3** illustrates the neural activity elicited during attentional selection of the target item for encoding into VWM for children.

# Electrophysiological Predictors of VWM Capacity in Children

Subsequently, we examined whether children's ability to deploy attentional selection in function of encoding into VWM related to their VWM capacity. We chose to examine this in

the time-window that the N2pc was elicited in adults (i.e., 260–320 ms), in order to investigate whether children that demonstrate a magnitude of ''adult-like'' neural activity during attentional selection at encoding, will show a greater cueing benefit in VWM capacity. Previous results have shown that the large variability in children's VWM capacity is explained by some children demonstrating an ''adult-like'' neural profile in their efficiency of preparatory attention whereas other don't (Shimi et al., 2014a). By examining a similar question here, results can demonstrate functional links between the efficiency of attentional selection at encoding and later VWM performance in childhood, a question that has not been investigated before.

We carried out median-split analyses, by dividing children into high- and low-capacity groups (on the basis of K benefit). This allowed us to carry out paired-sample t-tests between contralateral and ipsilateral ERP amplitudes, and therefore explore the presence of ''adult-like'' N2pc in each capacity group separately. Splitting the children into those who showed a large vs. small cue benefit following spatial cues in terms of K revealed a significant enhanced negativity contralateral to the position of the target in the memory array between 260 and 320 ms at PO7/8 and O1/2 sites, F(1,8) = 5.77, p = 0.04, i.e., N2pc, for the large cue benefit group. In contrast, there was no statistically significant N2pc in the small cue benefit group, F(1,7) = 0.27, p = 0.62 (**Figure 4**).

#### DISCUSSION

The aims of this study were to identify the ERP correlates of children's attentional selection of a target item, among multiple competing items, during encoding in VWM, and to test whether these resemble the neural correlates involved in adults' selective encoding in VWM. Results showed that both children and adults elicited a significant negativity contralateral to the item to be encoded in VWM, i.e., both age groups elicited an N2pc. The observed ERP component had a similar topographical distribution between the two age groups but differed in latency. Importantly, individual differences in the extent to which the N2pc at encoding was ''adult-like'' related to variation in VWM performance at the end of the trial in children.

Despite overall better VWM performance and higher VWM capacity for adults compared to children, all participants benefitted from cues before encoding. This suggests that when the memory array appeared, both children and adults largely selected the item to be probed for encoding in VWM. This behavioral finding was corroborated by the neural activity participants elicited following memory array onset: all participants elicited greater negativity at posterior scalp sites (O1/2 and PO7/8) contralateral to the target item, and this neural activity shared the typical spatiotemporal characteristics of the N2pc. The N2pc has been associated with visual search and spatial selection of targets among distractors in incoming percepts (e.g., Eimer, 1996; Luck et al., 1997; Hopf et al., 2000; Hickey et al., 2009) as well as with search and detection of targets held in VWM (e.g., Kuo et al., 2009; Dell'Acqua et al., 2010; Shimi et al., 2014a). Obtaining an N2pc here suggests that selective attention during VWM encoding both

in childhood and in adulthood involves spatially selecting the target item from the memory array for later recognition. In combination with recent findings where preparatory shifts of attentional orienting did not elicit an N2pc (Shimi et al., 2014a), our current result is consistent with past adult studies suggesting that the N2pc does not simply index the generalized attentional deployment in visual space towards anticipated target locations, but rather it reflects spatial attentional selection of target objects (Kiss et al., 2008; Woodman et al., 2009). The current study extends these observations to the VWM domain both for childhood and adulthood. Following topdown modulations from fronto-parietal areas during preparatory orienting of attention (Murray et al., 2011; Eimer, 2014a,b; Shimi et al., 2014a), the prioritization of a visual percept during encoding in VWM seems to include sensory regions of visual cortex, with posterior parietal occipital cortex coding the attended percept more specifically. This finding is noteworthy for developmental cognitive neuroscientists studying attention and/or VWM, as no prior study has examined the temporal dynamics involved in children's attentional selection during VWM encoding. Although a few other developmental studies have examined gating mechanisms in VWM (Sander et al., 2011; Astle et al., 2014), these have been focused on a subsequent ERP component to N2pc, i.e., the contralateral delay activity (CDA) which has mainly been investigated in the context of modulation by the number of items currently maintained in VWM, and not in terms of the deployment of selective attention to a specific stimulus for encoding.

Even though the N2pc was elicited in both age groups, there were latency and duration differences of the ERP component between the two age groups; that is, children as a group elicited the N2pc later and for longer than adults. This finding suggests that, although both children and adults can select the target item among multiple competing items during encoding in VWM, at least when appropriate attentional cues that guide selective attention are provided, the two age groups nonetheless differ in their ability to do so. It seems that, at the group level, children are slower and need more time to selectively and efficiently encode the relevant item from irrelevant information in VWM compared to adults. This result is in line with findings from the sensory domain that have shown that the speed and the efficiency of selection for relevant stimulus among competing items improves with age (Ridderinkhof and van der Stelt, 2000). Therefore our result extends previous findings relating selective attention for perception to the VWM domain. This neural change in attentional efficiency during encoding in VWM from childhood to adulthood may be the outcome of richer myelination of axons taking place across development, which may have an effect on axonal transmission and subsequently on the speed and efficiency of cognitive processing (Giedd et al., 1999; Klingberg et al., 1999; Casey et al., 2005; Craik and Bialystok, 2006).

Finally, despite the reliable presence of the N2pc in children at the group level (which provided a clear neural index of their ability to focus attention and to select the target item for encoding in VWM and for later recognition), our second key finding is the high degree of variability across children in the ability to attend to and encode targets in VWM. Children who demonstrated an ''adult-like'' neural modulation during the encoding phase of the target item, benefitted the most from attention cues that pointed to the item to-be-probed, and thus to the item that they should encode in VWM. In other words, high-capacity children who elicited the N2pc sharing the same spatio-temporal characteristics of the ERP component observed in adults (i.e., the N2pc was elicited earlier and for shorter period of time) showed a large attention benefit effect in their recognition memory performance for cued trials, compared to neutral trials. In contrast, low-capacity children did not show a robust differentiation in the adult N2pc time window, and showed a small attention benefit in behavioral terms. It is well accepted now that younger and older adults' ability to regulate access to VWM in a goal-directed manner is vital for protecting VWM capacity from irrelevant information (Vogel and Machizawa, 2004; Gazzaley et al., 2005; Vogel et al., 2005; McNab and Klingberg, 2008; Zanto and Gazzaley, 2009; Murray et al., 2011). Extending recent developmental neuroscience findings that have shown that individual differences in preparatory neural activity prior to encoding information in VWM distinguish children with high vs. low VWM capacity (Shimi et al., 2014a), our current findings demonstrate that individual differences in neural activity underlying selective attention during VWM encoding also discriminate children of high vs. low VWM capacity. Children's ability to deploy selective attention and to encode only the relevant item in VWM, which ultimately results in higher VWM capacity, is mediated by faster and more efficient neural processing that approximates the adults' neural profile. Future directions may include the investigation of possible other behavioral correlates of adult-like selection markers: for example, it is possible that children with higher VWM capacity and N2pc also score highly on measures of intelligence, although we did not measure these here. Nonetheless, to our knowledge, this is the first study to show correlations between the mechanisms of selective attentional deployment to a specific target item during VWM encoding (N2pc) and VWM capacity in childhood.

In conclusion, current findings demonstrate that efficient deployment of selective attention goes hand in hand with efficient VWM encoding both in childhood and in adulthood. Although behavioral data do not seem sensitive enough to capture age group differences in processing speed of VWM encoding, the underlying neural pattern demonstrates that from childhood children with more refined skills in selective attention, exhibit higher VWM capacity. These findings provide new insights to the relatively recent developmental cognitive neuroscience literature examining attentional contributions to increases in VWM capacity. Future studies examining the developmental trajectories of selective attention in service of VWM capacity can shed light on the maturation of the N2pc and behavioral related parameters.

# ACKNOWLEDGMENTS

AS was supported by a Bodossaki Foundation scholarship, St. Peter's College, University of Oxford and by an A.G Leventis Foundation scholarship. GS was supported by a Scholar Award of the James S. McDonnell Foundation. AC Nobre was supported by a Wellcome Trust Senior Investigator Award (104571/Z/14/Z). Also, special thanks to Brain and Cognition Lab for help and/or intellectual input to the study.

# REFERENCES


memory. J. Neurosci. 29, 8032–8038. doi: 10.1523/jneurosci.0952- 09.2009


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Shimi, Nobre and Scerif. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.