# NEURAL SIGNAL ESTIMATION IN THE HUMAN BRAIN

EDITED BY: Christopher W. Tyler, Clare Howarth and Lora T. Likova PUBLISHED IN: Frontiers in Neuroscience

#### *Frontiers Copyright Statement*

*© Copyright 2007-2016 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.*

*The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.*

*Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.*

*Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.*

*As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.*

> *All copyright, and all rights therein, are protected by national and international copyright laws.*

*The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.*

ISSN 1664-8714 ISBN 978-2-88919-923-5 DOI 10.3389/978-2-88919-923-5

# About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **NEURAL SIGNAL ESTIMATION IN THE HUMAN BRAIN**

Topic Editors:

**Christopher W. Tyler,** City University London, UK **Clare Howarth,** The University of Sheffield, UK **Lora T. Likova,** Smith-Kettlewell Eye Research Institute, USA

The Blood Brain Barrier and Astrocytes type 1. Image by Ben Brahim Mohammed (2010). https://commons.wikimedia.org/wiki/File:Blood\_Brain\_Barriere.jpg This file is licensed under the Creative Commons Attribution 3.0 Unported license.

The ultimate goal of functional brain imaging is to provide optimal estimates of the neural signals flowing through the long-range and local pathways mediating all behavioral performance and conscious experience. In functional Magnetic Resonance Imaging (fMRI), despite its impressive spatial resolution, this goal has been somewhat undermined by the fact that the fMRI response is essentially a blood-oxygenation level dependent (BOLD) signal that only indirectly reflects the nearby neural activity. The vast majority of fMRI studies restrict themselves to describing the details of these BOLD signals and deriving non-quantitative inferences about their implications for the underlying neural activity.

This Frontiers Research Topic welcomed empirical and theoretical contributions that focus on the explicit relationship of non-invasive brain imaging signals to the causative neural activity. The articles presented within this resulting eBook aim to both highlight the importance and improve the non-invasive estimation of neural signals in the human brain. To achieve this aim, the following issues are targeted:

(1) The spatial limitations of source localization when using MEG/EEG.

(2) The coupling of the BOLD signal to neural activity. Articles discuss how animal studies are fundamental in increasing our understanding of BOLD fMRI signals, analyze how non-neuronal cell types may contribute to the modulation of cerebral blood flow, and use modeling to improve our understanding of how local field potentials are linked to the BOLD signal.

(3) The contribution of excitatory and inhibitory neuronal activity to the BOLD signal.

(4) Assessment of neural connectivity through the use of resting state data, computational modeling and functional Diffusion Tensor Imaging (fDTI) approaches.

**Citation:** Tyler, C. W., Howarth, C., Likova, L. T., eds. (2016). Neural Signal Estimation in the Human Brain. Lausanne: Frontiers Media. doi: 10.3389/978-2-88919-923-5

# Table of Contents


*the brain: a problem for interpreting BOLD studies but potentially a new window on the underlying neural activity*

Richard B. Buxton, Valerie E. M. Griffeth, Aaron B. Simon and Farshad Moradi

# **SECTION 3: Advanced brain activity analyses**


# **SECTION 4: New approaches to neural connectivity**

*110 The spatial structure of resting state connectivity stability on the scale of minutes*

Javier Gonzalez-Castillo, Daniel A. Handwerker, Meghan E. Robinson, Colin Weir Hoy, Laura C. Buchanan, Ziad S. Saad and Peter A. Bandettini

# *129 Cortical connective field estimates from resting state fMRI activity*

Nicolás Gravel, Ben Harvey, Barbara Nordhjem, Koen V. Haak, Serge O. Dumoulin, Remco Renken, Branislava C' urcˇic'-Blake and Frans W. Cornelissen


René C. W. Mandl, Hugo G. Schnack, Marcel P. Zwiers, René S. Kahn and Hilleke E. Hulshoff Pol

# Editorial: Neural Signal Estimation in the Human Brain

#### Christopher W. Tyler 1, 2 \*, Clare Howarth<sup>3</sup> and Lora T. Likova<sup>1</sup>

*<sup>1</sup> Division of Optometry and Vision Sciences, Centre for Applied Vision Research, City University London, London, UK, <sup>2</sup> Smith-Kettlewell Eye Research Institute, San Francisco, CA, USA, <sup>3</sup> Department of Psychology, University of Sheffield, Sheffield, UK*

Keywords: human brain, neural signal estimation, fMRI, MEG, EEG

**The Editorial on the Research Topic**

#### **Neural Signal Estimation in the Human Brain**

The ultimate goal of functional brain imaging is to estimate the neural signals that flow through the brain, mediating behavior, and conscious experience during the spectrum of activities controlled by the nervous system. Although, various brain imaging techniques are in routine use, determining the underlying neural activity remains a challenge (Lopez da Silva, 2010). Despite its impressive spatial resolution, functional Magnetic Resonance Imaging (fMRI) measures a blood-oxygenationlevel-dependent (BOLD) signal and, hence, only indirectly reflects the nearby neural activity. The interpretation of these signals is further complicated as it is sometimes unclear what aspects of "neural activity" BOLD represents. "Neural activity" could refer to spiking activity, subthreshold activation, or synaptic currents, each of both excitatory and inhibitory neurons, to name a few. Although, early findings suggested that BOLD was directly proportional to average neuronal firing rates (Heeger et al., 2000; Rees et al., 2000), in the cortex BOLD fMRI signals are marginally better correlated with LFPs (reflecting slow waveforms of neural activity) than with MUAs (reflecting spiking; Logothetis et al., 2001), suggesting that they may preferentially reflect inputs and intracortical processing (Viswanathan and Freeman, 2007; Rauch et al., 2008). Further work is needed to better understand what aspect of neural activity is reflected by BOLD signals in cases where there is dissociation between LFPs and the BOLD signal, as occurs in the hippocampus (as discussed in Ekstrom, 2010).

The vast majority of fMRI studies describe only the properties of BOLD signals and make only qualitative inferences of their implications for the underlying neural activity. Conversely, in electrical and magnetic forms of non-invasive brain imaging the recorded signal derives directly from the functional activity of neurons (though with varying degrees of transmission from the neural origin to the scalp recording sites), but the ability to localize these signals with any degree of accuracy remains remarkably elusive as the complexity of brain activation for even the simplest of tasks tends to confound attempts to resolve the local neural components contributing to the recorded scalp responses. To improve our estimates of neural signals using non-invasive brain imaging techniques, this Frontiers Research Topic invited empirical and theoretical contributions focusing on the explicit relationship of brain imaging signals to causative neural activity.

The submitted contributions responded to the challenge of neural signal estimation in a variety of ways including: advanced analyses of the neural implications of magnetoencephalographic (MEG) and electroencephalographic (EEG) signals, derivations of the pathway for BOLD signal generation from the underlying neural activation signals through animal recording, human BOLD modeling studies, detailed assessment of local BOLD response components and resting-state activation, and interpretation of the new field of functional diffusion tensor imaging in terms of neural activation.

#### Edited by:

*Russell A. Poldrack, Stanford University, USA*

Reviewed by: *Kamil Uludag,*

*Maastricht University, Netherlands* \*Correspondence:

*Christopher W. Tyler cwt@ski.org*

#### Specialty section:

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience*

Received: *19 January 2016* Accepted: *11 April 2016* Published: *29 April 2016*

#### Citation:

*Tyler CW, Howarth C and Likova LT (2016) Editorial: Neural Signal Estimation in the Human Brain. Front. Neurosci. 10:185. doi: 10.3389/fnins.2016.00185*

Although the EEG and MEG commonly used to measure human neural activity have high temporal resolution, spatial localization of the signal source is difficult to achieve. Cicmil et al. highlighted the limits on localizing MEG signal sources by testing the ability of several reconstruction approaches to localize the source of retinotopic MEG signals in the human brain and found that none of the approaches for assessing angular position were suitable for resolving annular stimuli spanning different retinal eccentricities (unless restricted in angular position). A second contribution to such electrical signal analysis is the time-frequency approach to the source localization and functional connectivity from simultaneous MEG/EEG signals proposed by Zerouali et al. Although, this analysis specifically targeted sleep spindles, the work has broader implications for the functional integration of MEG and EEG signals and their source localization within the brain. This analysis revealed that functional connectivity across the cortex evolved during the spindles from short-range intra-hemispheric connections to longer range inter-hemispheric connections, suggesting an integrative role for these dynamic features of neural activity.

Several contributions focused on estimating the properties of the underlying neural sources that generate BOLD fMRI signals. Martin reviews the need for accurate neurovascular models of the coupling between neural activity and the local BOLD signal from animal studies. Animal studies have the striking advantage of allowing a wide variety of technical approaches to the analysis of neurovascular coupling. Martin evaluates 16 of these, from single-neuron electrophysiology to tissue oxygen voltammetry, considering both their advantages and limitations and highlighting the key areas in which our understanding of fMRI signals has been improved through the use of animal models. Howarth takes up the issue of whether cortical astrocytes (glial cells), and calcium transients within them, are involved in the vascular response to neuronal activity based on the recent debate regarding whether evoked glial calcium signals occur quickly enough to account for the dynamics of neurovascular coupling. Indeed, the exact mechanisms by which astrocytes respond to changes in neuronal activity and trigger the intracellular events regulating the resulting vascular response underlying the fMRI BOLD signal remain unclear. To take an analytic approach to this question, Tyler et al. evaluate four models for the neurovascular coupling between local field potentials recorded in cortex and BOLD signals recorded simultaneously in an adjacent location, for a range of stimulus durations. The results imply that the BOLD response is most closely coupled with metabolic demand derived from the neuronal input waveform, suggesting that the astrocytic signaling is responsive to the neurotransmitter metabolism of the dendritic arborization rather than to the neuron's spiking activity.

Further studies focus on contributions to the positive and negative components of the neurovascular relationships. Buxton et al. assess the coupling ratio of blood flow and oxygen metabolism to different kinds of neural activation, finding that blood flow variations are more closely coupled with stimulus-driven variations than with endogenous variations in neural activity (e.g., those driven by attention, adaptation, and generalized excitability). Variations in oxygen metabolism, on the other hand, are more closely coupled with endogenous neural variations. The authors suggest that these differences in coupling ratio reflect differential proportions of excitatory and inhibitory contributions of the neural signal to cortical BOLD signals, and hence provide a new window into the assessment of neural activity. A related topic is addressed by Chen, who uses stimulus-driven manipulations of activation and suppression to assess the excitatory and inhibitory contributions to the evoked BOLD signal. The stimuli were designed to have invariant local effects, but differential long-range interactions were found according to configural relationships of local orientations, which should produce no differences in BOLD signal in the absence of neural interactions. One component of the BOLD suppression was dependent on the orientation-specific inhibitory effect of the long-range interactions, while a second appeared to be a general negative BOLD response to adjacent contrast stimulation independent of the stimulus configuration. Thus, BOLD response properties can be used to identify targeted aspects of the underlying neural organization.

Three papers focus on advanced methods of decomposing the neural connectivity and reorganization in the brain from the distribution of BOLD signals. Gonzalez-Castillo et al. take the novel approach of analyzing the time-course of restingstate BOLD signals across the cortex to assess the stability of neural connectivity. The most stable connections were between homologous (symmetric) interhemispheric local regions, with stability persisting for several minutes. The more variable connections were found to correspond primarily to occipitofrontal connections across the traditional resting-state networks, which can be interpreted as corresponding to transient visual imagery. Gravel et al. take resting-state analysis a step further to develop the concept of local cortical connective fields. These are neural organizations analogous to neuronal receptive fields, but defined in terms of connectivity among cortical regions, rather than connectivity of the neuron to a sensory surface. In combination with the population receptive mapping developed by this group for the analysis of the visual cortex, restingstate BOLD connectivity can be interpreted in visual space. This approach allows visuotopic maps to be reconstructed using resting state data recorded in the visual cortex, enabling these authors to show that the local resting-state connectivity from visual area V1 to both V2 and V3 was invariant with eccentricity with a scale of ∼2 mm, substantially smaller than the population receptive fields for visual input in these cortical areas. This work suggests that it is possible to obtain some neural properties from resting-state fMRI data.

Concentrating on the example of motor learning, Yang et al. extend the analysis of BOLD activation maps. Learning may generate not only changes in the strength of activation in predefined regions of interest, but also changes in the spatial distribution of the activation across the cortex. To address this issue, the authors measure the changes in spatial distribution of activation following a simple motor learning task. Dimension reduction via singular-value decomposition was able to capture aspects of the neural reorganization produced by this form of motor learning. These findings validate the capability of computational modeling to determine properties of neural connectivity and reorganization from BOLD signal analysis.

The final two papers are concerned with a new functional form of Diffusion Tensor Imaging (DTI). DTI is a well-established technique for assessing the anatomical organization of the fiber pathways in vivo from the local anisotropy of the diffusion directions of water molecules within brain tissue. Functional DTI, on the other hand, assesses changes in this kind of anisotropy as a result of some functional manipulation of the state of the brain. Autio and Roberts raise concerns about contamination of this form of functional analysis by leakage of BOLD signal activation from adjacent gray matter into the voxels designated as fiber pathways. Mandl et al., whose previous paper on functional changes in fractional anisotropy in the optic radiations during visual stimulation was the subject of the Autio and Roberts critique, argue that such partial voluming would only occur at the ends of fiber tracts where they meet with the cortical regions that they are connecting, whereas the reported changes in fractional anisotropy occurred throughout the tracts.

# REFERENCES


In summary, functional imaging techniques are increasingly used to infer neural activity within the human brain. This special issue improves our ability to estimate these neural signals noninvasively and points us in the direction of the remaining issues that must be addressed before we can fully understand functional imaging signals.

# AUTHOR CONTRIBUTIONS

All authors listed, have made substantial, direct, and intellectual contribution to the work, and approved it for publication.

# FUNDING

CH was a Vice Chancellor's Advanced Fellow at the University of Sheffield and currently holds a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (grant number:105586/Z/14/Z).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Tyler, Howarth and Likova. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Localization of MEG human brain responses to retinotopic visual stimuli with contrasting source reconstruction approaches

#### *Nela Cicmil <sup>1</sup> \*, Holly Bridge2, Andrew J. Parker 1, Mark W. Woolrich2,3 and Kristine Krug1*

*<sup>1</sup> Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK*

*<sup>2</sup> Nuffield Department of Clinical Neuroscience, FMRIB Centre, John Radcliffe Hospital, University of Oxford, Oxford, UK*

*<sup>3</sup> Department of Psychiatry, Oxford Centre for Human Brain Activity, Warneford Hospital, University of Oxford, Oxford, UK*

#### *Edited by:*

*Christopher W. Tyler, Smith-Kettlewell Institute, USA*

#### *Reviewed by:*

*Xi-Nian Zuo, Chinese Academy of Sciences, China Kevin C. Chan, University of Pittsburgh, USA*

#### *\*Correspondence:*

*Nela Cicmil, Department Physiology, Anatomy and Genetics, University of Oxford, Sherrington Building, Parks Road, Oxford, OX1 3PT, UK e-mail: nela.cicmil@live.com*

Magnetoencephalography (MEG) allows the physiological recording of human brain activity at high temporal resolution. However, spatial localization of the source of the MEG signal is an ill-posed problem as the signal alone cannot constrain a unique solution and additional prior assumptions must be enforced. An adequate source reconstruction method for investigating the human visual system should place the sources of early visual activity in known locations in the occipital cortex. We localized sources of retinotopic MEG signals from the human brain with contrasting reconstruction approaches (minimum norm, multiple sparse priors, and beamformer) and compared these to the visual retinotopic map obtained with fMRI in the same individuals. When reconstructing brain responses to visual stimuli that differed by angular position, we found reliable localization to the appropriate retinotopic visual field quadrant by a minimum norm approach and by beamforming. Retinotopic map eccentricity in accordance with the fMRI map could not consistently be localized using an annular stimulus with any reconstruction method, but confining eccentricity stimuli to one visual field quadrant resulted in significant improvement with the minimum norm. These results inform the application of source analysis approaches for future MEG studies of the visual system, and indicate some current limits on localization accuracy of MEG signals.

**Keywords: magnetoencephalography (MEG), brain imaging, source localization, retinotopy, vision (ocular), fMRI**

### **INTRODUCTION**

Magnetoencephalography (MEG) measures magnetic fields emitted by neuronal electrical activity and thus allows the noninvasive recording of neuronal signals with millisecond temporal resolution (Hämäläinen et al., 1993). MEG has the potential to extend findings from electrophysiological studies in the visual systems of animals by recording neuronal activity across the whole brain in human viewers as they respond to visual stimuli. The high temporal resolution of MEG can complement results from functional MRI (fMRI), a human neuroimaging method that has good spatial resolution (approximately 1 mm) but provides an indirect measure of neuronal function with low temporal resolution relative to neuronal spiking activity (Logothetis et al., 2001; Logothetis and Wandell, 2004).

Although the magnetic fields measured by MEG pass through brain, skull and skin with minimal smearing [in contrast to the electrical potentials measured by electroencephalography (EEG)], localization of brain sources of MEG signals remains an ill-posed problem. The number of independent measurements of the signal is on the order of a few hundred sensors, whilst the possible spatial configurations of cortical sources giving rise to that signal is several orders of magnitude greater; hence, MEG measurements alone cannot constrain a unique solution to the inverse problem of source reconstruction (Hämäläinen et al., 1993).

A current approach to overcome this limitation is to impose prior constraints on the source solution, informed by assumptions about the brain activity patterns that give rise to the MEG signal. Different approaches to source reconstruction have been developed, incorporating different prior assumptions. The minimum norm estimate constrains the source solution by requiring that absolute activity amplitudes across the brain be as small as possible on average (Dale and Sereno, 1993; Hämäläinen and Ilmoniemi, 1994). Additionally, sources can be limited to the cortical mantle and a depth-weighting parameter used to counter the implicit bias of these assumptions toward superficial, spatially spread currents (Lin et al., 2006). On the other hand, brain activity can be assumed to be sparse, i.e., occurring in discrete cortical "patches", which in certain tasks may have a bilaterally correlated response (Pascual-Marqui et al., 1994). These sparseness and correlation parameters can be inferred from the data using Bayesian techniques, for example in the multiple sparse priors approach (Mattout et al., 2007; Friston et al., 2008; Henson et al., 2009). Related algorithms have been the basis of other source reconstruction approaches (Moradi et al., 2003; Poghosyan and Ioannides, 2007; Cottereau et al., 2011).

Alternatively, a spatial filtering algorithm known as beamforming can be employed to estimate the time-course of activity at each source location, independently of all other sources, and can be extended to evaluate signals within a frequency band of interest (van Veen et al., 1997; Robinson and Vrba, 1999; Barnes et al., 2006; Hillebrand and Barnes, 2011). Neuronal responses may oscillate at a particular frequency due to the internal properties of the processing networks involved (Wang, 2010), or a rhythmic change in the presented stimulus can evoke brain responses in a particular frequency band (Cottereau et al., 2011). In both cases, such frequency-related information can be used to focus source analysis onto a subspace of the measured MEG signal.

For visual neuroscience research, MEG source reconstruction methods should assign sources of early visual responses to occipital cortex and resolve activity arising from different occipital locations. However, with many contrasting reconstruction approaches available, it is not yet clear which prior assumptions are most appropriate for localizing MEG signals arising from the human visual system, specifically those from early cortical visual areas V1, V2, and V3.

The current gold standard for high spatial resolution of human visual brain activity is fMRI, which has been used to identify the retinotopic boundaries between visual areas, allowing comparison of responses along the visual hierarchy (Engel et al., 1994; Sereno et al., 1995; DeYoe et al., 1996; Wandell et al., 2005). Retinotopic mapping in early visual cortical areas of the human brain follows well-established patterns. In angular retinotopy, upper visual field locations are represented in ventral subregions of early visual areas, whilst lower visual field locations are represented in dorsal subregions. Left and right visual field locations are represented in the respective contralateral cortical hemispheres. For visual field eccentricity, the foveal region is represented at the occipital pole and representations of increasingly peripheral locations radiate anteriorly (Engel et al., 1994; DeYoe et al., 1996; Wandell et al., 2005). Comparison of the sources of the MEG signals of visual brain responses, as reconstructed by different reconstruction approaches, to fMRI retinotopic maps or regions of interest (ROIs) in the same individual should reveal which approaches can accurately localize signals arising from the visual system.

A number of studies that have evaluated MEG source reconstruction methods have compared the reconstruction of simulated electromagnetic data to their assumed sources (Hämäläinen and Ilmoniemi, 1994; Hauk, 2004; Lin et al., 2006; Trujillo-Barreto et al., 2008; Hillebrand and Barnes, 2011) and/or quantified goodness of reconstruction with a fitness measure such as model evidence rather than source localization accuracy (Mattout et al., 2007; Friston et al., 2008; Henson et al., 2009). A few studies have evaluated localization accuracy of one specific MEG source reconstruction method for real recorded visual responses, by comparing the source locations either to individuals' fMRI maps (Moradi et al., 2003; Poghosyan and Ioannides, 2007; Sharon et al., 2007; Cottereau et al., 2011) or to indirect indicators of retinotopic mapping, such as anatomical landmarks (Brookes et al., 2010; Perry et al., 2011).

We further this approach by reconstructing, for the first time, the sources of real recorded MEG signals from human viewers with three contrasting localization approaches and evaluating these reconstructions against fMRI retinotopic maps from the same individuals. Source localizations of responses to stimuli that differed either in angular retinotopy or eccentricity were compared to their independently established cortical locations in early visual areas V1, V2, V3, and V3A, defined for the individual participants by fMRI. We used large stimuli and assessed the accuracy of the extent of cortical activations rather than just one focal point in early visual areas. We focused on three methods included in freely available software packages: minimum norm (Minimum Norm Estimate, MGH/MIT Martinos Centre for Biomedical Imaging; Dale et al., 2000; Gramfort et al., 2014), multiple sparse priors (MSP in SPM8 software, FIL Methods Group, UCL; Litvak et al., 2011), and beamforming (adapted from SPM8 to work with Elekta Neuromag data; Woolrich et al., 2011). The beamformer was applied separately to early visual evoked responses and to ongoing oscillatory responses related to the stimulus flicker rate; minimum norm and multiple sparse priors were used to reconstruct early evoked responses only. A number of recent studies have incorporated information from fMRI retinotopic mapping to aid the localization of the MEG signal by placing spatial priors on the source solutions (Yoshioka et al., 2008; Hagler et al., 2009; Cottereau et al., 2012; Hagler and Dale, 2013). In contrast, our investigation focused on the reconstruction of sources from MEG signals alone, so the individual fMRI map provided an independent localization comparison.

Any justification for a combination of MEG and fMRI data needs to be based on a clear understanding of the contribution of each signal to the combined estimate. Our contribution here is based upon analyzing the quality of spatial localization of the MEG signal, using current standard methods.

#### **MATERIALS AND METHODS PARTICIPANTS**

Eight participants (6 female, 2 male; mean age 31.4 ± 12.6 years, range 22–58 years) took part in the experiment, although not all participants completed all measurements. Further details are given later. All participants had normal, or corrected to normal, visual acuity. The participants had no neurological or psychiatric illness, no brain injury, and were not taking any medications that might affect the nervous system. The research was approved by the University of Oxford's Central University Research Ethics Committee (CUREC), in accordance with the regulatory standards of the Code of Ethics of the World Medical Association (Declaration of Helsinki). Written informed consent was obtained from all participants who were not investigators of the project.

#### **MEG RETINOTOPY**

#### *Data collection and pre-processing*

*Stimuli.* Visual stimuli were projected onto a back-projection screen in the MEG scanner in front of the participant with a Panasonic® DLP (Digital Light Processing) based projector (PT-D7700E). Refresh rate was 60 Hz (all MEG data were lowpass filtered at 40 Hz prior to source reconstruction, see below). Distance between viewers' eyes and screen was 1500 mm and projected screen size was 390 × 290 mm, corresponding to 14*.*8 × 11*.*0◦ of visual angle. Accurate stimulus onset times were recorded with a photodiode (sampling rate 1000 Hz) placed over a small black square (8 × 8 mm) located in the bottom-left corner of the stimulus screen; this square flashed to white for 100 ms on the first frame of each stimulus onset (the photodiode blocked this flash from being seen by the participant). Participants passively viewed stimuli whilst maintaining central fixation.

Black-and-white checkerboard quadrant stimuli were presented to 6 participants with a Cambridge Research Systems VSG 2/5 graphics generator run with a Dell laptop (Subjects 1–4), or with Presentation® (Neurobehavioral Systems, Inc.) running on a Samsung R710 laptop (Centrino 2 P7450 processor, nVIDIA GeForce 9300M graphics card) (Subjects 5 and 6). Stimulus parameters were identical in both set-ups. Each quadrant extended 0–5.4◦ eccentricity, presented either in the upper left (UL), upper right (UR), lower left (LL) or lower right (LR) visual field. Quadrants contained 6 checks along the radius and the arc, decreasing in size by a factor of 1/*d*, where *d* is distance to apex. A black fixation point (radius 0.25◦) was present at the apex. Each stimulus was presented for 1000 ms with no inter-stimulus interval. Each block of quadrant stimuli consisted of 25 full-cycle rotations (UR, UL, LL, LR positions). 6 blocks were collected per participant.

Black-and-white checkerboard concentric ring and quarterring stimuli were presented with Presentation® software, as above, for all participants. Rings had 12 checks around the circumference and 3 checks along the radius, and were presented at three eccentricities: ECC 1 (0–0.75◦), ECC 2 (1.0–2.0◦), and ECC 3 (3.0–5.4◦). These eccentricity bands were selected to activate areas of similar size across cortex according to foveal magnification ratios, and extend approximately 3 cm into the calcarine sulcus; doubling maximum ring size would have further increased this extent by approximately 1 cm only (Wandell et al., 2005; Horton, 2006). Quarter-rings were formed from rings by masking out all but either the upper right or lower right quadrant of the visual field, resulting in 6 quarter-ring stimuli (upper right: U-ECC 1, U-ECC 2, and U-ECC 3; lower right: L-ECC 1, L-ECC 2, and L-ECC 3). Ring and quarter-ring stimuli were presented for 1000 ms in a pseudo-randomized order with a variable inter-stimulus interval of 600, 800, or 1000 ms (selected pseudo-randomly). Datasets for rings were recorded for 7 participants (Subjects 1– 3 and 5–8) with 4 blocks of 150 stimuli per participant. Datasets for quarter-rings were recorded for 5 participants (Subjects 1–2 and 6–8) with 5 blocks of 180 stimuli per participant.

All stimuli cycled through complete black-to-white-to-black or white-to-black-to-white contrast reversal at a rate of 4 Hz, i.e., the presented checkerboard pattern changed every 125 ms. This induces oscillatory brain responses at the second harmonic, a rate of 8 Hz. Stimuli were presented on a mid-gray background (mean luminance, 25 cd/m2); Michelson contrast was 99%.

*MEG scanner and data acquisition.* MEG data were collected with an Elekta Neuromag VectorView® MEG scanner at the Oxford Centre for Human Brain Activity (OHBA), Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford, U.K. The scanner comprises 306 MEG-channel sensors (102 magnetometers, 204 planar gradiometers). Sensors were tuned prior to each MEG recording session to limit noise levels to approximately 2.5 fT/cm. Sensors that became very noisy during a recording block would be individually re-tuned at the next inter-block break, using the Neuromag automatized heating process or by eye, as necessary. Continuous MEG data were recorded at 1000 Hz sampling rate (0.3–330 Hz bandpass filter). Prior to data acquisition, all metal and other potential sources of electromagnetic interference were removed from participants. Quality of recording was confirmed by visual inspection of 1–2 min of MEG recording during quiet sitting prior to the start of the experiment. Electro-oculogram (EOG) and electrocardiogram (ECG) timeseries were recorded simultaneously with MEG to track potential noise sources and artifacts. Four head position indicator (HPI) coils were attached to the participant's head and a Polhemus stylus and digitizer device were used to record the locations of fiducial points (right and left pre-auricular points (RPA, LPA) and nasion), the HPI coils, and between 40 and 80 extra digitizer points on the head surface. Prior to the recording of each stimulus block, head location in the scanner was measured with an automatic process that detected the coils. Continuous HPI recorded any head movements during data acquisition.

*Preprocessing and HPI correction.* Data were preprocessed with Elekta Neuromag® MaxFilter software (version 2.1, May 2009). MaxFilter software reduces noise in the data by suppressing magnetic interference coming from outside and inside the sensory array, using signal-space separation (SSS). The MaxMove subcommand was used to spatially co-register MEG recordings across blocks to the median head position for each individual. MaxMove continuous HPI movement compensation was also applied. Data were then epoched according to the onset of each visual stimulus (−500 to 1000 ms peri-onset).

*Artifact removal.* MEG channels with constant high noise levels as identified by visual inspection were rejected from further analysis. A maximum of two such channels was removed per participant and scan. Eye-related artifacts such as blinks were identified as deviations in the EOG recording trace. Epochs containing artifacts arising from the eyes or intermittent sensor noise were removed from further analysis. Peak-to-peak threshold for removal of eye blinks and overt eye movements was within the range 100–200 <sup>×</sup> <sup>10</sup>−<sup>6</sup> V. Maximum noise level threshold for magnetometer and gradiometer activity was within range 2–3 × <sup>10</sup>−<sup>12</sup> T and 1.5–2 <sup>×</sup> <sup>10</sup>−<sup>10</sup> T/m, respectively. In both cases, the specific threshold depended on the artifact amplitudes recorded for each individual. After artifact removal, in all cases there remained at least 95 trials per stimulus per participant.

#### *Source reconstruction of MEG signals*

Brain sources of MEG signals were localized using three different reconstruction approaches. The following sections detail the source space configurations, reconstruction approaches, and statistical methods used. **Table 1** provides a summary of these details along with the resultant localization accuracies for responses to quadrant stimuli.

*Anatomical MRI data collection.* Anatomical magnetic resonance imaging (aMRI) data were collected with a 3.0 Tesla TIM Trio scanner, located at the University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR).


**Table 1 | Source reconstruction method details for all localization accuracy comparisons.**

*\*Gaussian-weighted average over the time period.*

One T1 scan was taken for each participant using a standard structural magnetization-prepared rapid gradient echo (MPRAGE) sequence (130 Hz/pixel, flip angle = 8◦, TR/TE/ TI = 2040 ms/4.7 ms/900 ms). Orientation of scan acquisition was transverse (192 × 1 mm slices) with an inplane resolution of 1 × 1 mm.

*Source space modeling and HPI-MRI alignment.* Individuals' anatomical surfaces, to which MEG data were co-registered, were created from the aMRI data with Freesurfer software *recon-all* process (default parameters) (http://surfer*.*nmr*.*mgh*.* harvard*.*edu; Dale et al., 1999; Fischl et al., 1999). Correct segmentation of white/gray matter for cortical surfaces was confirmed by eye. FreeSurfer's *watershed* algorithm was used to reconstruct the inner skull, outer skull and outer skin surfaces from the individuals' aMRI data and to estimate the boundary element model (BEM) compartments. BEM compartments are used to specify the model for the electrical conductivity geometry of the head. A "single shell" forward model based upon this BEM was used in all source reconstruction methods.

Minimum norm reconstructions were implemented with MNE software (see Minimum norm estimate (MNE) reconstruction), which creates each individual's source space based upon each individual's cortical surface. Individuals' source spaces contained 10242 sources per hemisphere (corresponding to 3.1 mm source spacing) for all participants except Subjects 2, 3, and 4, for whom the anatomical scan and cortical surface reconstructions permitted a maximum of 4098 sources per hemisphere (corresponding to 4.9 mm source spacing). The specific resolution for each individual was limited by the *mne\_setup\_source\_space* command, which constructs the triangulated dipole grid from the reconstructed white matter surface, in the MNE analysis pipeline. Source reconstruction with multiple sparse priors assumptions was implemented with SPM8 software (see Multiple sparse priors (MSP) reconstruction). This software constructs the cortical surface meshes for the source space by inverse normalization of the canonical mesh derived from the MNI152 template brain (Mattout et al., 2007; Henson et al., 2009). These source spaces contain 4098 sources per hemisphere (corresponding to source spacing of approximately 4.9 mm (as advised by SPM8 Manual, Section 14.3, Source space modeling, p. 121). Beamformer source reconstruction did not confine activity to the cortical mesh but estimated it within the cranial volume. A source spacing of 4 mm was selected to lie reasonably within the range of resolutions utilized within the other reconstruction approaches. **Table 1** lists the source space used for each reconstruction approach.

Digitized fiducial points, HPI coils and remaining digitizer points were used to align the coordinate frame of the MEG data and the structural MRI data. Locations of fiducial points were first specified on the aMRI volume and an automatic alignment procedure, using an iterative closest point algorithm (ICP), non-linearly converged the frames to optimal alignment. The beamformer utilized the same co-registration as created in the SPM8 software for the multiple sparse priors method. Coregistration for the minimum norm reconstruction was run in the MNE software package using identical positional information and equivalent ICP alignment.

*First response peak (FRP).* The time window for all source reconstructions of the early evoked response was centered on 83 ms, representing the ascension of the FRP, which was qualitatively determined by eye. This FRP was used for all participants except for Subject 7, for whom 93 ms was used, as evoked responses for this participant were 10 ms slower to rise.

*Minimum norm estimate (MNE) reconstruction.* Data were analyzed with MNE software (Minimum Norm Estimate, MGH/MIT Martinos Centre for BioMedical Imaging; Hämäläinen and Ilmoniemi, 1994; Dale et al., 2000; Gramfort et al., 2014), time-locked to stimulus onset and averaged. A noise covariance matrix (NCM) was calculated from -500 to 0 ms prior to each stimulus onset; for quadrant stimuli, this necessarily comprised the final 500 ms of the previous stimulus presentation. Source reconstructions were performed on data bandpass filtered 1–40 Hz for 0–1000 ms post-stimulus, combining magnetometer and gradiometer measurements. Anatomically constrained dynamic statistical parametric mapping (dSPM) inverse solutions (based upon F-statistics calculated using baseline variance estimates) were generated at each cortical vertex (Dale et al., 2000). These dSPM source estimates were averaged across a 20 ms time window, centered on the FRP.

*Beamformer (early evoked response).* Data were analyzed with an LCMV (linearly constrained minimum variance) beamformer (adapted from SPM8 to work with SSS MaxFiltered Elekta Neuromag data; Woolrich et al., 2011), using lead fields calculated from the SPM8 neuroimaging analysis package (FIL Methods Group, UCL; Friston et al., 2008; Litvak et al., 2011). The beamformer data covariance matrix and weights were averaged over all trials, and used to produce separate reconstructed sources for each trial. These were then combined in a trial-wise General Linear Model to produce a t-statistic for each source location. For quadrant stimuli, the t-statistic described the trial-wise difference between responses to a particular quadrant compared to the other quadrants, as no inter-stimulus interval baseline was available. For rings and quarter-rings, the t-statistic described the difference between responses to the stimulus vs. average baseline activity −250 to 0 ms prior to stimulus onset. Sources were reconstructed for 0–1000 ms post-stimulus, bandpass filtered at 1–40 Hz, combining magnetometers and gradiometers. Resultant t-statistic images were averaged across a time window of 20 ms, centered on the FRP.

*Beamformer (time-frequency).* Time-frequency decomposition source analysis was performed within the 7–9 Hz frequency band, centered on 8 Hz, i.e., the 2nd harmonic of the stimulus contrastreversal frequency. The 2nd harmonic is used because each contrast reversal of the stimulus involves two contrast changes (from black to white then white to black) and visual brain areas respond to each such change (Campbell and Kulikowski, 1972; Cottereau et al., 2011). A pilot frequency decomposition analysis on sensor activity confirmed this band contained the greatest power. A time window of 200–1000 ms was selected for source reconstruction to avoid the FRP yet utilize maximum available data for reconstruction. Resultant t-statistic images were averaged across the time window. All other parameters were identical to the initial evoked response beamformer analysis above.

*Multiple sparse priors (MSP) reconstruction.* Data were analyzed with the MSP analysis algorithm available in the SPM8 M/EEG analysis package (FIL Methods Group, UCL; Friston et al., 2008; Litvak et al., 2011). MSP contains bilaterally symmetrical a priori assumptions based upon functional anatomy, which are selected or deselected by the reconstruction algorithm according to the presence or absence of bilateral correlation components in the data (Friston et al., 2008). Sources were reconstructed separately for each trial and a t-statistic was calculated across trials to indicate significance of source activity, as for the beamformer. Time window of source reconstruction was 40 ms wide, centered on the FRP, combining magnetometers and gradiometers. Source activity results were averaged over this time window, weighted by a Gaussian centered on the FRP.

The SPM8 analysis package was also used to run reconstructions with the IID (independently and identically distributed priors) reconstruction option, which corresponds to the minimum norm approach but does not incorporate the same depth weighting and anatomical constraints as the MNE software. All other factors were identical between IID and MSP reconstructions. Since the SPM8 source reconstruction procedure reconstructs variance around the mean signal, MSP and IID reconstructions were also run using a 150 ms time window (50–200 ms poststimulus), to encompass a greater amount of the response to stimulus onset. This wider time window did not result in improved source localization accuracy (**Table 1**). Therefore the shorter time window was used for the main comparisons in the present study.

#### *Morphing 3D source images to the individual's cortical surface.*

The beamformer and MSP methods output source reconstructions in MNI152 volumetric standard space. These were converted to individuals' cortical surface format (Freesurfer) to enable comparison with individuals' fMRI retinotopic maps. The *flirt* command from FSL (FMRIB, Oxford; Jenkinson et al., 2012) generated a transformation matrix from MNI152 volumetric space to Freesurfer volumetric space and then transformed the 3D source images to Freesurfer space. Freesurfer volume images were then converted to cortical surface format *mri\_vol2surf* command (Freesurfer). These surface files were then morphed onto the cortical anatomy of the individual participant with *mri\_surf2surf* (Freesurfer). The correspondence between the volumetric standard space results and the native space output of the morphing procedure was carefully checked and confirmed by eye at every stage for each individual subject.

### **FUNCTIONAL MRI RETINOTOPY**

#### *Stimuli*

Retinotopic quadrant and ring stimuli used for fMRI data collection were presented with the Cambridge Research Systems VSG 2/5 graphics generator with a Dell laptop. Visual stimulus parameters were identical to those used for MEG unless otherwise stated below. The quadrant stimulus rotated through 30◦ every TR (4000 ms) to producing traveling wave brain signals necessary for analysis with standard fMRI retinotopy software (Wandell, 1999). Similarly, concentric rings expanded every TR (4000 ms), taking 8 steps to cover the visual field 0–11.5◦. Hence, although the timing of visual stimulus presentation differed between fMRI and MEG data acquisitions, identical spatial points of the stimuli in the two cases could be selected, enabling direct comparisons between the brain source locations.

#### *fMRI data acquisition*

Retinotopic fMRI data were acquired according to standard methods with a 3T Tesla whole-body Siemens TIM Trio scanner and a 12-channel receive-only head coil, located at the University of Oxford Centre for Clinical Magnetic Resonance Research (OCMR). EPI sequence parameters were: TE = 30 ms; TR = 4000 ms; 40 2-mm slices; 2 × 2 mm in-plane resolution; matrix = 64 × 64. For angular mapping, each run consisted of 6 cycles through 12 angular locations, corresponding to 72 volumes acquired continuously (288 s); 4 runs were collected. For eccentricity mapping, each run consisted of 6 cycles through 8 eccentricities, corresponding to 48 volumes (192 s); 3–4 runs were collected. A reduced (40 2-mm slices) T1-weighted image (3D FLASH) was also included in each functional session, acquired coronally at an in-plane resolution of 1 × 1 mm. These slices were in the same planes as the retinotopic functional images, and were used to register functional retinotopy data to the whole brain structural MRI.

#### *fMRI retinotopy mapping*

The fMRI retinotopic maps were generated for individual participants according to standard procedures, using either mrTools software (HeegerLab; http://www*.*cns*.*nyu*.*edu/ heegerlab/wiki/) or mrVista software (Stanford; http://white*.* stanford*.*edu/software). Retinotopic BOLD activity maps were displayed on flat renderings of the occipito-temporal-parietal region, allowing borders between visual areas to be identified and traced. For angular retinotopy, dorsal (lower visual field) and ventral (upper visual field) subregions were defined on the left and right hemisphere for areas V1, V2, and V3. Area V3A was also defined on each hemisphere. For eccentricity, regions of interest (ROIs) representing the eccentricity bands for ECC 1, ECC 2, and ECC 3 stimuli were delineated across areas V1, V2, and V3. This fMRI retinotopic mapping procedure and combination of parameters have been used to map retinotopic visual areas across a significant number of individual subjects (Bridge and Parker, 2007; Minini et al., 2010). The definitions of areas V1- V3A according to this procedure are reliable insofar as—when combined with additional subjects—they result in a plausible probabilistic map for the location of each visual area (Bridge, 2011). On qualitative assessment, this localization of areas V1, V2, and V3 (ventral) also overlaps almost completely with probabilistic maps constructed using cytoarchitectonic, post-mortem definitions (Rottschy et al., 2007). Therefore we are confident that this mapping approach provides "ground truth" to the same extent as any currently available retinotopic mapping procedure in MRI.

#### **MEG-fMRI COMPARISONS**

#### *Source localization accuracy*

To evaluate MEG source localization accuracy relative to fMRI, we calculated the percentage of active vertices inside a particular visual cortical area that were localized to the retinotopically expected subregion of that area. The retinotopically expected subregion was defined in each individual, according to their fMRIdefined retinotopic map, and was evaluated for each stimulus. For example, to evaluate localization accuracy for an upper right (UR) quadrant in area V1, we calculated the percentage of active vertices within V1 that were localized into the left ventral subregion, which is the retinotopically expected location for that visual field stimulus. Localization accuracies for quadrant stimuli were averaged across stimuli and participants for each ROI. For rings, eccentricity band ROIs corresponding to stimulus eccentricities were defined across areas V1, V2, and V3 combined. Of the active vertices located across all the eccentricity bands, we calculated the percentage that localized into the retinotopically-expected band, separately for each cortical hemisphere, for each participant. The same procedure was used for quarter-rings, where retinotopic subregions were defined by both the angular visual field location and stimulus eccentricity.

As many of the resultant localization accuracy values were not normally distributed (MATLAB's Lillifors test, *p <* 0*.*05), non-parametric Wilcoxon signrank tests were used to calculate whether the sources were significantly localized into the retinotopically expected subregions (*p <* 0*.*05). For quadrants, chance level was 25% for visual areas V1, V2, and V3, for which we defined four angular subregions each (dorsal and ventral subregions in the left and right hemisphere), and 50% for V3A for which we defined two subregions only (left and right hemisphere). For rings, chance level was 33% as three eccentricity bands were defined. For quarter-rings, chance level for angular localization into the retinotopic quadrant was 25% and chance level for eccentricity localization into the eccentricity band within the quadrant was 33%.

Within each stimulus set, a Bonferroni multiple-comparison correction was applied to the statistical tests across visual areas and MEG reconstruction methods. A Kruskal-Wallis test was used to ascertain whether localization accuracy was different between source reconstruction methods.

If a source reconstruction approach resulted in no active vertices in the relevant early visual area for a particular stimulus and individual, this was excluded from the accuracy analyses. Aside from area V3A for quadrants (MNE: 16 rejections; beamformer: 12; MSP: 10; IID: 5; time-frequency beamformer: 14) and eccentricity reconstructions for lower-field quarter-rings (beamformer: 5 rejections), there were on average only 1 or 2 such failed localization per participant in each stimulus set for each source reconstruction method. We recalculated all results with the failed reconstructions included (*data not shown*); this slightly reduced overall localization accuracy rates, as expected, but did not affect the overall conclusions of the study.

#### *Threshold for active vertices*

The MEG source reconstruction methods output a statistical activity value, either F (MNE) or *t* (beamformer, MSP), for each point in the source space. This reflects that, in the context of source reconstruction algorithms, each potential source location has a probabilistic contribution to the MEG sensor signal because both noise in the data and the ill-posedness of the inverse problem preclude a unique, determined solution. A cut-off threshold must therefore be designated, which controls whether or not a particular cortical vertex is considered "active" in any given source reconstruction result. A non-systematic designation may affect the retinotopic MEG-fMRI comparisons in unexpected ways and thereby render unfair the comparison between reconstruction methods.

We defined the cut-off threshold in terms of the percentage of highest-responding vertices, across the cortex, that are designated "active". For example, a threshold of 1% indicates that only vertices with activity values in the top 1% are designated "active". We systematically calculated localization accuracy (as described above) as a function of cut-off threshold for each reconstruction method and visual stimulus, across visual areas V1, V2, and V3, individually for each participant. For a given threshold, if a stimulus resulted in zero "active" vertices in early visual areas for a particular participant, the accuracy result was set to zero. The optimal threshold was defined as the cut-off threshold which produced the most accurate source reconstructions for a given reconstruction approach. We then used this optimal threshold, set independently for each participant and for each reconstruction approach, in the localization accuracy evaluations and comparisons presented in the study. Optimal thresholds, converted to statistical values, for quadrant stimuli (averaged across participants) were *F* = 27.8 (MNE), *t* = 9*.*7 (beamformer for evoked response), *t* = 10*.*9 (beamformer for 7–9 Hz timefrequency window), *t* = 4*.*2 (MSP) and *t* = 2*.*1 (IID). For ring stimuli these were *F* = 17*.*5 (MNE), *t* = 10*.*8 (beamformer for evoked response) and *t* = 6*.*85 (MSP). For quarter-rings, average optimum thresholds across participants were *F* = 9*.*0 (MNE) and *t* = 8*.*3 (beamformer) for angular retinotopy, and *F* = 7*.*2 (MNE) and *t* = 7*.*7 (beamformer) for eccentricity mapping.

#### **RESULTS**

#### **LOCALIZATION OF VISUALLY EVOKED RESPONSES TO ANGULAR RETINOTOPIC STIMULI**

Contrast-reversing checkerboard quadrant stimuli were presented to six human observers and the evoked brain responses were measured with MEG. Quadrants of visual stimulation were located in the upper left (UL), upper right (UR), lower right (LR) or lower left (LL) visual field. In response to stimulus onset, occipital and parietal MEG sensors showed large deflections at 60–100 ms. Subsequent responses to the contrast reversals of the stimulus are seen throughout the stimulus duration (**Figure 1A**). Scalp topography of MEG gradiometer sensor activity shows how responses vary by stimulus location, roughly according to the expected retinotopic pattern (**Figure 1B**).

Cortical sources of the first response peak (FRP) of the visually evoked response were localized with three reconstruction approaches: minimum norm estimate (MNE), beamformer (BF),

**FIGURE 1 | (A)** Measured responses to upper and lower right quadrant stimuli, from a MEG channel located over the occipital cortex (gradiometer channel 1922), for Subject 1. Traces were time-locked to the onset of the visual stimulus (time = 0) and averaged. Changes to the stimulus contrast occurred every 125 ms following the onset of the visual stimulus (vertical black lines). Deflections of evoked responses to the upper and lower quadrants show opposite polarities, as might be expected from oppositely oriented current sources in the lower and upper calcarine banks respectively (Wandell et al., 2005). Source reconstructions were performed either on the visually evoked response (first response peak (FRP): centred at 83 ms) or upon the ongoing stimulus-induced oscillations at 8 Hz (200–1000 ms). **(B)** Gradiometer topographic maps (T/m) of averaged evoked responses at 83 ms post-stimulus for Subject 1 (S1) and Subject 4 (S4). Insets indicate stimulus locations. Black vertical and horizontal lines are presented to aid visualization. Peak responses tended to be over the hemisphere contralateral to the visual stimulus. Upper visual field stimuli evoked activation further back over the occipital pole than lower field stimuli.

**FIGURE 2 | Locations of the fMRI- and MEG-measured brain responses to quadrant stimuli. (A,E)** Individual fMRI-defined ventral and dorsal subregions of early visual areas, which respectively represent upper field and lower field quadrant stimuli, for Subject 1 (S1) and Subject 4 (S4). Blue = area V1; Red = area V2; Yellow = area V3; Green = area V3A. Insets indicate stimulus locations. **(B–D)** Retinotopic source reconstructions of MEG responses to quadrant stimuli around the first response peak (FRP) for Subject 1.

**(F–H)** Retinotopic Source reconstructions of MEG responses to quadrant stimuli around the first response peak (FRP) for Subject 4. F-statistic results for the minimum norm estimate approach (MNE), calculated using pre-stimulus variance estimates, are plotted on the individuals' inflated cortical surface **(B,F)**. The t-statistic results for the multiple sparse priors (MSP) approach **(C,G)** and beamformer **(D,H)**, calculated with a contrast of responses across stimuli, are displayed volumetrically on the MNI152 template brain.

and multiple sparse priors (MSP). Each of these approaches incorporates a different set of prior assumptions to solve the inverse problem of source reconstruction. We found that the MNE approach consistently localized sources of the MEG signals to the hemisphere contralateral to the quadrant location. A consistent dorsal-ventral distinction for lower and upper field stimuli was also present, in line with the pattern expected from the individuals' fMRI-defined retinotopic maps (**Figures 2A,B,E,F**). The multiple sparse priors (MSP) and beamformer approaches both resulted in localizations that generally followed this retinotopic pattern with some deviations (**Figures 2C,D,G,H**). For example, the MSP reconstruction of responses to the UR quadrant of Subject 1 (**Figure 2C**) localized sources to the dorsal occipital lobes, instead of the ventral left lobe.

Localization accuracy of each reconstruction approach was evaluated by assuming that the fMRI retinotopic map is a

gold-standard, and by calculating the percentage of active cortical vertices localized into the retinotopically-defined subregion of each early visual area (V1, V2, V3, V3A) for each participant. These percentages were calculated for each participant based on their own fMRI-defined map and were then averaged over stimuli and participants. We systematically calculated the localization accuracy as a function of the cut-off threshold for including active vertices into the analysis for each reconstruction method and visual stimulus, across visual areas V1, V2, and V3, individually for each participant (see Methods, Threshold for active vertices). The optimal threshold that produced the most accurate source reconstructions for a given participants and method was used.

Angular retinotopic localization accuracy measured in this way was significant for all four early visual areas for MNE, for 3 of 4 visual areas with beamforming, and for 2 of 4 visual areas with MSP (Wilcoxon signrank tests: *p <* 0*.*05 Bonferroni corrected for multiple comparisons; **Figure 3**, **Table 2**). On average, MNE was most successful in localizing the highest percentage of active vertices to the expected retinotopic subregions (mean areas V1– V3 combined: 77.9%; V3A: 100%), followed by the beamformer (mean areas V1–V3 combined: 69.1%; V3A: 97.4%), followed by MSP (mean V1–V3: 54.9%; V3A: 64.9%). Localization accuracies of the three reconstruction methods were significantly different (Kruskal-Wallis tests: areas V1–V3 combined: chi<sup>2</sup> <sup>=</sup> <sup>18</sup>*.*4, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001; V3A, chi2 <sup>=</sup> <sup>14</sup>*.*9, *<sup>p</sup> <sup>&</sup>lt;* <sup>0</sup>*.*001).

To further investigate the factors contributing to the different localization accuracy values for minimum norm vs. multiple sparse priors, source reconstruction was carried out in the SPM8 software using the IID (independently and identically distributed priors) source reconstruction. The IID algorithm corresponds to a minimum norm assumption, but does not incorporate the same

#### **Table 2 | Localization accuracy for angular mapping.**


*Percentage of active vertices that localized to the fMRI-defined subregion of individuals' retinotopic maps for each early visual area, for quadrant stimuli. Results, averaged across stimuli and participants (*±*SD), are shown for MNE, beamformer (evoked and time-frequency approaches) and MSP reconstruction methods.*

depth weighting and anatomical constraints as the MNE software. IID localization accuracy was better than chance for all 4 visual areas tested and the mean accuracy values were slightly higher for IID than MSP (IID: mean areas V1–V3 combined: 57.3%; V3A: 84.8%). However, this difference was not significant (Kruskall-Wallis: areas V1–V3: chi<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*1, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*75, area V3A: chi<sup>2</sup> <sup>=</sup> <sup>3</sup>*.*58; *p* = 0*.*058). This suggests that the depth weighting and anatomical constraints of the MNE implementation convey some advantage for retinotopic mapping. Increasing the time window to capture a wider section of the visually evoked response did not improve the MSP localization accuracy for angular retinotopy (see **Table 1**).

#### **BEAMFORMING SOURCE RECONSTRUCTION OF STIMULUS-INDUCED OSCILLATIONS**

Visual stimuli underwent contrast reversal at a rate of 4 Hz, evoking ongoing oscillations in brain responses at a rate of 8 Hz (**Figure 1A**). A time-frequency (TF) beamformer was focused on the 7–9 Hz frequency band of measured brain responses, 200– 1000 ms post-stimulus onset. This excluded the first response peak (FRP) of the MEG response. Beamformer localization accuracy was similar regardless of whether the FRP (described in Localization of visually evoked responses to angular retinotopic stimuli) or the 7–9 Hz frequency band signals were used (timefrequency beamformer: mean areas V1–V3 combined: 66.0%; V3A: 83.2%; Kruskall-Wallis: areas V1–V3: chi<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*18, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*67, area V3A: chi<sup>2</sup> <sup>=</sup> <sup>0</sup>*.*15; *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*70; **Table 2**). Both approaches resulted in localization at a level significantly better than chance for 3 of the 4 early visual areas tested, although the regions that failed to reach significance were different for the two approaches. The use of the stimulus frequency tag therefore resulted in source localizations that were as good as, but not significantly better than, the application of the beamformer to the FRP.

#### **LOCALIZATION OF VISUALLY EVOKED RESPONSES TO ECCENTRICITY-VARYING STIMULI**

To investigate whether retinotopic localizations could be obtained for eccentricity-varying stimuli as well as for angular stimuli,

brain responses to contrast-reversing concentric rings were measured with MEG in seven participants. Three ring eccentricities were used: ECC 1 (0–0.75◦), ECC 2 (1.0–2.0◦), and ECC 3 (3.0–5.4◦).

Unlike quadrants, rings were bilateral stimuli, extending over both halves of the visual field and were expected to activate both cortical hemispheres simultaneously. At the level of MEG sensor topography, evoked responses to rings did not show a clear spatial pattern according to stimulus eccentricity and some responses appeared unilaterally biased (**Figure 4**).

The three MEG source reconstruction approaches were used to reconstruct sources of the visually evoked responses to ring stimuli. Localization accuracy of responses to each type of ring stimulus into the individual participant's fMRI-defined eccentricity band was evaluated across visual areas V1, V2, and V3 combined, then averaged across participants. For Subject 6, the minimum norm estimate (MNE) reconstruction of sources to concentric rings followed the expected posterior-anterior progression in the early visual areas of the calcarine region as stimulus eccentricity increased (**Figure 5**). However, this result was the exception; retinotopic sources for responses to rings were not consistently obtained across the other six participants with any reconstruction approach. Localization accuracy was not significantly better than chance for ECC 2 stimulus responses for any source reconstruction method (**Figure 6**, **Table 3**). The minimum norm approach (MNE) localized sources accurately to the expected eccentricity band for ECC 3 stimuli and the beamformer for ECC 1 stimuli (**Figure 6**). The accuracies of the three reconstruction methods were significantly different from each other (Kruskal-Wallis: chi<sup>2</sup> <sup>=</sup> <sup>10</sup>*.*7, *p <* 0*.*01).

**FIGURE 5 | Spatial patterns of MNE source reconstructions of responses to ring stimuli for Subject 6.** These reconstructions followed the expected retinotopic posterior-anterior progression with increasing stimulus eccentricity. F-statistic results are presented for left and right inflated cortical surfaces (see **Figure 1**). Insets show the corresponding stimulus locations. Sup., superior; post., posterior; ant., anterior; inf., inferior.

#### **EFFECT OF CONFINING ECCENTRICITY-VARYING STIMULI TO A VISUAL FIELD QUADRANT**

To investigate the discrepancy in the success of retinotopic localization of visual responses to quadrants vs. concentric rings, five of the seven participants who were scanned with eccentricityvarying stimuli were re-scanned with an amended stimulus set (quarter-rings), which consisted of the checkerboard ring stimuli confined to either the upper or lower quadrant of the right visual hemifield. Quarter-rings were located within the same eccentricity bands (and hence retinotopic brain representations) as the corresponding ring stimuli. Therefore, there were 6 quarterring stimuli: three presented in the upper right visual field

#### **Table 3 | Localization accuracy for eccentricity mapping.**


*Percentage of active vertices that localized into the fMRI-defined subregion of individuals' retinotopic maps across early visual areas V1, V2, and V3, for ring stimuli. Results, averaged across stimuli and participants (*±*SD), are shown for MNE, beamformer, and MSP reconstruction methods.*

quadrant (U-ECC 1, U-ECC 2, U-ECC 3) and three presented in the lower right visual field quadrant (L-ECC 1, L-ECC 2, L-ECC 3). MEG sensor topographies show that activations tend to lie over the left cortical hemisphere, as expected from angular retinotopy, but again no clear topography by eccentricity is discernible (**Figure 7**). We evaluated localization accuracy of sources of brain responses to quarter-rings reconstructed by minimum norm estimates and the beamformer for evoked responses. MSP reconstructions did not consistently localize activity into the early visual areas for any participant (*data not shown*), so we focus on MNE and the beamformer in the present analysis.

#### *Angular retinotopy with quarter-rings*

We first confirmed that brain responses to quarter-rings were adequately mapped according to angular retinotopy, by calculating the percentage of active vertices localized into the expected subregion of early visual areas V1, V2, and V3, combined together. For example, brain sources of responses to upper field quarter-ring stimuli are expected to localize to the left ventral

**FIGURE 7 | Gradiometer topography maps (T/m) of the averaged evoked response to quarter-ring stimuli at 83 ms post-stimulus (Subject 2). Left panels:** show the responses to upper field quarter-ring stimuli. Quarter-ring eccentricities U-ECC 1, U-ECC 2, and U-ECC 3 are presented at the top, middle, and bottom, respectively. **Right panels:** show the responses to lower field quarter-ring stimuli. Quarter-ring eccentricities L-ECC1, L-ECC2, and L-ECC3 are presented at the top, middle, and bottom, respectively. Insets indicate stimulus locations.

subregion, whilst lower field quarter-ring stimuli to the left dorsal subregion. Results are presented separately for upper or lower visual field locations, averaged over stimuli and participants (**Figure 8A**). Reconstructing sources with the minimum norm approach (MNE) resulted in both upper and lower field stimuli sources localized into the fMRI-defined quadrant subregion at levels better than chance (Wilcoxon signrank: *p <* 0*.*001). Localization accuracy was comparable to that of quadrant stimuli reported with the MNE method above (mean over all stimuli: 73.3%, **Table 4**). For the beamformer, responses to upper field stimuli were well localized according to angular retinotopy (*p <* 0*.*001; mean: 75.4%) whilst responses to lower field stimuli were not localized better than chance level to the expected dorsal subregions (mean: 42.9%, *p* = 0*.*090, *n* = 15; **Table 4**). Beamformer reconstructions were however mapped according to angular retinotopy at a level better than chance when considered over all stimuli (mean over all stimuli: 59.0%, *p <* 0*.*001, *n* = 30).

#### *Eccentricity localization with quarter-rings*

We then evaluated the localization accuracy of responses to quarter-ring stimuli into the expected eccentricity band (within the expected angular retinotopic cortical subregion). Although mean localization accuracy values were numerically similar for both reconstruction methods, accuracy was significantly better than chance for the MNE method (mean across all stimuli: 51.4%; Wilcoxon signrank test: *p <* 0*.*01, *n* = 14) but did not reach significance for the beamformer (mean across all stimuli: 49.0%; Wilcoxon signrank test: *p* = 0*.*065, *n* = 10). On average, localization accuracy values were higher for lower visual field stimuli

**FIGURE 8 | (A)** Source localization accuracy, according to angular retinotopy, of evoked responses to quarter-ring stimuli with MNE (black) and beamformer (BF, gray) methods. Bars show percentage of active vertices localized to the fMRI-defined retinotopic subregion across early visual areas V1, V2, and V3. Error bars show s.e.m. Black dashed line indicates chance accuracy level for each early visual area ROI. ∗indicates *pm <* 0*.*0125 for Wilcoxon signrank test of localization accuracy compared

**Table 4 | Localization accuracy for quarter-ring stimuli.**


*Percentage of active vertices in early visual cortex that localized into the fMRI-defined subregion of individuals' retinotopic maps, for quarter-ring stimuli. Results are reported for upper or lower visual field and averaged across stimuli and participants (*±*SD) according to either angular (expected quadrant) or eccentricity retinotopy (expected eccentricity band).*

(mean across methods: 57.5%) than for upper visual field stimuli (mean across methods: 43.1%), but this difference did not reach significance (Kruskal-Wallis: chi<sup>2</sup> <sup>=</sup> <sup>3</sup>*.*42, *<sup>p</sup>* <sup>=</sup> <sup>0</sup>*.*064; **Table 4**). Only MNE reconstructions of brain responses to lower field stimuli were significantly better than chance when considered on their own (**Figure 8B**). Average accuracy values were close to those obtained for ring stimuli but not as high as those for angular retinotopy (quadrant stimuli) for the same reconstruction approaches (**Table 4**).

#### **DISCUSSION**

#### **SOURCE LOCALIZATION ACCURACY OF VISUAL RESPONSES TO STIMULI VARYING BY ANGULAR LOCATION**

Minimum norm estimate (MNE), beamformer, and multiple sparse priors (MSP) source reconstruction methods were used

to chance (*pi <* 0*.*05 with Bonferroni correction for 4 multiple comparisons). **(B)** Source localization accuracy, according to eccentricity, of evoked responses to quarter-ring stimuli, with MNE (black) and beamformer (BF, gray). Bars show the percentage of active vertices localized to the corresponding fMRI-defined eccentricity band, considered within the corresponding angular subregion of early visual areas V1, V2, and V3. Error bars and statistical comparisons as for **(A)**.

to reconstruct sources of visual brain responses to angular retinotopic stimuli (quadrants). Source localization accuracy was defined by how accurately the different MEG reconstruction methods could match fMRI retinotopic maps for each individual. On average, localization accuracy was higher for MNE source reconstructions than for the beamformer, which in turn was higher than MSP. The MNE approach assumes that source amplitudes are minimal whilst brain sources are many and independently distributed (Dale et al., 2000; Gramfort et al., 2014). Our results show that this approach produces—in conjunction with specific depth-weighting and anatomical constraints reliable source reconstructions of retinotopic activity in early visual cortex.

The beamformer, on the other hand, uses a spatial filtering algorithm to estimate the time course of activity at each brain source independently (van Veen et al., 1997; Hillebrand and Barnes, 2011). The difference in localization accuracy between MNE and beamformer may be due to the different reconstruction algorithm. However, the MNE and beamformer methods implemented in the analysis packages used here additionally differ in their utilization of anatomical information; MNE uses the individual's cortical surface as the source space for reconstruction and hence imposes an additional constraint on the solution of the inverse problem, whilst the beamformer evaluates signals independently throughout the cranial volume (see **Table 1**). Lack of a cortical anatomical prior may have contributed to the lower spatial resolution of the beamformer compared to MNE.

The multiple sparse priors (MSP) approach showed a trend of localizing sources to the expected angular subregions of early visual areas, but this reached significance only in areas V2 and V3. On average, localization accuracy was lower for MSP than for MNE and the beamformer. This was surprising, as a previous study had shown adequate source localizations of responses to visual face stimuli with the MSP assumptions, with results superior to those of SPM8's minimum norm implementation (IID) when goodness of reconstruction was evaluated by Bayesian model evidence (Henson et al., 2009). The MSP assumptions may have worked well when applied to brain responses to faces because the expected responding regions (fusiform face areas) are large, bilateral clusters, matching the MSP prior assumptions of sparseness and bilateral components, based upon functional anatomy, which are selected by the algorithm according to the correlations present in the data (Friston et al., 2008). However, this pattern may not adequately reconstruct brain activity patterns for the angular retinotopy stimuli used in this study, which are biased unilaterally and spread irregularly over occipital areas in different individuals. Our finding is in line with a recent result of Cottereau et al. (2012) in their evaluation of the use of fMRI maps as spatial *priors* for source reconstruction of simulated MEG data arising from early visual sources. Cottereau et al. evaluated source reconstructions by calculating both the focalization error (the ratio between the estimated and theoretical energies of the current at the simulated sources) and the relative energy (the ratio between the normalized energies contained in the estimation of the active sources and the global distribution). They report that although the MSP approach had slightly better relative energy estimates, it also had higher focalization errors when compared to the minimum norm (MNE equivalent).

The source space used for MSP reconstruction in the SPM8 software was the inverse-normalized template cortical mesh (Mattout et al., 2007; Henson et al., 2009; Litvak et al., 2011), rather than the individual's cortical template, which was used for MNE (see **Table 1**). A key advantage of this approach is it that it facilitates group level analysis and also facilitates the inclusion of fMRI priors for MEG analysis, which are typically defined in the template space. Previous studies have demonstrated that source reconstruction of evoked responses is not impaired by the use of the inverse-normalized template rather than the individual's cortical mesh (Mattout et al., 2007; Henson et al., 2009). This suggests that it is the assumptions of multiple sparse priors that underlie the difference in source localization accuracy between the MSP and the MNE methods. An alternative approach within the SPM8 software is IID (independent and identically distributed sources), which corresponds to minimum norm assumptions. Implementation of IID on the same MEG data in SPM8 software gives localization accuracy better than chance for all four early visual areas, compared to just two early visual areas for MSP.

However, mean localization accuracy values were not significantly different between IID and MSP and were generally lower for IID than for the MNE approach. This may be due to further differences between IID and MNE implementations, such as the use of depth-weighting in MNE to counteract the superficial bias of minimum norm assumptions (Lin et al., 2006). It could also be due to the differences in use of anatomical data for source space specification. Variability in individuals' cortical surface geometry around the tightly folded early visual areas may mean that the use of the individual's mesh rather than the inverse-normalized cortical template mesh makes a significant contribution to accurate localization of responses in experiments investigating the visual system. This could also apply to other tightly folded brain regions. Future updates to the IID and MSP reconstruction algorithms could include the option to utilize the individual's cortical surface, rather than the inverse-normalized template, for source space modeling.

Cottereau et al. (2011) reconstructed retinotopic sources accurately into early visual areas V1, V2, and V3, by using a stimulus contrast reversal frequency tag. We tested whether use of an ongoing frequency tag may be an improvement on using the first response peak (FRP). With the beamformer, we found similar retinotopic localization accuracy when analyzing source power at the second harmonic of the stimulus contrast reversal frequency as compared to the reconstruction of FRP. In both cases, the accuracy localizations were significantly better than chance for 3 of the 4 early visual areas. Cottereau et al. (2011) used a faster stimulus contrast-reversal rate (7.5 Hz; second harmonic: 15 Hz) and a wider time window for source reconstruction (5600 ms), such that they focused on localizing a "steady state" visual response. In the current study, stimulus contrast-reversal rate was 4 Hz (second harmonic: 8 Hz) and the time window was 800 ms long, perhaps resulting in a noisier power estimate that might have limited the localization accuracy. Therefore, a steadystate response longer than the one utilized in the present study may be necessary for the frequency-tag information to improve source reconstruction.

#### **SOURCE LOCALIZATION ACCURACY OF VISUAL RESPONSES TO STIMULI VARYING BY ECCENTRICITY**

Concentric rings are commonly used to map eccentricity in early visual areas with fMRI (DeYoe et al., 1994; Engel et al., 1994; Wandell, 1999). None of the reconstruction methods consistently localized responses to the appropriate eccentricity bands in early visual areas at a level better than chance. This was unexpected, especially for MNE and beamformer approaches, which had reliably localized visual responses to angular retinotopy. Bilateral, eccentricity-varying visual stimuli may present a unique set of challenges to MEG source reconstruction. Concentric rings are full-field visual stimuli and so are expected to synchronously activate both upper and lower calcarine banks in both the left and the right cortical hemispheres, which may result in some interference or cancelation of equal and opposite magnetic fields arising from opposing cortical surfaces. Moreover, spatially extended and correlated source activity cannot be spatially filtered by the beamformer as easily (Hansen et al., 2010). Assumptions of multiple sparse priors (MSP) might have been expected to be more appropriate for localizing ring stimuli as they incorporate priors of bilaterality; however, this was not found to be the case.

To evaluate whether the bilateral extent of the ring stimuli limited the retinotopic localization by eccentricity, MEG signals were recorded with "quarter-rings" stimuli, which were confined to either the upper or lower quadrant of the right visual field. With MNE and beamformer approaches, the corresponding brain sources were generally well localized according to angular retinotopy. But with regard to localization to the expected eccentricity band, average accuracy values remained low, close to those obtained for whole rings. The MNE reconstruction method localized sources at a level better than chance when considered overall quarter-ring stimuli and for lower field stimuli alone.

There are a number of reasons that may explain the limitations of MEG source localizations to eccentricity stimuli of varying sizes. The representation of eccentricity in early visual areas varies along the lateral-medial and posterior-anterior axes of the brain, such that foveal stimuli are represented at the occipital pole and the representation of more peripheral locations progresses medially and anteriorly along the banks of the calcarine sulcus (Wandell et al., 2005). As a result, the greatest changes in MEG signal amplitude by stimulus eccentricity may occur in the same sensors due to nearer vs. deeper sources. By contrast, quadrant stimuli are represented in different hemispheres and may be expected to activate quite different sets of sensors. Indeed, when inspecting sensor topography, no consistent pattern could be seen by stimulus eccentricity, although this could be seen for angular stimuli. In MEG source reconstruction, there is inherent ambiguity in discerning low-amplitude superficial activity from higher-amplitude deep activity, which might explain the poor eccentricity results found here.

Sharon et al. (2007) used the MNE reconstruction approach to localize MEG responses to visual stimuli according to both angular and eccentric retinotopic position in occipital cortex. Their visual stimuli were small Gabor patches constructed from Gaussians of 1.2 or 1.7◦ full-width at half-maximum, thereby similar in extent to our quarter-ring stimuli U-ECC 1/L-ECC 1 (radius 0.75◦) and U-ECC 2/L-ECC 2 (radius 1.0◦). Sharon et al. defined localization error as the mean distance in the 3D volume between the centers-of-mass of the MEG and fMRI activity clusters. For the reconstruction of MEG signals alone, the localization error over six participants was found to be approximately 10 mm. While in their Figures 2, 3, the localization of MEG responses alone are mostly associated with the correct bank of the calcarine, the example MEG sources in Figure 2 do not unambiguously show the expected progression anteriorly or medially according to eccentricity. Only analysis of the centers of gravity of the source localizations show a slight trend to vary with stimulus eccentricity in the expected retinotopic pattern. As the radii of our quarterring stimuli were of similar magnitude to those of Sharon et al. (2007), it seems unlikely that size alone can account for any discrepancy in localization between the two studies. On the other hand, Sharon et al. (2007) presented each stimulus to viewers in a total of 500 trials, rather than the 95–125 trials in the present study. It may be therefore be that a much greater signal to noise ratio obtained by averaging over a much larger number of trials is necessary to successfully localize MEG signals by eccentricity, compared with angular retinotopy.

#### **LIMITATIONS**

Localization accuracy of the different MEG analysis methods was evaluated by calculating what percentage of the active vertices in early visual areas V1, V2, V3, and V3A were located in the expected subregion according to fMRI retinotopy in the same individuals (see also Cottereau et al., 2011; Supplementary Data). We ignored the incidence of active vertices in areas such as LO, V4, V3B, hMT+, which were outside the areas studied here. An alternative way to test localization accuracy would be to calculate the percentage of cortical vertices, in a retinotopically expected subregion, that are "active" in response to the corresponding stimulus, relative to the total number of vertices in that subregion. However, this value would be difficult to interpret even with perfect MEG source reconstruction, because MEG sensors are blind to sources located at certain parts of the cortex, such as the crests of gyri, due to the geometry of magnetic fields of the brain relative to the orientation of the sensor array (Hansen et al., 2010). Nevertheless, future attempts at an anatomically corrected analysis of this type would be interesting. Alternatively, it would be possible to compare MEG and fMRI source localizations in the 3D volume, for example by computing the distance between the center of mass of the fMRI and the active vertices in the MEG source result (e.g., Sharon et al., 2007). We decided against this method because, for the large visual stimuli used here, this approach would not utilize all of the information available from the fMRI maps and a few peak responding vertices would not be indicative of the entire reconstruction result. Additionally, this measure of localization can be misleading for anything other than point stimuli.

An important assumption of this study was that fMRI retinotopy correctly localizes the true sources of brain responses in individual participants. Although there is a wealth of histological and lesion evidence to suggest that retinotopic mapping measured by fMRI corresponds to the true patterns (Holmes, 1918, 1945; Horton and Hoyt, 1991; Bridge et al., 2005; Bridge and Clare, 2006), there may be unknown differences between the exact locations of the sources of brain activity measured by MEG and fMRI. These methods detect different underlying processes (electrophysiological vs. metabolic) and the time-scale on which these processes change is different (milliseconds vs. seconds). MEG signals most likely arise from synchronous synaptic current in cortical pyramidal cells (Hämäläinen et al., 1993). There is ongoing research into the electrophysiological correlates of the fMRI BOLD signal but it seems to be linked to the local field potential (LFP), which is also a measure of total of synaptic activity in cortical cells (Logothetis et al., 2001). However, early visual cortex, especially striate cortex, is well perfused and blood vessels are likely to be spatially close to their neuronal sources (Engel et al., 1997). In all, these arguments indicate that an fMRI-MEG comparison is appropriate for evaluating MEG localization accuracy.

Some localization error must necessarily arise from inaccuracies in the source space specification from the anatomical MRI and its co-registration to the MEG coordinate space (Hillebrand and Barnes, 2011; Perry et al., 2011). Fully evaluating the effects of this was out of the scope of the present study. We argue that these effects are not expected to underlie the main results of the crossmethod comparisons reported here. For example, the identical MEG-fMRI co-registration method and forward model specification was used for the beamformer and MSP approaches within the SPM8 software package, and a near-identical method was also used for MNE. The same anatomical surfaces and digitizer points were used for all reconstructions.

#### **CONCLUSIONS**

MEG source reconstructions with prior assumptions of many independent, distributed sources of small amplitude (in connection with individual anatomical mesh data), or with prior assumptions of a spatial filtering (beamformer) approach, seem well matched to localize the irregularly patterned, unilateral responses of retinotopic subregions of the early visual areas. On the other hand, the sparse priors of the MSP method may be better matched for large, cluster-like source distributions that are bilateral, such as responses to face stimuli in the fusiform gyrus (Henson et al., 2009). Sources in early visual areas are more accurately localized according to angular retinotopy rather than eccentricity. Stimuli should be confined to a visual field quadrant (i.e., not bilateral). Further work is necessary to tease out the quantitative contributions of different prior assumptions and source space constructions. However, researchers aiming to localize brain activity arising from the early visual regions should take spatial extent into account when designing the stimulus and should carefully match the analysis method and software package used to the expected distribution of the signal.

#### **ACKNOWLEDGMENTS**

Nela Cicmil held a DPAG/Usher-Cunningham Studentship. Kristine Krug and Holly Bridge are Royal Society University Research Fellows. We would like to thank Lori Minini and Betina Ip for help with the fMRI data collection and analysis, and Sven Braeutigam and Jennifer Swettenham for help with MEG data collection and analysis. We would like to thank Gareth Barnes for advice on SPM8 MEEG software implementation. This work was funded by the Wellcome Trust and the Royal Society.

#### **REFERENCES**


Mattout, J., Henson, R., and Friston, K. J. (2007). Canonical source reconstruction for MEG. *Comput. Intell. Neurosci.* 2007:67613. doi: 10.1155/2007/67613


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 March 2014; accepted: 08 May 2014; published online: 27 May 2014. Citation: Cicmil N, Bridge H, Parker AJ, Woolrich MW and Krug K (2014) Localization of MEG human brain responses to retinotopic visual stimuli with contrasting source reconstruction approaches. Front. Neurosci. 8:127. doi: 10.3389/fnins. 2014.00127*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Cicmil, Bridge, Parker, Woolrich and Krug. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A time-frequency analysis of the dynamics of cortical networks of sleep spindles from MEG-EEG recordings

#### *Younes Zerouali <sup>1</sup> \*, Jean-Marc Lina1,2, Zoran Sekerovic 3,4, Jonathan Godbout 3, Jonathan Dube3,4, Pierre Jolicoeur <sup>4</sup> and Julie Carrier 3,4*

*<sup>1</sup> Department of Electrical Engineering, Ecole de Technologie Supérieure, Montreal, QC, Canada*

*<sup>3</sup> Center for Advanced Research in Sleep Medicine, Hôpital du Sacré-Coeur, Montreal, QC, Canada*

*<sup>4</sup> Department of Psychology, Université de Montréal, Montreal, QC, Canada*

#### *Edited by:*

*Christopher W. Tyler, The Smith-Kettlewell Eye Research Institute, USA*

#### *Reviewed by:*

*Pavan Ramkumar, Aalto University, Finland David Haynor, University of Washington, USA*

#### *\*Correspondence:*

*Younes Zerouali, Department of Electrical Engineering, Ecole de Technologie Supérieure, 1100, Notre-Dame West, Montreal, QC H3C 1K3, Canada e-mail: youness.zerouali-boukhal.1@ ens.etsmtl.ca*

Sleep spindles are a hallmark of NREM sleep. They result from a widespread thalamo-cortical loop and involve synchronous cortical networks that are still poorly understood. We investigated whether brain activity during spindles can be characterized by specific patterns of functional connectivity among cortical generators. For that purpose, we developed a wavelet-based approach aimed at imaging the synchronous oscillatory cortical networks from simultaneous MEG-EEG recordings. First, we detected spindles on the EEG and extracted the corresponding frequency-locked MEG activity under the form of an analytic ridge signal in the time-frequency plane (Zerouali et al., 2013). Secondly, we performed source reconstruction of the ridge signal within the Maximum Entropy on the Mean framework (Amblard et al., 2004), yielding a robust estimate of the cortical sources producing observed oscillations. Lastly, we quantified functional connectivity among cortical sources using phase-locking values. The main innovations of this methodology are (1) to reveal the dynamic behavior of functional networks resolved in the time-frequency plane and (2) to characterize functional connectivity among MEG sources through phase interactions. We showed, for the first time, that the switch from fast to slow oscillatory mode during sleep spindles is required for the emergence of specific patterns of connectivity. Moreover, we show that earlier synchrony during spindles was associated with mainly intra-hemispheric connectivity whereas later synchrony was associated with global long-range connectivity. We propose that our methodology can be a valuable tool for studying the connectivity underlying neural processes involving sleep spindles, such as memory, plasticity or aging.

**Keywords: wavelet ridges, source localization, maximum entropy on the mean, phase synchrony, functional connectivity, sleep spindles**

### **INTRODUCTION**

It is believed that the characteristic patterns of spontaneous bioelectrical activity that occur during sleep, originating either from focal cortical regions or large-scale networks, reflect essential neural processes that modify the long-term functionality of the awake brain (e.g., brain plasticity, memory enhancement, see Walker and Stickgold, 2006). Among them, sleep spindles constitute a hallmark of non-rapid-eye movement (NREM) sleep. A spindle is a transient high-amplitude oscillation seen in the electroencephalogram (EEG), typically lasting approximately 500–1500 ms within the sigma band (10–16 Hz). Sleep spindles reflect the sequential activation of the reticular and dorsal thalamic nuclei, followed by neocortical targets (Steriade et al., 1985, 1987). Early animal research pointed at hyperpolarizing potentials in thalamic reticular (RE) nucleus as the neurophysiological trigger of spindle sequences (Steriade et al., 1987). Subsequently, it was demonstrated that cortico-thalamic feedback is also crucial to initiate and terminate spindle oscillations (Destexhe et al., 1998; Golshani et al., 2001; Timofeev et al., 2001; Timofeev and Bazhenov, 2005; Bonjean et al., 2011).

Cortical synchrony is a key factor involved in sustaining spindle oscillations (Timofeev and Bazhenov, 2005). Neural modeling first suggested that cortical feedback on RE cells could result in a large-scale synchronous network of spindle oscillations over the cortex (Destexhe et al., 1998). Thalamo-cortical synchronous oscillations (12–14 Hz) were subsequently measured in situ in cats (Timofeev and Bazhenov, 2005). It was observed that termination of a spindle is characterized by desynchronization of responses between cortical and thalamocortical neurons (Steriade et al., 1998; Timofeev et al., 2001).

In EEG recordings, the mean frequency of spindles varies across the scalp. Spindles are usually slower at more anterior sites ("slower" spindles: 11–13 Hz) and typically faster at more posterior sites ("faster" spindles: 14–16 Hz; Jankel and Niedermeyer, 1985; Jobert et al., 1992). Interestingly, Andrillon et al. (2011) showed that faster spindles observed at electrode Cz

*<sup>2</sup> Centre de Recherches Mathématiques, Université de Montréal, Montreal, QC, Canada*

emerge usually around 500 ms before the onset of slower spindles at frontal sites. The scalp topography of spindle frequency may reflect distinct neurophysiological processes (Timofeev and Chauvette, 2013). According to this suggestion, higherfrequency and earlier spindles would reflect initial thalamocotical interactions, predominant in central regions; whereas lowerfrequency and later spindles would reflect secondary corticocortical interactions, spreading over frontal regions.

Recent studies also reported that intra-spindle frequency is not stable in time. For most spindles, the dynamics is characterized by a progressive frequency slowing, even at posterior EEG electrode sites (Schonwald et al., 2011). When analyzing separately spindles with high and low frequency, Urakami (2008) showed the shift in frequency over time is well explained with two dipolar sources located deep in the postcentral and in the precentral regions, bilaterally. However, the synchronous neural networks involved in sleep spindles, and the dynamics of their deployment over time, have never been characterized.

This article presents a new methodology to characterize the neural generators of EEG spindles from the perspective of cortical synchrony as measured on MEG. Thus, we considered frequencylocking among MEG sensors within a time window around spindles marked on the simultaneous EEG. MEG frequencylocking consists in transient synchronous events (SEs) during which activity recorded by a subset of sensors oscillate at the same frequency. There are two main reasons to consider MEG frequency-locking to understand cortical activity during EEG spindles. First, MEG recordings are spatially less corrupted with spurious correlations than EEG (absence of reference electrode, no spatial blurring from conduction on the scalp). Second, the source localization of oscillatory patterns is more tractable in MEG, where an adequate model of data generation does not involve current propagation through inhomogeneous tissues.

In the present work, we localized the cortical generators of the frequency-locked MEG events during EEG spindles. In addition we characterized for these events the cortical distribution of power and the cortico-cortical functional connectivity networks. To do such analyses in a unified framework, dedicated to transient oscillatory patterns like spindles, we developed a novel approach based on analytical (i.e., complex) time-frequency representations of the data from which the information related to synchrony was extracted. We identified the neural generators related to this information extracted from the MEG recordings for each spindle. The complex signal thus inferred on the sources has both information about power (amplitude) and phase, from which coupling between sources could be estimated. In addition, the frequency at which frequency-locking occurred allowed us to distinguish fast and slow rhythmic components within spindles.

Using this approach, our main results are: (1) Eighty percent of EEG spindles showed at least one significant MEG frequencylocked event; (2) within spindles, the central frequency of early frequency-locked activity was mainly distributed around 14 Hz (fast) whereas it is distributed around 12 Hz (slow) for late frequency-locked activity; (3) early frequency-locking, no matter its frequency, emerges mainly from parietal regions whereas late frequency-locking emerges from a much broader set of regions, localized mainly in frontal, parietal, and occipital areas; (4) overall long-range synchronization is lower for early than for late frequency-locking wheareas short-range synchronization is higher for early than for late frequency-locking; (5) the cortical network for late frequency-locking involved larger numbers of connections (particularly interhemispheric) than for early frequency-locking.

# **MATERIALS AND METHODS**

#### **PROTOCOL, MEG RECORDINGS, AND ANATOMICAL MRI**

Brain activity of 8 healthy subjects was recorded during sleep, using simultaneous MEG and EEG for a maximum period of 90 min following a period of 26 h of sleep deprivation (to insure a good probability of sleeping in the MEG laboratory). From this group, 5 young subjects were kept in the present study (see **Table 1**). Recordings were conducted at the Centre de Recherche en Neuropsychologie et en Cognition (CERNEC) of Université de Montréal using a 275 channel CTF-VMS whole-head magnetometer. Subjects arrived 1 h prior to their habitual bedtime and stayed awake until 2 h after their habitual wake time. During this sleep deprivation (under a research assistant supervision) activity was limited to reading or surfing on the Internet. The protocol was approved by the ETS ethics board and by the Comité d'Ethique de la Recherche of IUGM. Written informed consent was obtained from all subjects.

The MEG recordings were split into consecutive runs of 18 min. Sleep EEG was recorded simultaneously using 56 scalp electrodes referenced to the left mastoid with a CTF EEG system integrated with the MEG system. Electrodes were positioned using the 10–10 system. In addition, the horizontal (HEOG) and the vertical (VEOG) components of the electro-oculogram were recorded using two pairs of electrodes, one pair at the outer canthi and one pair above and below the left eye, respectively. MEG and EEG were digitized at 1200 Hz with an antialiasing low-pass filter at 300 Hz (30 dB/Octave) and a high pass filter of about 0.02 Hz. MEG signals were de-noised using the CTF [CTF MEG, Coquitlam (BC), Canada] third-order synthetic gradiometer algorithm. The EEG was manually scored for sleep stages according to standard criteria (American Academy of Sleep Medicine manual, Iber, 2007). EEG spindle detection was performed visually on Cz by an experienced sleep technician. A sleep


spindle was detected when a burst of oscillatory brain activity (12–14 Hz) was visible on NREM EEG for at least 0.5 using bandpassed filter (1–30 Hz) (Rechtschaffen and Kales, 1968). A high resolution anatomical T1-weighted MRI scan was acquired at the Unité de Neuroimagerie fonctionelle de l'Institut Universitaire de Gériatrie de Montréal using a T1-weighted 3D MPRAGE Fast sequence (slab: 160, voxel size: 1.0 ×1.0 × 1.2 mm, TR/TE: 2300/2.94 ms, TI: 900, FOV: 256) acquired in a 3T Siemens MAGNETOM Trio scanner (Siemens Medical Solutions, Malvern (PA), USA). A mesh representation of the white/gray matter interface with 8000 vertices (sources) was extracted from the MRI scan for each subject using Brainvisa (Cointepas et al., 2001). The spatial resolution of the mesh was 5.5 ± 2.8 mm and the orientation of the sources was constrained to be normal to the surface. The forward model *G* (see Section Imaging Cortical Synchrony) that was used for the source localization was obtained from a spherical head model computed using Brainstorm (Tadel et al., 2011).

#### **WAVELET ANALYSIS**

We consider the continuous wavelet representation of the multivariate data *M* (*t*),

$$\left(\boldsymbol{w}^{(m)}\left(\boldsymbol{a},\boldsymbol{b}\right) = \int\_{-\infty}^{+\infty} M\left(t\right) \,\,\overline{\Psi\_{\rm ab}\left(t\right)}\,\mathrm{d}t \tag{1}$$

with the wavelet defined as usual as

$$
\Psi\_{\rm ab}(t) = \frac{1}{\sqrt{a}} \Psi \left( \frac{t - b}{a} \right) \tag{2}
$$

where -(*t*) is a complex valued analytical wavelet of the Morse type (see Appendix II). ab (*t*) is a short time oscillatory function scaled by factor *a* and translated in time by *b* samples. Each wavelet coefficient *w*(*m*) (*a*, *b*), where *m* refers to the data space, thus describes the oscillatory behavior of the signals *M(t)* at scale *a* and around time sample *b*. The scaling factor *a* was spaced along 256 scales, thus yielding a spectral resolution of ≈0.4 Hz in the sigma band. It is noteworthy that this signal representation is highly redundant and neighboring wavelet coefficients are correlated. The next section describes how we can retrieve frequency-locking information from such a redundant representation.

#### **FREQUENCY-LOCKING IN THE SENSORS SPACE**

From a signal representation in the time-frequency (t-f) plane, one can extract the instantaneous frequency by computing wavelet ridges (Mallat, 2008). The procedure for a univariate signal is illustrated in **Figure 1**. At each time sample *b*, we locate on the wavelet scalogram (**Figure 1A**) the local maxima in amplitude (i.e., the energy). The frequency of such maxima defines the instantaneous frequency of one oscillator present in the signal. Contiguous maxima along time are then chained into "ridge lines" *a* = *r* (*b*). The location of all ridge lines in

the t-f plane is called a "ridge map" (**Figure 1B**) which is a binary representation of the oscillatory modes present in the signal (Delprat et al., 1994). As illustrated in **Figure 1** with a simulated spindle, the complex wavelet signal (**Figure 1D**) along the ridge line (**Figure 1B**) mostly reproduces (real part shown on **Supplementary Figure 1**), the original oscillatory signal (**Figure 1C**).

We extend this approach to multivariate (i.e., multichannel) MEG signals as illustrated in **Figure 2**. We first compute the ridge map of each sensor (**Figure 2A**), then we sum them to obtain a "multivariate ridge map" (**Figure 2B**), the values of which reflect the number of sensors sharing common local maxima, i.e., instantaneous frequencies. On the multivariate ridge map, we track common oscillatory modes as multivariate ridge curves *a* = *r*(*m*) (*t*). Each curve may vary in frequency over time and reflects an episode of frequency locking among sensors. From now on, the term 'ridge' refers to a multivariate ridge curve *<sup>a</sup>* <sup>=</sup> *<sup>r</sup>*(*m*) (*t*).

#### **STATISTICALLY SIGNIFICANT FREQUENCY-LOCKING**

We now define the strength of a ridge as the time average of the number of frequency-locked sensors at each time sample of the ridge. To define the minimal strength for a ridge to be considered as a spindle specific synchronous event, we define a thresholding procedure based on the rationale that synchrony must be stronger during a spindle than during baseline activity. We thus detect ridges (*r*(*b*) (*t*)) during a baseline window preceding a spindle (−1.5 to −0.5 s with respect to the marker) and compute their strength. Using a FDR approach, we build a cumulative distribution of ridge strength during baseline and set the cutoff such that *p* ≤ 0.05. Ridge strength cutoff is determined for each spindle, and only ridges above the cutoff are considered as "synchronous events" (SE).

#### **NON-LINEAR FILTERING OF MEG SIGNALS**

Spindles typically exhibit a succession of synchronous events SEs, the first and last of which are termed respectively *early* and *late* SE (see **Figure 2C**). For each of these events—indexed by r, we construct an analytic ridge signal *w*(*m*) *<sup>r</sup>* (*t*)- m stands for multivariate—that consists in the complex wavelet coefficients of *all* Ns sensors at frequencies along the line *<sup>a</sup>* <sup>=</sup> *<sup>r</sup>*(*m*) (*t*):

$$\boldsymbol{w}\_r^{(m)}(t) = \boldsymbol{w}^{(m)}(t, r^{(m)}(t)) \tag{3}$$

This ridge signal over the whole set of sensors is complex-valued and only exists during periods of frequency-locking between a subset of sensors. *w*(*m*) *<sup>r</sup>* (*t*) is an oscillatory component of *M*(*t*) of the form *<sup>w</sup>*(*m*) *<sup>r</sup>* (*t*) <sup>=</sup> *<sup>A</sup>* (*t*)*e<sup>i</sup>* <sup>φ</sup> (*t*) , where φ(*t*) is the instantaneous phase (Zerouali et al., 2013). This approach is analogous to the Hilbert-Huang Transform (HHT), which computes the instantaneous phase of empirical modes of the data. However, although it can successfully separate brain rhythms from EEG recordings (Bajaj and Pachori, 2012), the HHT is not readily

#### **FIGURE 2 | Real spindle: (A) average wavelet power over all MEG sensors.** The EEG onset is at time equal to 0. For the same spindle, **(B)** is the multivariate ridge map obtained by summing the individual ridge maps over all MEG sensors. The colors indicate the number of sensors frequency-locked at a particular time-frequency point. **(C)** Displays multivariate ridge mask produced after data-driven thresholding of the multivariate ridge plane **(B)**.

The mean power of this spindle is 12 Hz but the multivariate ridges **(B)** show synchrony above this value and even before the EEG onset (*t* = 0). In this particular case, we observe 3 multivariate ridge lines during the spindle (the discontinuity along the frequency axis reflects the limit in spectral resolution of the decomposition), with frequency starting around 12.6 Hz (early event) and ending at 11.13 Hz (late event).

usable for extracting synchronous components. It is noteworthy that the number of SEs that can be extracted from *M*(*t*) can vary and even be null if underlying neural generators are all asynchronous. We treat each spindle as a distinct event and quantify 4 characteristics of the SEs on a spindle-by-spindle basis: (1) the presence or absence of SEs, (2) the number of SEs, (3) the summed duration of the SEs, and (4) the onset time of the first SE.

#### **IMAGING CORTICAL SYNCHRONY**

Given a ridge signal *<sup>w</sup>*(*m*) *<sup>r</sup>* (*t*) of length *Tr*, we localize its cortical generators by solving the inverse problem associated with the following linear but ill-posed generative model:

$$\left(\boldsymbol{w}\_r^{(m)}\left(t\right) = G\,\boldsymbol{w}\_r^{(q)}\left(t\right) + \varepsilon\_r(t) \tag{4}$$

where *<sup>w</sup>*(*q*) *<sup>r</sup>* (*t*) is the *Nq* <sup>×</sup> *Tr* analytic source signal to be estimated, ε*r*(*t*) is noise and *G* is the *Ns* × *Nq* forward operator projecting source activity onto the sensors space. We emphasize here that although the ridge line is a non-linear filter, the ridge signal *<sup>w</sup>*(*m*) *<sup>r</sup>* (*t*) itself is linear with respect to data *<sup>M</sup>* (*t*) since the wavelet transform is a Iinear operation. The linear operator *G* is thus valid for ridge signals. In the present work, the estimation of the *Nq*-dimensional *<sup>w</sup>*(*q*) *<sup>r</sup>* (*t*) is obtained through the Maximum Entropy on the Mean as developed (Amblard et al., 2004) and validated in (Grova et al., 2006). It is noteworthy that *<sup>w</sup>*(*q*) *<sup>r</sup>* (*t*) is an analytic source signal, which provides access to the true phase of the sources. All routines used for this article are coded in Matlab [The MathWorks Inc., Natick (MA), USA] is interfaced with Brainstorm and distributed as an open-access toolbox (http://neuroimage.usc.edu/brainstorm).

### **GROUP-LEVEL SYNCHRONOUS NETWORKS**

In order to perform group analyses, we first projected the time courses *w*(*q*) *<sup>r</sup>* (*t*) from the individual anatomy space onto the MNI brain template using routines implemented in Brainstorm (Tadel et al., 2011). On this common template, we characterized source activity inferred from the SEs under two different perspectives: (1) the power, proportional to the square of the amplitude of source activity during a SE, and (2) the connectivity, to infer functional networks emerging through phase synchrony. These two properties on the sources are complementary by definition, since phase synchrony and power are theoretically independent (but see Ghuman et al., 2010 for a link between source SNR and synchrony detectability). We note here that while power during SEs was computed at the source level, phase synchrony addressed connectivity within and among 88 parcels, each including around 200 sources (227 ± 136). For that purpose, we performed an initial clustering of cortical sources into 88 parcels derived from the Tzourio– Mazoyer anatomical atlas (**Supplementary Figure 5**). We computed both short-range and long-range connectivity based on these parcels. Short-range connectivity was computed as pairwise source connectivity within each parcel, whereas long-range connectivity was computed using local average signals within parcels.

#### **POWER OF SYNCHRONOUS SOURCES**

For each source *n* on the template, we quantified the source power underlying the SEs *r* detected for a subject *s* (hence the notation n,*r;s* in next Equation). First, we computed the mean energy *E*(*q*) :

$$E\_{n,r;s}^{(q)} = \frac{1}{T\_r} \sum\_{t=1}^{T\_r} |\mathbf{w}\_{n,r;s}^{(q)}(t)|^2 \tag{5}$$

where *Tr* is the number of time samples in the SE *r*. Given that wavelet coefficients *<sup>w</sup>*(*q*) *n*,*r*;*s* (*t*) are approximately 0-mean fluctuations, *E*(*q*) *<sup>n</sup>*,*r*;*<sup>s</sup>* can be seen as a measure of source variance. We also compute the mean energy *E*(*q*) *<sup>n</sup>*,*b*;*<sup>s</sup>* of the sources along ridges b located during a baseline period (−1.5 to −0.5 s before EEG spindle marker). The null hypothesis (H0) in our statistical test was that source variance has the same distribution during SEs than during baseline. We assessed this hypothesis using Fischer's test on a group statistic *F*. For each subject *s*, we ran 100 iterations where we selected a subset *Ri*,*<sup>s</sup>* of 12 SEs, and a subset *Bi*,*<sup>s</sup>* of 12 ridges in the baseline periods to compute the F-statistic as follows,

$$F\_{n,i,s} = \frac{\sum\_{r \in R\_{i,s}} E\_{n,r}^{(q)}}{\sum\_{r \in R\_{i,s}} E\_{n,r}^{(q)}}, \quad i = 1, \dots, 100 \tag{6}$$

Given that our subjects displayed at least 42 SEs (see **Table 1**), we could generate at least 2.<sup>9</sup> <sup>×</sup> <sup>10</sup><sup>5</sup> unique subsets *Ri*,*<sup>s</sup>* and *Bi*,*<sup>s</sup>* (21 SEs for each onset—late/early, 12 choices per combination). The average F-statistic over the 100 iterations, for each subject *Fn*,*<sup>s</sup>* was then computed. Finally, we averaged the statistics *Fn*,*<sup>s</sup>* over subjects in order to obtain the group-level average statistic *Fn* We then derived the threshold *F<sup>T</sup>* (12, 12) = 21.02 such that any sources *n* with *Fn* > *F<sup>T</sup>* (12, 12) is significantly activated at a Bonferroni-corrected 5% level (*p* = 0.05/15028).

#### **SYNCHRONY AMONG SOURCES**

At this point, source signals *wr* (*q*) (*t*) are in a common anatomical space, thus we discard subject index. For each ridge signal *r* [we remind here that this signal is multivariate with dimensions (*Nsources* × *Nbins*)], we then computed pairwise synchrony ξ between parcels *i* and *j* using:

$$\xi\_{i,j}^{(r)} = \left| \frac{1}{T\_r} \sum\_{t=1}^{T\_r} \frac{\left. \begin{smallmatrix} q \\ \boldsymbol{\nu}\_{r,i}^{(q)}(t) \boldsymbol{\nu}\_{r,j}^{(q)\*}(t) \end{smallmatrix}}{\left| \boldsymbol{\nu}\_{r,i}^{(q)}(t) \right| \left| \boldsymbol{\nu}\_{r,j}^{(q)}(t) \right|} \right| \right| \tag{7}$$

where *Tr* is the length of ridge *<sup>r</sup>* and *<sup>w</sup>*(*q*)<sup>∗</sup> *<sup>r</sup>*,*<sup>j</sup>* (*t*) denotes the complex conjugation of *<sup>w</sup>*(*q*) *<sup>r</sup>*,*<sup>j</sup>* (*t*). This definition of synchrony is equivalent to the phase-locking value (PLV, Lachaux et al., 1999) and provides added robustness to round-off error. For each pair (*i,j*), we thus computed *R* synchrony values, where *R* was the total number of ridges for a particular condition, then we averaged those values to obtain mean pairwise synchrony. For simplicity, we explain the synchrony computation and thresholding for a single pair of regions, but the same computations were performed for all pairs.

We assessed the statistical significance of synchrony strength using a non-parametric approach aimed at estimating the distribution of estimated synchrony under the Null Hypothesis, for each pair of parcels (*i,j*). To do this we used a shuffling approach by randomly permuting the identity of ridges, thus yielding:

$$\xi\_{i,j}^{(r,u)} = \left| \frac{1}{T} \sum\_{t=1}^{T} \frac{\left. \begin{smallmatrix} q \\ \boldsymbol{\nu}\_{r,i}^{(q)}(t) \boldsymbol{\nu}\_{u,j}^{(q)\*}(t) \end{smallmatrix}}{\left| \boldsymbol{\nu}\_{r,i}^{(q)}(t) \right| \left| \boldsymbol{\nu}\_{u,j}^{(q)}(t) \right|} \right| \right| \tag{8}$$

where *r* = *u* and *T* = min(*Tr*, *Tu*). By permuting all ridges for a particular condition, we constructed R shuffled values *on*ξ (*r*,*u*) *<sup>i</sup>*,*<sup>j</sup>* . We repeated this operation 100 times in order to ensure statistical robustness of our null hypothesis. The null hypothesis was that the distribution of phase-synchrony within a given ridge was equivalent to that generated from random combinations of the signals across ridges. The distribution of ξ (*r*) *<sup>i</sup>*,*<sup>j</sup>* was then compared to the distribution under the null hypothesis and we derived a statistical threshold using the false discovery rate technique (see **Supplementary Figure 2** for an illustration). This technique consists in finding the synchrony value ξ*<sup>T</sup> <sup>i</sup>*,*<sup>j</sup>* that ensures an arbitrary false positive rate (herein set to 5%). First, PLV scores ξ*r*,*i*,*<sup>j</sup>* were transformed to *zr*,*i*,*<sup>j</sup>* using Fischer's transform *zr*,*i*,*<sup>j</sup>* = 0.5 [ln 1 + ξ*r*,*i*,*<sup>j</sup>* − ln (1 − ξ*r*,*i*,*j*)]. Then we computed the average z-scores *zi*,*j*, that were then inverse z-transformed to ξ*i*,*<sup>j</sup>* = exp 2*zi*,*<sup>j</sup>* − 1 / exp 2*zi*,*<sup>j</sup>* + 1 . Finally, we consider regions pair (*i,j*) as being significantly synchronous if the average across SEs in each classes of ξ*r*,*i*,*<sup>j</sup>* is at least ξ*<sup>T</sup> i*,*j* . It is important to note here that the average PLV values and the PLV thresholds, derived respectively with equations (7) and (8), are computed specifically for each condition [(early, late) × (slow, fast)].

#### **RESULTS**

#### **MEG FREQUENCY-LOCKING DURING SPINDLES (SEs)**

**Figure 3** shows a number of descriptive statistics for the SEs observed at the MEG sensor level. More than 80% of EEG spindles for each subject had at least one significant MEG SE and the average was 92% (see **Figure 3A**). We note that frequency-locking was mostly sampled with 2 ridges per spindle for subjects 1, 3, 6, and 8 (Mean = 1.7 ± 1.1), while subject 7 had an average of about 5 ridges per spindle (Mean = 4.9 ± 3.0) (see **Figure 3B**). Ridges had a median duration of about 500 ms, which did not vary much across subjects, as shown in **Figure 3C**.

#### **TIMING OF MEG SEs DURING SPINDLES**

We examined when MEG ridges were first observed within spindles. **Figure 4** shows the relative frequency of onset times. First SE from all spindles were pooled and using a probability density function, we computed their onset time with respect to EEG spindle marker at Cz. We observed that frequency-locking is initiated roughly between 250 ms before and 400 ms after EEG marker, with a main peak on the distribution at 110 ms after.

#### **CENTRAL FREQUENCY OF SEs IN SPINDLES**

**Figure 5** shows the distribution of central frequencies of all MEG SEs within EEG spindles (dashed line). The central frequency is here defined as the average instantaneous frequency along a SE. The distribution is bimodal with a main peak centered at 13.9 Hz and a lower peak around 11.5 Hz. Note that the spectral resolution of this analysis was limited to ∼0.4 Hz due to the discrete and inhomogenous (i.e., with exponentially-spaced spectral bins) wavelet scaling. Taking into the spectral resolution of the analysis, we can state that the main frequency mode for MEG synchrony is between 13.4 and 14.3 Hz, and the lower mode is between 11.1 and 11.9 Hz.

Among all SEs, we select subsests of *early* and *late* events. Interestingly, the central frequency of early SEs, which are the first detected ridges relative to spindle onset, is mainly distributed around 14 Hz (blue curve). On the other hand, the central frequency of late SEs, which are the last detected ridge, is mainly distributed around 12 Hz (red curve).

#### **ACTIVATION MAPS**

**Supplementary Figure 3** illustrates cortical activations associated with SEs that take place either early, or late relative to spindle

red, the 25th and 75th percentiles at the end of the box, and the "whiskers" indicate the minimum and maximum scores in the sample. The + in **(B)** indicate outliers. **(C)** Median total duration of SEs per spindle for each subject, along with the 25th and 75th percentiles. The + in **(C)** indicate outliers.

**FIGURE 4 | Probability density plot of the onset time of the first synchrony event (SE) in MEG relative to the spindle onset time at Cz in the EEG, for each participant.** Each point on the graph shows the density of early SEs for a given onset time and subject. The black line is the spline interpolation of the empirical probability distribution. The blue dashed line indicates the first "plateau" and the red dashed line marks the mode of the distribution.

onset. These maps are displayed using Otsu's visualization threshold and allow a qualitative description of cortical activity linked to synchrony (Otsu, 1979). We can see that cortical energy is mainly distributed over the perirolandic cortex, bilaterally, for early synchrony. On the other hand, cortical energy is more broadly distributed for late synchrony and spans frontal, perirolandic, temporal, and occipital regions. It thus seems that cortical synchrony during spindles is initiated in fairly focal perirolandic regions and extends progressively to further regions.

As was shown in **Figure 5**, the central frequency of early SEs is mainly high but it can be low, and the reverse is true for late synchrony (mainly low, but can be high). Thus, the observed differences in cortical activation could either be due to the timing (early vs. late) or the frequency of synchrony (low vs. high) of synchrony. In order to disentangle the effects of these two factors, we pooled SEs with respect to each combination of timing and frequency. We first verify that, based only on the chronology of the synchronous events for each spindle, the distribution of the early and late events will sample unambiguously the early and late part of the spindles. This is shown in **Figure 6**. Using this approach, results in **Figure 7** suggest that early SEs, no matter their frequency, emerge mainly from perirolandic regions. In addition, late synchrony emerges from a much broader set of regions, localized mainly in frontal, parietal, and occipital areas.

#### **SIGNIFICANT REGIONS OF CORTICAL SYNCHRONY DURING SLEEP SPINDLES**

**Figure 8** displays regions of significant projected power on cortical sources during SEs when the results were corrected for multiple comparisons using non-parametric statistical thresholding to Bonferroni-corrected *p* < 0.05. For early fast SEs, significant activations were found bilaterally, although stronger over the left hemisphere, in the postcentral gyrus, extending to the caudal part of the superior frontal gyrus, and in the left superior parietal lobule. In turn, for late slow SEs, activations were found, bilaterally, in the medial frontal gyrus, in the superior frontal gyrus, in the inferior parietal lobule and in the precuneus.

#### **SHORT- AND LONG-RANGE SYNCHRONY DURING SLEEP SPINDLES**

We examined separately short- and long-range synchronization during the early and late parts of spindles using measures of phase-locking value. Descriptive statistics for this analysis are displayed in **Figure 6B**. Overall short range synchronization, that is the averaged phase-locking values between pairs of sources within the same region, was significantly lower for late (0.63) than for early (0.77) synchrony [two-sample *t*-test, *t*(3009) = 7.64, *p* < 0.0001]. On the other hand, long-range synchronization, that is the mean phase-locking value between all pairs of sources across distinct regions, was significantly higher for late (0.48) than for early (0.41) synchrony [two-sample *t*-test, *t*(7654) = −38.87, *p* < 0.0001]. In particular, interhemispheric connections were denser in late synchrony, as the median PLV was increased by 0.085 in the latter condition [two-sample *t*-test, *t*(3870) = 17.42, *p* < 0.0001, data not plotted]. Also, intrahemispheric increase of median long-range PLV value was much more marked in the right [PLV = 0.12, *t*(1890) = 14.17, *p* < 0.0001, data not plotted] than in the left [PLV = 0.01, *t*(1890) = 4.61, *p* < 0.0001, data not plotted] hemisphere.

#### **SYNCHRONOUS NETWORKS DURING SPINDLES**

Recall from Section Group-Level Synchronous Network that we divided cortical regions into 88 distinct parcels. Phase-locking values (PLVs) were computed between all possible pairs of sources within each parcel to obtain short-scale synchrony values. In addition, we computed the average signal in parcel and computed PLVs between all possible pairs of parcels. Parcels

**FIGURE 6 | (A)** Onset time descriptive statistics (median, quartiles, and extrema) for early or late events, with fast or slow oscillations, relative to the time of spindle onset defined on the EEG at Cz. **(B)** Phase-locking value descriptive statistics for long-range and short-range synchrony displayed for both early (columns 1, 2) and late (columns 3, 4)

synchronous events. The horizontal bar and asterisks indicate a statistically-significant difference with *p* < 0.001 (see text for details). The number of points in the distributions of short-range PLV (source pairs) and long-range PLV (region pairs) are respectively 3,053,790 and 3828. See text for how events were classified.

were manually labeled to either the frontal, parietal, temporal, mesial or occipital regions. **Supplementary Figure 6** shows a schematic representation of connectivity among and within cortical parcels, each being represented with a node. Long-range pairwise PLVs values greater than 0.8 are depicted, and links that are significant statistically are in bold. Statistical significance of the PLV value for a pair was determined using the approach described in Section Synchrony Among Sources. We computed, within each condition [(early, late) × (fast, slow)] the null distribution of large-scale synchrony in absence of SEs, i.e., using ridge signals from the baseline. From that distribution, we derive the FDR threshold above which synchrony is

significant with p value of 5%. Short-range, within parcels synchrony, is coded with the node color and is not thresholded statistically.

*F*-value (see Section Power of Synchronous Sources) of the significantly

activated sources, insignificant ones are set to 0.

Cortical networks involved a larger number of significant pairwise connections for late synchrony (99) than for early synchrony (31). In particular, interhemispheric connections were denser in late (8) than in early (1) synchrony (**Supplementary Figure 6**).

In order to disentangle effects of timing versus frequency, we analyzed separately the 4 combinations of these two factors. We show the statistically-significant PLV links in **Figure 9** for late slow and early fast synchrony where we observed significant pairwise connections. There were no significant connections in the other two conditions (early slow, late fast). Interstingly, late slow synchrony involved a larger number of connections (137) than early fast synchrony (31). Finally, significant

interhemispheric synchrony was observed only in late slow synchrony. As a confirmatory analysis, we verified that this pattern was also observable on individual subjects' connectivity profiles (see **Supplementary Figure 7**). We found this effect was observable on 4 out 5 subjects, whereas the last subject showed an overall low number of interhemispheric links.

#### **DISCUSSION**

In this work, we addressed the dynamics of neuronal networks during sleep spindles under the angle of phase synchrony. We proposed an original source imaging approach to reveal the cortico-cortical functional connectivity associated with transient synchronous events occurring during sleep spindles. We discuss the present work in two steps: (1) the validation of the proposed ridge-based methodology against consensual knowledge on spindles and (2) the interpretation of new findings in relation to hypothesized functional roles of spindles.

#### **VALIDATION OF RIDGES FOR THE STUDY OF SPINDLES**

The following sections are intended to validate the use of frequency-locking for characterizing the dynamics of cortical activity during sleep spindles. We argue and provide supporting evidence that frequency-locking during spindles reveals spectral and topographical properties that were previously reported by studies on the signal amplitude during spindles. In addition, we show that imaging the power of cortical sources underlying frequency-locking during spindles yields activations within regions that were previously shown to be involved in spindles using a variety of imaging techniques. The results discussed in this first section will allow us to argue that amplitudebased and synchrony-based features of spindles reflect similar neurophysiological processes.

#### *Detectability of frequency-locking spindles*

We used a wavelet-ridge framework to detect and quantify frequency-locking during spindles. Using this framework, we observe significant MEG SEs in the vast majority of spindles and subjects, and the method allowed us to measure the duration of spindle-related frequency-locked activity with remarkable consistency across subjects. We see two main reasons why wavelet ridges should be favored for studying frequency-locking during spindles. (1) We observed that the central frequency of SEs detected on MEG sensors is higher earlier compared to later within spindles. (2) It was shown that cortical sources vary during the time course of spindles recorded in MEG (Dehghani et al., 2011), which is consistent with the observation that spindles are observed different MEG sensors along time (Hao et al., 1992; Zierewicz et al., 1999). Frequency-locking recorded with MEG thus reflects a non-stationary process.

Therefore, global measures computed over the entire duration of spindles, such as magnitude-squared coherence, cannot capture the complexity of the dynamics underlying synchrony during spindles, which may explain why they yield low (0.22) synchrony values (Dehghani et al., 2010; Bonjean et al., 2012). Another approach based on autoregressive modeling and partial cross-coherence also yielded low values (−0.29 to 0.38) for average MEG synchrony (Langheim et al., 2006). However, instead of capturing the complexity of MEG synchrony, this latter approaches filters out non-stationary components of MEG signals and estimates coherence on the residue. In contrast, wavelet ridges are particularly well suited to reveal patterns of frequency-locking that change over time and space, because their detection is more robust to spectral or spatial perturbations (Amor et al., 2005).

#### *MEG spindle dynamics*

Our results showed that frequency-locking has a higher frequency when it appears at the beginning of a spindle and lower frequency when it appears at the end, with a clear boundary at 13 Hz. This corroborates previous studies reporting that intra-spindle frequency is frequently characterized by a progressive slowing of oscillatory activity (Schonwald et al., 2011). We also observed a typical 500 ms delay between early and late synchrony. Using automatic spindles detection based on signal energy, Dehghani et al. (2011) showed that spindles in MEG could arise up to 200 ms before their EEG counterpart. Interestingly, from the perspective of synchrony, a similar delay can be observed between the onset of spindles visible on the EEG and MEG synchrony (MEG often earlier). On average, however, MEG synchrony arises 110 ms after EEG spindles onset.

By localizing the ridge complex signal, we efficiently target the sources that generate frequency-locking during MEG spindles. The ridge signal is thus more appropriate for the study of functional connectivity, as will be discussed in the next section. From the perspective of average power, we find different cortical activation maps for ridges with higher versus lower central frequency. Earlier and faster SEs emerged mainly from centroparietal regions bilaterally, but only the postcentral gyrus and the superior parietal lobule survived statistical thresholding. Other groups also linked fast spindles to centro-parietal sources using dipolar source modeling (Manshanden et al., 2002; Urakami, 2008), distributed source modeling (Anderer et al., 2001), spatial filtering (Gumenyuk et al., 2009), and fMRI (Schabus et al., 2007). On the other hand, later and slower SEs emerged, bilaterally, from frontal (medial and superior gyri), and parietal (precuneus, inferior parietal lobule). Activation of the medial frontal lobe for slow spindles was also observed using distributed source modeling (Anderer et al., 2001) and fMRI (Schabus et al., 2007). We note here that despite the small sample size in our study (5 subjects), our source localization yields highly significant activity with remarkable concordance with the literature.

In addition, it was reported that frontal activity linked to slow spindles shows fair inter-subject variability both at the sensors (Doran, 2003) and the sources level (Anderer et al., 2001), thus group analyses would tend to dampen activity in this region. Inter-subject variability could also be explained by lower Signal to Noise Ratio (SNR) for signals generated by deep/mesial sources, which impacts on the performance of any sources localizer (Hämäläinen and Ilmoniemi, 1994). The significant group activation in medial frontal gyrus could thus be explained by higher resistance of ridge-based source localization to lower SNR (Zerouali et al., 2013).

### **NEW INSIGHTS FROM FUNCTIONAL CONNECTIVITY** *Sources of synchrony: connectivity*

As discussed in Section Group-Level Synchronous Networks, short-range connectivity is assessed using pair-wise synchrony within parcels (3,053,790 pairs in total) while long-range connectivity was defined as pair-wise synchrony among regions (3828 pairs). We observe that short-range spindle synchrony (99.9% of all cortical pairwise associations) was significantly higher for earlier than for later SEs, while the reverse was true for longrange synchrony (higher for later SEs). This observation supports the view that short- and long-range synchronies are somewhat antagonistic. Indeed, short-range synchrony must be weak for a network to synchronize massively among long-range distances (Langheim et al., 2006) and strong short-range synchrony, such as during slow wave sleep, prevents TMS-induced electrical waves from propagating and reaching far cortical targets (Massimini et al., 2005). We however note here that our values of shortrange synchrony are corrupted by current leakage during source reconstruction. Indeed, due to the ill-posed nature of the sources imaging inverse problem, source extension is usually overestimated, thus creating artificially high PLV values (Schoffelen and Gross, 2009; Hillebrand et al., 2012).

Our most important result is that, regardless of the timing of frequency-locking (early vs. late SEs), we observed strong frontotemporal connectivity, bilaterally. However, inter-hemispheric connectivity was weak during early SEs but was significantly strengthened during later SEs. Also, although highly significant, the quantitative variations in long-range functional connectivity are weak (PLV = 0.03). In our work, a 6% (PLV/PLVearly) increase in global synchronization level of the cortex yielded a 200% [(99 − 31)/31] increase in the number of significant longrange connections. This is an interesting observation since it supports the view that the reinforcement of long-range connections of the functional networks during spindles is a low-cost mechanism. Cost-efficiency is an important feature of smallworld networks, such as brain networks, which optimize the balance between local and long-range connectivity in order to minimize wiring cost while preserving efficient information flow (Bassett and Bullmore, 2006). It is worth to mention that the null-hypothesis models the synchrony among uncoupled oscillators with similar frequency contents (due to the narrow-band spectrum as displayed in **Figure 5**). It has been computed by shuffling the time series in sources space, separately in each condition. Alternatively, we could have modeled the null hypothesis as asynchronous events at the sensors level. This could have been done by shuffling the ridge masks among spindles in the data space. On a qualitative basis, we observe that both approaches yield equivalent thresholds, thus similar connectivity graphs. In addition, it would be of interest to compare the connectivity changes highlighted by our statistical thresholding of connectivity matrices to other dimension-reduction strategies, such as minimum spanning trees (Tewarie et al., 2014).

Taken together, our results suggest that functional connectivity undergoes important changes during spindles, evolving from a pattern of short-range and intra-hemispheric connections to more long-range and inter-hemispheric connections. This transition from local to global networks during spindles is one of the most important new discoveries from our work.

#### *Sources of synchrony: dynamics*

Most spindles started with a faster oscillation that decelerated to a slower oscillation at the end of the spindle. This suggests that fast and slow stages of spindles are two manifestations of the same oscillator, which we view as a neural system endowed with functional capabilities, that varies in frequency over a dynamic range. The fast/slow spindle classification thus may result solely from the relative durations of the fast and slow regimes.

One puzzling observation is that early SEs can be either fast or, although infrequently, slow and the reverse is true for late SEs. We thus asked what is the fundamental property underlying the two classes of spindles, timing or frequency? We found that, for both early and late synchrony, cortical power has a consistent distribution regardless of frequency. On the other hand, functional connectivity patterns are inconsistent with respect to either timing or frequency alone, early slow and late fast synchrony being much reduced compared to the early fast and late slow synchrony.

It is noteworthy that we observe a link between the frequency at which the functional network oscillates and its spatial extent. Indeed, we showed that early SEs, which are characterized by a high frequency (>13 Hz), involve lower large-scale connectivity than late SEs, which are characterized by a lower frequency (<13 Hz). Despite a small frequency range, this result is consistent with evidence suggesting that fast rhythms (i.e., gamma) support local synchrony among neurons within a cortical patch while slower rhythms (i.e., beta, alpha, theta) support distant synchrony (von Stein and Sarnthein, 2000). The coupling mechanism between frequency and spatial extent was shown to rely on the firing properties of interneurons in a mathematical model of coupled networks. Indeed, a qualitative change in interneuron firing (spike doublet) was shown to cause a switch in oscillating frequency from gamma to beta range (Ermentrout and Kopell, 1998). Interestingly, using similar model, it was shown that quantitative changes in the level of self-inhibition of interneurons could tune the oscillating frequency within the lower beta range (12–20 Hz, Kopell et al., 2000). Accordingly, we can hypothesize that, during the time course of a spindle, the levels of self-inhibition of interneurons of the thalamocortical network increase, thus causing the oscillation frequency to slow down.

In the light of previous findings, our results show that, although frequency does not impact on the sources involved in synchrony, the connectivity of the network is certainly dependent on appropriate time-frequency dynamics that might be modulated through self-inhibitory properties of interneurons.

#### *Implications for studies on the functional role of spindles*

The implication of spindles in the consolidation of memory has been suggested by a wealth of studies and is now widely accepted as unequivocal (Walker and Stickgold, 2006). Procedural learning and declarative memory are associated to spindle density and sigma power (Morin et al., 2008; Schabus et al., 2007; Tamaki et al., 2009; Barakat et al., 2011; Fogel et al., 2012). Generators of the oscillatory regime and functional connectivity underlying early and late synchrony may underlie the role of spindles in brain plasticity. Future research should investigate how overnight procedural and declarative memory consolidation would influence generators and functional connectivity of early and late spindle synchrony. This research should also be performed in an older population, which not only shows reduced spindle density, but also reduced spindle amplitude, duration, and a trend for faster spindle mean frequency. Agerelated difference in overnight memory consolidation (Spencer et al., 2007; Aly and Moscovitch, 2010; Wilson et al., 2012) may be linked to modifications in functional connectivity of spindle synchrony.

# **CONCLUSION**

In this paper, we studied sleep spindles as a sequence of transient synchronous events using MEG recordings. The methodology we developed targets specifically cortical synchronous oscillations. It involves a non-linear filtering of MEG signals using wavelet ridges, yielding ridge signals on the sensors that embed the synchronous component buried in MEG recordings. Our approach is endowed with a high sensitivity to spindle activity, since synchrony can be detected regardless of energy, and high specificity due to a controlled selection of synchronous events. We were thus able to extract statistically robust patterns of functional connectivity despite having tested only five participants. We were able to show that functional connectivity undergoes dynamical changes with respect to time-frequency features of the spindles. Future research will focus on the effect of aging and learning on such functional connectivity.

#### **ACKNOWLEDGMENT**

We acknowledge the referees for their careful reading of the manuscript. This work was supported by grants from Quebec Brain Imaging Network (Jean-Marc Lina, Julie Carrier, and Pierre Jolicoeur) and NSERC discovery program (Jean-Marc Lina, Julie Carrier).

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnins.2014. 00310/abstract

**Supplementary Figure 1 | Superimposition of the real part of the ridge signal with the original signal of the Figure 1.**

**Supplementary Figure 2 | Example of FDR statistical thresholding of PLVs**

**distribution.** The blue and red curves represent the distributions of PLVS during a SE and during a baseline period, respectively. The threshold (vertical dashed line) is set such that the ratio between the suprathreshold area under the red curve and the suprathreshold area under the blue curve is equal to an arbitrary value. We chose to set the FDR threshold at 5%, which amounts to tolerate 5% false positives.

**Supplementary Figure 3 | Activation maps associated with early (upper left) and late (upper right) SEs.** The maps are displayed with Otsu's threshold for easier visual comparison.

**Supplementary Figure 4 | Unthresholded activation maps associated with each of the 4 categories of SEs.** Normalization and color code are the same as **Figure 7**.

**Supplementary Figure 5 | Cortical parcels used in the computation of large-scale functional connectivity.** Parcels are grossly derived from the Tzourio-Mazoyer atlas and registered with the MNI template. Color-coding indicates brain lobes: frontal (red), parietal (blue), temporal (cyan), medial (green), and occipital (orange).

**Supplementary Figure 6 | Connectivity profile associated with early (left) and late (right) SEs.** Inter-region synchrony is depicted with curved lines linking two nodes. Color coding and statistical thresholds are the same as in **Figure 9**.

**Supplementary Figure 7 | Connectivity profiles associated with early (upper row) and late (bottom row) SEs for each subject.** No statistical threshold was computed on these profiles since subject-based analysis suffers low degrees of freedom; all links displayed reflect PLV values overs 0.9.

**Supplementary Figure 8 | Morse wavelet parameterized with β = 4 and γ = 4 . (A)** Wavelet representation in the time domain. The thin black and dashed lines represent respectively the real and imaginary parts of the complex wavelet and the thick line represents its envelope. **(B)** Representation of the wavelet in the Fourier domain over the positive part of its spectrum.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 January 2014; accepted: 13 September 2014; published online: 28 October 2014.*

*Citation: Zerouali Y, Lina J-M, Sekerovic Z, Godbout J, Dube J, Jolicoeur P and Carrier J (2014) A time-frequency analysis of the dynamics of cortical networks of sleep spindles from MEG-EEG recordings. Front. Neurosci. 8:310. doi: 10.3389/fnins. 2014.00310*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Zerouali, Lina, Sekerovic, Godbout, Dube, Jolicoeur and Carrier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Contributions and complexities from the use of *in vivo* animal models to improve understanding of human neuroimaging signals

# *Chris Martin\**

*Department of Psychology, The University of Sheffield, Sheffield, UK*

#### *Edited by:*

*Christopher W. Tyler, Smith-Kettlewell Eye Research Institute, USA*

#### *Reviewed by:*

*Afonso C. Silva, NINDS, USA Fahmeed Hyder, Yale University, USA*

#### *\*Correspondence:*

*Chris Martin, Department of Psychology, The University of Sheffield, Western Bank, Sheffield S10 2TP, UK e-mail: c.martin@sheffield.ac.uk*

Many of the major advances in our understanding of how functional brain imaging signals relate to neuronal activity over the previous two decades have arisen from physiological research studies involving experimental animal models. This approach has been successful partly because it provides opportunities to measure both the hemodynamic changes that underpin many human functional brain imaging techniques and the neuronal activity about which we wish to make inferences. Although research into the coupling of neuronal and hemodynamic responses using animal models has provided a general validation of the correspondence of neuroimaging signals to specific types of neuronal activity, it is also highlighting the key complexities and uncertainties in estimating neural signals from hemodynamic markers. This review will detail how research in animal models is contributing to our rapidly evolving understanding of what human neuroimaging techniques tell us about neuronal activity. It will highlight emerging issues in the interpretation of neuroimaging data that arise from *in vivo* research studies, for example spatial and temporal constraints to neuroimaging signal interpretation, or the effects of disease and modulatory neurotransmitters upon neurovascular coupling. We will also give critical consideration to the limitations and possible complexities of translating data acquired in the typical animals models used in this area to the arena of human fMRI. These include the commonplace use of anesthesia in animal research studies and the fact that many neuropsychological questions that are being actively explored in humans have limited homologs within current animal models for neuroimaging research. Finally we will highlighting approaches, both in experimental animals models (e.g. imaging in conscious, behaving animals) and human studies (e.g. combined fMRI-EEG), that mitigate against these challenges.

**Keywords: neurovascular, functional magnetic resonance imaging, rodent, neuroimaging, hemodynamic**

# **INTRODUCTION**

"One of the difficulties in understanding the brain is that it is like nothing so much as a lump of porridge."

Richard L. Gregory (Gregory, 1997)

Human functional brain imaging techniques now play a prominent role in much neuroscience and psychological research. Unlike many other organs in the body, assigning function or aspects of functions, in space or time to particular components of the brain based on even a very detailed analysis of its structure alone is extremely difficult. Although we have learned much from neuropsychological research in human subjects or experimental animal research studies, there can be little doubt that the possibilities for understanding afforded by any technique that can provide a spatiotemporal readout of changes in brain function are immense. It is thus perhaps unsurprising that the rate of publication of scientific papers incorporating functional neuroimaging now exceeds 10 per day (Kim and Ogawa, 2012). Beyond academic communities, these tools and the images they produce have also captured public interest and provided renewed opportunities for neuroscientists to engage with publics on both scientific matters and upon issues at the science-society interface (Racine et al., 2005).

For more than 20 years, the non-invasive neuroimaging technique of functional magnetic resonance imaging (fMRI) based on blood oxygen level dependent (BOLD) signal changes has been used to estimate neural signals in the human brain (Ogawa et al., 1990). Prior to this, positron emission tomography (PET) firmly established the possibilities for human neuroimaging based on surrogate hemodynamic markers. Since the human application of these hemodynamic based neuroimaging techniques, substantial effort has been directed toward improving our understanding of functional brain imaging signals, the hemodynamic changes that give rise to them, and the relationship of these changes to underlying neuronal activity (**Figure 1**). Although it is widely appreciated amongst scientific communities

that the relationships between activity within heterogeneous neuronal populations and neuroimaging signals are complex, rather indirect, and incompletely understood, many research papers continue to report neuroimaging signals as "*measures of neuronal activity*," which they are not. In addition, it has been shown that the presence of brain "activation" images in scientific papers results in higher ratings of scientific quality even where the images provide no additional information to the reader (Mccabe and Castel, 2008). It is thus increasingly important that we both build our understanding of precisely what functional neuroimaging signals do tell us about brain activity, and ensure that improved frameworks for the interpretation of brain imaging data are effectively communicated. In this review, we will use the terms neuroimaging or functional brain imaging to refer to hemodynamic imaging techniques as applied to human subjects, rather than the broader definition which includes electroencephalography (EEG) or magnetoencephalography (MEG).

Neuroimaging signals arise because of a coupling between changes in neural activity, metabolism, and hemodynamics (blood flow, oxygenation, and volume) in the brain, termed neurovascular coupling (**Figure 1**; Villringer, 1997; Logothetis and Wandell, 2004; Logothetis, 2008). Studies of neurovascular and neurometabolic coupling have therefore been central to the progress that has been made so far in investigating what neuroimaging signals tell us about neuronal activity. Important insights have arisen from research using animal models, in our laboratory and elsewhere, in which detailed measurements of functional brain imaging signals and/or the hemodynamic and neuronal events underpinning them can be made (Mathiesen et al., 2000; Norup Nielsen and Lauritzen, 2001; Smith et al., 2002; Devor et al., 2003, 2005; Martindale et al., 2003; Jones et al., 2004; Sheth et al., 2004; Berwick et al., 2005, 2008; Hewson-Stoate et al., 2005; Martin et al., 2006a,b, 2012; Hillman et al., 2007; Franceschini et al., 2008; Boorman et al., 2010; Kim et al., 2010). It is possible, in animal models, to explore in detail the changes that occur in each component of the complex system linking neuronal activity to neuroimaging signal changes, as illustrated in **Figure 1**. Thus, a number of landmark papers (e.g. Logothetis et al., 2001) have now established the general validity of fMRI signal changes as indicators of altered neuronal activity: there is overwhelming empirical evidence that increases in BOLD fMRI signals in healthy cortical structures reflect increased neuronal activity in those structures. However, the boundaries of this broad statement and limits to its generalizability are becoming increasingly important as neuroimaging is applied to study the whole brain in both health and disease contexts.

The focus of this paper will therefore be upon the exceptions, complexities and remaining uncertainties around neuroimaging signal interpretation, as revealed predominantly by *in vivo* experimental animal research into neuroimaging signals and neurovascular coupling. We shall begin by outlining the main approaches to investigating neuroimaging signals and neurovascular coupling in experimental animal models, including a brief overview of the major techniques. We shall then review some of the key research questions that these approaches allow us to address along with some of the main insights provided thus far. We shall review recent research highlighting areas where the relationships between neuronal activity and hemodynamic changes are more complex and discuss the implications of this for neural signal estimation in human neuroimaging. Finally we will turn attention to the experimental animal models themselves, their limitations, as well as new possibilities to investigate neurovascular coupling directly in human subjects.

# **INVESTIGATING NEUROIMAGING SIGNALS USING EXPERIMENTAL ANIMAL MODELS**

Research using experimental animal models to investigate neuroimaging signals and their relationship to neuronal activity has approached the issue from two converging perspectives. On the one hand there has been an emphasis on determining the parametric relationships between neuronal activity and neuroimaging signals, including characterization of the mathematical relationships between signals, estimation of the hemodynamic impulse response function (HIRF, see **Figure 2** and section Characterization of hemodynamic impulse response functions), and the development of comprehensive biophysical models of neurovascular coupling. We shall refer to this approach as being concerned with "parametric neurovascular coupling," emphasizing the concurrent measurement of signals in different components of the neurovascular-neuroimaging system, illustrated schematically in **Figure 1** (outer processes). On the other hand, attempts have been made to determine the physiological mechanisms of neurovascular coupling, with a focus on the key chemical mediators and modulators, and the relative involvement of different cell types including astrocytes, neurons, pericytes, endothelial cells, and vascular smooth muscle cells. We shall refer to this approach as being concerned with "physiological neurovascular coupling," emphasizing the biochemical and physiological mechanisms that mediate the relationships between the neuronal, metabolic, and hemodynamic components, illustrated in

**Figure 1** (inner detail). The former approach is more directly relevant to the development of strategies to analyze neuroimaging data and our ability to make general judgments regarding the effects of experimental manipulations on the magnitude of underlying neuronal signals, whereas the latter is critical if we are to improve our detailed understanding of the neurophysiological events that these signals represent. Convergence of both perspectives is important for progressing our ability to estimate neural signals from functional brain imaging data acquired in an increasingly broad range of scenarios. With this distinction in mind, we shall now briefly walk through the principle methodologies being applied in this field and outline how they contribute to either approach. A summary of the main techniques used to improve understanding of neuroimaging signals in experimental animal models is provided in **Table 1** and for a fuller review the reader is directed to recent publications by Devor et al. (2012, 2014) and Zhao et al. (2014c).

#### **fMRI FOR INTERROGATING NEUROIMAGING SIGNALS**

To some extent it is "by definition" that investigations of neurovascular coupling usually require measurement of both the neural and vascular components of the overall system. In parametric neurovascular coupling research this is predominantly the case and so experimental designs in which multiple data acquisition methods can be combined have become the workhorse of this field. Because of the high degree of inter-trial (and intersubject) variability inherent in hemodynamic measurements in particular, concurrent acquisition of neuronal and hemodynamic data provides significant statistical and data modeling advantages (in addition to reducing animal numbers). Combining neuroimaging techniques that are used in humans (such as fMRI) with electrophysiological techniques to measure neuronal signals would appear the most parsimonious approach to elucidating neuroimaging-neural signal relationships. These data provide direct insight into the quantitative relationships between neuroimaging and neural signals. Furthermore, small-animal fMRI systems, though a combination of reduced bore size (and accordingly other hardware including gradient coils), smaller radiofrequency coils and increased field strength are routinely able to resolve voxels approximately an order of magnitude smaller than obtainable in human fMRI. High field strength fMRI in animal models has enabled researchers to study the fine detail of hemodynamic responses, with sub-second temporal precision and in-plane voxel sizes of *<*100μm (see section Heterogeneous distribution of fMRI signals sources revealed by high field strength fMRI). However, combining this approach with direct measures of neuronal activity is problematic for a number of reasons. Firstly, electrodes to record neuronal activity will introduce artifacts into the imaging data. These can be minimized by careful electrode design, choice of materials, and image acquisition parameters, but it is very difficult to completely avoid distortion of imaging data at precisely the point where it is of most interest (where the electrode is positioned). Secondly, fMRI in particular makes use of large, rapidly changing electromagnetic fields as part of the data acquisition process which can severely distort the electrophysiological measurement of neural signals. Whilst an increasing number of laboratories have been able to find ways


**Table 1 | Overview of principle methods used to investigate neuroimaging signals and neurovascular coupling.**

around this (e.g. Huttunen et al., 2008), it is technically challenging and often requires substantial "development time" on an MRI system, making it a costly process. Thirdly, there are also limits to temporal and spatial resolution of PET and to a lesser extent fMRI. Although high-field strength fMRI is now able to achieve impressive results, as discussed further below, convergence upon the spatiotemporal scales at which neuronal activity operates or blood flow is directly regulated, remains elusive. Finally, a major barrier to the use of non-invasive imaging techniques for research in this field is cost: preclinical MRI facilities (for rats/mice) may charge *>*\$3000 per day, whereas facility costs are typically an order of magnitude lower for using many of the alternate *in vivo* approaches as summarized in **Table 1**.

#### **OPTICAL METHODS FOR IMPROVED RESOLUTION AND MULTIMODAL DATA ACQUISITION**

Because of these limitations, and because more invasive approaches can be used in experimental animals models, methods of making hemodynamic measurements that require more direct brain access have become well established. An increasingly wide range of techniques make use of the light absorption and/or scattering properties of brain tissue, and specifically the hemoglobin present in the vascular system, in order to obtain high spatial and temporal resolution readouts of hemodynamic changes. These techniques usually require visualization of the brain either through a craniotomy or a thin cranial window (the skull is thinned to translucency over the imaged brain tissue). One popular technique, intrinsic signal optical imaging (Malonek and Grinvald, 1996), measures changes in the concentration of oxy- and deoxy-hemoglobin and is able to resolve changes at the level of 10 s of microns and 10 s of milliseconds (**Figure 3**). This technique is particularly amenable to combination with other methods, such as the implantation of microelectrodes to simultaneously record neuronal activity, or the addition of other probes to record blood flow changes or tissue oxygenation. Another advantage of this approach is that the cost and complexity of the imaging system is much reduced compared to MRI systems. However, a major disadvantage of many standard optical imaging approaches is the limited depth penetration due to light scattering and absorption by tissue: systems exploiting light in the visible wavelength range are limited in signal acquisition to just the first few hundred microns of cortical tissue. The related techniques of near infrared spectroscopy which uses light at longer wavelengths to penetrate several centimeters of tissue, or diffuse optical tomography which uses an EEG-style array of detectors (and sources), have the advantage that they are used in both human and animal studies (they can penetrate the skull), but there is a severe trade off in terms of spatial resolution due to light scattering effects. This class of methodologies provides information relevant to both parametric and physiological neurovascular coupling approaches (although typically individual research studies will emphasize one other the other approach). This is because fairly detailed measurement of different aspects of the hemodynamic and neuronal responses can be combined with pharmacological manipulations to enable interrogation of relevant biochemical pathways (e.g. see reviews by Carmignoto and Gomez-Gonzalo, 2010; Cauli and Hamel, 2010).

**topographically preserved with mapping in somatosensory cortex (B) of individual whiskers to individual cortical columns (C). (A–C)** adapted from Chen-Bee et al. (2012). **(D)** Measurement of total hemoglobin concentration changes during stimulation of individual whiskers (A1–E1) produces spatiotemporal activation maps that allow spatial discrimination of activation

surface (vasculature) and cortical histological sections with the stimulated barrel highlighted in black to verify anatomical specificity. The contour around each stimulated barrel is the activated total hemoglobin region defined as all pixels with 50% of the peak response from a mean image of the last 4 s of the 16 s stimulation period. **(D,E)** adapted with permission from Berwick et al. (2008).

#### **MICROSCOPY METHODS FOR CELLULAR LEVEL RESOLUTION AND RESEARCH INTO MECHANISMS**

Two-photon laser scanning microscopy (2PSLM) in an increasingly popular technique that is able to record both neuronal and hemodynamic events at cellular level resolution (see Shih et al., 2012a, for an excellent recent review) thus enabling physiological neurovascular coupling to be investigated. Fluorescent reporter molecules are used to provide information about intracellular events, vascular responses, blood flow and other phenomena and these can be measured at extremely high spatial resolution due to the very narrow focal plane of 2PLSM. Additionally, the use of laser light with wavelengths in the near infrared range is able to achieve better tissue penetration than standard optical imaging techniques (up to 1 mm). It is also possible to "stimulate" neurophysiological events, for instance using calcium uncaging by photolysis (e.g. Takano et al., 2006), thus providing opportunities for very fine levels of control and recording of neurovascular function. This microscopic approach is often combined with highly specific pharmacological or electrophysiological manipulations which may be applied to individual cells. It can be performed *in vivo* or *in vitro*, and increasingly research groups will deploy the technique in both modes to address a specific question in order to combine the enhanced control, manipulation and data quality advantages of the *in vitro* approach with the demonstration of functional relevance that is best achieved *in vivo*. 2PLSM is therefore becoming established as a "gold standard" method for addressing physiological neurovascular coupling research questions. Disadvantages of 2PLSM include the requirement for the lens objective to be very close to the imaged tissue sample, making the combination of 2PLSM with other recording methods technically challenging. This could be a problem for relating the microscopic findings from 2PLSM to the more macroscopic vascular and neuronal events that neuroimaging measures or makes inferences about, respectively, and which is typically a strength of parametric neurovascular coupling approaches.

#### **MONITORING NEURONAL ACTIVITY**

A major advantage of using animal models in this context is that it is possible to combine methods that measure hemodynamic changes (including non-invasive imaging tools such as small animal MRI) with other, usually invasive methods that can measure the changes in neuronal activity that underlie the hemodynamic events. For this latter purpose, electrodes can be implanted directly into the brain to record the activity of neurons and in some cases multi-site electrode probes are used which can capture activity at multiple locations in the brain or along the length of a functional unit such as the cortical column (Berwick et al., 2008). From these recordings it is possible to resolve both local field potentials (LFPs) (an aggregate measure of excitatory and to a lesser extent inhibitory synaptic, activity in a local cell population, primarily reflecting input) and spiking activity (action potentials, primarily reflecting output activity). Other, optical approaches, are able to produce high resolution 2 or 3-D maps of cellular activity (Akkin et al., 2010), often in combination with reporter dyes (Devor et al., 2007). Although these invasive approaches provide measurements of neuronal activity that share signal sources with non-invasive electrophysiological techniques that are used in human subjects, such as EEG or MEG, because of large differences in the spatial resolutions achievable and other factors, direct comparisons are not straightforward. In this respect, further research is needed to improve understanding of the relationships between the different measures of neuronal activity made in humans and animal models to underpin translation of findings between studies (e.g. Buzsáki et al., 2012).

The direct brain access, experimental control and stability afforded by the use of experimental animal models has enabled many combinations of the techniques or approaches described above and in **Table 1** to be used in order to directly interrogate neuroimaging signals and neurovascular coupling (see Zhao et al., 2014c). For example, in our own laboratory we have established concurrent optical imaging and high field fMRI in rodents in order to provide insights into the hemodynamic constituents of BOLD fMRI signal changes (**Figure 4**). Working outside of the MRI environment we have been able to combined a wider range of measurement techniques for concurrent recording of various neuronal, metabolic, and hemodynamic signals in rodent somatosensory cortex (**Figure 5**), providing a multimodal readout of neurovascular function. The multimodal measurement techniques, when combined with other experimental tools such as transgenic animals or pharmacological manipulations, can

**FIGURE 4 | Concurrent fMRI and optical imaging spectroscopy.** An oblique slice covering the dorsal surface of the brain (top left) is first used to identify a coronal (top center) or topographic slice (top right) containing the whisker barrel cortex for fMRI data acquisition. Apparatus allowing concurrent optical imaging is illustrated (top), consisting of a specially adapted MRI-compatible endoscope to transmit light to and from the brain surface. Optical imaging data is shown in the bottom panels, including the raw gray scale imaging of the cortical surface visualized through a thin cranial window (left and center), and activation induced changes in deoxyhemoglobin concentration (right) that correspond well to the concurrently acquired fMRI data (top right). Adapted with permission from Kennerley et al. (2012).

provide new insights which bridge the "parametric" and "physiological" neurovascular coupling research approaches. This is especially the case when the resultant data is made available for the development of biophysical models which attempt to capture the mathematical relationships between important biological parameters (e.g. Zheng et al., 2010; Brodersen et al., 2011; Rosa et al., 2011).

#### **POSSIBILITIES FOR ACUTE AND RELIABLE LONGER TERM DATA ACQUISITION**

Acute experiments involving animal models might be conducted over many hours, during which time baseline physiological parameters are carefully monitored and maintained within a narrow range. This enhanced window for data acquisition enables a more intensive exploration of the effects of manipulating independent variables (such as stimulation intensity, duration, frequency, repetition rate, multiple stimulus types) upon measured responses than would be possible in human subjects. Within such a time window it also possible to explore the effects of pharmacological or other manipulations on a within-subjects basis, obtaining neurovascular coupling readouts pre-, during-, and post-treatment. Because animals can be fixed with respect to the imaging apparatus more rigidly and for longer periods than would be possible in human subjects, test-retest reliability is very high. Chronic experimental designs are also possible where data from individual subjects can be acquired repeatedly over many weeks or months (Weber et al., 2006; Silva et al., 2011; Brydges et al., 2013; Martin et al., 2013). Although such approaches are also used in human neuroimaging studies, the high degree of experimental control possible in animal studies is particularly advantageous in light of the many factors which in humans can alter neurovascular coupling (e.g. as discussed elsewhere in this paper).

# **WHOLE BRAIN ACCESS**

Human neuroimaging studies frequently report on responses to stimuli occurring across the brain, in both cortical and subcortical regions. In animal models it is possible to elicit controlled changes in neuronal activity in specific structures, including deep brain structures, which are not easily accessible to non-invasive stimulation techniques in order to refine our understanding of neurovascular coupling across the whole brain. In addition, as many brain structures are in receipt of multiple, convergent inputs that may or may not involve different cell types or neurotransmitters, it is possible, by independently stimulating convergent pathways, to improve our understanding what the composite neuroimaging signal is revealing about input patterns (e.g. see Enager et al., 2009; Krautwald and Angenstein, 2012; Krautwald et al., 2013).

### **KEY AREAS OF INSIGHT FROM** *IN VIVO* **EXPERIMENTAL ANIMAL RESEARCH STUDIES**

In the following sections, we focus on four major themes that emerge from state-of-the-art research studies using animal models which target key questions in relation to understanding neuroimaging signals.

#### **HETEROGENEOUS DISTRIBUTION OF fMRI SIGNALS SOURCES REVEALED BY HIGH FIELD STRENGTH fMRI**

Research carried out on high field strength, preclinical MRI systems has provided important insights into the hemodynamic composition of neuroimaging signals as well as the localization of separable signal components to specific vascular or neuronal architectures across the cortical laminae. This is a question not just relating to anatomical localization, but given our extensive understanding of how neural computation is distributed within the cortical columnar structure, goes to the core of precisely "what" fMRI signals reveal about neuronal activity in the context of cortical neuroimaging.

Recent studies in rodents and non-human primates have enabled the cortical hemodynamic response to sensory stimulation be elucidated as a function of cortical depth. Goense et al. (2012) studied both positive and negative BOLD signals, in addition to the commensurate CBV and CBF changes, in the primary visual cortex of anesthetized macaques. Although positive BOLD signals were associated with increases in both CBV and CBF, the depth profile varied for the BOLD, CBV, and CBF changes. Whilst BOLD increases were maximal at the cortical surface, CBF increases were maximal at approximately layer IV whilst CBV increases occurred with relative uniformity throughout the cortical layers. More intriguingly, negative BOLD signals occurred in regions where CBV increased and CBF decreased. Maximal negative BOLD responses occurred across the middle cortical layers whereas CBV responses were largest around layer IV and CBF responses were largest at the surface. This pattern of results suggest that the hemodynamic response to changes in neuronal activity that is measured in most neuroimaging studies (which lack the ability to resolve cortical layers) is an aggregate of a complex set of blood volume, flow and oxygenation response functions which appear to be heavily dependent upon an interaction of vascular and neuronal architecture. Goense et al., further conclude that these data provide evidence for differing neurovascular coupling mechanism operating at different cortical layers and underlying positive and negative BOLD signal changes. These data are in broad agreement with Shih et al. (2013) who used CBV-weighted fMRI at 11.7T in anaesthetized rats to resolve hemodynamic changes at a resolution that revealed cortical columnar like structure. CBV changes were again found to be maximal at ∼layer IV. In this study, the laminar profile of neuronal activity was also measured (in a separate group of animals) and this revealed a mismatch between hemodynamic and neuronal activation-depth profiles. A further challenge to the specificity of BOLD fMRI signals to neuronal activity at the level of cortical laminae was reported by Herman et al. (2013), who calculated the laminar profile of changes in oxidative metabolism (CMRO2) in response to somatosensory stimulation in anesthetized rodents using a multimodal "calibrated fMRI" approach. Here, BOLD and CBV changes appeared spatially uncoupled to separately measured neuronal activity whereas CBF and CMRO2 changes were relatively well coupled to LFPs and multi-unit activity (spiking) respectively.

On the one hand the data obtained using these methods are problematic for fMRI: the neuronal activity mapping potential of BOLD signals in particular appears to be confounded by the neurovascular macrostructure of the cortical column (the spatial relationship of micro- and macro-vasculature to the neuronal and neurometabolic activity "sinks"). On the other hand, these data represent a new set of constraints that researchers can build into mathematical modeling and data analysis tools in order to improve the accuracy of inferences about neuronal activity. In addition, the increasing availability of higher field strength systems for research use and a wider use of multimodal fMRI acquisition approaches (i.e., acquiring CBF and/or CBV data in addition to BOLD signals) will mitigate against the uncertainty inherent in BOLD signals. An important new direction for high field strength fMRI research will be to improve understanding of neurovascular and hemodynamic variability across heterogeneous subcortical structures, for instance across the thalamus, hippocampus, or adjacent component structures of the basal ganglia. Since fMRI methods can be applied in both human and animal models, these efforts will be best served by designing studies that are as far as possible analogous between species (i.e., in terms of stimulation paradigms, pulse sequences, anatomical focus), with animal models presenting an opportunity for more detailed investigation of key findings using additional, invasive techniques (as described above).

#### **CHARACTERIZATION OF HEMODYNAMIC IMPULSE RESPONSE FUNCTIONS (HIRF)**

The "hemodynamic impulse response function" (illustrated in **Figure 2**) is widely used as a canonical model in fMRI data analysis tools and as such a detailed understanding of this and how this is affected by brain region, health and disease status, pharmacological or physiological manipulations, is very important (Gitelman et al., 2003; Martindale et al., 2003; Handwerker et al., 2012). In essence, the HIRF refers to the hemodynamic response that results from a single neuronal event: by convolving this function with either measured neuronal activity or more commonly an estimate of the neuronal input function, an estimate of the neuroimaging signal change attributable to the stimulus is obtained. The HIRF is often approximated as a composite of two gamma functions which may be specified by a relatively small number of parameters. In fMRI analysis these parameters can either be fixed or allowed to vary within a predefined range in order to optimize the identification of responsive voxels. Although the general form of the temporal HIRF has been wellcharacterized and there is recent evidence that it is stable in the context of robust alterations in hemodynamic baseline parameters (Kennerley et al., 2012), in human studies as well as in animal experiments it has been shown that there are many factors that can influence the parameter values that specify the HIRF (**Figure 6**), as reviewed by Handwerker et al. (2012). A further complication is introduced by Hirano et al. (2011) who argue that a discrepancy between the BOLD and CBF HIRF that becomes apparent when moving from very brief to longer stimuli, is related to differential contributions of venous and arterial hemodynamic changes to the neuroimaging signals at different time points.

Even less well-understood is the spatial structure of the HIRF, yet specification of this is equally important in the context of spatiotemporal neuroimaging methods and the signal processing required. Within the cortex, the distinct neuronal and vascular architecture that is present in different layers is known to be a source of variability in hemodynamic responses, as described in the previous section. There have however been very few experimental studies investigating the spatial relationships between neuronal and hemodynamic signal changes across the cortical surface or in sub-cortical structures. One exception is Vazquez et al. (2013), who investigated the hemodynamic point-spread function using a multimodal stimulation and data-acquisition approach and showed a linear relationship between the spatial extent of neuronal and hemodynamic responses. One reason for

a lack of research into the spatial correspondence of neuronal and hemodynamic changes that is required to inform development of a spatial HIRF is the availability of methods to simultaneously provide spatiotemporal readouts in both measurement modalities (e.g. see Baraghis et al., 2011). One solution is to investigate spatial hemodynamic changes in the context of neural systems with very well-known spatial response properties, such as the primary visual cortex (e.g. Aquino et al., 2014). In the context of optical imaging techniques it may also be possible to take advantage of rapid changes in light scattering that occur (due to cell swelling) when neurons are activated, or to separate optical signal components (e.g. from intrinsic metabolic and hemodynamic markers) using multichannel acquisition systems (Zhao et al., 2014b). In section Relating signals in time and space, we review other research into the temporal and spatial relationships between neuronal and hemodynamic signal changes which will also be important in the refinement of the canonical HIRF and how it is used.

Understanding the sources of HIRF variability could lead to improvements in neuroimaging experimental design, data analysis or interpretation. This understanding could also produce possibilities for differences in the HIRF to be exploited as biomarkers for the neuronal, vascular, or neurovascular perturbations of function associated with many brain diseases. Future goals could include the development a framework for neuroimaging data analysis that applies empirically derived constraints on a flexible HIRF according to (e.g.) brain region, subject age, health status, etc. Alternately, methods for obtaining subject or brain region specific HIRF estimates in human neuroimaging could be investigated (Kang et al., 2003; Handwerker et al., 2012) with the aim of developing a rapid scanning protocol which can help "tune-up" the canonical HIRF to be used in the analysis of the main study neuroimaging data.

#### **INVESTIGATIONS OF NEUROVASCULAR COUPLING MECHANISMS**

A full review of insights into the biochemical or physiological mechanisms known to underpin neurovascular coupling is beyond the scope of this paper and there are a number of excellent recent reviews which address this issue directly (e.g. Attwell et al., 2010; Carmignoto and Gomez-Gonzalo, 2010; Cauli and Hamel, 2010; Hillman, 2014). In overview, combining the methodologies detailed in section Investigating neuroimaging signals using experimental animal models and **Table 1** with pharmacological manipulations has been useful in elucidating, for instance through selective inhibition, key mechanisms, molecules, and mediators involved in neurovascular coupling. It is partly through this approach for instance that the roles of nitric oxide (Akgoren et al., 1994; Lindauer et al., 1999; Kitaura et al., 2007) and cyclooxygenases (Niwa et al., 2000, 2001; Lecrux et al., 2012) in neurovascular coupling have been elucidated, or the relationships between cerebral blood flow changes and oxidative metabolism have been probed (Leithner et al., 2010). This approach also enables the impact of therapeutic or other drugs upon neurovascular coupling or brain imaging signals to be investigated, either to explore these effects directly or to inform fMRI data interpretation in (for example) patient groups (Choi et al., 2006; Lindauer et al., 2010; Chin et al., 2011; Sander et al., 2013).

Most recently, possibilities for using optogenetic tools to help dissect the cellular-specific contribution to BOLD signals and neurovascular coupling have become available. Work using these methods is in its earliest stages. It has been applied to investigate the spatial correspondence of hemodynamic and neuronal responses to light-activation vs. sensory stimulation (Vazquez et al., 2013; Li et al., 2014) and in addition Lee et al. (2010) combined optogenetic stimulation with fMRI in mice and demonstrated a complex pattern of both positive and negative BOLD signal changes attributable to specific activation of cortical pyramidal cells (Lee et al., 2010; Urban et al., 2012). As in other areas adopting optogenetic approaches, there is a proliferation of new viral vectors and transgenic lines that allow targeting of specific components parts of the overall system (for example, astrocytes, Figueiredo et al., 2011) and as such it seems likely that these approaches will provide important insights into the contributions of elements of the neurovascular "circuit" to hemodynamic and fMRI signal changes. Whilst there are a finite number of cell types which contribute to neurovascular coupling, the number of pharmacological components of the key signaling pathways is much larger and by way of their interactions, substantially more complex. Combining new optogenetic tools and in particular inhibitory techniques (e.g. via halorhodopsin) with pharmacological manipulations will perhaps provide the most important insights, as this will allow interrogation of cell-specific biochemical pathways. It should also be noted that hemodynamic signal changes, and in particular when detected using fMRI, may be vulnerable to localized heating effects caused by laser power, as recently reported by Christie et al. (2012) and as such the use of optogenetic tools in neurovascular research will require careful validation.

A long standing debate in the literature which has recently become more prominent and which is likely to benefit from the development of optogenetic tools, concerns the relative contributions of neurons, astrocytes, and pericytes to neurovascular coupling. Whilst the release of neurotransmitters and vasoactive molecules from neurons is well established as a primary factor in initiating vascular responses, how to attribute the subsequent biochemical cascade resulting in locally increased blood flow to specific molecules and cell types remains uncertain. On the one hand a recent paper provided evidence for stimulus induced vasodilation occurring independently of the astrocyte-located mechanism thought to be critical for initiating the blood flow response (Nizar et al., 2013), yet conflicting results were subsequently reported by Lind et al. (2013). For an in-depth review of the complex role of astrocytes in neurovascular coupling, see Howarth (2014). Two very recent publications are likely to initiate new lines of enquiry in this field. Firstly Hall et al. (2014) outline a major and previously neglected role for pericytes in highly localized and dynamic control of blood flow in the brain and in a review Hillman (2014) suggests vascular endothelial cells may represent an additional overlooked mechanism in the regulation of brain blood flow and the generation of neuroimaging signals. The increasingly apparent complexity of neurovascular coupling may suggest a degree of physiological redundancy in the regulation of brain blood flow, an evolutionary consequence of the very limited oxygen available for sustaining the brains high metabolic demand should blood flow be perturbed. A finding by Leithner et al. (2010) that 30% of the hemodynamic response to evoked neuronal activity remains even when the major known biochemical pathways mediating neurovascular coupling are simultaneously inhibited by a cocktail of pharmacological agents, and that this reduction had little impact on neuronal activity, certainly seems indicative of system about which there is much to discover.

These uncertainties are important for the interpretation of neuroimaging signals in a number of ways. Knowing the biochemical or cellular substrates of hemodynamic signal changes, the relative contributions of direct neuron-vascular signaling pathways or signaling mediated via other cell types, will provide basic insights into the information contained within neuroimaging signals about (for example) neural computation or functional connectivity. Furthermore, variation in the contribution or functioning of these pathways between brain regions, disease states, age, and many other conditions could alter the provision of such information, and therefore have implications for how we interpret neuroimaging data in many situations.

In summary of this section, research into neurovascular coupling and the neurophysiological basis of neuroimaging signals is advancing beyond a broad-brush validation of neuroimaging signals as a marker for neuronal activity changes. It is now detailing a wider empirical landscape that will be needed for the interpretation of functional brain imaging signals acquired in increasingly diverse experimental and empirical contexts. The next sections highlight research in these areas that challenges simplistic interpretations of fMRI signals, suggesting that a more nuanced and contextually informed approach to signal analysis and data interpretation is indeed now required.

#### **INTERPRETING FUNCTIONAL BRAIN IMAGING SIGNALS AS NEURONAL ACTIVITY**

In **Figure 6** we summarize the factors which may influence what neuroimaging signals tell us about neuronal activity and which may differ between many conditions including brain regions, subjects, patient groups, and time-points. With respect to the interpretation of functional MRI signals, neuronal activity is usually broadly classified into two types: LFPs or spiking activity. Much research and discussion has focused upon the relative contributions of these two types of activity to BOLD signals and although it is generally recognized that neuroimaging signals correlate best to LFPs (Logothetis et al., 2001; Viswanathan and Freeman, 2007), spiking activity has also been shown to correlate closely in many (Logothetis et al., 2001; Jones et al., 2004), but not all contexts so far investigated (Caesar et al., 2003; Thomsen et al., 2004; Rauch et al., 2008). Another question that has been extensively investigated using experimental animals is whether brain hemodynamic responses are linearly, or non-linearly related to changes in specific aspects of neuronal activity. Research in which simultaneous neuronal and hemodynamic measures are made in cortex has shown that both linear and non-linear patterns of coupling are found which may be attributable to a range of factors (Jones et al., 2004, 2008; Sheth et al., 2004; Hewson-Stoate et al., 2005; Martin et al., 2006b; Hoffmeyer et al., 2007; Zhang et al., 2009; Liu et al., 2010; Magri et al., 2011). The issues of linearity and contributions of spiking or synaptic activity become more complex when the relative contributions of different neuronal input pathways (e.g. Enager et al., 2009) and neuron types to neurovascular coupling are considered. For example, it has been shown that both excitatory (glutamatergic) and inhibitory (GABAergic) neurons can evoke positive BOLD signals, whilst activity amongst inhibitory neurons alone can produce negative BOLD signals (Lauritzen et al., 2012).

It is important to note that much of the research that has been conducted to investigate the relative contributions of neuronal activity types to neuroimaging signals has focused almost exclusively on cerebral and to a lesser extent cerebellar cortical structures. Human neuroimaging on the other hand, is applied to study the whole brain and signals originating from subcortical structures are tacitly interpreted in the same way as cortical signals, despite the relative lack of empirical research to support such an approach. Indeed, the research that has been conducted suggests that neurovascular coupling is brain region dependent (Sloan et al., 2010; Devonshire et al., 2012), a finding that is perhaps not unsurprising given we know that brain structures differ substantially in their cytoarchitecture, vascular density, involvement of different neurons and neurotransmitter systems, and other factors (see **Figure 6**). We argue that as human neuroimaging continues to provide surrogate markers for activation across the whole brain, experimental animals studies will be needed to probe the regional heterogeneity of neurovascular coupling and the relative contributions of the various components of neuronal activity. Key questions to ask of specific brain structures include:


#### **RELATING SIGNALS IN TIME AND SPACE**

Detailed investigation of the spatiotemporal evolution of the hemodynamic response in experimental animal models has demonstrated that as it propagates through the capillary network and begins to include lager contributions from upstream or downstream changes in arterioles and venules respectively, the estimation of neuronal signal changes from hemodynamic proxies inevitably becomes more difficult (Hirano et al., 2011; Yu et al., 2012). For instance a recent study investigated the spatial correlation between neuronal activity and fMRI BOLD responses in the mouse somatosensory cortex using a combination of sensory stimulation and channelrhodopsin-mediated activation (Li et al., 2014). Focal activation of neurons using laser light produced neuronal responses that were tightly confined to the area of stimulation (∼0.5 mm), whereas the hemodynamic responses to the same stimulus extended to a much larger area (*>*3 mm). Work by Vazquez et al. (2013) indicates a closer spatial correspondence of neuronal and hemodynamic changes where hemodynamics are measured using optical methods. This highlights how the ability of BOLD fMRI to spatially resolve neuronal activity changes is in part limited by biophysical factors such as large-vessel signal contributions to BOLD signals (see Kim and Ogawa, 2012, for a full review). In any case, it is evident that measuring the fine detail of the hemodynamic point-spread function, which may itself depend upon various factors (**Figure 6**), is important for mapping neuronal events.

Inverted hemodynamic responses corresponding to negative BOLD signal changes have also been studied in detail in animal models using optical imaging techniques. In both awake and anaesthetized rats, a center-surround response profile has been observed, where focal increases in cerebral blood flow, volume and oxygenation are accompanied by an inverted response annulus which itself appears to have a neuronal origin (Devor et al., 2007; Boorman et al., 2010; Martin et al., 2012). Although these "negative surround" responses are detectable using high field strength small animal fMRI (Kennerley et al., 2012), it is less likely that these small changes would be readily detectable in human fMRI, even though they may represent an important component of the overall response. The equivalent problem may also exist in the temporal domain. Our own work in awake rodents has revealed that the temporal hemodynamic responses function may have additional complexity which is masked by the commonplace us of anesthesia in *in vivo* research studies (see Discussion below and Martin et al., 2013). Specifically, we find evidence for a more dynamic, oscillatory hemodynamic response in cortex yet once again the limitations of (in this case temporal) resolution in typical human fMRI would make it unlikely that such changes would be detected (Martin et al., 2013). A recent theoretical work supports both of these empirical results, predicting the occurrence of both an adjacent region of negative BOLD response as well as temporal signal oscillations (Aquino et al., 2014).

Improvements in our understanding of the fine detail of the spatiotemporal hemodynamic response function would have two main benefits. Firstly, by enabling the derivation or estimation of a more accurate spatiotemporal HIRF (e.g. one that accounts for spatiotemporal hemodynamic oscillations, Aquino et al., 2014), we would be better able to estimate the magnitude and spatial extent of the underlying neuronal responses in standard fMRI studies. Secondly, as the spatial and temporal resolution of neuroimaging increases with greater availability of higher field strength magnets and other technological improvements, determining the finer grain detail of the spatiotemporal HIRF becomes more important to ensure parallel improvements in the accuracy of mapping neuronal activity in space and time using hemodynamic proxies. In both cases, it is clear that more research is needed to determine the spatiotemporal correspondence of hemodynamic responses to underlying neuronal activity, as well as how this changes across the brain.

#### **QUANTIFICATION, BASELINES, AND NEUROENERGETICS**

A major limiting factor for the interpretation of neuroimaging signals, especially those studies using BOLD fMRI, is the fact that BOLD signal changes are not quantitative. Signal changes are expressed as a percentage of baseline values which are themselves known to be influenced by a wide range of neurophysiological and general physiological factors. Because the BOLD signal is a product of cerebral blood flow, cerebral blood volume, and oxygen consumption (**Figure 1**), baseline changes affecting any or all of these properties will impact upon measured BOLD signals in ways that may not necessarily reflect commensurate changes in neuronal activity. For example, commonly ingested substances such as caffeine can alter BOLD baseline, in part through effects on oxidative metabolism and cerebral blood flow (Mulderink et al., 2002; Griffeth et al., 2011), as can aging, disease, or a range of pharmacological manipulations (e.g. D'esposito et al., 2003; Iannetti and Wise, 2007). In addition, Jones et al. (2008) showed that experimental alterations in baseline neuronal activity intended to emulate switching between different cortical arousal states affected neurovascular coupling in a rodent model. As discussed in section Limitations of animal models for neurovascular research, a major concern for the use of animal models in this field is the effects of anesthetic agents on baseline parameters.

One approach to tackling these difficulties has been to develop calibrated fMRI methods for use in humans which are able to provide quantitative measures of metabolic changes in response to task or stimulus conditions (see review by Hoge, 2012). This approach effectively reduces the uncertainty inherent in standard fMRI studies by providing a measurement that is more directly linked to neuronal activity changes, moving the research imperative from neurovascular coupling to neurometabolic coupling and neuroenergetics. Research in experimental animal models is making important contributions in this area and this endeavor is supported by recent evidence that the brain's energy budget, that is the attribution of brain energy consumption to different neuronal processes (Attwell and Laughlin, 2001), is preserved across mammalian species (Hyder et al., 2013). A full review of neurometabolic coupling and neuroenergetics is beyond the scope of the present paper (see Hyder and Rothman, 2012), however an important early finding from studies in rat was that energy use by neurons (oxidative glucose consumption) is linearly correlated to excitatory neuronal activity (glutamate release, Sibson et al., 1998). Understanding the relationship between oxidative glucose consumption and specific components of neuronal activity is therefore an important objective for interpreting quantitative neuroimaging signals. It is partly because of this and the knowledge that the metabolism of neurons and astrocytes is closely related (e.g. Pellerin and Magistretti, 1994; Bélanger et al., 2011), that research to understand the fine detail of neuron-astrocyte communication is so important for understanding functional brain imaging signals. Studies in which the activity of neurons and astrocytes can be differentiated and/or specifically manipulated, for example using optogenetic approaches combined with 2PLSM will be significant in this regard (Li et al., 2013a) and combination of these approaches with metabolic readouts, for instance using molecular oxygen sensors (Lecoq et al., 2011) or more macroscopic techniques (e.g. see Devor et al., 2012) will advance understanding considerably.

#### **IMPACT OF DISEASE AND MODULATORY NEUROTRANSMITTERS UPON THE INTERPRETATION OF FUNCTIONAL BRAIN IMAGING SIGNALS**

Functional brain imaging is increasingly being applied to investigate brain function in clinical populations (e.g. Diamond et al., 2007; Karmonik et al., 2010; O'brien et al., 2010; Sundermann et al., 2014), it is important to be able to conduct a detailed exploration of the impact of brain diseases upon neuroimaging signals and their relationship to neuronal activity in animal models. This includes the use of both transgenic lines and pharmacologically or surgically induced disease states (Sanganahalli et al., 2013; Serres et al., 2014). Such investigations may become more important as neurovascular breakdown becomes increasingly implicated in a range of disease conditions (Zlokovic, 2010, 2011), suggesting that the empirical basis of assumptions concerning neuroimaging signal interpretation which have been established through research primarily in healthy animals, may not apply. If neuroimaging techniques are to be used to investigate the effects of therapeutic drugs on brain function biomarkers, it will also be important for example to delineate (neuro)vascular and neuronal effects of these drugs. For example, a study in human Alzheimer's patients suggested that acetycholinesterase inhibitors produced alterations in cortical vascular response to stimuli that were independent of changes in the underlying neuronal response (Rosengarten et al., 2009). In addition, the impact of other perturbations of normal brain function including normal aging (D'esposito et al., 2003; Rosengarten et al., 2003) and the ingestion of substances known to alter neurovascular and/or hemodynamic function such as caffeine (Pelligrino et al., 2010; Diukova et al., 2012) or alcohol (Luchtmann et al., 2013) upon these signals can also be explored in depth (Meno et al., 2005; Diukova et al., 2012).

Because functional brain imaging signals are the final consequence of neuronal, neurometabolic and hemodynamic events, they are vulnerable to disease-related perturbations operating at a number of levels. Although alterations in neuronal function as a result of pathology are in general likely to be reflected within altered neuroimaging signals, a key question however is to what extent these signal changes also reflect alterations in the normal translation of neuronal events to hemodynamic changes? The use of neuroimaging data to estimate differences in neural signals within subjects, between groups or experimental conditions, or across experimental animals, tacitly assumes preserved or at least non-systematically altered neurovascular coupling. There is however accumulating evidence that many brain diseases feature altered neurovascular coupling and that in estimating neuronal signals using hemodynamic based imaging techniques, these effects must be taken into account.

#### **NEUROIMAGING IN NEURODEGENERATIVE DISEASE**

There is a growing consensus that changes in the function of the neurovascular unit are a critical component of brain disease development and progression (Benarroch, 2007; Zacchigna et al., 2008; Iadecola, 2010; Grammas, 2011; Zlokovic, 2011). In the case of neurodegenerative disease, alterations in neurovascular function have been detected at early stages, preceding the onset of clinical indicators (Bookheimer et al., 2000; Ruitenberg et al., 2005; Knopman and Roberts, 2010; Sheline et al., 2010; Zlokovic, 2011) and it is now well established that cerebrovascular dysfunction is a major risk factor for many brain diseases (Girouard and Iadecola, 2006; Iadecola, 2010; Toledo et al., 2013). A number of studies have identified changes in vascular reactivity, hemodynamic responses or neurovascular coupling in animal models of neurodegenerative disease (Rancillac et al., 2012; Sanganahalli et al., 2013). In relation to this, neuroinflammation has been identified as a key neurodegenerative disease process involving neurovascular unit disruption. Whilst acute neuroinflammatory responses may have a neuroprotective function, chronic inflammatory responses within the central nervous system are associated with neuronal damage and may not only "fans the flames" of many CNS disorders (Frank-Cannon et al., 2009), but may precede and even play a causal role in these diseases (Hauss-Wegrzyniak et al., 1998; Qin et al., 2007; Gao et al., 2011; Cunningham, 2013).

Work carried out in specific animal models of disease has also suggested disease-related alterations in neurovascular function that may affect our ability to interpret fMRI signal changes. In the area of neurodegeneration for instance, a widely used transgenic model is one in which the amyloid-beta precursor protein (APP) is over-expressed, leading to a toxic accumulation of amyloid-beta protein. Such models recapitulate many, but not all (for a review see Balducci and Forloni, 2011) of the major features of human AD including an age-dependent neurovascular impairment and vascular dysfunction (Lecrux and Hamel, 2011). Work in these mice shows, amongst other things, impaired hemodynamic responses to neuronal activation, altered resting cerebral blood flow and cerebrovascular autoregulation and impaired metabolic activity (Iadecola et al., 1999; Niwa et al., 2002; Nicolakakis et al., 2008). Additionally, ApoE4 mice model the known association in humans between AD (and cerebral amyloid angiopathy, CAA) and possession of the apolipoprotein E ε4 allele (ApoE4). Expression of ApoE4 has been linked to higher risk of AD with earlier onset (Blacker et al., 1997) and is believed to be related to dysfunctional clearance of amyloidbeta from the brain (Castellano et al., 2011; Hawkes et al., 2012). It has been suggested that the *APOE***ε**4 allele may alter neurovascular function through amyloid-beta exerting a timeand concentration-dependent toxic effect on rat microvascular endothelial cells (Folin et al., 2006). In addition, fMRI readouts of brain hemodynamic signals have found that in human ApoE4 carriers, cerebral blood flow, task-related brain activity, and functional connectivity appear altered several decades prior to any clinical indicators of dementia (Filippini et al., 2009, 2011).

In addition to Alzheimer's disease and other neurodegenerative conditions, alterations in neurovascular coupling have also been reported in a number of other diseases and pathological states (Girouard and Iadecola, 2006; Hamilton et al., 2010) including stroke and ischemia (Lin et al., 2011; Baker et al., 2013; Jackman and Iadecola, 2014), hypertension and hypotension, spreading depression (Hamzei et al., 2003; Nagaoka et al., 2006; Del Zoppo, 2010; Ayata, 2013; Fordsmann et al., 2013) as well as in normal aging (D'esposito et al., 2003). There is also accumulating evidence for longer term effects of systemic health challenges upon brain microcirculatory regulation and neurovascular coupling. For example, a recent study demonstrated impairment of neurovascular coupling in animals fed a high fat diet over a period of several weeks (Li et al., 2013b) and systemic infection has been show to produce alterations in cerebrovascular function (Puntener et al., 2012) and the shape of hemodynamic responses (Couch et al., 2013). Finally, many brain diseases involve alterations in the function of specific neurotransmitter systems, and we refer the reader to the next section for an exploration of the possible effects of such alterations.

Overall, there is substantial scope for disease conditions and potentially the health status of individuals more generally to impact upon how we should interpret functional neuroimaging data. By understanding more specifically how these alterations impact upon neurovascular function and the relationships between neuronal and hemodynamic changes, it may be possible to either optimize data analysis strategies to account for these effects or build consideration of these changes into our discussions of neuroimaging findings. Lastly, we speculate that early alterations in neurovascular function associated with many diseases may be detectable through concurrent measurement of neuronal and hemodynamic activity in the brain. As such, these neurovascular changes may provide for novel disease biomarkers, measureable using multimodal techniques that are now becoming available for use in humans. We return to this in the last section of this review.

#### **NEUROTRANSMITTER AND NEUROPHARMACOLOGICAL MODULATIONS OF NEUROVASCULAR COUPLING**

Although the established view of the role of specific neurotransmitters in neurovascular coupling processes emphasize predominantly glutamate and GABA (e.g. Logothetis, 2008), recent evidence has emerged suggesting that other neurotransmitters may also play a role. The serotonin, noradrenaline, dopamine, and acetylcholine systems each feature neurons that project widely throughout cortical and subcortical structures, in addition to forming key elements of certain specific structurestructure connections. A fundamental challenge for the use of hemodynamic imaging methods in contexts where function in these neurotransmitter systems (and relevant structures) arises from the fact that they are vasoactive (Choi et al., 2006; Hamel, 2006; Martin and Sibson, 2008; Jenkins, 2012; Shih et al., 2012b; Toussay et al., 2013), and therefore able to elicit hemodynamic effects which may not be directly related to their effects upon neuronal activity. We will focus here on the emerging evidence for effects of dopamine and serotonin neurotransmission upon neurovascular coupling and the interpretation of neuroimaging signals, although similar lines of evidence exist for modulation of neuroimaging signals by other neurotransmitters including acetylcholine (Hamel, 2006; Rosengarten et al., 2006; Kocharyan et al., 2008) and noradrenaline (Toussay et al., 2013).

The neurotransmitter dopamine (DA) and the functional anatomy of the dopaminergic system has been intensively studied for many years. This is due in part to the known involvement of this neurotransmitter in a range of cognitive, affective and motor functions in healthy brain as well as a wide range of diseases including Parkinson's disease, schizophrenia, drug addiction, ADHD, and pain disorders to name a few. Unsurprisingly, there is a rapid proliferation of non-invasive brain imaging studies of both healthy and disease-related brain processes that are associated with alterations in DA function, for instance, fMRI studies of Parkinson's disease (Hacker et al., 2012), risk-taking behavior (Kohno et al., 2013), schizophrenia (Yoon et al., 2013), reward prediction error (Chowdhury et al., 2013). This literature, which addresses pharmacological, disease- or task-related, or genetic alterations in DA function, rests on the assumption that differences in fMRI signals between conditions or subjects are attributable to effects of the altered DA function on neuronal activity. We suggest that sufficient data to support this assumption does not yet exist. Although arguments have been made that fMRI signals may provide an adequate biomarker of DA release (Knutson and Gibbs, 2007), this does not address the issue of how to interpret stimulus/task evoked changes in neuronal activity in the context of altered DA function. In addition, we know that the long established vasoactive properties of dopamine are capable of producing complex effects upon the relationship between DA signaling, neurovascular coupling and fMRI responses in subcortical structures (Devonshire et al., 2004; Choi et al., 2006; Shih et al., 2009; Jenkins, 2012; Mandeville et al., 2013).

Due in part to its non-quantitative nature, alterations in baseline parameters including metabolism, cerebral blood flow, and oxygenation renders fMRI signal interpretation vulnerable to the effects of such changes upon both evoked responses and restingstate measurements. As it has been shown that the magnitude of stimulus-evoked BOLD responses is dependent upon such baseline parameters (Shulman et al., 2007; Lu et al., 2008), the effects of changes in dopaminergic neurotransmission upon local baseline conditions such as cerebral blood flow and brain metabolism is an important potential confound. In relation to this, direct regulation of brain microvasculature tone by dopamine has been demonstrated (Krimer et al., 1998; Choi et al., 2006; Kowianski et al., 2013) and such vasomotor regulation has the capacity to modulate measured fMRI signal changes (Tian et al., 2010), with these effects occurring independently of neurogenic neurovascular coupling and this potentially confounding interpretation of fMRI data. For instance, Arthurs et al. (2004) measured the effects of the selective dopamine D2 receptor antagonist sulpiride on electrophysiological and fMRI responses in human subjects and using path analysis estimated that 84% of the drug effect on fMRI responses to stimulation occurred via direct effects upon hemodynamics, rather than neuronal responses themselves (Arthurs et al., 2004).

fMRI is also widely used to investigate the role of 5-HT in normal function, disease processes, and therapeutic drug effects in a wide range of human and animal models (Hariri et al., 2002; Hariri and Weinberger, 2003; Del-Ben et al., 2005, 2008; Mckie et al., 2005; Stark et al., 2006, 2008; Rao et al., 2007; Tanaka et al., 2007; Graeff and Del-Ben, 2008; Munafo et al., 2008). Within human fMRI studies, responses in structures receiving serotonergic input appear to be profoundly influenced by synaptic 5-HT, with up to 42% of fMRI response variability attributable to availability of the 5-HT transporter (Rhodes et al., 2007). Windischberger et al. (2010) report a powerful modulation of BOLD responses to stimuli by selective serotonin reuptake inhibitors (SSRIs) that was specific to regions receiving dense serotonergic projections. However, many research studies point to a complex role for serotonin in neurovascular coupling (Cauli et al., 2004; Hamel, 2006), vascular reactivity (Bonvento et al., 1997) and for the interpretation of BOLD fMRI signals in terms of spiking or synaptic activity (Rauch et al., 2008). Additionally, the well-established influence of ascending serotonergic projections on the cerebral vasculature and microvascular tone (Toda and Fujita, 1973; Dieguez et al., 1981; Cohen et al., 1996, 1999) provide strong potential for modulation of vascular compliance and therefore BOLD signal dynamics (Behzadi and Liu, 2005; Boas et al., 2008). This severely complicates (a) the interpretation of neuroimaging signals from structures receiving serotonergic input, and (b) comparisons of task-evoked responses between conditions involving altered 5-HT function.

# **LIMITATIONS OF ANIMAL MODELS FOR NEUROVASCULAR RESEARCH**

Although work in animal models is essential to advance the capabilities of hemodynamic neuroimaging methods to estimate neuronal activity in humans, there are a number of important limitations associated with their use. A major concern is the use of anesthesia in the vast majority of animal research carried out in this field. Anesthesia is used for two main purposes. Firstly, as many of the techniques used to study neuronal, neurovascular and hemodynamic processes are invasive, anesthesia is necessary in order to prevent suffering associated with (for instance) the placement of recording electrodes. Secondly, anesthesia is often used in order to prevent movement of the animal during data collection. It is possible to conduct both imaging (e.g. using fMRI or optical techniques) and electrophysiological experiments in unanaesthetized animals and indeed this approach has been taken by a number of laboratories (see below). For small animal imaging in particular therefore, anesthesia is frequently used chiefly to prevent movement, avoiding the need for restraint and/or length animal training protocols. Because all general anesthetics have at least moderate effects upon normal physiological regulation (for example blood pressure, blood oxygenation, thermoregulation) it is usually necessary to perform additional invasive procedures (even if the imaging technique is itself non-invasive) in order to enable normal physiology to restored, monitored and maintained.

Using anesthesia in animal research models has important consequences for translation of findings from *in vivo* experimental research to the human neuroimaging arena (where subjects are rarely imaged under anesthesia). Previous research in our laboratory and elsewhere has indicated that anesthetic agents disrupt neurovascular coupling in a number of ways (Lahti et al., 1999; Nakao et al., 2001; Brevard et al., 2003; Sicard et al., 2003; Martin et al., 2006b, 2012; Luo et al., 2007; Tsurugizawa et al., 2010; Fukuda et al., 2013), including via alterations in baseline hemodynamic parameters (Shulman et al., 1999; Hyder et al., 2002). A recent study comparing the effects of four different anesthetics on BOLD fMRI responses in mice found that response differences could largely be explained by differing systemic effects of the stimuli attributable to the different anesthetic conditions (Schroeter et al., 2014).

A relatively consistent finding is that anesthetics significantly delay the hemodynamic response function (Martin et al., 2006b; Huttunen et al., 2008; Franceschini et al., 2010) and although a recent study comparing fMRI BOLD responses in awake and anesthetized marmosets found the opposite effect (more protracted responses in awake animals, Liu et al., 2013), the authors suggest this was in turn the result of the much larger responses observed and consequent increases in venous drainage time. A further complication is that different anesthetic agents disrupt neurovascular coupling in different ways. This may be particularly problematic for pharmacological neuroimaging studies. For example Du et al. (2009) demonstrated anesthetic-dependent effects upon hemodynamic changes induced by cocaine including alterations in the coupling between different hemodynamic measures. A study investigating the role of nitric oxide in neurovascular coupling found differences between awake and anesthetized conditions that were in turn attributable to the deleterious effects of the anesthetic agent upon the magnitude of stimulus-evoked cerebral blood flow changes (Nakao et al., 2001). For a more in depth discussion of the effects of anesthesia in studies of neurovascular coupling the reader is directed to a recent review paper (Masamoto and Kanno, 2012).

Much human fMRI engages subjects in complex cognitive tasks which involve activity within, and communication between, networked structures. Models of many relevant cognitive and affective processes such as learning, memory and attention have been established in awake behaving animals, as well as capabilities to study disease-relevant alterations in these functions. An additional problem with the use of anesthesia in animal models therefore is that it limits the possibilities for investigating neurovascular function, and therefore improving our understanding neuroimaging signals, in the context of these "higherorder" brain functions. Although a number of laboratories have developed procedures for conducting fMRI or other hemodynamic measurement procedures in awake animals (Martin et al., 2002, 2013; Brevard et al., 2003; Sicard et al., 2003; Chin et al., 2011; Desai et al., 2011; Brydges et al., 2013; Liu et al., 2013; Pisauro et al., 2013; Takuwa et al., 2013), and a few studies also report on the neuronal signals underlying the hemodynamic and/or fMRI responses (Lipton et al., 2006; Martin et al., 2006b; Goense and Logothetis, 2008; Sirotin and Das, 2009; Desai et al., 2011; Liu et al., 2013), this work continues to focus almost exclusively on cortical structures. In addition, the effects of stress, especially where animals are restrained, must be carefully taken into account as this is likely to have a range of general physiological and neurophysiological effects. This is particularly the case were animal models of neuropsychiatric illness or related drug treatments are being studied. It will also be a challenge to establish animal models for cognitive or behavioral neuroscience in the context of the apparatus generally required for investigation of neuroimaging signals and neurovascular coupling (as outlined in section Investigating neuroimaging signals using experimental animal models and **Table 1**), where restraint is required. To reduce restraint stress and allow more sophisticated experimental designs, methods for fixing experimental animals with respect to data acquisition apparatus but permitting engagement in cognitive tasks have recently been reported (e.g. Dombeck et al., 2007), and one study reports a method whereby rats voluntarily engage with head-restraint apparatus to allow functional brain imaging (using 2PLSM).

#### **FUTURE RESEARCH CHALLENGES AND POSSIBILITIES**

We suggest that a key contribution of future research studies in animal models will be the investigation of neuronalhemodynamic-neuroimaging signal relationships in subcortical structures, using both awake and anesthetized animals. To optimize the translation of neurovascular coupling data from animal studies to improving neural signal estimation in human fMRI, it will also be important to study neurovascular coupling in the context of information processing or in behavioral paradigms that more closely reflect research designs used in human subjects. A major challenge in this respect will be the use of restraint in most of the current awake animal studies. A small number of laboratories have already successfully investigated affective or cognitive processes using fMRI in awake rats (e.g. see review by Ferris, 2014; also Brydges et al., 2013; Zhao et al., 2014a) and the combination of these approaches with either concurrent measurement of neuronal data, or careful synthesis of new hemodynamic data with existing data regarding underlying neuronal activity, will provide important insights. Methods have also been recently developed for use in rodents that utilize virtual reality technology in order to provide head-fixed animals with a pseudo-environment with which they can interact (Harvey et al., 2009; Scott et al., 2013). Adaptations of these techniques may further provide key insights into the neuronal events that fMRI signals report on whilst the brain is engaged in relatively complex tasks.

Technological developments are increasing the possibilities to investigate the neuronal basis of hemodynamic neuroimaging signals directly in humans. For example, combining EEG with fMRI (EEG-fMRI) is an approach that has primarily been deployed to marry the relatively high spatial resolution of fMRI with the high temporal resolution of EEG for cognitive neuroscience research purposes (Huster et al., 2012; Laufs, 2012). This approach may also provide insights into neurovascular coupling in human subjects (Wan et al., 2006; Diukova et al., 2012; Huster et al., 2012; Mayhew et al., 2013; Mullinger et al., 2013) although the spatial and temporal mismatch of the measurements made by each technique will pose a number of challenges: the spatial and temporal domains from which data are sampled are very different. Other approaches include simultaneous near infra-red spectroscopy (NIRS) and EEG (Moosmann et al., 2003), combined MEG and NIRS (e.g. Mackert et al., 2004) or diffuse optical imaging combined with MEG (Ou et al., 2009). Most recently, Fabiani et al. (2014) report on a combined optical spectroscopy, event-related potential and fMRI study of change in neurovascular function in normal aging.

#### **SUMMARY**

As non-invasive brain imaging techniques are used to address an increasingly diverse range of questions relevant to both brain function and dysfunction, it will become more important that our understanding of the neurophysiological basis of these signals is specific to the brain structure, disease context, pharmacological, or task-related effects under investigation. Fortunately a wide range of experimental tools, including combinations of methods that optimize our ability to probe neuroimaging and neuronal signal relationships, are now available in experimental animals. It will be important that these approaches continue to develop to enable neurovascular coupling to be probed in contexts established in behavioral neuroscience. Simultaneously, technical advances in human research studies that allow neurovascular coupling to be probed directly will help detail a more comprehensive empirical framework for estimating neural signals in the human brain from hemodynamic proxies.

# **ACKNOWLEDGMENTS**

Chris Martin is a Royal Society University Research Fellow at the University of Sheffield. I would like to thank the Royal Society and the Wellcome Trust (Research Project Grant: WT093223AIA) for financial support.

# **REFERENCES**


risk for Alzheimer's disease. *N. Engl. J. Med.* 343, 450–456. doi: 10.1056/NEJM 200008173430701


mammalian species and activity levels. *Proc. Natl. Acad. Sci. U.S.A.* 110, 3549–3554. doi: 10.1073/pnas.1214912110


BOLD responses in the rat dentate gyrus. *J. Cereb. Blood Flow Metab.* 32, 291–305. doi: 10.1038/jcbfm.2011.126


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 March 2014; accepted: 01 July 2014; published online: 19 August 2014. Citation: Martin C (2014) Contributions and complexities from the use of in vivo animal models to improve understanding of human neuroimaging signals. Front. Neurosci. 8:211. doi: 10.3389/fnins.2014.00211*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Martin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The contribution of astrocytes to the regulation of cerebral blood flow

# *Clare Howarth\**

*Department of Psychology, University of Sheffield, Sheffield, UK*

*Lora T. Likova, The Smith-Kettlewell Eye Research Institute, USA*

#### *Reviewed by:*

*Wei Chen, University of Minnesota, USA Gabor Petzold, German Center for Neurodegenerative Diseases, Germany*

#### *\*Correspondence:*

*Clare Howarth, Department of Psychology, University of Sheffield, Western Bank, Sheffield, S.Yorkshire, S10 2TP, UK e-mail: c.howarth@sheffield.ac.uk*

*Edited by:* In order to maintain normal brain function, it is critical that cerebral blood flow (CBF) is matched to neuronal metabolic needs. Accordingly, blood flow is increased to areas where neurons are more active (a response termed functional hyperemia). The tight relationships between neuronal activation, glial cell activity, cerebral energy metabolism, and the cerebral vasculature, known as neurometabolic and neurovascular coupling, underpin functional MRI (fMRI) signals but are incompletely understood. As functional imaging techniques, particularly BOLD fMRI, become more widely used, their utility hinges on our ability to accurately and reliably interpret the findings. A growing body of data demonstrates that astrocytes can serve as a "bridge," relaying information on the level of neural activity to blood vessels in order to coordinate oxygen and glucose delivery with the energy demands of the tissue. It is widely assumed that calcium-dependent release of vasoactive substances by astrocytes results in arteriole dilation and the increased blood flow which accompanies neuronal activity. However, the signaling molecules responsible for this communication between astrocytes and blood vessels are yet to be definitively confirmed. Indeed, there is controversy over whether activity-induced changes in astrocyte calcium are widespread and fast enough to elicit such functional hyperemia responses. In this review, I will summarize the evidence which has convincingly demonstrated that astrocytes are able to modify the diameter of cerebral arterioles. I will discuss the prevalence, presence, and timing of stimulus-induced astrocyte calcium transients and describe the evidence for and against the role of calcium-dependent formation and release of vasoactive substances by astrocytes. I will also review alternative mechanisms of astrocyte-evoked changes in arteriole diameter and consider the questions which remain to be answered in this exciting area of research.

**Keywords: astrocyte, neurovascular coupling, cerebral blood flow, calcium, functional hyperemia**

### **INTRODUCTION**

For normal functioning of the brain to be maintained it is critical that increases in neuronal energy demands are met by changes in local blood flow with high temporal and spatial resolution. This necessitates close connections between neurons, glia, and the energy metabolism and blood supply of the brain. Increased neuronal activity is accompanied by an increase in local cerebral blood flow (CBF), a phenomenon termed functional hyperemia. It is this increase in CBF and oxygenation which underlies BOLD functional MRI (fMRI). BOLD fMRI is commonly used as a surrogate measure of neural activity. A valid interpretation of such data requires a thorough understanding of the cellular basis of the BOLD signal. While a coupling between cerebral energy consumption and neuronal activity was originally suggested over a century ago (Roy and Sherrington, 1890), the exact relationship remains an active area of research. Although neuronal activity induced increases in blood flow are due, at least in part, to the direct action of neurons [via glutamate-evoked release of nitric oxide (NO)] on arteriole smooth muscle (Fergus and Lee, 1997), over the past decade there has been extensive research (Zonta et al., 2003; Mulligan and MacVicar, 2004; Filosa et al., 2006; Takano et al., 2006) determining the role which astrocytes, and activity-induced Ca2<sup>+</sup> signals within astrocytes, may play (as discussed in recent reviews by Attwell et al., 2010; Petzold and Murthy, 2011).

Being situated in the synaptic cleft and having multiple endfeet which are opposed to smooth muscle cells (**Figure 1A**), astrocytes can act as a "bridge," relaying information about changes in synaptic activity between neurons and the vasculature, ensuring that neuronal energy demands are met.

#### **INITIAL** *IN VITRO* **EVIDENCE DEMONSTRATED THAT ASTROCYTES CAN REGULATE ARTERIOLE DIAMETER**

Initial studies revealing a potential role of astrocytes in neurovascular coupling were performed *in vitro* using acute brain slices and whole mount retina. This *in vitro* research has resulted in convincing evidence that astrocytes are able to control vascular diameter (**Figure 1B**). During neuronal activity, glutamate is released and acts via neuronal NMDA receptors to activate neuronal nitric oxide synthase (nNOS), resulting in the release of NO. NO acts on smooth muscle cells, increasing blood flow via a cGMP pathway (Fergus and Lee, 1997). However, in addition to

triggering neuronal NO-evoked effects on the vasculature, neuronally released glutamate can act on astrocyte metabotropic glutamate receptors (mGluR), raising astrocyte [Ca2+]i (Zonta et al., 2003; Takano et al., 2006). Over a decade ago, observations of astrocyte soma and endfeet [Ca2+]i signals which were well-timed with vessel diameter changes in response to mGluR activation were the first evidence that astrocytes may contribute to neurovascular coupling (Zonta et al., 2003). This work implicated cyclooxygenase enzymes (COX) in the downstream signaling pathway leading from increased astrocyte [Ca2+]i to vessel dilation. An increase in astrocytic [Ca2+]i can result in the production of arachidonic acid (AA) via phospholipase A2 (PLA2), a Ca2<sup>+</sup> sensitive enzyme highly expressed in astrocytes (Farooqui et al., 1997; Cahoy et al., 2008). AA is subsequently metabolized to COX and cytochrome P450 epoxygenase derivatives [prostaglandin E2 (PgE2) and epoxyeicosatrienoic acids (EETs), respectively]. These vasoactive metabolites can be released from the astrocyte endfeet, apposed to arterioles, resulting in activation of smooth muscle K+ channels and vasodilation (although see Dabertrand et al. (2013) who suggest that PgE2 may constrict, rather than dilate, isolated parenchymal arterioles).

In addition to AA being metabolized within the astrocyte, it can diffuse to arteriole smooth muscle, producing the vasoconstrictor 20-HETE via ω-hydroxylases (Roman, 2002). Shortly after the demonstration that astrocyte [Ca2+]i increases were closely linked to vasodilations, two photon photolysis of caged calcium directly within the somata of astrocytes was used to trigger a [Ca2+]i transient within the astrocyte and evoked vasoconstriction (Mulligan and MacVicar, 2004). Pharmacology experiments revealed the importance of PLA2 and it was proposed that 20-HETE, a vasoconstrictor, was generated from AA, which was formed in the astrocytes. 20-HETE inhibits smooth muscle K+ conductances to depolarize and contract smooth muscle cells (Lange et al., 1997). Thus, astrocyte [Ca2+]i entry can trigger either vasodilation (Zonta et al., 2003; Filosa et al., 2004) or vasoconstriction (Mulligan and MacVicar, 2004) depending on which signaling pathway dominates (**Figure 2**).

The retina is an ideal system in which to study blood flow regulation in response to local signals as its low density of blood vessels requires the ability to efficiently match the local blood supply to local neuronal metabolic needs (Funk, 1997). The observation that glial [Ca2+]i transients were closely correlated in time with changes in arteriole diameter was extended to the case of the retina where both vasodilations and constrictions were reported to be evoked by either physiological light stimulation or uncaging of Ca2<sup>+</sup> in Muller cells (Metea and Newman, 2006). In agreement with the previous findings in hippocampal slices (Mulligan and MacVicar, 2004), 20-HETE was implicated as the vasoconstrictor molecule in the retina. However, in contrast to findings in cortical slices (Zonta et al., 2003), the data suggested that conversion of AA to EETs, rather than to PgE2, caused arteriole dilations in the retina. The hunt was on to find the variable which selects a dilatory response over a constrictive one and vice versa.

While *in vitro* studies have several advantages, including the ability to control various cellular elements, there are technical limitations to this approach which are worth noting. A lack of myogenic tone, due to a lack of perfusion and intraluminal pressure (Iadecola and Nedergaard, 2007), can result in vessels being maximally dilated. To compensate for this loss of tone, in many studies, slices are pre-treated with a vasoconstrictor (Zonta et al., 2003; Filosa et al., 2004; Metea and Newman, 2006). However, preconstriction has been shown to alter the direction of arteriolar responses (Mulligan and MacVicar, 2004). Furthermore, many experiments are carried out at non-physiological temperatures, e.g., with brain slices

**FIGURE 2 | Astrocyte calcium-dependent vasoactive signaling pathways.** Neuronally released glutamate can act on astrocyte mGluRs, activating PLC, and increasing astrocyte [Ca2+]i, activating PLA2 resulting in the release of AA from the plasma membrane. AA can be metabolized within the astrocyte to form PgE2 or EETs which are released and act on smooth muscle cells, evoking vasodilation. Alternatively, AA can be released and act on smooth muscle cells where it is metabolized to the vasoconstrictor 20-HETE. ATP can activate Ca2+-mediated downstream vasoactive pathways either by acting on P2Y receptors and activating PLC or via P2X7 receptors, increasing [Ca2+]i. An alternative vasoactive pathway downstream of the [Ca2+]i increase is the activation of BKCa channels and subsequent efflux of the vasodilator K+.

maintained at room temperature (Mulligan and MacVicar, 2004; Gordon et al., 2008).

#### **HOW IS THE DIRECTION OF ARTERIOLE DIAMETER CHANGE DETERMINED?**

NO, which can bind to the heme moiety and inactivate cytochrome P450 enzymes (Fleming, 2001; Roman, 2002), was suggested to determine the direction of retinal arteriole diameter change (Metea and Newman, 2006). While in the brain neural activity and the resulting NO production has been shown to correspond to increases in blood flow (Akgoren et al., 1994), in the retina the occurrence of vasoconstrictions dominated as NO levels increased (Metea and Newman, 2006). This finding was in agreement with pharmacological inhibition of NO synthase, which converted astrocyte-evoked vasoconstrictions to vasodilations in brain slices (Mulligan and MacVicar, 2004). A possible explanation for this observation is that preconstriction of vessels by L-NAME, which was used to inhibit NO synthase, increases the basal tone of vessels and, hence, may predispose them to dilate to other factors (Blanco et al., 2008). Many of the enzymes suggested to be responsible for signaling downstream of the increase of astrocyte [Ca2+]i are sensitive to NO (e.g., CYP4A which produces 20-HETE) (Fleming, 2001; Roman, 2002) suggesting that a complex relationship may exist between NO levels and neurovascular coupling signaling pathways. Differing basal NO levels may exist in different preparations, hence pathways may be inhibited to varying degrees. This may explain why some groups reported only constrictions (Mulligan and MacVicar, 2004) while others reported constrictions and dilations (Metea and Newman, 2006).

Metabolic factors, such as partial pressure of oxygen (pO2) (Offenhauser et al., 2005) and the extracellular lactate concentration (Hu and Wilson, 1997) change rapidly within the parenchyma during neural activity. Gordon et al. (2008) performed experiments in acute brain slices and proposed that such metabolic factors may play a role in determining the direction of arteriole diameter changes. The level of oxygen present in the aCSF (artificial CSF) used in these experiments was found to determine the direction of arteriole diameter change in response to uncaging calcium within the soma of astrocytes (Gordon et al., 2008). At higher levels of O2 (aCSF bubbled with 95% O2 and 5% CO2, typical of acute brain slice experiments), vasoconstrictions were triggered, while at lower O2 levels vasodilations dominated (**Figure 1B**). The lower O2 level (aCSF bubbled with 20% O2), resulted in a pO2 which mimics the lower end of physiological measurements *in vivo* (Offenhauser et al., 2005). At the lower oxygen levels used, both lactate and adenosine levels were increased compared to under conditions of higher O2 and vasodilation was proposed to be dominant due to two mechanisms. Firstly, as uptake of PgE2 by the prostaglandin transporter is inhibited by extracellular lactate (Chan et al., 2002), there is an accumulation of extracellular PgE2 following [Ca2+]i-evoked PgE2 release by astrocytes, thus facilitating the vasodilatory response. Secondly, the increased levels of adenosine were proposed to be acting on A2A receptors on the smooth muscle itself, blocking Ca2<sup>+</sup> channels (Murphy et al., 2003) and preventing vasoconstriction. In agreement with these findings, in *ex vivo* retina, the incidence of light-evoked vasoconstrictions was lower in 21% O2 compared to 100% O2. Additionally, at the lower oxygen level, a PgE2 component of vasodilation became salient (Mishra et al., 2011). Whether such a mechanism plays a functional role *in vivo* remains to be proven. Although changing tissue pO2 by breathing high or low oxygen has been shown to change basal CBF and arteriole diameter in the direction predicted by *in vitro* experiments (McCalden et al., 1984; Mishra et al., 2011), hyperoxia had no effect on lightevoked dilations or flow in the retina *in vivo* (Mishra et al., 2011). Furthermore, an increased tissue pO2 failed to alter the functional hyperemia response to sensory stimulation (Lindauer et al., 2010). Lin et al. (2010) recently published human NMR spectroscopy studies showing that CBF increases were positively correlated with lactate production while being negatively correlated with the percentage change in oxygen consumption (CMRO2). These findings suggest that task-induced CBF responses are mediated by factors other than the demand for oxygen. In order to test the *in vivo* relevance of the findings of Gordon et al. (2008), it may be more appropriate to test the end effectors predicted by their experiments, i.e., lactate and adenosine.

### **ALTERNATIVE MECHANISMS OF ASTROCYTE CONTROL OF CBF**

In addition to the mGluR-evoked mechanisms of CBF regulation, there is evidence for a further glutamate-dependent pathway. In the olfactory bulb, intrinsic optical signal (IOS) changes (used as a proxy for CBF measurements) in response to odor stimulation were found to be unaffected by blocking AMPA/NMDA receptors nor mGluRs (Gurden et al., 2006). However, the increase in CBF was reduced when glial glutamate transporters were blocked. This work was expanded by Schummers et al. (2008) who demonstrated that, in visual cortex, the astrocytic [Ca2+]i signal and the change in IOS in response to a visual stimulus were significantly reduced when glial glutamate transporters were blocked. Furthermore, blocking glial glutamate transporters reduced odor-evoked increases in both erythrocyte velocity and flux in the olfactory bulb [even after controlling for potentially higher receptor activity after transporter blockade Petzold et al. (2008)]. In contrast to experiments in the visual cortex (Schummers et al., 2008) however, Petzold et al. (2008) observed no significant change of the calcium response in astrocyte somata when blocking glial glutamate uptake. While further experimentation is needed to resolve the signaling molecules which underlie this mechanism of CBF control, these data suggest that calciumindependent vasodilatory pathways may exist. Indeed, IP3 independent stimulation-induced vasodilation has recently been observed in the cortex of IP3 knockout mice (Nizar et al., 2013). The role of astrocyte Ca2<sup>+</sup> signaling in the regulation of CBF is currently hotly debated and will be discussed later in this review.

In contrast to brain slices, glutamate is largely ineffective in evoking glial [Ca2+]i increases in the retina. In retina, neuronto-glia signaling, and resulting vasoactivity, is mediated by neuronal release of ATP and activation of purinergic P2Y receptors (Newman, 2005; Metea and Newman, 2006). Activation of P2Y receptors (which are highly expressed in astrocyte endfeet: Simard et al., 2003), activates phospholipase C (PLC) and the downstream calcium-dependent signaling pathways discussed above (**Figure 2**). ATP can also act on glial P2X7 receptors, resulting in an increase in astrocyte [Ca2+]i (Carrasquero et al., 2009; Habbas et al., 2011) and triggering the formation and release of vasoactive substances (**Figure 2**). In addition to neuronally released ATP, calcium-dependent ATP exocytosis by glial cells may occur (Pangrsic et al., 2007; Blum et al., 2008). ATP which is released into the extracellular space is rapidly hydrolyzed to form adenosine (Xu and Pelligrino, 2007) which has been shown to be vasodilatory in both the cerebral cortex and cerebellum, and is thought to be involved in functional hyperemia *in vivo* (Dirnagl et al., 1994; Akgoren et al., 1997; Shi et al., 2008).

Increases in extracellular concentrations of K+ cause vasodilation in cerebral arterioles (Kuschinsky and Wahl, 1978). Although the original hypothesis of "astrocyte K+ siphoning" (Paulson and Newman, 1987) has been disproved (Metea et al., 2007), a calcium-dependent mechanism by which astrocytes may contribute to the regulation of CBF via K+ has been demonstrated (Filosa et al., 2006). BKCa channels in astrocyte endfeet were shown to be activated following neuronal activityevoked increases in astrocytic [Ca2+]i via mGluR activation. The resulting local increase in extracellular K<sup>+</sup> activated Kir channels (Kir2.1) on the smooth muscle cell, hyperpolarizing the cell and leading to vasodilation. This work is consistent with *in vivo* studies inhibiting BKCa channels (Gerrits et al., 2002) and Kir channels (Leithner et al., 2010), both of which were found to result in an attenuation of the CBF increase evoked by somatosensory activation. However, as glial membrane potentials are close to the equilibrium potential for K+ (Kuffler et al., 1966), increasing K+ conductance may not result in an increased net efflux of K+. Furthermore, as the contribution of endfeet K+ efflux (via glial Kir4.1 channels) has been disproved in the retina (Metea et al., 2007), its role in the cortex needs to be verified.

#### **DO ASTROCYTES PLAY A ROLE IN THE REGULATION OF CBF** *IN VIVO***?**

Several experimental models have been used to investigate the role of astrocytes in the regulation of CBF *in vivo* including: uncaging of Ca2<sup>+</sup> within astrocytes, somatosensory stimulation, pharmacological inhibition, and genetic deletion.

When Ca2<sup>+</sup> was uncaged within astrocyte endfeet, triggering an increase in astrocyte [Ca2+]i, dilation of an adjacent arteriole was observed (**Figures 1C,D**) (Takano et al., 2006). In agreement with the suggestion that AA conversion to PgE2 underlies the dilation, inhibition of COX-1 but not COX-2 enzymes blocked the vasodilations. However, controversy remains regarding the role of COX-1 in neural activity-evoked vasodilation. While COX-1 inhibition (with a high dose of SC560) can inhibit the CBF response to odorant stimulation in the olfactory bulb (Petzold et al., 2008) or uncaging of Ca2<sup>+</sup> in astrocytes in the cortex (Takano et al., 2006), lower doses of SC560 have no effect on the CBF response to whisker stimulation (Niwa et al., 2001; Lecrux et al., 2011; Liu et al., 2012). Furthermore, genetic deletion of COX-1 had no effect on functional hyperemia (Niwa et al., 2001). In contrast, pharmacological inhibition or genetic knockout of COX-2 attenuates the CBF response to neuronal activation (Niwa et al., 2000). As COX-2 is more highly expressed in neurons than astrocytes, these data have led to the suggestion that neuronal COX activity may underlie the component of functional hyperemia which is mediated by COX products. Recent data suggests that photolysis of caged Ca2<sup>+</sup> might artifactually produce vasodilation via glutamate-permeable anion channels. Activation of these channels (either by calcium or astrocytic volume changes following photolysis) leads to glutamate release and an mGluR-mediated increase in mEPSC frequency. Photolysis-induced astrocytic glutamate release activates neuronal mGluRs and NMDA/AMPA receptors resulting in K+ efflux and neuronal depolarization (and potentially smooth muscle cell hyperpolarization due to increased extracellular K+) (Wang et al., 2013). This effect may explain the differing effects of COX-1 inhibition on sensory stimulusmediated vasodilation vs. photolysis-mediated vasodilation. In addition, regional heterogeneity of COX-1 expression (as has been found for nNOS) may offer a further explanation for the differing effects of COX-1 inhibition which have been observed.

Although some groups have used sensory stimuli to investigate the signaling pathways underlying astrocyte-mediated CBF changes (e.g., Zonta et al., 2003; Petzold et al., 2008), much of the evidence for astrocytic mGluR-mediated vasodilations is based on *in vitro* work using tissue from juvenile rodents (e.g., Zonta et al., 2003; Mulligan and MacVicar, 2004). A role for mGluR-mediated vasodilations in adult rodents remains contentious. Recent research has suggested that expression levels of mGluR5 alter with development, being undetectable beyond postnatal week 3 (Sun et al., 2013). In agreement with this finding, Calcinaghi et al. (2011), using a highly specific mGluR5 blocker, found no evidence for a role of mGluR5 in the onset or maintenance of CBF increases in the whisker barrel of adult anesthetized rats in response to brief whisker stimulation. Furthermore, blockade of mGluRs in the olfactory bulb had no effect on the hemodynamic response to odor stimulation (Gurden et al., 2006). However, in contradiction to these results, mGluR5-antagonist sensitive sensory simulation-evoked astrocyte [Ca2+]i transients in the barrel cortex of adult mice have been reported (Wang et al., 2006; Lind et al., 2013). In the olfactory bulb, Petzold et al. (2008) reported that the mGluR5 antagonist, MPEP, decreased vasodilations, supporting the idea that functional hyperemia is mediated, at least in part, by mGluR5, which, within the glomerular layer is expressed exclusively in astrocytes. Vasodilations were also reduced by inhibiting COX-1, suggesting that the functional hyperemia mediated by astrocytic mGluR5 depends on COX-1 activity. It remains unclear, therefore, under what conditions mGluR5 plays a role in neurovascular coupling.

Several additional factors may explain the discrepencies observed in different studies. Regional differences in expression of mGluR5 and/or the importance of mGluR-mediated signaling for the regulation of CBF may exist (MPEP reduces fMRI responses to hindpaw stimulation in rat primary cortex by only 18%, compared to 66% in striatum: Sloan et al., 2010). mGluR5 may be upregulated in reactive astrocytes (Aronica et al., 2000), suggesting that the role of astrocytic mGluR5 in neurovascular coupling may be associated more with non-physiological conditions. The recruitment of astrocyte calcium-mediated vasodilation may depend upon the frequency of stimulation used. Wang et al. (2006) demonstrated that astrocyte calcium signals in the barrel cortex of mice were a function of frequency, with signals rarely evoked by a 1 Hz whisker stimulation and peaking in response to 5Hz stimulation (although this may only occur in the anesthetized state, see Thrane et al., 2012). Furthermore, recent imaging of neuronal and astrocytic calcium signals in the rat somatosensory cortex has shown that high frequency activation of the forepaw (a 10 Hz but not a 1 Hz stimulus) leads to a late component of vasodilation that is correlated with increased astrocyte calcium and increased CBF as measured by fMRI BOLD signals (Schulz et al., 2012). The findings discussed throughout this review suggest that there is a complex interaction of many factors (both astrocytic and neuronal) determining how CBF is controlled, both basally and in response to neural activity. The task of studying the cellular functionality of astrocytes and/or neurons is thus a challenging one.

In addition to the vasodilations described above, there is *in vivo* evidence for astrocyte [Ca2+]i transients resulting in vasoconstriction. Two-photon imaging of astrocytes bulk loaded with calcium indicator dyes revealed that vasoconstrictions of penetrating cortical arterioles occurred during spreading depression (SD) at the onset of the fast astrocytic Ca2<sup>+</sup> wave (Chuquet et al., 2007). Inhibiting either PLA2 or the refilling of internal calcium stores reduced the SD-induced vasoconstriction, suggesting that astrocytes mediate SD-induced vasoconstrictions via PLA2-mediated AA release.

In summary, the evidence suggests that in response to neural activity, astrocyte [Ca2+]i increases and vasoactive messengers are released from astrocytic endfeet. Thus, astrocytes may evoke changes in arteriole diameter and regulate CBF.

#### **ARE ACTIVITY-EVOKED ASTROCYTE CALCIUM TRANSIENTS WIDESPREAD AND FAST ENOUGH TO CONTRIBUTE TO NEUROVASCULAR COUPLING?**

Although a large body of evidence has been acquired over the past decade suggesting that astrocytes are potential mediators of functional hyperemia, the idea remains controversial. The presence, prevalence, and timing of astrocyte Ca2<sup>+</sup> signaling in response to neural activity and its role in the regulation of CBF is currently hotly debated. In a recent review, Cauli and Hamel (2010) discuss the relative timings of astrocytic and neuronal calcium responses to neuronal activity. Rapid calcium events are thought to reflect activation of ionotropic receptors (which are expressed frequently in neurons), while slower calcium responses are proposed to reflect activation of metabotropic receptors (expressed by astrocytes and neurons) and the release of calcium from intracellular stores. These calcium signal dynamics agree with the observation that calcium events in neurons often precede those in astrocytes (Wang et al., 2006; Schummers et al., 2008; Nizar et al., 2013). These data would suggest that astrocytes may only contribute to functional hyperemia in the late phase of the response. Recent studies have suggested that arteriole dilations resulting from neural activity may not only precede astrocytic [Ca2+]i signals (Nizar et al., 2013) but can, in fact, occur in the absence of glial [Ca2+]i signals (Schulz et al., 2012).

Using *in vivo* 2-photon imaging of astrocytes, Wang et al. (2006) reported whisker stimulation-evoked astrocyte [Ca2+]i transients in the barrel cortex which peak several seconds post stimulation. Such transients are too slow to trigger the hemodynamic response to neural activity, which occurs anywhere from a few hundred milliseconds to a couple of seconds after the onset of neuronal activity (Kleinfeld et al., 1998; Devor et al., 2003; Zonta et al., 2003). This idea is supported by evidence suggesting that there is a long lag time between the onset of stimulation and astrocyte [Ca2+]i transients (Schulz et al., 2012; Thrane et al., 2012) and that, following forepaw stimulation, the onset of astrocyte calcium responses may lag behind the onset of arteriole dilation at the same depth within the cortex (Nizar et al., 2013). In this last study (as is common in such studies), bulk loading of the calcium indicator dye, Oregon Green Bapta-1 (OGB-1) was used to measure calcium signals in both neurons and astrocytes. During data analysis the astrocyte region of interest (ROI) was minimized in order to avoid contamination from neuropil signals, which were suggested to account for the initial rapid calcium transients sometimes observed within an astrocyte ROI. Such rapid transients were not observed in astrocytes when using the calcium indicator dye Fluo-4, which was absent in neurons. The difficulty in determining with 100% certainty whether a calcium signal is within an astrocyte, astrocytic process, or neuropil highlights the need for the development both of improved sensitivity of 2-photon detection and of better dye localization. However, other studies also using *in vivo* 2-photon microscopy, IOS and bulk loading of calcium indicator dyes, contradict these findings. Within the olfactory bulb glomerulus, odor stimulation resulted in a local increase in CBF which was strongly correlated, both spatially and temporally, with an increase in astrocytic [Ca2+]i (Petzold et al., 2008). More recently, Lind et al. (2013) used signal-enhancing analysis of Ca2<sup>+</sup> activity to give higher sensitivity to fast Ca2<sup>+</sup> signals. This study demonstrates that, in contrast to the small proportion of astrocytes previously reported to exhibit fast [Ca2+]i transients (Winship et al., 2007), in the whisker barrel cortex of adult mice 66% of astrocyte somata and 70% of processes exhibit a stimulus-evoked [Ca2+]i elevation with rapid onset (peak ∼100 ms) and short duration which precedes local vasodilations (Lind et al., 2013). While stimulusevoked [Ca2+]i transients occurring concurrently in neurons and astrocytes correlated with synaptic activity, only the astrocytic signals correlated with hemodynamic changes. Astrocytic calcium transients consisted of a fast response and, in ∼10% of astrocytes, slow augmentation. The authors suggest that it is this slow component that has been previously reported by other studies and that it is their improved analysis method which enables the fast component to be detected.

# **ARE SUBCELLULAR CA2<sup>+</sup> TRANSIENTS IMPORTANT?**

In brain slices, it has been shown that calcium signals can occur in astrocytic processes in the absence of changes in the cell body (Di Castro et al., 2011). It may be that subcellular astrocyte calcium transients, e.g., those in the endfeet rather than those in the soma, are important for the regulation of CBF (McCaslin et al., 2011; Dunn et al., 2013; Lind et al., 2013). Devor's group reported that the onset of [Ca2+]i transients in endfeet (which may precede those in the soma: Wang et al., 2006) were delayed relative to the onset of arteriole dilation at the same cortical depth (Nizar et al., 2013). However, Lind et al. (2013) demonstrated fast [Ca2+]i transients within endfeet which preceeded local vasodilation. In order to investigate [Ca2+]i transients in the astrocytic soma and/or processes, these studies, along with those of other groups (e.g., Dunn et al., 2013), utilized bulk loading of calcium indicator dye which lacks cellular specificity. The development of targeted expression of genetically induced calcium indicators will allow better dye localization and may result in the reliable detection of fast subcellular [Ca2+]i transients. Such subcellular transients could result in the release of vasoactive substances, hence playing a role in the regulation of CBF. Although this technique has yet to reveal results *in vivo*, membrane-bound genetic calcium indicators have been shown to detect local, subcellular, calcium rises in cultured astrocytes (Shigetomi et al., 2010a,b).

Finally, the majority of published neurovascular coupling studies have been performed in the cortex of anesthetized animals. Anesthetics may disrupt important features of neurovascular coupling, thus acting as a confound in understanding the cellular mechanisms underlying the regulation of CBF in response to neural activity (Martin et al., 2012). Three commonly used anesthetic combinations (ketamine/xylazine, isoflurane, and urethane) have been found to significantly suppress sensoryevoked astrocyte [Ca2+]i transients in mice (Thrane et al., 2012). Sensory-evoked [Ca2+]i transients were found to be more delayed with a slower rise time and longer duration in anesthetized animals compared to awake animals (Thrane et al., 2012). Further studies in awake rodents, such as those performed by Martin et al. (2012), are required in order to fully investigate the role of astrocytes, and their sensory-evoked [Ca2+]i transients, in neurovascular coupling.

#### **CONCLUSIONS**

The work outlined here demonstrates that astrocytes are capable of eliciting both vasoconstriction and vasodilation of brain arterioles. A popular hypothesis of astrocytic control of CBF in response to neural activity has been that neuronally released glutamate acts on astrocytic mGluRs to raise astrocytic [Ca2+]i, initiating downstream production of AA and the formation and release of vasoactive substances (Zonta et al., 2003; Mulligan and MacVicar, 2004; Takano et al., 2006; Petzold et al., 2008). However, recent studies have called into question the role of mGluR5 and IP3-mediated downstream pathways in the functional hyperemia response (Gurden et al., 2006; Calcinaghi et al., 2011; Nizar et al., 2013; Sun et al., 2013). Evidence from the retina suggests that neuron-glia signaling may be mediated by neuronally released ATP acting on glial P2Y receptors rather than via activation of mGluRs by glutamate (Newman, 2005; Metea and Newman, 2006). Indeed, it has been shown that astrocyte [Ca2+]i signals can be evoked by ATP in the cerebral cortex (Sun et al., 2013) and in cerebellar slices (Piet and Jahr, 2007; Habbas et al., 2011). Alternative hypotheses of astrocyte control of vessel diameter also include the efflux of K<sup>+</sup> through Ca2+-activated K+ channels in astrocyte endfeet (Filosa et al., 2006), although the functional, *in vivo*, significance of this pathway remains to be demonstrated. The role of astrocyte [Ca2+]i transients in the control of CBF *in vivo* during functional hyperemia remains controversial. An inability to observe Ca2<sup>+</sup> transients that are fast enough for neurovascular coupling has called into question the impact of astrocytes on CBF regulation in response to neural activity (Nizar et al., 2013). Recent advances in data analysis techniques resulting in a higher sensitivity to fast Ca2<sup>+</sup> signals may have overcome this problem (Lind et al., 2013), providing direct evidence for the existence of Ca2<sup>+</sup> responses which are rapid enough to contribute to neurovascular coupling. It is, however, worth considering that while we study Ca2<sup>+</sup> because we can currently visualize it, Ca2+-independent mechanisms such as those involving glutamate transport (Gurden et al., 2006; Petzold et al., 2008; Schummers et al., 2008) may play an important role in astrocyte-mediated regulation of CBF. A role for astrocytes in the control of CBF in pathology also remains a possibility (Chuquet et al., 2007). While the evidence suggests that astrocytes are important players in neurovascular coupling and functional hyperemia, the questions of whether astrocytes play a dominant role in triggering fast hemodynamic responses and, in particular, under what circumstances astrocytic Ca2+-mediated pathways are responsible, remain open. The exact mechanisms by which astrocytes are able to sense changes in neuronal activity and trigger the intracellular events regulating the resulting vascular response which underlies the fMRI BOLD signal remain unclear. Indeed, which pathway predominates may often result from the experimental model used. Other issues which remain to be solved are: what is the functional significance of astrocytic [Ca2+]i transients in awake animals? Under what circumstances are mGluR-mediated vasodilation and constriction important? What are the messengers underlying neurovascular coupling in healthy and diseased brain? Do slow astrocyte [Ca2+]i signals contribute to the sustained hemodynamic response? Research on this topic must continue. New technologies such as targeted genetic encoding of calcium indicators, optogenetics, and transgenic mouse lines allowing astrocyte physiology specifically to be altered will help us move forward with this research. Only by fully understanding the cellular mechanisms underlying functional hyperemia and the resulting BOLD signal will we be able to accurately interpret the BOLD fMRI signal in health and disease.

#### **ACKNOWLEDGMENTS**

Clare Howarth is a Vice Chancellor's Advanced Fellow at the University of Sheffield. I would like to thank Anusha Mishra and Fergus O'Farrell for their comments on the manuscript.

#### **REFERENCES**


neurovascular coupling agent, constricts rather than dilates parenchymal arterioles. *J. Cereb. Blood Flow Metab.* 33, 479–482. doi: 10.1038/jcbfm.2013.9


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 February 2014; accepted: 18 April 2014; published online: 09 May 2014. Citation: Howarth C (2014) The contribution of astrocytes to the regulation of cerebral blood flow. Front. Neurosci. 8:103. doi: 10.3389/fnins.2014.00103*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Howarth. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Analysis of Neural-BOLD Coupling Through Four Models of the Neural Metabolic Demand

Christopher W. Tyler\*, Lora T. Likova and Spero C. Nicholas

*Smith-Kettlewell Institute, San Francisco, CA, USA*

The coupling of the neuronal energetics to the blood-oxygen-level-dependent (BOLD) response is still incompletely understood. To address this issue, we compared the fits of four plausible models of neurometabolic coupling dynamics to available data for simultaneous recordings of the local field potential and the local BOLD response recorded from monkey primary visual cortex over a wide range of stimulus durations. The four models of the metabolic demand driving the BOLD response were: direct coupling with the overall LFP; rectified coupling to the LFP; coupling with a slow adaptive component of the implied neural population response; and coupling with the non-adaptive intracellular input signal defined by the stimulus time course. Taking all stimulus durations into account, the results imply that the BOLD response is most closely coupled with metabolic demand derived from the intracellular input waveform, without significant influence from the adaptive transients and nonlinearities exhibited by the LFP waveform.

#### Edited by:

*Clare Howarth, The University of Sheffield, UK*

#### Reviewed by:

*Amir Shmuel, McGill University, Canada Kevin Matthews Aquino, University of Sydney, Australia*

> \*Correspondence: *Christopher W. Tyler cwt@ski.org*

#### Specialty section:

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience*

Received: *14 April 2014* Accepted: *16 October 2015* Published: *15 December 2015*

#### Citation:

*Tyler CW, Likova LT and Nicholas SC (2015) Analysis of Neural-BOLD Coupling Through Four Models of the Neural Metabolic Demand. Front. Neurosci. 9:419. doi: 10.3389/fnins.2015.00419* Keywords: fMRI, metabolic coupling, neural signal estimation, human brain, multimodal imaging, BOLD, local field potentials

# INTRODUCTION

The goal of functional Magnetic Resonance Imaging (fMRI) is to estimate properties of the neural signals in the brain during the spectrum of activities controlled by the nervous system. However, the recorded fMRI signal is a response to the metabolic demands of supporting the nearby neural activity (Thompson et al., 2003, 2004). It is therefore important to understand as much as possible about the pathway coupling the recorded fMRI response to the dynamics of the neural activity giving rise to it. The theoretical development of the neural/BOLD coupling logic is based on that of Tyler and Likova (2011) although the present application to monkey joint fMRI/local-field-potential data is entirely novel.

# Neural/Astrocyte Coupling

It is widely accepted that the origin of the metabolic demand driving the blood-oxygen-leveldependent (BOLD) signal recorded in fMRI is the energetic load deriving from transmitter release at the synaptic inputs to each neuron (Logothetis, 2002, 2003; Logothetis and Wandell, 2004; Shmuel et al., 2006; Carmignoto and Gómez-Gonzalo, 2010). The transmitter release is tightly coupled to the activation of the post-synaptic receptors on the recipient cell membrane and consequently to the energetic demands of the synaptic activation of the transmitter molecules for future release, the majority of synapses being glutamatergic (Magistretti, 2006). The synaptic origin of the energetic demand driving the BOLD signal is thus coupled to the net transmitter signal impinging on the cells, and hence to the intracellular potential in these cells. The majority of these energetic demands are met by either by glycolysis of glutamate to glutamine in the neighboring astrocytes (Shank and Aprison, 1979; Wang and Floor, 1994; Bélanger et al., 2011; see **Figure 1**), or by oxidative phosphorylation from the neuronal mitochondria (Attwell and Laughlin, 2001; Hall et al., 2012; Pellerin and Magistretti, 2012).

It should be mentioned, however, that the existence of a direct interneuron pathway for vasodilation and vasoconstriction has also been proposed (Dirnagl et al., 1993; Ma et al., 1996), although the proportion of the effects specific to this direct pathway remain a matter of debate (Lindauer et al., 1996; Attwell et al., 2010). Indeed, we are unaware of any studies on this issue that provide evidence of interneuron control of vascular diameter having the fast (∼5 s) time constant sufficient to account for the BOLD response dynamics in the human brain in vivo.

# Source of BOLD Waveform Variability

It is well known that there are substantial variations in the BOLD waveform in different cortical regions recorded during the same task (Handwerker et al., 2004, 2012; Fox et al., 2005), which have often been interpreted as due to variations in the local hemodynamics among cortical regions. Two points should be made in this regard. One is that differences in hemodynamics are largely attributable to differences in density of the arterial supply and draining veins overlying the cortical parenchyma (Handwerker et al., 2012), which indeed are expected to have different dynamics from the local capillaries within the cortex. However, this is an issue that can be addressed by accurate segmentation and the appropriate choice of voxel sizes to exclude extra-parenchymal signals and restrict the recorded BOLD responses to cortical space. To our knowledge, none of the papers evaluating the regional variations in BOLD waveform have implemented this strategy.

The other important point is that none of the studies of regional variations in BOLD waveform have assessed the role of neural variations in temporal waveform in this phenomenon. Neural waveform variations among neurons of different types and even the same types in different cortical regions are wellestablished (e.g., Hegdé and Van Essen, 2004, 2006). Such variations in the source signal can readily give rise to variations in the consequent BOLD waveforms, even on a different (longer) timescale (see Tyler et al., 2008; Tyler and Likova, 2011, 2014). Given this neurophysiological evidence, it is arbitrary and prejudicial to attribute all BOLD waveform variations purely to hemodynamics. There must be a neural component to this variation that needs to be acknowledged in all analyses of BOLD variations across regions.

Indeed, the logic of the known neural variations in neural signals poses the question whether any of the regional BOLD variation can be securely attributed to hemodynamic causes. All studies of regional BOLD variation to date have employed paradigms in which the BOLD responses are mediated by neural signals, whether in response to external stimulation or intrinsic neural interactions. As such, the BOLD responses were subject to the known functional variation of neural activity across regions of cortical specialization, and hence of potential temporal variation. Only if the neural signals were determined to be equal by direct measurement, or the BOLD signals across cortical regions were generated by a post-neural input, such as nitric oxide infusion in the region of the blood vessels, could the variation be convincing attributed to hemodynamic factors. However, in order to follow the first course, it is necessary to determine the aspect of the neural signals that is responsible for generating BOLD response dynamics, which is the topic of the present paper based on a novel analysis of simultaneously recorded local field potential (LFP) and BOLD signals from monkey cortex.

# Nonlinearity of the BOLD Time Course

The neural and BOLD response time courses were measured simultaneously to rotating checkerboards stimuli in a study by Logothetis (2003) in behaving monkeys. The neural time course was recorded in terms of the LFP, with the BOLD signal being recorded from 16 adjacent voxels (since the presence of the electrode prevented recording from the actual voxel containing it). Representative results are shown in **Figure 2**.

Two points are noteworthy. One is that the LFP timecourse (black curves) does not exactly match the stimulus timecourse (black box function) despite the author's efforts to do so by providing a continuously moving, high contrast target. The timecourse has the initial transient ubiquitously seen in singleunit recordings, followed by a sustained plateau that shows a gradual adaptation effect. The off-response shows a similar (inverted) transient, but only minimum evidence of the plateau. As a result, the overall LFP response is nonlinearly related to the stimulus in a manner that can be captured by a parallel-channel model of the sum of several component neural responses, but not by a serial model of convolution with any single form of temporal impulse response.

Logothetis' concern was not, however, with the linearity or otherwise of the LFP, but with its relation to the BOLD response. The BOLD time course was predicted on the basis of convolution of the recorded LFP waveforms with an estimated impulse response function. The function that provided a good fit for short duration stimuli, however, showed significant deviations from the measure data at long durations (**Figure 2**), predicting a substantially stronger BOLD response than was actually recorded at the longest duration, in particular.

This result implies that the neurometabolic coupling is not well-described by a linear convolution process, but has further nonlinearities built into it that need to be taken into account in an attempt to infer the neural signal on the basis of local BOLD response recordings.

The LFP recordings in **Figure 2** make it apparent that the LFP waveform has a complex time course that can be approximated by two exponentials with time constants of about 1 s and >30 s, respectively. Relative to the usual time courses of neural transients, of about 50 ms these are remarkably prolonged neural processes on the time scale of the recorded BOLD signal from the same general region of cortex (blue trace).

The importance of this adaptation effect is emphasized by the fact that the recorded LFP signal does not fully match the predicted BOLD activation (red curve), and therefore a more comprehensive model is required, going beyond the standard General Linear Model (GLM) of convolution of a metabolic kernel with the stimulus time course. We note that a corresponding adaptation effect in the neural response to

FIGURE 2 | Time course of the local field potentials (black trace), BOLD (red blue trace), and predicted BOLD (red trace) to a continuous dynamic stimuli (black rectangle) of 3, 6, 12, and 24 s duration (A–D, respectively from Logothetis, 2003, with permission). The prediction was generate by linear convolution of the recorded LFP signal with a hemodynamic response function (see Logothetis, 2002, for details).

flickering stimulation was inferred by Pfeuffer et al. (2003) from the pattern of variations in BOLD response amplitude as a function of stimulus duration.

# THEORETICAL ANALYSIS

# Analysis of Neural/BOLD Coupling Nonlinearities

The widespread utilization of the general linear model in fMRI analyses may be taken to imply that it is an adequate approximation to the BOLD signal behavior under typical recording conditions, but a detailed reveals some limitations of this model. As a starting point of the analysis, we have developed a specific model structure of the processes leading to the BOLD paramagnetic signal of fMRI recordings (Tyler and Likova, 2011). This model goes beyond the linear convolution analyses of Friston (1997) and Friston et al. (1998, 2000) in incorporating multiple forms of neural signal within each voxel and recognizing an explicit glial aspect to the metabolic coupling pathway.

In general terms, the stimulus impinging on the subject generates a sequence of neural responses starting with the transduction into a neural signal within the sensory receptors, which then propagates to the brain and activates various populations of neurons within the voxels then being analyzed by the fMRI technique. For instance, the signals arriving from the retina generate synaptic activation of the populations of cortical cells, which generates a local energetic demand for the restoration of the neurotransmitter molecules carrying the activation signals. The chain of cortical metabolic processing, illustrated in the block diagram of **Figure 3**, progresses from the local metabolic demand generated by the neural events at the synapse through the metabolic coupling mediated by the neighboring astrocyte glial cells as a whole to the processes of oxygen delivery by the adjacent capillaries that is detected by the imaging methodology. It is important to emphasize that the astrocyte metabolic processes are slow relative to the intracellular signal dynamics, about as slow as the processes of hemodynamic oxygen supply. The time constant of the astrocyte responses is known to be of the order of several seconds (Kelly and Van Essen, 1974; Filosa et al., 2004; Metea and Newman, 2006; Schummers et al., 2008), and it is clear that there must be a substantial pre-hemodynamic component from these slow responses. Kelly and Van Essen (1974) and Schummers et al. (2008) also show that the slower glial responses are as strongly tuned to local stimulus orientation as are the neural responses, implying a tight functional coupling between them. However, at present too little is known of their dynamics and/or nonlinearities to securely assign precise time constants to the astrocytic component relative to the hemodynamic component.

# Specifying the Model Framework

The model framework is slightly modified from that in Tyler and Likova (2011). We treat the neural responses within each voxel for a given stimulus S(t) as generated by sets of homogeneous populations with similar signal waveforms Ni(t) within each population (see **Figure 3**). For generality, it is assumed that these neural signal waveforms are generated by a nonlinear transduction from the input stimulus. The transduction from each neural population response to the local metabolic demand Mi(t) is further assumed for generality to be nonlinear. The overall metabolic demands G(t) within a voxel are met primarily by the surrounding astrocytes, which support the required neural energy consumption over time and space and make a complementary metabolic demand G(t) on the adjacent vasculature. This integrated metabolic demand stimulates the vascular hemodynamic processes H(t) provide the requisite oxygen and glucose exchange to replenish the energy depletion in the astrocytes. The last three stages constitute the metabolic

TABLE 1 | Mathematical model of the operations involved in the generation of the BOLD signal from the input stimulus.


response that determines the ratio of oxygenated to deoxygenated hemoglobin in the blood complement of a given voxel that is estimated through the paramagnetic reaction as the BOLD signal Y(t). These post-neural processing stages are often modeled as a linear metabolic response kernel (mrk) convolved with the presumed neural signal.

The terms of the conceptual model in **Figure 3** are related by a series of mathematical operations specified in **Table 1** (modified from Tyler and Likova, 2011). The three operators are: (i) linear convolution (⊗), a nonlinear amplitude relation (f[ ]), and a multiple linear integrator (Σ). Note that each stage of the model is treated as the linear convolution of the output signal from the previous stage with a temporal response kernel designated by lower case initial for the respective process, i.e., the neural response function n(t), the metabolic response function m(t), the glial response function g(t), the true hemodynamic response function h(t), the paramagnetic response function p(t) that generates the BOLD signal, and an approximate metabolic response kernel mrk(t). This last process corresponds to a linear approximation of the metabolic coupling relation implied by the previous three stages. The linear integration across multiple parallel elements within the voxel provided by the glial coupling stage corresponds to a nonlinear process in the context of singlechannel solution.

#### Nonlinearities

Unlike the example in **Figure 2**, however, typical LFP responses show a much weaker transient at offset than onset (see **Figure 6**, column 1), which implies the presence of an adaptation process decreasing the transient component over time. Such adaptation can be readily modeled by the nonlinear process of an exponential decay with time constant γ multiplying the response over time (after convolution with the stimulus), as shown in the first line of **Table 1**.

**Table 1** thus invokes three kinds of nonlinearities in the overall model—an amplitude nonlinearity (lines 1 and 2), an adaptive temporal nonlinearity (the exponential term in line 1), and a multiple summatory nonlinearity (line 3). Nevertheless, these stages are typically inaccessible, therefore for practical purposes, they are approximated by the linear model form in the last line of the table: a function representing the neural metabolic demand evoked by the neural response to the stimulus presentation is convolved with the metabolic response kernel (mrk).

# Nonlinear Model of the Local Field Potential (LFP) The Neural Signal

A comprehensive model of the BOLD therefore requires an accurate model of the intracellular potential dynamics coupled to stimulation. If the excitatory and inhibitory transmitter release are symbolized by ψe,ψ<sup>i</sup> , we can specify the relationships between the synaptic input and the intracellular potential VI(t) as follows:

$$\begin{aligned} V\_I(t) &= \sum\_{\eta\_\ell} \psi\_\ell(t) - \sum\_{\eta\_i} \psi\_i(t) + \mathbf{n}\_0 \,(0, \sigma\_0) \\ &= \operatorname{stim}(t) \otimes \left( \eta\_\epsilon t^{k\_\ell} e^{-t/\tau\_\ell} - \eta\_i t^{k\_\ell} e^{-t/\tau\_i} \right) + \mathbf{n}\_0 \,(0, \sigma\_0) \,(1) \end{aligned}$$

where η<sup>e</sup> and η<sup>i</sup> are the number of excitatory and inhibitory transmitter molecules, respectively (or, strictly, the number of ionic charges carried by the net inflow of transmitter molecules per unit time) and n (0,σI) is the cumulated noise of the intracellular signal from quantal, thermal, and transmitter sources.

To avoid complications, we do not specify the contributory components of the intracellular noise. For example, the quantal component will decrease in standard deviation as luminance level is increased, and the transmitter source may decrease in standard deviation as the activation level decreases, but we assume the totality of noise sources add up to a constant Gaussian noise source to a first approximation. This assumption has been evaluated in detail by Carandini (2004) in coupled intracellular and extracellular recordings. His model provides an accurate quantitative account of the strong signal-dependence of the variability of the extracellular spike rate (Tolhurst et al., 1981; Vogels et al., 1989) in terms of a purely additive Gaussian intracellular noise passing through the threshold-like nonlinearity of the spike generation process. Thus, the additive Gaussian noise assumption for the intracellular signal governing the metabolic demand is fully compatible with the signaldependent properties of neural spike noise.

The constants η<sup>e</sup> and η<sup>i</sup> are specified for every individual cell and will vary substantially among cell types. Indeed, they will vary substantially with the placement of the intracellular (e.g., patchclamp) recording site in relation to the synaptic inputs of the cell. However, for the present purposes, the relevant values are the average values integrated over large volumes of cortex leading to the local metabolic demand that underlies the BOLD signal, as reflected in the local field potential (LFP) recorded at a site in the extracellular medium.

As is highlighted by the data of **Figure 2**, there are adaptive effects in the neural response with a complex time course that can be approximated by two exponentials with time constants of about 1 s and >30 s, respectively. These are remarkably prolonged neural processes on the time scale of the recorded BOLD signal from the same general region of cortex (blue trace) as indicated by the fact that the recorded LFP signal does not fully match the predicted BOLD activation (red curve). The negative LFP signal in **Figure 2** following stimulus offset has a similar (but inverted) time course to that following the stimulus onset, implying that the adaptation effect is a subtractive inhibition rather than solely a multiplicative form of fatigue (which would have no negative rebound). If such a gain control were purely multiplicative, the amplitude of signal change at offset would be substantially less than that at onset, whereas the two amplitudes are similar within about 10%. Thus, the adaptive inhibition must be predominantly subtractive rather than multiplicative gain control and may correspond to the tonic intracellular hyperpolarization suggested by Carandini and Ferster (1997, 2000) to be the mechanism for pattern adaptation. However, it is adapting essentially to a dynamic input modulation, and hence the sustained LFP signal should be treated as deriving from a full-wave rectified transform of the intracellular potential.

Formally, the neural signal for the present analysis is considered to be the extracellular voltage Vj(t) in each jth subpopulation of neurons with homogeneous response characteristics and is related to the intracellular voltage according to

$$V + \tau\_j \frac{dV}{dt} = \alpha\_j V\_I, \quad \text{where} \quad V = V\_j \,(t - \Delta t) \tag{2}$$

and where τ<sup>j</sup> and ζ<sup>j</sup> are the time constants of the two exponentials, 1t is an onset delay, and α<sup>j</sup> is a scaling factor, for a given neural population j.

Solving Equation (2) for V<sup>j</sup> (t) and restricting it to positive t gives:

$$V\_j(t - \Delta t) = \frac{\alpha\_j}{\mathfrak{r}\_j} V\_I(t - \Delta t) \otimes e^{-(t - \Delta t)/\mathfrak{r}\_j}, \ t > \Delta t$$

$$= 0, \qquad t < \Delta t \tag{3}$$

Thus, the neural input for the contributions of the various neural populations to the LFP for the model of **Table 1** is:

$$m(t) = \sum\_{j>1} V\_j \left( t \right) \tag{4}$$

together with a sustained component given by:

$$n\_1\left(t\right) = \int V\_1\left(t\right)\tag{5}$$

Finally, the mrk for the metabolic coupling relation in the last line of **Table 1** is assumed to be a gamma function of the form:

$$mrk(t) = \alpha\_M t^k \cdot e^{-t/\tau\_M} \tag{6}$$

where αM, k, and τ <sup>M</sup> are the characteristic constants of the mrk dynamics.

To implement the additive (parallel-process) model of Equation (3) (shown in **Figure 4** for a qualitative fit to the data of **Figure 2**), the two decay components had time constants of 1 and 60 s ("slow" and "fast" components, red and green curves in **Figure 4A**). These processes were convolved with a neural signal derived from sum of the two components after convolution of the two components with the rectangular form of the continuous stimulus for 3 and 12 s, the latter corresponding to the responses in **Figure 2**. This model captures the qualitative features of the LFP data (**Figures 4B,C**, black curves) with the sum of the two component responses (red and green curves in **Figure 4C**). Again, it is difficult to obtain such a combination of the two component slopes with purely serial model, because this would imply a convolution of the two exponentials which would necessarily result in a function dominated by the slower process rather than allowing both processes full expression.

# Neurometabolic Coupling

As will become evident, we will need a range of models of neurometabolic coupling to account for the variety of data available. We therefore develop four options as to what aspect of the neural signal is coupled through the metabolic demand to the BOLD response (see **Figure 5**). All four options assume that the coupling to generate the BOLD response can be approximated as a linear process of convolution with the mrk (last line of **Table 1**), with the nonlinearities occurring in terms of the predominant aspect of the neural signal and the early stages of the metabolic chain that is assumed to be driving the BOLD response. Thus, the coupling of the mrk with a LFP model response is assumed to be linear (as in Friston et al., 1998).

# LFP Coupling

The first model option (**Figure 5**, top row) is the original concept that the LFP represents the net neural signal in the voxel, which generates the metabolic demand that drives the metabolic recovery processes through in the blood supply (Lippert et al., 2010), as mediated by the intervening glial cells. The net neural signal contributing to the LFP is the input for a given cortical area as well as its local intracortical processing, including the activity of excitatory and inhibitory interneurons and the effect of neuromodulatory pathways (Logothetis, 2003, 2008; Magri et al., 2012). The LFP model for this option is specified in the first line of **Table 1**, which incorporates a slow adaptive process in addition to the fast and slow decay components of Equation (4).

# Slow Adaptive Coupling

Instead of assuming that the MRK input derives from the whole LFP, it may be assumed to be specific primarily to the slow adaptive component of the model (**Figure 5**, second row), with the fast component attributable to spiking activity, which would have little impact on the BOLD response due it its low metabolic requirements (Logothetis, 2002, 2003, 2008; Logothetis and Wandell, 2004). Thus, the mrk is assumed to be solely the sustained component of Equation (5) followed by the adaptive process of line 1 of **Table 1**.

# Neurotransmitter Input Coupling

An alternative option is the assumption that all the observed LFP adaptation is a function of extracellular signal diffusion after the metabolic demand has been defined by the neurotransmitter processes (**Figure 5**, third row). Under this assumption, the neurometabolic coupling would be with a non-adaptive sustained neurotransmitter response to the input signal, as proposed by Logothetis (2002, 2003, 2008) and specified in Equation (5). In particular, this hypothesis implies that there would be no transient off-response component contributing to the BOLD signal.

# Rectified LFP Coupling

A final option (**Figure 5**, fourth row) is that any deviation of the LFP from zero (either positive or negative) is mediated by the release of some form of neurotransmitter and represents a metabolic demand (Sotero and Trujillo-Barreto, 2007; Tyler and Likova, 2011), as specified in Equation (1) with η<sup>i</sup> taking the value of −1. This assumption implies that the release of any neurotransmitter in the form of either excitatory or inhibitory synaptic coupling would constitute a neurometabolic load that generated a positive neurometabolic demand. A simplified model of such a demand would thus be represented by a rectified version of the nonlinear LFP (Rect LFP), although it is possible that this would still underestimate the metabolic demand due to electrical cancellation of the positive and negative components in different parts of the cell. Nevertheless, the rectified LFP would constitute a lower bound of the neurometabolic demand, and in particular would convey its characteristic of having no negative aspects. This simplified model can therefore be used as an initial assay of whether the rectification approach has merit, with possible elaboration if it provides a better fit than the other models.

# METHODS

As specified in the previous section, these four hypothetical forms of coupling have all been proposed in the literature. Here we may now compare their performance within primary visual cortex (V1) of macaque monkeys from LFP data made available to us by Nikos Logothetis from the study described in **Figure 2** (see Logothetis, 2003, for details), with seven recording durations (2, 3.2, 4.3, 6.4, 12.8, 13.4, and 25.7 s). The LFP bandwidth was 10–300 Hz. The stimuli were large-field rotating checkerboards, alternating in direction every 2 s, designed to avoid response adaptation as much as possible. There were a total of 28 datasets, which are averaged for each available duration to provide the average data for the seven durations shown in **Figure 6**.

The model fitting was implemented through the Matlab fminsearch function for optimization of a parametrized function to data, with the mean squared error as the variable to be minimized. For the full LFP model of Equation (4), we needed to include a sustained (non-adaptive) component (Equation 5) in addition to the two adaptive components (see line 1 of **Table 1**) in order to capture the characteristics of the response; thus i = 1, 2. To fit the LFP model of Equation (4) to these data, the four dynamic parameters of τ<sup>i</sup> , ζ<sup>I</sup> , their onset delay 1t and their adaptation time constant γ , were optimized for the fit to the mean responses simultaneously across all seven durations, together with amplitude of each component as a free parameter at each duration, making a total of 4 + 7 = 11 free parameters. For each duration, n = 64 and the residual variances for the LFP fits are specified in each panel of the first column of **Figure 6**. Thus the 64 × 7 = 448 parameters of the average LFP data are fit with a model of 11 free parameters. The component weights of the resulting three components (green curves) are shown in the remaining columns of **Figure 6**, with the overall LFP waveforms (dashed blue curves) for comparison.

For the full model fits to the BOLD waveforms, the optimized LFP fit for each duration was convolved with an mrk according to Equation (6), with k and τ<sup>M</sup> optimized to all durations simultaneously, together with an amplitude parameter α<sup>M</sup> and baseline shift parameter for each of the 7 durations (2+2 ∗ 7 = 16 free parameters). Since the BOLD sampling rate was 250 ms, the dataset of 160 × 7 = 1120 parameters was being fit with the 16 free parameters for each of the four models of metabolic demand shown in **Figure 5** (given the LFP fit as the input function for each duration). The presence of 160 samples at each duration implies that individual fits are significant at p < 0.001 of the Ftest, providing Bonferroni correction to p < 0.02 for multiple applications to 16 fits if they account for more than 61% of the variance (i.e., if the residual variance is less than 39% of the overall variance).

Moreover, for the ratio between any two variances to be significant, the ratio has to exceed 1.63 on the F-test for significance at p < 0.001 (which provides an appropriate level

FIGURE 6 | Left column: Overall neural model fits (blue curves) to the average LFP responses (red curves) at each of 7 durations. Proportion of variance unaccounted for (R2) shown as insets. Three right columns: Optimized sustained, fast and slow adaptive components (green curves for each duration) required to provide the overall neural model fits (blue curves). Note that residual variance (1–*R* <sup>2</sup>) is less than 3% in all cases, and must be considered to have fully characterized the LFP dynamics of V1.

of Bonferroni correction for the test validity at p < 0.05 over the multiple applications of 6 pairwise comparisons among the 4 models, times 7 durations, or a total of 42 test applications).

# MODELING RESULTS

The first aspect of the study was to fit the model of Equation (4) to the average LFPs across duration, as shown in **Figure 6**. This model fit had the twofold goal of (a) providing a lowfree-parameter characterization of the LFP waveform and of (b) defining its component structure in terms of the components developed in Equations (2–5) and **Figure 5**. The specific model components were thus a sustained component matching the stimulus input, a fast adaptive component and a slower adaptive component. (Note that the adaptation gives the latter two components a much reduced offset transient relative to their onset transients at long durations; **Figure 6**, columns 3 and 4.) The optimal dynamic parameters are specified in **Table 2**.

The neural model fits to the LFP waveforms show that the three-component model has the appropriate structure to match all the evident features of the waveform, accounting for an average of 98% of the variance. Except at short durations, all three components are approximately equally weighted in the combined model. It might be possible to capture the data with the same component weights across duration, but the goal of the study is not LFP modeling per se, so it was not relevant to pursue this issue.

Fits of the four models for the metabolic demand to the BOLD responses at each duration are shown in **Figure 7**, based on the components of the LFP fits in **Figure 6**, together with their optimized mrk (top row). Note that the BOLD mrk parameters in **Figure 7** were allowed to vary across the four models (as there is no prior on the relationships among the models), but held constant over the 7 stimulus durations, as the metabolic parameters are not expected to be affected by the nature of the stimulus. The time constants of the optimized mrk waveforms in terms of peak latency were 4.8, 9.3, 6.6, and 2.7 s for the four models, respectively, based on a 5th-order gamma function model.

As specified in Methods, the individual fits are significant at p < 0.03 if the residual variance is <39%. Thus, all the fits are significant except for several of those for the 3.2 and 4.3 s durations.

For the specific comparisons among the different models, the statistically significant cases may be assessed as any having ratio of the residual variances greater than 1.63 between model fits at a given stimulus duration, as described in Methods.

Across the durations, each of the model fits is significantly worse than for the Input model at a few durations (residual variances shown in bold), particularly those for the Slow Adapt model, and no model has significantly better fits than the Input


model at any duration, with the exception of the LFP model at one duration—12.8 s (**Figure 7**). At the longest duration, the Input model fits are significantly better those for all three other models. Thus, taken together, the net result is that the Input model provides the best fit overall across the 7 stimulus durations.

# DISCUSSION

Taking all durations into account, the results of this modeling study imply that the BOLD response is most closely coupled with the neurotransmitter input waveform defined by the sustained response close to the boxcar waveform of the stimulus time course, without the transients and adaptive nonlinearities exhibited by the LFP waveform. The best-fitting BOLD mrk was a 5th-order gamma function with a peak time of 4.8 s and no inhibitory rebound, accounting for more than 90% of the variance at the three longest durations (which would correspond to correlations between the model and the data of >0.95). In practice, of course, the inputs to V1 voxels would have passed through several stages of neural processing in the visual pathway, including transmission delays, and temporal integration, but these effects are evidently too small to be resolved on the time scale of the available analysis. Also, it should be noted that the initial transients characteristic of most neuronal responses are specifically minimized by the design of the stimuli, which provided continuous movement alternating in direction every 2 s, and hence that the initial neural response should be expected to closely match the stimulus specification. In this context, it is actually surprising to find the LFP exhibiting the pronounced initial transient that is evident in **Figure 6**, since the stimulus was specifically designed to minimize such deviations from the input boxcar waveform in the form of directional adaptation. However, the present data and model fits imply that any longerterm adaptation to this kind of motion stimulus is happening beyond the stage of the neural inputs to V1, as there is no tendency on average for the BOLD response to decline at the longest durations, and hence it must derive from a non-adapting component of the neural response in V1.

Thus, the net conclusion from this study agrees with that of (2002, 2003, and 2008), that the form of the BOLD signal is most compatible with the input to the neuronal response, i.e., with the energetics of the primary neural activation that requires a glutamatergic metabolic response. It is noteworthy that this is the coupling that involves the briefest estimated mrk, because this is the metabolic demand with the least transient input of the four. In fact, the mrk peak for this case is occurring at only 4.8 s, a fairly typical value for the general understanding for human BOLD responses. (Note, however, that this value cannot be compared directly with the HRF of the standard approach, as the HRF incorporates all preceding neural dynamics, whereas the mrk is restricted to the metabolic response kernel by the assumptions of the analysis.)

Moreover, the model mrk had no delay parameters. As can be seen from the examples in **Figure 7** (first column), there is no visible tendency for the rise of the BOLD onset to lag the model fits. This result suggests that there is no inherent BOLD delay relative to the gamma-function model of the mrk in relation to

neural activation beyond that implied by the order of the gamma function required to account for the full BOLD waveform. Any further delays that may be needed in a range of GLM analyses of the gamut of tasks in the literature may be attributed to neural processing delays.

It should be emphasized that the linear convolution of the mrk stage required for the present fits implies (although it does not prove) that any further complexity or cortical diversity of the measured BOLD dynamics, as reported by Fox et al. (2005), Handwerker et al. (2012) or Likova and Tyler (2007), for example, is attributable to variations in the underlying neural signals rather than to variations in the BOLD HRF per se. On this basis, the results further imply that the use of stimuli that allow neural adaptation prior to arrival in the cortex, and hence an adaptive waveform for the cortical input (wherever in the cortex that may be), would show an adaptive BOLD response in that region of cortex. Moreover, a neural input that had a negative rebound in the signal arriving at the cortex would show a negative rebound in the BOLD response. For example, the rotating noise stimulus of the Logothetis study analysis here was changed in direction every 2 s to minimize adaptation effects. If instead it had been maintained indirection for the full 40 s time period, classic motion adaptation would have been expected during the stimulus presentation, with a negative rebound corresponding to the motion aftereffect. Such behavior was indeed reported by Tootell et al. (1995). Evidence in favor even stronger adaptation effects in a purely transient noise paradigm is provided by Likova and Tyler (2007).

# CONCLUSION

The good quality of the full model fits to the combined LFP and BOLD data as a function of duration provides a principled assessment of the nature of the neural/BOLD coupling behavior underlying BOLD fMRI and provides structured insights into the nature of the neural signal components contributing to the BOLD response dynamics. In general, the results are consistent with previous work employing a linear convolution of the stimulus waveform with a gamma-function model of the BOLD dynamics, but they provide further insight into the nature of the underlying processes involved. In particular, they reveal that no negative rebound of the BOLD response is required to account for the recorded BOLD waveforms.

In relation to the first stage of the model process, the extremely high quality of the model fits to the LFP data provides strong evidence that the LFP component model has the appropriate component structure to account for the mechanisms contributing to the recorded LFP dynamics. This question was not the focus of the present paper, but we note that there are surprisingly few modeling studies attempting to characterize the mechanisms of neural response dynamics,

# REFERENCES


particularly in the case of LFPs, and propose this model structure as the starting point for more targeted studies of this issue.

In relation to the question of assessing the neural signals contributing to BOLD responses throughout the brain, a key tool in this enterprise is an accurate model structure for the likely neural responses in any local volume of cortex. The parameters of such a model can allow for optimization to the range of responses encountered across stimulus conditions, cortical regions and individual brains. The success of the present analysis helps to provide validation that this is an achievable goal, and should encourage similar efforts for a wider range of stimulus conditions to determine how far the present model can be generalized and what other aspects need to be included to characterize the full range of such constraints.

# ACKNOWLEDGMENTS

Thanks to Nikos Logothetis for providing the joint neurophysiological/fMRI data analyzed in this study. Funded by Congressionally Directed Medical Resaarch Program Grant DM102524.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Tyler, Likova and Nicholas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Variability of the coupling of blood flow and oxygen metabolism responses in the brain: a problem for interpreting BOLD studies but potentially a new window on the underlying neural activity

#### *Richard B. Buxton\*, Valerie E. M. Griffeth , Aaron B. Simon and Farshad Moradi*

*Department of Radiology, Center for Functional MRI, University of California, San Diego, La Jolla, CA, USA*

#### *Edited by:*

*Clare Howarth, The University of Sheffield, UK*

#### *Reviewed by:*

*Anand Joshi, University of Southern California, USA Kevin Murphy, Cardiff University, UK*

#### *\*Correspondence:*

*Richard B. Buxton, Department of Radiology, Center for Functional MRI, University of California, San Diego, W. M. Keck Building, 0677, 9500 Gilman Drive, La Jolla, CA 92093-0677, USA e-mail: rbuxton@ucsd.edu*

Recent studies from our group and others using quantitative fMRI methods have found that variations of the coupling ratio of blood flow (CBF) and oxygen metabolism (CMRO2) responses to a stimulus have a strong effect on the BOLD response. Across a number of studies an empirical pattern is emerging in the way CBF and CMRO2 changes are coupled to neural activation: if the stimulus is modulated to create a stronger response (e.g., increasing stimulus contrast), CBF is modulated more than CMRO2; on the other hand, if the brain state is altered such that the response to the same stimulus is increased (e.g., modulating attention, adaptation, or excitability), CMRO2 is modulated more than CBF. Because CBF and CMRO2 changes conflict in producing BOLD signal changes, this finding has an important implication for conventional BOLD-fMRI studies: the BOLD response exaggerates the effects of stimulus variation but is only weakly sensitive to modulations of the brain state that alter the response to a standard stimulus. A speculative hypothesis is that variability of the coupling ratio of the CBF and CMRO2 responses reflects different proportions of inhibitory and excitatory evoked activity, potentially providing a new window on neural activity in the human brain.

**Keywords: cerebral blood flow (CBF), cerebral metabolic rate of oxygen (CMRO2), blood oxygenation level dependent (BOLD), functional magnetic resonance imaging (fMRI), inhibitory/excitatory neural activity**

#### **THE CHALLENGE OF INTERPRETING THE BOLD RESPONSE IN A QUANTITATIVE WAY**

Functional magnetic resonance imaging (fMRI) based on the detection of blood oxygenation level dependent (BOLD) signal changes has had an enormous influence on human neuroscience studies, providing a sensitive and noninvasive tool for detecting a change in neural activity in response to a stimulus or during spontaneous neural fluctuations. The basic physical phenomenon underlying the BOLD effect is that deoxyhemoglobin is paramagnetic, and its presence reduces the MR signal slightly (Buxton, 2013). If the blood becomes more oxygenated, the MR signal goes up. Note, though, that this phenomenon by itself is not enough to explain why the BOLD effect happens: one could easily imagine that CBF and CMRO2 increase by the same fraction in response to neural activity changes, which would not change blood oxygenation. The existence of the BOLD effect depends also on a second, physiological phenomenon: when neural activity increases CBF increases much more than CMRO2—*decreasing* the local oxygen extraction fraction—and the decreased concentration of deoxyhemoglobin creates the BOLD response. While it is widely understood that the BOLD response is not directly related to neural activity, there is nevertheless a tendency to think of it as a relatively simple two-step process: increased neural activity leads to a CBF change, which then produces a BOLD signal change. In this perspective article we argue that this view is too simplistic, because it leaves out the important role played by CMRO2: when neural activity increases, the CBF increase tends to wash out deoxyhemoglobin, while the CMRO2 increase tends to create more deoxyhemoglobin. For this reason, the BOLD signal depends strongly on the coupling ratio *n*, the ratio of the fractional changes in CBF and CMRO2. For example, the same change in CBF will produce a larger BOLD response when *n* is large.

For this reason, interpreting the BOLD response in terms of the underlying neural activity is not just a question of understanding neurovascular coupling; we must also understand neuro-metabolic coupling. Local neural activity includes both synaptic and spiking activity, and both excitatory and inhibitory activity. The basic problem is that we currently do not have a good quantitative understanding of how each of these aspects of neural activity drives CBF and CMRO2. Current thinking is that the acute CBF response to a stimulus is not driven directly by the change in energy metabolism, but rather by signals related to the neural activity itself (Attwell and Iadecola, 2002). This essentially feed-forward mechanism provides a way to avoid a potentially dangerous drop in tissue O2 concentration by increasing CBF in anticipation of a greater need for oxygen (Buxton, 2010). The need for a relatively fast CBF response is that there is very little O2 available in tissue to serve as a buffer [tissue O2 in gray matter would be depleted in about 1 s for normal CMRO2 (Buxton, 2010)], and a quick increase in CMRO2 could lead to a sharp drop in available O2 in the tissue unless CBF also quickly rises. This means that we must think of CBF and CMRO2 as being driven in parallel by neural activity, but potentially by different aspects of that activity.

These physiological considerations emphasize the difficulty of interpreting the BOLD response in a quantitative way. Most fMRI investigators would support the view that if a local BOLD signal change is detected in response to a stimulus, it suggests that there is some underlying change in neural activity, the basis of using the BOLD response as a mapping signal. However, if we focus on questions comparing BOLD responses under different conditions, the interpretation becomes more problematic: does a change of the underlying neural activity in response to a stimulus necessarily lead to a BOLD signal change? Or, if the BOLD response is different comparing two conditions, does the magnitude of the difference reflect the magnitude of the underlying physiological differences? These are more difficult questions to answer, and reflect a key shift from simply asking where activation occurs to asking how much activation occurs. The difficulty in making this shift is part of the reason for the lack of clinical impact of fMRI, despite the clear potential to provide information on brain dysfunction. The most established fMRI application in a clinical setting is in pre-surgical planning (Chakraborty and McEvoy, 2008), where the basic question is with regard to the location of activity, reflecting the success of fMRI as a mapping tool. For many clinical and neuroscience applications, though, the part of the brain of interest is already known, and the important question is: what is the level of neural activity of that brain area under different conditions?

We take this as the fundamental challenge for fMRI: how can we interpret the magnitude of the BOLD signal in a quantitative way in terms of the underlying physiological activity? Based on the studies discussed below, our conclusion is that the BOLD response alone is ambiguous, and cannot be interpreted reliably as a quantitative reflection of the underlying physiology. Fortunately, though, the combination of BOLD imaging with arterial spin labeling (ASL) methods and a calibrated BOLD approach makes it possible to isolate the effects of CBF and CMRO2 (Davis et al., 1998; Hoge, 2012; Pike, 2012). This quantitative fMRI approach provides a much richer context for assessing the underlying physiology of brain activation and offers the potential of revealing more about the underlying neural activity than BOLD imaging alone.

### **THE COMPLEXITY OF THE BOLD RESPONSE**

From a quantitative viewpoint, we can look at the BOLD response as driven by a CBF change, but strongly modulated by two additional physiological factors: the CBF/CMRO2 coupling ratio *n*, discussed above, and the amount of deoxyhemoglobin present in the baseline state (**Figure 1**). In order to clarify the complexity of the BOLD signal, we introduced a simple heuristic model for the BOLD response (-*S*), based on a more detailed model (Griffeth and Buxton, 2011), that approximately captures the different factors involved (Griffeth et al., 2013):

$$
\Delta S = A \left( 1 - 1/n - \alpha\_V \right) \left( 1 - F\_0/F \right) \tag{1}
$$

The scaling factor *A* is proportional to the total amount of deoxyhemoglobin present in the baseline state, and so depends on the baseline oxygen extraction fraction and venous blood volume, and also depends on technical factors related to the data acquisition (magnetic field strength and echo time). The baseline CBF is denoted *F*0, and the activated CBF is denoted *F*. The nonlinear dependence on *F* reflects the ceiling effect on the BOLD response: even a very large flow is limited in its effect because it can only reduce the finite amount of deoxyhemoglobin present in the baseline state. The parameter α*<sup>V</sup>* describes the effect of a change in venous blood volume with activation, which changes the total blood volume containing deoxyhemoglobin. Typical values of the parameters for a strong

is a simple model for the BOLD response in terms of these

physiological changes.

**Frontiers in Neuroscience** | Brain Imaging Methods June 2014 | Volume 8 | Article 139 |

oxygen metabolism (CMRO2), with increased blood flow (CBF) driven by aspects of the neural response. The BOLD response is primarily driven activation in visual cortex are *A* = 0.12, *F*/*F*<sup>0</sup> = 1.4 (40% flow increase), *n* = 2 (20% CMRO2 increase), and α*<sup>V</sup>* = 0.2 (Chen and Pike, 2009), giving a BOLD signal change of about 0.01 (1%).

Caffeine provides a useful test for exploring the complexities involved in the BOLD response because it has both neural and vascular effects through inhibition of adenosine receptors, and thus affects multiple factors in Equation (1). Adenosine has the somewhat counterintuitive effect of inhibiting neural activity but increasing CBF, which is most likely a protective mechanism limiting O2 demand while trying to increase O2 delivery. We thus expect administration of caffeine to reduce CBF but potentially to increase CMRO2 as the effects of adenosine are blocked. In our study (Perthen et al., 2008; Griffeth et al., 2011) we used a calibrated BOLD experimental design that made it possible to refer all changes to the pre-caffeine baseline state, allowing us to look at both baseline changes due to caffeine and also the response to a visual stimulus before and after caffeine (**Figure 2A**). The primary findings were that baseline CBF was reduced by 25% due to caffeine, consistent with earlier studies (Chen and Parrish, 2009a), while baseline CMRO2 increased, and in addition the absolute CMRO2 response to the visual stimulus was increased by 60% post-caffeine [consistent with findings in (Chen and Parrish, 2009b)]. The latter result is consistent with the idea that caffeine led to increased excitability, in the sense that the same stimulus elicited a much stronger evoked response.

#### **FIGURE 2 | Pattern of variation of the coupling ratio of CBF and CMRO2 responses.** Data from three studies of visual cortex show how responses are modulated by: **(A)** ingestion of 200 mg caffeine (Perthen et al., 2008; Griffeth et al., 2011); **(B)** increasing stimulus contrast (Liang et al., 2013); and **(C)** increasing attention to a fixed stimulus (Moradi et al., 2012). For the caffeine data **(A)**, changes are as a percentage of pre-caffeine baseline state, and the plots for CBF (middle column) and CMRO2 (right column) show both the baseline shift due to caffeine (the shift of the bottom of the bars) as well as the change in the activation

state due to the visual stimulus response (the shift of the top of the bars). Note that the relative BOLD responses (left column) for the two conditions within each experiment (pre- vs. post-caffeine, low contrast vs. high contrast, and unattended vs. attended) do not quantitatively reflect the underlying CMRO2 response for those conditions. The BOLD response was unchanged with caffeine, despite a large change in the CMRO2 response to the stimulus, and the BOLD response greatly overestimated the CMRO2 change when stimulus contrast was changed and greatly underestimated the CMRO2 change when attention was modulated.

The surprising result, given these large changes in the underlying physiology, was that the BOLD response to the visual stimulus was unchanged by caffeine. The origin of this negative finding illustrates the complexity involved in interpreting the BOLD response, in this case because two effects were present but acting in opposite directions. The baseline shift, decreasing CBF with increasing CMRO2, would increase baseline levels of deoxyhemoglobin, creating a larger value of *A*. However, the increased neural excitability, with a larger change in CMRO2 compared to CBF in response to the visual stimulus, decreased the value of *n*. In our study population these two effects mutually cancelled, leaving the BOLD signal unchanged. In short, this example shows that large physiological changes, detected with quantitative fMRI methods, can be missed when looking only at BOLD responses.

#### **THE VARIABILITY OF FLOW/METABOLISM COUPLING**

The caffeine example raises a basic question: how variable is the CBF/CMRO2 coupling ratio under different conditions? For the past several years we have tried to address this question with a series of calibrated BOLD studies in human visual cortex. While we (Ances et al., 2008) and others (Chiarelli et al., 2007) have found different coupling ratios in different brain regions, our goal in these studies was to specifically test whether the coupling ratio changes for the same brain region under different conditions. For several conditions we found the coupling ratio *n* to be unchanged, in good agreement with earlier pioneering studies using the calibrated BOLD approach by Hoge et al. (1999). In particular, one scenario in which we expected to see coupling differences was comparing color and luminance stimuli designed to preferentially stimulate blob and interblob regions. Anatomically, these regions are defined by different concentrations of cytochrome oxidase, suggesting different capacities for oxidative metabolism. However, we found no evidence for a coupling difference when the stimuli were adjusted to evoke similar magnitudes of response (Leontiev et al., 2013).

However, in several other studies we found evidence for significant variability of the CBF/CMRO2 coupling ratio (**Figure 2**). In these studies we found that *n* was smaller for a weak stimulus compared with a stronger stimulus (varying contrast of the stimulus) (Liang et al., 2013), for an attended stimulus compared to the same stimulus when unattended (Moradi et al., 2012), and with adaptation to a sustained stimulus compared to the initial response (Moradi and Buxton, 2013). Put another way, compared to the CBF response these data are consistent with the CMRO2 response rounding off more as the stimulus intensity increases, responding more strongly to attention, and adapting more quickly to a sustained stimulus. Based on these studies an interesting empirical pattern is beginning to emerge for how CBF and CMRO2 respond to different types of neural activity. If the stimulus is modulated to create a stronger response (e.g., increasing stimulus contrast), CBF is modulated more than CMRO2 (*n* increases); on the other hand, if the brain state is altered such that the response to the same stimulus is increased (e.g., modulating attention, adaptation, or excitability with caffeine), CMRO2 is modulated more than CBF (*n* decreases). Because CBF and CMRO2 changes conflict in producing BOLD signal changes, this finding has an important implication for conventional BOLD-fMRI studies: the BOLD response exaggerates the effects of stimulus variation but is only weakly sensitive to modulations of the brain state that alter the response to a standard stimulus.

These effects are not small, as illustrated in **Figure 2**. Changing the stimulus contrast created a modest change in the evoked CMRO2 response but the BOLD response modulation was about twice as large. In contrast, attention created a large amplification of the CMRO2 response, with only a modest change in the BOLD response. Going back to our caffeine study, despite a large change in the CMRO2 response to the stimulus, there was no change in the BOLD response. In short, the BOLD signal could exaggerate the underlying change in CMRO2 or miss it entirely. Note that these effects are all consistent with our understanding of the conflicting effects of CBF and CMRO2 changes on the BOLD response, with relatively small changes in *n* having a large effect. The intriguing physiological phenomenon is that the coupling ratio is not fixed within a brain region, but varies under different conditions. This clearly presents a problem for the interpretation of the BOLD response alone, but these results also show that quantitative fMRI methods can provide a deeper probe of the physiology of brain activation, and raises the question: does the CBF/CMRO2 coupling ratio tell us something about the underlying evoked neural activity?

## **NEURAL ACTIVITY: WHAT COSTS ENERGY AND WHAT DRIVES BLOOD FLOW?**

Our basic assumption is that CMRO2 is the physical parameter closest to the underlying neural activity in that it reflects the net energy cost of that activity. This assumption is important to make explicit, because it is complicated by the dissociation of glucose metabolism and oxygen metabolism in the brain (Fox et al., 1988). For reasons that are not well understood, glucose metabolism increases more than oxygen metabolism with increased neural activity. Nevertheless, most of the energy required in terms of adenosine triphosphate (ATP) generation to support the neural activity is thought to come from oxidative metabolism of pyruvate, with the contribution from glycolysis as a small fraction (Buxton and Frank, 1997; Lin et al., 2010).

The primary energy cost of neural activity is the restoration of sodium and calcium gradients partially degraded by neural activity (Attwell and Laughlin, 2001; Buxton, 2013). These ions are maintained in a state far from thermodynamic equilibrium, with high extracellular concentrations and low intracellular concentrations. An action potential arriving at an excitatory synapse triggers a chain of events that leads to the opening of sodium channels on the post-synaptic dendrite. The sodium then flows through the channel due to the electrochemical gradient, creating an excitatory inward synaptic current that partially depolarizes the membrane potential. This in turn leads to the opening of voltage sensitive calcium channels, creating an influx of calcium ions (Lauritzen, 2005). If the net excitatory current into the postsynaptic cell reaches the soma with sufficient strength an action potential is generated. Importantly, none of this signaling process requires energy, because each step is downhill in a thermodynamic sense. The energy cost is in restoring the ion gradients by pumping sodium and calcium back out of the cell, requiring ATP as the source of free energy for this thermodynamically uphill process. For this reason, excitatory neural activity has a high energetic cost. While there is an energy cost associated with clearing neurotransmitter from the synaptic cleft and repackaging it in the pre-synaptic terminal, this is thought to be less than 10% of the total energy cost of synaptic activity (Attwell and Laughlin, 2001). There is also a cost in generating and propagating the action potential, and although this cost is estimated to be about half of the energy cost for the rat brain, the higher number of synapses each axon projects to in the primate brain shifts the dominant energy cost to recovery from synaptic activity rather than action potential production. Estimates for the primate brain are that excitatory synaptic activity accounts for about 3/4 of the energy costs of neural signaling (Attwell and Iadecola, 2002).

Inhibitory synaptic activity is likely to have a much lower energy cost. Inhibitory activity can take several forms, but the simplest is the opening of chloride channels. The extracellular medium has a higher concentration of both sodium and chloride than the intracellular medium. However, because chloride ions are negatively charged, their distribution is close to equilibrium with the negative intracellular electric potential. The membrane potential reflects the balance of open channels for different ions, and opening more chloride channels tends to peg the membrane potential at the chloride equilibrium potential, effectively reducing the effect of simultaneous excitatory sodium currents. When GABA, the primary inhibitory neurotransmitter in the cortex, is released there will again be the energy cost associated with clearing and repackaging the neurotransmitter, but there is no large energy cost for post-synaptic ion pumping: chloride ions are already in a near equilibrium distribution, and there is no large sodium flux as there is for excitatory activity.

Blood flow is driven strongly by aspects of excitatory synaptic activity, a well-matched feed-forward system given that the dominant energy cost is excitatory activity. In contrast, the role of inhibitory interneurons in the control of CBF presents an intriguingly complex picture (Cauli et al., 2004). Some classes of interneurons have a constricting effect on blood vessels, acting to reduce CBF. However, other classes of interneurons have a vasodilatory effect, increasing CBF. In particular, one of the most potent vasodilators known, nitric oxide (NO), is released by inhibitory interneurons (Estrada and DeFelipe, 1998). As with the effects of adenosine, discussed above in the context of our caffeine experiment, this is an example of an agent that has opposite effects on CBF and CMRO2: acting to increase CBF while also acting to inhibit neural activity and thus reduce CMRO2.

#### **DOES CBF/CMRO<sup>2</sup> COUPLING REFLECT THE BALANCE OF INHIBITORY AND EXCITATORY NEURAL ACTIVITY?**

The observation that there are examples of inhibitory mechanisms that have a larger effect on increasing CBF than on increasing CMRO2 (or even act to reduce CMRO2) suggests a speculative hypothesis: the coupling ratio *n* of CBF and CMRO2 responses to a stimulus tracks with the ratio of inhibitory to excitatory activity in the neural response. In this picture, when there is a strong involvement of inhibitory activity, CBF is increased relative to CMRO2 because of the vasodilatory effect of the inhibitory mechanisms, and thus *n* is larger. In our experiments we had no direct information on the balance of excitatory and inhibitory activity, but we can imagine plausible scenarios based on this hypothesis. For our attention experiment, the visual stimulus was either the focus of the task or a distractor for another task the subject was asked to perform; we hypothesize that inhibition of the response to the stimulus in the latter unattended case would lead to a larger *n*, as observed (Moradi et al., 2012). With adaptation, we hypothesize that increased involvement of inhibitory mechanisms over time would act to reduce the CMRO2 response while continuing to push up the CBF response, as observed (Moradi and Buxton, 2013). In the caffeine experiment, before caffeine was given adenosine was more effective, tending to increase the balance of inhibitory and excitatory activity and boost the CBF response but suppress the CMRO2 response (Griffeth et al., 2011). With increasing contrast of a visual stimulus, animal studies of the behavior of different cellular types found a flattening of the response of simple regularly spiking neurons (thought to be glutamatergic excitatory cells) but continued increasing activity of simple fast spiking neurons (thought to be inhibitory GABAergic neurons) (Contreras and Palmer, 2003), suggesting a greater proportional involvement of inhibitory activity as contrast increases, consistent with our finding of increased *n* (Liang et al., 2013).

This hypothesis is speculative, but suggests the possibility of a new direction in which quantitative fMRI may be able to provide information on the underlying activity. Note that this information is in addition to the magnitude of the overall evoked response, as reflected in the CMRO2 response. The overall response depends on the balance of excitatory and inhibitory activity in a nonlinear way, and the overall response magnitude (the CMRO2 response) could be large for either a weaker stimulus with no inhibition or a stronger stimulus with more involvement of inhibitory mechanisms. If this hypothesis is true, then the ratio of CBF and CMRO2 responses could provide an index of the involvement of inhibitory neural activity that could distinguish these cases.

In conclusion, the BOLD response is a complex phenomenon, and the magnitude of the BOLD response cannot be taken as a quantitative reflection of underlying activity. Our studies suggest a pattern in which the BOLD magnitude exaggerates the physiological changes when the stimulus strength is changed, but underestimates or completely misses those changes when the brain state is modulated to change the response to the same stimulus. This is a problem for interpreting BOLD imaging alone, but quantitative fMRI methods offer a way to untangle the ambiguities of the BOLD response. Current work in our group is focused on developing approaches to apply these methods to analyze dynamic responses (Simon et al., 2013) and to make the calibration easier to apply by eliminating the need to breathe special gas mixtures (Blockley et al., 2012). Potentially, quantitative fMRI methods provide two candidate measurements of neural activity: the overall evoked response, as reflected in the CMRO2 change; and the balance of evoked inhibitory and excitatory activity, as reflected in the coupling ratio of the CBF and CMRO2 responses. We emphasize though, that this picture is speculative, based on two elements: (1) a limited set of experiments in human primary visual cortex to explore the variability of CBF/CMRO2 coupling; (2) limited understanding of the role of inhibitory mechanisms on CBF control (most of which comes from brain slice experiments, rather than *in vivo* experiments) and very little understanding of effects of inhibitory activity on CMRO2 (although the theoretical arguments are plausible). Each of these elements requires much more experimental attention to test whether there is any truth in this speculative hypothesis.

#### **ACKNOWLEDGMENTS**

This work was supported by NIH grants NS036722, NS081405, and EB000790. The authors would like to thank Anna Devor for helpful discussions of these ideas.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 04 March 2014; accepted: 19 May 2014; published online: 11 June 2014. Citation: Buxton RB, Griffeth VEM, Simon AB and Moradi F (2014) Variability of the coupling of blood flow and oxygen metabolism responses in the brain: a problem for interpreting BOLD studies but potentially a new window on the underlying neural activity. Front. Neurosci. 8:139. doi: 10.3389/fnins.2014.00139*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Buxton, Griffeth, Simon and Moradi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Corrigendum: Variability of the coupling of blood flow and oxygen metabolism responses in the brain: a problem for interpreting BOLD studies but potentially a new window on the underlying neural activity

*Richard B. Buxton1 \*, Valerie E. M. Griffeth1, Aaron B. Simon1, Farshad Moradi <sup>1</sup> and Amir Shmuel <sup>2</sup>*

*<sup>1</sup> Department of Radiology, Center for Functional MRI, University of California, San Diego, La Jolla, CA, USA*

*<sup>2</sup> Departments of Neurology and Neurosurgery, Physiology and Biomedical Engineering, Montreal Neurological Institute Brain Imaging Centre, McGill University,*

*Montreal QC, Canada \*Correspondence: rbuxton@ucsd.edu*

#### *Edited and reviewed by:*

*Clare Howarth, The University of Sheffield, UK*

**Keywords: fMRI, BOLD, cerebral blood flow, cerebral metabolic rate of oxygen, inhibition**

#### **A corrigendum on**

**Variability of the coupling of blood flow and oxygen metabolism responses in the brain: a problem for interpreting BOLD studies but potentially a new window on the underlying neural activity**

*by Buxton, R. B., Griffeth, V. E. M., Simon, A. B., and Moradi, F. (2014). Front. Neurosci. 8:139. doi: 10.3389/fnins. 2014.00139*

Through an oversight the author list of the published version of this paper failed to reflect the important contributions of Amir Shmuel. For all aspects of scientific attribution he should be considered to be the final author on the paper, so that the appropriate author list is: Richard B. Buxton, Valerie E. M. Griffeth, Aaron B. Simon, Farshad Moradi, and Amir Shmuel.

# **AUTHOR CONTRIBUTIONS**

Richard B. Buxton and Amir Shmuel conceptualized the hypothesis that CBF and CMRO2 may be driven differentially by excitatory and inhibitory activity and formulated experiments to test the idea. Valerie E. M. Griffeth, Aaron B. Simon, and Farshad Moradi conducted experiments and contributed ideas on how their results could be interpreted within the hypothesis. Richard B. Buxton wrote the manuscript, and all authors contributed to editing the final version.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 10 July 2014; accepted: 22 July 2014; published online: 23 September 2014.*

*Citation: Buxton RB, Griffeth VEM, Simon AB, Moradi F and Shmuel A (2014) Corrigendum: Variability of the coupling of blood flow and oxygen metabolism responses in the brain: a problem for interpreting BOLD studies but potentially a new window on the underlying neural activity. Front. Neurosci. 8:241. doi: 10.3389/fnins. 2014.00241*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Buxton, Griffeth, Simon, Moradi and Shmuel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Partitioning two components of BOLD activation suppression in flanker effects

# *Chien-Chung Chen1,2\**

*<sup>1</sup> Department of Psychology, National Taiwan University, Taipei, Taiwan*

*<sup>2</sup> Neurobiology and Cognitive Science Center, National Taiwan University, Taipei, Taiwan*

#### *Edited by:*

*Christopher W. Tyler, Smith-Kettlewell Institute, USA*

#### *Reviewed by:*

*Justin L. Gardner, RIKEN Brain Science Institute, Japan Pavan Ramkumar, Aalto University, Finland*

#### *\*Correspondence:*

*Chien-Chung Chen, Department of Psychology, National Taiwan University, 1, Sec. 4, Roosevelt Rd., Taipei 106, Taiwan e-mail: c3chen@ntu.edu.tw*

The presence of a visual stimulus not only increases the blood oxygenation level dependent (BOLD) activation in its retinotopic regions in the visual cortex but also suppresses the activation of the nearby regions. Here we investigated whether there are multiple components for such lateral effects by using the m-sequence paradigm to measure the stimulus spatial configuration specific BOLD activation. The central target (2 cyc/deg grating) was centered on a fixation point while the flanking stimulus was placed 2◦ away and was located on axes that were either collinear or orthogonal to the target's orientation. Three types of flankers were used: gratings whose orientation was the same as the central stimulus, gratings which were orthogonal to the stimulus, and random dots. The onset and offset of each stimulus were determined by shifted copies of an 8-bit long m-sequence. The duration of each state of the sequence was 2 s or 1TR. The first order activation, computed as the waveform recorded following on-states minus that recorded after off-states, determined the retinotopic regions for each stimulus. We then computed BOLD activation waveforms for the target under various flanker conditions. All flankers reduced the activation to the target. The suppressive effect was largest following the presence of the iso-orientation collinear flankers. Our result suggests two types of BOLD signal suppression: general suppression, which occurs whenever a flanker is presented and is insensitive to the spatial configuration of the stimuli, and spatial configuration dependent suppression, which may be related to the collinear flanker effect.

**Keywords: collinearity, m-sequence, lateral interaction, flanker effect, spatial configuration**

# **INTRODUCTION**

The visual response to a stimulus can be modulated by another stimulus. For instance, in the Ebbinghaus effect, a target circle surrounded by large circles appears to be smaller than the same target surrounded by small circles; in simultaneous contrast (Wallach, 1948; Gilchrist, 2006), a patch of gray on a dark background appears brighter than the same patch on a bright background; and, in particular, in the flanker effect (Polat and Sagi, 1993, 1994; Chen and Tyler, 2001, 2008), the visibility of a low contrast periodic pattern (target) increases when it is flanked by collinear and iso-oriented patterns (flankers). Such lateral modulation of visual performance may have a neurophysiological basis. Whereas a visual cortical neuron only responds to visual stimuli projected onto its receptive field (Hubel and Wiesel, 1962; DeAngelis et al., 1993), this response can be modulated by the presence of other visual stimuli presented outside its classical receptive field (Blakemore and Tobin, 1972; Nelson and Frost, 1985; Knierim and Van Essen, 1992; Sillito et al., 1995; Polat et al., 1998; Sengpiel et al., 1998; Kapadia et al., 1999, 2000; Chen et al., 2001; Freeman et al., 2001; Angelucci et al., 2002; Cavanaugh et al., 2002).

In psychophysical or electrophysiological experiments, the lateral effect can be measured by comparing the visual performance or cell response to a visual target in the presence of a spatial context with those without a context. This approach, however, may not be directly applicable to an fMRI study. The blood oxygenation level dependent (BOLD) activation in the early visual cortex does show a retinotopic property, i.e., that the activation of a particular voxel corresponds to the presence of visual stimuli at a certain location (Engel et al., 1997; Tootell et al., 1998). Thus, at first glance, it might be possible simply to measure the context effect by comparing the activation of a set of voxels to a target projected to their corresponding retinotopic locations on the display, with and without the presence of a context outside their corresponding retinotopic locations. Indeed, there were fMRI studies did just that (Zenger-Landolt and Heeger, 2003; Tajima et al., 2010; Wade and Rowland, 2010). However, the result from such experimental paradigm may not reflect the true neural mechanisms for context effect. For instance, it is known that the presence of a visual stimulus not only produces an increment of BOLD activation in the corresponding retinotopic regions for that stimulus, there is also a sustained reduction in BOLD activation in the neighboring brain regions (Logothetis, 2002; Shmuel et al., 2002, 2006; Smith et al., 2004; Chen et al., 2005). Such negative BOLD may not have a neurophysiological origin but, as discussed by Shulman et al. (1997) and Shmuel et al. (2002), it may be caused by "blood steal," i.e., the activation to the visual stimuli draws fresh blood to the corresponding retinotopic region, thus reducing it in the neighboring regions. The reduction of fresh blood, in turn, causes a reduction in BOLD activation. Hence, there is a possibility that the effect of a context stimulus on the activation of a target brain region is simply caused by the regions responsive to that context stimulus drawing blood away. That is, the measurement of the context in the target region may be contaminated by a hemodynamic cause and thus cannot reflect the nature of the lateral interactions between neural mechanisms.

The possibility of the involvement of a hemodynamic factor in the BOLD activation illustrates the risk of studying context effect. Even without the hemodynamic factor, the presentation of the context stimulus may cause an overall change in the neural activity due to, say, an increment in the stimulus size. Thus, the experimental result may tag a neural mechanism that is unrelated to the context effect in perception. To avoid such risk, the better strategy is to compare activation to the stimuli that are known to cause a difference in perception.

The flanker effect is strongly configuration dependent. At the behavioral level, the detection threshold for a Gabor target is reduced by the presence of Gabor flankers only if the flanker has the same orientation (Polat and Sagi, 1993; Chen and Tyler, 2002) and is placed on the collinear axis of the target orientation (Polat and Sagi, 1993, 1994; Solomon and Morgan, 2000; Chen and Tyler, 2008). A flanker with an orthogonal orientation or which is placed away from the collinear axis has little, if any, effect on target detection. Electrophysiological evidence also shows that the response of a visual cortical neuron is best modulated if the context has an orientation similar to the preferred orientation of the cell (Blakemore and Tobin, 1972; Nelson and Frost, 1985) and is placed on the collinear axis (Kapadia et al., 2000).

Here, we exploited the configuration dependency of the flanker effect. We tested the BOLD activation of the brain region responsive to a central target in the presence of flankers with different orientations and locations. A BOLD activation caused by the visual context effect should show a dependency on the spatial configuration. That is, a flanker which has the same orientation and is placed on the collinear axis should produce the largest change in BOLD activation from that to the target alone. On the other hand, a lateral effect that is not related to the visual context effect, such as those with hemodynamic origins or an overall change in neural activity, should be indifferent to the spatial configuration of the stimuli.

In our experiment, there could be more than one stimulus component on the display. To separate the effect of different stimulus components, we used an M-sequence technique (Sutter, 2001; Buracas and Boynton, 2002) to control the experimental sequence. An M-sequence is a temporal binary (e.g., 0/1) random sequence that determines the state of a stimulus; in our experiment, the onset and the offset of the stimulus components. This type of binary sequence is generated in such a way as to consist of the same number of zero and one events and all possible combinations of zero and one events within a pre-designated length. That is, there is no bias on any states of the stimulus. Furthermore, an M-sequence also has the property that any temporal shift of the sequence is always orthogonal to the original sequence. Thus, one can assign each stimulus component to an M-sequence, which is a shift-copy of the M-sequences for other stimulus components. In this way, the occurrences of any stimulus component combinations, such as target alone, flanker 1 alone, target+flanker1, etc., are the same and therefore there is no bias toward any stimulus combination. In addition, since all M-sequences used in the experiments are orthogonal to each other, one can extract the effect of one stimulus component without being contaminated by the effect of the other components. With these properties, we were able to have multiple flankers in one fast event-related run and thus keep our experiment to a reasonable length.

# **METHODS**

#### **PARTICIPANTS**

Eight healthy volunteers between early 20 to early 40 years old participated in this study. One participant was the author of this paper while the others were naïve to the purpose of the experiment and were compensated financially for the hours of the experiment. Informed consent was obtained from each participant before scanning. The experiment was approved by the IRB of the National Taiwan University Hospital.

#### **EQUIPMENT AND DATA ACQUISITION**

All stimuli were delivered with MR-compatible goggles (Resonance Technology, USA) mounted on the head of the participants. The resolution of the goggles was 800 × 600 with a dot size of 0.096◦ visual angle. The frame rate was 60 Hz. All the stimuli were generated on a PC compatible computer with the Psychophysics toolbox (Brainard, 1997) under the MATLAB (The Mathworks, Matick, MA, USA) environment. The visual acuity of the participants was corrected to normal by a set of convex lenses mounted on the goggles, in front of the display.

The magnetic resonance images were collected on a Bruker 30/90 Medspec 3T scanner (Bruker Medical, Ettlingen, Germany) with a cylindrical head coil. The functional images (T2∗-weighted BOLD) were acquired with an Echo-planar imaging sequence (Stehling et al., 1991) with *TR* = 2000 ms, *TE* = 33 ms, flip angle = 90◦, and voxel resolution = 3 × 3 × 3 mm. The images were collected in 20 transverse planes parallel to the AC-PC (anterior commissure-posterior commissure) line with a 19.2 cm FOV and an image matrix of 64 × 64. A set of anatomical images (T1-weighted, 256 × 256) was acquired in identical planes.

For the functional data, before statistical analysis, we first used SPM8 (http://www.fil.ion.ucl.ac.uk/spm/software/spm8/) software to correct for the timing difference between slices in a volume, and realigned the images acquired at different time points to remove head motion artifacts. The realigned images, as well as the anatomic images, were then normalized to a standard template with SPM8. The normalized images were fed to the mrVista software (Wandell et al., 2000) for co-registration and visualization after statistical analysis.

#### **STIMULUS**

As shown in **Figure 1**, there were three components in a stimulus. The first component was the central stimulus, or the target, which was a sinusoid grating with a 45◦ orientation, presented through a circular aperture with a 2◦ radius. The second component was a collinear flanker located on the two ends of the target along the axis that passed through the center and was parallel to the

orientation of the target. The third component was a side flanker located on the axis orthogonal to the orientation of the target. The flankers were either a sinusoid grating or random dots presented through a fan aperture. The aperture in each quadrant extended from 2.5 to 6◦ visual angle from the center of the display in radius and spanned 70◦ in azimuth.

There were three types of stimulus. In the iso-orientation condition, the flankers contained sinusoidal gratings at a 45◦ orientation. In the orthogonal condition, the flankers contained sinusoidal gratings at a 135◦ orientation. The gratings had a spatial frequency of 4 cyc/deg and a contrast of 95%. In the random dot condition, the flankers were random dots whose luminance was drawn from a uniform distribution that had the same range and mean as the luminance distribution of the pattern stimuli. All stimulus components were presented on a gray background of mean luminance.

#### **PROCEDURE**

We used a fast event-related design. The stimulus was updated every 2 s (1TR). The sequence of the presentation of image components was determined by m-sequences. The method of generating m-sequences has been discussed by Sutter (2001). We used 8-bit m-sequences for the experiment. The duration of each state was 2 s. We used three shift-copies of the same sequence in each run. The original sequence controlled the onset and offset of the central target. The second sequence, which was constructed by shifting the first sequence by 64 states, controlled the collinear flanker. The third sequence, which was constructed by shifting the first sequence by 128 states, controlled the side flanker. The state value 1 meant that the image component was presented, while the state value 0 meant that it was not presented. In each state, the stimulus was presented for 1 s, followed by a 1 s blank in which only the gray background of mean luminance and the fixation point were shown on the display. All stimulus components, when presented, counter-phase flickered (that is, the luminance of each pixel alternating between positive and negative polarity about the mean luminance) at 4 Hz.

A circular fixation point (0.26◦ diameter) was placed at the center of the display throughout the experiment. At the beginning of each state, there was a 1/10 chance that the color of the fixation would change from red to green or *vice versa*. The observer was to press a button to indicate the change in fixation color. All observers achieved at least 80% accuracy in this fixation task.

For each participant, there were three functional runs, one for each of the iso-orientation, orthogonal orientation, and random dot conditions respectively. Each run started with a 6 s (3TR) blank period followed by 256 (28) m-sequence states (512 s). The data from the first 6 s was not included in data analysis to avoid the start transient. The order of the three functional runs was randomized for each observer.

# **RESULT**

#### **THE FIRST ORDER ACTIVATION AND ROI SELECTION**

**Figure 2** shows the first order activation for each stimulus component on flat maps for one observer. The flat maps here had their center near the occipital poles and extended 80 *mm* in radius around that point. The areas delineated by colored borders are the first-tier retinotopic areas (V1–3), identified with a rotating wedge for that observer acquired for a previous study (Chen et al., 2007).

The first order activation here is the BOLD activation to the presence of a stimulus component. The first order activation from an m-sequence can be extracted with a linear regression method (Buracas and Boynton, 2002). We first convolved the sequence for each image component with a difference-of-gamma (DOG) hemodynamic response function,

$$\mathbf{g}\left(t\right) = \boldsymbol{w}\_1 \times \left[ (t\_1/\alpha\_1)^{\beta\_1} \times e^{\left(-t\_1/\alpha\_1\right)} \right]$$

$$-\boldsymbol{w}\_2 \times (t\_2/\alpha\_1)^{\beta\_2} \times e^{\left(-t\_2/\alpha\_1\right)} \right] \tag{1}$$

where t is time in seconds, *t*<sup>1</sup> = *t* − 6 and *t*<sup>2</sup> = *t* − 12. The values of the parameters, given by Chen and Tyler (2008) are α<sup>1</sup> = 5.4, α<sup>2</sup> = 10.8, β<sup>1</sup> = 6, β<sup>2</sup> = 12, *w*<sup>1</sup> = 1 and *w*<sup>2</sup> = 0.35. Those parameters were shown (Chen and Tyler, 2008) to provide a good fit to the hemodynamic response function following a 1 s sensory stimulation measured by Glover (1999). The convolved sequences, along with a unity vector, were used as regressors. In this way, we were able to acquire the base line activation (the regression coefficient to the unity vector) and the activation amplitude to the presence of each of the three image components for each voxel. The activation of a voxel to an image component

was considered significant if the t-statistics of the regression coefficient for the corresponding sequence reached 4.72. This criterion was equivalent to a two-way α-level about 10−<sup>6</sup> for each individual voxel and Bonferroni corrected α-level 0.01, based on the number of gray matter voxels.

The central target produced activation in the foveal confluence region (**Figure 2A**). The two flankers, on the other hand, produced activation in the peripheral region (**Figures 2B,C**). The regions for the central target activation were used as areas of interest (ROIs) for the subsequent analysis. These ROIs respond little to the flankers alone. As shown in **Figure 2** the areas activated by the flankers (**Figures 2B,C**) had no overlap with these ROIs. The amplitude of the BOLD activation in these ROIs to flankers never reached a statistically significant level (α-level 0.01). Hence, our result in the foveal ROIS cannot be explained by an intrusion from the flankers. Notice that, since it is difficult, if not impossible, to separate the foveal responses in different early visual areas, we opted to treat all voxels activated by the target in the early visual cortex in each hemisphere as one ROI.

#### **LATERAL EFFECTS ON BOLD ACTIVATION**

**Figure 3** showed BOLD activation produced by the presence of the target in the various flanker conditions in the left and right hemisphere ROIs. In each panel, blue symbols and curves denote the BOLD activation following the stimulus events in which only the target was presented; red symbols and curves, the target and the collinear flanker; and magenta symbols and curves, the target and the side flankers. The smooth curves are fits of the DOG function Equation (1) with amplitude w1 as the free parameter. The BOLD activation for each voxel was time locked average following

**participants and voxels.** The left and right columns show activations in the left and right hemisphere ROIs respectively. In each panel, blue symbols and collinear flanker; and magenta symbols and curves, the target and the side flankers. The smooth curves are fits of the difference-of-gamma function Equation (1). The error bars denote one standard error of individual difference. a specific event. The activation was then averaged across all voxels in an ROI, before being averaged across participants. The error bars denote the standard error of individual difference.

Without flankers, the BOLD activation to the central target in the ROIs showed a typical biphasic shape and peaked at 6 s after stimulus onset. The presence of the flankers reduced the amplitude of BOLD activation to the target. In the iso-orientation condition, the presence of the flankers reduced the BOLD activation. The peak activation, on average, dropped 34% [*t*(7) = 3.07, *p* = 0.009] and 27% [*t*(7) = 3.21, *p* = 0.007] in the left and right hemisphere respectively. The activation with the collinear flankers was only half of that without flankers [*t*(7) = 4.02, *p* = 0.003 for the left and *t*(7) = 4.39, *p* = 0.002 for the right hemisphere]. Thus, while the presence of either flanker reduced the peak activation, the effect was greater in the collinear flanker condition than in the side flanker condition. The difference between the collinear and the side flanker was significant in both the left [*t*(7) = 2.87, *p* = 0.01] and in the right hemisphere [*t*(7) = 1.98, *p* = 0.04]. Notice that, the flankers also reduced BOLD activation in the undershoot region of the waveform. However, there was no systematic difference between the side and the collinear flankers in this.

In the orthogonal orientation condition (**Figure 3B**), the presence of either collinear or side flankers reduced the BOLD activation to the target. However, there was little, if any, difference in activation amplitude between the two flanker conditions. The result for the random condition (**Figure 3C**) was similar to that of the orthogonal orientation condition. That is, the presence of the flankers reduced the BOLD activation to the target by a similar amount regardless the location of the flankers.

To summarize our result, **Figure 4** shows the peak activation in all flanker conditions. As shown above, the flanker location effect, or the activation difference produced by the collinear and side flankers, was only significant in the iso-orientation condition. There was little, if any, difference in activation amplitude between the two flanker conditions in either orthogonal or noise conditions. The orientation effect, or the activation difference produced by the iso-orientation and orthogonal flankers was pronounced in collinear location. The difference was statistically significant in the left hemisphere [*t*(7) = 2.01, *p* = 0.04] but not beyond the limitation of noise [*t*(7) = 1.58, *p* = 0.08] in the right hemisphere. There was no orientation effect at the side location.

#### **DISCUSSION**

Despite a very short time interval between events (2 s), we were able to obtain a reliable measurement of BOLD activation to stimulus components (**Figure 2**) and various combinations of them (**Figure 3**) with m-sequences. Hence, the m-sequences technique is indeed a useful and efficient tool to measure brain activity to multiple visual inputs with fMRI.

In this study, we showed that BOLD activation to a target in the early visual cortical regions was suppressed by flankers presented outside the corresponding retinotopic locations of those regions. Such suppression occurred regardless of the orientation (iso-orientation, orthogonal orientation), composition (grating or random dot) or location (collinear or side) of the flankers. The suppression effect was greatest when the iso-orientation flankers were presented at the collinear location. Other than the iso-orientation collinear flankers, the suppression effect from all other flankers was similar. Hence, there seem to be at least two types of lateral suppression in the early visual cortex: one is a general suppression that occurs whenever a stimulus component is presented and the other is a spatial configuration specific suppression that occurs only when the iso-orientation collinear flankers are presented.

The configuration specific effect is consistent with the wellknown collinear lateral interaction phenomenon, that is, that the visibility of a target periodic pattern can be altered by the presence of an iso-orientation flanker whose stripes are collinear with those of the target (Polat and Sagi, 1993, 1994; Zenger and Sagi, 1996; Solomon et al., 1999; Chen and Tyler, 2001, 2002, 2008). This collinear flanker effect is reduced as the orientation of the flanker deviates from that of the target (Polat and Sagi, 1993; Chen and Tyler, 2002), or as the flankers move away from the collinear axis toward the sides (Polat et al., 1997; Solomon et al., 1999; Chen and Tyler, 2008). Single cell recording also shows similar configuration effects (Polat et al., 1998; Kapadia et al., 2000). There is also anatomic evidence showing that V1 neurons send their fibers to contact V1 neurons in other hypercolumns with the same orientation preference (Bosking et al., 1997). Hence, there is convergent evidence for a collinear lateral interaction that is reflected in our configuration specific effect.

Many psychophysics studies demonstrate the collinear lateral interaction by showing that the detection threshold to the target decreases with the presence of collinear flankers (Polat and Sagi, 1993; Huang et al., 2012). That is, the effect of the flankers is to facilitate target detection. At first glance, this collinear facilitation contradicts our suppressive effect. However, it is known that collinear lateral interaction is contrast dependent. Polat et al. (1998, also see Chen et al., 2001) showed that the presence of collinear flankers not only increased the firing rate of the primary visual cortical neurons at low target contrast, but also decreased it at high contrast. At the behavioral level, indeed, the presence of collinear flankers reduces contrast detection and discrimination thresholds at low contrasts. However, it also increases the contrast discrimination threshold at high contrast, suggesting a reduction of internal response to the target by the flankers (Chen and Tyler, 2001, 2008; Wu and Chen, 2010). That is, the collinear lateral interaction is suppressive at high contrasts. Our stimuli had a contrast of 80%, well into the suppressive range reported in the previous studies (Polat et al., 1998; Chen and Tyler, 2001).

Some may argue that our collinear effect may be due to a preference for radial orientation in the visual cortical activation. That is, the BOLD activation of the visual cortex to a pattern whose orientation points to the fixation (radial) is greater than that whose orientation is orthogonal to the radial orientation (Sasaki et al., 2006; Freeman et al., 2013). In our experiment, the iso-orientation collinear flanker was a radial stimulus while the iso-orientation side flanker was not. Hence, the larger lateral effect produced by the collinear flanker might just reflect the greater cortical activation to the iso-orientation collinear flankers. However, notice that, the iso-orientation collinear flankers were not the only radial stimuli in our experiment. The orthogonal side flankers were also radial stimuli. Yet, we found no difference

in the lateral effect produced by the side flankers (radial) and the collinear flankers (not radial) in the orthogonal condition. Hence, the radial bias of the cortical response cannot explain our result.

Different factors may underlie the general lateral suppression in our result. It is known that the presence of a visual stimulus not only produces an increment of BOLD activation in the corresponding retinotopic brain regions for that stimulus; there is also a sustained reduction in BOLD activation in the neighboring brain regions (Logothetis, 2002; Shmuel et al., 2002; Smith et al., 2004; Chen et al., 2005). One hypothesis is that negative BOLD activation may be of hemodynamic origin (Shulman et al., 1997; Shmuel et al., 2002). For instance, the presence of a visual stimulus could increase the activation of certain cortical regions, which in turn would lead to an increment of cerebral blood flow (CBF) to those cortical regions. This local increment in CBF could result in a redistribution of blood and thus a decrement of CBF in neighboring cortical regions. As a result, one may observe a decrement in BOLD activation in voxels corresponding to the visual fields outside the stimulus. Recent evidence, however, is against this "blood steal" theory. Shmuel et al. (2002, 2006) show that negative BOLD activation is correlated with the local field potential, suggesting a neural origin. Smith et al. (2004) found that negative BOLD activation can occur in a different hemisphere from the one with positive activation. Such extended signal reduction is unlikely to be hemodynamic in origin, given different blood vessels supplying the two hemispheres.

There is also evidence that the general lateral suppression of BOLD activation may be caused by the response of broadly tuned visual mechanisms. It is known that after staring at a gray region surrounded by a dynamic patterned background (adapter), observers perceive a twinkling aftereffect in the location of the gray region when the pattern stimulus is removed (Ramachandran and Gregory, 1991; Hardage and Tyler, 1995). That is, the aftereffect is induced in a region that had never received any stimulation during either the adapting or the test phases. Chen et al. (2005) showed that negative BOLD activation is positively correlated with the aftereffect. That is, while the BOLD activation in the stimulated brain region went up and down with the onset and offset of the visual stimulus respectively, the activation in the unstimulated region actually decreased after the stimulus onset and rebounded after the stimulus offset. Furthermore, the amplitude of the rebound in the unstimulated region increased the strength of the aftereffect. Thus, such negative BOLD activation should reflect the lateral inhibition in the visual system. Notice that the percept of the twinkle aftereffect is similar regardless of the pattern of the adapter. Hence, such lateral inhibition can be induced by a wide range of stimuli.

There were studies (Zenger-Landolt and Heeger, 2003; Tajima et al., 2010; Wade and Rowland, 2010) measuring the BOLD activation of a central grating surrounded by another grating. The common result was that the BOLD activation to the target can be suppressed by the presence of surrounding ring. For a better quantitative analysis for this surround effect, Wade and Rowland (2010) measured the BOLD activation to the target of various contrasts and found that their result can be fit with a model assuming a broadly tuned lateral interaction mechanism. These broadly tuned lateral interactions are consistent with the general lateral suppression we found in this study.

With a model based approach, Zuiderbaan et al. (2012) and Greene et al. (2014) showed that the BOLD activation in V1–3 to a visual stimulus can be best described by a model with a population receptive field (i.e., the receptive field of a unit of gray matter) with excitatory and inhibitory regions. This result may imply a lateral interaction among neural mechanisms. Notice that, their results were based on an analysis of single voxels while our result was manifested in ROIs with dozens of voxels. Given the difference in scale, it is difficult to make a direct comparison between the two sets of results. A further model that can associate the activation of a single and a group of voxels is needed before we can have a comprehensive treatment on the results from these different paradigms.

In conclusion, the presence of any flankers can produce a suppressive effect on BOLD activation to the central stimulus. Furthermore, it is the iso-orientated collinear flankers that create the greatest suppression. Thus, our results suggest two types of lateral suppression in BOLD activation: the first is a general suppression, which may relate to a neural mechanism with a broad tuning property, such as the one underlying "negative BOLD," and the second is a spatial configuration dependent suppression which may be related to collinear flanker effect.

#### **ACKNOWLEDGMENTS**

Support by NSC 99-2410-H-002-081-MY3 and 102-2420-H-002 -018 -MY3 to Chien-Chung Chen. We thank Dr. Chen, Jyh-Hung and the NTU MRI/MRS Laboratory for allowing us to use the MRI scanner and Ms. Tseng, Runng-Yu for assisting in data collection.

#### **REFERENCES**


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 14 March 2014; accepted: 23 May 2014; published online: 08 July 2014. Citation: Chen C-C (2014) Partitioning two components of BOLD activation suppression in flanker effects. Front. Neurosci. 8:149. doi: 10.3389/fnins.2014.00149 This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# On tests of activation map dimensionality for fMRI-based studies of learning

Juemin Yang<sup>1</sup> \*, Lior Shmuelof <sup>2</sup> , Luo Xiao<sup>1</sup> , John W. Krakauer <sup>3</sup> and Brian Caffo<sup>1</sup>

*<sup>1</sup> Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA, <sup>2</sup> Department of Brain and Cognitive Sciences, Ben-Gurion University of the Negev, Beersheba, Israel, <sup>3</sup> Departments of Neurology and Neuroscience, Johns Hopkins University, Baltimore, MD, USA*

#### Edited by:

*Russell A. Poldrack, University of Texas, USA*

#### Reviewed by:

*Lei Wang, Northwestern University Feinberg School of Medicine, USA Anand Joshi, University of Southern California, USA Christopher W. Tyler, Smith-Kettlewell Eye Research Institute, USA*

#### \*Correspondence:

*Juemin Yang, Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street Baltimore, MD 21205, USA juyang@jhsph.edu*

#### Specialty section:

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience*

Received: *13 December 2013* Accepted: *26 February 2015* Published: *14 April 2015*

#### Citation:

*Yang J, Shmuelof L, Xiao L, Krakauer JW and Caffo B (2015) On tests of activation map dimensionality for fMRI-based studies of learning. Front. Neurosci. 9:85. doi: 10.3389/fnins.2015.00085* A methodology for investigating learning is developed using activation distributions, as opposed to standard voxel-level interaction tests. The approach uses tests of dimensionality to consider the ensemble of paired changes in voxel activation. The developed method allows for the investigation of non-focal and non-localized changes due to learning. In exchange for increased power to detect learning-based changes, this procedure sacrifices the localization information gained via voxel-level interaction testing. The test is demonstrated on an arc-pointing motor task for the study of motor learning, which served as the motivation for this methodological development. The proposed framework considers activation distribution, while the specific proposed test investigates linear tests of dimensionality. This paper includes: the development of the framework, a large scale simulation study, and the subsequent application to a study of motor learning in healthy adults. While the performance of the method was excellent when model assumptions held, complications arose in instances of massive numbers of null voxels or varying angles of principal dimension across subjects. Further analysis found that careful masking addressed the former concern, while an angle correction successfully resolved the latter. The simulation results demonstrated that the study of linear dimensionality is able to capture learning effects. The motivating data set used to illustrate the method evaluates two similar arc-pointing tasks, each over two sessions, with training on only one of the tasks in between sessions. The results suggests different activation distribution dimensionality when considering the trained and untrained tasks separately. Specifically, the untrained task evidences greater activation distribution dimensionality than the trained task. However, the direct comparison between the two tasks did not yield a significant result. The nature of the indication for greater dimensionality in the untrained task is explored and found to be non-linear variation in the data.

Keywords: canonical variates analysis, cognitive learning, BOLD fMRI, statistical parametric mapping, interaction test

# 1. Introduction

This manuscript considers settings where task-related activation may be present before and after learning, yet the distribution of activated voxels changes. For context, consider the motivating study for the work, where two motor tasks of equal difficulty were performed in a scanner over two sessions. Training for one of the tasks occurred in between the sessions, while the other task served as a control. Current methodology would use random effects statistical parametric mapping (SPM Friston et al., 2011) to test for a differential effect of training between tasks to study learning. However, this approach suffers from considering only voxel-level activation, or change in activation, in isolation. In contrast, learning may induce changes in activation distribution, i.e., the distribution of intensities of BOLD responses to the paradigm. Moreover, the study of activation distributions offers many potential benefits over voxel-level testing, including: the elimination of multiplicity concerns, robustness to registration, and sensitivity to hypotheses of particular interest in the study of learning.

Analysis of dimensionality of fMRI task-based activation maps (Worsley et al., 1997; Zarahn, 2002) provides a starting framework. The proposed procedure considers the distribution of activation maps and tests their dimensionality using eigenvalue decompositions. To illustrate the goals of the test, consider our motivating example. Learning could manifest itself in many ways in the collection of voxels that are activated. For example, BOLD contrast estimates of the activated voxels could be identical across sessions, increased or decreased, change from activated to not (and vice versa) or uncorrelated. The test of dimensionality should be considered one of several possible probes to interrogate such hypotheses.

Our investigation includes a large scale simulation study of brain activation maps. The simulation results demonstrate that the study of dimensionality in a framework similar to Zarahn (2002) is able to capture learning effects. The motivating data set is used to illustrate the method, which is applied to the trained and untrained tasks separately and then jointly.

# 2. Methods

Subjects performed an fMRI motor task in two scanning sessions, with training between them. A second, similarly difficult, fMRI motor task was performed at the two sessions, but had no training in between. We focus on activation maps within an appropriately selected spatial mask, such as one encapsulating the primary motor cortex. Let γˆijk(v) be the subject- (represented by index i = 1, . . . , N), session- (j = 1, 2), task- (k = 1, 2) and voxel- (v = 1, . . . ,V) specific estimates of task activation. These are obtained by voxel-wise regression of a HRF-convolved task paradigm in registered space (see Lindquist, 2008; Lindquist et al., 2009, for descriptions and discussion), conducted separately for each subject's visit.

This paper is concerned with the statistical analysis of, and hypotheses associated with, the collection of subject-specific activation maps, represented by the <sup>V</sup> <sup>×</sup> 2 matrix <sup>Ŵ</sup><sup>ˆ</sup> ik = { ˆγi1<sup>k</sup> (v), γˆi2<sup>k</sup> (v)} V v=1 .

A conceptual model is considered where the activation maps are estimates of assumed true activation maps, Ŵik = {γi1<sup>k</sup> (v), γi2<sup>k</sup> (v)} V v=1 . Thus, variation in the elements of Ŵik is (intra-subject) biological variation in the hemodynamic BOLD response to the paradigm. In contrast, variation in <sup>Ŵ</sup><sup>ˆ</sup> ik includes this biological variation, as well as all of the variation and biases that occur in the practical process of computing the BOLD paradigm response.

Both <sup>Ŵ</sup><sup>ˆ</sup> ik and Ŵik also vary across subjects. Consider the V × 2 matrix, A<sup>k</sup> = {β1<sup>k</sup> (v), β2<sup>k</sup> (v)} V v=1 as representing the population average of voxel-level activation. Here βjk(v) = E(γijk(v)), j = 1, 2. A non-zero βjk(v) indicates that, on average, subjects activated at that particular location. Treating v as being meaningfully consistent across subjects requires that appropriate template-based (or equivalent) registration has been performed. The matrix, <sup>A</sup><sup>ˆ</sup> <sup>K</sup>, is thus a data-level estimate of <sup>A</sup><sup>k</sup> , obtained by taking empirical means across subjects at each voxel.

A straightforward investigation of learning for the first (trained) task arises from a sharp null hypothesis test of:

$$H\_0: \beta\_{21}(\nu) - \beta\_{11}(\nu) = \beta\_{22}(\nu) - \beta\_{12}(\nu),$$

conducted separately, voxel-by-voxel. This tests the difference in the longitudinal change in the BOLD response between the trained and untrained tasks. Comparing longitudinal learning effects with a reference (untrained) task addresses non-learning based biases across sessions. The test in question is normally conducted with standard interaction tests—perhaps accounting for subject-level correlation (see Diggle et al., 2002, for a general treatment of correlated data). Typically, the test is performed separately at each voxel, via so-called Statistical Parametric Mapping (SPM). Significance is usually ascertained with super-threshold voxel level statistics using random field theory (see Friston et al., 2011, and the references therein) or via resampling statistics (Nichols and Holmes, 2001).

This SPM approach has several benefits for the study of learning. However, it also has limitations. Notably, the approach suffers from multiplicity issues and concentrates only on focal and localized interaction hypotheses, one voxel at a time. Moreover, it is highly dependent on accurate co-registration across subjects. Little information is gained from the ensemble of voxels, except through smoothing during preprocessing.

As an alternative, consider examining the activation distribution. Let D = A<sup>2</sup> − A<sup>1</sup> = {β21(v) − β11(v), β22(v) − β12(v)} be the V × 2 matrix of longitudinal changes in the contrasts of interest, with its associated estimate, <sup>D</sup><sup>ˆ</sup> . The SPM approach tests whether the two entries of each row of D are the same. Suppose one instead assumes that elements of D arise from a bivariate distribution and interest is in the ensemble of voxel-specific pairs, instead of individual voxels.

**Figure 1** is a conceptual diagram showing possible shapes associated with the distribution of voxel pairs. The conceptual model is informed by the idea of Gaussian mixture models (see McLachlan and Peel, 2000, for an introduction). The mixture model is governed by four major areas: (A) voxels that were "activated" (had a change across sessions) only in the trained task, (B) voxels that were activated in both tasks, (C) voxels that were activated only in the untrained task, and (D) voxels that were not activated in either task.

It is the shape of (B) that is of primary interest. For instance, any shift in B above the diagonal line represents training based learning. If the shape is spherical, there is no correlation between

FIGURE 1 | Conceptual diagram for fMRI activation distributions based on the motivating study of motor learning. Shaded areas represent learning based (inter-session differences) between a trained (Y axis) and untrained (X axis) task. Across all panels, Area (A) represents voxels with change in activation across sessions only in the trained task, (B) represents voxels with change in activation across sessions in both the trained and untrained task, (C) voxels with change in activation across sessions only in the untrained task, (D) represents no change in activation for both tasks. The four panels (I–IV) represent different potential shapes of the activation distributions for (B) with (I, II) showing a two dimensional shape and (III, IV) showing an approximately one dimensional. In (I, III) inter-sessions differences are symmetrically represented whereas in II, IV one task had a uniformly greater increase.

training status and change in activation across sessions. In contrast, the more ellipsoidal the shape, the greater the correlations in activation extent across sessions.

While acknowledging that SPM operates voxel-by-voxel, and that **Figure 1** displays voxel groups, the SPM approach would investigate each point's distance from the diagonal line, assessing significance relative to inter-subject variability. Therefore, given enough data, the SPM approach would conceptually reject for voxels in groups (A) and (C) in the cases represented by all panels. However, it would reject most of the voxels in group B in panels II and IV only. The approach would reject few of the voxels in (B) for panels I and III. Contrast this with the shape and dimensionality of (B) being constant for panels I and II together and III and IV together. Thus, to the extent that learning represents itself as changes in the shape of the activation distribution, the voxel-wise approach would not tell the complete story.

Instead, we view the shape of the bivariate distributions of points in group (B) as informative for studying changes in task activation. One key attribute is its intrinsic dimensionality (1 vs. 2 dimensional). Ignoring groups (A), (C), and (D), one would conclude that (B) is two dimensional in panels I and II and intrinsically one dimensional in III and IV. The dimensionality of (B) is useful for differentiating whether changes in intensity or distribution account for activation changes following learning.

The use of principal components to investigate the dimensionality of learning builds upon an existing literature on the use of dimensionality testing in the study of activation maps (Worsley et al., 1997). Specifically, Zarahn (2002) and Moeller and Habeck (2006) considered it within the context of functional imaging. The aim of this work is to study the goals, limitations and hypotheses of tests of dimensionality of fMRI activation maps. A test of one vs. two dimensions on the set <sup>D</sup><sup>ˆ</sup> , that is rank(D<sup>ˆ</sup> ), investigates the null hypothesis

$$H\_0: \beta\_{21}(\nu) - \beta\_{11}(\nu) = \mathfrak{c} \{ \beta\_{22}(\nu) - \beta\_{12}(\nu) \}$$

for unspecified c and collectively for all voxels v.

Let <sup>A</sup><sup>ˆ</sup> <sup>k</sup> = 1 N <sup>P</sup>A<sup>ˆ</sup> ik and recall that <sup>D</sup><sup>ˆ</sup> <sup>=</sup> <sup>A</sup><sup>ˆ</sup> <sup>2</sup> <sup>−</sup> <sup>A</sup><sup>ˆ</sup> 1. Following the existing work on tests of dimensionality in fMRI, we use root tests of the second eigenvalue (see Mardia et al., 1980) to investigate the hypotheses of one dimension vs. two. A simulation-based investigation of this test follows. The simulation study includes: the strength of the effect, the intrinsic dimensionality (considering power and error rates), and the impact of biological and measurement variation, including variation in the angle of the subject-specific principal direction.

# 3. Materials and Simulation

# 3.1. Motivating Data Set

A motor learning study served as motivation for this work, though we emphasize that the methodology generally applies to any study of change in activation. The goal of the motor study centered on investigating skilled motor learning via the Arc Pointing Task (APT) (Shmuelof et al., 2012), where the task was designed to better understand neural correlates of motor skill acquisition. The subjects completed two similarly demanding motor tasks of drawing an arc within reference lines by moving their (non-dominant in all cases) left wrist. The interior circles in **Figure 2** represent the starting and end points of the path. Subjects were directed to stay within the lines of the outer circles while tracing the arc. Subjects were scanned while performing the tasks at baseline and again 5 days later, with training on just one of the two tasks in the interim. Comparison of fMRI activation (or any measurement of motor function) from baseline to follow-up considers both effects related to motor learning and those related to changes between sessions. Comparison with the, otherwise similar, untrained task as a reference eliminates additive inter-session biases unrelated to learning.

The specifics of the study are as follows. Thirteen right-handed subjects (8 females, 18–27 years of age) engaged in the above described motor tasks, none having performed these tasks previously. Subjects participated in a 5 day protocol consisting of daily behavioral sessions in the lab and two fMRI scans on the baseline and final days (1 and 5, respectively). During scanning, subjects performed the APT. Horizontal (trained) and vertical (untrained, control) APT movements were performed in separate block design experiments before and after training for the horizontal task. Six movements were performed in 18 blocks

(repeated 6 times), at a slow speed (1.5 s per movement). Subjects received online feedback regarding the position of the cursor, but no further information about their success or failure, or about their movement speed. In the trained task, targets were presented on the horizontal line (same configuration as during the behavioral task in the lab) and in the untrained task, targets were aligned vertically. Movements were always in the clockwise direction. Subjects performed the movements with their (non-dominant) left wrist, while lying on their back, and receiving visual feedback of their movements through goggles (resonance technology, Los Angeles, CA). Further details on the experimental paradigm can be found in Shmuelof et al. (2014).

Data was acquired on a Philips Intera 3T scanner using a Philips SENSE head coil. The functional scans were collected using a gradient echo EPI, with voxel size of 3 × 3 × 3 mm (240 × 240 × 240 mm matrix). TR = 2 s, flip angle = 77<sup>o</sup> , axial slices, TE = 25 ms. Forty slices were gathered in an interleaved sequence at a thickness of 3 mm (no gap). Ninety − six volumes were accumulated in each experimental run. The first 2 volumes were discarded to allow magnetization to reach equilibrium. A single T1-weighted anatomical scan was also obtained for each subject (MPRAGE, 1 mm<sup>3</sup> ).

Functional data were preprocessed using SPM5 (http://www. fil.ion.ucl.ac.uk/spm/software/spm5/). Before statistical analysis, the data was also corrected for slice timing acquisition and head motions, re-sliced to 2 × 2 × 2 mm voxels using a fourth degree B-spline interpolation, and transformed into a Talairach standard space (Talairach and Tournoux, 1988). A general linear model was used for data analysis, followed by calculation of beta maps. Scatter plots of beta before training and after training are shown in **Figures 3, 4**.

By comparing the trained and untrained tasks, the population impact of learning was estimated by considering differences in the change in activation maps over sessions. Using the developed notation, the collections compared are, {β21(v) − β11(v)}v=1,...<sup>V</sup> to {β22(v) − β12(v)}v=1,...,V, where, as previously noted, the first index indicates session (baseline and fifth day) and the second indicates task (horizontal and vertical). The test of dimensionality then considers whether the changes in activated voxels after training is uncorrrelated with the changes in the untrained (but otherwise similar) task. Under Gaussian assumptions, absence of correlation among activated voxels implies that the extent of activation is unrelated between sessions.

All subjects gave written, informed consent and received a small compensation for participating in the Study, which was approved by the Columbia University Institutional Review Board.

### 3.2. Simulation Study

Assume there are V = V<sup>1</sup> +V<sup>2</sup> voxels in total: V<sup>1</sup> that are significantly different across sessions (group B in **Figure 1**) and referred to as "activated," and V<sup>2</sup> that are not (group D in **Figure 1**). Under this working example, the term activated implies a nonzero change in the contrast values across sessions. Thus, π = V2 V is the percentage of non-activated voxels.

The simulation model is:

$$b\_{i\boldsymbol{\nu}} \overset{iid}{\sim} N\left\{ \begin{pmatrix} \beta\_{21}(\boldsymbol{\nu}) - \beta\_{11}(\boldsymbol{\nu})\\ \beta\_{22}(\boldsymbol{\nu}) - \beta\_{12}(\boldsymbol{\nu}) \end{pmatrix}, I\boldsymbol{\sigma}^2 \right\} = N(\delta(\boldsymbol{\nu}), I\boldsymbol{\sigma}), \tag{1}$$

where δ(v) = {δ1(v), δ2(v)} = {β21(v) − β11(v), β22(v) − β12(v)} and biv = {b1iv, b2iv} is a subject-specific realization plus noise. The generation of the δ(v) parameters varied across simulation settings, and is described separately for each case below.

In all simulation settings, the estimate of the V × 2 matrix of the <sup>δ</sup>(v), labeled <sup>D</sup><sup>ˆ</sup> , was obtained via the voxel-specific mean across subjects. Following Worsley et al. (1997), the V ×2 matrix, <sup>Z</sup>, denotes <sup>δ</sup><sup>ˆ</sup> divided by its standard error. That is, <sup>Z</sup><sup>k</sup> (v) = Var{δ<sup>ˆ</sup> k (v)} −1/2 δˆ k (v) make up row v and column k of Z. Here the variance was calculated across subjects separately for each voxel.

The cross-product matrix is then

$$\mathcal{S} = \sum\_{\nu=1}^{V} Z(\nu)' Z(\nu) / V.$$

The Lawley/Hotelling trace statistic is:

$$\mathcal{S}\_q = \sum\_{j=q+1}^h \lambda\_j / (h-q),$$

where λ<sup>j</sup> , j = 1, 2, · · · , h are the eigenvalues of S, h is the total number of eigenvectors and q is the testing rank. Under independence and Gaussian assumptions, S<sup>q</sup> follows an F distribution under the null hypothesis, where the first q principal components capture all of the signal. In our case, h = 2, q = 1 and the test statistic is simply the second eigenvalue of S.

#### 3.2.1. Simulation Under the Null Hypothesis

The first simulation setting considers the hypothesis of unidimensionality; that is, whether δ1(v) = cδ2(v), where c is constant across subjects. The parameter δ1(v) for the activated

voxels was simulated as uniformly distributed in [min, max], with this range computed from values of [0, 1]–[10, 15]. Note that for voxels inactive in both time points, δ1(v) = 0. Thus, δ1(1), . . . , δ1(V1) 6= 0 while δ1(V<sup>1</sup> + 1), . . . , δ1(V) = 0. Note that, δ2(v) = cδ1(v) regardless of null status.

**Figure 5** shows example data for a simulated subject as well as the estimated statistics. The null simulation varied according to the following: (i) distance of the activated voxels from the inactivated ones, as well as the range of activation, (controlled by min and max); (ii) the percentage of inactivated voxels (π); and (iii) the number of subjects (N). For all of the null hypothesis scenarios, c = 1. The type I error rates correspond to the percentage of rejections of the Lawley/Hotelling trace statistic for each simulation setting. The specifics of each scenario are described below while the results are shown in **Table 1**.

**Simulation under variation in the distance:** In this scenario, N = 12, V<sup>1</sup> = 40, V<sup>2</sup> = 200, and σ = 1. Five scenarios for each pair of min and max were considered. The results suggest that the type I error is not significantly affected by the distance of the activated voxels from the inactivated ones.


# 3.2.2. Simulation Under the Alternative Hypothesis

There are a variety of ways in which the null hypothesis can fail to be true; herein, several key departures were analyzed. First, consider a straightforward departure, where **Figure 1** holds, with sets (A) and (C) both empty. The extent of spherical and elliptical variation around the principal axis are evaluated. However, other departures could also be present. Most importantly, the null could be true for each subject, but with a varying angle along the principal axis. In addition, a non-trivial percentage of voxels changing activation status (i.e., sets (A) and (C) from **Figure 1** being non-empty) would similarly represent a departure from the null hypothesis. The simulation scenarios for these parameters are described below.

The number of subjects remains N = 12 while min = 0.5, max = 1.5, V<sup>1</sup> = 40, and V<sup>2</sup> = 200.

**Simulation under a basic alternatives:** Two basic alternative settings were considered. In the first, the δ(v) were simulated as two dimensional, yet one dimension dominates the other. This method of simulation added orthogonal variation around the line used in the simulation under the null hypothesis. Specifically, the activated voxels have Gaussian variation orthogonal to the major axis (see **Figure 6A**). This was done in lieu of simulating a bivariate Gaussian with a non-zero correlation to consider an even, non-concentrated spread along the major axis. Simulations using a bivariate normal yielded similar results. In the second setting the correlation was assumed to be zero (see **Figure 6B**).


# 3.3. Simulation Results

**Table 1** displays the results across the simulation settings. All tests were performed at a nominal 5% error rate.

#### TABLE 1 | Results of the simulation studies.


*(Continued)*

#### TABLE 1 | Continued


*Shown are type I error rates and power across simulation settings.*

differences for each task in the motivating study. In (A) the voxels have Gaussian variation added orthogonally to the major axis. In (B) there is no relationship.

FIGURE 7 | Example simulation for the setting when the principal axis differs across subjects. The axes are the two dimensional bivariate simulated data representing inter-session differences for each task in the motivating study. The gray line is a reference identity line, while the red line is the axis of principal direction.

# 3.3.1. Simulations Under the Null Hypothesis

Adherence to the specified nominal error rate was remarkably consistent as parameter settings varied. When varying the distance, the test showed only slight liberalism (Type I error rate larger than the nominal) across settings. Only for unrealistically small activation sets did the test demonstrate liberalism when altering the activation set size. In addition, varying the number of subjects had little impact. Adherence to the nominal error rate was acceptable, even at very low numbers of subjects.

### 3.3.2. Simulations Under the Alternative Hypothesis

Under the basic alternative, where the true voxel states possessed a strong (but not perfectly linear) corelation, power varied as expected. Under a strong correlation (σ<sup>b</sup> close to 0), power trended to the nominal type I error rate. Encouragingly, power quickly trended to one as the true relationship moved away from a dominant dimension. As expected, the power tended to 1 as the sample size increased (confirming the relevant asymptotics). However, the sample size needed to be relatively large to have adequate power at the modest value of σ<sup>b</sup> = 0.2.

In the case where no dimension dominated under the basic alternative of absence of correlation, power changed significantly with the spread of activation, σ<sup>b</sup> . When the angle of principal direction varied, power suffered dramatically. To address this, a first stage subject-specific principal components rotation was investigated. This appeared to improve power in settings where the null and non-null voxels were more clearly delineated, but continued to exhibit low power (11%) when the distance was large (min = 10, max = 15). A non-trivial fraction of voxels changing activation status had a negative impact on power.

# 4. Data Analysis of the Motivating Data Set

This section investigates the impact of training on activation using the APT data described in Section 3.1 and represented in **Figures 3**, **4**, which show estimated beta maps. A null hypotheses suggests that the data points are close to the principal line. Notably, a distinction between the null and alternative hypothesis is difficult to ascertain graphically. However, it is apparent that the axis of principal direction varies by subject. Next, dimensionality is tested via three methods: first considering only the (trained) horizontal task, then only the (untrained) vertical task, and then comparing both. When considering the untrained task in isolation we are testing H<sup>0</sup> :β21(v) = cβ11(v), then H<sup>0</sup> :β22(v) = cβ12(v) for the trained and H<sup>0</sup> : β21(v) − β11(v) = c{β22(v) − β12(v)} when comparing trained and untrained. (The paper used the latter as the primary motivating example.) In **Table 2**, the results before and after angle correction are shown.

# 4.1. Motor Learning Data Results

The axis of principal direction varied by subject (see **Figures 3**, **4**). Before correcting for the principal angle, the tests of dimensionality were insignificant, for both the horizontal and the vertical tasks. However, after correcting the principal angle by subject, the p-values of the tests were highly reduced. Focusing only on the tasks separately, the test of dimensionality yielded a p-value of 0.05 for the vertical task and 0.16 for the horizontal one. When comparing across tasks, the p-value was 0.36. Thus, the untrained task has a significant second dimension that does not appear to be present in the trained. Inspecting the data, excess variability in the trained task appears to be due to biomodal changes in

#### TABLE 2 | P-values of the tests of dimensionality for the motor learning data set.


*The first row considers the Session 1 vs. Session 2 for the Horizontal task (H*<sup>0</sup> : β21(*v*) = *c*β11(*v*))*. The second row does the same for the vertical task (H*<sup>0</sup> : β22(*v*) = *c*β12(*v*)*). The third considers inter-session differences across tasks (H*<sup>0</sup> : β21(*v*) − β11(*v*) = *c*{β22(*v*) − β12(*v*)}*). P-values are given with and without having performed an angle correction.*

activation. It is not surprising that the comparison across tasks was non-significant, given the increased variability obtained from taking differences and the issues of power for the test.

# 5. Discussion

# 5.1. Simulation Results

The simulation results suggest that tests of dimensionality are a reasonable exploratory testing procedure for investigating the distribution of paired activation maps. However, their confirmatory performance was hindered by instances with low power in situations that could be realistically seen in practice. The adherence to the nominal type I error rate, on the other hand, was uniformly acceptable across simulation settings. Thus, a rejection from this test is likely informative, while an acceptance less so.

The low power cases occurred where there is substantial variability in the principal axis, or where activation status changed. This latter condition created confusion between noise and signal, with the test attributing signal variability as noise. Of the two cases, careful masking could eliminate concern over changing activation status. However, variability in the principal axis is likely the norm and could arise from a number of plausible

biological, technological and processing causes. The straightforward refinement of a first stage subject-level principal component rotation improves the power.

### 5.2. General Discussion

This manuscript posited a different paradigm for statistically evaluating learning using task-related BOLD fMRI activation maps. At its core, the primary advance is the supposition of using the bivariate distribution of the activation maps, or changes in activation maps, when comparing tasks over sessions. Under this framework, changes in the distribution of activated voxels are key, not voxel level changes in activation extent, as would be evaluated in voxel-level parametric mapping interaction tests. An unintended benefit of this distributional approach in this setting is avoiding the familiar issue of having to determine interactions where main effects are not present.

The intended benefit of increasing power over voxel-level interaction tests was found to be true, provided assumptions hold. For example, **Figure 9** provides a simulation example where the alternative test of dimensionality is both true and detected (P-value of 0.03). However, only 11% of the voxels would satisfy a voxel level test of significance. We emphasize the different nature of the hypotheses interrogated by these approaches so that comparisons of power should be taken with a grain of salt.

Evaluating distributional differences for learning-based activation tests a different scientific hypothesis than voxel level testing. In our example, the question was how BOLD activation, or changes in activation, relate between trained and untrained tasks. Investigating activation distributions is less sensitive to the requirement of focal localization of effects compared to interaction testing. For example, two small spatially separated significant interaction regions may have different voxel-level interaction significance than a single contiguous region of the same aggregate size. In contrast, the distribution may not change. Conversely, evaluating contrast map distributions does not provide the benefits of localization to inform results.

It is worth emphasizing that the investigation of activation distribution represents a complementary procedure to voxel-level testing and does not represent a form of omnibus test to be performed prior to it. Thus, it is perhaps not useful to generate a single analytic pipeline, whereby omnibus distributional tests are followed by voxel level contrasts of interest.

An interesting next direction in this line of research would consider full models of the joint distribution of {β11(v), β12(v), β21(v), β22(v)}. This could be accomplished using a Bayesian random effects approach via mixtures of Gaussian random variables. However, the feasibility, applicability and gain of such an approach over simpler solutions remains unknown. A tantalizing possible benefit would be robustness to inter-subject registration to a template. In contrast, interaction tests focus on localization and as such, place a heavy burden on accurate inter-subject registration. A full random effect mixture model could possibly remove the need for inter-subject registration, or at least remove the need for non-affine registration.

The far simpler approach discussed in this manuscript addresses dimensionality. The results show that the operating characteristics of the approach are viable, if modeling assumptions are met. Particularly encouraging was the robustness to variation in the distance of the center of activation from null voxels. However, its sensitivity to the angle of the principal axis is a core issue, as such variation is clear from the data.

In the real data analysis it is noteworthy that the vertical and horizontal tasks differed in their respective tests of dimensionality. Particularly, the null hypothesis was not rejected in the trained task (horizontal) while it was in the untrained task (vertical). However, there does appear to be more apparent non-Gaussianity in the vertical task, suggesting a component of the rejection is related to a form of dimensionality not well-covered by the model. The contrast test comparing vertical vs. horizontal was not significant. Therefore, it cannot be concluded that the activation distribution given by the inter-session differences across tasks is not linear. For all three cases, the data analysis suggests large variability in the subject-specific principal axes, a setting where low power was evidenced in the simulation study. Thus, the null results are perhaps indicative of low power.

# References


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Yang, Shmuelof, Xiao, Krakauer and Caffo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

#### *Javier Gonzalez-Castillo1 \*, Daniel A. Handwerker 1, Meghan E. Robinson1,2, Colin Weir Hoy1, Laura C. Buchanan1, Ziad S. Saad3 and Peter A. Bandettini <sup>1</sup>*

*<sup>1</sup> Section on Functional Imaging Methods, Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA*

*<sup>2</sup> Translational Research Center for TBI and Stress Disorders (TRACTS), VA Boston Healthcare System, Boston, MA, USA*

*<sup>3</sup> Scientific and Statistical Computing Core, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA*

#### *Edited by:*

*Christopher W. Tyler, Smith-Kettlewell Eye Research Institute, USA*

#### *Reviewed by:*

*Xi-Nian Zuo, Chinese Academy of Sciences, China R. Matthew Hutchison, Western University, Canada*

#### *\*Correspondence:*

*Javier Gonzalez-Castillo, Section on Functional Imaging Methods, National Institute of Mental Health, National Institutes of Health, Building 10, Room 1D80, 10 Center Dr, Bethesda, MD 20892, USA e-mail: javier.gonzalez-castillo@ nih.gov*

Resting state functional MRI (rsfMRI) connectivity patterns are not temporally stable, but fluctuate in time at scales shorter than most common rest scan durations (5–10 min). Consequently, connectivity patterns for two different portions of the same scan can differ drastically. To better characterize this temporal variability and understand how it is spatially distributed across the brain, we scanned subjects continuously for 60 min, at a temporal resolution of 1 s, while they rested inside the scanner. We then computed connectivity matrices between functionally-defined regions of interest for non-overlapping 1 min windows, and classified connections according to their strength, polarity, and variability. We found that the most stable connections correspond primarily to inter-hemispheric connections between left/right homologous ROIs. However, only 32% of all within-network connections were classified as most stable. This shows that resting state networks have some long-term stability, but confirms the flexible configuration of these networks, particularly those related to higher order cognitive functions. The most variable connections correspond primarily to inter-hemispheric, across-network connections between non-homologous regions in occipital and frontal cortex. Finally we found a series of connections with negative average correlation, but further analyses revealed that such average negative correlations may be related to the removal of CSF signals during pre-processing. Using the same dataset, we also evaluated how similarity of within-subject whole-brain connectivity matrices changes as a function of window duration (used here as a proxy for scan duration). Our results suggest scanning for a minimum of 10 min to optimize within-subject reproducibility of connectivity patterns across the entire brain, rather than a few predefined networks.

**Keywords: fMRI, connectivity dynamics, stability, rest, sliding window analysis**

### **INTRODUCTION**

In recent years, the functional magnetic resonance imaging (fMRI) research community has undertaken a slow, yet constant shift in attention from functional localization (where in the brain a specific function resides) to functional connectivity (how different brain regions interact with each other). Today, it is well established that some brain regions are tuned primarily to perform specific tasks (e.g., motor cortex controls the movement of body parts, visual cortex analyzes incoming visual stimuli, etc.) Still, this one-to-one relationship soon diffuses as one moves beyond primary cortices into association cortex to understand the neuronal correlates of higher cognitive functions such as emotions, speech, or attention. Moreover, it is increasingly common to discover variations in functional connectivity, rather than in specific functional modules, that seem to differentiate complex mental conditions (see Greicius, 2008 for a review) such as autism (Just et al., 2007; Gotts et al., 2012), depression (Sheline et al., 2010), and Alzheimer's Disease (Wang et al., 2013a).

One well-known, non-invasive approach to the study of functional connectivity in the human brain is resting state fMRI (rsfMRI; Biswal et al., 1995). In this technique, the spatial cofluctuation of Blood Oxygenation Level Dependent (BOLD) signals is recorded while subjects rest quietly in the scanner in the absence of any specific task demands, and these data are used to explore patterns of functional connectivity at the system level (see Lowe, 2010 for a historical review). More importantly, rsfMRI is not only a powerful research tool, but it has great potential for clinical applications given its experimental simplicity, short scanning durations, richness of information, ease of sharing, and low requirement for subject compliance. Nevertheless, for clinicians to be able to rely on rsfMRI-based biomarkers to diagnose or intervene, several challenges with respect to reproducibility and interpretation must be resolved (Castellanos et al., 2013). Although overall patterns of rsfMRI-based functional connectivity have proven to be reliable across scans, subjects, and even institutions, quantitative measures with the potential to become biomarkers (e.g., the strength of a given connection) are not yet sufficiently reliable, as they depend on factors such as scan condition (e.g., eyes closed vs. eyes open Yan et al., 2009; Van Dijk et al., 2010; McAvoy et al., 2012), scan duration (Birn et al., 2013), and specific pre-processing steps used during the analysis (Murphy et al., 2009; Power et al., 2012). Despite these dependences, some rsfMRI connectivity metrics such as regional homogeneity (ReHo; Zuo et al., 2013), amplitude of spontaneous low frequency oscillations (Zuo et al., 2010), and several measures of centrality (Zuo et al., 2012) have been shown to have encouraging test– retest reliability. Nevertheless, one additional factor that poses interesting questions regarding how to best record and quantify rsfMRI-based metrics is the recently observed dynamic behavior of rsfMRI connectivity patterns (Chang and Glover, 2010).

Several recent studies have shown how patterns of rsfMRI connectivity vary substantially even over the duration of a single scan (Chang and Glover, 2010; Handwerker et al., 2012; Tagliazucchi et al., 2012; Hutchison et al., 2013b), thereby calling into question the assumption of temporal stationarity even over short timescales (see Hutchison et al., 2013a for a review). Similarly, other studies have explored how scan duration affects the reproducibility of rsfMRI connectivity patterns (Van Dijk et al., 2010; Birn et al., 2013). However, most of these studies have focused their analysis on a handful of representative connections and networks. Given the large variability of functional roles and connection strengths across the human brain connectome, it can be expected that optimal scan acquisition strategies and reliability of biomarker measurements will depend greatly on the connections of interest. For example, Allen et al. (2014) recently reported a series of rsfMRI networks, labeled the "Zone of Instability," that exhibit significantly greater temporal variability in functional connectivity. These regions with the greatest instability correspond primarily to dorsal attention areas, default mode regions, and superior occipital areas. Still Allen and colleges' exploration of dynamic behavior was constrained by the duration of the resting scans (5 m and 4 s) and their temporal resolution (2 s), which limit both the quality of functional connectivity estimates (given the low number of available data points) and the domain of functional connectivity configurations that occur during such short scan periods.

The purpose of the current study is to further explore and characterize rsfMRI connectivity dynamics, and in that manner extend some of the findings of Allen et al. (2014) and others (Tagliazucchi et al., 2012; Hutchison et al., 2013b). To overcome the above-mentioned limitations resulting from short scan durations, in this study rsfMRI data were collected in 12 participants who were scanned continuously for 60 min at a temporal resolution of 1 s. Using these data, we evaluated pair-wise connections over the scale of minutes, investigating their polarity, strength, and variability. We evaluated the spatial distribution of three categories of connections (namely stable positive connections, variable positive connections, and negative connections) and whether assignment of connections to these three groups was consistent across subjects. Using a sliding window approach, we found that most stable positive connections correspond mainly to symmetric, inter-hemispheric, within- and across-network connections; while most variable positive connections correspond primarily to inter- and intra-hemispheric, across-network connections between occipital and frontal regions. Negative connections correspond primarily to those between two medial subcortical regions and fronto-parietal regions. We also evaluated how window length, a proxy for scan duration, affects the degree of similarity in whole-brain, within-subject connectivity patterns. We found two regimes in terms of how similarity changes with scan duration. For short scan durations (approximately less than 10 min) similarity of whole-brain connectivity patterns decreases quickly as scan duration shortens. For longer durations, although similarity increases with scan length, it does so at a much lower rate.

# **MATERIALS AND METHODS**

#### **DATA ACQUISITION**

Twelve healthy volunteers (7 males; age: 30.17 ± 10.22 years) participated in this study after providing written consent in agreement with a protocol approved by the NIH Protocol Review Board. Subjects were scanned continuously in a General Electric 3T MRI scanner for 60 min while relaxing with their eyes closed. A 32-channel receive-only head coil was used. The resting scans were acquired using a gradient-recalled echo-planar imaging (EPI) sequence (*TR* = 1 s, *TE* = 27 ms, FOV = 24/21 cm, image matrix = 64 × 64/72 × 72, slice thickness = 4.0 mm, slice spacing = 0.0 mm, flip angle = 60◦, number of slices = 23, number of acquisitions = 3600, ASSET Acceleration = 2). In addition, a high-resolution T1 spoiled gradient echo (SPGR) scan was acquired for alignment and presentation purposes (sagittal prescription, number of slices per slab = 176, slice thickness = 1 mm, FOV = 256 mm, image matrix = 256 × 256) in each subject.

Respiration and cardiac traces were also collected during the resting scans using a respiration belt and a pulse oximeter, in all subjects except one. Both physiological traces were acquired with a sampling rate of 50 Hz.

In order to achieve a temporal resolution of 1 s, it was necessary to restrict our spatial coverage. In particular, with the current data, we cannot draw any conclusions regarding connections involving the cerebellum, temporal poles, or ventral temporal regions. New technological developments, such as multi-slice acquisition techniques (Feinberg and Setsompop, 2013), may soon be able to eliminate this limitation (Smith et al., 2012).

#### **DATA PRE-PROCESSING**

Data pre-processing was conducted with the AFNI software package (Cox, 1996). Pre-processing steps include: discarding of initial 10 volumes to allow for magnetic homogenization; despiking (with AFNI *3dDespike*); physiological noise correction (in all subjects but one) including regressors for the RETROICOR (Glover et al., 2000), RVT (Birn et al., 2006), and RHR (Chang et al., 2009) models; slice time correction (AFNI program *3dTshift*); head motion correction (AFNI program *3dvolreg*) and transformation into MNI space (AFNI program *@auto\_tlrc*) in a single interpolation step; and spatial smoothing (FWHM = 6 mm). In addition, mean, linear trends, signal from local white matter (WM), signal from the lateral ventricles (CSF), motion estimates, the first derivative of motion estimates, and a series of sine and cosine functions to remove all frequencies outside the range (0.01–0.25 Hz) were regressed out in a single regression step (AFNI program *3dTproject*). This last regression step permits us to account for potential hardware instabilities and remaining physiological noise (ANATICOR; Jo et al., 2010, 2013; Gotts et al., 2013). During this regression step, time points with motion greater than 0.4 mm were removed from the data (scrubbing) and replaced by values obtained via linear interpolation in time. On average, 1649 degrees of freedom (DOF) remain after the abovementioned regression and censoring steps (**Table 1** shows motion, number of interpolated volumes, and remaining DOFs for each subject).

Spatial transformation matrices to go back and forth between the original EPI space, T1-anatomical space, and MNI standard space were also computed for each subject using AFNI programs *3dAllineate* and *@auto\_tlrc*. These matrices were subsequently used for presentation purposes and to bring publicly available atlases into each subject's functional data space (see below).

#### **BRAIN PARCELLATION**

In order to parcellate the brain into a limited number of spatially contiguous, functionally homogeneous, non-overlapping regions of interest (ROIs), we used the publicly available template of 150 ROIs associated with the Craddock Atlas (Craddock et al., 2012) (**Figure 1A**). An ROI-based approach was selected over a voxel-wise approach to help with interpretation, minimize the contribution of small errors in alignment to between-subject comparisons, and ease computational load. Nevertheless, despite using a functionally-based atlas with relatively small ROIs, some level of functional inhomogeneity should be expected when combining voxels into a single time-series (Zuo et al., 2013).

For each subject, we first brought this MNI atlas template into each subject's EPI space. Subsequently, we removed ROIs (20 ROIs from cerebellum, midbrain, and lower temporal cortex) that

**Table 1 | Motion, number of censored time points, and remaining DOFs after bandpass filtering, regression of nuisance signals, and censoring in each subject.**


*Participant SBJ09 was excluded from all sliding window analyses due to the large number of data points that required interpolation due to head movement according to the criteria set during pre-processing.*

did not have at least 10 voxels within the imaged field of view for all 12 subjects (**Figure 1B**).

In order to group the remaining 130 non-overlapping ROIs into functionally relevant networks, we used the functional network taxonomy published by Laird et al. (2011), excluding two artifactual networks (ICNs 19 and 20 identified as artifactual by Laird and colleagues) and two networks not covered by our scanning FOV (ICNs 5 and 14). Each ROI was assigned to one of the 16 remaining networks described by Laird and colleagues by identifying the network with maximal spatial overlap with that ROI (**Figure 1C**). Within each network, ROIs in connectivity matrices appear sorted according to decreasing degree of overlap with that network. **Table 2** shows detailed information regarding which Laird et al. (2011) networks were used, the labeling scheme used in the remainder of this paper, how many ROIs were assigned to each of these networks, and the color assigned to the nodes of each network in the result figures.

#### **ROI REPRESENTATIVE TIME SERIES EXTRACTION**

For each ROI, the principal singular vector (computed with AFNI program *3dmaskSVD*) across all voxels in the ROI was used as the representative time series. This resulted in 130 time series of interest with 3590 time points in each subject. The average and standard deviation of the Pearson's correlation between each ROI's representative time series and all voxels in the ROI, across all subjects and all ROIs, was 0.61 ± 0.08.

#### **CONNECTIVITY MATRIX BASED ON WHOLE TIME SERIES: STATIONARY ANALYSIS**

For each subject, we computed an overall correlation matrix (130 × 130) under the assumption of temporal stationarity, using all available 3590 time points. In these matrices, connectivity between two given ROIs is measured in terms of their Pearson's correlation (*r*). These matrices are symmetric, with *r* = 1 along the diagonal. All information is therefore contained in the 8385 values that form the upper triangular region. In the remainder of this manuscript we use the term "connectivity snapshot" to refer to a vector that contains only these uniquely informative values.

Binarized (connected/not-connected) versions of these connectivity matrices were also obtained using the following criteria: a cell in the matrix is given a value of 1 (connected) only if the corresponding correlation value for that cell is statistically significant at *p <* 0*.*05 corrected for multiple comparisons according to the Bonferroni criteria, taking into account the number of unique connections in the matrix (i.e., *p <* 0*.*05/8385). Otherwise, the cell is given a zero (not-connected) in these binary matrices. Even though the correct DOFs (**Table 1**) were used when computing the significance of the correlations prior to the multiple comparison correction, the significance level is approximate due to the unknown relationship between signal and noise in rsfMRI.

#### **SELECTION OF CONNECTIONS OF INTEREST FOR SLIDING WINDOW ANALYSIS**

For our exploratory analysis of rsfMRI dynamics, we studied connections that showed significant correlation values in the stationary analysis for at least seven participants (half of the sample plus one). This selection step reduced the number of

**FIGURE 1 | (A)** Depiction of the 150-ROI Craddock Atlas on top of five sagittal slices in the MNI stereotaxic space. **(B)** Depiction of the remaining 130 ROIs from the atlas considered in this study. ROIs eliminated from the original atlas

correspond mainly to the cerebellum and inferior temporal regions that were not part of the imaging FOV for all 12 participants. **(C)** Grouping of the remaining ROIs according to the Laird et al. (2011) functional network templates.

pairwise connections under study from the original 8385 to 5232 connections (see **Figure 3**).

#### **WHOLE-BRAIN, WITHIN-SUBJECT CONNECTIVITY MATRIX SIMILARITY vs. WINDOW DURATION**

In order to evaluate how the within-subject similarity of wholebrain connectivity patterns changes as a function of window length, we segmented our 60 min of data (minus the first 10 discarded seconds) into temporally non-overlapping windows with durations ranging from 30 s to 19.5 min in steps of 30 s. The number of available non-overlapping windows decreases with increasing window duration. A maximum duration of 19.5 min was chosen so that at least three different windows were available for the analysis in each individual.

For each subject and window duration, we first computed connectivity matrices for each non-overlapping window. We then computed the average correlation between all available matrices for a given duration and subject. This average number permits us to describe within-subject similarity between connectivity matrices for a given duration. We finally computed an average value across all subjects, for each window duration, to obtain an aggregate measure of within-subject similarity for our population of subjects (**Figure 4**).

#### **CONNECTION STABILITY ANALYSIS**

For each subject, we computed sliding window correlations with a window length of 60 s and a window step of 60 s (to avoid overlap). There are two reasons for choosing this 60 s window duration: (1) to have a sufficiently large number of data points per window to compute meaningful correlation values; and (2) because recent studies have shown that functional connectivity is related to both cognition (Shirer et al., 2012) and electrocortical


**Table 2 | Summary of correspondence between Craddock Atlas ROIs and Laird Network Templates.**

measures (Tagliazucchi et al., 2012) at similar temporal scales. Nevertheless, to evaluate the extensibility of these results to other window durations, we also performed the same analysis using non-overlapping windows of 120 and 180 s durations.

A 20% tapering of the time series was performed prior to computation of the correlation. For 60 s windows, the sliding window analysis produced for each participant (s) a matrix Cs (connection, window) with 5032 connections X 59 windows (not 60 due to the 10 s discarded at the beginning of the scan) that contains information about the evolution of connectivity strength over time for all connections under scrutiny (**Figure 2A**).

#### *Most stable/variable connections*

Subsequently, for each row of this matrix, we computed the coefficient of variation (CVAR) as follows:

$$\text{CVAR (i,s)} = \text{stdev} \left( \text{C}\_{\text{s}}(\text{i,:} \text{)} \right) / \text{mean } \left( \text{C}\_{\text{s}}(\text{i,:} \text{)} \right) \tag{1}$$

where s is a given subject and i is a given connection (**Figure 2A**). In order to compute this summary metric we transformed correlation values into Fisher's *Z*-scores, computed the summary statistics, and then transformed these back from Fisher's *Z*-scores into correlation values.

In addition, the median and standard deviation of CVAR values across all subjects and connections was computed, and connections whose CVAR was outside one standard deviation of this median were removed from further analyses (**Figure 2B**). This threshold condition eliminated 9 ± 5 (mean ± standard deviation) connections per subject. After removal of outlier connections, the Cs matrices were sorted according to their CVAR values (**Figure 2C**). We then classified all remaining connections into one of three groups (**Figure 2D**). First, we divided the pool of connections into those with positive or negative CVAR. Then, within the pool of connections with positive CVAR, we further subdivided these into two subgroups: 50% of the positive CVAR connections with the highest CVAR values went into one subgroup (most variable), and the remaining half went into the other subgroup (most stable). In summary, this process forces every non-outlier connection to be part of one these three groups:


To aggregate results across subjects while giving maximum attention to connections with a similar pattern of correlation across participants, we generated a new group-level classification matrix in which a given connection was marked as being of one of the three types mentioned above, if and only if, that connection was classified in the same manner in all participants (**Figure 2E**— Top). In addition, to examine the effect of this threshold, matrices were also generated showing the number of subjects in which connections were classified in each group (**Figure 7**). To evaluate the presence of patterns of interest in the spatial distribution of these three types of connections, we used AFNI program *SUMA* (Saad and Reynolds, 2012) to visualize each of these three groups in a 3D brain space (**Figure 2E**—Bottom).

#### *Permutation analysis for group-level connection identification*

In order to determine the probability that results of the connection grouping procedure described above would occur due to chance, we conducted a permutation test in which the labels of all connections in each subject were randomly shuffled. Using the same group sizes for each subject from the real data, the connections for each group were then selected within that subject. The

**FIGURE 2 | Sliding-Window methods. (A)** Example running window connectivity matrix for one representative subject on the left, and its associated vector of CVAR values on the right. The thresholds used to discard connections on the basis of excessive CVAR are depicted as red dashed lines. Eight connections that were discarded for this particular subject are marked as red dots. **(B)** Sliding window connectivity matrix and CVAR vector after removal of outlier connections. Now there are 5023 connections, instead of 5032, for this representative subject. **(C)** Sliding window connectivity matrix and CVAR vector after sorting connections according to their CVAR. Connections with negative CVAR are at the bottom of the graph, while connections with positive CVAR are on the top. The further a connection is from the horizontal axis where

CVAR is the closest to zero (black dashed line), the more variable the strength of that connection across time. **(D)** Classification of connections in three possible groups for three other representative subjects, shown both as sorted sliding window connectivity matrices (left) and in a single matrix form (right) where the color of the cell for a connection denotes its group assignment according to our criteria. The three groups are: connections with negative CVAR (blue); lowest positive CVAR/most stable connections (green); largest positive CVAR/least stable connections (red). **(E)** Aggregated results across subjects. We do this by only selecting connections classified the same way across all 11 participants that were included in the sliding window analysis. Connections of the three types are shown both in matrix view (top) and in brain space (bottom).

number of connections classified in the same group across all subjects was then counted. This procedure was repeated 5000 times to obtain a distribution of the number of connections that would be classified in the same group in all subjects based only on chance.

#### **RESULTS**

#### **STATIONARY ANALYSES RESULTS**

**Figure 3A** shows the static connectivity matrices for four representative subjects computed using the complete time series (3590 time points). Although there is some degree of similarity in the overall structure of the matrices across subjects (e.g., within-network connections are stronger than betweennetwork connections in all subjects; connectivity between MV2 and VS3 is also stronger in many subjects), there are clear differences in terms of the strength of many individual connections. From a quantitative point of view, the average correlation between the different subjects' connectivity snapshots (upper top triangle of the matrix excluding the diagonal) is *r* = 0*.*53 ± 0*.*07.

**Figure 3B** shows binarized (connected/unconnected) versions of the connectivity matrices presented in **Figure 3A**. The average and standard deviation number of statistically significant connections for the current sample was 5198 ± 747 (out of 8385 possible connections). **Figure 3C** shows another matrix view of the data where the value in each cell is the number of subjects for which that particular connection is statistically significant under the criteria described above. Finally, **Figure 3D** shows a binarized version of this aggregate view (**Figure 3C**), by marking with

gray color only the connections that were classified as statistically significant in at least seven (more than half of the study population) subjects. There are a total of 5032 connections that pass this group-level threshold. All remaining results, with the exception of the whole-brain within-subject similarity vs. scan duration analysis (section Similarity of Whole-Brain Connectivity as a Function of Window Duration), were conducted using only this subset of 5032 connections.

matrices for the same four representative subjects after statistical

# **SIMILARITY OF WHOLE-BRAIN CONNECTIVITY AS A FUNCTION OF WINDOW DURATION**

least seven subjects.

**Figure 4** shows how within-subject similarity of connectivity patterns across the whole brain decreases as a function of window duration. For durations larger than 10 min, the rate of decrease is relatively slow. It is for durations shorter than approximately 6 min that within-subject similarity decreases at a faster rate. This behavior was consistent across subjects.

#### **HISTOGRAMS OF SLIDING-WINDOW CORRELATIONS**

**Figure 5A** shows histograms of correlation values across time (bin width = 0.25) for all connections in one representative subject (SBJ01) as black traces. Visual inspection reveals no clear boundaries between different connection types, but a continuum of behavior in which connections span a wide range of mean and standard deviation values. Peaks can be observed at all centers of histogram bins. This is not the result of individual histograms having many peaks (temporal evolution of connectivity strength following multimodal distributions), but due to the overlap of approximately 5000 histograms with a wide range of means and standard deviations. To show how individual histograms do not present such sharp profiles, but are mostly uni-modal in shape, a subset of 11 randomly selected histograms are highlighted with dashed colored lines in **Figure 5A**. **Figure 5B** shows the same histograms as **Figure 5A**, but this time histograms have been colored according to their membership to one of the three groups defined in terms of CVAR (blue = negative CVAR; red = most variable positive CVAR; green = most stable positive CVAR). Despite the lack of clear boundaries between histograms, the classification criteria based on the CVAR were able to generate three compact groups of connections in all subjects (**Figure 5C** shows a second representative subject). An additional observation is that most stable positive connections, as defined with the CVAR criteria, are connections with high mean connection strength across time (green histograms peak primarily at the right of the graphs).

#### **MOST VARIABLE POSITIVE CONNECTIONS**

**Figures 6A,B** show the 23 connections classified as most variable in all participants for a window duration of 60 s. **Table 3** summarizes the distribution of such connections across different networks. All 23 connections correspond to connections between ROIs from different networks (**Table 3**). Primarily, most variable connections correspond to non-symmetric, inter-hemispheric connections between occipital (visual networks) and frontal regions (fronto-parietal networks). A similar general pattern was observed for window durations of 2 (**Figure 6C**) and 3 (**Figure 6D**) min. The total number of connections in this pool was 13 for 2 min windows, and 14 for 3 min windows.

In addition, **Figure 7A** shows a non-thresholded version of **Figure 6B**, where the color of each connection represents the number of subjects for which that connection was classified as most variable. Connections marked as most variable for seven or more subjects are colored with different shades of red. These connections still correspond primarily to inter-network connections. Moreover, they tend to correspond primarily to connections between occipital (visual networks) and fronto-parietal networks, as well as connections between nodes of EI3 and all other networks.

#### **MOST STABLE POSITIVE CONNECTIONS**

**Figures 8A,B** show the 364 connections classified as most stable in all participants for a window duration of 60 s. **Table 4** summarizes the distribution of these connections within and across

mostly uni-modal shape of individual histograms, 11 randomly selected histograms are highlighted using dashed colored lines. **(B)** Same histograms as in **(A)**, but this time each histogram is colored according to the

most stable positive CVAR connection. Grouping of connections show a compact profile with all connections from the same group clustering together. **(C)** Same as **(B)** for a second representative subject.

different networks. Roughly 40% of the connections, 148, correspond to within-network connections and the remaining 216 to across-network connections. A large percentage of stable positive connections are symmetric, inter-hemispheric connections. This pattern becomes more apparent if we restrict our analysis only to connections in the bottom 25% and 12.5% of positive CVAR values (**Figure 9**). When window duration was increased to 2 (**Figure 8C**) and 3 (**Figure 8D**) min, a similar spatial pattern arises. The total number of positive stable connections was 344 for 2 min windows, and 334 for 3 min windows.

In addition, **Figure 7B** shows a non-thresholded version of **Figure 8B**, where connections classified as most stable for seven or more subjects appear with different shades of green. Most stable connections under these less stringent conditions correspond primarily to within-network connections, although several clusters of most stable connections can be observed between the AUD and SPP networks, between the four MV networks, and between MV3-4 and visual and auditory regions.

**Figure 10** shows a summary view of the matrix in **Figure 8B**. For each square, we show the percentage of connections that fall within the most stable positive pool. Therefore, squares in the diagonal show the percentage of within-network connections that were classified as most stable. For example, MV3 and VS2 are the two most cohesive networks, with 100 and 70% of all possible within-network connections being consistently stable across time. Squares outside the diagonal show the percentage of all possible connections between two given networks that fall within the pool of most stable connections. We can see how MV1, MV3, and MV4 (red dashed outlines) have a substantial number of stable communication pathways among each other. The same is true for the SPP and the AUD networks (green dashed outlines). All percentages in this figure have been corrected to take into account only the 5032 connections that passed our stationary significant criteria.

# **NEGATIVE CONNECTIONS**

**Figures 11A,B** show the 32 connections with negative CVAR in all participants for a window duration of 60 s. **Table 5** summarizes the distribution of such connections across different networks. All negative connections correspond to across-network connections. In particular, 26 connections involve two regions from the Emotion/Interoception network #2 (EI2). This pattern of negative CVAR connections primarily involving regions from the EI2 network is also very apparent in **Figure 7C**, where connections marked as negative CVAR connections in seven or more subjects appear marked in different shades of blue. When window duration was increased to 2 (**Figure 11C**) and 3 (**Figure 11D**) min a similar connectivity map was also produced. The total number of negative connections was 32 for 2 min windows, and 30 for 3 min windows.

# **DISCUSSION**

Using 60 min resting scans with a temporal resolution of 1 s and a sliding window analysis approach, we divided functional connections in our data into three groups based on similarity of patterns of temporal variability across our study population. Sorting and grouping of connections was done according to the coefficient of variance (CVAR) of connectivity strength across time. The CVAR is a common measure of spread for Gaussianlike distributions that accounts for differences in the mean and has a simple interpretation (i.e., the larger the CVAR, the bigger the spread of the distribution of values around the mean). Connectivity strength histograms (**Figure 5**) showed distributions follow mostly uni-modal, bell-like shapes with different levels of spread, suggesting that the use of CVAR is a valid first approximation to estimate variability for the temporal evolution of connection strength. To aggregate results at the group level, we decided to focus our attention only on connections classified in the same manner across all participants. A permutation analysis (5000 repetitions) revealed that the number of connections



*Connection counts are divided in two groups: connections between two ROIs that are part of the same network (within) and connections between ROIs that are part of different networks (across).*

randomly found in any of the three groups, when following the above-mentioned criteria to combine results across subjects, is less than four connections. Finally, to evaluate the role that regional differences in signal-to-noise ratios may have played in our study, we also computed average temporal signal-to-noise ratio (TSNR) across subjects for all ROIs entering the analysis. We found no clear relationship between ROI TSNR values and participation in connections of a given type (most variable, most stable, or negative CVAR). These results suggest that the simple criteria used in this study provide reasonable descriptions of the patterns of temporal variability in resting state connectivity, and that these results are reproducible across subjects and capture true structure present in the data (i.e., not found by purely by chance).

**FIGURE 7 | Number of subjects for which a given connection was classified as most variable (A), most stable (B), and with negative CVAR (C).** Connections that were consistently classified in the same group for all 11 subjects are marked with a black outline. These are the same connections

shown in **Figure 6** (most variable), **Figure 8** (most stable), and **Figure 11** (negative CVAR). Connections that were classified in the same group for seven or more subjects appear in different shades of red (most variable), green (most stable), or blue (negative CVAR) in the corresponding panel.

**FIGURE 8 | (A)** Most stable positive connections for window length = 60 s. Connections classified as most stable in all 11 participants are shown over 3D renderings of a brain surface. **(B)** The same information shown as a 2D matrix. Colors corresponding to networks on the axes of the matrix are used to color nodes of that network in brain space. **(C)** Most stable connections for window length = 120 s. **(D)** Most stable connections with window length = 180 s.


**Table 4 | Absolute (#) and relative (%) number of connections with positive low CVAR (most stable) for each network.**

*Connection counts are divided in two groups: connections between two ROIs in the network of interest (within) and connections between one ROI in the network of interest and one ROI not in the network of interest (across).*

**FIGURE 9 | (A)** Most stable positive connections when only connections within the lowest 25% of CVAR values are selected in each subject. **(B)** Most stable positive connections when only connections within the lowest 12.5% of CVAR values are selected in each subject. As the selection criterion becomes more stringent, a smaller number of connections make it to the group level maps presented here. When fewer connections are present, the symmetric inter-hemispheric pattern becomes clearer.

The connections that reliably fall in each category have very distinct spatial patterns when plotted in brain space. In particular, most temporally stable connections (low positive CVAR) correspond mainly to symmetric, inter-hemispheric connections both within- and across-networks; most temporally variable connections (high positive CVAR) correspond mainly to non-symmetric, inter-hemispheric, across-network connections between occipital and frontal regions; and connections with negative CVAR correspond mainly to connections between two medial ventral subcortical regions and bilateral fronto-parietal regions. These general patterns were observed for non-overlapping window durations ranging from 1 to 3 min. We discuss the findings related to each of these categories in detail below.

#### **MOST STABLE POSITIVE CONNECTIONS**

Most stable positive connections is the largest of the three connection pools, with approximately one order of magnitude more connections than the other two groups (364 most stable connections vs. 23 and 32 in the other two groups). Moreover, most stable connections are not only more consistent across subjects and fluctuate less, but fluctuate around higher correlation values than least stable connections (green histograms cluster on the right hand side, which corresponds to stronger positive correlation values; see **Figures 5B,C**). These two observations suggest that while being classified as most variable or negative may depend to a larger extent on subject-dependent factors (e.g., on-going cognition, awareness levels, etc.), most stable connections are so because of an underlying source largely independent of these factors. One such source could be anatomical connectivity. Several studies have shown a good correspondence between BOLD resting state connectivity patterns and underlying direct anatomical connections as measured in Diffusion Tensor Imaging (DTI) (Greicius et al., 2009; Van Den Heuvel et al., 2009) and in primate electrophysiology and tracer studies (Margulies et al., 2009; Wang et al., 2013b). Additionally, computational modeling studies have shown that structural connections provide robust predictions of

functional connectivity, although the reverse is not always true (Honey et al., 2009; Deco et al., 2011). Relating to the current study, Honey et al. (2009) observed that ROI pairs with direct anatomical connectivity—as measured by diffusion spectrum imaging tractography—had more stable functional connectivity both within and across rsfMRI sessions. In agreement with their findings, many of the most stable connections identified here are symmetric, inter-hemispheric connections between left/right homologous regions that are known to have direct connections via the corpus callosum. However, it should also be noted that stable functional connectivity patterns can also be supported by indirect anatomical connections as well (Tyszka et al., 2011; O'Reilly et al., 2013).

Approximately 40% of the most stable connections correspond to those between two nodes of the same network (within-network connections). Still, that accounts for only 32% of all withinnetwork connections, which confirms prior observations suggesting that resting-state networks are not as temporally stable in their configuration as originally assumed (Chang and Glover, 2010; Handwerker et al., 2012; Smith et al., 2012; Tagliazucchi et al., 2012; Hutchison et al., 2013b). Our data also shows that levels of temporal cohesion vary substantially across networks. The four most temporally cohesive networks were MV4 (100% of its 6 within-network connections fall in the most stable group), VS2 (70%), MV3 (50%), and AUD (50%) (**Figure 10** and **Table 4**). The MV4 network, which primarily covers bilateral dorsal parietal cortex (BA5), has been shown to have a preference for motor execution and learning (Laird et al., 2011). The MV3 network, which sits laterally to MV4 and covers mainly primary and supplementary motor cortex for upper extremities was found to be strongly associated with tasks involving hand movement (Laird et al., 2011). Additionally, networks VS2 (which covers posterior and inferior portions of occipital cortex) and AUD (which covers the transverse temporal gyri) correspond to primary visual and auditory cortices. Taken together, our results suggest that primary sensory-motor networks are among the most temporally stable with respect to their internal connectivity patterns. On the other end of the spectrum, VS1 (11.76%), FPR (12.50%), SPP (14.20%), and EI4 (15%) were the networks with the lowest percentage of within-network connections that were consistently stable across all subjects. These networks span a wide range of regions involved in complex higher-order functions such as visual identification of complex visual stimuli (VS1), attention control and reasoning (FPR), speech production (SPP), and emotion discrimination (EI4). It may be that performance of these more complex tasks relies on a broader and more dynamic set of connectivity configurations, and that these tasks and their configurations occur less often during rest. In agreement with these



length = 180 s.

*Connection counts are divided in two groups: connections between two ROIs in the network of interest (within) and connections between one ROI in the network of interest and one ROI not in the network of interest (across).*

findings, Mueller et al. (2013) found that inter-subject variability in stationary patterns of global functional connectivity was lowest in unimodal cortical areas similar to the sensory-motor systems found to be most stable here.

information shown as a 2D matrix. Colors corresponding to networks

Regarding most stable between-network connections, we found two sets of networks to be the most stably interconnected. The first group consists of networks MV1, MV3, and MV4 (red outlines in **Figure 10**). The second group consists of SPP and AUD (green outlines in **Figure 10**). These groups of networks were found to be tightly connected in terms of their functional role when matched against thousands of activity patterns from task-based studies included in the BrainMap database (Fox et al., 2005). MV1, MV3, and MV4 were found to consistently participate in a variety of experiments related to motor and visuo-spatial integration and coordination (Laird et al., 2011). Moreover, MV3 and MV4 (the two networks with the largest percentage of inter-network stable connections) failed to split into two separate entities in a prior similar study that used a smaller subsample of the BrainMap database (Smith et al., 2009). In the case of the SPP and AUD networks, their functional relationship was not as strong, but both networks heavily contribute to language-related tasks. These reported agreements between network groupings based on functionality (as measured by paradigm and behavioral domain) and levels of stable inter-connectivity suggest that networks that share a common functional space (e.g., motor-visual integration, language) also share stable communication pathways, despite appearing as separate entities in resting state analyses that do not focus on the dynamic aspects of connectivity. Nonetheless, it is worth noticing that the other two multi-network functional spaces defined by Laird et al. (2011), namely emotion/interoception and visual, did not show such a clear pattern of stable interconnectivity between networks.

#### **MOST VARIABLE POSITIVE CONNECTIONS**

Most variable positive connections correspond primarily to internetwork, inter-hemispheric connections involving nodes from the fronto-parietal networks (FPR: 9 connections; FPL: 1 connection) and the visual networks (VS3: 7 connections; VS2: 3 connections; VS1: 5 connections). It has been previously shown that the fronto-parietal network is composed of flexible hub regions that can reconfigure their functional connectivity rapidly in order to adapt and participate in a great variety of externally driven tasks (Cole et al., 2013). Our results suggest that such flexibility can also be observed during undirected cognition while resting, and not solely in situations requiring highly adaptive task control. Moreover, a recent study showed that subjects engage and transition between many different mental activities while resting in the scanner (Delamillieure et al., 2010). The three most common mental activities reported by this pool of 180 subjects were visual imagery, inner speech, and somatosensory awareness. All but one across-network connections involving the fronto-parietal network also involve nodes from the visual and SPP networks, which are directly related to these mental activities commonly reported by subjects after rest scans. Lastly, additional connections belonging to this category outside the fronto-parietal network correspond primarily to connections between occipital regions and nodes from the DMN, motor/visuospatial networks, and the emotion/interoception networks (as described by Laird et al.). Some of these areas, in particular DMN and heteromodal occipital regions, overlap with areas described as part of the "Zone of Instability" (regions with more temporally variable connections between them) by Allen et al. (2014).

Although high temporal variability makes these connections a difficult target for study, the fact that such high volatility was consistent across all subjects in our pool suggests that these connections may constitute good targets for some technical and clinical applications. First, the pool of 23 connections identified as most variable across all subjects may constitute a good set of "worse-case scenario" targets for reproducibility studies and/or optimization of parameters such as scan duration. They could help obtain conservative bound values for such parameters. Moreover, the ability of certain regions to flexibly reconfigure their connectivity patterns has been shown to be directly related to the capacity to learn new motor skills (Bassett et al., 2011). Finally, Mueller et al. (2013) recently showed that areas with the largest levels of inter-subject variability in stationary global connectivity patterns correspond primarily to heteromodal association cortex in lateral pre-frontal cortex, the temporal-parietal junction, fronto-parietal control regions, and attention network areas (as defined by Yeo et al., 2011). They also reported a large degree of overlap between these regions of high functional connectivity variability and a brain map obtained from a metaanalysis of areas that predict individual differences in several cognitive and behavioral domains (e.g., personality traits, intelligence, memory performance, etc.) Many of the connections classified as most variable in our study are between ROIs located in the areas and networks of high variability reported by Mueller and colleagues. This suggests that short-term temporal variability in connectivity patterns (as observed here) may be partially responsible for the inter-subject differences in functional connectivity observed at longer temporal scales, which may in turn be related to individual differences in cognition and behavior. Given the consistently high temporal instability of these connections across all our healthy subjects, it would be interesting to study if temporal variability is somehow impaired or increased in populations with some level of cognitive decline, and in that manner evaluate the potential diagnostic power of the dynamic behavior of rsfMRI connectivity.

#### **NEGATIVE CONNECTIONS**

Of the 32 connections with negative CVAR in all participants, 26 correspond to connections involving two medial ROIs that are part of the EI2 network. The first ROI (with 21 negative connections) spans a large range of small anatomical structures, including the mammillary bodies, the hypothalamus, medial portions of the caudate, the fornix, and the third ventricle. The second ROI (with 5 negative connections) is located just posterior to the first and covers large portions of the bilateral thalamus. Correlation maps between each ROI's representative time series and all ROI voxels (**Figure 12**) show how the highest contributing voxels to the representative time series fall primarily within or around the third ventricle. This is particularly true for the ROI with 21 negative CVAR connections. This pattern suggests that negative correlations between these ROIs and other brain regions are not the result of anti-correlation between GM structures within the ROIs and other brain regions, but a result of the regression of CSF signals during pre-processing (Saad et al., 2012). In this study, the CSF signals may have been contaminated by signals from other neighboring tissues due to the relatively large voxel size used in this study. In fact, when the removal of CSF signal is omitted from the analysis pipeline, only three connections with negative CVAR remain, thereby supporting the potential artifactual origin of the average negative behavior observed for these connections. Conversely, the general patterns described for the other two connection types (most stable and most variable) remains consistent when CSF is not removed during the analysis.

It is also worth noting that while omitting the step concerning the removal of CSF signals led to the disappearance of the

majority of connections with an average negative correlation (and therefore negative CVAR), we nevertheless observed many connections alternating between positive and negative connectivity for short periods, regardless of CSF signal removal. This is in agreement with prior observations of this phenomenon in studies on functional connectivity dynamics (Chang and Glover, 2010; Hutchison et al., 2013b).

#### **STABILITY OF WITHIN-SUBJECT CONNECTIVITY PATTERNS vs. WINDOW DURATION**

In addition to classifying connections in the three abovementioned groups, we also evaluated how window length (used here as a proxy for scan duration) affects the within-subject similarity of whole-brain connectivity patterns. We found two general regimes. For durations below approximately 6 min, similarity of within-subject whole-brain connectivity matrices decreases quickly as window length decreases. Conversely, for durations above 10 min, the rate at which similarity increases with scan duration is much slower. This result suggests that if stability is a factor of interest (e.g., in longitudinal studies), using longer scans is desirable, particularly above approximately 10 min. Most previous studies of rsfMRI reproducibility have used shorter scans and focused on a handful of connections when evaluating the temporal stability of rsfMRI as a function of scan duration. Van Dijk et al. (2010) concluded that stable measures of connectivity can be obtained with scans as short as 5 min. This conclusion was based on how scan duration affected average within- and betweennetwork correlations for only three networks (default mode, dorsal attention, and a reference network consisting of auditory, motor, and visual regions). Nevertheless, Birn et al. (2013) more recently concluded that increasing scan length from 5 to 13 min greatly improved reproducibility. In this case, the authors studied all potential connections between 17 different ROIs. Using a completely different approach, Anderson et al. (2011) found that obtaining functional connectivity "fingerprints" that uniquely identified each participant required a minimum of approximately 15 min of data. Despite differences in scanning and analytical procedures, our results are in better agreement with those of Anderson et al. (2011) and Birn et al. (2013), which are based on larger samples of connections. This suggests that a minimum of approximately 10 min is desirable for good reproducibility, and that reproducibility keeps increasing at a lower rate for yet longer scan durations. Collectively, these results also highlight how suggested scan duration will depend on the target networks under analysis.

#### **LIMITATIONS OF THE STUDY**

In this study we did not record any measure of vigilance (e.g., eye tracking system, concurrent EEG recordings). Given the duration of the scans and that subjects were instructed to keep their eyes closed, it is very likely that our subjects went through some periods of sleep or decreased vigilance during the 60 min scans, despite being instructed to stay awake. Changes in vigilance or sleep are known to affect connectivity patterns measured with fMRI (Horovitz et al., 2009; Tagliazucchi et al., 2012). To partially evaluate the effect of this potential confound, we performed the analysis again using the first and last halves of the time series separately, under the assumption that periods of drowsiness will become more frequent as scanning progresses. When the data was split in this manner, the spatial patterns of connectivity per connection category and the bulk differences in number of connections per category remain very similar to those reported for the whole-run analysis (see Supplementary Figure 1). This suggests that although the classification of specific connections may be affected by this factor, the overall patterns discussed above remain present. Nevertheless, a better-controlled experiment with information about when these changes in vigilance occur may help better elucidate the origin of the patterns observed here. Also, restricting the analysis to periods of equal vigilance levels may help increase the number of patterns found to be common across subjects.

Another important factor to consider is how ROI and network templates used during the analysis affect interpretation of the data. We used a functionally-based atlas for the purpose of aggregating voxels into functionally homogenous regions. Functionally-based atlases have been proven to outperform anatomically-based atlases at reproducing functional connectivity patterns present at the voxel level (Craddock et al., 2012) and when attempting to decode cognitive states based on measures of connectivity (Shirer et al., 2012). In particular, the 150 ROI atlas was selected because it provided a good compromise between ROI size (sufficient functional homogeneity), computational tractability, and interpretability of the results. Using more fine-grained ROIs may allow detection of additional patterns of interest, and additional studies should be conducted to evaluate the robustness of the results presented here against the use of different parcellation schemes (Yeo et al., 2011; Shirer et al., 2012).

In a similar manner, the Laird et al. (2011) ICN templates were chosen to aid with interpretation given their behavioral correlates. Our discussion regarding the temporal stability of withinand across-network communication pathways heavily relies on the assignment of ROIs to these networks. Differences in network definition, and subsequent distribution of ROIs across them, may affect the conclusions. As of today, the fMRI community still debates which is the most informative decomposition level, or levels, to study resting state connectivity, as the configuration of networks heavily depends on this parameter (Abou-Elseoud et al., 2010). Moreover, there is an avid debate regarding the actual configuration of the well-studied default mode network (Buckner et al., 2008; Liu and Duyn, 2013). Comparative analyses between measures of temporal stability, such as the ones presented here, and network definitions obtained at different decomposition levels may help determine the most appropriate levels of brain parcellation.

#### **CONCLUSIONS**

We used a sliding window analysis to attempt a basic characterization of BOLD resting state connectivity dynamics. We found three well-differentiated sets of connections, whose temporal variability patterns were reproducible across all participants and have distinct spatial patterns. First, most stable connections were found to correspond primarily to symmetric, inter-hemispheric connections both within and across networks. We found that primary sensory-motor networks seem to be more temporally stable in their connectivity patterns than those more closely related to higher order cognitive processes. Second, most variable connections were found to correspond primarily to non-symmetric, inter-hemispheric, across-network connections between occipital and frontal regions. The number of connections consistently among the most variable group across all subjects was much lower than the number of connections among the most stable, suggesting subject-dependent, ongoing cognitive variables have a strong effect on the configuration of flexible connections in the brain. Finally, a small set of connections was found to have negative average connectivity across time, though a large percentage of these were identified as potential artifacts. All these general patterns were present for window lengths ranging from 1 to 3 min.

We also used the current dataset to evaluate how whole-brain, within-subject similarity of connectivity patterns varies as a function of window duration. This applies to studies where the focus is not on the dynamic behavior of connections, but on overall stable patterns that arise when full scans enter the analysis. Our results suggest that in order to maximize similarity of overall whole-brain connectivity, rest scans should last as long as possible, with clear stability benefits for 10 min rather than 5 min scans.

# **ACKNOWLEDGMENTS**

This research was possible thanks to the support of the NIMH-IRP. Portions of this study utilized the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, MD (http://biowulf*.*nih*.*gov). Also, we would like to acknowledge Dr. Gang Chen from the Statistical and Scientific Computing Core for his valuable input.

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www*.*frontiersin*.*org/journal/10*.*3389/fnins*.*2014*.* 00138/abstract

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 28 February 2014; accepted: 18 May 2014; published online: 11 June 2014. Citation: Gonzalez-Castillo J, Handwerker DA, Robinson ME, Hoy CW, Buchanan LC, Saad ZS and Bandettini PA (2014) The spatial structure of resting state connectivity stability on the scale of minutes. Front. Neurosci. 8:138. doi: 10.3389/fnins. 2014.00138*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Gonzalez-Castillo, Handwerker, Robinson, Hoy, Buchanan, Saad and Bandettini. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Cortical connective field estimates from resting state fMRI activity

### *Nicolás Gravel 1,2,5\*, Ben Harvey3, Barbara Nordhjem1, Koen V. Haak4, Serge O. Dumoulin3, Remco Renken5, Branislava Cur ´ ciˇ c-Blake ´ <sup>5</sup> and Frans W. Cornelissen1,5*

*<sup>1</sup> Laboratory of Experimental Ophthalmology, University Medical Center Groningen, University of Groningen, Groningen, Netherlands*

*<sup>2</sup> Laboratorio de Circuitos Neuronales, Centro Interdisciplinario de Neurociencia, Pontificia Universidad Católica de Chile, Santiago, Chile*

*<sup>3</sup> Experimental Psychology, Helmholtz Institute, Utrecht University, Utrecht, Netherlands*

*<sup>4</sup> Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands*

*<sup>5</sup> NeuroImaging Center, University Medical Center Groningen, University of Groningen, Netherlands*

#### *Edited by:*

*Christopher W. Tyler, The Smith-Kettlewell Eye Research Institute, USA*

#### *Reviewed by:*

*Javier Gonzalez-Castillo, National Institute of Mental Health, USA Jonathan Winawer, New York University, USA*

#### *\*Correspondence:*

*Nicolás Gravel, Laboratory of Experimental Ophthalmology, Neuroimaging Center, University Medical Center Groningen, University of Groningen, Antonius Deusinglaan 2, Groningen, 9713 AW, Netherlands e-mail: n.gravel@umcg.nl*

One way to study connectivity in visual cortical areas is by examining spontaneous neural activity. In the absence of visual input, such activity remains shaped by the underlying neural architecture and, presumably, may still reflect visuotopic organization. Here, we applied population connective field (CF) modeling to estimate the spatial profile of functional connectivity in the early visual cortex during resting state functional magnetic resonance imaging (RS-fMRI). This model-based analysis estimates the spatial integration between blood-oxygen level dependent (BOLD) signals in distinct cortical visual field maps using fMRI. Just as population receptive field (pRF) mapping predicts the collective neural activity in a voxel as a function of response selectivity to stimulus position in visual space, CF modeling predicts the activity of voxels in one visual area as a function of the aggregate activity in voxels in another visual area. In combination with pRF mapping, CF locations on the cortical surface can be interpreted in visual space, thus enabling reconstruction of visuotopic maps from resting state data. We demonstrate that V1 ➤ V2 and V1 ➤ V3 CF maps estimated from resting state fMRI data show visuotopic organization. Therefore, we conclude that—despite some variability in CF estimates between RS scans—neural properties such as CF maps and CF size can be derived from resting state data.

**Keywords: RS-fMRI, population receptive fields, connective field modeling, connectivity mapping, visuotopic maps**

#### **INTRODUCTION**

The human visual cortex is a highly complex and interconnected system operating at various temporal and spatial scales, and as such, non-invasive assessment of the neural correlates of human visual processing are of great importance. A significant contribution toward understanding human visual processing can be made by studying cortico-cortical interactions between different visual areas (Heinzle et al., 2011; Haak et al., 2013; Raemaekers et al., 2013). One way to study these neural correlates is by examining spontaneous blood-oxygen level dependent (BOLD) co-fluctuations during resting state (Heinzle et al., 2011; Raemaekers et al., 2013). Given that resting state BOLD fluctuations are partly shaped by the underlying functional and neuroanatomical organization (Biswal et al., 1997; Logothetis, 1998; Raichle et al., 2001; Boly et al., 2007; Deco et al., 2011; Hutchison et al., 2013a; Wang et al., 2013), analysis of resting state activity offers a possibility to examine intrinsic functional connectivity of the visual system as well as the extent of variability of these processes.

Although functional magnetic resonance imaging (fMRI) indirectly measures neural activity, accurate methods to map neural response selectivity in the early visual cortex from the BOLD signal have been developed (Engel et al., 1997; Smith et al., 2001; Dumoulin and Wandell, 2008). With these methods, the unifying concept of classical receptive field (Hubel and Wiesel, 1962) has found its place in fMRI, under the definition of population receptive field (pRF). The term pRF was first used to describe population encoding in macaque early visual areas (Victor et al., 1994). Used in fMRI, the term describes the aggregate responses of fMRI recording sites (voxels) to presented stimuli, in terms of the position and size of the visual field area to which each recording site responds.

The parametric modeling approach of the pRF techinque has allowed non-invasive investigation of neural response selectivity, its cortical organization, and the computational properties of the visual system. A recent complementary method, called connective field (CF) modeling (Haak et al., 2013), extends this type of analysis to model cortico-cortical interactions in terms of spatially localized patterns of functional connectivity. Specifically, this method enables characterization of a recording site in terms of aggregate cortical activity in another brain area, thus extending the concept of receptive field from a description of preferred locations in visual (stimulus) space to preferred locations on the cortical surface.

CF modeling was originally conceived as a method to analyze responses evoked by visual field mapping (VFM) stimuli, though the analysis does not use a description of the stimulus. As such, it could in principle be applied to explore cortico-cortical connectivity profiles during different experimental conditions as well as resting state. To realize this potential, a number of questions must be addressed. In this paper, we try to provide answers to at least four of them. First, how do we measure CF models in the presence of substantial physiological measurement noise? Second, how much scan time is sufficient to achieve accurate discrimination of CF models obtained from resting state data? Third, how do CF parameters obtained from resting state compare to those obtained from stimulus-evoked activity? Four, to what extent do CF parameters vary between resting state scans?

While previous studies have examined cortico-cortical interactions in the early visual cortex during resting state (Heinzle et al., 2011; Raemaekers et al., 2013), our current study focuses on the application of the CF method. These previous studies used model-free approaches whereas the CF method is a modelbased approach. To the extent that the model adequately describes the underlying neuronal activity, model-based approaches provide summary descriptions of aggregate neural activity, which is another reason to examine the application of the CF method to analyze resting state fMRI data.

# **MATERIALS AND METHODS PARTICIPANTS**

We recruited four subjects with normal visual acuity (age: S1 = 26, S2 = 30, S3 = 31, S4 = 40 years old). Experimental procedures were approved by the medical ethics committee of the University Medical Center Utrecht.

#### **STIMULUS**

Visual stimuli were presented by back-projection onto a 15*.*0 × 7*.*9 cm gamma-corrected screen inside the MRI bore. The subject viewed the display through prisms and mirrors, and the total distance from the subject's eyes (in the scanner) to the display screen was 36 cm. Visible display resolution was 1024 × 538 pixels. The stimuli were generated in Matlab (Mathworks, Natick, MA, USA) using the PsychToolbox (Brainard, 1997; Pelli, 1997). The mapping paradigm consisted of drifting bar apertures at various orientations, which exposed a 100% contrast checkerboard moving parallel to the bar orientation. After each horizontal or vertical bar orientation pass, 30 s of mean-luminance stimulus were displayed. Subjects fixated a dot in the center of the visual stimulus. The dot changed colors between red and green at random intervals. To ensure attention was maintained, subjects pressed a button on a response box every time the color changed (detailed procedures can be found in Dumoulin and Wandell, 2008; Harvey and Dumoulin, 2011). The radius of the stimulation area covered 6.25◦ of visual angle from the fixation point.

#### **RESTING STATE**

During the resting state scans, the stimulus was replaced with a black screen and subjects closed their eyes. We chose this so that there was no visual input; neither from outside the stimulus area (hence eyes closed) nor from light coming through the eyelids (hence the black screen). The lights in the scanning room were off and blackout blinds removed light from outside the room. The room was in complete darkness.

#### **DATA ACQUISITION**

Functional T2∗-weighted 2D echo planar images were acquired on a 7 Tesla scanner (Philips, Best, Netherlands) using a 32 channel head coil at a voxel resolution of 1.98 × 1*.*98 × 2*.*00 mm, with a field of view of 190 × 190 × 50 mm. TR was 1500 ms, TE was 25 ms, and flip angle was 80◦. The volume orientation differs between subjects, though in all cases it was approximately perpendicular to the calcarine sulcus. High resolution T1-weighted structural images acquired at 7T using a 32 channel head coil at a resolution of 0*.*49 × 0*.*49 × 0*.*80 mm, with a field of view of 252 × 252 × 190 mm. TR was 7 ms, TE was 2.84 ms, and flip angle was 8◦. We compensated for intensity gradients across the image using an MP2RAGE sequence, dividing the T1 by a coacquired proton density scan of the same resolution, with a TR of 5.8 ms, TE was 2.84 ms, and flip angle was 1◦. In total, eight 240-volumes functional scans were acquired; comprising 5 resting state scans (RS) and 3 interleaved VFM scans. The first scan was a RS scan. Physiological data were not collected.

#### **PREPROCESSING**

First, the T1-weighted structural volumes were resampled to 1 mm isotropic voxel resolution. Gray and white matter were automatically segmented using Freesurfer and hand edited in ITKGray to minimize segmentation errors (Teo et al., 1997). The cortical surface was reconstructed at the white/gray matter boundary and rendered as a smoothed 3D mesh (Wandell et al., 2007). Motion correction within and between scans was applied for the VFM and the RS scans (Nestares and Heeger, 2000). To clean the resting scan signals from DC baseline drift and reduce high frequency nuisance from physiological variation, time courses were band pass filtered with a high-pass discrete cosine transform filter (DCT) with cut-off frequency of 0.01 Hz and a low-pass 4th order Butterworth filter with cutoff frequency of 0.1 Hz. Finally, functional data were aligned to the anatomical scans (Nestares and Heeger, 2000) and interpolated to the anatomical segmentation space.

#### **ANALYSIS**

#### *Population receptive field mapping*

Early visual areas V1, V2, and V3 were mapped using the pRF method (Dumoulin and Wandell, 2008). The method uses a parameterized forward model of the underlying neuronal population, a description of the hemodynamic response (HRF), and the stimulus aperture. The model we chose corresponds to a circular Gaussian characterized by three parameters: x and y (positions), and size (σ). A set of candidate pRF models are combined with the stimulus aperture to generate predictions of the neural responses each candidate pRF would produce. Subsequent convolution of this predicted neural response time course with the HRF give a set of candidate predicted fMRI response time courses for each combination of pRF parameters. The best fitting predicted fMRI time courses and their associated pRF parameters are then chosen to summarize the response of each recording site (Dumoulin and Wandell, 2008).

#### *Connective field mapping*

CF model parameters were estimated for both the VFM and RS scans using the CF modeling method described by Haak et al. (2013). CF models summarize the activity of each recording site in a target region of interest (ROI) in terms of the aggregate activity contributed by a set of recording sites in a source ROI (Haak et al., 2013). Specifically, the BOLD activity over a particular part of a source region (the CF) is integrated (summed) to yield the BOLD activity at a target recording site, whose neural response we are trying to describe. As we aim to determine the source CF for all target recording sites within an ROI simultaneously, we describe a target visual field map ROI (i.e., V2 or V3). As candidate source CFs are limited to a particular visual field map, this is described as the source ROI (here, always V1). First, a discrete parameter space of 2-dimensional Gaussians of different candidate sizes (σ) is generated for each candidate location (each recording site inside the source ROI, V1), giving a set of candidate V1-referred CF models. In the next step, similarly to the pRF approach, a candidate predicted time course is generated for each candidate CF model by calculating the Gaussian weighted sum of the measured signals from the candidate CF (including the preferred recording site and its neighbors). These candidate time courses predictions are compared to the measured time course of each recording site in the target ROI (V2 and V3), and the best fitting prediction and its associate V1-referred CF parameters are chosen for each target recording site. Furthermore, because CF preferred locations in V1 cortical surface are associated with preferred visual field positions during pRF mapping, coordinates in visual space can be inferred for target recording sites. This allows the reconstruction of visuotopic maps even in the absence of stimuli. Note that the size of a CF represents the Gaussian spread along the cortical surface (mm) and is defined as the shortest path distance between pairs of vertices in the 3D mesh associated with the gray/white matter border. The location and size of the ROIs are defined during pRF mapping. These parameters (location and size of the source ROI) may restrict CF position but not CF size. By emphasizing the spatial profile of functional connectivity, a CF allows to examine spatially localized connectivity patterns among brain areas. As with most functional connectivity measures, CF models do not infer the temporal order of the responses in target and source recording sites.

#### *Discriminability criterion*

By emphasizing local over long-range functional connectivity, biologically inspired models like pRF and CF are generally robust to global effects (i.e., physiological noise). Nevertheless, evaluation of model significance can be frustrated by the noisy and non-stationary nature of the time series obtained from resting state. To overcome this issue and assess the statistical significance of CF models estimated from the RS, we apply a strategy based in surrogate data testing.

First, we distinguish the contribution of topographically organized BOLD co-fluctuations from spatially uncorrelated random BOLD fluctuations. This distinction allows defining a criterion in terms of model discriminability. In this context, we define discriminability as the distinction between topographically organized BOLD co-fluctuations and spatially uncorrelated random BOLD fluctuations. To determine model discriminability, we estimated null distributions from the variance explained (VE) of CF models obtained from surrogate V1 BOLD time courses. To generate these surrogate BOLD signals, artificial time courses were produced with the iterative amplitude adjusted Fourier transform (iAAFT) method (Schreiber and Schmitz, 1996; Venema et al., 2006). This method randomizes the phase of the original signal, but preserves its autocorrelation, linear structure, and amplitude distribution. The spatial correlation between BOLD time courses in the source region is lost but their fundamental statistical properties are preserved. Each CF model estimation was accompanied of an estimation based on surrogate time courses. For the present analysis, the null distributions obtained from 240 volumes (each RS scan) are comparable across subjects and target ROIs (V2 and V3); therefore, we combined all estimates into one null distribution and used the 5th percentile as discrimination threshold.

Second, we estimated the amount of data that is sufficient to discriminate RS-based CF models by examining the dependence of discrimination accuracy on data quantity. First, CF models were calculated for different amounts of RS data (both for original and for surrogate data). Segments of 40, 80, 120, 160, 200, and 240 volumes starting from the beginning of each RS scan were used. Next, VE estimates (adjusted for the degrees of freedom in each amount of volumes) were grouped according to their corresponding segment length, obtaining original and null VE distributions for each amount of volumes. These distributions allow the application of a receiver-operator characteristic (ROC) analysis. By assessing the performance of a binary classifier as its discrimination threshold is varied, ROC analysis provides quantitative measures of model discrimination performance. To discriminate CF models attributed to genuine BOLD co-fluctuations from those attributed to random BOLD activity, the corresponding VE cutoff threshold is moved from 0 to 1 across the original and the null distributions, producing a contingency matrix of true positives (*hits*), false positives (*false alarms*), true negatives (*correct rejections*), and false negatives (*miss*). Using the contingency matrix, values of true positive rate (*sensitivity*) and false positive rate (*1-specificity*) are computed and plotted as ROC curves. In ROC space, a diagonal line corresponds to random discrimination. The area under the ROC curve (AUC) is commonly used to quantify classifier discriminability, with a value of 0.5 corresponding to random, and a value of 1 to perfect, classification. We choose informedness as our discriminability index, which corresponds to twice the area between the curve and the diagonal: 2∗AUC-1 (Hanley and McNeil, 1982; Fawcet, 2006). It has the advantage that 0 represents random, and 1 represents perfect classification. Finally, we estimated the dependence of discrimination accuracy on the EV cutoff threshold by calculating the F1 score for each amount of volumes.

#### *Spatial analysis*

In the spatial domain, we estimate CF size change and position scatter during RS using VFM-based size and position as reference.

First, to assess CF position variability in the RS, we assume that CFs are topographically organized. This implies that neural activity in neighboring cortical locations in the target ROI may correlate with neural activity in neighboring cortical locations inside the source ROI that represent the same portions of visual space, as shown by VFM. This assumption allows us to estimate position variability as position scatter of V1-referred CFs by calculating their displacement on the V1 cortical surface with respect to their VFM-based reference positions.

We proceeded as follows: for each recording site in the target ROI, position scatter was calculated as the shortest distance along the cortical manifold between the VFM-based center position and the RS-based position. This distance was computed in millimeters using Dijkstra's algorithm (Dijkstra, 1959). Estimates whose associated models scored a VE above discrimination threshold (0.35 VE) were retained. To quantify the variability in position scatter for each subject and each RS scan, the median (to assess tendency) and the median absolute deviation (MAD; to assess dispersion) were calculated for each RS scan and subject. To assess RS scan-to-scan variability, we also calculated these values for all RS scan pairs. In order to determine a possible influence of cortical distance (i.e., shared vasculature, spatial blurring), we compared position scatter as a function of the distance between CF centers and their associated recording sites in the target area. We then compared position scatter as a function of VFM-based reference eccentricity. Finally, agreement in eccentricity estimates was quantified by calculating linear correlation coefficients for VFM- and RS-based eccentricities.

Second, we examined differences in size for V1 ➤ V2 and V1 ➤ V3 models between RS- and VFM-based estimates. RS-based size estimates for V1 ➤ V2 and V1 ➤ V3 from all participants were grouped by map combination and compared to those obtained based on VFM using a two-sample Kolmogorov–Smirnov test (KS-test). Subsequently, we examined the relation of RS-based CF size as a function of VFM-reference eccentricity by binning eccentricity in bins of 1◦ and calculating linear fits over the mean with bootstrapped confidence intervals (1000 iterations).

#### **RESULTS**

#### **DERIVING CONNECTIVE FIELD MODELS BASED ON RESTING STATE fMRI DATA**

Our first analysis concerned two questions: whether CF models could be obtained in presence of substantial physiological measurement noise; and, if the models obtained could be discriminated based in the contribution of genuine spontaneous BOLD co-fluctuations. **Figure 1** shows the distributions of VE for actual (*blue*) and surrogate (*black*) RS data. We used the VE of CFs obtained from surrogate RS data as null-distribution (240 volumes, TR: 1.5 s). The VE cutoff threshold was estimated based on the 5th percentile of the null-distributions and lies around ∼0.35 VE for all subjects. The majority of the models have a VE that exceeds this cutoff threshold. Importantly, this analysis demonstrates that the estimation of CF models based in genuine spontaneous BOLD co-fluctuations is possible even in presence of substantial physiological measurement noise. Nevertheless, we cannot determine the effect that these confounds exerts in the estimation of CF parameters.

In addition, we examined the dependence of discrimination accuracy on the amount of volumes included in the analysis. To do so, we calculated VE (adjusted for degrees of freedom) for actual and surrogate data for various amounts of volumes and applied a ROC analysis. **Figure 2** summarizes the results of the analysis for a single subject (Subject 3). First, it shows the VE distributions for actual (*black*) and surrogate data (*red*) as a function of the amount of volumes included in the analysis. VE drops with the number of volumes, but drops more sharply for the surrogate data (**Figure 2A**). The resulting ROC curves are shown in **Figure 2B**; they show detection probability as a function of false alarm probability for each amount of volumes. Detection probability increases with the amount of volumes. **Figure 2D** shows discrimination accuracy (F1 score) as a function of the VE threshold for each amount of volumes analyzed.

This analysis also indicates that CF modeling could be based on even shorter scan periods with retaining reasonable discrimination accuracy. However, fewer models are expected to lie above threshold. Finally, it must be noted that, even though this analysis provides a strategy to optimize modeling accuracy by adjusting the VE cutoff threshold, in the remaining analysis we use a threshold of 0.35 VE, which corresponds to the 5th percentile of the null-distribution obtained after grouping the VE of surrogate RS-based models from all scans and subjects.

#### **SPATIAL ASPECTS OF RESTING STATE CONNECTIVE FIELD MAP ESTIMATION**

The next question we address is whether the topographical maps based on RS data have similar characteristics as the one based on VFM data (our current reference). Also, how variable are the results between RS scans? To provide an impression of this variability, **Figure 3** shows both VFM and RS derived CF maps for a single participant (maps for other participants are shown in Supplementary Materials). V2 and V3 CF parameter maps (V1 referred) are plotted on a smoothed 3D mesh representing gray matter along the cortical surface. Eccentricity, polar angle and size

(σ) are plotted in three columns. In top row of panels, CF parameters estimated based on VFM data are shown. These maps serve as our reference. In the lower rows of panels, these same parameters are plotted for all RS scans. As shown previously (Haak et al., 2013), the VFM derived maps show a clear retinotopic organization (note that in the context of CF modeling, eccentricity and polar angle maps are inferred from a pRF mapping and associated to each recording site in the source region, in this case V1). In some RS scans eccentricity and polar angles maps resembles the VFM-based reference, although some variability can be observed (**Figure 3**, RS4, RS5). To quantify the variability of the individual maps, the median position displacement in CF cortical location (relative to the VFM reference and between all RS scan pairs; in mm) and the MAD were calculated for RS1 to RS5 (values are reported in the legend of **Figure 3**). These values confirm the impression that RS4 and RS5 most clearly resemble the visuotopic organization observed in the VFM-based maps (results are shown for participant 3, those for the other participants are shown in the Supplementary Material).

**Figure 4A** plots the change in V1-referred CF center position between RS- and VFM-based reference position as a function of VE (of the RS model). CFs with higher VE show smaller cortical displacements. The majority of CFs (as indicated by the heat map) have a high VE and show relatively small displacements. **Figure 4B** shows a distance effect for V1 ➤ V2 (*R* = 0.90, *p <* 0*.*0001) but not for V1 ➤ V3 (*R* = 0.11, *p <* 0*.*0001). **Figure 4C** shows that there are no systematic deviations from the median cortical displacement as a function of eccentricity. **Figure 4D** shows a good agreement between RS- and VFM-based eccentricities (V1 ➤ V2: *R* = 0.97, *p <* 0*.*0001; V1 ➤ V3: *R* = 0.70, *p <* 0*.*0001).

**Figure 5** shows VFM- and RS-based V1-referred CF size distributions for V2 and V3 (data grouped over all scans and participants, *N* = 4). RS-based CF size tend to be smaller than those estimated based on VFM data (V1 ➤ V2: *p <* 0*.*0001, KStest = 0.240; V1 ➤ V3: *p <* 0*.*0001, KS-test = 0.0001). Moreover, we cannot confirm a difference in RS-based CF size estimates for V1 ➤ V2 or V1 ➤ V3 (*p* = 0*.*0065, KS-test = 0.015).

**Figure 6** plots the relationship between CF size and eccentricity for VFM- and RS-based estimates. The left panel shows that VFM-based CF size estimates for V1 ➤ V2 do not increase significantly with eccentricity (*black line*), whereas those for V1 ➤ V3 do (*yellow line*). The right panel shows that RS-based CF size for V2 (*black line*) and V3 (*yellow line*) do not increase significantly with eccentricity.

Together, the analyses shown in **Figures 5**, **6** show that RSbased CF size estimates are smaller than those estimated based on VFM. In RS, CF size does not appear to increase with eccentricity, neither within the visual hierarchy.

### **DISCUSSION**

#### **CONNECTIVE FIELD MODELS CAN BE ESTIMATED BASED ON RESTING STATE DATA**

We have shown that connective field (CF) modeling can be based on resting state (RS) data. This indicates that spontaneous blood-oxygen level dependent (BOLD) co-fluctuations in the early visual cortex state preserves fine-grained topographic connectivity structure. While this preservation of topographic connectivity corroborates results of previous studies (Heinzle et al., 2011; Raemaekers et al., 2013) our study goes beyond these by examining both the topography and the spatial properties of the functional connections. In order to assess the statistical significance of our CF estimates, we determined a variance explained (VE) cutoff threshold taking into account the VE of CF models based on surrogate RS data (**Figure 1**). This involves disrupting the phase correlations across recording sites in the source region of interest (ROI) in order to destroy the local structure of BOLD co-fluctuations. Furthermore, we examined the dependence of discrimination accuracy on the amount of data and found six minutes of scanning (240 volumes using a TR of 1.5 s at 7T) to be more than sufficient to achieve good discrimination (**Figure 2**).

#### **AGREEMENT BETWEEN RESTING STATE AND VISUAL FIELD MAPPING BASED CONNECTIVE FIELD PARAMETERS**

Although data obtained during RS provide different information than data obtained during stimulation, a comparison of the maps estimated from RS to those estimated based on visual field mapping (VFM) reveals a fairly close agreement between the two (**Figure 3**). Some RS maps show patterns of visuotopic organization that agree well with their VFM reference (**Figure 3**, RS4, RS5). Nevertheless, we observed substantial variability in CF model parameters for different RS scans. We quantified the degree

CF models from subject 3.

From left to right: eccentricity, polar angle, and size. **Top panel** corresponds to visual field mapping (VFM)-based estimates. **Lower panels** show parameter estimates for each resting state (RS) scan. For V1 ➤ V2 CF models, the position displacement in CF cortical location (in mm) between VFM- and RS-based estimates for RS1 to RS5 is: median (MAD) = 10.0 (5.4); 8.5 (5); 5.8 (3.7); 3.8 (3.4); and 4.1 (3.0), respectively [total = 5.4 (3.9)]. Corresponding position displacement values between RS4 and RS5 (the RS scans with lowest displacement: 4.1 (3.1); between RS1 and RS2 (the RS

scans with highest displacement): 8.5 (5.8); between RS1 and RS4: 10.5 (6.6); when grouping results for all RS scan pairs: 8.6 (5.9). For V1 ➤ V3 CF models, the corresponding values are: 13.6 (6.3); 14.4 (6.8); 7.9 (5.4); 6.7 (5.5); and 7.1 (4.2) [total = 8.7 (5.5)]. Eccentricity and polar angle are inferred from V1 pRF mapping (see Materials and Methods for details). Data are for V1 ➤ V2 and V1 ➤ V3 models estimated for subject 3 (data for other subjects included in Supplementary Materials). A threshold of 0.35 VE was applied. Median cortical displacements reflect the agreement between RS and VFM maps and between different RS maps.

of agreement by measuring CF position scatter as the cortical displacement between RS- and VFM-based CF cortical positions and show that the median cortical displacement reflects the agreement observed in **Figure 3** (data for other subjects are shown in Supplementary Materials). Besides the observed variability in visuotopic organization, CF size estimates obtained for RS scans were generally smaller than those obtained for VFM (**Figure 5**). Moreover, contrary to estimates based on VFM, RS-based CF size did not increase with eccentricity neither throughout the visual hierarchy (**Figure 6**).

#### **SPATIAL CHANGES: POSSIBLE MECHANISMS**

In the absence of visual input, changes in CF size and variability in CF position may reflect a reduction in the amount of spatial

**FIGURE 4 | Position scatter for V1-referred connective fields for a single subject. (A)** Joint histogram of cortical displacement in V1-referred CF centers as a function of adjusted VE. The goodness of fit tends to decrease with larger displacements (colorbar depicts frequency of voxels after grouping data from all RS scans; the number of voxels that entered the analysis is: 1622 for V1 ➤ V2 and 1467 for V1 ➤ V3). **(B)** Position scatter as a function of the distance from the target voxel. A cortical distance effect can be seen in V1 ➤ V2 (*R* = 0.90, *p <* 0*.*0001) but not in V1 ➤ V3 (*R* = 0*.*11, *p <* 0*.*0001). **(C)** No systematic deviations from the

median distance are observed for eccentricity (data was binned in eccentricity bins of 0.25◦). Points represent the median of each bin and error-bars the median absolute deviation for the corresponding bin. **(D)** There is good agreement between RS-based eccentricity and VFM reference eccentricity (V1 ➤ V2: *R* = 0.97, *p <* 0*.*0001; V1 ➤ V3: *R* = 0.70, *p <* 0*.*0001) (data was binned in eccentricity bins of 0.25◦. Points represent the median of each bin and error-bars the median absolute deviation for the corresponding bin). Data are from subject 3. A cutoff threshold of 0.5 VE (*F*<sup>1</sup> ∼0.85) was applied in **(B–D)**.

integration and selectivity, respectively. Possible mechanisms underlying these changes in CFs may involve temporal restructuring of corticothalamic network activity in a state-dependent way (Mastronarde, 1989; Wörgötter et al., 1998; Andolina et al., 2007; Britz and Michel, 2011), as well as intracortical processing mediated by horizontal connections and feedback signals from higher

**FIGURE 6 | Relation between eccentricity and V1-referred connective field size in visual areas V2 (black) and V3 (yellow) grouped over participants (***N* **= 4).** Resting state based size estimates do not increase with eccentricity. Eccentricity was binned in intervals of 1◦. Dots indicate the mean of VE-weighted CF size for each bin. Linear fits were calculated for these means. Dashed lines correspond to the 95% bootstrap confidence interval of the linear fit (1000 iterations). A cutoff threshold of 0.35 VE was applied.

cortical stages (Rao and Ballard, 1999; Steriade, 2000; Llinás and Steriade, 2006; Botelho et al., 2014; Schmid and Keliris, 2014). Decreased corticothalamic feedback and cortical lateral inhibition in the absence of visual input likely plays a role in the shrinkage of CFs, as well as in the reduced visuotopic organization observed on the higher-scatter CF maps. These input changes might adjust the balance between excitation and inhibition in cortical neuronal populations that eventually shapes cortico-cortical connectivity as a function of stimulation, behavioral context, and physiological state (Kosslyn et al., 1995; Lehmann et al., 1998; Rao and Ballard, 1999; Steriade, 2000; Martínez-Trujillo and Treue, 2004; Slotnick et al., 2005; Womelsdorf et al., 2008; Greenberg et al., 2012; Haak et al., 2012). During resting state, a variety of ongoing processes may modulate connectivity between visual areas. In particular, the transitional period from wakefulness to sleep leads to a progressive inhibition of synaptic transmission through thalamic relay neurons (Steriade, 2000; Llinás and Steriade, 2006), which is another possible cause to the changes observed.

Another reason to speculate that there may be differences between the RS and VFM results is related to the origin of the BOLD signal. Given that the majority of the brain's energy budget is devoted to ongoing intrinsic activity (i.e., RS), the metabolic costs of the adjustment between excitation and inhibition may reflect in the BOLD signal. The relative contribution of excitation and inhibition to the BOLD signal changes between the RS and VFM scans. Inhibitory functions, which may be supported more by oxidative mechanisms than by excitatory signaling, may contribute less to the measured BOLD signal (Buzsaki et al., 2007). As a consequence, resting state BOLD co-fluctuations may provide a different picture of the neural connections.

#### **LIMITATIONS AND FUTURE DIRECTIONS**

The current study assesses CF properties in four healthy participants. Even though the results are consistent between participants, further studies involving more participants are advised. Moreover, the CF models were estimated based on entire RS scans. As such, they only estimate average CF properties and do not capture temporal variations in these. To establish the possible neural mechanisms underlying the observed changes in CF properties, further research is still necessary. In its current implementation, the present method cannot determine the precise factors that contribute to this variability. Large-scale network interactions, physiological processes and measurement noise might all influence the variability observed. Important to note is, however, that biologically inspired methods like pRF and CF modeling that emphasize local connectivity are generally robust to global effects like physiological noise.

In future studies, extending the present analysis with dynamic functional connectivity metrics (Sakoglu et al., 2010; Kiviniemi ˘ et al., 2011; Allen et al., 2012; Hutchison et al., 2013b), might help to disclose relevant temporal and spatial repertories in various experimental conditions allowing to study phenomena that unfold over time, such as attention, contextual modulation, and object recognition. Adding independent measures of neural activity like electroencephalography (Yuan et al., 2012) or other neurophysiological recordings seems a promising path to capture relevant temporal variations in neural activity. Future analyses could also take into account simultaneously recorded physiological data and draining veins in the preprocessing of the data, as these are known to influence resting state functional connectivity estimates (Birn et al., 2001; Logothetis et al., 2009; Winawer et al., 2010; Heinzle et al., 2011; Haak et al., 2013). Lastly, it should be noted that some of the possible mechanisms underlying changes in CF properties are based on animal models (Wörgötter et al., 1998; Steriade, 2000; Haupt et al., 2004; Llinás and Steriade, 2006; Andolina et al., 2007; Womelsdorf et al., 2008). Because certain experimental manipulations are not possible in human subjects, comparative approaches between humans and animal models are needed to bridge the gap in RS-fMRI investigations (Hutchison and Everling, 2012; Mantini et al., 2012). Examining the correspondence of functional and anatomical connectivity in homologous brain architectures will help to further elucidate the mechanisms underlying neural activity.

# **CONCLUDING REMARKS**

We have shown that CF estimates can be obtained based on RS data. We observed good agreement can be observed between RSand VFM-based maps, and between different RS-based maps. This implies that local functional connectivity in visual cortical areas during resting state, as measured with CF modeling, may reflect the underlying neural architecture. However, we found that CF estimates may vary between RS scans even for high VE scans. The present study cannot determine to what extent this variability is explained by genuine changes in the neural properties of the visual system or by various external sources of noise. Nevertheless, we show that neural properties such as CF maps and CF size can be derived from RS data.

#### **ACKNOWLEDGMENTS**

Nicolás Gravel was supported by the (Chilean) National Commission for Scientific and Technological Research (BECAS CHILE and millennium center for neuroscience CENEM NC10 001 F). Ben Harvey, Barbara Nordhjem, Serge O. Dumoulin, and Frans W. Cornelissen were supported by the Netherlands Organization for Scientific Research (NWO Brain and Cognition grant 433-09-233). BC-B was supported by grant ERC StG 2012- 312787\_DRASTIC (Awarded to A.Aleman). We would like to thank Tomas Ossandon and Daan Wesselink for their valuable suggestions for the analysis.

#### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fnins.2014. 00339/abstract

#### **REFERENCES**


perception in humans. *Proc. Natl. Acad. Sci. U.S.A.* 104, 12187–12192. doi: 10.1073/pnas.0611404104


**Conflict of Interest Statement:** The Review Editor Jonathan Winawer declares that, despite having collaborated with authors Koen V. Haak, Ben Harvey, Serge O. Dumoulin, Remco Renken, and Frans W. Cornelissen two years ago, the review process was handled objectively and no conflict of interest exists. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 05 May 2014; accepted: 06 October 2014; published online: 31 October 2014. Citation: Gravel N, Harvey B, Nordhjem B, Haak KV, Dumoulin SO, Renken R, Curˇ ´ ci´c-Blake B and Cornelissen FW (2014) Cortical connective field estimates from resting state fMRI activity. Front. Neurosci. 8:339. doi: 10.3389/fnins.2014.00339 This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Gravel, Harvey, Nordhjem, Haak, Dumoulin, Renken, Curˇ ´ ci´c-Blake and Cornelissen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Interpreting functional diffusion tensor imaging

#### *Joonas Arttu Autio1,2,3 and R. Edward Roberts <sup>4</sup> \**

*<sup>1</sup> Medical Research Center Oulu, Oulu University Hospital and University of Oulu, Oulu, Finland*

*<sup>2</sup> Department of Diagnostics, Faculty of Medicine, University of Oulu, Oulu, Finland*

*<sup>3</sup> Department of Diagnostic Radiology, Oulu University Hospital, Oulu, Finland*

*<sup>4</sup> Division of Brain Sciences, Academic Department of Neuro-otology, Imperial College London, Charing Cross Hospital Campus, London, UK*

*\*Correspondence: ed.roberts@imperial.ac.uk*

#### *Edited by:*

*Christopher W. Tyler, The Smith-Kettlewell Eye Research Institute, USA*

#### *Reviewed by:*

*Robert F. Dougherty, Stanford University, USA*

**Keywords: functional diffusion tensor imaging, fractional anisotropy, BOLD, MRI, behavior**

#### **A commentary on**

#### **Functional diffusion tensor imaging at 3 tesla**

*by Mandl, R. C. W., Schnack, H. G., Zwiers, M. P., Kahn, R. S., and Hulshoff Pol, H. E. (2013). Front. Hum. Neurosci. 7:817. doi: 10.3389/fnhum.2013.00817*

# **SECTION**

In this issue Mandl and colleagues replicated the findings of a previous study (Mandl et al., 2008) in which they explored task-related changes in fractional anisotropy (FA) along white matter (WM) tracts using functional diffusion tensor imaging (fDTI). They report increased FA in WM of thalamocortical pathways during tactile stimulation and in the optic radiations during visual stimulation, while only minor changes in mean diffusivity (MD) and blood oxygenation level dependent (BOLD) contrast were observed. Mandl and colleagues suggest that fDTI might provide a novel window on previously inaccessible WM information transfer. These findings, in addition to a number of previous reports of changes in MD with close temporal proximity to behavioral stimuli, could have a significant impact on our understanding of brain function (Aso et al., 2009; Baslow et al., 2012). However, at the present time there has been no rigorous validation of the methodology or thorough explanation of the physiological basis for the effects (Miller et al., 2007; Jin and Kim, 2008; Yacoub et al., 2008). In this commentary we discuss the possible explanations for the functional FA observations and how future studies could begin to explore these effects.

The most likely explanation for the observed increase in FA is that it reflects changes in the BOLD fMRI signal. It is well established that neuronal activation is associated with a decrease in the transverse relaxation rate (*R*2), observed as an increase in the gray matter (GM) magnetic resonance signal (Ogawa et al., 1990). In contrast, WM BOLD activation is a very rarely reported phenomenon. It follows that the relative GM/WM BOLD signal ratio is very likely to *increase* during a stimulus-induced positive BOLD period, and *decrease* during the post-stimulation negative BOLD period. Since GM and WM have different FA-values, a change in the relative GM/WM ratio may have an impact upon FA quantification. In contrast, since GM and WM have similar MD values, a change in the GM/WM ratio would probably not influence MD. However, the very small BOLD signal changes observed in this study would seem to suggest otherwise, but could be explained by the method of analysis. By taking into account voxels along the entire tract length, areas of WM proximal to GM regions at tract termination points might have been more strongly influenced by a GM BOLD effect than those in the main body of the tract.

To test this hypothesis we simulated the effect which a partial-volume of gray matter would have on parallel and transverse diffusivity using published parameters. Relaxation rates *R*2\_*gm* = 14.12 1/s, *R*2\_*gm*\_activation = 14.00 1/s, and *R*2\_*wm* = 12.34 1/s; estimated from the relation -*<sup>R</sup>*<sup>2</sup> = − -*S <sup>S</sup>* /*TE* (Donahue et al., 2006; Miller et al., 2007); *ADC* values *ADCgm* = <sup>0</sup>.<sup>937</sup> <sup>∗</sup> <sup>10</sup>−<sup>3</sup> mm2/s, *ADCwm*,parallel <sup>=</sup> <sup>1</sup>.<sup>5</sup> <sup>∗</sup> <sup>10</sup>−<sup>3</sup> mm2/s, *ADCwm*,radial <sup>=</sup> <sup>0</sup>.<sup>4</sup> <sup>∗</sup> 10−<sup>3</sup> mm2/s (Kiselev and Il'yasov, 2007; Qiu et al., 2008); Gray matter fraction (*fgm*), White matter fraction (*fwm* = 1 − *fgm*), TE (78 ms) and *b*-value (1000 s/mm2) (Mandl et al., 2008) using the equation below:

$$\begin{split} \frac{\Delta S}{S} &= \left(\frac{S\_{\text{activation}}}{S\_{\text{baseline}}} - 1\right) \* \, 100\,\%\\ &= \left(\frac{f\_{\text{gen}} \cdot \varepsilon^{-R\_{\text{2,gen},\text{air}} \cdot TE - ADC\_{\text{gen}} \cdot \text{braha}}{-R\_{\text{2,var}} \cdot TE - ADC\_{\text{year}} \cdot \text{braha}}\right) \\ &= \left(\frac{+f\_{\text{wen}} \cdot \varepsilon^{-R\_{\text{2,gen}} \cdot TE - ADC\_{\text{gen}} \cdot \text{braha}}}{f\_{\text{gen}} \cdot \varepsilon^{-R\_{\text{2,gen}} \cdot TE - ADC\_{\text{aux}} \cdot \text{braha}}} - 1\right) \\ &+ f\_{\text{wen}} \cdot \varepsilon \end{split}$$

**Figure 1** illustrates that the signal changes are substantial even with modest 20% gray matter partial volumes, with a 0.28% increase in parallel diffusivity, 0.11% reduction in transverse, and BOLD change of 0.18%. This suggests that small BOLD changes could provide a physiological explanation for the changes observed. However, this possibility would still not explain the differences in observed time courses between the two stimulation types. Although changes in the GM BOLD signal would appear to be the most likely explanation, it is still unclear to what extent and precisely how this could impact on FA measurements in central white matter pathways.

A more technical consideration is the possible effect of image noise and partial volumes on FA quantification (Basser and Jones, 2002; Rudrapatna et al., 2012). At 2.<sup>5</sup> <sup>×</sup> <sup>2</sup>.<sup>5</sup> <sup>×</sup> 7 mm<sup>3</sup> resolution, it is likely that several WM voxels could be contaminated with volumes of GM, even

after using standardized white matter templates. Noise in MRI acquisitions is thought to cause an overestimation of FA in both isotropic and anisotropic structures (Pierpaoli and Basser, 1996), and it is also well known that stimulation-evoked BOLD responses demonstrate substantial trial-to-trial fluctuations. Therefore, could the trial-to-trial BOLD response fluctuations impose an apparent increase in the MR noise level and cause a functional FA overestimation? Although a possibility, the very low BOLD signal changes indicate that this is unlikely. The specificity of the results to pathways previously associated with tactile or visual function, and the replication of prior results (Mandl et al., 2008) suggest that partial volume or noise effects cannot fully explain these findings.

A final possibility is that FA increases may reflect activity-evoked glial swelling associated with increases in extracellular potassium levels (Ransom et al., 1985). Such activity would predict an increase in Na+, K−-ATPase utilization to recover post-activation transmembrane ion gradients, which in turn might translate into changes in vascular oxygenation levels. However, the extant evidence from BOLD fMRI and PET studies does not support a metabolic explanation for the observed effects. *In vitro* studies in the rat brain which are free from confounding vascular effects - show that massive depolarization and increases in metabolism have a minimal effect upon WM ADC quantification (Anderson et al., 1996). Thus, the lack of convincing evidence for WM activation is in line with the emerging view that WM energy consumption is predominantly dedicated to non-signaling related ATP consumption and maintenance of resting potentials (Harris and Attwell, 2012).

In order to advance the use of functional DTI, a more detailed exploration of the origin of the observed changes is vital. To describe the basic WM, GM, and CSF model, even when contributions from blood and *R*<sup>2</sup> are excluded, requires 18 separate parameters (Basser and Jones, 2002). This level of complexity sets significant limitations on the interpretation of a functional FA change, therefore we recommend caution when interpreting the origin of fDTI signals, as at the present time the picture is far from clear. Future investigations should: (1) exclude activated BOLD voxels from FA analyses to ameliorate the impact of possible BOLD or noise effects and (2) investigate the effect of hypercapnia on FA quantification in humans, since this is not associated with a substantial increase in neuronal information processing. Such experiments may help disentangle the impact of vascular effects upon functional FA quantification and extend our understanding of signal changes in WM using fDTI.

#### **ACKNOWLEDGMENTS**

This work was partly funded by the UK Medical Research Council.

# **REFERENCES**

Anderson, A. W., Zhong, J., Petroff, O. A. C., Szafer, A., Ransom, B. R., Prichard, J. W., et al. (1996). Effects of osmotically driven cell volume changes on diffusion-weighted imaging of the rat optic nerve. *Magn. Reson. Med.* 35, 162–167. doi: 10.1002/mrm.1910 350206


FMRI at high b value. *Proc. Natl. Acad. Sci. U.S.A.* 104, 20967–20972. doi: 10.1073/pnas.0707257105


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 December 2013; accepted: 21 March 2014; published online: 11 April 2014.*

*Citation: Autio JA and Roberts RE (2014) Interpreting functional diffusion tensor imaging. Front. Neurosci. 8:68. doi: 10.3389/fnins.2014.00068*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Autio and Roberts. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Do we measure gray matter activation with functional diffusion tensor imaging?

#### *René C. W. Mandl <sup>1</sup> \*, Hugo G. Schnack1, Marcel P. Zwiers 2, René S. Kahn1 and Hilleke E. Hulshoff Pol <sup>1</sup>*

*<sup>1</sup> Department of Psychiatry, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, Netherlands*

*<sup>2</sup> Centre for Cognitive Neuroimaging, Donders Institute for Brain, Cognition, and Behaviour, Radboud University Nijmegen, Nijmegen, Netherlands*

*\*Correspondence: r.mandl@umcutrecht.nl*

#### *Edited by:*

*Christopher W. Tyler, The Smith-Kettlewell Eye Research Institute, USA*

*Reviewed by:*

*R. Edward Roberts, Imperial College London, UK*

**Keywords: white matter activation, BOLD signal, partial volume effects, signal leaking, fiber tracts**

#### **A commentary on**

#### **Interpreting functional diffusion tensor imaging**

*by Autio, J. A., and Roberts, R. E. (2014). Front. Neurosci. 8:68. doi: 10.3389/fnins. 2014.00068*

In this comment by Autio and Roberts (2014) on our second fDTI article "Functional Diffusion Tensor Imaging at 3 Tesla" (Mandl et al., 2013) the authors suggest that BOLD signal originating from gray matter could in part explain the reported task-related FA changes in white matter. The rationale is that the relative contribution from activated gray matter to the measured signal increases in voxels containing both gray and white matter. Because the ADC value for gray matter is between the parallel and perpendicular ADC for white matter, this increased contribution effectively could lead to an increase in the measured parallel ADC and a decrease in the measured perpendicular ADC and hence an increase in FA. Indeed, contamination by signal "leaking" from gray matter into white matter has been one of our major concerns in both our fDTI articles, together with the effects of motion.

However, we think that the proposed mechanism by Autio and Roberts to the reported task-related FA changes does not contribute to our finding. One, the use of the non-parametric sign test in our first fDTI paper (Mandl et al., 2008) prevents that only a few voxels (e.g., the end points of the tract touching active gray matter) can result in activation of a complete tract. Two, the global shift of the histograms presented in Figure 5 Mandl et al. (2008) shows that a large part of the white matter voxels in the active tracts contribute to the measured taskrelated FA change. Of course this in itself does not rule out the proposed mechanism because it could be that the active tracts are (for a large part) surrounded by active gray matter voxels. This may for instance be the case for the optic radiations. These tracts are relatively short and are for a large part adjacent to (possible active) gray matter voxels. However, this certainly is not the case for the active thalamo-cortical tracts as can be seen in the supplementary movie (Mandl et al., 2008, Movie S1). This movie shows the combined fDTI and BOLD fMRI results for the tactile experiment in a single subject (subject nr 5). It can be readily seen that the hypothesized partial voluming with possible active gray matter could only occur at the endpoints of the fiber bundle. Furthermore, three, in the second fDTI paper (Mandl et al., 2013) we introduce a time lag between the stimulus and the start of the acquisition of a fDTI volume to make the measurement less sensitive to relatively fast varying signal changes (e.g., BOLD related signal changes). Still, similar effects were reported for the tactile experiment.

Taken together we conclude that although the hypothesized mechanism by Autio and Roberts is intriguing and more experiments are needed to obtain better insight in the underlying mechanisms it cannot explain our measured task-related changes in FA in functional Diffusion Tensor Imaging.

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 April 2014; accepted: 08 May 2014; published online: 27 May 2014.*

*Citation: Mandl RCW, Schnack HG, Zwiers MP, Kahn RS and Hulshoff Pol HE (2014) Do we measure gray matter activation with functional diffusion tensor imaging? Front. Neurosci. 8:126. doi: 10.3389/fnins. 2014.00126*

*This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience.*

*Copyright © 2014 Mandl, Schnack, Zwiers, Kahn and Hulshoff Pol. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*